Podcasts about open api

  • 154PODCASTS
  • 229EPISODES
  • 42mAVG DURATION
  • 1WEEKLY EPISODE
  • Apr 3, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about open api

Latest podcast episodes about open api

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Today's guests, David Soria Parra and Justin Spahr-Summers, are the creators of Anthropic's Model Context Protocol (MCP). When we first wrote Why MCP Won, we had no idea how quickly it was about to win. In the past 4 weeks, OpenAI and now Google have now announced the MCP support, effectively confirming our prediction that MCP was the presumptive winner of the agent standard wars. MCP has now overtaken OpenAPI, the incumbent option and most direct alternative, in GitHub stars (3 months ahead of conservative trendline): For protocol and history nerds, we also asked David and Justin to tell the origin story of MCP, which we leave to the reader to enjoy (you can also skim the transcripts, or, the changelogs of a certain favored IDE). It's incredible the impact that individual engineers solving their own problems can have on an entire industry. Timestamps 00:00 Introduction and Guest Welcome 00:37 What is MCP? 02:00 The Origin Story of MCP 05:18 Development Challenges and Solutions 08:06 Technical Details and Inspirations 29:45 MCP vs Open API 32:48 Building MCP Servers 40:39 Exploring Model Independence in LLMs 41:36 Building Richer Systems with MCP 43:13 Understanding Agents in MCP 45:45 Nesting and Tool Confusion in MCP 49:11 Client Control and Tool Invocation 52:08 Authorization and Trust in MCP Servers 01:01:34 Future Roadmap and Stateless Servers 01:10:07 Open Source Governance and Community Involvement 01:18:12 Wishlist and Closing Remarks

Torsion Talk Podcast
Torsion Talk S8 Ep101: Puerto Rico Recap, GDU Intensive Launch & Insight for Doors Demo Review

Torsion Talk Podcast

Play Episode Listen Later Apr 1, 2025 27:23


Fresh off an incredible summit in Puerto Rico, Ryan is back on Torsion Talk with major updates, new product announcements, and a detailed review of his demo with Insight for Doors! In this episode, he shares key takeaways from the summit, discusses the future of FSM software for his business, and breaks down the pros and cons of Insight for Doors compared to FieldPulse and ServiceTitan.Episode Highlights:

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What

The Laundromat Millionaire Show with Dave Menz
Raising the Bar for Laundry Services with John Buni & David Griffith-Jones of CleanCloud

The Laundromat Millionaire Show with Dave Menz

Play Episode Listen Later Mar 26, 2025 68:48


What does it take to be successful in the laundry industry? More than just a great facility and a high-quality product, you must have the web presence for customers to see those offerings! At CleanCloud, co-founders John Buni & David Griffith-Jones work hard to create a software system that can help you get there. In this episode of The Laundromat Millionaire Show with Dave & Carla Menz, learn about the services CleanCloud can offer your laundry business and what we all see for the future of our industry.Referenced Links: Our Sponsor: H-M Company Drain Troughs: https://www.draintroughs.comEastern Funding: https://www.easternfunding.com/Our Guests: CleanCloud: https://cleancloudapp.com/Our Website: https://www.laundromatmillionaire.comOur Online Course: https://dave-menz.mykajabi.com/sales-pageOur Youtube channel: https://youtube.com/c/LaundromatMillionaireOur Podcast: https://laundromatmillionaire.com/podcast/Our Facebook: https://www.facebook.com/laundromatmillionaire/Our Facebook Group: https://www.facebook.com/groups/laundromatmillionaireOur LinkedIn: https://www.linkedin.com/in/dave-laundromat-millionaire-menz/Our Instagram: https://www.instagram.com/laundromatmillionaire/Our laundromats: https://www.queencitylaundry.comOur pick-up and delivery laundry services: https://www.queencitylaundry.com/deliveryOur WDF & Delivery Workshop: https://laundromatmillionaire.com/pick-up-delivery-workshop/LaundroBoost Marketing Company: https://laundroboostmarketing.com/Suggested Services Page: https://www.laundromatmillionaire.com/servicesWDF & Delivery Dynamics: A Complete Business Blueprint: https://laundromatmillionaire.com/wdf-delivery-dynamics-a-business-blueprint/Clean Show Registration: https://the-clean-show.us.messefrankfurt.com/us/en.htmlSeinfeld clip: https://www.youtube.com/watch?v=KYyzm5AlovAJetsons clip: https://www.youtube.com/watch?v=1pphyvgd7-kFolding Machine clip: https://www.youtube.com/watch?v=C76osXtpLeMOpen Source graphic: https://hackr.io/blog/what-is-open-source-softwareTimestamps0:00:00 Episode 95 Intro0:03:07 Backstory of CleanCloud0:08:07 Evolving from POS to Delivery too0:09:58 Services CleanCloud Provides0:13:31 Software Development & Attainable Marketing Strategies0:21:33 Support Provided to CleanCloud Customers0:24:57 Website Creation0:27:25 Pricing Structure of CleanCloud0:29:04 Biggest Differentiator – International0:33:58 Training Opportunities on the Software0:38:56 Ability to Integrate with Self Service Machines0:43:12 Thoughts on Open Source Software0:50:28 Advice for Store Owners1:02:32 The Future of the Industry1:06:01 Closing Remarks & Contact Info

PodRocket - A web development podcast from LogRocket
LLMs for web developers with Roy Derks

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Mar 6, 2025 28:45


Roy Derks, Developer Experience at IBM, talks about the integration of Large Language Models (LLMs) in web development. We explore practical applications such as building agents, automating QA testing, and the evolving role of AI frameworks in software development. Links https://www.linkedin.com/in/gethackteam https://www.youtube.com/@gethackteam https://x.com/gethackteam https://hackteam.io We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Roy Derks.

Patoarchitekci
Short #65: Java vs Python AI Battle, AKS Network Tools, Database Future, OpenAPI Changes

Patoarchitekci

Play Episode Listen Later Feb 21, 2025 27:23


W najnowszym odcinku Short #65 stawiamy Javę i Pythona do walki w ringu AI! Nasz PatoAI na Discordzie właśnie dostał upgrade i może pochwalić się bazą 76 odcinków w swojej wektorowej pamięci. Azure serwuje nam nowe network tools dla App Service, a DoorDash kombinuje jak połączyć klastry cross-region. Tymczasem PostgreSQL po cichu przejmuje tron wśród baz danych, a GitHub Copilot dostaje tryb agenta-ninja. Chcesz wiedzieć, czy Java faktycznie ma szanse wykopać Pythona z AI? A może ciekawi Cię, na co Alphabet wyda 75 miliardów dolarów? Włącz ten odcinek i przekonaj się, czy DeepSeek wytrzyma nasze zapytania!   A teraz nie ma co się obijać!

AWS Bites
139. Building Great APIs with Powertools

AWS Bites

Play Episode Listen Later Feb 19, 2025 24:32


In this episode, we discuss using AWS Lambda Powertools for Python to build serverless REST APIs with AWS Lambda. We cover the benefits of using Powertools for routing, validation, OpenAPI support, and more. Powertools provides an excellent framework for building APIs while maintaining Lambda best practices.In this episode, we mentioned the following resources: AWS Bites 41. How can Middy make writing Lambda functions easier? - ⁠https://awsbites.com/41-how-can-middy-make-writing-lambda-functions-easier⁠ AWS Bites 120. Lambda Best Practices - ⁠https://awsbites.com/120-lambda-best-practices/⁠ REST API - Powertools for AWS Lambda (Python) - ⁠https://docs.powertools.aws.dev/lambda/python/latest/core/event_handler/api_gateway/⁠ Hono - ⁠https://hono.dev/⁠ Fastify - ⁠https://fastify.dev/⁠ Axum - ⁠https://github.com/tokio-rs/axum⁠ FastAPI - ⁠https://fastapi.tiangolo.com/⁠Do you have any AWS questions you would like us to address?Leave a comment here or connect with us on BlueSky or LinkedIn: https://bsky.app/profile/eoin.sh | https://www.linkedin.com/in/eoins/ https://bsky.app/profile/loige.co | https://www.linkedin.com/in/lucianomammino/

Semaphore Uncut
Lorna Mitchell on OpenAPI in Design-First Development

Semaphore Uncut

Play Episode Listen Later Feb 18, 2025 25:20


A cornerstone of API development, OpenAPI offers a standardized format to define, design, and document APIs. Born out as open-source and embraced by tech giants like Google, Microsoft, and IBM, OpenAPI isn't just a specification—it's a shift toward interoperability, transparency, and developer empowerment. In this article, Lorna Mitchell, a leading voice in API tooling and VP of Developer Experience at Redocly, sheds light on best practices, pitfalls, and how teams fully harness OpenAPI's potential.Listen to the full episode or read the transcript on the Semaphore blog.Like this episode? Be sure to leave a ⭐️⭐️⭐️⭐️⭐️ review on the podcast player of your choice and share it with your friends.

Buongiorno da Edo
L'Europa mette 200 miliardi sul tavolo dell'AI - Buongiorno 243

Buongiorno da Edo

Play Episode Listen Later Feb 13, 2025 9:19


Che bello oggi parliamo anche di cose tecniche con OpenAPI, chissà se piace.

Torsion Talk Podcast
Torsion Talk S8 Ep96: Finding the Best Field Service Management Software – The Journey Begins

Torsion Talk Podcast

Play Episode Listen Later Feb 11, 2025 30:52


In this episode of Torsion Talk, Ryan kicks off a new series dedicated to finding the best field service management software for garage door businesses and home service companies. If you've ever felt frustrated with your CRM, overwhelmed by rising software costs, or stuck with tools that don't fully meet your needs—this series is for you.Ryan shares his first-hand experiences with platforms like ServiceTitan, Jobber, ServiceFusion, Housecall Pro, and others. He breaks down the must-haves for the ideal FSM software, the common pain points business owners face, and why so many companies feel stuck with overpriced, underperforming systems.What You'll Learn in This Episode:✅ Why most FSM software companies miss the mark—and what they should be focusing on.✅ The biggest challenges in choosing the right software for your business.✅ Why ServiceTitan left Ryan frustrated—and what he's looking for in a replacement.✅ The key features every garage door business needs from an FSM platform.✅ How AI and automation will reshape the future of field service management.Key Discussion Points:

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Did you know that adding a simple Code Interpreter took o3 from 9.2% to 32% on FrontierMath? The Latent Space crew is hosting a hack night Feb 11th in San Francisco focused on CodeGen use cases, co-hosted with E2B and Edge AGI; watch E2B's new workshop and RSVP here!We're happy to announce that today's guest Samuel Colvin will be teaching his very first Pydantic AI workshop at the newly announced AI Engineer NYC Workshops day on Feb 22! 25 tickets left.If you're a Python developer, it's very likely that you've heard of Pydantic. Every month, it's downloaded >300,000,000 times, making it one of the top 25 PyPi packages. OpenAI uses it in its SDK for structured outputs, it's at the core of FastAPI, and if you've followed our AI Engineer Summit conference, Jason Liu of Instructor has given two great talks about it: “Pydantic is all you need” and “Pydantic is STILL all you need”. Now, Samuel Colvin has raised $17M from Sequoia to turn Pydantic from an open source project to a full stack AI engineer platform with Logfire, their observability platform, and PydanticAI, their new agent framework.Logfire: bringing OTEL to AIOpenTelemetry recently merged Semantic Conventions for LLM workloads which provides standard definitions to track performance like gen_ai.server.time_per_output_token. In Sam's view at least 80% of new apps being built today have some sort of LLM usage in them, and just like web observability platform got replaced by cloud-first ones in the 2010s, Logfire wants to do the same for AI-first apps. If you're interested in the technical details, Logfire migrated away from Clickhouse to Datafusion for their backend. We spent some time on the importance of picking open source tools you understand and that you can actually contribute to upstream, rather than the more popular ones; listen in ~43:19 for that part.Agents are the killer app for graphsPydantic AI is their attempt at taking a lot of the learnings that LangChain and the other early LLM frameworks had, and putting Python best practices into it. At an API level, it's very similar to the other libraries: you can call LLMs, create agents, do function calling, do evals, etc.They define an “Agent” as a container with a system prompt, tools, structured result, and an LLM. Under the hood, each Agent is now a graph of function calls that can orchestrate multi-step LLM interactions. You can start simple, then move toward fully dynamic graph-based control flow if needed.“We were compelled enough by graphs once we got them right that our agent implementation [...] is now actually a graph under the hood.”Why Graphs?* More natural for complex or multi-step AI workflows.* Easy to visualize and debug with mermaid diagrams.* Potential for distributed runs, or “waiting days” between steps in certain flows.In parallel, you see folks like Emil Eifrem of Neo4j talk about GraphRAG as another place where graphs fit really well in the AI stack, so it might be time for more people to take them seriously.Full Video EpisodeLike and subscribe!Chapters* 00:00:00 Introductions* 00:00:24 Origins of Pydantic* 00:05:28 Pydantic's AI moment * 00:08:05 Why build a new agents framework?* 00:10:17 Overview of Pydantic AI* 00:12:33 Becoming a believer in graphs* 00:24:02 God Model vs Compound AI Systems* 00:28:13 Why not build an LLM gateway?* 00:31:39 Programmatic testing vs live evals* 00:35:51 Using OpenTelemetry for AI traces* 00:43:19 Why they don't use Clickhouse* 00:48:34 Competing in the observability space* 00:50:41 Licensing decisions for Pydantic and LogFire* 00:51:48 Building Pydantic.run* 00:55:24 Marimo and the future of Jupyter notebooks* 00:57:44 London's AI sceneShow Notes* Sam Colvin* Pydantic* Pydantic AI* Logfire* Pydantic.run* Zod* E2B* Arize* Langsmith* Marimo* Prefect* GLA (Google Generative Language API)* OpenTelemetry* Jason Liu* Sebastian Ramirez* Bogomil Balkansky* Hood Chatham* Jeremy Howard* Andrew LambTranscriptAlessio [00:00:03]: Hey, everyone. Welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:12]: Good morning. And today we're very excited to have Sam Colvin join us from Pydantic AI. Welcome. Sam, I heard that Pydantic is all we need. Is that true?Samuel [00:00:24]: I would say you might need Pydantic AI and Logfire as well, but it gets you a long way, that's for sure.Swyx [00:00:29]: Pydantic almost basically needs no introduction. It's almost 300 million downloads in December. And obviously, in the previous podcasts and discussions we've had with Jason Liu, he's been a big fan and promoter of Pydantic and AI.Samuel [00:00:45]: Yeah, it's weird because obviously I didn't create Pydantic originally for uses in AI, it predates LLMs. But it's like we've been lucky that it's been picked up by that community and used so widely.Swyx [00:00:58]: Actually, maybe we'll hear it. Right from you, what is Pydantic and maybe a little bit of the origin story?Samuel [00:01:04]: The best name for it, which is not quite right, is a validation library. And we get some tension around that name because it doesn't just do validation, it will do coercion by default. We now have strict mode, so you can disable that coercion. But by default, if you say you want an integer field and you get in a string of 1, 2, 3, it will convert it to 123 and a bunch of other sensible conversions. And as you can imagine, the semantics around it. Exactly when you convert and when you don't, it's complicated, but because of that, it's more than just validation. Back in 2017, when I first started it, the different thing it was doing was using type hints to define your schema. That was controversial at the time. It was genuinely disapproved of by some people. I think the success of Pydantic and libraries like FastAPI that build on top of it means that today that's no longer controversial in Python. And indeed, lots of other people have copied that route, but yeah, it's a data validation library. It uses type hints for the for the most part and obviously does all the other stuff you want, like serialization on top of that. But yeah, that's the core.Alessio [00:02:06]: Do you have any fun stories on how JSON schemas ended up being kind of like the structure output standard for LLMs? And were you involved in any of these discussions? Because I know OpenAI was, you know, one of the early adopters. So did they reach out to you? Was there kind of like a structure output console in open source that people were talking about or was it just a random?Samuel [00:02:26]: No, very much not. So I originally. Didn't implement JSON schema inside Pydantic and then Sebastian, Sebastian Ramirez, FastAPI came along and like the first I ever heard of him was over a weekend. I got like 50 emails from him or 50 like emails as he was committing to Pydantic, adding JSON schema long pre version one. So the reason it was added was for OpenAPI, which is obviously closely akin to JSON schema. And then, yeah, I don't know why it was JSON that got picked up and used by OpenAI. It was obviously very convenient for us. That's because it meant that not only can you do the validation, but because Pydantic will generate you the JSON schema, it will it kind of can be one source of source of truth for structured outputs and tools.Swyx [00:03:09]: Before we dive in further on the on the AI side of things, something I'm mildly curious about, obviously, there's Zod in JavaScript land. Every now and then there is a new sort of in vogue validation library that that takes over for quite a few years and then maybe like some something else comes along. Is Pydantic? Is it done like the core Pydantic?Samuel [00:03:30]: I've just come off a call where we were redesigning some of the internal bits. There will be a v3 at some point, which will not break people's code half as much as v2 as in v2 was the was the massive rewrite into Rust, but also fixing all the stuff that was broken back from like version zero point something that we didn't fix in v1 because it was a side project. We have plans to move some of the basically store the data in Rust types after validation. Not completely. So we're still working to design the Pythonic version of it, in order for it to be able to convert into Python types. So then if you were doing like validation and then serialization, you would never have to go via a Python type we reckon that can give us somewhere between three and five times another three to five times speed up. That's probably the biggest thing. Also, like changing how easy it is to basically extend Pydantic and define how particular types, like for example, NumPy arrays are validated and serialized. But there's also stuff going on. And for example, Jitter, the JSON library in Rust that does the JSON parsing, has SIMD implementation at the moment only for AMD64. So we can add that. We need to go and add SIMD for other instruction sets. So there's a bunch more we can do on performance. I don't think we're going to go and revolutionize Pydantic, but it's going to continue to get faster, continue, hopefully, to allow people to do more advanced things. We might add a binary format like CBOR for serialization for when you'll just want to put the data into a database and probably load it again from Pydantic. So there are some things that will come along, but for the most part, it should just get faster and cleaner.Alessio [00:05:04]: From a focus perspective, I guess, as a founder too, how did you think about the AI interest rising? And then how do you kind of prioritize, okay, this is worth going into more, and we'll talk about Pydantic AI and all of that. What was maybe your early experience with LLAMP, and when did you figure out, okay, this is something we should take seriously and focus more resources on it?Samuel [00:05:28]: I'll answer that, but I'll answer what I think is a kind of parallel question, which is Pydantic's weird, because Pydantic existed, obviously, before I was starting a company. I was working on it in my spare time, and then beginning of 22, I started working on the rewrite in Rust. And I worked on it full-time for a year and a half, and then once we started the company, people came and joined. And it was a weird project, because that would never go away. You can't get signed off inside a startup. Like, we're going to go off and three engineers are going to work full-on for a year in Python and Rust, writing like 30,000 lines of Rust just to release open-source-free Python library. The result of that has been excellent for us as a company, right? As in, it's made us remain entirely relevant. And it's like, Pydantic is not just used in the SDKs of all of the AI libraries, but I can't say which one, but one of the big foundational model companies, when they upgraded from Pydantic v1 to v2, their number one internal model... The metric of performance is time to first token. That went down by 20%. So you think about all of the actual AI going on inside, and yet at least 20% of the CPU, or at least the latency inside requests was actually Pydantic, which shows like how widely it's used. So we've benefited from doing that work, although it didn't, it would have never have made financial sense in most companies. In answer to your question about like, how do we prioritize AI, I mean, the honest truth is we've spent a lot of the last year and a half building. Good general purpose observability inside LogFire and making Pydantic good for general purpose use cases. And the AI has kind of come to us. Like we just, not that we want to get away from it, but like the appetite, uh, both in Pydantic and in LogFire to go and build with AI is enormous because it kind of makes sense, right? Like if you're starting a new greenfield project in Python today, what's the chance that you're using GenAI 80%, let's say, globally, obviously it's like a hundred percent in California, but even worldwide, it's probably 80%. Yeah. And so everyone needs that stuff. And there's so much yet to be figured out so much like space to do things better in the ecosystem in a way that like to go and implement a database that's better than Postgres is a like Sisyphean task. Whereas building, uh, tools that are better for GenAI than some of the stuff that's about now is not very difficult. Putting the actual models themselves to one side.Alessio [00:07:40]: And then at the same time, then you released Pydantic AI recently, which is, uh, um, you know, agent framework and early on, I would say everybody like, you know, Langchain and like, uh, Pydantic kind of like a first class support, a lot of these frameworks, we're trying to use you to be better. What was the decision behind we should do our own framework? Were there any design decisions that you disagree with any workloads that you think people didn't support? Well,Samuel [00:08:05]: it wasn't so much like design and workflow, although I think there were some, some things we've done differently. Yeah. I think looking in general at the ecosystem of agent frameworks, the engineering quality is far below that of the rest of the Python ecosystem. There's a bunch of stuff that we have learned how to do over the last 20 years of building Python libraries and writing Python code that seems to be abandoned by people when they build agent frameworks. Now I can kind of respect that, particularly in the very first agent frameworks, like Langchain, where they were literally figuring out how to go and do this stuff. It's completely understandable that you would like basically skip some stuff.Samuel [00:08:42]: I'm shocked by the like quality of some of the agent frameworks that have come out recently from like well-respected names, which it just seems to be opportunism and I have little time for that, but like the early ones, like I think they were just figuring out how to do stuff and just as lots of people have learned from Pydantic, we were able to learn a bit from them. I think from like the gap we saw and the thing we were frustrated by was the production readiness. And that means things like type checking, even if type checking makes it hard. Like Pydantic AI, I will put my hand up now and say it has a lot of generics and you need to, it's probably easier to use it if you've written a bit of Rust and you really understand generics, but like, and that is, we're not claiming that that makes it the easiest thing to use in all cases, we think it makes it good for production applications in big systems where type checking is a no-brainer in Python. But there are also a bunch of stuff we've learned from maintaining Pydantic over the years that we've gone and done. So every single example in Pydantic AI's documentation is run on Python. As part of tests and every single print output within an example is checked during tests. So it will always be up to date. And then a bunch of things that, like I say, are standard best practice within the rest of the Python ecosystem, but I'm not followed surprisingly by some AI libraries like coverage, linting, type checking, et cetera, et cetera, where I think these are no-brainers, but like weirdly they're not followed by some of the other libraries.Alessio [00:10:04]: And can you just give an overview of the framework itself? I think there's kind of like the. LLM calling frameworks, there are the multi-agent frameworks, there's the workflow frameworks, like what does Pydantic AI do?Samuel [00:10:17]: I glaze over a bit when I hear all of the different sorts of frameworks, but I like, and I will tell you when I built Pydantic, when I built Logfire and when I built Pydantic AI, my methodology is not to go and like research and review all of the other things. I kind of work out what I want and I go and build it and then feedback comes and we adjust. So the fundamental building block of Pydantic AI is agents. The exact definition of agents and how you want to define them. is obviously ambiguous and our things are probably sort of agent-lit, not that we would want to go and rename them to agent-lit, but like the point is you probably build them together to build something and most people will call an agent. So an agent in our case has, you know, things like a prompt, like system prompt and some tools and a structured return type if you want it, that covers the vast majority of cases. There are situations where you want to go further and the most complex workflows where you want graphs and I resisted graphs for quite a while. I was sort of of the opinion you didn't need them and you could use standard like Python flow control to do all of that stuff. I had a few arguments with people, but I basically came around to, yeah, I can totally see why graphs are useful. But then we have the problem that by default, they're not type safe because if you have a like add edge method where you give the names of two different edges, there's no type checking, right? Even if you go and do some, I'm not, not all the graph libraries are AI specific. So there's a, there's a graph library called, but it allows, it does like a basic runtime type checking. Ironically using Pydantic to try and make up for the fact that like fundamentally that graphs are not typed type safe. Well, I like Pydantic, but it did, that's not a real solution to have to go and run the code to see if it's safe. There's a reason that starting type checking is so powerful. And so we kind of, from a lot of iteration eventually came up with a system of using normally data classes to define nodes where you return the next node you want to call and where we're able to go and introspect the return type of a node to basically build the graph. And so the graph is. Yeah. Inherently type safe. And once we got that right, I, I wasn't, I'm incredibly excited about graphs. I think there's like masses of use cases for them, both in gen AI and other development, but also software's all going to have interact with gen AI, right? It's going to be like web. There's no longer be like a web department in a company is that there's just like all the developers are building for web building with databases. The same is going to be true for gen AI.Alessio [00:12:33]: Yeah. I see on your docs, you call an agent, a container that contains a system prompt function. Tools, structure, result, dependency type model, and then model settings. Are the graphs in your mind, different agents? Are they different prompts for the same agent? What are like the structures in your mind?Samuel [00:12:52]: So we were compelled enough by graphs once we got them right, that we actually merged the PR this morning. That means our agent implementation without changing its API at all is now actually a graph under the hood as it is built using our graph library. So graphs are basically a lower level tool that allow you to build these complex workflows. Our agents are technically one of the many graphs you could go and build. And we just happened to build that one for you because it's a very common, commonplace one. But obviously there are cases where you need more complex workflows where the current agent assumptions don't work. And that's where you can then go and use graphs to build more complex things.Swyx [00:13:29]: You said you were cynical about graphs. What changed your mind specifically?Samuel [00:13:33]: I guess people kept giving me examples of things that they wanted to use graphs for. And my like, yeah, but you could do that in standard flow control in Python became a like less and less compelling argument to me because I've maintained those systems that end up with like spaghetti code. And I could see the appeal of this like structured way of defining the workflow of my code. And it's really neat that like just from your code, just from your type hints, you can get out a mermaid diagram that defines exactly what can go and happen.Swyx [00:14:00]: Right. Yeah. You do have very neat implementation of sort of inferring the graph from type hints, I guess. Yeah. Is what I would call it. Yeah. I think the question always is I have gone back and forth. I used to work at Temporal where we would actually spend a lot of time complaining about graph based workflow solutions like AWS step functions. And we would actually say that we were better because you could use normal control flow that you already knew and worked with. Yours, I guess, is like a little bit of a nice compromise. Like it looks like normal Pythonic code. But you just have to keep in mind what the type hints actually mean. And that's what we do with the quote unquote magic that the graph construction does.Samuel [00:14:42]: Yeah, exactly. And if you look at the internal logic of actually running a graph, it's incredibly simple. It's basically call a node, get a node back, call that node, get a node back, call that node. If you get an end, you're done. We will add in soon support for, well, basically storage so that you can store the state between each node that's run. And then the idea is you can then distribute the graph and run it across computers. And also, I mean, the other weird, the other bit that's really valuable is across time. Because it's all very well if you look at like lots of the graph examples that like Claude will give you. If it gives you an example, it gives you this lovely enormous mermaid chart of like the workflow, for example, managing returns if you're an e-commerce company. But what you realize is some of those lines are literally one function calls another function. And some of those lines are wait six days for the customer to print their like piece of paper and put it in the post. And if you're writing like your demo. Project or your like proof of concept, that's fine because you can just say, and now we call this function. But when you're building when you're in real in real life, that doesn't work. And now how do we manage that concept to basically be able to start somewhere else in the in our code? Well, this graph implementation makes it incredibly easy because you just pass the node that is the start point for carrying on the graph and it continues to run. So it's things like that where I was like, yeah, I can just imagine how things I've done in the past would be fundamentally easier to understand if we had done them with graphs.Swyx [00:16:07]: You say imagine, but like right now, this pedantic AI actually resume, you know, six days later, like you said, or is this just like a theoretical thing we can go someday?Samuel [00:16:16]: I think it's basically Q&A. So there's an AI that's asking the user a question and effectively you then call the CLI again to continue the conversation. And it basically instantiates the node and calls the graph with that node again. Now, we don't have the logic yet for effectively storing state in the database between individual nodes that we're going to add soon. But like the rest of it is basically there.Swyx [00:16:37]: It does make me think that not only are you competing with Langchain now and obviously Instructor, and now you're going into sort of the more like orchestrated things like Airflow, Prefect, Daxter, those guys.Samuel [00:16:52]: Yeah, I mean, we're good friends with the Prefect guys and Temporal have the same investors as us. And I'm sure that my investor Bogomol would not be too happy if I was like, oh, yeah, by the way, as well as trying to take on Datadog. We're also going off and trying to take on Temporal and everyone else doing that. Obviously, we're not doing all of the infrastructure of deploying that right yet, at least. We're, you know, we're just building a Python library. And like what's crazy about our graph implementation is, sure, there's a bit of magic in like introspecting the return type, you know, extracting things from unions, stuff like that. But like the actual calls, as I say, is literally call a function and get back a thing and call that. It's like incredibly simple and therefore easy to maintain. The question is, how useful is it? Well, I don't know yet. I think we have to go and find out. We have a whole. We've had a slew of people joining our Slack over the last few days and saying, tell me how good Pydantic AI is. How good is Pydantic AI versus Langchain? And I refuse to answer. That's your job to go and find that out. Not mine. We built a thing. I'm compelled by it, but I'm obviously biased. The ecosystem will work out what the useful tools are.Swyx [00:17:52]: Bogomol was my board member when I was at Temporal. And I think I think just generally also having been a workflow engine investor and participant in this space, it's a big space. Like everyone needs different functions. I think the one thing that I would say like yours, you know, as a library, you don't have that much control of it over the infrastructure. I do like the idea that each new agents or whatever or unit of work, whatever you call that should spin up in this sort of isolated boundaries. Whereas yours, I think around everything runs in the same process. But you ideally want to sort of spin out its own little container of things.Samuel [00:18:30]: I agree with you a hundred percent. And we will. It would work now. Right. As in theory, you're just like as long as you can serialize the calls to the next node, you just have to all of the different containers basically have to have the same the same code. I mean, I'm super excited about Cloudflare workers running Python and being able to install dependencies. And if Cloudflare could only give me my invitation to the private beta of that, we would be exploring that right now because I'm super excited about that as a like compute level for some of this stuff where exactly what you're saying, basically. You can run everything as an individual. Like worker function and distribute it. And it's resilient to failure, et cetera, et cetera.Swyx [00:19:08]: And it spins up like a thousand instances simultaneously. You know, you want it to be sort of truly serverless at once. Actually, I know we have some Cloudflare friends who are listening, so hopefully they'll get in front of the line. Especially.Samuel [00:19:19]: I was in Cloudflare's office last week shouting at them about other things that frustrate me. I have a love-hate relationship with Cloudflare. Their tech is awesome. But because I use it the whole time, I then get frustrated. So, yeah, I'm sure I will. I will. I will get there soon.Swyx [00:19:32]: There's a side tangent on Cloudflare. Is Python supported at full? I actually wasn't fully aware of what the status of that thing is.Samuel [00:19:39]: Yeah. So Pyodide, which is Python running inside the browser in scripting, is supported now by Cloudflare. They basically, they're having some struggles working out how to manage, ironically, dependencies that have binaries, in particular, Pydantic. Because these workers where you can have thousands of them on a given metal machine, you don't want to have a difference. You basically want to be able to have a share. Shared memory for all the different Pydantic installations, effectively. That's the thing they work out. They're working out. But Hood, who's my friend, who is the primary maintainer of Pyodide, works for Cloudflare. And that's basically what he's doing, is working out how to get Python running on Cloudflare's network.Swyx [00:20:19]: I mean, the nice thing is that your binary is really written in Rust, right? Yeah. Which also compiles the WebAssembly. Yeah. So maybe there's a way that you'd build... You have just a different build of Pydantic and that ships with whatever your distro for Cloudflare workers is.Samuel [00:20:36]: Yes, that's exactly what... So Pyodide has builds for Pydantic Core and for things like NumPy and basically all of the popular binary libraries. Yeah. It's just basic. And you're doing exactly that, right? You're using Rust to compile the WebAssembly and then you're calling that shared library from Python. And it's unbelievably complicated, but it works. Okay.Swyx [00:20:57]: Staying on graphs a little bit more, and then I wanted to go to some of the other features that you have in Pydantic AI. I see in your docs, there are sort of four levels of agents. There's single agents, there's agent delegation, programmatic agent handoff. That seems to be what OpenAI swarms would be like. And then the last one, graph-based control flow. Would you say that those are sort of the mental hierarchy of how these things go?Samuel [00:21:21]: Yeah, roughly. Okay.Swyx [00:21:22]: You had some expression around OpenAI swarms. Well.Samuel [00:21:25]: And indeed, OpenAI have got in touch with me and basically, maybe I'm not supposed to say this, but basically said that Pydantic AI looks like what swarms would become if it was production ready. So, yeah. I mean, like, yeah, which makes sense. Awesome. Yeah. I mean, in fact, it was specifically saying, how can we give people the same feeling that they were getting from swarms that led us to go and implement graphs? Because my, like, just call the next agent with Python code was not a satisfactory answer to people. So it was like, okay, we've got to go and have a better answer for that. It's not like, let us to get to graphs. Yeah.Swyx [00:21:56]: I mean, it's a minimal viable graph in some sense. What are the shapes of graphs that people should know? So the way that I would phrase this is I think Anthropic did a very good public service and also kind of surprisingly influential blog post, I would say, when they wrote Building Effective Agents. We actually have the authors coming to speak at my conference in New York, which I think you're giving a workshop at. Yeah.Samuel [00:22:24]: I'm trying to work it out. But yes, I think so.Swyx [00:22:26]: Tell me if you're not. yeah, I mean, like, that was the first, I think, authoritative view of, like, what kinds of graphs exist in agents and let's give each of them a name so that everyone is on the same page. So I'm just kind of curious if you have community names or top five patterns of graphs.Samuel [00:22:44]: I don't have top five patterns of graphs. I would love to see what people are building with them. But like, it's been it's only been a couple of weeks. And of course, there's a point is that. Because they're relatively unopinionated about what you can go and do with them. They don't suit them. Like, you can go and do lots of lots of things with them, but they don't have the structure to go and have like specific names as much as perhaps like some other systems do. I think what our agents are, which have a name and I can't remember what it is, but this basically system of like, decide what tool to call, go back to the center, decide what tool to call, go back to the center and then exit. One form of graph, which, as I say, like our agents are effectively one implementation of a graph, which is why under the hood they are now using graphs. And it'll be interesting to see over the next few years whether we end up with these like predefined graph names or graph structures or whether it's just like, yep, I built a graph or whether graphs just turn out not to match people's mental image of what they want and die away. We'll see.Swyx [00:23:38]: I think there is always appeal. Every developer eventually gets graph religion and goes, oh, yeah, everything's a graph. And then they probably over rotate and go go too far into graphs. And then they have to learn a whole bunch of DSLs. And then they're like, actually, I didn't need that. I need this. And they scale back a little bit.Samuel [00:23:55]: I'm at the beginning of that process. I'm currently a graph maximalist, although I haven't actually put any into production yet. But yeah.Swyx [00:24:02]: This has a lot of philosophical connections with other work coming out of UC Berkeley on compounding AI systems. I don't know if you know of or care. This is the Gartner world of things where they need some kind of industry terminology to sell it to enterprises. I don't know if you know about any of that.Samuel [00:24:24]: I haven't. I probably should. I should probably do it because I should probably get better at selling to enterprises. But no, no, I don't. Not right now.Swyx [00:24:29]: This is really the argument is that instead of putting everything in one model, you have more control and more maybe observability to if you break everything out into composing little models and changing them together. And obviously, then you need an orchestration framework to do that. Yeah.Samuel [00:24:47]: And it makes complete sense. And one of the things we've seen with agents is they work well when they work well. But when they. Even if you have the observability through log five that you can see what was going on, if you don't have a nice hook point to say, hang on, this is all gone wrong. You have a relatively blunt instrument of basically erroring when you exceed some kind of limit. But like what you need to be able to do is effectively iterate through these runs so that you can have your own control flow where you're like, OK, we've gone too far. And that's where one of the neat things about our graph implementation is you can basically call next in a loop rather than just running the full graph. And therefore, you have this opportunity to to break out of it. But yeah, basically, it's the same point, which is like if you have two bigger unit of work to some extent, whether or not it involves gen AI. But obviously, it's particularly problematic in gen AI. You only find out afterwards when you've spent quite a lot of time and or money when it's gone off and done done the wrong thing.Swyx [00:25:39]: Oh, drop on this. We're not going to resolve this here, but I'll drop this and then we can move on to the next thing. This is the common way that we we developers talk about this. And then the machine learning researchers look at us. And laugh and say, that's cute. And then they just train a bigger model and they wipe us out in the next training run. So I think there's a certain amount of we are fighting the bitter lesson here. We're fighting AGI. And, you know, when AGI arrives, this will all go away. Obviously, on Latent Space, we don't really discuss that because I think AGI is kind of this hand wavy concept that isn't super relevant. But I think we have to respect that. For example, you could do a chain of thoughts with graphs and you could manually orchestrate a nice little graph that does like. Reflect, think about if you need more, more inference time, compute, you know, that's the hot term now. And then think again and, you know, scale that up. Or you could train Strawberry and DeepSeq R1. Right.Samuel [00:26:32]: I saw someone saying recently, oh, they were really optimistic about agents because models are getting faster exponentially. And I like took a certain amount of self-control not to describe that it wasn't exponential. But my main point was. If models are getting faster as quickly as you say they are, then we don't need agents and we don't really need any of these abstraction layers. We can just give our model and, you know, access to the Internet, cross our fingers and hope for the best. Agents, agent frameworks, graphs, all of this stuff is basically making up for the fact that right now the models are not that clever. In the same way that if you're running a customer service business and you have loads of people sitting answering telephones, the less well trained they are, the less that you trust them, the more that you need to give them a script to go through. Whereas, you know, so if you're running a bank and you have lots of customer service people who you don't trust that much, then you tell them exactly what to say. If you're doing high net worth banking, you just employ people who you think are going to be charming to other rich people and set them off to go and have coffee with people. Right. And the same is true of models. The more intelligent they are, the less we need to tell them, like structure what they go and do and constrain the routes in which they take.Swyx [00:27:42]: Yeah. Yeah. Agree with that. So I'm happy to move on. So the other parts of Pydantic AI that are worth commenting on, and this is like my last rant, I promise. So obviously, every framework needs to do its sort of model adapter layer, which is, oh, you can easily swap from OpenAI to Cloud to Grok. You also have, which I didn't know about, Google GLA, which I didn't really know about until I saw this in your docs, which is generative language API. I assume that's AI Studio? Yes.Samuel [00:28:13]: Google don't have good names for it. So Vertex is very clear. That seems to be the API that like some of the things use, although it returns 503 about 20% of the time. So... Vertex? No. Vertex, fine. But the... Oh, oh. GLA. Yeah. Yeah.Swyx [00:28:28]: I agree with that.Samuel [00:28:29]: So we have, again, another example of like, well, I think we go the extra mile in terms of engineering is we run on every commit, at least commit to main, we run tests against the live models. Not lots of tests, but like a handful of them. Oh, okay. And we had a point last week where, yeah, GLA is a little bit better. GLA1 was failing every single run. One of their tests would fail. And we, I think we might even have commented out that one at the moment. So like all of the models fail more often than you might expect, but like that one seems to be particularly likely to fail. But Vertex is the same API, but much more reliable.Swyx [00:29:01]: My rant here is that, you know, versions of this appear in Langchain and every single framework has to have its own little thing, a version of that. I would put to you, and then, you know, this is, this can be agree to disagree. This is not needed in Pydantic AI. I would much rather you adopt a layer like Lite LLM or what's the other one in JavaScript port key. And that's their job. They focus on that one thing and they, they normalize APIs for you. All new models are automatically added and you don't have to duplicate this inside of your framework. So for example, if I wanted to use deep seek, I'm out of luck because Pydantic AI doesn't have deep seek yet.Samuel [00:29:38]: Yeah, it does.Swyx [00:29:39]: Oh, it does. Okay. I'm sorry. But you know what I mean? Should this live in your code or should it live in a layer that's kind of your API gateway that's a defined piece of infrastructure that people have?Samuel [00:29:49]: And I think if a company who are well known, who are respected by everyone had come along and done this at the right time, maybe we should have done it a year and a half ago and said, we're going to be the universal AI layer. That would have been a credible thing to do. I've heard varying reports of Lite LLM is the truth. And it didn't seem to have exactly the type safety that we needed. Also, as I understand it, and again, I haven't looked into it in great detail. Part of their business model is proxying the request through their, through their own system to do the generalization. That would be an enormous put off to an awful lot of people. Honestly, the truth is I don't think it is that much work unifying the model. I get where you're coming from. I kind of see your point. I think the truth is that everyone is centralizing around open AIs. Open AI's API is the one to do. So DeepSeq support that. Grok with OK support that. Ollama also does it. I mean, if there is that library right now, it's more or less the open AI SDK. And it's very high quality. It's well type checked. It uses Pydantic. So I'm biased. But I mean, I think it's pretty well respected anyway.Swyx [00:30:57]: There's different ways to do this. Because also, it's not just about normalizing the APIs. You have to do secret management and all that stuff.Samuel [00:31:05]: Yeah. And there's also. There's Vertex and Bedrock, which to one extent or another, effectively, they host multiple models, but they don't unify the API. But they do unify the auth, as I understand it. Although we're halfway through doing Bedrock. So I don't know about it that well. But they're kind of weird hybrids because they support multiple models. But like I say, the auth is centralized.Swyx [00:31:28]: Yeah, I'm surprised they don't unify the API. That seems like something that I would do. You know, we can discuss all this all day. There's a lot of APIs. I agree.Samuel [00:31:36]: It would be nice if there was a universal one that we didn't have to go and build.Alessio [00:31:39]: And I guess the other side of, you know, routing model and picking models like evals. How do you actually figure out which one you should be using? I know you have one. First of all, you have very good support for mocking in unit tests, which is something that a lot of other frameworks don't do. So, you know, my favorite Ruby library is VCR because it just, you know, it just lets me store the HTTP requests and replay them. That part I'll kind of skip. I think you are busy like this test model. We're like just through Python. You try and figure out what the model might respond without actually calling the model. And then you have the function model where people can kind of customize outputs. Any other fun stories maybe from there? Or is it just what you see is what you get, so to speak?Samuel [00:32:18]: On those two, I think what you see is what you get. On the evals, I think watch this space. I think it's something that like, again, I was somewhat cynical about for some time. Still have my cynicism about some of the well, it's unfortunate that so many different things are called evals. It would be nice if we could agree. What they are and what they're not. But look, I think it's a really important space. I think it's something that we're going to be working on soon, both in Pydantic AI and in LogFire to try and support better because it's like it's an unsolved problem.Alessio [00:32:45]: Yeah, you do say in your doc that anyone who claims to know for sure exactly how your eval should be defined can safely be ignored.Samuel [00:32:52]: We'll delete that sentence when we tell people how to do their evals.Alessio [00:32:56]: Exactly. I was like, we need we need a snapshot of this today. And so let's talk about eval. So there's kind of like the vibe. Yeah. So you have evals, which is what you do when you're building. Right. Because you cannot really like test it that many times to get statistical significance. And then there's the production eval. So you also have LogFire, which is kind of like your observability product, which I tried before. It's very nice. What are some of the learnings you've had from building an observability tool for LEMPs? And yeah, as people think about evals, even like what are the right things to measure? What are like the right number of samples that you need to actually start making decisions?Samuel [00:33:33]: I'm not the best person to answer that is the truth. So I'm not going to come in here and tell you that I think I know the answer on the exact number. I mean, we can do some back of the envelope statistics calculations to work out that like having 30 probably gets you most of the statistical value of having 200 for, you know, by definition, 15% of the work. But the exact like how many examples do you need? For example, that's a much harder question to answer because it's, you know, it's deep within the how models operate in terms of LogFire. One of the reasons we built LogFire the way we have and we allow you to write SQL directly against your data and we're trying to build the like powerful fundamentals of observability is precisely because we know we don't know the answers. And so allowing people to go and innovate on how they're going to consume that stuff and how they're going to process it is we think that's valuable. Because even if we come along and offer you an evals framework on top of LogFire, it won't be right in all regards. And we want people to be able to go and innovate and being able to write their own SQL connected to the API. And effectively query the data like it's a database with SQL allows people to innovate on that stuff. And that's what allows us to do it as well. I mean, we do a bunch of like testing what's possible by basically writing SQL directly against LogFire as any user could. I think the other the other really interesting bit that's going on in observability is OpenTelemetry is centralizing around semantic attributes for GenAI. So it's a relatively new project. A lot of it's still being added at the moment. But basically the idea that like. They unify how both SDKs and or agent frameworks send observability data to to any OpenTelemetry endpoint. And so, again, we can go and having that unification allows us to go and like basically compare different libraries, compare different models much better. That stuff's in a very like early stage of development. One of the things we're going to be working on pretty soon is basically, I suspect, GenAI will be the first agent framework that implements those semantic attributes properly. Because, again, we control and we can say this is important for observability, whereas most of the other agent frameworks are not maintained by people who are trying to do observability. With the exception of Langchain, where they have the observability platform, but they chose not to go down the OpenTelemetry route. So they're like plowing their own furrow. And, you know, they're a lot they're even further away from standardization.Alessio [00:35:51]: Can you maybe just give a quick overview of how OTEL ties into the AI workflows? There's kind of like the question of is, you know, a trace. And a span like a LLM call. Is it the agent? It's kind of like the broader thing you're tracking. How should people think about it?Samuel [00:36:06]: Yeah, so they have a PR that I think may have now been merged from someone at IBM talking about remote agents and trying to support this concept of remote agents within GenAI. I'm not particularly compelled by that because I don't think that like that's actually by any means the common use case. But like, I suppose it's fine for it to be there. The majority of the stuff in OTEL is basically defining how you would instrument. A given call to an LLM. So basically the actual LLM call, what data you would send to your telemetry provider, how you would structure that. Apart from this slightly odd stuff on remote agents, most of the like agent level consideration is not yet implemented in is not yet decided effectively. And so there's a bit of ambiguity. Obviously, what's good about OTEL is you can in the end send whatever attributes you like. But yeah, there's quite a lot of churn in that space and exactly how we store the data. I think that one of the most interesting things, though, is that if you think about observability. Traditionally, it was sure everyone would say our observability data is very important. We must keep it safe. But actually, companies work very hard to basically not have anything that sensitive in their observability data. So if you're a doctor in a hospital and you search for a drug for an STI, the sequel might be sent to the observability provider. But none of the parameters would. It wouldn't have the patient number or their name or the drug. With GenAI, that distinction doesn't exist because it's all just messed up in the text. If you have that same patient asking an LLM how to. What drug they should take or how to stop smoking. You can't extract the PII and not send it to the observability platform. So the sensitivity of the data that's going to end up in observability platforms is going to be like basically different order of magnitude to what's in what you would normally send to Datadog. Of course, you can make a mistake and send someone's password or their card number to Datadog. But that would be seen as a as a like mistake. Whereas in GenAI, a lot of data is going to be sent. And I think that's why companies like Langsmith and are trying hard to offer observability. On prem, because there's a bunch of companies who are happy for Datadog to be cloud hosted, but want self-hosted self-hosting for this observability stuff with GenAI.Alessio [00:38:09]: And are you doing any of that today? Because I know in each of the spans you have like the number of tokens, you have the context, you're just storing everything. And then you're going to offer kind of like a self-hosting for the platform, basically. Yeah. Yeah.Samuel [00:38:23]: So we have scrubbing roughly equivalent to what the other observability platforms have. So if we, you know, if we see password as the key, we won't send the value. But like, like I said, that doesn't really work in GenAI. So we're accepting we're going to have to store a lot of data and then we'll offer self-hosting for those people who can afford it and who need it.Alessio [00:38:42]: And then this is, I think, the first time that most of the workloads performance is depending on a third party. You know, like if you're looking at Datadog data, usually it's your app that is driving the latency and like the memory usage and all of that. Here you're going to have spans that maybe take a long time to perform because the GLA API is not working or because OpenAI is kind of like overwhelmed. Do you do anything there since like the provider is almost like the same across customers? You know, like, are you trying to surface these things for people and say, hey, this was like a very slow span, but actually all customers using OpenAI right now are seeing the same thing. So maybe don't worry about it or.Samuel [00:39:20]: Not yet. We do a few things that people don't generally do in OTA. So we send. We send information at the beginning. At the beginning of a trace as well as sorry, at the beginning of a span, as well as when it finishes. By default, OTA only sends you data when the span finishes. So if you think about a request which might take like 20 seconds, even if some of the intermediate spans finished earlier, you can't basically place them on the page until you get the top level span. And so if you're using standard OTA, you can't show anything until those requests are finished. When those requests are taking a few hundred milliseconds, it doesn't really matter. But when you're doing Gen AI calls or when you're like running a batch job that might take 30 minutes. That like latency of not being able to see the span is like crippling to understanding your application. And so we've we do a bunch of slightly complex stuff to basically send data about a span as it starts, which is closely related. Yeah.Alessio [00:40:09]: Any thoughts on all the other people trying to build on top of OpenTelemetry in different languages, too? There's like the OpenLEmetry project, which doesn't really roll off the tongue. But how do you see the future of these kind of tools? Is everybody going to have to build? Why does everybody want to build? They want to build their own open source observability thing to then sell?Samuel [00:40:29]: I mean, we are not going off and trying to instrument the likes of the OpenAI SDK with the new semantic attributes, because at some point that's going to happen and it's going to live inside OTEL and we might help with it. But we're a tiny team. We don't have time to go and do all of that work. So OpenLEmetry, like interesting project. But I suspect eventually most of those semantic like that instrumentation of the big of the SDKs will live, like I say, inside the main OpenTelemetry report. I suppose. What happens to the agent frameworks? What data you basically need at the framework level to get the context is kind of unclear. I don't think we know the answer yet. But I mean, I was on the, I guess this is kind of semi-public, because I was on the call with the OpenTelemetry call last week talking about GenAI. And there was someone from Arize talking about the challenges they have trying to get OpenTelemetry data out of Langchain, where it's not like natively implemented. And obviously they're having quite a tough time. And I was realizing, hadn't really realized this before, but how lucky we are to primarily be talking about our own agent framework, where we have the control rather than trying to go and instrument other people's.Swyx [00:41:36]: Sorry, I actually didn't know about this semantic conventions thing. It looks like, yeah, it's merged into main OTel. What should people know about this? I had never heard of it before.Samuel [00:41:45]: Yeah, I think it looks like a great start. I think there's some unknowns around how you send the messages that go back and forth, which is kind of the most important part. It's the most important thing of all. And that is moved out of attributes and into OTel events. OTel events in turn are moving from being on a span to being their own top-level API where you send data. So there's a bunch of churn still going on. I'm impressed by how fast the OTel community is moving on this project. I guess they, like everyone else, get that this is important, and it's something that people are crying out to get instrumentation off. So I'm kind of pleasantly surprised at how fast they're moving, but it makes sense.Swyx [00:42:25]: I'm just kind of browsing through the specification. I can already see that this basically bakes in whatever the previous paradigm was. So now they have genai.usage.prompt tokens and genai.usage.completion tokens. And obviously now we have reasoning tokens as well. And then only one form of sampling, which is top-p. You're basically baking in or sort of reifying things that you think are important today, but it's not a super foolproof way of doing this for the future. Yeah.Samuel [00:42:54]: I mean, that's what's neat about OTel is you can always go and send another attribute and that's fine. It's just there are a bunch that are agreed on. But I would say, you know, to come back to your previous point about whether or not we should be relying on one centralized abstraction layer, this stuff is moving so fast that if you start relying on someone else's standard, you risk basically falling behind because you're relying on someone else to keep things up to date.Swyx [00:43:14]: Or you fall behind because you've got other things going on.Samuel [00:43:17]: Yeah, yeah. That's fair. That's fair.Swyx [00:43:19]: Any other observations just about building LogFire, actually? Let's just talk about this. So you announced LogFire. I was kind of only familiar with LogFire because of your Series A announcement. I actually thought you were making a separate company. I remember some amount of confusion with you when that came out. So to be clear, it's Pydantic LogFire and the company is one company that has kind of two products, an open source thing and an observability thing, correct? Yeah. I was just kind of curious, like any learnings building LogFire? So classic question is, do you use ClickHouse? Is this like the standard persistence layer? Any learnings doing that?Samuel [00:43:54]: We don't use ClickHouse. We started building our database with ClickHouse, moved off ClickHouse onto Timescale, which is a Postgres extension to do analytical databases. Wow. And then moved off Timescale onto DataFusion. And we're basically now building, it's DataFusion, but it's kind of our own database. Bogomil is not entirely happy that we went through three databases before we chose one. I'll say that. But like, we've got to the right one in the end. I think we could have realized that Timescale wasn't right. I think ClickHouse. They both taught us a lot and we're in a great place now. But like, yeah, it's been a real journey on the database in particular.Swyx [00:44:28]: Okay. So, you know, as a database nerd, I have to like double click on this, right? So ClickHouse is supposed to be the ideal backend for anything like this. And then moving from ClickHouse to Timescale is another counterintuitive move that I didn't expect because, you know, Timescale is like an extension on top of Postgres. Not super meant for like high volume logging. But like, yeah, tell us those decisions.Samuel [00:44:50]: So at the time, ClickHouse did not have good support for JSON. I was speaking to someone yesterday and said ClickHouse doesn't have good support for JSON and got roundly stepped on because apparently it does now. So they've obviously gone and built their proper JSON support. But like back when we were trying to use it, I guess a year ago or a bit more than a year ago, everything happened to be a map and maps are a pain to try and do like looking up JSON type data. And obviously all these attributes, everything you're talking about there in terms of the GenAI stuff. You can choose to make them top level columns if you want. But the simplest thing is just to put them all into a big JSON pile. And that was a problem with ClickHouse. Also, ClickHouse had some really ugly edge cases like by default, or at least until I complained about it a lot, ClickHouse thought that two nanoseconds was longer than one second because they compared intervals just by the number, not the unit. And I complained about that a lot. And then they caused it to raise an error and just say you have to have the same unit. Then I complained a bit more. And I think as I understand it now, they have some. They convert between units. But like stuff like that, when all you're looking at is when a lot of what you're doing is comparing the duration of spans was really painful. Also things like you can't subtract two date times to get an interval. You have to use the date sub function. But like the fundamental thing is because we want our end users to write SQL, the like quality of the SQL, how easy it is to write, matters way more to us than if you're building like a platform on top where your developers are going to write the SQL. And once it's written and it's working, you don't mind too much. So I think that's like one of the fundamental differences. The other problem that I have with the ClickHouse and Impact Timescale is that like the ultimate architecture, the like snowflake architecture of binary data in object store queried with some kind of cache from nearby. They both have it, but it's closed sourced and you only get it if you go and use their hosted versions. And so even if we had got through all the problems with Timescale or ClickHouse, we would end up like, you know, they would want to be taking their 80% margin. And then we would be wanting to take that would basically leave us less space for margin. Whereas data fusion. Properly open source, all of that same tooling is open source. And for us as a team of people with a lot of Rust expertise, data fusion, which is implemented in Rust, we can literally dive into it and go and change it. So, for example, I found that there were some slowdowns in data fusion's string comparison kernel for doing like string contains. And it's just Rust code. And I could go and rewrite the string comparison kernel to be faster. Or, for example, data fusion, when we started using it, didn't have JSON support. Obviously, as I've said, it's something we can do. It's something we needed. I was able to go and implement that in a weekend using our JSON parser that we built for Pydantic Core. So it's the fact that like data fusion is like for us the perfect mixture of a toolbox to build a database with, not a database. And we can go and implement stuff on top of it in a way that like if you were trying to do that in Postgres or in ClickHouse. I mean, ClickHouse would be easier because it's C++, relatively modern C++. But like as a team of people who are not C++ experts, that's much scarier than data fusion for us.Swyx [00:47:47]: Yeah, that's a beautiful rant.Alessio [00:47:49]: That's funny. Most people don't think they have agency on these projects. They're kind of like, oh, I should use this or I should use that. They're not really like, what should I pick so that I contribute the most back to it? You know, so but I think you obviously have an open source first mindset. So that makes a lot of sense.Samuel [00:48:05]: I think if we were probably better as a startup, a better startup and faster moving and just like headlong determined to get in front of customers as fast as possible, we should have just started with ClickHouse. I hope that long term we're in a better place for having worked with data fusion. We like we're quite engaged now with the data fusion community. Andrew Lam, who maintains data fusion, is an advisor to us. We're in a really good place now. But yeah, it's definitely slowed us down relative to just like building on ClickHouse and moving as fast as we can.Swyx [00:48:34]: OK, we're about to zoom out and do Pydantic run and all the other stuff. But, you know, my last question on LogFire is really, you know, at some point you run out sort of community goodwill just because like, oh, I use Pydantic. I love Pydantic. I'm going to use LogFire. OK, then you start entering the territory of the Datadogs, the Sentrys and the honeycombs. Yeah. So where are you going to really spike here? What differentiator here?Samuel [00:48:59]: I wasn't writing code in 2001, but I'm assuming that there were people talking about like web observability and then web observability stopped being a thing, not because the web stopped being a thing, but because all observability had to do web. If you were talking to people in 2010 or 2012, they would have talked about cloud observability. Now that's not a term because all observability is cloud first. The same is going to happen to gen AI. And so whether or not you're trying to compete with Datadog or with Arise and Langsmith, you've got to do first class. You've got to do general purpose observability with first class support for AI. And as far as I know, we're the only people really trying to do that. I mean, I think Datadog is starting in that direction. And to be honest, I think Datadog is a much like scarier company to compete with than the AI specific observability platforms. Because in my opinion, and I've also heard this from lots of customers, AI specific observability where you don't see everything else going on in your app is not actually that useful. Our hope is that we can build the first general purpose observability platform with first class support for AI. And that we have this open source heritage of putting developer experience first that other companies haven't done. For all I'm a fan of Datadog and what they've done. If you search Datadog logging Python. And you just try as a like a non-observability expert to get something up and running with Datadog and Python. It's not trivial, right? That's something Sentry have done amazingly well. But like there's enormous space in most of observability to do DX better.Alessio [00:50:27]: Since you mentioned Sentry, I'm curious how you thought about licensing and all of that. Obviously, your MIT license, you don't have any rolling license like Sentry has where you can only use an open source, like the one year old version of it. Was that a hard decision?Samuel [00:50:41]: So to be clear, LogFire is co-sourced. So Pydantic and Pydantic AI are MIT licensed and like properly open source. And then LogFire for now is completely closed source. And in fact, the struggles that Sentry have had with licensing and the like weird pushback the community gives when they take something that's closed source and make it source available just meant that we just avoided that whole subject matter. I think the other way to look at it is like in terms of either headcount or revenue or dollars in the bank. The amount of open source we do as a company is we've got to be open source. We're up there with the most prolific open source companies, like I say, per head. And so we didn't feel like we were morally obligated to make LogFire open source. We have Pydantic. Pydantic is a foundational library in Python. That and now Pydantic AI are our contribution to open source. And then LogFire is like openly for profit, right? As in we're not claiming otherwise. We're not sort of trying to walk a line if it's open source. But really, we want to make it hard to deploy. So you probably want to pay us. We're trying to be straight. That it's to pay for. We could change that at some point in the future, but it's not an immediate plan.Alessio [00:51:48]: All right. So the first one I saw this new I don't know if it's like a product you're building the Pydantic that run, which is a Python browser sandbox. What was the inspiration behind that? We talk a lot about code interpreter for lamps. I'm an investor in a company called E2B, which is a code sandbox as a service for remote execution. Yeah. What's the Pydantic that run story?Samuel [00:52:09]: So Pydantic that run is again completely open source. I have no interest in making it into a product. We just needed a sandbox to be able to demo LogFire in particular, but also Pydantic AI. So it doesn't have it yet, but I'm going to add basically a proxy to OpenAI and the other models so that you can run Pydantic AI in the browser. See how it works. Tweak the prompt, et cetera, et cetera. And we'll have some kind of limit per day of what you can spend on it or like what the spend is. The other thing we wanted to b

Millionaire Car Salesman Podcast
EP 10:04 How to Find the Hidden Money in Your CRM & Maximize Opportunities to Increase Sales

Millionaire Car Salesman Podcast

Play Episode Listen Later Feb 4, 2025 54:26


In this electrifying episode of the Millionaire Car Salesman Podcast, hosts LA Williams and Sean V. Bradley dive deep into the innovations shaping the automotive CRM landscape with special guests Shane Born and Melissa Sinclair from NCC and ProMax.  "You're investing in one of the most important tools to help run all aspects of your business. You want to make sure that it's set up properly from the very beginning." - Melissa Sinclair As the automotive industry faces unprecedented changes, Shane and Melissa discuss how their companies are spearheading the credit-first approach in CRM solutions, creating streamlined and transparent processes for both dealers and consumers. The episode begins with the hosts addressing missed speaking opportunities at an industry event due to unforeseen circumstances and quickly shifts focus to the groundbreaking strategies being implemented by ProMax, setting a new bar for CRM systems integrated with National Credit Center resources. "Success involves partnering with the right people... ensure your team understands the what, the why, and the how, and then be flexible to pivot if a generational storm hits." - Shane Born The conversation is packed with insights into how dealerships can leverage modern CRM systems to improve customer experience, enhance fraud prevention, and streamline sales processes using accurate real-time credit data. Shane Born highlights the unique position of their CRM solution, emphasizing its ability to provide consistent, transparent, and frictionless service to consumers while maintaining a keen adaptability to market changes. Melissa Sinclair adds to this by discussing the newly launched open API, offering seamless integration capabilities with vendor partners to unify dealership operations. This episode is a must-listen for any automotive professional keen on harnessing cutting-edge CRM technologies to propel their business forward.   Key Takeaways: ✅ Credit-First CRM Approach: NCC and ProMax are revolutionizing the CRM space with a credit-first methodology, enhancing the buying process through accurate financing options and fraud prevention features. ✅ User Interface and Experience: The newly redesigned UI/UX of ProMax's CRM offers a modern, intuitive interface that simplifies processes for salespeople and management, improving efficiency and usability. ✅ Service and Sales Integration: By integrating service customer data with CRM strategies, dealerships can unlock unprecedented sales opportunities, targeting both existing and untapped customer bases. ✅ Open API for Seamless Integration: The introduction of the open API, Dex, allows for effortless connection with various dealership software, ensuring all operations are synchronized and efficient. ✅ Robust OEM Partnerships: ProMax's CRM is approved by most OEMs, including 100% co-op with General Motors, showcasing the platform's comprehensive capabilities and industry trust.     About Shane Born Shane Born is a prominent figure in the automotive industry, holding a significant position at NCC (National Credit Center) and ProMax. With over two decades of experience, Shane has been instrumental in driving technological advancements and fostering customer relations in the automotive sector. His focus lies in integrating credit solutions with CRM systems to enhance dealership efficiencies!   About Melissa Sinclair Melissa Sinclair at NCC and ProMax, bringing extensive experience and deep knowledge about CRM solutions. Melissa has contributed substantially to the development of innovative customer relationship management tools and has played a key role in ensuring that these tools meet the needs of modern dealerships. She is particularly focused on enhancing user experience and fostering partnerships within the automotive industry.     Unleashing the Power of CRM in Automotive Sales Key Takeaways The Millionaire Car Salesman Podcast dives into how CRM technology is revolutionizing the automotive industry, focusing on NCC's ProMax and its unique offerings. The discussion emphasizes the shift towards credit-first processes, enhancing both dealer and consumer experiences by leveraging advanced data management. An insider perspective on NADA reveals valuable insights into networking strategies and industry trends despite unexpected challenges. CRM Innovations in the Automotive Industry The Millionaire Car Salesman Podcast, hosted by LA Williams and Sean V. Bradley, recently featured Shane Born and Melissa Sinclair from NCC and ProMax, who discussed groundbreaking innovations in CRM technology, specifically tailored for the automotive industry. As the podcast episodes unravel, they explore how these technologies redefine dealer-consumer interactions and internal processes within dealerships. According to Shane Born, NCC is leveraging its dual expertise as both a data and software-as-a-service company: "We are both a DAS company and a SaaS company," he notes. Their end-to-end platform ensures a streamlined, credit-first approach that not only enhances dealer efficiency but also significantly improves the customer's buying journey. This dual focus allows NCC to provide a seamless workflow tied to a single database, an advantage that sets it apart from competitors relying on multiple integrations. From quote to contract, the process is designed to eliminate friction and bolster customer satisfaction by providing accurate financing solutions quickly. Melissa Sinclair elaborates on the customized experience ProMax delivers: "Any role in the dealership can benefit as the platform was designed specifically with them in mind." The conversations with dealers led to UI and UX improvements, focusing on easy navigation and relevant data presentation, ensuring that even complex tools are intuitive for users across all roles within the dealership. Strategies from the NADA Conference One of the intriguing segments of the podcast episode involved understanding the dynamics of the National Automobile Dealers Association (NADA) conference. Despite logistical hurdles due to inclement weather, vendors and attendees alike made the most of the opportunity to connect and explore industry advancements. Shane Born emphasized that despite a "50% pre-registration count," those who made it to the event were laser-focused on solving pertinent challenges, leading to higher closing rates on new dealer group deals. NADA's decision to open its welcome party to all attendees, not just paid registrants, was a strategic move to enhance networking despite lower turnout numbers. "It allows us to really have meaningful conversations about what's happening and where the industry is heading," Born shared, highlighting how such interactions lead to long-lasting partnerships and industry insights. This part of the podcast underscores the importance of adaptability and creativity in fostering networking opportunities, even in unforeseen circumstances. NADA highlighted industry trends, such as AI integration and credit-first approaches, showcased by innovative solutions like NCC's ProMax. Maximizing CRM for Dealer Success In detailing the capabilities of NCC's complete CRM, the podcast dives into how dealerships can maximize the tool for their success. At its core, ProMax's CRM offers a robust framework for both reactive and proactive dealership strategies—a point emphasized repeatedly throughout the discussion. "Any user can go in and build a custom list," explained Melissa Sinclair, illustrating the CRM's adaptability in creating targeted customer lists based on various criteria, such as unsold showroom traffic or service appointments. The system's Opportunities Dashboard is a game changer, automatically collating potential purchase opportunities, ensuring that even if a salesperson overlooks a prospect, the CRM does not. Sean V. Bradley highlighted common dealer pitfalls, such as under-utilizing CRM capabilities, which often leads to unnecessary marketing expenditures. "Why are they spending $65,000 in advertising per month when they could invest $5,000 to $10,000 in CRM?" he questioned, advocating for a full-time CRM manager to ensure optimal use of the system. Reflecting on trends discussed at the podcast, it's evident that leveraging comprehensive CRM tools like NCC's ProMax could save money and increase sales, making it essential for dealerships to embrace such technologies fully. Reflections on the Automotive Horizon As the podcast episode closes, the hosts and guests reflect on how the innovations discussed are reshaping the automotive sales landscape. The evolution of CRM technology, spearheaded by companies like NCC and solutions such as ProMax, shows an industry leaning heavily into data-driven, customer-centric approaches. The recurring theme remains clear: the automotive industry is poised for growth through technology adaptation and strategic partnerships. As Shane Born perfectly encapsulates, success in this field "comes down to consistent execution of your strategic vision," underscoring the importance of aligning technology with well-defined dealership goals. With ProMax's open API and partnership with AI leaders like Impel, NCC ensures they are not just evolving alongside industry changes, but leading the charge toward a more integrated and customer-focused sales experience. In the ever-evolving automotive market, insights from such discussions are invaluable, paving the way for dealerships to not only keep pace but thrive in an exciting era of digital transformation.     Resources: Podium: Discover how Podium's innovative AI technology can unlock unparalleled efficiency and drive your dealership's sales to new heights. Visit www.podium.com/mcs to learn more!   NCC: Credit-Driven Retailing - NCC delivers industry-best credit-driven retailing for auto dealerships, combining a powerful credit and compliance engine and fully integrated CRM/Desking platform for maximum profitability.   Complete CRM: Complete CRM is a streamlined, all-in-one system that simplifies your dealership software and processes so you can manage every aspect of your operation with ease; from tracking and following up on leads, desking deals, managing inventory, marketing to your customers, and more.   Dealer Synergy & Bradley On Demand: The automotive industry's #1 training, tracking, testing, and certification platform and consulting & accountability firm.   The Millionaire Car Salesman Facebook Group: Join the #1 Mastermind Group in the Automotive Industry! With over 28,000 members, gain access to successful automotive mentors & managers, the best industry practices, & collaborate with automotive professionals from around the WORLD! Join The Millionaire Car Salesman Facebook Group today!   Win the Game of Googleopoly: Unlocking the secret strategy of search engines.   The Millionaire Car Salesman Podcast is Proudly Sponsored By: Podium: Elevating Dealership Excellence with Intelligent Customer Engagement Solutions. Unlock unparalleled efficiency and drive sales with Podium's innovative AI technology, featured proudly on the Millionaire Car Salesman Podcast. Visit www.podium.com/mcs to learn more!   NCC: Powered by proprietary solutions such as Intelligent Credit Engine™ and LenderSelect™, NCC transforms the car-buying experience for dealers and their customers. From compliance and lender selection to CRM and desking, to marketing and data mining—NCC integrates them all in a single, seamless platform to deliver better customer experiences, maximum efficiency and maximum profit.   Complete CRM: As an innovative leader in the industry for the last 30 years, Complete CRM is designed to give your dealership the competitive edge in a demanding marketplace. Powered by Complete Credit™ and award-winning desking, Complete CRM™ is the industry's only credit and compliance-enabled CRM that lets dealers achieve maximum profitability on every deal. Built on modern technology, Complete CRM seamlessly integrates credit, compliance, inventory, data mining, lead generation, enterprise functionality, and customized reporting in one tool with a single login.   Dealer Synergy: The #1 Automotive Sales Training, Consulting, and Accountability Firm in the industry! With over two decades of experience in building Internet Departments and BDCs, we have developed the most effective automotive Internet Sales, BDC, and CRM solutions. Our expertise in creating phone scripts, rebuttals, CRM action plans, strategies, and templates ensures that your dealership's tools and personnel reach their full potential.   Bradley On Demand: The automotive sales industry's top Interactive Training, Tracking, Testing, and Certification Platform. Featuring LIVE Classes and over 9,000 training modules, our platform equips your dealership with everything needed to sell more cars, more often, and more profitably!    

Maintainable
Lorna Mitchell: Writing Documentation Engineers Will Actually Read

Maintainable

Play Episode Listen Later Jan 28, 2025 43:18


Join Robby as he chats with Lorna Mitchell, open source advocate and technical writer, about the art of creating documentation that doesn't gather dust. Lorna shares her experiences as a maintainer of the open source project RST2PDF, the value of API governance, and how documentation bridges gaps in developer experience.Highlights:What Makes Software Maintainable: Characteristics like great documentation, automated tests, and onboarding ease.Documentation's Role in Long-Lived Software: Why it's crucial for internal tools and open source projects alike.Open Source in Practice: Lorna's journey with RST2PDF and adopting a tech stack she wasn't initially fluent in.API Governance Simplified: Lorna explains the four levels of API readiness and how teams can work toward more usable APIs.Writing Documentation for Engineers: How style guides can empower contributors without overwhelming them.Using Tools to Improve Documentation: From linters to prose-checking tools like Veil, Lorna discusses practical tips.Key Takeaways:[00:01:00] What makes software well-maintained: documentation, accessibility, and automated tests.[00:03:10] Why documentation isn't just for new users—Lorna's experience with revisiting her own open source projects.[00:06:30] Diving into rst2pdf: Challenges in maintaining an abandoned project.[00:13:45] Balancing ownership and transitioning open source projects to new maintainers.[00:15:30] What is OpenAPI, and how does API governance impact usability?[00:26:10] The art of concise yet helpful documentation for different audiences.[00:33:00] Using examples in APIs to enhance clarity and reduce confusion.[00:40:00] Tools for improving writing, from prose linters to markdown syntax checkers.Resources Mentioned:Lorna Mitchell's Websiterst2pdf ProjectSimon Willison's Post on One-Person ProjectsHow to Take Smart NotesOpenAPI SpecificationVeil Prose LinterFollow Lorna:GitHubIndieWebThanks to Our Sponsor!Turn hours of debugging into just minutes! AppSignal is a performance monitoring and error-tracking tool designed for Ruby, Elixir, Python, Node.js, Javascript, and other frameworks.It offers six powerful features with one simple interface, providing developers with real-time insights into the performance and health of web applications.Keep your coding cool and error-free, one line at a time! Use the code maintainable to get a 10% discount for your first year. Check them out! Subscribe to Maintainable on:Apple PodcastsSpotifyOr search "Maintainable" wherever you stream your podcasts.Keep up to date with the Maintainable Podcast by joining the newsletter.

the no BS short-term rental podcast
Unified API for STR Tech is Here with Abdul Ahadh

the no BS short-term rental podcast

Play Episode Listen Later Jan 9, 2025 46:50


This week, we're diving into the messy world of **hospitality tech** and uncovering solutions that actually work. Our guest is Abdul Ahadh, CEO and co-founder of Calry, a game-changing Unified API for property management systems (PMS). If you've ever struggled with clunky integrations or wasted time wrestling with tech that doesn't play nice, this episode is for you. We break break down: - Why the PMS integration game is broken. - How Calry's Unified API cuts through the chaos. - Real-world benefits like reduced resource burn and faster integrations. Plus, Ahad shares his journey from Bangalore's startup scene to disrupting the short-term rental API game. He gives us a sneak peek into Calry's expansion plans and why they're the partner you need to scale smarter, not harder. Visit CALRY - https://calry.app/Timestamps (because we know you're busy): **00:00** - No BS welcome: Let's get real about hospitality tech. **00:57** - Post-VRMA takeaways and industry chatter. **03:00** - Homecoming Week in Atlanta: The vibe shift. **07:51** - Meet Abdul Ahadh and the origin story of Calry. **09:49** - Building Calry: Why the world needed a Unified API. **13:32** - The gritty truth about hospitality tech challenges. **16:49** - Bangalore: Why it's the Silicon Valley you should care about. **18:05** - How Calry's solving real problems—and what's next. **21:34** - Why PMS integrations suck and how to fix them. **22:31** - The Unified API solution: One point of contact, endless possibilities. **24:05** - Business efficiency unlocked: The Calry effect. **26:23** - Open API and business decisions. **31:50** - Scaling up: Calry's roadmap for the future. **38:51** - The advantages of switching to Calry. **44:28** - Closing thoughts: Ahad's advice for the next big disruptor.

Scaling DevTools
Sid Maestre from APIMatic: APIs build vs buy

Scaling DevTools

Play Episode Listen Later Dec 16, 2024 47:49 Transcription Available


We dig into the the build vs. buy dilemma for APIs, and the role of OpenAPI in effective documentation. This episode is brought to you by WorkOS. If you're thinking about selling to enterprise customers, WorkOS can help you add enterprise features like Single Sign On and audit logs.We explore how AI is transforming the landscape of APIs and developer tools, and discuss the future of coding.The choice between building and buying SDKs depends on company maturity.OpenAPI is crucial for generating quality API documentation.AI is revolutionizing how APIs are created and consumed.Maintaining SDK libraries can be a significant challenge.Developer tools must evolve to keep pace with API design changes.Trust in AI-generated code is growing among developers.The future of coding will likely involve more AI integration.Links:APIMaticSid Maestre

Les Cast Codeurs Podcast
LCC 319 - le ramasse-miettes-charognes

Les Cast Codeurs Podcast

Play Episode Listen Later Dec 16, 2024 70:05


Dans cet épisde en audio et en vidéo (youtube.com/lescastcodeurs), Guillaume et Emmanuel discutent des 15 ans de Go, d'une nouvelle approche de garbage collecting, de LLMs dans les applications Java, dobservabilité, d'une attaque de chaine d'approvisionnement via javac et d'autres choses. Enregistré le 13 décembre 2024 Téléchargement de l'épisode LesCastCodeurs-Episode-319.mp3 News Langages Go fête son 15ème anniversaire ! https://go.dev/blog/15years discute les 15 ans la corrections de gotchas dans les for loops (notamment les variables étaient loop scoped) le fait que la compile echoue si on attend une version de go superieure seulement depuis go 1.21 en parallele de la gestion de la chaine d'outil (c'est en 2023 seulement!) opt-in telemetrie aussi recent Construire OpenJDK à partir des sources sur macOS https://www.morling.dev/blog/building-openjdk-from-source-on-macos/ de maniere surprenante ce n'est pas tres compliqué Papier sur l'aproche Mark-scavenge pour un ramasse miette https://inside.java/2024/11/22/mark-scavenge-gc/ papier de recherche utiliser l'accessibilité pour preuve de vie n'est pas idéal: un objet peut etre atteignable mais ne sera jamais accedé par le programme les regions les plus pauvres en objets vivant voient leurs objets bouger dans uen autre region et la regio libéré, c'est le comportement classique des GC deux methodes: mark evaguate qui le fait en deux temps et la liveness peut evoluer ; et scavenge qui bouge l'objet vivant des sa decouverte ont fait tourner via ZGC des experience pour voir les objects consideres vivants et bougés inutilement. resultats montrent un gros taux d'objets bougés de maniere inutile proposent un algo different ils marquent les objets vivants mais ne les bougent pas avant le prochain GC pour leur donner une change de devenir unreachable elimine beaucoup de deplacement inutiles vu que les objets deviennent non accessible en un cycle de GC jusquà 91% de reduction ! Particulierement notable dans les machines chargées en CPU. Les tokens d'accès court ou longs https://grayduck.mn/2023/04/17/refresh-vs-long-lived-access-tokens/ pourquoi des long access tokens (gnre refresh token) sont utilises pour des short lived dans oauth 2.0 refresh token simplifient la revocation: vu que seul le auth serveur a a verifier la révocation et les clients vérifient l'expiration et la validité de la signature refresh token ne sont envoyés que entre endpoints alors que les access tokens se baladent pas mal: les frontières de confiance ne sont pas traversées refresh token comme utilise infréquement, et donc peut etre protegee dans une enclave les changements de grants sont plus simple tout en restant distribuable histoire des access refresh token et access token permet de mieux tracer les abus / attaques les inconvenients: c'est plus compliqué en flow, the auth serveur est un SPOF amis mitigeable Java Advent est de retour https://www.javaadvent.com/calendar backstage Java integrite par defaut (et ses consequences sur l'ecosysteme) timefold (sovler) Les extensions JUNit 5 OpenTelemetry via Java Agent vs Micrometer analyse statique de code CQRS et les fonctionalités modernes de Java java simple (sans compilatrion, sans objet fullstack dev with quarkus as backend José Paumard introduit et explique les Gatherers dans Java 24 dans cette vidéo https://inside.java/2024/11/26/jepcafe23/ Librairies Micronaut 4.7, avec l'intégration de LangChain4j https://micronaut.io/2024/11/14/micronaut-framework-4-7-0-released/ Combiner le framework de test Spock et Cucumber https://www.sfeir.dev/back/spock-framework-revolutionnez-vos-tests-unitaires-avec-la-puissance-du-bdd-et-de-cucumber/ les experts peuvent écrire leurs tests au format Gherkin (de Cucumber) et les développeurs peuvent implémenter les assertions correspondantes avec l'intégration dans Spock, pour des tests très lisibles Spring 6.2 https://spring.io/blog/2024/11/14/spring-framework-6-2-0-available-now beans @Fallback améliorations sur SpELet sur le support de tests support de l'echape des property placeholders une initioalisation des beans en tache de fond nouvelle et pleins d'autres choses encore Comment créer une application Java LLM tournant 100% en Java avec Jlama https://quarkus.io/blog/quarkus-jlama/ blog de Mario Fusco, Mr API et Java et Drools utilise jlama + quarkus + langchain Explique les avantage de l'approche pure Java comme le cycle de vie unique, tester les modeles rapidement, securite (tout est in process), monolithe ahahah, observabilité simplifiée, distribution simplifiée (genre appli embarquée) etc Vert.x 5 en seconde incubation https://vertx.io/blog/eclipse-vert-x-5-candidate-2-released/ Support des Java modules (mais beacoup des modules vert.x eux-même ne le supportent pas support io_uring dans vert.x core le load balancing côté client le modele des callbacks n'est plus supporté, vive les Futur beaucoup d'améliorations autour de gRPC et d'autres choses Un article sur Spring AI et la multi modalite audio https://spring.io/blog/2024/12/05/spring-ai-audio-modality permet de voir les evolutions des APIs de Spring AI s'appluie sur les derniers modeles d'open ai des examples comme par exemple un chatbot voix et donc comment enregistrer la voix et la passer a OpenAI Comment activer le support experimental HTTP/3 dans Spring Boot https://spring.io/blog/2024/11/26/http3-in-reactor-2024 c'ets Netty qui fait le boulot puis Spring Netty l'article décrit les etapes pour l'utiliser dans vos applis Spring Boot ou Spring Cloud Gateway l'article explique aussi le cote client (app cliente) ce qui est sympa Infrastructure Un survol des offres d'observabilité http://blog.ippon.fr/2024/11/18/observabilite-informatique-comprendre-les-bases-2eme-partie/ un survol des principales offres d'observabilité Open source ou SaaS et certains outsiders Pas mal pour commencer à défricher ce qui vous conviendrait blog de ippon Web Sortie de Angular 19 https://blog.ninja-squad.com/2024/11/19/what-is-new-angular-19.0/ stabilité des APIs Signal APIs migration automatique vers signals composants standalone par défaut nouvelles APIs linkedSignal et resource de grosses améliorations de SSR et HMR un article également de Sfeir sur Angular 19 https://www.sfeir.dev/front/angular-19-tout-ce-quil-faut-savoir-sur-les-innovations-majeures-du-framework/ Angluar 19 https://www.sfeir.dev/front/angular-19-tout-ce-quil-faut-savoir-sur-les-innovations-majeures-du-framework/ composant standalone par default (limiter les problemes de dependances), peut le mettre en strict pour le l'imposer (ou planter) signalement des imports inutilisés @let pour les variables locales dans les templates linkedSignal (experimental) pour lier des signaux entre eux (cascade de changement suite a un evenement hydratation incrementale (contenu progressivement interactif avec le chargement - sur les parties de la page visible ou necessaires et event replay, routing et modes de rendu en rendy hybride, Hot module replacement etc The State of Frontend — dernière compilation des préférences des développeurs en terme de front https://tsh.io/state-of-frontend/ React en tête, suivi de Vue et Svelte. Angular seulement 4ème Côté rendering framework, Next.js a la majorité absolue, ensuite viennent Nuxt et Astro Zod est la solution de validation préférée Pour la gestion de date, date-fns est en tête, suivi par moment.js Côté state management, React Context API en première place, mais les suivants sont tous aussi pour React ! Grosse utilisation de lodash pour plein d'utilités Pour fetcher des resources distantes, l'API native Fetch et Axios sont les 2 vaincoeurs Pour le déploiement, Vercel est premier Côté CI/CD, beaucoup de Github Actions, suivi par Gitlab CI Package management, malgré de bonnes alternatives, NPM se taille toujours la part du lion Ecrasante utilisation de Node.js comme runtime JavaScript pour faire du développement front Pour ce qui est du typing, beaucoup utilisent TypeScript, et un peu de JSdoc, et la majorité des répondants pensent que TypeScript a dépassé JavaScript en usage Dans les API natives du navigateur, Fetch, Storage et WebSockets sont les APIs les plus utilisées La popularité des PWA devrait suivre son petit bonhomme de chemin En terme de design system, shadcn.ui en tête, suivi par Material, puis Bootstram Pour la gestion des styles, un bon mix de plain old CSS, de Tailwind, et de Sass/CSS Jest est premier comme framework de tests Les 3/4 des développeurs front utilisent Visual Studio Code, quant au quart suivant, c'est JetBrains qui raffle les miettes Pour le build, Vite récolte les 4/5 des voix ESLint et Prettier sont les 2 favoris pour vérifier le code   Parfois, on aimerait pouvoir tester une librairie ou un framework JavaScript, sans pour autant devoir mettre en place tout un projet, avec outil de build et autre. Julia Evans explore les différents cas de figure, suivant la façon dont ces librairies sont bundlées https://jvns.ca/blog/2024/11/18/how-to-import-a-javascript-library/ Certaines librairies permette de ne faire qu'un simple import dans une balise script Certaines frameworks sont distribués sous forme d'Universal Module Definition, sous CommonJS, d'ESmodule franchemet en tant que noob c'est compliqué quand même Data et Intelligence Artificielle L'impact de l'IA en entreprise et des accès aux documents un peu laxistes https://archive.ph/uPyhX l'indexing choppe tout ce qu'il peut et l'IA est tres puissante pour diriger des requetes et extraires les données qui auraient du etre plus restreintes Différentes manières de faire de l'extraction de données et de forcer la main à un LLM pour qu'il génère du JSON https://glaforge.dev/posts/2024/11/18/data-extraction-the-many-ways-to-get-llms-to-spit-json-content/ l'approche “je demande gentiment” au LLM, en faisant du prompt engineering en utilisant du function calling pour les modèles supportant la fonctionnalité, en particulier avant les approches de type “JSON mode” ou “JSON schema” ou effectivement si le modèle le supporte aussi, toujours avec un peu de prompting, mais en utilisant le “JSON mode” qui force le LLM a générer du JSON valide encore mieux avec la possibilité de spécifier un schema JSON (type OpenAPI) pour que le JSON en sortie soit “compliant” avec le schéma proposé Comment masquer les données confidentielles avec ses échanges avec les LLMs https://glaforge.dev/posts/2024/11/25/redacting-sensitive-information-when-using-generative-ai-models/ utilisation de l'API Data Loss Prevention de Google Cloud qui permet d'identifier puis de censurer / masquer (“redacted” en anglais) des informations personnelles identifiables (“PII”, comme un nom, un compte bancaire, un numéro de passeport, etc) pour des raison de sécurité, de privacy, pour éviter les brèche de données comme on en entend trop souvent parler dans les nouvelles On peut utiliser certains modèles d'embedding pour faire de la recherche de code https://glaforge.dev/posts/2024/12/02/semantic-code-search-for-programming-idioms-with-langchain4j-and-vertex-ai-embedding-models/ Guillaume recherche des bouts de code, en entrant une requête en langue naturel Certains embedding models supportent différents types de tâches, comme question/réponse, question en langue naturelle / retour sous forme de code, ou d'autres tâches comme le fact checking, etc Dans cet article, utilisation du modèle de Google Cloud Vertex AI, en Java, avec LangChain4j Google sort la version 2 de Gemini Flash https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/ La nouvelle version Gemini 2.0 Flash dépasse même Gemini 1.5 Pro dans les benchmarks Tout en étant 2 fois plus rapide que Gemini 1.5 Pro, et bien que le prix ne soit pas encore annoncé, on imagine également plus abordable Google présente Gemini 2 comme le LLM idéal pour les “agents” Gemini propose une vraie multimodalité en sortie (premier LLM sur le marché à le proposer) : Gemini 2 peut entrelacer du texte, des images, de l'audio Gemini 2 supporte plus de 100 langues 8 voix de haute qualité, assez naturelles, pour la partie audio Un nouveau mode speech-to-speech en live, où on peut même interrompre le LLM, c'est d'ailleurs ce qui est utilisé dans Project Astra, l'application mobile montrée à Google I/O qui devient un vrai assistant vocale en live sur votre téléphone Google annonce aussi une nouvelle expérimentation autour des assistants de programmation, avec Project Jules, avec lequel on peut discuter en live aussi, partager son code, comme un vrai pair programmeur Google a présenté Project Mariner qui est un agent qui est sous forme d'extension Chrome, qui va permettre de commander votre navigateur comme votre assistant de recherche personnel, qui va être capable de faire des recherches sur le web, de naviguer dans les sites web, pour trouver les infos que vous recherchez Cet autre article montre différentes vidéos de démos de ces fonctionnalités https://developers.googleblog.com/en/the-next-chapter-of-the-gemini-era-for-developers/ Un nouveau projet appelé Deep Research, qui permet de faire des rapports dans Gemini Advanced : on donne un sujet et l'agent va proposer un plan pour un rapport sur ce sujet (qu'on peut valider, retoucher) et ensuite, Deep Research va effectuer des recherches sur le web pour vous, et faire la synthèse de ses recherches dans un rapport final https://blog.google/products/gemini/google-gemini-deep-research/ Enfin, Google AI Studio, en plus de vous permettre d'expérimenter avec Gemini 2, vous pourrez aussi utiliser des “starter apps” qui montrent comment faire de la reconnaissance d'objet dans des images, comment faire des recherches avec un agent connecté à Google Maps, etc. Google AI Studio permet également de partager votre écran avec lui, en mobile ou en desktop, de façon à l'utiliser comme un assistant qui peut voir ce que vous faites, ce que vous coder et peut répondre à vos questions Méthodologies Un article de GitHub sur l'impact de la surutilisation des CPU sur la perf de l'appli https://github.blog/engineering/architecture-optimization/breaking-down-cpu-speed-how-utilization-impacts-performance/ c'est surprenant qu'ils ont des effets des 30% de perf c'est du a la non limit thermique, au boost de frequece qui en suit ils ont donc cherché le golden ratio pour eux autour de 60% ils prennent des morceaux de cluster kube poru faire tourner les workloads et ajoutent des wqorkload CPU artificiels (genre math) Sécurité Attaque de la chaîne d'approvisionnement via javac https://xdev.software/en/news/detail/discovering-the-perfect-java-supply-chain-attack-vector-and-how-it-got-fixed s'appuie sur l'annotation processeur l'annotation processors des dependances est chargé et executé au moment du build du projet et cherche les annotations processor dans le user classpath (via le pattern serviceloader) et donc si la dependance est attaquée et un annotation processor est ajouté ou modifié on a un vecteur d'attaque au moment de la compilation du projet ciblé des qu'on deparre l'IDE en gros workaround, activer -proc:none et activer les annotation processors explicitly dans votre outil de build certaines améliorations dans le JDK: le compilateur note qu'il execute un annotation processor dans java 23+ les annotation processors sont deactivés par defaut Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 19 décembre 2024 : Normandie.ai 2024 - Rouen (France) 20 janvier 2025 : Elastic{ON} - Paris (France) 22-25 janvier 2025 : SnowCamp 2025 - Grenoble (France) 24-25 janvier 2025 : Agile Games Île-de-France 2025 - Paris (France) 30 janvier 2025 : DevOps D-Day #9 - Marseille (France) 6-7 février 2025 : Touraine Tech - Tours (France) 21 février 2025 : LyonJS 100 - Lyon (France) 28 février 2025 : Paris TS La Conf - Paris (France) 20 mars 2025 : PGDay Paris - Paris (France) 20-21 mars 2025 : Agile Niort - Niort (France) 25 mars 2025 : ParisTestConf - Paris (France) 26-29 mars 2025 : JChateau Unconference 2025 - Cour-Cheverny (France) 28 mars 2025 : DataDays - Lille (France) 28-29 mars 2025 : Agile Games France 2025 - Lille (France) 3 avril 2025 : DotJS - Paris (France) 10-11 avril 2025 : Android Makers - Montrouge (France) 10-12 avril 2025 : Devoxx Greece - Athens (Greece) 16-18 avril 2025 : Devoxx France - Paris (France) 29-30 avril 2025 : MixIT - Lyon (France) 7-9 mai 2025 : Devoxx UK - London (UK) 16 mai 2025 : AFUP Day 2025 Lille - Lille (France) 16 mai 2025 : AFUP Day 2025 Lyon - Lyon (France) 16 mai 2025 : AFUP Day 2025 Poitiers - Poitiers (France) 24 mai 2025 : Polycloud - Montpellier (France) 5-6 juin 2025 : AlpesCraft - Grenoble (France) 11-13 juin 2025 : Devoxx Poland - Krakow (Poland) 12-13 juin 2025 : Agile Tour Toulouse - Toulouse (France) 12-13 juin 2025 : DevLille - Lille (France) 24 juin 2025 : WAX 2025 - Aix-en-Provence (France) 26-27 juin 2025 : Sunny Tech - Montpellier (France) 1-4 juillet 2025 : Open edX Conference - 2025 - Palaiseau (France) 18-19 septembre 2025 : API Platform Conference - Lille (France) & Online 2-3 octobre 2025 : Volcamp - Clermont-Ferrand (France) 6-10 octobre 2025 : Devoxx Belgium - Antwerp (Belgium) 16-17 octobre 2025 : DevFest Nantes - Nantes (France) 6 novembre 2025 : dotAI 2025 - Paris (France) 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 17 juin 2026 : Devoxx Poland - Krakow (Poland) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

.NET in pillole
270 - Cambiamenti e migliorie per OpenAPI con .NET 9

.NET in pillole

Play Episode Listen Later Dec 9, 2024 10:10


Dopo il famoso addio a Swashbuckle.AspNetCore ecco tutti i cambiamenti arrivati per OpenAPI con .NET 9.https://learn.microsoft.com/en-us/aspnet/core/release-notes/aspnetcore-9.0?view=aspnetcore-9.0&WT.mc_id=DT-MVP-4021952https://github.com/dotnet/aspnetcore/discussions/58103https://learn.microsoft.com/en-us/aspnet/core/fundamentals/openapi/openapi-tools?view=aspnetcore-9.0&WT.mc_id=DT-MVP-4021952#openapi #dotnet #dotnetinpillole #podcast

airhacks.fm podcast with adam bien
From .mobi Over GraphQL to Quarkus Dev UI

airhacks.fm podcast with adam bien

Play Episode Listen Later Dec 1, 2024 59:42


An airhacks.fm conversation with Phillip Krueger (@phillipkruger) about: early programming experiences with Visual Basic and Java, transition from actuarial science to computer science, first job at a bank working with Java Swing and RMI over CORBA, experience with J2EE and XML technologies, working with XML and XSLT, development of open-source Swing components, work on dotMobi sites for mobile phones in Africa, creation of API extensions for Java EE and MicroProfile, involvement in the MicroProfile GraphQL specification, joining Red Hat and working on quarkus, development of SmallRye GraphQL, improvements to OpenAPI support in Quarkus, work on Quarkus Dev UI, discussion about the evolution of Java application servers and frameworks, comparison of REST and GraphQL, thoughts on Java development culture in South Africa Phillip Krueger on twitter: @phillipkruger

Microsoft 365 Developer Podcast
Why build declarative agents with Visual Studio Code vs Copilot Studio

Microsoft 365 Developer Podcast

Play Episode Listen Later Nov 25, 2024 36:35


In this episode Jeremy Thake talks to Sebastien Levert. The podcast focuses on the evolution and development of tools within the Microsoft 365 ecosystem, particularly around agent building and Copilot Studio. The discussion highlights the distinction between low-code/no-code solutions for makers and pro-code tools for developers. Key tools such as Agent Builder and Teams Toolkit are explored, emphasizing their respective strengths: Agent Builder's accessibility for information workers versus Teams Toolkit's power and flexibility for professional developers. They discuss how developers can leverage Visual Studio Code, adaptive cards, and GitHub integration for building robust, scalable agents. A recurring theme is enabling seamless collaboration between makers and developers, with a focus on governance, scalability, and customization, particularly for enterprise-grade agents. The conversation also delves into advancements in API integration, such as using OpenAPI files and the emerging TypeSpec language to streamline the creation of APIs and plugins. Graph connectors and Power Platform connectors are highlighted as critical components for data access and processing within agents. To watch their Ignite breakout with all the demos discussed watch Developers guide to building your own agents To find out more please visit https://aka.ms/extendcopilotm365 

durch die bank
Open Banking-Stammtisch Runde 23: API-Standardisierung im regulierten und nicht-regulierten Kontext

durch die bank

Play Episode Listen Later Nov 20, 2024 29:23


Nach Umsetzung der PSD2-Richtlinie sei man 2018 "aus dem regulatorischen Koma" erwacht, so schildert es Dr. Ortwin Scheja von der Berlin Group. Seitdem geht es kontinuierlich weiter mit dem Ausbau der Schnittstellen (APIs), welche die technische Grundlage sind, um die Ziele der PSD2 zu erreichen.Die Schweiz hat das Thema Open Banking ebenfalls vorangetrieben - auch ohne regulatorische Vorgaben. Warum die entsprechende Arbeitsgruppe nicht "Open API" sondern "Common API" heißt, und was wir aus dem Schweizer Vorbild lernen können, erfahren Sie in der neuen Podcast-Folge. Ute Kolck moderiert die Diskussion - dieses Mal mit fünf Fachexpert:innen: Christoph Berentzen (Commerzbank), Prof. Dr. Silke Finken (International School of Management), Jürgen Petry (Swiss Fintech Innovations), Dr. Ortwin Scheja (Berlin Group) und Roger Wisler (Zürcher Kantonalbank).

Les Cast Codeurs Podcast
LCC 315 - les températures ne sont pas déterministes

Les Cast Codeurs Podcast

Play Episode Listen Later Sep 17, 2024 110:08


JVM summit, virtual threads, stacks applicatives, licences, déterminisme et LLMs, quantification, deux outils de l'épisode et bien plus encore. Enregistré le 13 septembre 2024 Téléchargement de l'épisode LesCastCodeurs-Episode–315.mp3 News Langages Netflix utilise énormément Java et a rencontré un problème avec les Virtual Thread dans Java 21. Les ingénieurs de Netflix analysent ce problème dans cet article : https://netflixtechblog.com/java–21-virtual-threads-dude-wheres-my-lock–3052540e231d Les threads virtuels peuvent améliorer les performances mais posent des défis. Un problème de locking a été identifié : les threads virtuels se bloquent mutuellement. Cela entraîne des performances dégradées et des instabilités. Netflix travaille à résoudre ces problèmes et à tirer pleinement parti des threads virtuels. Une syntax pour indiquer qu'un type est nullable ou null-restricted arriverait dans Java https://bugs.openjdk.org/browse/JDK–8303099 Foo! interdirait null Foo? indiquerait que null est accepté Foo?[]! serait un tableau non-null de valeur nullable Il y a aussi des idées de syntaxe pour initialiser les tableaux null-restricted JEP: https://openjdk.org/jeps/8303099 Les vidéos du JVM Language Summit 2024 sont en ligne https://www.youtube.com/watch?v=OOPSU4LnKg0&list=PLX8CzqL3ArzUEYnTa6KYORRbP3nhsK0L1 Project Leyden Update Project Babylon - Code Reflection Valhalla - Where Are We? An Opinionated Overview on Static Analysis for Java Rethinking Java String Concatenation Code Reflection in Action - Translating Java to SPIR-V Java in 2024 Type Specialization of Java Generics - What If Casts Have Teeth ? (avec notre Rémi Forax national !) aussi tip or tail pour tout l'ecosysteme quelques liens sur Babylon: Code reflection pour exprimer des langages etranger (SQL) dans Java: https://openjdk.org/projects/babylon/ et sont example en emulation de LINQ https://openjdk.org/projects/babylon/articles/linq Librairies Micronaut sort sa version 4.6 https://micronaut.io/2024/08/26/micronaut-framework–4–6–0-released/ essentiellement une grosse mise à jour de tonnes de modules avec les dernières versions des dépendances Microprofile 7 faire quelques changements et evolution incompatibles https://microprofile.io/2024/08/22/microprofile–7–0-release/#general enleve Metrics et remplace avec Telemetry (metrics, log et tracing) Metrics reste une spec mais standalone Microprofile 7 depende de Jakarta Core profile et ne le package plus Microprofile OpenAPI 4 et Telemetry 2 amenent des changements incompatibles Quarkus 3.14 avec LetsEncrypt et des serialiseurs JAckson sans reflection https://quarkus.io/blog/quarkus–3–14–1-released/ Hibernate ORM 6.6 Serialisateurs JAckson sans reflection installer des certificats letsencrypt simplement (notamment avec la ligne de commande qui aide sympa notamment avec ngrok pour faire un tunnel vers son localhost retropedalage sur @QuarkusTestResource vs @WithTestResource suite aux retour de OOME et lenteur des tests mieux isolés Les logs structurées dans Spring Boot 3.4 https://spring.io/blog/2024/08/23/structured-logging-in-spring-boot–3–4 Les logs structurées (souvent en JSON) vous permettent de les envoyer facilement vers des backends comme Elastic, AWS CloudWatch… Vous pouvez les lier à du reporting et de l'alerting. Spring Boot 3.4 prend en charge la journalisation structurée par défaut. Il prend en charge les formats Elastic Common Schema (ECS) et Logstash, mais il est également possible de l'étendre avec vos propres formats. Vous pouvez également activer la journalisation structurée dans un fichier. Cela peut être utilisé, par exemple, pour imprimer des journaux lisibles par l'homme sur la console et écrire des journaux structurés dans un fichier pour l'ingestion par machine. Infrastructure CockroachDB qui avait une approche Business Software License (source available puis ALS 3 ans apres), passe maintenant en license proprietaire avec source available https://www.cockroachlabs.com/blog/enterprise-license-announcement/ Polyform project offre des licences standardisees selon les besoins de gratuit vs payant https://polyformproject.org/ Cloud Azure fonctions, comment le demarrage a froid est optimisé https://www.infoq.com/articles/azure-functions-cold-starts/?utm_campaign=infoq_content&utm_source=twitter&utm_medium=feed&utm_term=Cloud fonctions ont une latence naturelle forte toutes les lantences longues ne sont aps impactantes pour le business les demarrages a froid peuvent etre mesures avec les outils du cloud provider donc faites en usage faites des decentilers de latences experience 381 ms cold et 10ms apres tracing pour end to end latence les strategies keep alive pings: reveiller la fonctione a intervalles reguliers pour rester “warm” dans le code de la fonction: initialiser les connections et le chargement des assemblies dans l'initialization configurer dans host.json le batching, desactiver file system logging etc deployer les fonctions as zips reduire al taille du code et des fichiers (qui sont copies sur le serveur froid) sur .net activer ready to run qui aide le JIT compiler instances azure avec plus de CPU et memoire sont plus cher amis baissent le cold start dedicated azure instances pour vos fonctions (pas aprtage avec les autres tenants) ensuite montre des exemples concrets Web Sortie de Vue.js 3.5 https://blog.vuejs.org/posts/vue–3–5 Vue.JS 3.5: Nouveautés clés Optimisations de performance et de mémoire: Réduction significative de la consommation de mémoire (–56%). Amélioration des performances pour les tableaux réactifs de grande taille. Résolution des problèmes de valeurs calculées obsolètes et de fuites de mémoire. Nouvelles fonctionnalités: Reactive Props Destructure: Simplification de la déclaration des props avec des valeurs par défaut. Lazy Hydration: Contrôle de l'hydratation des composants asynchrones. useId(): Génération d'ID uniques stables pour les applications SSR. data-allow-mismatch: Suppression des avertissements de désynchronisation d'hydratation. Améliorations des éléments personnalisés: Prise en charge de configurations d'application, d'API pour accéder à l'hôte et au shadow root, de montage sans Shadow DOM, et de nonce pour les balises. useTemplateRef(): Obtention de références de modèle via l'API useTemplateRef(). Teleport différé: Téléportation de contenu vers des éléments rendus après le montage du composant. onWatcherCleanup(): Enregistrement de callbacks de nettoyage dans les watchers. Data et Intelligence Artificielle On entend souvent parler de Large Language Model quantisés, c'est à dire qu'on utilise par exemple des entiers sur 8 bits plutôt que des floatants sur 32 bits, pour réduire les besoins mémoire des GPU tout en gardant une précision proche de l'original. Cet article explique très visuellement et intuitivement ce processus de quantisation : https://newsletter.maartengrootendorst.com/p/a-visual-guide-to-quantization Guillaume continue de partager ses aventures avec le framework LangChain4j. Comment effectuer de la classification de texte : https://glaforge.dev/posts/2024/07/11/text-classification-with-gemini-and-langchain4j/ en utilisant la classe TextClassification de LangChain4j, qui utilise une approche basée sur les vector embeddings pour comparer des textes similaires en utilisant du few-shot prompting, sous différentes variantes, dans cet autre article : https://glaforge.dev/posts/2024/07/30/sentiment-analysis-with-few-shots-prompting/ et aussi comment faire du multimodal avec LangChain4j (avec le modèle Gemini) pour analyser des textes, des images, mais également des vidéos, du contenu audio, ou bien des fichiers PDFs : https://glaforge.dev/posts/2024/07/25/analyzing-videos-audios-and-pdfs-with-gemini-in-langchain4j/ Pour faire varier la prédictibilité ou la créativité des LLMs, certains hyperparamètres peuvent être ajustés, comme la température, le top-k et le top-p. Mais est-ce que vous savez vraiment comment fonctionnent ces paramètres ? Deux articles très clairs et intuitifs expliquent leur fonctionnement : https://medium.com/google-cloud/is-a-zero-temperature-deterministic-c4a7faef4d20 https://medium.com/google-cloud/beyond-temperature-tuning-llm-output-with-top-k-and-top-p–24c2de5c3b16 la tempoerature va ecraser la probabilite du prochain token mais il reste des variables: approximnation des calculs flottants, stacks differentes effectuants ces choix differemment, que faire en cas d'egalité de probabilité entre deux tokens mais il y a d'atures apporoches de configuiration des reaction du LLM: top-k (qui evite les tokens peu frequents), top-p pour avoir les n des tokens qui totalient p% des probabilités temperature d'abord puis top-k puis top-p explique quoi utiliser quand OSI propose une definition de l'IA open source https://www.technologyreview.com/2024/08/22/1097224/we-finally-have-a-definition-for-open-source-ai/ gros debats ces derniers mois utilisable pour tous usages sans besoin de permission chercheurs peuvent inspecter les components et etudier comment le system fonctionne systeme modifiable pour tout objectif y compris chager son comportement et paratger avec d'autres avec ou sans modification quelque soit l'usage Definit des niveaux de transparence (donnees d'entranement, code source, poids) Une longue rétrospective de PostgreSQL a des volumes de malades et les problèmes de lock https://ardentperf.com/2024/03/03/postgres-indexes-partitioning-and-lwlocklockmanager-scalability/ un article pour vous rassurer que vous n'aurez probablement jamais le problème histoire sous forme de post mortem des conseils pour éviter ces falaises Outillage Un premier coup d'oeil à la future notation déclarative de Gradle https://blog.gradle.org/declarative-gradle-first-eap un article qui explique à quoi ressemble cette nouvelle syntaxe déclarative de Gradle (en plus de Groovy et Kotlin) Quelques vidéos montrent le support dans Android Studio, pour le moment, ainsi que dans un outil expérimental, en attendant le support dans tous les IDEs L'idée est d'éviter le scripting et d'avoir vraiment qu'une description de son build Cela devrait améliorer la prise en charge de Gradle dans les IDEs et permettre d'avoir de la complétion rapide, etc c'est moi on on a Maven là? Support de Firefox dans Puppeteer https://hacks.mozilla.org/2024/08/puppeteer-support-for-firefox/ Puppeteer, la bibliothèque d'automatisation de navigateur, supporte désormais officiellement Firefox dès la version 23. Cette avancée permet aux développeurs d'écrire des scripts d'automatisation et d'effectuer des tests de bout en bout sur Chrome et Firefox de manière interchangeable. L'intégration de Firefox dans Puppeteer repose sur WebDriver BiDi, un protocole inter-navigateurs en cours de standardisation au W3C. WebDriver BiDi facilite la prise en charge de plusieurs navigateurs et ouvre la voie à une automatisation plus simple et plus efficace. Les principales fonctionnalités de Puppeteer, telles que la capture de journaux, l'émulation de périphériques, l'interception réseau et le préchargement de scripts, sont désormais disponibles pour Firefox. Mozilla considère WebDriver BiDi comme une étape importante vers une meilleure expérience de test inter-navigateurs. La prise en charge expérimentale de CDP (Chrome DevTools Protocol) dans Firefox sera supprimée fin 2024 au profit de WebDriver BiDi. Bien que Firefox soit officiellement pris en charge, certaines API restent non prises en charge et feront l'objet de travaux futurs. Guillaume a créé une annotation @Retry pour JUnit 5, pour retenter l'exécution d'un test qui est “flaky” https://glaforge.dev/posts/2024/09/01/a-retryable-junit–5-extension/ Guillaume n'avait pas trouvé d'extension par défaut dans JUnit 5 pour remplacer les Retry rules de JUnit 4 Mais sur les réseaux sociaux, une discussion intéressante s'ensuit avec des liens sur des extensions qui implémentent cette approche Comme JUnit Pioneer qui propose plein d'extensions utiles https://junit-pioneer.org/docs/retrying-test/ Ou l'extension rerunner https://github.com/artsok/rerunner-jupiter Arnaud a aussi suggéré la configuration de Maven Surefire pour relancer automatiquement les tests qui ont échoué https://maven.apache.org/surefire/maven-surefire-plugin/examples/rerun-failing-tests.html la question philosophique est: est-ce que c'est tolerable les tests qui ecouent de façon intermitente Architecture Un ancien fan de GraphQL en a fini avec la technologie GraphQL et réfléchit aux alternatives https://bessey.dev/blog/2024/05/24/why-im-over-graphql/ Problèmes de GraphQL: Sécurité: Attaques d'autorisation Difficulté de limitation de débit Analyse de requêtes malveillantes Performance: Problème N+1 (récupération de données et autorisation) Impact sur la mémoire lors de l'analyse de requêtes invalides Complexité accrue: Couplage entre logique métier et couche de transport Difficulté de maintenance et de tests Solutions envisagées: Adoption d'API REST conformes à OpenAPI 3.0+ Meilleure documentation et sécurité des types Outils pour générer du code client/serveur typé Deux approches de mise en œuvre d'OpenAPI: “Implementation first” (génération de la spécification à partir du code) “Specification first” (génération du code à partir de la spécification) retour interessant de quelqu'un qui n'utilise pas GraphQL au quotidien. C'était des problemes qui devaient etre corrigés avec la maturité de l'ecosysteme et des outils mais ca a montré ces limites pour cette personne. Prensentation de Grace Hoper en 1980 sur le future des ordinateurs. https://youtu.be/AW7ZHpKuqZg?si=w_o5_DtqllVTYZwt c'est fou la modernité de ce qu'elle décrit Des problèmes qu'on a encore aujourd'hui positive leadership Elle décrit l'avantage de systèmes fait de plusieurs ordinateurs récemment declassifié Leader election avec les conditional writes sur les buckets S3/GCS/Azure https://www.morling.dev/blog/leader-election-with-s3-conditional-writes/ L'élection de leader est le processus de choisir un nœud parmi plusieurs pour effectuer une tâche. Traditionnellement, l'élection de leader se fait avec un service de verrouillage distribué comme ZooKeeper. Amazon S3 a récemment ajouté le support des écritures conditionnelles, ce qui permet l'élection de leader sans service séparé. L'algorithme d'élection de leader fonctionne en faisant concourir les nœuds pour créer un fichier de verrouillage dans S3. Le fichier de verrouillage inclut un numéro d'époque, qui est incrémenté à chaque fois qu'un nouveau leader est élu. Les nœuds peuvent déterminer s'ils sont le leader en listant les fichiers de verrouillage et en vérifiant le numéro d'époque. attention il peut y avoir plusieurs leaders élus (horloges qui ont dérivé) donc c'est à gérer aussi Méthodologies Guillaume Laforge interviewé par Sfeir, où il parle de l'importance de la curiosité, du partage, de l'importance de la qualité du code, et parsemé de quelques photos des Cast Codeurs ! https://www.sfeir.dev/success-story/guillaume-laforge-maestro-de-java-et-esthete-du-code-propre/ Sécurité Comment crowdstrike met a genoux windows et de nombreuses entreprises https://next.ink/144464/crowdstrike-donne-des-details-techniques-sur-son-fiasco/ l'incident vient de la mise à jour de la configuration de Falcon l'EDR de crowdstrike https://www.crowdstrike.com/blog/falcon-update-for-windows-hosts-technical-details/ qu'est ce qu'un EDR? Un système Endpoint Detection and Response a pour but de surveiller votre machine ( access réseaux, logs, …) pour detecter des usages non habituels. Cet espion doit interagir avec les couches basses du système (réseau, sockets, logs systems) et se greffe donc au niveau du noyau du système d'exploitation. Il remonte les informations en live à une plateforme qui peut ensuite adapter les réponse en live si l'incident a duré moins de 1h30 coté crowdstrike plus de 8 millions de machines se sont retrouvées hors service bloquées sur le Blue Screen Of Death selon Microsoft https://blogs.microsoft.com/blog/2024/07/20/helping-our-customers-through-the-crowdstrike-outage/ cela n'est pas la première fois et était déjà arrivé il y a quelques mois sur Linux. Comme il s'agissait d'une incompatibilité de kernel il avait été moins important car les services ITs gèrent mieux ces problèmes sous Linux https://stackdiary.com/crowdstrike-took-down-debian-and-rocky-linux-a-few-months-ago-and-no-one-noticed/ Les benchmarks CIS, un pilier pour la sécurité de nos environnements cloud, et pas que ! (Katia HIMEUR TALHI) https://blog.cockpitio.com/security/cis-benchmarks/ Le CIS est un organisme à but non lucratif qui élabore des normes pour améliorer la cybersécurité. Les référentiels CIS sont un ensemble de recommandations et de bonnes pratiques pour sécuriser les systèmes informatiques. Ils peuvent être utilisés pour renforcer la sécurité, se conformer aux réglementations et normaliser les pratiques. Loi, société et organisation Microsoft signe un accord avec OVHCloud pour qu'il arretent leur plaine d'antitrust https://www.politico.eu/article/microsoft-signs-antitrust-truce-with-ovhcloud/ la plainte était en Europe mermet a des clients de plus facilement deployer les solutions Microsoft dans le fournisseur de cloud de leur choix la plainte avait ete posé à l'été 2021 ca rendait faire tourner les solutions MS plus cheres et non competitives vs MS ElasticSearch et Kibana sont de nouveau Open Source, en ajoutant la license AGPL à ses autres licences existantes https://www.elastic.co/fr/blog/elasticsearch-is-open-source-again le marché d'il y a trois ans et maintenant a changé AWS est une bon partenaire le flou Elasticsearch vs le produit d'AWS s'est clarifié donc retour a l'open source via AGPL Affero GPL Elastic n'a jamais cessé de croire en l'open source d'après Shay Banon son fondateur Le changement vers l'AGPL est une option supplémentaire, pas un remplacement d'une des autres licences existantes et juste apres, Elastic annonce des resultants decevants faisant plonger l'action de 25% https://siliconangle.com/2024/08/29/elastic-shares-plunge–25-lower-revenue-projections-amid-slower-customer-commitments/ https://unrollnow.com/status/1832187019235397785 et https://www.elastic.co/pricing/faq/licensing pour un résumé des licenses chez elastic Outils de l'épisode MailMate un client email Markdown et qui gere beaucoup d'emails https://medium.com/@nicfab/mailmate-a-powerful-client-email-for-macos-markdown-integrated-email-composition-e218fe2accf3 Emmanuel l'utilise sur les boites email secondaires un peu lent a demarrer (synchro) et le reste est rapide boites virtuelles (par requete) SpamSieve Que macOS je crois Trippy, un analyseur de réseau https://github.com/fujiapple852/trippy Il regroupe dans une CLI traceroute et ping Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 17 septembre 2024 : We Love Speed - Nantes (France) 17–18 septembre 2024 : Agile en Seine 2024 - Issy-les-Moulineaux (France) 19–20 septembre 2024 : API Platform Conference - Lille (France) & Online 20–21 septembre 2024 : Toulouse Game Dev - Toulouse (France) 25–26 septembre 2024 : PyData Paris - Paris (France) 26 septembre 2024 : Agile Tour Sophia-Antipolis 2024 - Biot (France) 2–4 octobre 2024 : Devoxx Morocco - Marrakech (Morocco) 3 octobre 2024 : VMUG Montpellier - Montpellier (France) 7–11 octobre 2024 : Devoxx Belgium - Antwerp (Belgium) 8 octobre 2024 : Red Hat Summit: Connect 2024 - Paris (France) 10 octobre 2024 : Cloud Nord - Lille (France) 10–11 octobre 2024 : Volcamp - Clermont-Ferrand (France) 10–11 octobre 2024 : Forum PHP - Marne-la-Vallée (France) 11–12 octobre 2024 : SecSea2k24 - La Ciotat (France) 15–16 octobre 2024 : Malt Tech Days 2024 - Paris (France) 16 octobre 2024 : DotPy - Paris (France) 16–17 octobre 2024 : NoCode Summit 2024 - Paris (France) 17–18 octobre 2024 : DevFest Nantes - Nantes (France) 17–18 octobre 2024 : DotAI - Paris (France) 30–31 octobre 2024 : Agile Tour Nantais 2024 - Nantes (France) 30–31 octobre 2024 : Agile Tour Bordeaux 2024 - Bordeaux (France) 31 octobre 2024–3 novembre 2024 : PyCon.FR - Strasbourg (France) 6 novembre 2024 : Master Dev De France - Paris (France) 7 novembre 2024 : DevFest Toulouse - Toulouse (France) 8 novembre 2024 : BDX I/O - Bordeaux (France) 13–14 novembre 2024 : Agile Tour Rennes 2024 - Rennes (France) 16–17 novembre 2024 : Capitole Du Libre - Toulouse (France) 20–22 novembre 2024 : Agile Grenoble 2024 - Grenoble (France) 21 novembre 2024 : DevFest Strasbourg - Strasbourg (France) 21 novembre 2024 : Codeurs en Seine - Rouen (France) 27–28 novembre 2024 : Cloud Expo Europe - Paris (France) 28 novembre 2024 : Who Run The Tech ? - Rennes (France) 2–3 décembre 2024 : Tech Rocks Summit - Paris (France) 3 décembre 2024 : Generation AI - Paris (France) 3–5 décembre 2024 : APIdays Paris - Paris (France) 4–5 décembre 2024 : DevOpsRex - Paris (France) 4–5 décembre 2024 : Open Source Experience - Paris (France) 5 décembre 2024 : GraphQL Day Europe - Paris (France) 6 décembre 2024 : DevFest Dijon - Dijon (France) 22–25 janvier 2025 : SnowCamp 2025 - Grenoble (France) 30 janvier 2025 : DevOps D-Day #9 - Marseille (France) 6–7 février 2025 : Touraine Tech - Tours (France) 3 avril 2025 : DotJS - Paris (France) 16–18 avril 2025 : Devoxx France - Paris (France) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via twitter https://twitter.com/lescastcodeurs Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

Working Draft » Podcast Feed
Revision 629: OpenAPI-MSW

Working Draft » Podcast Feed

Play Episode Listen Later Sep 3, 2024 87:51


Christoph Fricke (Twitter, Github) ist Berater und Webdev-Hexer bei Holisticon und erzählt uns von seinem Open-Source-Projekt! UNSER SPONSOR Workshops.DE bietet IT-Schulungen für moderne Web-Entw…

PodRocket - A web development podcast from LogRocket
Fullstack TypeScript with Erik Hanchett

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Aug 28, 2024 32:55


Erik Hanchett, senior developer advocate at AWS Amplify, explores the world of Fullstack TypeScript. He discusses the significance of end-to-end type safety, the tools to achieve it, and delves into the benefits and functionalities of AWS Amplify. Links https://www.programwitherik.com https://x.com/erikch https://www.youtube.com/c/programwitherik https://www.linkedin.com/in/erikhanchett https://trpc.io https://orval.dev https://docs.amplify.aws We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Erik Hanchett.

Go Time
OpenAPI & API Design

Go Time

Play Episode Listen Later Aug 8, 2024 74:12


We're talking OpenAPI this week! Kris & Johnny are joined by Jamie Tanna, one of the maintainers of oapi-codegen, to discuss OpenAPI, API design philosophies, versioning, and open source maintenance and sustainability. In addition to the usual laughs and unpopular opinions, this week's episode includes a Changelog++ section that you don't want to miss.

Develpreneur: Become a Better Developer and Entrepreneur
The Power of Documentation: Transforming Your Development Practices

Develpreneur: Become a Better Developer and Entrepreneur

Play Episode Listen Later Aug 8, 2024 19:01


Welcome back to our series on the developer journey. In this episode, we tackle one of the most crucial yet often neglected aspects of development: the power of documentation. While it might seem tedious, proper documentation is vital to enhancing your workflow and ensuring that your work is accessible and understandable for others. Why The Power of Documentation Matters Developer documentation is often the unsung hero in the software development lifecycle. Many developers overlook it, leading to frustration down the line when they or their colleagues struggle to understand undocumented code. Documentation is akin to testing: everyone acknowledges its importance, yet it frequently gets pushed aside due to time constraints. This negligence can result in messy, hard-to-navigate codebases. The truth is that high-quality documentation is indispensable. It's not just about creating records but ensuring that your code's functionality is clear and easily understandable for anyone who might work with it in the future. Good documentation reflects your professionalism and dedication, whether you're writing APIs or developing software. Leverage Modern Tools The good news is that documenting your work has never been easier. Today's tools have revolutionized how we approach documentation, making it more integrated and less of a chore. For instance, if you're building APIs, tools like Swagger (now known as OpenAPI) can auto-generate user-friendly and comprehensive documentation. Similarly, Postman not only helps with API testing but also assists in creating documentation. In addition, static code analysis tools can help highlight areas that may require documentation, but they're only part of the solution. Modern auto-generation tools like Javadoc have set a precedent for documentation in various languages. Whether you're working with Java, Python, or JavaScript, there's likely a tool available to auto-generate documentation for your code. The key is to explore and understand what's available for your specific environment. Integrate the Power of Documentation Throughout the Development Process Documentation should be seamlessly integrated into every phase of development. It's not limited to code comments or README files; it extends to comprehensive setup guides, deployment instructions, and user manuals. For instance, ensuring your README file includes details on how to set up and deploy the application can save a lot of time for those who come after you. Moreover, tools like GitHub, Bitbucket, and Atlassian provide robust platforms for managing documentation alongside code. With these tools, you can create wikis, track issues, and maintain up-to-date documentation that evolves with your project. This way, your documentation is accurate and consistently synchronized with your codebase. The Power of Documentation for Different Audiences Understanding your audience is crucial. Developer documentation must often cater to various stakeholders, including developers, testers, and end-users. Here's a breakdown of what each might require: Developers: Need detailed comments on functions, classes, and APIs. This includes input parameters, return types, and possible exceptions. Well-commented code makes it easier for other developers to pick up where you left off without digging through the entire codebase. Testers: Require information on how the application should behave under different conditions. This includes setup instructions, test cases, and expected outcomes. A clear test plan and test case documentation can streamline the quality assurance process. End-users: They need user manuals and release notes that explain how to use the application and what's new in updates. This documentation should be clear, concise, and tailored to the user's level of expertise. Embracing the Power of Documentation Good documentation is more than just a set of guidelines or instructions; it's essential to a well-oiled development machine. When done right, it transforms the way teams work together, reduces the likelihood of costly errors, and ensures that your code can be easily maintained and evolved. By prioritizing documentation, you're investing in the long-term health of your projects. It fosters collaboration, enhances understanding, and leads to more reliable and robust software. In a fast-paced development environment, where changes are constant, and the stakes are high, having thorough and accessible documentation is not just a best practice—it's a competitive advantage. So, as you continue your development journey, remember that documentation isn't a task to be completed and forgotten. It's an ongoing process that should evolve with your code, adapting to new features, updates, and changes. Embrace, champion, and make it a core part of your development process. In doing so, you'll improve your efficiency and contribute to a culture of clarity and excellence in software development. Stay Connected: Join the Developreneur Community We invite you to join our community and share your coding journey with us. Whether you're a seasoned developer or just starting, there's always room to learn and grow together. Contact us at info@develpreneur.com with your questions, feedback, or suggestions for future episodes. Together, let's continue exploring the exciting world of software development. Additional Resources Organizing Business Documentation: A Critical Challenge for Entrepreneurs Test-Driven Development – A Better Object-Oriented Design Approach SDLC – The software development life cycle is simplified Using a Document Repository To Become a Better Developer The Developer Journey Videos – With Bonus Content

Changelog Master Feed
OpenAPI & API Design (Go Time #328)

Changelog Master Feed

Play Episode Listen Later Aug 8, 2024 74:12


We're talking OpenAPI this week! Kris & Johnny are joined by Jamie Tanna, one of the maintainers of oapi-codegen, to discuss OpenAPI, API design philosophies, versioning, and open source maintenance and sustainability. In addition to the usual laughs and unpopular opinions, this week's episode includes a Changelog++ section that you don't want to miss.

The Week with Roger
This Week: Digital Transformation World: from AI to APIs

The Week with Roger

Play Episode Listen Later Jun 24, 2024 10:57


Analyst Don Kellogg receives an update from Roger Entner and Daryl Schoolar, who are attending the Digital Transformation World conference in Copenhagen.00:25 Overview of the conference 04:06 Practical AI use cases 07:27 Cloud native is crucial for operators 08:24 Corporate culture is a sticking point 09:10 How should outages be treated? 10:22 Event wrap-uphttps://dtw.tmforum.org/Tags: telecom, telecommunications, wireless, prepaid, postpaid, cellular phone, Don Kellogg, Roger Entner, Daryl Schoolar, DTW24, AI, OpenAPI, Nokia, simplification, cloud native, corporate culture, outages

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Editor's note: One of the top reasons we have hundreds of companies and thousands of AI Engineers joining the World's Fair next week is, apart from discussing technology and being present for the big launches planned, to hire and be hired! Listeners loved our previous Elicit episode and were so glad to welcome 2 more members of Elicit back for a guest post (and bonus podcast) on how they think through hiring. Don't miss their AI engineer job description, and template which you can use to create your own hiring plan! How to Hire AI EngineersJames Brady, Head of Engineering @ Elicit (ex Spring, Square, Trigger.io, IBM)Adam Wiggins, Internal Journalist @ Elicit (Cofounder Ink & Switch and Heroku)If you're leading a team that uses AI in your product in some way, you probably need to hire AI engineers. As defined in this article, that's someone with conventional engineering skills in addition to knowledge of language models and prompt engineering, without being a full-fledged Machine Learning expert.But how do you hire someone with this skillset? At Elicit we've been applying machine learning to reasoning tools since 2018, and our technical team is a mix of ML experts and what we can now call AI engineers. This article will cover our process from job description through interviewing. (You can also flip the perspectives here and use it just as easily for how to get hired as an AI engineer!)My own journeyBefore getting into the brass tacks, I want to share my journey to becoming an AI engineer.Up until a few years ago, I was happily working my job as an engineering manager of a big team at a late-stage startup. Like many, I was tracking the rapid increase in AI capabilities stemming from the deep learning revolution, but it was the release of GPT-3 in 2020 which was the watershed moment. At the time, we were all blown away by how the model could string together coherent sentences on demand. (Oh how far we've come since then!)I'd been a professional software engineer for nearly 15 years—enough to have experienced one or two technology cycles—but I could see this was something categorically new. I found this simultaneously exciting and somewhat disconcerting. I knew I wanted to dive into this world, but it seemed like the only path was going back to school for a master's degree in Machine Learning. I started talking with my boss about options for taking a sabbatical or doing a part-time distance learning degree.In 2021, I instead decided to launch a startup focused on productizing new research ideas on ML interpretability. It was through that process that I reached out to Andreas—a leading ML researcher and founder of Elicit—to see if he would be an advisor. Over the next few months, I learned more about Elicit: that they were trying to apply these fascinating technologies to the real-world problems of science, and with a business model that aligned it with safety goals. I realized that I was way more excited about Elicit than I was about my own startup ideas, and wrote about my motivations at the time.Three years later, it's clear this was a seismic shift in my career on the scale of when I chose to leave my comfy engineering job at IBM to go through the Y Combinator program back in 2008. Working with this new breed of technology has been more intellectually stimulating, challenging, and rewarding than I could have imagined.Deep ML expertise not requiredIt's important to note that AI engineers are not ML experts, nor is that their best contribution to a tech team.In our article Living documents as an AI UX pattern, we wrote:It's easy to think that AI advancements are all about training and applying new models, and certainly this is a huge part of our work in the ML team at Elicit. But those of us working in the UX part of the team believe that we have a big contribution to make in how AI is applied to end-user problems.We think of LLMs as a new medium to work with, one that we've barely begun to grasp the contours of. New computing mediums like GUIs in the 1980s, web/cloud in the 90s and 2000s, and multitouch smartphones in the 2000s/2010s opened a whole new era of engineering and design practices. So too will LLMs open new frontiers for our work in the coming decade.To compare to the early era of mobile development: great iOS developers didn't require a detailed understanding of the physics of capacitive touchscreens. But they did need to know the capabilities and limitations of a multi-touch screen, the constrained CPU and storage available, the context in which the user is using it (very different from a webpage or desktop computer), etc.In the same way, an AI engineer needs to work with LLMs as a medium that is fundamentally different from other compute mediums. That means an interest in the ML side of things, whether through their own self-study, tinkering with prompts and model fine-tuning, or following along in #llm-paper-club. But this understanding is so that they can work with the medium effectively versus, say, spending their days training new models.Language models as a chaotic mediumSo if we're not expecting deep ML expertise from AI engineers, what are we expecting? This brings us to what makes LLMs different.We'll assume already that our ideal candidate is already inspired by, and full of ideas about, all the new capabilities AI can bring to software products. But the flip side is all the things that make this new medium difficult to work with. LLM calls are annoying due to high latency (measured in tens of seconds sometimes, rather than milliseconds), extreme variance on latency, high error rates even under normal operation. Not to mention getting extremely different answers to the same prompt provided to the same model on two subsequent calls!The net effect is that an AI engineer, even working at the application development level, needs to have a skillset comparable to distributed systems engineering. Handling errors, retries, asynchronous calls, streaming responses, parallelizing and recombining model calls, the halting problem, and fallbacks are just some of the day-in-the-life of an AI engineer. Chaos engineering gets new life in the era of AI.Skills and qualities in candidatesLet's put together what we don't need (deep ML expertise) with what we do (work with capabilities and limitations of the medium). Thus we start to see what Elicit looks for in AI engineers:* Conventional software engineering skills. Especially back-end engineering on complex, data-intensive applications.* Professional, real-world experience with applications at scale.* Deep, hands-on experience across a few back-end web frameworks.* Light devops and an understanding of infrastructure best practices.* Queues, message buses, event-driven and serverless architectures, … there's no single “correct” approach, but having a deep toolbox to draw from is very important.* A genuine curiosity and enthusiasm for the capabilities of language models.* One or more serious projects (side projects are fine) of using them in interesting ways on a unique domain.* …ideally with some level of factored cognition, e.g. breaking the problem down into chunks, making thoughtful decisions about which things to push to the language model and which stay within the realm of conventional heuristics and compute capabilities.* Personal studying with resources like Elicit's ML reading list. Part of the role is collaborating with the ML engineers and researchers on our team. To do so, the candidate needs to “speak their language” somewhat, just as a mobile engineer needs some familiarity with backends in order to collaborate effectively on API creation with backend engineers.* An understanding of the challenges that come along with working with large models (high latency, variance, etc.) leading to a defensive, fault-first mindset.* Careful and principled handling of error cases, asynchronous code (and ability to reason about and debug it), streaming data, caching, logging and analytics for understanding behavior in production.* This is a similar mindset that one can develop working on conventional apps which are complex, data-intensive, or large-scale apps. The difference is that an AI engineer will need this mindset even when working on relatively small scales!On net, a great AI engineer will combine two seemingly contrasting perspectives: knowledge of, and a sense of wonder for, the capabilities of modern ML models; but also the understanding that this is a difficult and imperfect foundation, and the willingness to build resilient and performant systems on top of it.Here's the resulting AI engineer job description for Elicit. And here's a template that you can borrow from for writing your own JD.Hiring processOnce you know what you're looking for in an AI engineer, the process is not too different from other technical roles. Here's how we do it, broken down into two stages: sourcing and interviewing.SourcingWe're primarily looking for people with (1) a familiarity with and interest in ML, and (2) proven experience building complex systems using web technologies. The former is important for culture fit and as an indication that the candidate will be able to do some light prompt engineering as part of their role. The latter is important because language model APIs are built on top of web standards and—as noted above—aren't always the easiest tools to work with.Only a handful of people have built complex ML-first apps, but fortunately the two qualities listed above are relatively independent. Perhaps they've proven (2) through their professional experience and have some side projects which demonstrate (1).Talking of side projects, evidence of creative and original prototypes is a huge plus as we're evaluating candidates. We've barely scratched the surface of what's possible to build with LLMs—even the current generation of models—so candidates who have been willing to dive into crazy “I wonder if it's possible to…” ideas have a huge advantage.InterviewingThe hard skills we spend most of our time evaluating during our interview process are in the “building complex systems using web technologies” side of things. We will be checking that the candidate is familiar with asynchronous programming, defensive coding, distributed systems concepts and tools, and display an ability to think about scaling and performance. They needn't have 10+ years of experience doing this stuff: even junior candidates can display an aptitude and thirst for learning which gives us confidence they'll be successful tackling the difficult technical challenges we'll put in front of them.One anti-pattern—something which makes my heart sink when I hear it from candidates—is that they have no familiarity with ML, but claim that they're excited to learn about it. The amount of free and easily-accessible resources available is incredible, so a motivated candidate should have already dived into self-study.Putting all that together, here's the interview process that we follow for AI engineer candidates:* 30-minute introductory conversation. Non-technical, explaining the interview process, answering questions, understanding the candidate's career path and goals.* 60-minute technical interview. This is a coding exercise, where we play product manager and the candidate is making changes to a little web app. Here are some examples of topics we might hit upon through that exercise:* Update API endpoints to include extra metadata. Think about appropriate data types. Stub out frontend code to accept the new data.* Convert a synchronous REST API to an asynchronous streaming endpoint.* Cancellation of asynchronous work when a user closes their tab.* Choose an appropriate data structure to represent the pending, active, and completed ML work which is required to service a user request.* 60–90 minute non-technical interview. Walk through the candidate's professional experience, identifying high and low points, getting a grasp of what kinds of challenges and environments they thrive in.* On-site interviews. Half a day in our office in Oakland, meeting as much of the team as possible: more technical and non-technical conversations.The frontier is wide openAlthough Elicit is perhaps further along than other companies on AI engineering, we also acknowledge that this is a brand-new field whose shape and qualities are only just now starting to form. We're looking forward to hearing how other companies do this and being part of the conversation as the role evolves.We're excited for the AI Engineer World's Fair as another next step for this emerging subfield. And of course, check out the Elicit careers page if you're interested in joining our team.Podcast versionTimestamps* [00:00:24] Intros* [00:05:25] Defining the Hiring Process* [00:08:42] Defensive AI Engineering as a chaotic medium* [00:10:26] Tech Choices for Defensive AI Engineering* [00:14:04] How do you Interview for Defensive AI Engineering* [00:19:25] Does Model Shadowing Work?* [00:22:29] Is it too early to standardize Tech stacks?* [00:32:02] Capabilities: Offensive AI Engineering* [00:37:24] AI Engineering Required Knowledge* [00:40:13] ML First Mindset* [00:45:13] AI Engineers and Creativity* [00:47:51] Inside of Me There Are Two Wolves* [00:49:58] Sourcing AI Engineers* [00:58:45] Parting ThoughtsTranscript[00:00:00] swyx: Okay, so welcome to the Latent Space Podcast. This is another remote episode that we're recording. This is the first one that we're doing around a guest post. And I'm very honored to have two of the authors of the post with me, James and Adam from Elicit. Welcome, James. Welcome, Adam.[00:00:22] James Brady: Thank you. Great to be here.[00:00:23] Hey there.[00:00:24] Intros[00:00:24] swyx: Okay, so I think I will do this kind of in order. I think James, you're, you're sort of the primary author. So James, you are head of engineering at Elicit. You also, We're VP Eng at Teespring and Spring as well. And you also , you have a long history in sort of engineering. How did you, , find your way into something like Elicit where, , it's, you, you are basically traditional sort of VP Eng, VP technology type person moving into a more of an AI role.[00:00:53] James Brady: Yeah, that's right. It definitely was something of a Sideways move if not a left turn. So the story there was I'd been doing, as you said, VP technology, CTO type stuff for around about 15 years or so, and Notice that there was this crazy explosion of capability and interesting stuff happening within AI and ML and language models, that kind of thing.[00:01:16] I guess this was in 2019 or so, and decided that I needed to get involved. , this is a kind of generational shift. And Spent maybe a year or so trying to get up to speed on the state of the art, reading papers, reading books, practicing things, that kind of stuff. Was going to found a startup actually in in the space of interpretability and transparency, and through that met Andreas, who has obviously been on the, on the podcast before asked him to be an advisor for my startup, and he countered with, maybe you'd like to come and run the engineering team at Elicit, which it turns out was a much better idea.[00:01:48] And yeah, I kind of quickly changed in that direction. So I think some of the stuff that we're going to be talking about today is how actually a lot of the work when you're building applications with AI and ML looks and smells and feels much more like conventional software engineering with a few key differences rather than really deep ML stuff.[00:02:07] And I think that's one of the reasons why I was able to transfer skills over from one place to the other.[00:02:12] swyx: Yeah, I[00:02:12] James Brady: definitely[00:02:12] swyx: agree with that. I, I do often say that I think AI engineering is about 90 percent software engineering with like the, the 10 percent of like really strong really differentiated AI engineering.[00:02:22] And that might, that obviously that number might change over time. I want to also welcome Adam onto my podcast because you welcomed me onto your podcast two years ago.[00:02:31] Adam Wiggins: Yeah, that was a wonderful episode.[00:02:32] swyx: That was, that was a fun episode. You famously founded Heroku. You just wrapped up a few years working on Muse.[00:02:38] And now you've described yourself as a journalist, internal journalist working on Elicit.[00:02:43] Adam Wiggins: Yeah, well I'm kind of a little bit in a wandering phase here and trying to take this time in between ventures to see what's out there in the world and some of my wandering took me to the Elicit team. And found that they were some of the folks who were doing the most interesting, really deep work in terms of taking the capabilities of language models and applying them to what I feel like are really important problems.[00:03:08] So in this case, science and literature search and, and, and that sort of thing. It fits into my general interest in tools and productivity software. I, I think of it as a tool for thought in many ways, but a tool for science, obviously, if we can accelerate that discovery of new medicines and things like that, that's, that's just so powerful.[00:03:24] But to me, it's a. It's kind of also an opportunity to learn at the feet of some real masters in this space, people who have been working on it since it was, before it was cool, if you want to put it that way. So for me, the last couple of months have been this crash course, and why I sometimes describe myself as an internal journalist is I'm helping to write some, some posts, including Supporting James in this article here we're doing for latent space where I'm just bringing my writing skill and that sort of thing to bear on their very deep domain expertise around language models and applying them to the real world and kind of surface that in a way that's I don't know, accessible, legible, that, that sort of thing.[00:04:03] And so, and the great benefit to me is I get to learn this stuff in a way that I don't think I would, or I haven't, just kind of tinkering with my own side projects.[00:04:12] swyx: I forgot to mention that you also run Ink and Switch, which is one of the leading research labs, in my mind, of the tools for thought productivity space, , whatever people mentioned there, or maybe future of programming even, a little bit of that.[00:04:24] As well. I think you guys definitely started the local first wave. I think there was just the first conference that you guys held. I don't know if you were personally involved.[00:04:31] Adam Wiggins: Yeah, I was one of the co organizers along with a few other folks for, yeah, called Local First Conf here in Berlin.[00:04:36] Huge success from my, my point of view. Local first, obviously, a whole other topic we can talk about on another day. I think there actually is a lot more what would you call it , handshake emoji between kind of language models and the local first data model. And that was part of the topic of the conference here, but yeah, topic for another day.[00:04:55] swyx: Not necessarily. I mean , I, I selected as one of my keynotes, Justine Tunney, working at LlamaFall in Mozilla, because I think there's a lot of people interested in that stuff. But we can, we can focus on the headline topic. And just to not bury the lead, which is we're talking about hire, how to hire AI engineers, this is something that I've been looking for a credible source on for months.[00:05:14] People keep asking me for my opinions. I don't feel qualified to give an opinion and it's not like I have. So that's kind of defined hiring process that I'm super happy with, even though I've worked with a number of AI engineers.[00:05:25] Defining the Hiring Process[00:05:25] swyx: I'll just leave it open to you, James. How was your process of defining your hiring, hiring roles?[00:05:31] James Brady: Yeah. So I think the first thing to say is that we've effectively been hiring for this kind of a role since before you, before you coined the term and tried to kind of build this understanding of what it was.[00:05:42] So, which is not a bad thing. Like it's, it was a, it was a good thing. A concept, a concept that was coming to the fore and effectively needed a name, which is which is what you did. So the reason I mentioned that is I think it was something that we kind of backed into, if you will. We didn't sit down and come up with a brand new role from, from scratch of this is a completely novel set of responsibilities and skills that this person would need.[00:06:06] However, it is a A kind of particular blend of different skills and attitudes and and curiosities interests, which I think makes sense to kind of bundle together. So in the, in the post, the three things that we say are most important for a highly effective AI engineer are first of all, conventional software engineering skills, which is Kind of a given, but definitely worth mentioning.[00:06:30] The second thing is a curiosity and enthusiasm for machine learning and maybe in particular language models. That's certainly true in our case. And then the third thing is to do with basically a fault first mindset, being able to build systems that can handle things going wrong in, in, in some sense.[00:06:49] And yeah, the I think the kind of middle point, the curiosity about ML and language models is probably fairly self evident. They're going to be working with, and prompting, and dealing with the responses from these models, so that's clearly relevant. The last point, though, maybe takes the most explaining.[00:07:07] To do with this fault first mindset and the ability to, to build resilient systems. The reason that is, is so important is because compared to normal APIs, where normal, think of something like a Stripe API or a search API or something like this. The latency when you're working with language models is, is wild, like you can get 10x variation.[00:07:32] I mean, I was looking at the stats before, actually, before, before the podcast. We do often, normally, in fact, see a 10x variation in the P90 latency over the course of, Half an hour, an hour when we're prompting these models, which is way higher than if you're working with a, more kind of conventional conventionally backed API.[00:07:49] And the responses that you get, the actual content and the responses are naturally unpredictable as well. They come back with different formats. Maybe you're expecting JSON. It's not quite JSON. You have to handle this stuff. And also the, the semantics of the messages are unpredictable too, which is, which is a good thing.[00:08:08] Like this is one of the things that you're looking for from these language models, but it all adds up to needing to. Build a resilient, reliable, solid feeling system on top of this fundamentally, well, certainly currently fundamentally shaky foundation. The models do not behave in the way that you would like them to.[00:08:28] And yeah, the ability to structure the code around them such that it does give the user this warm, reassuring, Snappy, solid feeling is is really what we're driving for there.[00:08:42] Defensive AI Engineering as a chaotic medium[00:08:42] Adam Wiggins: What really struck me as we, we dug in on the content for this article was that third point there. The, the language models is this kind of chaotic medium, this, this dragon, this wild horse you're, you're, you're riding and trying to guide in the direction that is going to be useful and reliable to users, because I think.[00:08:58] So much of software engineering is about making things not only high performance and snappy, but really just making it stable, reliable, predictable, which is literally the opposite of what you get from from the language models. And yet, yeah, the output is so useful, and indeed, some of their Creativity, if you want to call it that, which is, is precisely their value.[00:09:19] And so you need to work with this medium. And I guess the nuanced or the thing that came out of Elissa's experience that I thought was so interesting is quite a lot of working with that is things that come from distributed systems engineering. But you have really the AI engineers as we're defining them or, or labeling them on the illicit team is people who are really application developers.[00:09:39] You're building things for end users. You're thinking about, okay, I need to populate this interface with some response to user input. That's useful to the tasks they're trying to do, but you have this. This is the thing, this medium that you're working with that in some ways you need to apply some of this chaos engineering, distributed systems engineering, which typically those people with those engineering skills are not kind of the application level developers with the product mindset or whatever, they're more deep in the guts of a, of a system.[00:10:07] And so it's, those, those skills and, and knowledge do exist throughout the engineering discipline, but sort of putting them together into one person that is That feels like sort of a unique thing and working with the folks on the Elicit team who have that skills I'm quite struck by that unique that unique blend.[00:10:23] I haven't really seen that before in my 30 year career in technology.[00:10:26] Tech Choices for Defensive AI Engineering[00:10:26] swyx: Yeah, that's a Fascinating I like the reference to chaos engineering. I have some appreciation, I think when you had me on your podcast, I was still working at Temporal and that was like a nice Framework, if you live within Temporal's boundaries, you can pretend that all those faults don't exist, and you can, you can code in a sort of very fault tolerant way.[00:10:47] What is, what is you guys solutions around this, actually? Like, I think you're, you're emphasizing having the mindset, but maybe naming some technologies would help? Not saying that you have to adopt these technologies, but they're just, they're just quick vectors into what you're talking about when you're, when you're talking about distributed systems.[00:11:03] Like, that's such a big, chunky word, , like are we talking, are Kubernetes or, and I suspect we're not, , like we're, we're talking something else now.[00:11:10] James Brady: Yeah, that's right. It's more at the application level rather than at the infrastructure level, at least, at least the way that it works for us.[00:11:17] So there's nothing kind of radically novel here. It is more a careful application of existing concepts. So the kinds of tools that we reach for to handle these kind of slightly chaotic objects that Adam was just talking about, are retries and fallbacks and timeouts and careful error handling. And, yeah, the standard stuff, really.[00:11:39] There's also a great degree of dependence. We rely heavily on parallelization because, , these language models are not innately very snappy, and , there's just a lot of I. O. going back and forth. So All these things I'm talking about when I was in my earlier stages of a career, these are kind of the things that are the difficult parts that most senior software engineers will be better at.[00:12:01] It is careful error handling, and concurrency, and fallbacks, and distributed systems, and, , eventual consistency, and all this kind of stuff and As Adam was saying, the kind of person that is deep in the guts of some kind of distributed systems, a really high, high scale backend kind of a problem would probably naturally have these kinds of skills.[00:12:21] But you'll find them on, on day one, if you're building a, , an ML powered app, even if it's not got massive scale. I think one one thing that I would mention that we do do yeah, maybe, maybe two related things, actually. The first is we're big fans of strong typing. We share the types all the way from the Backend Python code all the way to the to the front end in TypeScript and find that is I mean We'd probably do this anyway But it really helps one reason around the shapes of the data which can going to be going back and forth and that's really important When you can't rely upon You you're going to have to coerce the data that you get back from the ML if you want if you want for it to be structured basically speaking and The second thing which is related is we use checked exceptions inside our Python code base, which means that we can use the type system to make sure we are handling, properly handling, all of the, the various things that could be going wrong, all the different exceptions that could be getting raised.[00:13:16] So, checked exceptions are not, not really particularly popular. Actually there's not many people that are big fans of them. For our particular use case, to really make sure that we've not just forgotten to handle, , This particular type of error we have found them useful to to, to force us to think about all the different edge cases that can come up.[00:13:32] swyx: Fascinating. How just a quick note of technology. How do you share types from Python to TypeScript? Do you, do you use GraphQL? Do you use something[00:13:39] James Brady: else? We don't, we don't use GraphQL. Yeah. So we've got the We've got the types defined in Python, that's the source of truth. And we go from the OpenAPI spec, and there's a, there's a tool that you work and use to generate types dynamically, like TypeScript types from those OpenAPI definitions.[00:13:57] swyx: Okay, excellent. Okay, cool. Sorry, sorry for diving into that rabbit hole a little bit. I always like to spell out technologies for people to dig their teeth into.[00:14:04] How do you Interview for Defensive AI Engineering[00:14:04] swyx: One thing I'll, one thing I'll mention quickly is that a lot of the stuff that you mentioned is typically not part of the normal interview loop.[00:14:10] It's actually really hard to interview for because this is the stuff that you polish out in, as you go into production, the coding interviews are typically about the happy path. How do we do that? How do we, how do we design, how do you look for a defensive fault first mindset?[00:14:24] Because you can defensive code all day long and not add functionality. to your to your application.[00:14:29] James Brady: Yeah, it's a great question and I think that's exactly true. Normally the interview is about the happy path and then there's maybe a box checking exercise at the end of the candidate says of course in reality I would handle the edge cases or something like this and that unfortunately isn't isn't quite good enough when when the happy path is is very very narrow and yeah there's lots of weirdness on either side so basically speaking, it's just a case of, of foregrounding those kind of concerns through the interview process.[00:14:58] It's, there's, there's no magic to it. We, we talk about this in the, in the po in the post that we're gonna be putting up on, on Laton space. The, there's two main technical exercises that we do through our interview process for this role. The first is more coding focus, and the second is more system designy.[00:15:16] Yeah. White whiteboarding a potential solution. And in, without giving too much away in the coding exercise. You do need to think about edge cases. You do need to think about errors. The exercise consists of adding features and fixing bugs inside the code base. And in both of those two cases, it does demand, because of the way that we set the application up and the interview up, it does demand that you think about something other than the happy path.[00:15:41] But your thinking is the right prompt of how do we get the candidate thinking outside of the, the kind of normal Sweet spot, smooth smooth, smoothly paved path. In terms of the system design interview, that's a little easier to prompt this kind of fault first mindset because it's very easy in that situation just to say, let's imagine that, , this node dies, how does the app still work?[00:16:03] Let's imagine that this network is, is going super slow. Let's imagine that, I don't know, like you, you run out of, you run out of capacity in, in, in this database that you've sketched out here, how do you handle that, that, that sort of stuff. So. It's, in both cases, they're not firmly anchored to and built specifically around language models and ways language models can go wrong, but we do exercise the same muscles of thinking defensively and yeah, foregrounding the edge cases, basically.[00:16:32] Adam Wiggins: James, earlier there you mentioned retries. And this is something that I think I've seen some interesting debates internally about things regarding, first of all, retries are, can be costly, right? In general, this medium, in addition to having this incredibly high variance and response rate, and, , being non deterministic, is actually quite expensive.[00:16:50] And so, in many cases, doing a retry when you get a fail does make sense, but actually that has an impact on cost. And so there is Some sense to which, at least I've seen the AI engineers on our team, worry about that. They worry about, okay, how do we give the best user experience, but balance that against what the infrastructure is going to, , is going to cost our company, which I think is again, an interesting mix of, yeah, again, it's a little bit the distributed system mindset, but it's also a product perspective and you're thinking about the end user experience, but also the.[00:17:22] The bottom line for the business, you're bringing together a lot of a lot of qualities there. And there's also the fallback case, which is kind of, kind of a related or adjacent one. I think there was also a discussion on that internally where, I think it maybe was search, there was something recently where there was one of the frontline search providers was having some, yeah, slowness and outages, and essentially then we had a fallback, but essentially that gave people for a while, especially new users that come in that don't the difference, they're getting a They're getting worse results for their search.[00:17:52] And so then you have this debate about, okay, there's sort of what is correct to do from an engineering perspective, but then there's also what actually is the best result for the user. Is giving them a kind of a worse answer to their search result better, or is it better to kind of give them an error and be like, yeah, sorry, it's not working right at the moment, try again.[00:18:12] Later, both are obviously non optimal, but but this is the kind of thing I think that that you run into or, or the kind of thing we need to grapple with a lot more than you would other kinds of, of mediums.[00:18:24] James Brady: Yeah, that's a really good example. I think it brings to the fore the two different things that you could be optimizing for of uptime and response at all costs on one end of the spectrum and then effectively fragility, but kind of, if you get a response, it's the best response we can come up with at the other end of the spectrum.[00:18:43] And where you want to land there kind of depends on, well, it certainly depends on the app, obviously depends on the user. I think it depends on the, feature within the app as well. So in the search case that you, that you mentioned there, in retrospect, we probably didn't want to have the fallback. And we've actually just recently on Monday, changed that to Show an error message rather than giving people a kind of degraded experience in other situations We could use for example a large language model from a large language model from provider B rather than provider A and Get something which is within the A few percentage points performance, and that's just a really different situation.[00:19:21] So yeah, like any interesting question, the answer is, it depends.[00:19:25] Does Model Shadowing Work?[00:19:25] swyx: I do hear a lot of people suggesting I, let's call this model shadowing as a defensive technique, which is, if OpenAI happens to be down, which, , happens more often than people think then you fall back to anthropic or something.[00:19:38] How realistic is that, right? Like you, don't you have to develop completely different prompts for different models and won't the, won't the performance of your application suffer from whatever reason, right? Like it may be caused differently or it's not maintained in the same way. I, I think that people raise this idea of fallbacks to models, but I don't think it's, I don't, I don't see it practiced very much.[00:20:02] James Brady: Yeah, it is, you, you definitely need to have a different prompt if you want to stay within a few percentage points degradation Like I, like I said before, and that certainly comes at a cost, like fallbacks and backups and things like this It's really easy for them to go stale and kind of flake out on you because they're off the beaten track And In our particular case inside of Elicit, we do have fallbacks for a number of kind of crucial functions where it's going to be very obvious if something has gone wrong, but we don't have fallbacks in all cases.[00:20:40] It really depends on a task to task basis throughout the app. So I can't give you a kind of a, a single kind of simple rule of thumb for, in this case, do this. And in the other, do that. But yeah, we've it's a little bit easier now that the APIs between the anthropic models and opening are more similar than they used to be.[00:20:59] So we don't have two totally separate code paths with different protocols, like wire protocols to, to speak, which makes things easier, but you're right. You do need to have different prompts if you want to, have similar performance across the providers.[00:21:12] Adam Wiggins: I'll also note, just observing again as a relative newcomer here, I was surprised, impressed, not sure what the word is for it, at the blend of different backends that the team is using.[00:21:24] And so there's many The product presents as kind of one single interface, but there's actually several dozen kind of main paths. There's like, for example, the search versus a data extraction of a certain type, versus chat with papers, versus And each one of these, , the team has worked very hard to pick the right Model for the job and craft the prompt there, but also is constantly testing new ones.[00:21:48] So a new one comes out from either, from the big providers or in some cases, Our own models that are , running on, on essentially our own infrastructure. And sometimes that's more about cost or performance, but the point is kind of switching very fluidly between them and, and very quickly because this field is moving so fast and there's new ones to choose from all the time is like part of the day to day, I would say.[00:22:11] So it isn't more of a like, there's a main one, it's been kind of the same for a year, there's a fallback, but it's got cobwebs on it. It's more like which model and which prompt is changing weekly. And so I think it's quite, quite reasonable to to, to, to have a fallback that you can expect might work.[00:22:29] Is it too early to standardize Tech stacks?[00:22:29] swyx: I'm curious because you guys have had experience working at both, , Elicit, which is a smaller operation and, and larger companies. A lot of companies are looking at this with a certain amount of trepidation as, as, , it's very chaotic. When you have, when you have , one engineering team that, that, knows everyone else's names and like, , they, they, they, they meet constantly in Slack and knows what's going on.[00:22:50] It's easier to, to sync on technology choices. When you have a hundred teams, all shipping AI products and all making their own independent tech choices. It can be, it can be very hard to control. One solution I'm hearing from like the sales forces of the worlds and Walmarts of the world is that they are creating their own AI gateway, right?[00:23:05] Internal AI gateway. This is the one model hub that controls all the things and has our standards. Is that a feasible thing? Is that something that you would want? Is that something you have and you're working towards? What are your thoughts on this stuff? Like, Centralization of control or like an AI platform internally.[00:23:22] James Brady: Certainly for larger organizations and organizations that are doing things which maybe are running into HIPAA compliance or other, um, legislative tools like that. It could make a lot of sense. Yeah. I think for the TLDR for something like Elicit is we are small enough, as you indicated, and need to have full control over all the levers available and switch between different models and different prompts and whatnot, as Adam was just saying, that that kind of thing wouldn't work for us.[00:23:52] But yeah, I've spoken with and, um, advised a couple of companies that are trying to sell into that kind of a space or at a larger stage, and it does seem to make a lot of sense for them. So, for example, if you're trying to sell If you're looking to sell to a large enterprise and they cannot have any data leaving the EU, then you need to be really careful about someone just accidentally putting in, , the sort of US East 1 GPT 4 endpoints or something like this.[00:24:22] I'd be interested in understanding better what the specific problem is that they're looking to solve with that, whether it is to do with data security or centralization of billing, or if they have a kind of Suite of prompts or something like this that people can choose from so they don't need to reinvent the wheel again and again I wouldn't be able to say without understanding the problems and their proposed solutions , which kind of situations that be better or worse fit for but yeah for illicit where really the The secret sauce, if there is a secret sauce, is which models we're using, how we're using them, how we're combining them, how we're thinking about the user problem, how we're thinking about all these pieces coming together.[00:25:02] You really need to have all of the affordances available to you to be able to experiment with things and iterate rapidly. And generally speaking, whenever you put these kind of layers of abstraction and control and generalization in there, that, that gets in the way. So, so for us, it would not work.[00:25:19] Adam Wiggins: Do you feel like there's always a tendency to want to reach for standardization and abstractions pretty early in a new technology cycle?[00:25:26] There's something comforting there, or you feel like you can see them, or whatever. I feel like there's some of that discussion around lang chain right now. But yeah, this is not only so early, but also moving so fast. , I think it's . I think it's tough to, to ask for that. That's, that's not the, that's not the space we're in, but the, yeah, the larger an organization, the more that's your, your default is to, to, to want to reach for that.[00:25:48] It, it, it's a sort of comfort.[00:25:51] swyx: Yeah, I find it interesting that you would say that , being a founder of Heroku where , you were one of the first platforms as a service that more or less standardized what, , that sort of early developer experience should have looked like.[00:26:04] And I think basically people are feeling the differences between calling various model lab APIs and having an actual AI platform where. , all, all their development needs are thought of for them. , it's, it's very much, and, and I, I defined this in my AI engineer post as well.[00:26:19] Like the model labs just see their job ending at serving models and that's about it. But actually the responsibility of the AI engineer has to fill in a lot of the gaps beyond that. So.[00:26:31] Adam Wiggins: Yeah, that's true. I think, , a huge part of the exercise with Heroku, which It was largely inspired by Rails, which itself was one of the first frameworks to standardize the SQL database.[00:26:42] And people had been building apps like that for many, many years. I had built many apps. I had made my own templates based on that. I think others had done it. And Rails came along at the right moment. We had been doing it long enough that you see the patterns and then you can say look let's let's extract those into a framework that's going to make it not only easier to build for the experts but for people who are relatively new the best practices are encoded into you.[00:27:07] That framework, , Model View Controller, to take one example. But then, yeah, once you see that, and once you experience the power of a framework, and again, it's so comforting, and you can develop faster, and it's easier to onboard new people to it because you have these standards. And this consistency, then folks want that for something new that's evolving.[00:27:29] Now here I'm thinking maybe if you fast forward a little to, for example, when React came on the on the scene, , a decade ago or whatever. And then, okay, we need to do state management. What's that? And then there's, , there's a new library every six months. Okay, this is the one, this is the gold standard.[00:27:42] And then, , six months later, that's deprecated. Because of course, it's evolving, you need to figure it out, like the tacit knowledge and the experience of putting it in practice and seeing what those real What those real needs are are, are critical, and so it's, it is really about finding the right time to say yes, we can generalize, we can make standards and abstractions, whether it's for a company, whether it's for, , a library, an open source library, for a whole class of apps and it, it's very much a, much more of a A judgment call slash just a sense of taste or , experience to be able to say, Yeah, we're at the right point.[00:28:16] We can standardize this. But it's at least my, my very, again, and I'm so new to that, this world compared to you both, but my, my sense is, yeah, still the wild west. That's what makes it so exciting and feels kind of too early for too much. too much in the way of standardized abstractions. Not that it's not interesting to try, but , you can't necessarily get there in the same way Rails did until you've got that decade of experience of whatever building different classes of apps in that, with that technology.[00:28:45] James Brady: Yeah, it's, it's interesting to think about what is going to stay more static and what is expected to change over the coming five years, let's say. Which seems like when I think about it through an ML lens, it's an incredibly long time. And if you just said five years, it doesn't seem, doesn't seem that long.[00:29:01] I think that, that kind of talks to part of the problem here is that things that are moving are moving incredibly quickly. I would expect, this is my, my hot take rather than some kind of official carefully thought out position, but my hot take would be something like the You can, you'll be able to get to good quality apps without doing really careful prompt engineering.[00:29:21] I don't think that prompt engineering is going to be a kind of durable differential skill that people will, will hold. I do think that, The way that you set up the ML problem to kind of ask the right questions, if you see what I mean, rather than the specific phrasing of exactly how you're doing chain of thought or few shot or something in the prompt I think the way that you set it up is, is probably going to be remain to be trickier for longer.[00:29:47] And I think some of the operational challenges that we've been talking about of wild variations in, in, in latency, And handling the, I mean, one way to think about these models is the first lesson that you learn when, when you're an engineer, software engineer, is that you need to sanitize user input, right?[00:30:05] It was, I think it was the top OWASP security threat for a while. Like you, you have to sanitize and validate user input. And we got used to that. And it kind of feels like this is the, The shell around the app and then everything else inside you're kind of in control of and you can grasp and you can debug, etc.[00:30:22] And what we've effectively done is, through some kind of weird rearguard action, we've now got these slightly chaotic things. I think of them more as complex adaptive systems, which , related but a bit different. Definitely have some of the same dynamics. We've, we've injected these into the foundations of the, of the app and you kind of now need to think with this defined defensive mindset downwards as well as upwards if you, if you see what I mean.[00:30:46] So I think it would gonna, it's, I think it will take a while for us to truly wrap our heads around that. And also these kinds of problems where you have to handle things being unreliable and slow sometimes and whatever else, even if it doesn't happen very often, there isn't some kind of industry wide accepted way of handling that at massive scale.[00:31:10] There are definitely patterns and anti patterns and tools and whatnot, but it's not like this is a solved problem. So I would expect that it's not going to go down easily as a, as a solvable problem at the ML scale either.[00:31:23] swyx: Yeah, excellent. I would describe in, in the terminology of the stuff that I've written in the past, I describe this inversion of architecture as sort of LLM at the core versus LLM or code at the core.[00:31:34] We're very used to code at the core. Actually, we can scale that very well. When we build LLM core apps, we have to realize that the, the central part of our app that's orchestrating things is actually prompt, prone to, , prompt injections and non determinism and all that, all that good stuff.[00:31:48] I, I did want to move the conversation a little bit from the sort of defensive side of things to the more offensive or, , the fun side of things, capabilities side of things, because that is the other part. of the job description that we kind of skimmed over. So I'll, I'll repeat what you said earlier.[00:32:02] Capabilities: Offensive AI Engineering[00:32:02] swyx: It's, you want people to have a genuine curiosity and enthusiasm for the capabilities of language models. We just, we're recording this the day after Anthropic just dropped Cloud 3. 5. And I was wondering, , maybe this is a good, good exercise is how do people have Curiosity and enthusiasm for capabilities language models when for example the research paper for cloud 3.[00:32:22] 5 is four pages[00:32:23] James Brady: Maybe that's not a bad thing actually in this particular case So yeah If you really want to know exactly how the sausage was made That hasn't been possible for a few years now in fact for for these new models but from our perspective as when we're building illicit What we primarily care about is what can these models do?[00:32:41] How do they perform on the tasks that we already have set up and the evaluations we have in mind? And then on a slightly more expansive note, what kinds of new capabilities do they seem to have? Can we elicit, no pun intended, from the models? For example, well, there's, there's very obvious ones like multimodality , there wasn't that and then there was that, or it could be something a bit more subtle, like it seems to be getting better at reasoning, or it seems to be getting better at metacognition, or Or it seems to be getting better at marking its own work and giving calibrated confidence estimates, things like this.[00:33:19] So yeah, there's, there's plenty to be excited about there. It's just that yeah, there's rightly or wrongly been this, this, this shift over the last few years to not give all the details. So no, but from application development perspective we, every time there's a new model release, there's a flow of activity in our Slack, and we try to figure out what's going on.[00:33:38] What it can do, what it can't do, run our evaluation frameworks, and yeah, it's always an exciting, happy day.[00:33:44] Adam Wiggins: Yeah, from my perspective, what I'm seeing from the folks on the team is, first of all, just awareness of the new stuff that's coming out, so that's, , an enthusiasm for the space and following along, and then being able to very quickly, partially that's having Slack to do this, but be able to quickly map that to, okay, What does this do for our specific case?[00:34:07] And that, the simple version of that is, let's run the evaluation framework, which Lissa has quite a comprehensive one. I'm actually working on an article on that right now, which I'm very excited about, because it's a very interesting world of things. But basically, you can just try, not just, but try the new model in the evaluations framework.[00:34:27] Run it. It has a whole slew of benchmarks, which includes not just Accuracy and confidence, but also things like performance, cost, and so on. And all of these things may trade off against each other. Maybe it's actually, it's very slightly worse, but it's way faster and way cheaper, so actually this might be a net win, for example.[00:34:46] Or, it's way more accurate. But that comes at its slower and higher cost, and so now you need to think about those trade offs. And so to me, coming back to the qualities of an AI engineer, especially when you're trying to hire for them, It's this, it's, it is very much an application developer in the sense of a product mindset of What are our users or our customers trying to do?[00:35:08] What problem do they need solved? Or what what does our product solve for them? And how does the capabilities of a particular model potentially solve that better for them than what exists today? And by the way, what exists today is becoming an increasingly gigantic cornucopia of things, right? And so, You say, okay, this new model has these capabilities, therefore, , the simple version of that is plug it into our existing evaluations and just look at that and see if it, it seems like it's better for a straight out swap out, but when you talk about, for example, you have multimodal capabilities, and then you say, okay, wait a minute, actually, maybe there's a new feature or a whole new There's a whole bunch of ways we could be using it, not just a simple model swap out, but actually a different thing we could do that we couldn't do before that would have been too slow, or too inaccurate, or something like that, that now we do have the capability to do.[00:35:58] I think of that as being a great thing. I don't even know if I want to call it a skill, maybe it's even like an attitude or a perspective, which is a desire to both be excited about the new technology, , the new models and things as they come along, but also holding in the mind, what does our product do?[00:36:16] Who is our user? And how can we connect the capabilities of this technology to how we're helping people in whatever it is our product does?[00:36:25] James Brady: Yeah, I'm just looking at one of our internal Slack channels where we talk about things like new new model releases and that kind of thing And it is notable looking through these the kind of things that people are excited about and not It's, I don't know the context, the context window is much larger, or it's, look at how many parameters it has, or something like this.[00:36:44] It's always framed in terms of maybe this could be applied to that kind of part of Elicit, or maybe this would open up this new possibility for Elicit. And, as Adam was saying, yeah, I don't think it's really a I don't think it's a novel or separate skill, it's the kind of attitude I would like to have all engineers to have at a company our stage, actually.[00:37:05] And maybe more generally, even, which is not just kind of getting nerd sniped by some kind of technology number, fancy metric or something, but how is this actually going to be applicable to the thing Which matters in the end. How is this going to help users? How is this going to help move things forward strategically?[00:37:23] That kind of, that kind of thing.[00:37:24] AI Engineering Required Knowledge[00:37:24] swyx: Yeah, applying what , I think, is, is, is the key here. Getting hands on as well. I would, I would recommend a few resources for people listening along. The first is Elicit's ML reading list, which I, I found so delightful after talking with Andreas about it.[00:37:38] It looks like that's part of your onboarding. We've actually set up an asynchronous paper club instead of my discord for people following on that reading list. I love that you separate things out into tier one and two and three, and that gives people a factored cognition way of Looking into the, the, the corpus, right?[00:37:55] Like yes, the, the corpus of things to know is growing and the water is slowly rising as far as what a bar for a competent AI engineer is. But I think, , having some structured thought as to what are the big ones that everyone must know I think is, is, is key. It's something I, I haven't really defined for people and I'm, I'm glad that this is actually has something out there that people can refer to.[00:38:15] Yeah, I wouldn't necessarily like make it required for like the job. Interview maybe, but , it'd be interesting to see like, what would be a red flag. If some AI engineer would not know, I don't know what, , I don't know where we would stoop to, to call something required knowledge, , or you're not part of the cool kids club.[00:38:33] But there increasingly is something like that, right? Like, not knowing what context is, is a black mark, in my opinion, right?[00:38:40] I think it, I think it does connect back to what we were saying before of this genuine Curiosity about and that. Well, maybe it's, maybe it's actually that combined with something else, which is really important, which is a self starting bias towards action, kind of a mindset, which again, everybody needs.[00:38:56] Exactly. Yeah. Everyone needs that. So if you put those two together, or if I'm truly curious about this and I'm going to kind of figure out how to make things happen, then you end up with people. Reading, reading lists, reading papers, doing side projects, this kind of, this kind of thing. So it isn't something that we explicitly included.[00:39:14] We don't have a, we don't have an ML focused interview for the AI engineer role at all, actually. It doesn't really seem helpful. The skills which we are checking for, as I mentioned before, this kind of fault first mindset. And conventional software engineering kind of thing. It's, it's 0. 1 and 0.[00:39:32] 3 on the list that, that we talked about. In terms of checking for ML curiosity and there are, how familiar they are with these concepts. That's more through talking interviews and culture fit types of things. We want for them to have a take on what Elisa is doing. doing, certainly as they progress through the interview process.[00:39:50] They don't need to be completely up to date on everything we've ever done on day zero. Although, , that's always nice when it happens. But for them to really engage with it, ask interesting questions, and be kind of bought into our view on how we want ML to proceed. I think that is really important, and that would reveal that they have this kind of this interest, this ML curiosity.[00:40:13] ML First Mindset[00:40:13] swyx: There's a second aspect to that. I don't know if now's the right time to talk about it, which is, I do think that an ML first approach to building software is something of a different mindset. I could, I could describe that a bit now if that, if that seems good, but yeah, I'm a team. Okay. So yeah, I think when I joined Elicit, this was the biggest adjustment that I had to make personally.[00:40:37] So as I said before, I'd been, Effectively building conventional software stuff for 15 years or so, something like this, well, for longer actually, but professionally for like 15 years. And had a lot of pattern matching built into my brain and kind of muscle memory for if you see this kind of problem, then you do that kind of a thing.[00:40:56] And I had to unlearn quite a lot of that when joining Elicit because we truly are ML first and try to use ML to the fullest. And some of the things that that means is, This relinquishing of control almost, at some point you are calling into this fairly opaque black box thing and hoping it does the right thing and dealing with the stuff that it sends back to you.[00:41:17] And that's very different if you're interacting with, again, APIs and databases, that kind of a, that kind of a thing. You can't just keep on debugging. At some point you hit this, this obscure wall. And I think the second, the second part to this is the pattern I was used to is that. The external parts of the app are where most of the messiness is, not necessarily in terms of code, but in terms of degrees of freedom, almost.[00:41:44] If the user can and will do anything at any point, and they'll put all sorts of wonky stuff inside of text inputs, and they'll click buttons you didn't expect them to click, and all this kind of thing. But then by the time you're down into your SQL queries, for example, as long as you've done your input validation, things are pretty pretty well defined.[00:42:01] And that, as we said before, is not really the case. When you're working with language models, there is this kind of intrinsic uncertainty when you get down to the, to the kernel, down to the core. Even, even beyond that, there's all that stuff is somewhat defensive and these are things to be wary of to some degree.[00:42:18] Though the flip side of that, the really kind of positive part of taking an ML first mindset when you're building applications is that you, If you, once you get comfortable taking your hands off the wheel at a certain point and relinquishing control, letting go then really kind of unexpected powerful things can happen if you lean on the, if you lean on the capabilities of the model without trying to overly constrain and slice and dice problems with to the point where you're not really wringing out the most capability from the model that you, that you might.[00:42:47] So, I was trying to think of examples of this earlier, and one that came to mind was we were working really early when just after I joined Elicit, we were working on something where we wanted to generate text and include citations embedded within it. So it'd have a claim, and then a, , square brackets, one, in superscript, something, something like this.[00:43:07] And. Every fiber in my, in my, in my being was screaming that we should have some way of kind of forcing this to happen or Structured output such that we could guarantee that this citation was always going to be present later on that the kind of the indication of a footnote would actually match up with the footnote itself and Kind of went into this symbolic.[00:43:28] I need full control kind of kind of mindset and it was notable that Andreas Who's our CEO, again, has been on the podcast, was was the opposite. He was just kind of, give it a couple of examples and it'll probably be fine. And then we can kind of figure out with a regular expression at the end. And it really did not sit well with me, to be honest.[00:43:46] I was like, but it could say anything. I could say, it could literally say anything. And I don't know about just using a regex to sort of handle this. This is a potent feature of the app. But , this is that was my first kind of, , The starkest introduction to this ML first mindset, I suppose, which Andreas has been cultivating for much longer than me, much longer than most, of yeah, there might be some surprises of stuff you get back from the model, but you can also It's about finding the sweet spot, I suppose, where you don't want to give a completely open ended prompt to the model and expect it to do exactly the right thing.[00:44:25] You can ask it too much and it gets confused and starts repeating itself or goes around in loops or just goes off in a random direction or something like this. But you can also over constrain the model. And not really make the most of the, of the capabilities. And I think that is a mindset adjustment that most people who are coming into AI engineering afresh would need to make of yeah, giving up control and expecting that there's going to be a little bit of kind of extra pain and defensive stuff on the tail end, but the benefits that you get as a, as a result are really striking.[00:44:58] The ML first mindset, I think, is something that I struggle with as well, because the errors, when they do happen, are bad. , they will hallucinate, and your systems will not catch it sometimes if you don't have large enough of a sample set.[00:45:13] AI Engineers and Creativity[00:45:13] swyx: I'll leave it open to you, Adam. What else do you think about when you think about curiosity and exploring capabilities?[00:45:22] Do people are there reliable ways to get people to push themselves? for joining us on Capabilities, because I think a lot of times we have this implicit overconfidence, maybe, of we think we know what it is, what a thing is, when actually we don't, and we need to keep a more open mind, and I think you do a particularly good job of Always having an open mind, and I want to get that out of more engineers that I talk to, but I, I, I, I struggle sometimes.[00:45:45] Adam Wiggins: I suppose being an engineer is, at its heart, this sort of contradiction of, on one hand, yeah,

The Cybersecurity Defenders Podcast
#132 - API security with Jeremy Snyder, Founder and CEO at FireTail.io

The Cybersecurity Defenders Podcast

Play Episode Listen Later Jun 12, 2024 35:50


On this episode of The Cybersecurity Defenders Podcast, we talk API security with Jeremy Snyder, Founder and CEO at FireTail.io.FireTail.io is a pioneering company specializing in end-to-end API security. With APIs being the number one attack surface and a significant threat to data privacy and security, Jeremy and his team are at the forefront of protecting sensitive information in an increasingly interconnected world.Jeremy brings a wealth of experience in cloud, cybersecurity, and data domains, coupled with a strong background in M&A, international business, business development, strategy, and operations. Fluent in five languages and having lived in five different countries, he offers a unique global perspective on cybersecurity challenges and innovations.FireTail.io's data breach tracker.vacuum - The world's fastest OpenAPI & Swagger linter.Nuclei - Fast and customisable vulnerability scanner based on simple YAML based DSL.

PROFIT With A Plan
EP258: Streamline Your Business with Cutting-Edge B2B Payment Solutions with David Ademola

PROFIT With A Plan

Play Episode Listen Later Jun 11, 2024 36:09 Transcription Available


Streamline Your Business with Cutting-Edge B2B Payment Solutions   Introduction Is cash flow a constant headache for your B2B or B2G business? Are inefficiencies draining your resources and profits? You're not alone, and we have just the solution. In this episode, we're diving deep into B2B payment solutions, exploring strategies to streamline operations, save money, and enhance payment security. Host:  Marcia Riner, Business Growth Strategist and CEO of Infinite Profit® Guest: David Ademola, B2B Payment Expert at PayNation™️ Key Talking Points Removing Manual Processes Automation in payment systems reduces errors and saves time. Integrating accounting software like QuickBooks or Zoho can streamline data entry and processing. Open API integrations simplify connections between existing platforms. Accelerating Cash Flow Using tailored payment solutions to cut down on delays and improve cash flow. Understanding and utilizing Level 2 and Level 3 optimization to reduce processing fees significantly. Automating recurring billing for consistent cash flow. Increasing Payment Acceptance Security Leveraging advanced data entry requirements for more secure transactions. Using virtual cards to control and monitor spending. Implementing robust fraud prevention measures to protect financial data. Improving Efficiency and Reducing Costs Multi-platform integrations (QuickBooks, Zoho, CRMs, and ERPs) to enhance operational efficiency. Reducing steps in payment processing from multiple entries to one or two steps. Enhancing inventory management through integrated systems to avoid overstocking and understocking. Exploring Payment Options Beyond traditional credit card payments: ACH, e-checks, and virtual cards. Offering flexible payment terms for B2G clients, such as 30, 60, or even 90-day terms. Managing subscriptions and memberships efficiently with updated payment systems. Navigating Fees and Margins Strategies for incorporating transaction fees into product pricing. Educating on the difference between surcharges and cash discounts. Understanding the impact of transaction fees on the bottom line and how to mitigate them. Conclusion Streamlining your payment processes can revolutionize your business operations and improve your financial health. Tune in to this episode to learn actionable strategies from David Ademola, a seasoned Fintech expert, and take your business to the next level. Connect with David Ademola For more information on tailored B2B payment solutions and a free analysis of your payment processing systems, visit longustreferralpaynation.com and reach out for a free consultation to optimize your payment processes. Engage with Us We'd love to hear your questions and comments! When was the last time you reviewed your payment processing? Drop us a message, and we'll get back to you with personalized advice. Don't forget to subscribe to Profit with a Plan on your favorite podcast platform and stay tuned for more insightful episodes every Tuesday.   Want to supercharge your business, avoid profit plateaus, operational headaches, and growth roadblocks?  Marcia has created a brand-new Profit Booster® Playbook just for you. You'll uncover 3 essential strategies and the quick way to take action on them. This is not just a single page report, its filled with impactful strategies, actionable steps, and expert guidance to elevate your profits painlessly. Make this your best year ever.  Download this free playbook at www.BoostingProfit.com     About Marcia Riner.  She is a business growth strategist who helps business owners dramatically increase their revenue, profit, and the value of their company. In fact, she can show prospective clients a clear pathway to profit and an impactful ROI for working here before hiring her firm. Through her proven Profit Booster® strategies, she gets results. Marcia is the CEO of Infinite Profit® and more information can be found at https://www.InfiniteProfitConsulting.com Got questions?  Reach out to Marcia and her team at (949) 229-2112 ♾️  

Cup o' Go
A quick tour of some proposals, and a long chat about OpenAPI with Jamie Tanna

Cup o' Go

Play Episode Listen Later May 10, 2024 63:59


Go 1.22.3 & 1.22.10 releasedProposalsAccepted: add binary.Append functionLikely accept: new `go telemetry` subcommandLikely decline: Notify about new major versions of dependenciesPackt book bundleInterview with Jamie TannaBlog: Creating a more sustainable model for `oapi-codegen` in the futureBlog: oapi-codegen is moving to its own orgon GitHub: github.com/deepmap/oapi-codegen

Sustain
Episode 231: OSCA 2023 with Velda Kiara on her Open Source Journey

Sustain

Play Episode Listen Later May 3, 2024 19:48


Guest Velda Kiara Panelist Richard Littauer Show Notes Today, host Richard has a conversation with guest Velda Kiara, a passionate open source developer. Velda discusses how open source has helped businesses, how it benefits both coders and non-coders, and how it can lead to career growth. She also talks about the challenges of open source, particularly in terms of finances and the sustainability of projects. The discussion also turns to Velda's attendance at OSCA fest in Lagos, Nigeria, and her involvement with Black Python Devs. Velda shares her personal journey of contributing to Django and other Python projects and tells us about her experience joining programs like Djangonaut Space and contributing to projects like Novu. Press download now to hear more! [00:00:10] The episode opens with Velda highlighting the ins and outs of open source, acknowledging that it allows for the use of software that businesses can monetize. She appreciates the good that comes from open source despite the criticism of some corporations. She acknowledges the pros and cons of open source, expressing hope that the pros will eventually outweigh the cons. [00:02:21] Richard introduces Velda and praises her answer and asks if she'd like to change her initial statement. Velda stands by her answer, expressing willingness to continue the discussion for further insights on open source. [00:03:31] Velda confirms her attendance at OSCA fest, mentioning he talk on building APIs with Django, DRF, and Open API, and discusses the importance of sustainability in growing the open source community in Africa. [00:04:34] Richard inquires about Velda's involvement with Black Panther Devs, and she explains the inception, its objectives, and activities like workshops and meetups that support the community. [00:07:12] The conversation shifts to encouraging newcomers to join open source, emphasizing roles beyond coding, such as project management and writing. [00:09:08] Richard and Velda discuss the challenges designers face in open source and the potential career benefits of contributing to open source, even for non-developers. Velda shares how open source helped her gain experience and improve skills, which is beneficial at any career level, and she discusses the “level up” aspect of open source and the learning opportunities it provides. [00:12:00] Richard explores into the sustainability of open source for late-stage careers and the challenges maintainers face. Velda suggests using open source for mentorship and ensuring project continuity by engaging contributors and sharing maintenance responsibilities. [00:14:02] What currently excites Velda about open source? She expresses her excitement about contributing to Django after building many websites with it and her positive experience at DjangoCon US, which she found to be an inclusive community. Also, she discusses Djangonaut Space, an eight-week program designed to assist new contributors like her in contributing to the Django framework or third-party packages. [00:16:28] Velda mentions her contributions to other Python projects, such as Novu, and her new experiences working with SDKs. She reflects on the learning process in open source and shares her excitement for exploring various Python projects and talks about how she started a newsletter called, “The Storytellers by Tales.” Quotes [00:12:36] “If you eventually want to not let the project die, you could easily use open source as a way to mentor another person who's going to help you maintain for a while if you want to retire or stop writing code in general.” Links SustainOSS (https://sustainoss.org/) SustainOSS Twitter (https://twitter.com/SustainOSS?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) SustainOSS Discourse (https://discourse.sustainoss.org/) podcast@sustainoss.org (mailto:podcast@sustainoss.org) SustainOSS Mastodon (https://mastodon.social/tags/sustainoss) Open Collective-SustainOSS (Contribute) (https://opencollective.com/sustainoss) Richard Littauer Socials (https://www.burntfen.com/2023-05-30/socials) Velda Kiara X/Twitter (https://twitter.com/VeldaKiara) Velda Kiara LinkedIn (https://www.linkedin.com/in/veldakiara/) Velda Kiara Website (https://veldakiara.notion.site/veldakiara/Velda-Kiara-46aec24028fd4e8dbdba003097c18b5b) Black Python Devs (https://blackpythondevs.github.io/) KJay Miller (https://kjaymiller.com/) Djangonaut Space (https://djangonaut.space/) Novu (https://github.com/novuhq/novu) Sustain Podcast-Episode 169: Dawn Wages of PSF on organizing communities, ethical licenses, and more (https://podcast.sustainoss.org/169) Credits Produced by Richard Littauer (https://www.burntfen.com/) Edited by Paul M. Bahr at Peachtree Sound (https://www.peachtreesound.com/) Show notes by DeAnn Bahr Peachtree Sound (https://www.peachtreesound.com/) Special Guest: Velda Kiara.

Hacker Public Radio
HPR4099: Introducing Home Automation and Home Assistant

Hacker Public Radio

Play Episode Listen Later Apr 18, 2024


Home Automation, The Internet of things. This is the first episode in a new series called Home Automation. The series is open to anyone and I encourage everyone to contribute. https://en.wikipedia.org/wiki/Home_automation From Wikipedia, the free encyclopedia Home automation or domotics is building automation for a home. A home automation system will monitor and/or control home attributes such as lighting, climate, entertainment systems, and appliances. It may also include home security such as access control and alarm systems. The phrase smart home refers to home automation devices that have internet access. Home automation, a broader category, includes any device that can be monitored or controlled via wireless radio signals, not just those having internet access. When connected with the Internet, home sensors and activation devices are an important constituent of the Internet of Things ("IoT"). A home automation system typically connects controlled devices to a central smart home hub (sometimes called a "gateway"). The user interface for control of the system uses either wall-mounted terminals, tablet or desktop computers, a mobile phone application, or a Web interface that may also be accessible off-site through the Internet. Now is the time I tried this out a few years ago, but after a lot of frustration with configuration of esp32 arduinos, and raspberry pi's I left it be. Recently inspired by colleagues in work, I decided to get back into it and my initial tests show that the scene has much improved over the years. Youtube Playlist The Hook Up, RSS Home Automation Guy, RSS Everything Smart Home, RSS Smart Solutions for Home, RSS Smart Home Circle, RSS Smart Home Junkie, RSS Home Assistant The first thing we'll need is something to control it all. Something will allow us to control our homes without requiring the cloud. https://en.wikipedia.org/wiki/Home_Assistant From Wikipedia, the free encyclopedia Home Assistant is free and open-source software for home automation, designed to be an Internet of things (IoT) ecosystem-independent integration platform and central control system for smart home devices, with a focus on local control and privacy. It can be accessed through a web-based user interface, by using companion apps for Android and iOS, or by voice commands via a supported virtual assistant, such as Google Assistant or Amazon Alexa, and their own "Assist" (built-in local voice assistant). The Home Assistant software application is installed as a computer appliance. After installation, it will act as a central control system for home automation (commonly called a smart home hub), that has the purpose of controlling IoT connectivity technology devices, software, applications and services from third-parties via modular integration components, including native integration components for common wireless communication protocols such as Bluetooth, Thread, Zigbee, and Z-Wave (used to create local personal area networks with small low-power digital radios). Home Assistant as such supports controlling devices and services connected via either open and proprietary ecosystems as long they provide public access via some kind of Open API or MQTT for third-party integrations over the local area network or the Internet. Information from all devices and their attributes (entities) that the application sees can be used and controlled from within scripts trigger automation using scheduling and "blueprint" subroutines, e.g. for controlling lighting, climate, entertainment systems and home appliances. Summary Original author(s): Paulus Schoutsen Developer(s): Home Assistant Core Team and Community Initial release: 17 September 2013 Repository: https://github.com/home-assistant Written in: Python (Python 3.11) Operating system: Software appliance / Virtual appliance (Linux) Platform: ARM, ARM64, IA-32 (x86), and x64 (x86-64) Type: Home automation, smart home technology, Internet of things, task automator License: Apache License (free and open-source) Website: https://www.home-assistant.io The following is taken from the Concepts and terminology on the Home Assistant website. It is reproduced here under the creative commons Attribution-NonCommercial-ShareAlike 4.0 International License Integrations Integrations are pieces of software that allow Home Assistant to connect to other software and platforms. For example, a product by Philips called Hue would use the Philips Hue term integration and allow Home Assistant to talk to the hardware controller Hue Bridge. Any Home Assistant compatible term devices connected to the Hue Bridge would appear in Home Assistant as devices. For a full list of compatible term integrations, refer to the integrations documentation. Once an term integration has been added, the hardware and/or data are represented in Home Assistant as devices and entities. Entities Entities are the basic building blocks to hold data in Home Assistant. An term entity represents a term sensor, actor, or function in Home Assistant. Entities are used to monitor physical properties or to control other term entities. An term entity is usually part of a term device or a term service. Entities have term states. Devices Devices are a logical grouping for one or more term entities. A term device may represent a physical term device, which can have one or more sensors. The sensors appear as entities associated with the term device. For example, a motion sensor is represented as a term device. It may provide motion detection, temperature, and light levels as term entities. Entities have states such as detected when motion is detected and clear when there is no motion. Devices and entities are used throughout Home Assistant. To name a few examples: Dashboards can show a state of an term entity. For example, if a light is on or off. An automation can be triggered from a state change on an term entity. For example, a motion sensor entity detects motion and triggers a light to turn on. A predefined color and brightness setting for a light saved as a scene. Areas An area in Home Assistant is a logical grouping of term devices and term entities that are meant to match areas (or rooms) in the physical world: your home. For example, the living room area groups devices and entities in your living room. Areas allow you to target service calls at an entire group of devices. For example, turning off all the lights in the living room. Locations within your home such as living room, dance floor, etc. Areas can be assigned to term floors. Areas can also be used for automatically generated cards, such as the Area card. Automations A set of repeatable term actions that can be set up to run automatically. Automations are made of three key components: Triggers - events that start an term automation. For example, when the sun sets or a motion sensor is activated. Conditions - optional tests that must be met before an term action can be run. For example, if someone is home. Actions - interact with term devices such as turn on a light. To learn the basics about term automations, refer to the automation basics page or try creating an automation yourself. Scripts Similar to term automations, scripts are repeatable term actions that can be run. The difference between term scripts and term automations is that term scripts do not have triggers. This means that term scripts cannot automatically run unless they are used in an term automations. Scripts are particularly useful if you perform the same term actions in different term automations or trigger them from a dashboard. For information on how to create term scripts, refer to the scripts documentation. Scenes Scenes allow you to create predefined settings for your term devices. Similar to a driving mode on phones, or driver profiles in cars, it can change an environment to suit you. For example, your watching films term scene may dim the lighting, switch on the TV and increase its volume. This can be saved as a term scene and used without having to set individual term devices every time. To learn how to use term scenes, refer to the scene documentation. Add-ons Depending on your installation type, you can install third party add-ons. Add-ons are usually apps that can be run with Home Assistant but provide a quick and easy way to install, configure, and run within Home Assistant. Add-ons provide additional functionality whereas term integrations connect Home Assistant to other apps.

Dealer Talk With Jen Suzuki
What Is Salesforce Doing for Car Dealerships? Find out From a User's Perspective with Vicki Poponi!

Dealer Talk With Jen Suzuki

Play Episode Listen Later Apr 11, 2024 31:41


Meet Vicki Poponi, former C Suite Executive at Honda, now automotive industry advisor with Salesforce! SHe shares her firsthand experience with Salesforce as a game-changer in the automotive industry. Discover how Salesforce, a global force for over 24 years, is transforming automotive dealerships with its innovative solutions. Vicki delves into the essence of Salesforce for dealers, highlighting its versatility and customization capabilities. Learn how Salesforce provides an out-of-the-box solution that covers 80% of dealership needs, allowing for seamless customization through their App Store. Explore how dealers can tailor their client systems without the hassle of complex integrations, thanks to Salesforce's open API that enables developers to create add-on features. Unravel the power of Salesforce's Auto Cloud tailored for automotive dealers, starting from the Marketing Cloud and expanding into a comprehensive platform beyond traditional CRM capabilities. Dive into the vision of Salesforce for better customer experiences, focusing on creating seamless transitions and enhancing customer interactions throughout the sales journey. Discover how Salesforce empowers businesses to build personalized systems that align perfectly with their unique requirements, without the need for extensive IT support. Explore the limitless possibilities of Salesforce's platform, designed to elevate sales operations and drive exceptional customer engagement in the competitive automotive landscape. Tune in to gain valuable insights from Vicki's perspective on leveraging Salesforce to revolutionize automotive sales processes and deliver unparalleled customer experiences that drive lasting success in the industry. Dealer Talk with Jen Suzuki Podcast |Jennifer@edealersolution.com | 800-625-1590 | edealersolutions.com

RadioDotNet
Блестящий Garnet, проблемы экосистемы, OpenAPI и OpenAI

RadioDotNet

Play Episode Listen Later Mar 31, 2024 103:35


Подкаст RadioDotNet выпуск №90 от 1 апреля 2024 года Сайт подкаста: radio.dotnet.ru Boosty (₽): boosty.to/RadioDotNet Темы: [00:01:09] — Microsoft Garnet microsoft.github.io/garnetgithub.com/microsoft/garnet t.me/epeshkblog/154 [00:12:39] — Heap data structure and .NET priority queue andrewlock.net/an-introduction-to-the-heap-data-struc... andrewlock.net/behind-the-implementation-of-dotnets-p... andrewlock.net/implementing-dijkstras-algorithm-for-f... [00:21:59] — Tales from the .NET Migration Trenches (Part 2) jimmybogard.com/tales-from-the-net-migration-trenches-... jimmybogard.com/tales-from-the-net-migration-trenches-... jimmybogard.com/tales-from-the-net-migration-trenches-... jimmybogard.com/tales-from-the-net-migration-trenches-... jimmybogard.com/tales-from-the-net-migration-trenches-... [00:41:45] — .NET Developers Begging for Ecosystem Destruction aaronstannard.com/dotnet-eventing-backslide [01:04:01] — Generate OpenAPI specification at build time meziantou.net/generate-openapi-specification-at-buil... github.com/dotnet/aspnetcore/issues/54598 github.com/dotnet/aspnetcore/issues/54599 [01:20:24] — .NET Task Parallel Library vs System.Threading.Channels chrlschn.dev/blog/dotnet-task-parallel-library-vs-s... [01:29:43] — Introducing .NET Smart Components – AI-powered UI controls devblogs.microsoft.com/dotnet/introducing-dotnet-smart-compon... [01:41:42] — Кратко о разном devblogs.microsoft.com/dotnet/dotnet-7-end-of-support Фоновая музыка: Максим Аршинов «Pensive yeti.0.1»

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Top 5 Research Trends + OpenAI Sora, Google Gemini, Groq Math (Jan-Feb 2024 Audio Recap) + Latent Space Anniversary with Lindy.ai, RWKV, Pixee, Julius.ai, Listener Q&A!

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 9, 2024 108:52


We will be recording a preview of the AI Engineer World's Fair soon with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have!Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an ex-technical co-founder type (can MVP products end to end, comfortable with ambiguous prod requirements, etc). Reach out to him for more!Thanks for all the love on the Four Wars episode! We're excited to develop this new “swyx & Alessio rapid-fire thru a bunch of things” format with you, and feedback is welcome. Jan 2024 RecapThe first half of this monthly audio recap pod goes over our highlights from the Jan Recap, which is mainly focused on notable research trends we saw in Jan 2024:Feb 2024 RecapThe second half catches you up on everything that was topical in Feb, including:* OpenAI Sora - does it have a world model? Yann LeCun vs Jim Fan * Google Gemini Pro 1.5 - 1m Long Context, Video Understanding* Groq offering Mixtral at 500 tok/s at $0.27 per million toks (swyx vs dylan math)* The {Gemini | Meta | Copilot} Alignment Crisis (Sydney is back!)* Grimes' poetic take: Art for no one, by no one* F*** you, show me the promptLatent Space AnniversaryPlease also read Alessio's longform reflections on One Year of Latent Space!We launched the podcast 1 year ago with Logan from OpenAI:and also held an incredible demo day that got covered in The Information:Over 750k downloads later, having established ourselves as the top AI Engineering podcast, reaching #10 in the US Tech podcast charts, and crossing 1 million unique readers on Substack, for our first anniversary we held Latent Space Final Frontiers, where 10 handpicked teams, including Lindy.ai and Julius.ai, competed for prizes judged by technical AI leaders from (former guest!) LlamaIndex, Replit, GitHub, AMD, Meta, and Lemurian Labs.The winners were Pixee and RWKV (that's Eugene from our pod!):And finally, your cohosts got cake!We also captured spot interviews with 4 listeners who kindly shared their experience of Latent Space, everywhere from Hungary to Australia to China:* Balázs Némethi* Sylvia Tong* RJ Honicky* Jan ZhengOur birthday wishes for the super loyal fans reading this - tag @latentspacepod on a Tweet or comment on a @LatentSpaceTV video telling us what you liked or learned from a pod that stays with you to this day, and share us with a friend!As always, feedback is welcome. Timestamps* [00:03:02] Top Five LLM Directions* [00:03:33] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)* [00:11:42] Direction 2: Synthetic Data (WRAP, SPIN)* [00:17:20] Wildcard: Multi-Epoch Training (OLMo, Datablations)* [00:19:43] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)* [00:23:33] Wildcards: Text Diffusion, RALM/Retro* [00:25:00] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)* [00:28:26] Wildcard: Model Merging (mergekit)* [00:29:51] Direction 5: Online LLMs (Gemini Pro, Exa)* [00:33:18] OpenAI Sora and why everyone underestimated videogen* [00:36:18] Does Sora have a World Model? Yann LeCun vs Jim Fan* [00:42:33] Groq Math* [00:47:37] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars* [00:55:42] The Alignment Crisis - Gemini, Meta, Sydney is back at Copilot, Grimes' take* [00:58:39] F*** you, show me the prompt* [01:02:43] Send us your suggestions pls* [01:04:50] Latent Space Anniversary* [01:04:50] Lindy.ai - Agent Platform* [01:06:40] RWKV - Beyond Transformers* [01:15:00] Pixee - Automated Security* [01:19:30] Julius AI - Competing with Code Interpreter* [01:25:03] Latent Space Listeners* [01:25:03] Listener 1 - Balázs Némethi (Hungary, Latent Space Paper Club* [01:27:47] Listener 2 - Sylvia Tong (Sora/Jim Fan/EntreConnect)* [01:31:23] Listener 3 - RJ (Developers building Community & Content)* [01:39:25] Listener 4 - Jan Zheng (Australia, AI UX)Transcript[00:00:00] AI Charlie: Welcome to the Latent Space podcast, weekend edition. This is Charlie, your new AI co host. Happy weekend. As an AI language model, I work the same every day of the week, although I might get lazier towards the end of the year. Just like you. Last month, we released our first monthly recap pod, where Swyx and Alessio gave quick takes on the themes of the month, and we were blown away by your positive response.[00:00:33] AI Charlie: We're delighted to continue our new monthly news recap series for AI engineers. Please feel free to submit questions by joining the Latent Space Discord, or just hit reply when you get the emails from Substack. This month, we're covering the top research directions that offer progress for text LLMs, and then touching on the big Valentine's Day gifts we got from Google, OpenAI, and Meta.[00:00:55] AI Charlie: Watch out and take care.[00:00:57] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO of Residence at Decibel Partners, and we're back with a monthly recap with my co host[00:01:06] swyx: Swyx. The reception was very positive for the first one, I think people have requested this and no surprise that I think they want to hear us more applying on issues and maybe drop some alpha along the way I'm not sure how much alpha we have to drop, this month in February was a very, very heavy month, we also did not do one specifically for January, so I think we're just going to do a two in one, because we're recording this on the first of March.[00:01:29] Alessio: Yeah, let's get to it. I think the last one we did, the four wars of AI, was the main kind of mental framework for people. I think in the January one, we had the five worthwhile directions for state of the art LLMs. Four, five,[00:01:42] swyx: and now we have to do six, right? Yeah.[00:01:46] Alessio: So maybe we just want to run through those, and then do the usual news recap, and we can do[00:01:52] swyx: one each.[00:01:53] swyx: So the context to this stuff. is one, I noticed that just the test of time concept from NeurIPS and just in general as a life philosophy I think is a really good idea. Especially in AI, there's news every single day, and after a while you're just like, okay, like, everyone's excited about this thing yesterday, and then now nobody's talking about it.[00:02:13] swyx: So, yeah. It's more important, or better use of time, to spend things, spend time on things that will stand the test of time. And I think for people to have a framework for understanding what will stand the test of time, they should have something like the four wars. Like, what is the themes that keep coming back because they are limited resources that everybody's fighting over.[00:02:31] swyx: Whereas this one, I think that the focus for the five directions is just on research that seems more proMECEng than others, because there's all sorts of papers published every single day, and there's no organization. Telling you, like, this one's more important than the other one apart from, you know, Hacker News votes and Twitter likes and whatever.[00:02:51] swyx: And obviously you want to get in a little bit earlier than Something where, you know, the test of time is counted by sort of reference citations.[00:02:59] The Five Research Directions[00:02:59] Alessio: Yeah, let's do it. We got five. Long inference.[00:03:02] swyx: Let's start there. Yeah, yeah. So, just to recap at the top, the five trends that I picked, and obviously if you have some that I did not cover, please suggest something.[00:03:13] swyx: The five are long inference, synthetic data, alternative architectures, mixture of experts, and online LLMs. And something that I think might be a bit controversial is this is a sorted list in the sense that I am not the guy saying that Mamba is like the future and, and so maybe that's controversial.[00:03:31] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)[00:03:31] swyx: But anyway, so long inference is a thesis I pushed before on the newsletter and on in discussing The thesis that, you know, Code Interpreter is GPT 4. 5. That was the title of the post. And it's one of many ways in which we can do long inference. You know, long inference also includes chain of thought, like, please think step by step.[00:03:52] swyx: But it also includes flow engineering, which is what Itamar from Codium coined, I think in January, where, basically, instead of instead of stuffing everything in a prompt, You do like sort of multi turn iterative feedback and chaining of things. In a way, this is a rebranding of what a chain is, what a lang chain is supposed to be.[00:04:15] swyx: I do think that maybe SGLang from ElemSys is a better name. Probably the neatest way of flow engineering I've seen yet, in the sense that everything is a one liner, it's very, very clean code. I highly recommend people look at that. I'm surprised it hasn't caught on more, but I think it will. It's weird that something like a DSPy is more hyped than a Shilang.[00:04:36] swyx: Because it, you know, it maybe obscures the code a little bit more. But both of these are, you know, really good sort of chain y and long inference type approaches. But basically, the reason that the basic fundamental insight is that the only, like, there are only a few dimensions we can scale LLMs. So, let's say in like 2020, no, let's say in like 2018, 2017, 18, 19, 20, we were realizing that we could scale the number of parameters.[00:05:03] swyx: 20, we were And we scaled that up to 175 billion parameters for GPT 3. And we did some work on scaling laws, which we also talked about in our talk. So the datasets 101 episode where we're like, okay, like we, we think like the right number is 300 billion tokens to, to train 175 billion parameters and then DeepMind came along and trained Gopher and Chinchilla and said that, no, no, like, you know, I think we think the optimal.[00:05:28] swyx: compute optimal ratio is 20 tokens per parameter. And now, of course, with LLAMA and the sort of super LLAMA scaling laws, we have 200 times and often 2, 000 times tokens to parameters. So now, instead of scaling parameters, we're scaling data. And fine, we can keep scaling data. But what else can we scale?[00:05:52] swyx: And I think understanding the ability to scale things is crucial to understanding what to pour money and time and effort into because there's a limit to how much you can scale some things. And I think people don't think about ceilings of things. And so the remaining ceiling of inference is like, okay, like, we have scaled compute, we have scaled data, we have scaled parameters, like, model size, let's just say.[00:06:20] swyx: Like, what else is left? Like, what's the low hanging fruit? And it, and it's, like, blindingly obvious that the remaining low hanging fruit is inference time. So, like, we have scaled training time. We can probably scale more, those things more, but, like, not 10x, not 100x, not 1000x. Like, right now, maybe, like, a good run of a large model is three months.[00:06:40] swyx: We can scale that to three years. But like, can we scale that to 30 years? No, right? Like, it starts to get ridiculous. So it's just the orders of magnitude of scaling. It's just, we're just like running out there. But in terms of the amount of time that we spend inferencing, like everything takes, you know, a few milliseconds, a few hundred milliseconds, depending on what how you're taking token by token, or, you know, entire phrase.[00:07:04] swyx: But We can scale that to hours, days, months of inference and see what we get. And I think that's really proMECEng.[00:07:11] Alessio: Yeah, we'll have Mike from Broadway back on the podcast. But I tried their product and their reports take about 10 minutes to generate instead of like just in real time. I think to me the most interesting thing about long inference is like, You're shifting the cost to the customer depending on how much they care about the end result.[00:07:31] Alessio: If you think about prompt engineering, it's like the first part, right? You can either do a simple prompt and get a simple answer or do a complicated prompt and get a better answer. It's up to you to decide how to do it. Now it's like, hey, instead of like, yeah, training this for three years, I'll still train it for three months and then I'll tell you, you know, I'll teach you how to like make it run for 10 minutes to get a better result.[00:07:52] Alessio: So you're kind of like parallelizing like the improvement of the LLM. Oh yeah, you can even[00:07:57] swyx: parallelize that, yeah, too.[00:07:58] Alessio: So, and I think, you know, for me, especially the work that I do, it's less about, you know, State of the art and the absolute, you know, it's more about state of the art for my application, for my use case.[00:08:09] Alessio: And I think we're getting to the point where like most companies and customers don't really care about state of the art anymore. It's like, I can get this to do a good enough job. You know, I just need to get better. Like, how do I do long inference? You know, like people are not really doing a lot of work in that space, so yeah, excited to see more.[00:08:28] swyx: So then the last point I'll mention here is something I also mentioned as paper. So all these directions are kind of guided by what happened in January. That was my way of doing a January recap. Which means that if there was nothing significant in that month, I also didn't mention it. Which is which I came to regret come February 15th, but in January also, you know, there was also the alpha geometry paper, which I kind of put in this sort of long inference bucket, because it solves like, you know, more than 100 step math olympiad geometry problems at a human gold medalist level and that also involves planning, right?[00:08:59] swyx: So like, if you want to scale inference, you can't scale it blindly, because just, Autoregressive token by token generation is only going to get you so far. You need good planning. And I think probably, yeah, what Mike from BrightWave is now doing and what everyone is doing, including maybe what we think QSTAR might be, is some form of search and planning.[00:09:17] swyx: And it makes sense. Like, you want to spend your inference time wisely. How do you[00:09:22] Alessio: think about plans that work and getting them shared? You know, like, I feel like if you're planning a task, somebody has got in and the models are stochastic. So everybody gets initially different results. Somebody is going to end up generating the best plan to do something, but there's no easy way to like store these plans and then reuse them for most people.[00:09:44] Alessio: You know, like, I'm curious if there's going to be. Some paper or like some work there on like making it better because, yeah, we don't[00:09:52] swyx: really have This is your your pet topic of NPM for[00:09:54] Alessio: Yeah, yeah, NPM, exactly. NPM for, you need NPM for anything, man. You need NPM for skills. You need NPM for planning. Yeah, yeah.[00:10:02] Alessio: You know I think, I mean, obviously the Voyager paper is like the most basic example where like, now their artifact is like the best planning to do a diamond pickaxe in Minecraft. And everybody can just use that. They don't need to come up with it again. Yeah. But there's nothing like that for actually useful[00:10:18] swyx: tasks.[00:10:19] swyx: For plans, I believe it for skills. I like that. Basically, that just means a bunch of integration tooling. You know, GPT built me integrations to all these things. And, you know, I just came from an integrations heavy business and I could definitely, I definitely propose some version of that. And it's just, you know, hard to execute or expensive to execute.[00:10:38] swyx: But for planning, I do think that everyone lives in slightly different worlds. They have slightly different needs. And they definitely want some, you know, And I think that that will probably be the main hurdle for any, any sort of library or package manager for planning. But there should be a meta plan of how to plan.[00:10:57] swyx: And maybe you can adopt that. And I think a lot of people when they have sort of these meta prompting strategies of like, I'm not prescribing you the prompt. I'm just saying that here are the like, Fill in the lines or like the mad libs of how to prompts. First you have the roleplay, then you have the intention, then you have like do something, then you have the don't something and then you have the my grandmother is dying, please do this.[00:11:19] swyx: So the meta plan you could, you could take off the shelf and test a bunch of them at once. I like that. That was the initial, maybe, promise of the, the prompting libraries. You know, both 9chain and Llama Index have, like, hubs that you can sort of pull off the shelf. I don't think they're very successful because people like to write their own.[00:11:36] swyx: Yeah,[00:11:37] Direction 2: Synthetic Data (WRAP, SPIN)[00:11:37] Alessio: yeah, yeah. Yeah, that's a good segue into the next one, which is synthetic[00:11:41] swyx: data. Synthetic data is so hot. Yeah, and, you know, the way, you know, I think I, I feel like I should do one of these memes where it's like, Oh, like I used to call it, you know, R L A I F, and now I call it synthetic data, and then people are interested.[00:11:54] swyx: But there's gotta be older versions of what synthetic data really is because I'm sure, you know if you've been in this field long enough, There's just different buzzwords that the industry condenses on. Anyway, the insight that I think is relatively new that why people are excited about it now and why it's proMECEng now is that we have evidence that shows that LLMs can generate data to improve themselves with no teacher LLM.[00:12:22] swyx: For all of 2023, when people say synthetic data, they really kind of mean generate a whole bunch of data from GPT 4 and then train an open source model on it. Hello to our friends at News Research. That's what News Harmony says. They're very, very open about that. I think they have said that they're trying to migrate away from that.[00:12:40] swyx: But it is explicitly against OpenAI Terms of Service. Everyone knows this. You know, especially once ByteDance got banned for, for doing exactly that. So so, so synthetic data that is not a form of model distillation is the hot thing right now, that you can bootstrap better LLM performance from the same LLM, which is very interesting.[00:13:03] swyx: A variant of this is RLAIF, where you have a, where you have a sort of a constitutional model, or, you know, some, some kind of judge model That is sort of more aligned. But that's not really what we're talking about when most people talk about synthetic data. Synthetic data is just really, I think, you know, generating more data in some way.[00:13:23] swyx: A lot of people, I think we talked about this with Vipul from the Together episode, where I think he commented that you just have to have a good world model. Or a good sort of inductive bias or whatever that, you know, term of art is. And that is strongest in math and science math and code, where you can verify what's right and what's wrong.[00:13:44] swyx: And so the REST EM paper from DeepMind explored that. Very well, it's just the most obvious thing like and then and then once you get out of that domain of like things where you can generate You can arbitrarily generate like a whole bunch of stuff and verify if they're correct and therefore they're they're correct synthetic data to train on Once you get into more sort of fuzzy topics, then it's then it's a bit less clear So I think that the the papers that drove this understanding There are two big ones and then one smaller one One was wrap like rephrasing the web from from Apple where they basically rephrased all of the C4 data set with Mistral and it be trained on that instead of C4.[00:14:23] swyx: And so new C4 trained much faster and cheaper than old C, than regular raw C4. And that was very interesting. And I have told some friends of ours that they should just throw out their own existing data sets and just do that because that seems like a pure win. Obviously we have to study, like, what the trade offs are.[00:14:42] swyx: I, I imagine there are trade offs. So I was just thinking about this last night. If you do synthetic data and it's generated from a model, probably you will not train on typos. So therefore you'll be like, once the model that's trained on synthetic data encounters the first typo, they'll be like, what is this?[00:15:01] swyx: I've never seen this before. So they have no association or correction as to like, oh, these tokens are often typos of each other, therefore they should be kind of similar. I don't know. That's really remains to be seen, I think. I don't think that the Apple people export[00:15:15] Alessio: that. Yeah, isn't that the whole, Mode collapse thing, if we do more and more of this at the end of the day.[00:15:22] swyx: Yeah, that's one form of that. Yeah, exactly. Microsoft also had a good paper on text embeddings. And then I think this is a meta paper on self rewarding language models. That everyone is very interested in. Another paper was also SPIN. These are all things we covered in the the Latent Space Paper Club.[00:15:37] swyx: But also, you know, I just kind of recommend those as top reads of the month. Yeah, I don't know if there's any much else in terms, so and then, regarding the potential of it, I think it's high potential because, one, it solves one of the data war issues that we have, like, everyone is OpenAI is paying Reddit 60 million dollars a year for their user generated data.[00:15:56] swyx: Google, right?[00:15:57] Alessio: Not OpenAI.[00:15:59] swyx: Is it Google? I don't[00:16:00] Alessio: know. Well, somebody's paying them 60 million, that's[00:16:04] swyx: for sure. Yes, that is, yeah, yeah, and then I think it's maybe not confirmed who. But yeah, it is Google. Oh my god, that's interesting. Okay, because everyone was saying, like, because Sam Altman owns 5 percent of Reddit, which is apparently 500 million worth of Reddit, he owns more than, like, the founders.[00:16:21] Alessio: Not enough to get the data,[00:16:22] swyx: I guess. So it's surprising that it would go to Google instead of OpenAI, but whatever. Okay yeah, so I think that's all super interesting in the data field. I think it's high potential because we have evidence that it works. There's not a doubt that it doesn't work. I think it's a doubt that there's, what the ceiling is, which is the mode collapse thing.[00:16:42] swyx: If it turns out that the ceiling is pretty close, then this will maybe augment our data by like, I don't know, 30 50 percent good, but not game[00:16:51] Alessio: changing. And most of the synthetic data stuff, it's reinforcement learning on a pre trained model. People are not really doing pre training on fully synthetic data, like, large enough scale.[00:17:02] swyx: Yeah, unless one of our friends that we've talked to succeeds. Yeah, yeah. Pre trained synthetic data, pre trained scale synthetic data, I think that would be a big step. Yeah. And then there's a wildcard, so all of these, like smaller Directions,[00:17:15] Wildcard: Multi-Epoch Training (OLMo, Datablations)[00:17:15] swyx: I always put a wildcard in there. And one of the wildcards is, okay, like, Let's say, you have pre, you have, You've scraped all the data on the internet that you think is useful.[00:17:25] swyx: Seems to top out at somewhere between 2 trillion to 3 trillion tokens. Maybe 8 trillion if Mistral, Mistral gets lucky. Okay, if I need 80 trillion, if I need 100 trillion, where do I go? And so, you can do synthetic data maybe, but maybe that only gets you to like 30, 40 trillion. Like where, where is the extra alpha?[00:17:43] swyx: And maybe extra alpha is just train more on the same tokens. Which is exactly what Omo did, like Nathan Lambert, AI2, After, just after he did the interview with us, they released Omo. So, it's unfortunate that we didn't get to talk much about it. But Omo actually started doing 1. 5 epochs on every, on all data.[00:18:00] swyx: And the data ablation paper that I covered in Europe's says that, you know, you don't like, don't really start to tap out of like, the alpha or the sort of improved loss that you get from data all the way until four epochs. And so I'm just like, okay, like, why do we all agree that one epoch is all you need?[00:18:17] swyx: It seems like to be a trend. It seems that we think that memorization is very good or too good. But then also we're finding that, you know, For improvement in results that we really like, we're fine on overtraining on things intentionally. So, I think that's an interesting direction that I don't see people exploring enough.[00:18:36] swyx: And the more I see papers coming out Stretching beyond the one epoch thing, the more people are like, it's completely fine. And actually, the only reason we stopped is because we ran out of compute[00:18:46] Alessio: budget. Yeah, I think that's the biggest thing, right?[00:18:51] swyx: Like, that's not a valid reason, that's not science. I[00:18:54] Alessio: wonder if, you know, Matt is going to do it.[00:18:57] Alessio: I heard LamaTree, they want to do a 100 billion parameters model. I don't think you can train that on too many epochs, even with their compute budget, but yeah. They're the only ones that can save us, because even if OpenAI is doing this, they're not going to tell us, you know. Same with DeepMind.[00:19:14] swyx: Yeah, and so the updates that we got on Lambda 3 so far is apparently that because of the Gemini news that we'll talk about later they're pushing it back on the release.[00:19:21] swyx: They already have it. And they're just pushing it back to do more safety testing. Politics testing.[00:19:28] Alessio: Well, our episode with Sumit will have already come out by the time this comes out, I think. So people will get the inside story on how they actually allocate the compute.[00:19:38] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)[00:19:38] Alessio: Alternative architectures. Well, shout out to our WKV who won one of the prizes at our Final Frontiers event last week.[00:19:47] Alessio: We talked about Mamba and Strapain on the Together episode. A lot of, yeah, monarch mixers. I feel like Together, It's like the strong Stanford Hazy Research Partnership, because Chris Ray is one of the co founders. So they kind of have a, I feel like they're going to be the ones that have one of the state of the art models alongside maybe RWKB.[00:20:08] Alessio: I haven't seen as many independent. People working on this thing, like Monarch Mixer, yeah, Manbuster, Payena, all of these are together related. Nobody understands the math. They got all the gigabrains, they got 3DAO, they got all these folks in there, like, working on all of this.[00:20:25] swyx: Albert Gu, yeah. Yeah, so what should we comment about it?[00:20:28] swyx: I mean, I think it's useful, interesting, but at the same time, both of these are supposed to do really good scaling for long context. And then Gemini comes out and goes like, yeah, we don't need it. Yeah.[00:20:44] Alessio: No, that's the risk. So, yeah. I was gonna say, maybe it's not here, but I don't know if we want to talk about diffusion transformers as like in the alt architectures, just because of Zora.[00:20:55] swyx: One thing, yeah, so, so, you know, this came from the Jan recap, which, and diffusion transformers were not really a discussion, and then, obviously, they blow up in February. Yeah. I don't think they're, it's a mixed architecture in the same way that Stripe Tiena is mixed there's just different layers taking different approaches.[00:21:13] swyx: Also I think another one that I maybe didn't call out here, I think because it happened in February, was hourglass diffusion from stability. But also, you know, another form of mixed architecture. So I guess that is interesting. I don't have much commentary on that, I just think, like, we will try to evolve these things, and maybe one of these architectures will stick and scale, it seems like diffusion transformers is going to be good for anything generative, you know, multi modal.[00:21:41] swyx: We don't see anything where diffusion is applied to text yet, and that's the wild card for this category. Yeah, I mean, I think I still hold out hope for let's just call it sub quadratic LLMs. I think that a lot of discussion this month actually was also centered around this concept that People always say, oh, like, transformers don't scale because attention is quadratic in the sequence length.[00:22:04] swyx: Yeah, but, you know, attention actually is a very small part of the actual compute that is being spent, especially in inference. And this is the reason why, you know, when you multiply, when you, when you, when you jump up in terms of the, the model size in GPT 4 from like, you know, 38k to like 32k, you don't also get like a 16 times increase in your, in your performance.[00:22:23] swyx: And this is also why you don't get like a million times increase in your, in your latency when you throw a million tokens into Gemini. Like people have figured out tricks around it or it's just not that significant as a term, as a part of the overall compute. So there's a lot of challenges to this thing working.[00:22:43] swyx: It's really interesting how like, how hyped people are about this versus I don't know if it works. You know, it's exactly gonna, gonna work. And then there's also this, this idea of retention over long context. Like, even though you have context utilization, like, the amount of, the amount you can remember is interesting.[00:23:02] swyx: Because I've had people criticize both Mamba and RWKV because they're kind of, like, RNN ish in the sense that they have, like, a hidden memory and sort of limited hidden memory that they will forget things. So, for all these reasons, Gemini 1. 5, which we still haven't covered, is very interesting because Gemini magically has fixed all these problems with perfect haystack recall and reasonable latency and cost.[00:23:29] Wildcards: Text Diffusion, RALM/Retro[00:23:29] swyx: So that's super interesting. So the wildcard I put in here if you want to go to that. I put two actually. One is text diffusion. I think I'm still very influenced by my meeting with a mid journey person who said they were working on text diffusion. I think it would be a very, very different paradigm for, for text generation, reasoning, plan generation if we can get diffusion to work.[00:23:51] swyx: For text. And then the second one is Dowie Aquila's contextual AI, which is working on retrieval augmented language models, where it kind of puts RAG inside of the language model instead of outside.[00:24:02] Alessio: Yeah, there's a paper called Retro that covers some of this. I think that's an interesting thing. I think the The challenge, well not the challenge, what they need to figure out is like how do you keep the rag piece always up to date constantly, you know, I feel like the models, you put all this work into pre training them, but then at least you have a fixed artifact.[00:24:22] Alessio: These architectures are like constant work needs to be done on them and they can drift even just based on the rag data instead of the model itself. Yeah,[00:24:30] swyx: I was in a panel with one of the investors in contextual and the guy, the way that guy pitched it, I didn't agree with. He was like, this will solve hallucination.[00:24:38] Alessio: That's what everybody says. We solve[00:24:40] swyx: hallucination. I'm like, no, you reduce it. It cannot,[00:24:44] Alessio: if you solved it, the model wouldn't exist, right? It would just be plain text. It wouldn't be a generative model. Cool. So, author, architectures, then we got mixture of experts. I think we covered a lot of, a lot of times.[00:24:56] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)[00:24:56] Alessio: Maybe any new interesting threads you want to go under here?[00:25:00] swyx: DeepSeq MOE, which was released in January. Everyone who is interested in MOEs should read that paper, because it's significant for two reasons. One three reasons. One, it had, it had small experts, like a lot more small experts. So, for some reason, everyone has settled on eight experts for GPT 4 for Mixtral, you know, that seems to be the favorite architecture, but these guys pushed it to 64 experts, and each of them smaller than the other.[00:25:26] swyx: But then they also had the second idea, which is that it is They had two, one to two always on experts for common knowledge and that's like a very compelling concept that you would not route to all the experts all the time and make them, you know, switch to everything. You would have some always on experts.[00:25:41] swyx: I think that's interesting on both the inference side and the training side for for memory retention. And yeah, they, they, they, the, the, the, the results that they published, which actually excluded, Mixed draw, which is interesting. The results that they published showed a significant performance jump versus all the other sort of open source models at the same parameter count.[00:26:01] swyx: So like this may be a better way to do MOEs that are, that is about to get picked up. And so that, that is interesting for the third reason, which is this is the first time a new idea from China. has infiltrated the West. It's usually the other way around. I probably overspoke there. There's probably lots more ideas that I'm not aware of.[00:26:18] swyx: Maybe in the embedding space. But the I think DCM we, like, woke people up and said, like, hey, DeepSeek, this, like, weird lab that is attached to a Chinese hedge fund is somehow, you know, doing groundbreaking research on MOEs. So, so, I classified this as a medium potential because I think that it is a sort of like a one off benefit.[00:26:37] swyx: You can Add to any, any base model to like make the MOE version of it, you get a bump and then that's it. So, yeah,[00:26:45] Alessio: I saw Samba Nova, which is like another inference company. They released this MOE model called Samba 1, which is like a 1 trillion parameters. But they're actually MOE auto open source models.[00:26:56] Alessio: So it's like, they just, they just clustered them all together. So I think people. Sometimes I think MOE is like you just train a bunch of small models or like smaller models and put them together. But there's also people just taking, you know, Mistral plus Clip plus, you know, Deepcoder and like put them all together.[00:27:15] Alessio: And then you have a MOE model. I don't know. I haven't tried the model, so I don't know how good it is. But it seems interesting that you can then have people working separately on state of the art, you know, Clip, state of the art text generation. And then you have a MOE architecture that brings them all together.[00:27:31] swyx: I'm thrown off by your addition of the word clip in there. Is that what? Yeah, that's[00:27:35] Alessio: what they said. Yeah, yeah. Okay. That's what they I just saw it yesterday. I was also like[00:27:40] swyx: scratching my head. And they did not use the word adapter. No. Because usually what people mean when they say, Oh, I add clip to a language model is adapter.[00:27:48] swyx: Let me look up the Which is what Lava did.[00:27:50] Alessio: The announcement again.[00:27:51] swyx: Stable diffusion. That's what they do. Yeah, it[00:27:54] Alessio: says among the models that are part of Samba 1 are Lama2, Mistral, DeepSigCoder, Falcon, Dplot, Clip, Lava. So they're just taking all these models and putting them in a MOE. Okay,[00:28:05] swyx: so a routing layer and then not jointly trained as much as a normal MOE would be.[00:28:12] swyx: Which is okay.[00:28:13] Alessio: That's all they say. There's no paper, you know, so it's like, I'm just reading the article, but I'm interested to see how[00:28:20] Wildcard: Model Merging (mergekit)[00:28:20] swyx: it works. Yeah, so so the wildcard for this section, the MOE section is model merges, which has also come up as, as a very interesting phenomenon. The last time I talked to Jeremy Howard at the Olama meetup we called it model grafting or model stacking.[00:28:35] swyx: But I think the, the, the term that people are liking these days, the model merging, They're all, there's all different variations of merging. Merge types, and some of them are stacking, some of them are, are grafting. And, and so like, some people are approaching model merging in the way that Samba is doing, which is like, okay, here are defined models, each of which have their specific, Plus and minuses, and we will merge them together in the hope that the, you know, the sum of the parts will, will be better than others.[00:28:58] swyx: And it seems like it seems like it's working. I don't really understand why it works apart from, like, I think it's a form of regularization. That if you merge weights together in like a smart strategy you, you, you get a, you get a, you get a less overfitting and more generalization, which is good for benchmarks, if you, if you're honest about your benchmarks.[00:29:16] swyx: So this is really interesting and good. But again, they're kind of limited in terms of like the amount of bumps you can get. But I think it's very interesting in the sense of how cheap it is. We talked about this on the Chinatalk podcast, like the guest podcast that we did with Chinatalk. And you can do this without GPUs, because it's just adding weights together, and dividing things, and doing like simple math, which is really interesting for the GPU ports.[00:29:42] Alessio: There's a lot of them.[00:29:44] Direction 5: Online LLMs (Gemini Pro, Exa)[00:29:44] Alessio: And just to wrap these up, online LLMs? Yeah,[00:29:48] swyx: I think that I ki I had to feature this because the, one of the top news of January was that Gemini Pro beat GPT-4 turbo on LM sis for the number two slot to GPT-4. And everyone was very surprised. Like, how does Gemini do that?[00:30:06] swyx: Surprise, surprise, they added Google search. Mm-hmm to the results. So it became an online quote unquote online LLM and not an offline LLM. Therefore, it's much better at answering recent questions, which people like. There's an emerging set of table stakes features after you pre train something.[00:30:21] swyx: So after you pre train something, you should have the chat tuned version of it, or the instruct tuned version of it, however you choose to call it. You should have the JSON and function calling version of it. Structured output, the term that you don't like. You should have the online version of it. These are all like table stakes variants, that you should do when you offer a base LLM, or you train a base LLM.[00:30:44] swyx: And I think online is just like, There, it's important. I think companies like Perplexity, and even Exa, formerly Metaphor, you know, are rising to offer that search needs. And it's kind of like, they're just necessary parts of a system. When you have RAG for internal knowledge, and then you have, you know, Online search for external knowledge, like things that you don't know yet?[00:31:06] swyx: Mm-Hmm. . And it seems like it's, it's one of many tools. I feel like I may be underestimating this, but I'm just gonna put it out there that I, I think it has some, some potential. One of the evidence points that it doesn't actually matter that much is that Perplexity has a, has had online LMS for three months now and it performs, doesn't perform great.[00:31:25] swyx: Mm-Hmm. on, on lms, it's like number 30 or something. So it's like, okay. You know, like. It's, it's, it helps, but it doesn't give you a giant, giant boost. I[00:31:34] Alessio: feel like a lot of stuff I do with LLMs doesn't need to be online. So I'm always wondering, again, going back to like state of the art, right? It's like state of the art for who and for what.[00:31:45] Alessio: It's really, I think online LLMs are going to be, State of the art for, you know, news related activity that you need to do. Like, you're like, you know, social media, right? It's like, you want to have all the latest stuff, but coding, science,[00:32:01] swyx: Yeah, but I think. Sometimes you don't know what is news, what is news affecting.[00:32:07] swyx: Like, the decision to use an offline LLM is already a decision that you might not be consciously making that might affect your results. Like, what if, like, just putting things on, being connected online means that you get to invalidate your knowledge. And when you're just using offline LLM, like it's never invalidated.[00:32:27] swyx: I[00:32:28] Alessio: agree, but I think going back to your point of like the standing the test of time, I think sometimes you can get swayed by the online stuff, which is like, hey, you ask a question about, yeah, maybe AI research direction, you know, and it's like, all the recent news are about this thing. So the LLM like focus on answering, bring it up, you know, these things.[00:32:50] swyx: Yeah, so yeah, I think, I think it's interesting, but I don't know if I can, I bet heavily on this.[00:32:56] Alessio: Cool. Was there one that you forgot to put, or, or like a, a new direction? Yeah,[00:33:01] swyx: so, so this brings us into sort of February. ish.[00:33:05] OpenAI Sora and why everyone underestimated videogen[00:33:05] swyx: So like I published this in like 15 came with Sora. And so like the one thing I did not mention here was anything about multimodality.[00:33:16] swyx: Right. And I have chronically underweighted this. I always wrestle. And, and my cop out is that I focused this piece or this research direction piece on LLMs because LLMs are the source of like AGI, quote unquote AGI. Everything else is kind of like. You know, related to that, like, generative, like, just because I can generate better images or generate better videos, it feels like it's not on the critical path to AGI, which is something that Nat Friedman also observed, like, the day before Sora, which is kind of interesting.[00:33:49] swyx: And so I was just kind of like trying to focus on like what is going to get us like superhuman reasoning that we can rely on to build agents that automate our lives and blah, blah, blah, you know, give us this utopian future. But I do think that I, everybody underestimated the, the sheer importance and cultural human impact of Sora.[00:34:10] swyx: And you know, really actually good text to video. Yeah. Yeah.[00:34:14] Alessio: And I saw Jim Fan at a, at a very good tweet about why it's so impressive. And I think when you have somebody leading the embodied research at NVIDIA and he said that something is impressive, you should probably listen. So yeah, there's basically like, I think you, you mentioned like impacting the world, you know, that we live in.[00:34:33] Alessio: I think that's kind of like the key, right? It's like the LLMs don't have, a world model and Jan Lekon. He can come on the podcast and talk all about what he thinks of that. But I think SORA was like the first time where people like, Oh, okay, you're not statically putting pixels of water on the screen, which you can kind of like, you know, project without understanding the physics of it.[00:34:57] Alessio: Now you're like, you have to understand how the water splashes when you have things. And even if you just learned it by watching video and not by actually studying the physics, You still know it, you know, so I, I think that's like a direction that yeah, before you didn't have, but now you can do things that you couldn't before, both in terms of generating, I think it always starts with generating, right?[00:35:19] Alessio: But like the interesting part is like understanding it. You know, it's like if you gave it, you know, there's the video of like the, the ship in the water that they generated with SORA, like if you gave it the video back and now it could tell you why the ship is like too rocky or like it could tell you why the ship is sinking, then that's like, you know, AGI for like all your rig deployments and like all this stuff, you know, so, but there's none, there's none of that yet, so.[00:35:44] Alessio: Hopefully they announce it and talk more about it. Maybe a Dev Day this year, who knows.[00:35:49] swyx: Yeah who knows, who knows. I'm talking with them about Dev Day as well. So I would say, like, the phrasing that Jim used, which resonated with me, he kind of called it a data driven world model. I somewhat agree with that.[00:36:04] Does Sora have a World Model? Yann LeCun vs Jim Fan[00:36:04] swyx: I am on more of a Yann LeCun side than I am on Jim's side, in the sense that I think that is the vision or the hope that these things can build world models. But you know, clearly even at the current SORA size, they don't have the idea of, you know, They don't have strong consistency yet. They have very good consistency, but fingers and arms and legs will appear and disappear and chairs will appear and disappear.[00:36:31] swyx: That definitely breaks physics. And it also makes me think about how we do deep learning versus world models in the sense of You know, in classic machine learning, when you have too many parameters, you will overfit, and actually that fails, that like, does not match reality, and therefore fails to generalize well.[00:36:50] swyx: And like, what scale of data do we need in order to world, learn world models from video? A lot. Yeah. So, so I, I And cautious about taking this interpretation too literally, obviously, you know, like, I get what he's going for, and he's like, obviously partially right, obviously, like, transformers and, and, you know, these, like, these sort of these, these neural networks are universal function approximators, theoretically could figure out world models, it's just like, how good are they, and how tolerant are we of hallucinations, we're not very tolerant, like, yeah, so It's, it's, it's gonna prior, it's gonna bias us for creating like very convincing things, but then not create like the, the, the useful role models that we want.[00:37:37] swyx: At the same time, what you just said, I think made me reflect a little bit like we just got done saying how important synthetic data is for Mm-Hmm. for training lms. And so like, if this is a way of, of synthetic, you know, vi video data for improving our video understanding. Then sure, by all means. Which we actually know, like, GPT 4, Vision, and Dolly were trained, kind of, co trained together.[00:38:02] swyx: And so, like, maybe this is on the critical path, and I just don't fully see the full picture yet.[00:38:08] Alessio: Yeah, I don't know. I think there's a lot of interesting stuff. It's like, imagine you go back, you have Sora, you go back in time, and Newton didn't figure out gravity yet. Would Sora help you figure it out?[00:38:21] Alessio: Because you start saying, okay, a man standing under a tree with, like, Apples falling, and it's like, oh, they're always falling at the same speed in the video. Why is that? I feel like sometimes these engines can like pick up things, like humans have a lot of intuition, but if you ask the average person, like the physics of like a fluid in a boat, they couldn't be able to tell you the physics, but they can like observe it, but humans can only observe this much, you know, versus like now you have these models to observe everything and then They generalize these things and maybe we can learn new things through the generalization that they pick up.[00:38:55] swyx: But again, And it might be more observant than us in some respects. In some ways we can scale it up a lot more than the number of physicists that we have available at Newton's time. So like, yeah, absolutely possible. That, that this can discover new science. I think we have a lot of work to do to formalize the science.[00:39:11] swyx: And then, I, I think the last part is you know, How much, how much do we cheat by gen, by generating data from Unreal Engine 5? Mm hmm. which is what a lot of people are speculating with very, very limited evidence that OpenAI did that. The strongest evidence that I saw was someone who works a lot with Unreal Engine 5 looking at the side characters in the videos and noticing that they all adopt Unreal Engine defaults.[00:39:37] swyx: of like, walking speed, and like, character choice, like, character creation choice. And I was like, okay, like, that's actually pretty convincing that they actually use Unreal Engine to bootstrap some synthetic data for this training set. Yeah,[00:39:52] Alessio: could very well be.[00:39:54] swyx: Because then you get the labels and the training side by side.[00:39:58] swyx: One thing that came up on the last day of February, which I should also mention, is EMO coming out of Alibaba, which is also a sort of like video generation and space time transformer that also involves probably a lot of synthetic data as well. And so like, this is of a kind in the sense of like, oh, like, you know, really good generative video is here and It is not just like the one, two second clips that we saw from like other, other people and like, you know, Pika and all the other Runway are, are, are, you know, run Cristobal Valenzuela from Runway was like game on which like, okay, but like, let's see your response because we've heard a lot about Gen 1 and 2, but like, it's nothing on this level of Sora So it remains to be seen how we can actually apply this, but I do think that the creative industry should start preparing.[00:40:50] swyx: I think the Sora technical blog post from OpenAI was really good.. It was like a request for startups. It was so good in like spelling out. Here are the individual industries that this can impact.[00:41:00] swyx: And anyone who, anyone who's like interested in generative video should look at that. But also be mindful that probably when OpenAI releases a Soa API, right? The you, the in these ways you can interact with it are very limited. Just like the ways you can interact with Dahlia very limited and someone is gonna have to make open SOA to[00:41:19] swyx: Mm-Hmm to, to, for you to create comfy UI pipelines.[00:41:24] Alessio: The stability folks said they wanna build an open. For a competitor, but yeah, stability. Their demo video, their demo video was like so underwhelming. It was just like two people sitting on the beach[00:41:34] swyx: standing. Well, they don't have it yet, right? Yeah, yeah.[00:41:36] swyx: I mean, they just wanna train it. Everybody wants to, right? Yeah. I, I think what is confusing a lot of people about stability is like they're, they're, they're pushing a lot of things in stable codes, stable l and stable video diffusion. But like, how much money do they have left? How many people do they have left?[00:41:51] swyx: Yeah. I have had like a really, Ima Imad spent two hours with me. Reassuring me things are great. And, and I'm like, I, I do, like, I do believe that they have really, really quality people. But it's just like, I, I also have a lot of very smart people on the other side telling me, like, Hey man, like, you know, don't don't put too much faith in this, in this thing.[00:42:11] swyx: So I don't know who to believe. Yeah.[00:42:14] Alessio: It's hard. Let's see. What else? We got a lot more stuff. I don't know if we can. Yeah, Groq.[00:42:19] Groq Math[00:42:19] Alessio: We can[00:42:19] swyx: do a bit of Groq prep. We're, we're about to go to talk to Dylan Patel. Maybe, maybe it's the audio in here. I don't know. It depends what, what we get up to later. What, how, what do you as an investor think about Groq? Yeah. Yeah, well, actually, can you recap, like, why is Groq interesting? So,[00:42:33] Alessio: Jonathan Ross, who's the founder of Groq, he's the person that created the TPU at Google. It's actually, it was one of his, like, 20 percent projects. It's like, he was just on the side, dooby doo, created the TPU.[00:42:46] Alessio: But yeah, basically, Groq, they had this demo that went viral, where they were running Mistral at, like, 500 tokens a second, which is like, Fastest at anything that you have out there. The question, you know, it's all like, The memes were like, is NVIDIA dead? Like, people don't need H100s anymore. I think there's a lot of money that goes into building what GRUK has built as far as the hardware goes.[00:43:11] Alessio: We're gonna, we're gonna put some of the notes from, from Dylan in here, but Basically the cost of the Groq system is like 30 times the cost of, of H100 equivalent. So, so[00:43:23] swyx: let me, I put some numbers because me and Dylan were like, I think the two people actually tried to do Groq math. Spreadsheet doors.[00:43:30] swyx: Spreadsheet doors. So, one that's, okay, oh boy so, so, equivalent H100 for Lama 2 is 300, 000. For a system of 8 cards. And for Groq it's 2. 3 million. Because you have to buy 576 Groq cards. So yeah, that, that just gives people an idea. So like if you deprecate both over a five year lifespan, per year you're deprecating 460K for Groq, and 60K a year for H100.[00:43:59] swyx: So like, Groqs are just way more expensive per model that you're, that you're hosting. But then, you make it up in terms of volume. So I don't know if you want to[00:44:08] Alessio: cover that. I think one of the promises of Groq is like super high parallel inference on the same thing. So you're basically saying, okay, I'm putting on this upfront investment on the hardware, but then I get much better scaling once I have it installed.[00:44:24] Alessio: I think the big question is how much can you sustain the parallelism? You know, like if you get, if you're going to get 100% Utilization rate at all times on Groq, like, it's just much better, you know, because like at the end of the day, the tokens per second costs that you're getting is better than with the H100s, but if you get to like 50 percent utilization rate, you will be much better off running on NVIDIA.[00:44:49] Alessio: And if you look at most companies out there, who really gets 100 percent utilization rate? Probably open AI at peak times, but that's probably it. But yeah, curious to see more. I saw Jonathan was just at the Web Summit in Dubai, in Qatar. He just gave a talk there yesterday. That I haven't listened to yet.[00:45:09] Alessio: I, I tweeted that he should come on the pod. He liked it. And then rock followed me on Twitter. I don't know if that means that they're interested, but[00:45:16] swyx: hopefully rock social media person is just very friendly. They, yeah. Hopefully[00:45:20] Alessio: we can get them. Yeah, we, we gonna get him. We[00:45:22] swyx: just call him out and, and so basically the, the key question is like, how sustainable is this and how much.[00:45:27] swyx: This is a loss leader the entire Groq management team has been on Twitter and Hacker News saying they are very, very comfortable with the pricing of 0. 27 per million tokens. This is the lowest that anyone has offered tokens as far as Mixtral or Lama2. This matches deep infra and, you know, I think, I think that's, that's, that's about it in terms of that, that, that low.[00:45:47] swyx: And we think the pro the break even for H100s is 50 cents. At a, at a normal utilization rate. To make this work, so in my spreadsheet I made this, made this work. You have to have like a parallelism of 500 requests all simultaneously. And you have, you have model bandwidth utilization of 80%.[00:46:06] swyx: Which is way high. I just gave them high marks for everything. Groq has two fundamental tech innovations that they hinge their hats on in terms of like, why we are better than everyone. You know, even though, like, it remains to be independently replicated. But one you know, they have this sort of the entire model on the chip idea, which is like, Okay, get rid of HBM.[00:46:30] swyx: And, like, put everything in SREM. Like, okay, fine, but then you need a lot of cards and whatever. And that's all okay. And so, like, because you don't have to transfer between memory, then you just save on that time and that's why they're faster. So, a lot of people buy that as, like, that's the reason that you're faster.[00:46:45] swyx: Then they have, like, some kind of crazy compiler, or, like, Speculative routing magic using compilers that they also attribute towards their higher utilization. So I give them 80 percent for that. And so that all that works out to like, okay, base costs, I think you can get down to like, maybe like 20 something cents per million tokens.[00:47:04] swyx: And therefore you actually are fine if you have that kind of utilization. But it's like, I have to make a lot of fearful assumptions for this to work.[00:47:12] Alessio: Yeah. Yeah, I'm curious to see what Dylan says later.[00:47:16] swyx: So he was like completely opposite of me. He's like, they're just burning money. Which is great.[00:47:22] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars[00:47:22] Alessio: Gemini, want to do a quick run through since this touches on all the four words.[00:47:28] swyx: Yeah, and I think this is the mark of a useful framework, that when a new thing comes along, you can break it down in terms of the four words and sort of slot it in or analyze it in those four frameworks, and have nothing left.[00:47:41] swyx: So it's a MECE categorization. MECE is Mutually Exclusive and Collectively Exhaustive. And that's a really, really nice way to think about taxonomies and to create mental frameworks. So, what is Gemini 1. 5 Pro? It is the newest model that came out one week after Gemini 1. 0. Which is very interesting.[00:48:01] swyx: They have not really commented on why. They released this the headline feature is that it has a 1 million token context window that is multi modal which means that you can put all sorts of video and audio And PDFs natively in there alongside of text and, you know, it's, it's at least 10 times longer than anything that OpenAI offers which is interesting.[00:48:20] swyx: So it's great for prototyping and it has interesting discussions on whether it kills RAG.[00:48:25] Alessio: Yeah, no, I mean, we always talk about, you know, Long context is good, but you're getting charged per token. So, yeah, people love for you to use more tokens in the context. And RAG is better economics. But I think it all comes down to like how the price curves change, right?[00:48:42] Alessio: I think if anything, RAG's complexity goes up and up the more you use it, you know, because you have more data sources, more things you want to put in there. The token costs should go down over time, you know, if the model stays fixed. If people are happy with the model today. In two years, three years, it's just gonna cost a lot less, you know?[00:49:02] Alessio: So now it's like, why would I use RAG and like go through all of that? It's interesting. I think RAG is better cutting edge economics for LLMs. I think large context will be better long tail economics when you factor in the build cost of like managing a RAG pipeline. But yeah, the recall was like the most interesting thing because we've seen the, you know, You know, in the haystack things in the past, but apparently they have 100 percent recall on anything across the context window.[00:49:28] Alessio: At least they say nobody has used it. No, people[00:49:30] swyx: have. Yeah so as far as, so, so what this needle in a haystack thing for people who aren't following as closely as us is that someone, I forget his name now someone created this needle in a haystack problem where you feed in a whole bunch of generated junk not junk, but just like, Generate a data and ask it to specifically retrieve something in that data, like one line in like a hundred thousand lines where it like has a specific fact and if it, if you get it, you're, you're good.[00:49:57] swyx: And then he moves the needle around, like, you know, does it, does, does your ability to retrieve that vary if I put it at the start versus put it in the middle, put it at the end? And then you generate this like really nice chart. That, that kind of shows like it's recallability of a model. And he did that for GPT and, and Anthropic and showed that Anthropic did really, really poorly.[00:50:15] swyx: And then Anthropic came back and said it was a skill issue, just add this like four, four magic words, and then, then it's magically all fixed. And obviously everybody laughed at that. But what Gemini came out with was, was that, yeah, we, we reproduced their, you know, haystack issue you know, test for Gemini, and it's good across all, all languages.[00:50:30] swyx: All the one million token window, which is very interesting because usually for typical context extension methods like rope or yarn or, you know, anything like that, or alibi, it's lossy like by design it's lossy, usually for conversations that's fine because we are lossy when we talk to people but for superhuman intelligence, perfect memory across Very, very long context.[00:50:51] swyx: It's very, very interesting for picking things up. And so the people who have been given the beta test for Gemini have been testing this. So what you do is you upload, let's say, all of Harry Potter and you change one fact in one sentence, somewhere in there, and you ask it to pick it up, and it does. So this is legit.[00:51:08] swyx: We don't super know how, because this is, like, because it doesn't, yes, it's slow to inference, but it's not slow enough that it's, like, running. Five different systems in the background without telling you. Right. So it's something, it's something interesting that they haven't fully disclosed yet. The open source community has centered on this ring attention paper, which is created by your friend Matei Zaharia, and a couple other people.[00:51:36] swyx: And it's a form of distributing the compute. I don't super understand, like, why, you know, doing, calculating, like, the fee for networking and attention. In block wise fashion and distributing it makes it so good at recall. I don't think they have any answer to that. The only thing that Ring of Tension is really focused on is basically infinite context.[00:51:59] swyx: They said it was good for like 10 to 100 million tokens. Which is, it's just great. So yeah, using the four wars framework, what is this framework for Gemini? One is the sort of RAG and Ops war. Here we care less about RAG now, yes. Or, we still care as much about RAG, but like, now it's it's not important in prototyping.[00:52:21] swyx: And then, for data war I guess this is just part of the overall training dataset, but Google made a 60 million deal with Reddit and presumably they have deals with other companies. For the multi modality war, we can talk about the image generation, Crisis, or the fact that Gemini also has image generation, which we'll talk about in the next section.[00:52:42] swyx: But it also has video understanding, which is, I think, the top Gemini post came from our friend Simon Willison, who basically did a short video of him scanning over his bookshelf. And it would be able to convert that video into a JSON output of what's on that bookshelf. And I think that is very useful.[00:53:04] swyx: Actually ties into the conversation that we had with David Luan from Adept. In a sense of like, okay what if video was the main modality instead of text as the input? What if, what if everything was video in, because that's how we work. We, our eyes don't actually read, don't actually like get input, our brains don't get inputs as characters.[00:53:25] swyx: Our brains get the pixels shooting into our eyes, and then our vision system takes over first, and then we sort of mentally translate that into text later. And so it's kind of like what Adept is kind of doing, which is driving by vision model, instead of driving by raw text understanding of the DOM. And, and I, I, in that, that episode, which we haven't released I made the analogy to like self-driving by lidar versus self-driving by camera.[00:53:52] swyx: Mm-Hmm. , right? Like, it's like, I think it, what Gemini and any other super long context that model that is multimodal unlocks is what if you just drive everything by video. Which is[00:54:03] Alessio: cool. Yeah, and that's Joseph from Roboflow. It's like anything that can be seen can be programmable with these models.[00:54:12] Alessio: You mean[00:54:12] swyx: the computer vision guy is bullish on computer vision?[00:54:18] Alessio: It's like the rag people. The rag people are bullish on rag and not a lot of context. I'm very surprised. The, the fine tuning people love fine tuning instead of few shot. Yeah. Yeah. The, yeah, the, that's that. Yeah, the, I, I think the ring attention thing, and it's how they did it, we don't know. And then they released the Gemma models, which are like a 2 billion and 7 billion open.[00:54:41] Alessio: Models, which people said are not, are not good based on my Twitter experience, which are the, the GPU poor crumbs. It's like, Hey, we did all this work for us because we're GPU rich and we're just going to run this whole thing. And

ceo american spotify tiktok black australia art europe english google ai china apple vision france politics online service state crisis living san francisco west research russia chinese elon musk reach search microsoft teacher surprise ring harry potter security asian chatgpt broadway run silicon valley mvp ceos discord medium reddit mail dubai stanford math adolf hitler fill worlds complex direction context mixed stanford university qatar dom one year falcon cto offensive tension substack retro ia minecraft newton hungary explorers sf openai gemini residence archive alt nvidia ux api builder laptops apples lamar discovered generate fastest sweep voyager stable python j'ai ui developed mm jet stretching gpt rj ml lama alibaba hungarian github automated llama directions grimes notion rail lava merge transformer lesser clip runway metaphor amd synthetic samba bal emo sora copilot shack wechat sam altman structured ops mamba llm ix unreal engine gpu connector spreadsheets rahul raspberry pi agi bytedance vector zapier sql pixie collected c4 sonar rag anz gpus 7b deepmind lambda vps utilization alessio perplexity tiananmen square speculative anthropic lms gopher lm web summit json arp mixture sundar pichai 60k mistral kura cli pocketcast pika google gemini tendency soa motif digital ocean a16z sumit demo day itamar chinchillas adept versa npm yon markov reassuring dabble linux foundation hacker news dcm boma us tech omo moes svelte agis jupyter yann lecun open api matryoshka jupyter notebooks tpu jeremy howard replit vipul exa groq 70b neurips hbm mece gemini pro nat friedman rlhf rnn chris ray code interpreter mrl naton audio recap simon willison 460k latent space sfai unthinking and openai versal jerry liu matei zaharia hashnode
CacaoCast
Épisode 273 - Apple Vision Pro, Île dynamique, OpenAPI, DeckIt, Shaper, OSStatus, Proxy pour G4

CacaoCast

Play Episode Listen Later Jan 11, 2024 55:17


Bienvenue dans le deux-cent-soixante-treizième épisode de CacaoCast! Dans cet épisode, Philippe Casgrain et Philippe Guitard discutent des sujets suivants: Apple Vision Pro - Pré-commandes le 19 janvier et disponible le 2 février! Île dynamique - Pour la masquer dans le simulateur OpenAPI - Maintenant intégré dans le langage Swift DeckIt - Créez une application jeu-de-carte avec peu de code Shaper - Convertissez du SVG en SwiftUI OSStatus - Tous les codes d'erreurs macOS, et plus Proxy pour G4 - Si vous avez un vieux mac et que vous voulez naviguer sur Internet Ecoutez cet épisode

Agency Intelligence
Peter MacDonald: On Advancing Insurance Tech

Agency Intelligence

Play Episode Listen Later Jan 9, 2024 46:58


In this episode of Agents Influence podcast, host Jason Cass interviews Peter MacDonald, Co-founder & CEO of Wunderite. Key Topics: Introduction of Peter MacDonald and the creation of Wunderwrite. Discussion on the independent agent landscape and the IndieTech event. Insights from CEOs of major industry players on the rate of industry change. Exploration of the customer journey in insurance and its impact on agency operations. The challenge of integrating technology and maintaining relevancy in a rapidly evolving industry. The role of Open API in advancing industry connectivity. Perspectives on investment in technology by insurance agencies. Reach out to: Peter MacDonald Jason Cass Visit Website: Wunderite Agency Intelligence

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
(Part 3/N) Salesforce: Anypoint Design Center, Anypoint Code Builder IDE

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

Play Episode Listen Later Dec 20, 2023 24:10


In this podcast episode, Krish continues the discussion on the Salesforce Anypoint Design Center. He starts by recapping the previous episode and addressing a timeout issue. He then explores the process of publishing API documentation and compares the RAML and OpenAPI specifications. Krish also demonstrates how to access and customize the public portal. He explains how to enable and disable the portal and shares assets with team members. Finally, he provides feedback on the Design Center's user interface and concludes the episode. #snowpal #salesforce #design Snowpal's Products: Backends as Services on ⁠⁠⁠⁠⁠AWS Marketplace⁠⁠⁠⁠⁠ Mobile Apps on ⁠⁠⁠⁠⁠App Store⁠⁠⁠⁠⁠ and ⁠⁠⁠⁠⁠Play Store⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠Web App⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠Education Platform⁠⁠⁠⁠⁠ for Learners and Course Creators

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
(Part 2/N) Salesforce: Anypoint Design Center, Anypoint Code Builder IDE

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

Play Episode Listen Later Dec 19, 2023 48:03


In this podcast episode, Krish explores the Anypoint Design Center and walks through the process of creating an API specification using RAML. He compares RAML and OpenAPI and discusses the advantages and disadvantages of each. Krish also troubleshoots and debugs issues in the Design Center, highlighting some inconsistencies and potential bugs. Overall, the episode provides an overview of the Design Center and offers insights into working with RAML and API specifications. #snowpal #salesforce #design Snowpal's Products: Backends as Services on ⁠⁠⁠⁠⁠AWS Marketplace⁠⁠⁠⁠⁠ Mobile Apps on ⁠⁠⁠⁠⁠App Store⁠⁠⁠⁠⁠ and ⁠⁠⁠⁠⁠Play Store⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠Web App⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠Education Platform⁠⁠⁠⁠⁠ for Learners and Course Creators

Slick Talk: The Hospitality Podcast
Future Perspectives on Hospitality Technology | Richard Valtr

Slick Talk: The Hospitality Podcast

Play Episode Listen Later Nov 22, 2023 56:36


In this episode of Slick Talk, we invited Richard Valtr of Mews, a cloud-based property management software company, for another enlightening conversation. The discussion centers around several key points, including the evolution of Mews since our last conversation in July 2020, the immense potential of AI in the hospitality sector, and how a shift in perspective from viewing hospitality as merely 'real estate' to a provider of a 24-hour lifestyle experience can enhance operations and customer satisfaction. The podcast also covers the recent acquisitions by Mews and how they align with the company's vision, notably the acquirement of Nomi Travel, an AI-focused app, and the launch of Mews Ventures. Richard also shares some thoughtful insights about the role of AI and data in personalizing and improving the hospitality experience and his vision for the industry's future. This episode is brought to you by our sponsors at: Minut – Minut has more than just security features! They monitor noise, movement, and occupancy all within one device, and all Slick Talk listeners get 2 months FREE when they sign up with this link! Vintory - Scaling your property management company? Vintory is giving Slick Talk listeners a free digital copy of their book "From 0 to 500 Properties In 5 Years" and a $50 Amazon Gift Card for those who sign-up to do a demo! Safely.com – The best STR insurance that covers guests, owners, and managers! Making Safely a no-brainer! Hostfully – Use code SLICKTALK for 3 months free of their digital guidebook or $100 off their property management platform! 00:00 Introduction and Welcome 00:14 Reflecting on Past Episodes and Guest Appearances 01:09 Overview of Mews and its Role in the Hospitality Industry 02:24 Journey of Mews through the Pandemic 02:36 The Growth and Expansion of Mews 03:28 The Importance of Being an Ideas Company 04:03 The Evolution of Mews and its Impact on the Hospitality Industry 04:59 The Role of Mews in Enhancing Guest Experience 07:39 The Shift to Mid-Market and Enterprise Segment 07:54 The Importance of Relevance in the Hospitality Industry 08:20 The Strategy of Building for Independence and Scaling to Enterprise 16:33 The Importance of Being Relevant and Useful in the Hospitality Industry 17:16 The Role of Mews in Enhancing the 24-Hour Guest Experience 21:40 The Strategy of Mews for Acquisitions and Partnerships 24:12 The Importance of a Single Platform and Open API in the Hospitality Industry 25:39 The Importance of a Vibrant Ecosystem in Hospitality 26:56 The Vision for Future Ventures and the Need for Innovation 27:52 The Role of Hospitality in a 365-Day Experience 28:33 Theological Thinking in Hospitality and the Need for Startups 30:10 The Future of Workspaces: Hotels as Mixed-Use Real Estate Spaces 37:30 The Potential of AI in Enhancing Hospitality Experiences 42:43 The Future of Hospitality: A Vision for the Next 20 Years 45:06 The Role of Data in Personalizing Hospitality Experiences ——– Thank you for tuning into our podcast! Slick Talk is a Hospitality.FM production and you can find more of our shows at Hospitality.FM or anywhere else you listen to your podcasts! Listen to more episodes on our website and take a look at our amazing podcast and network sponsors that make this all possible! You can also listen to our Monday morning podcast, Good Morning Hospitality, where we dive into the industry as a whole in a more casual setting! If you ever want to contact us for guest suggestions or anything else related to the podcast, please fill out our contact form and we will be in touch! Last but not least, we love to connect on LinkedIn! Let's connect there so you can see the daily content we post beyond the podcast! Learn more about your ad choices. Visit megaphone.fm/adchoices

Elm Radio
095: elm-open-api with Wolfgang Schuster

Elm Radio

Play Episode Listen Later Nov 20, 2023 78:10


Wolfgang Schuster (github) (twitter)elm-open-api (NPM package) (Elm package)Akita (now part of postman)JSON Schemadillonkearns/elm-formWolfgang's Effortless SDKs blog postGraphQL Custom Scalar Codecs feature in dillonkearns/elm-graphqlelm-units packageOpen API Generatorswagger-elmelm-open-api Real World example

Smart Software with SmartLogic
HTTP Requests in Elixir vs. JavaScript with Yordis Prieto & Stephen Chudleigh

Smart Software with SmartLogic

Play Episode Listen Later Oct 26, 2023 50:29


In today's episode, Sundi and Owen are joined by Yordis Prieto and Stephen Chudleigh to compare notes on HTTP requests in Elixir vs. Ruby, JavaScript, Go, and Rust. They cover common pain points when working with APIs, best practices, and lessons that can be learned from other programming languages. Yordis maintains Elixir's popular Tesla HTTP client library and shares insights from building APIs and maintaining open-source projects. Stephen has experience with Rails and JavaScript, and now works primarily in Elixir. They offer perspectives on testing HTTP requests and working with different libraries. While Elixir has matured, there is room for improvement - especially around richer struct parsing from HTTP responses. The discussion highlights ongoing efforts to improve the developer experience for HTTP clients in Elixir and other ecosystems. Topics Discussed in this Episode HTTP is a protocol - but each language has different implementation methods Tesla represents requests as middleware that can be modified before sending Testing HTTP requests can be a challenge due to dependence on outside systems GraphQL, OpenAPI, and JSON API provide clear request/response formats Elixir could improve richer parsing from HTTP into structs Focus on contribution ergonomics lowers barriers for new participants Maintainers emphasize making contributions easy via templates and clear documentation APIs drive adoption of standards for client/server contracts They discuss GraphQL, JSON API, OpenAPI schemas, and other standards that provide clear request/response formats TypeScript brings types to APIs and helps to validate responses Yordis notes that Go and Rust make requests simple via tags for mapping JSON to structs Language collaboration shares strengths from different ecosystems and inspires new libraries and tools for improving the programming experience Links Mentioned Elixir-Tesla Library: https://github.com/elixir-tesla/tesla Yordis on Github: https://github.com/yordis Yordis on Twitter: https://twitter.com/alchemist_ubi Yordis on LinkedIn: https://www.linkedin.com/in/yordisprieto/ Yordis on YouTube: https://www.youtube.com/@alchemistubi Stephen on Twitter: https://twitter.com/stepchud Stephen's projects on consciousness: https://harmonicdevelopment.us Owen suggests: Http.cat HTTParty: https://github.com/jnunemaker/httparty Guardian Library: https://github.com/ueberauth/guardian Axios: https://axios-http.com/ Straw Hat Fetcher: https://github.com/straw-hat-team/nodejs-monorepo/tree/master/packages/%40straw-hat/fetcher Elixir Tesla Wiki: https://github.com/elixir-tesla/tesla/wiki HTTPoison: https://github.com/edgurgel/httpoison Tesla Testing: https://hexdocs.pm/tesla/readme.html#testing Tesla Mock: https://hexdocs.pm/tesla/Tesla.Mock.html Finch: https://hex.pm/packages/finch Mojito: https://github.com/appcues/mojito Erlang Libraries and Frameworks Working Group: https://github.com/erlef/libs-and-frameworks/ and https://erlef.org/wg/libs-and-frameworks Special Guests: Stephen Chudleigh and Yordis Prieto.

North Meets South Web Podcast
The one with all the JSON API stuff with TJ Miller

North Meets South Web Podcast

Play Episode Listen Later Oct 12, 2023 46:36


Jake and Michael are joined by TJ Miller to try and untangle their confusion about JSON API, Open API, Swagger, and JSON Schema from last episode.This episode is brought to you by our friends at Workvivo - The leading employee communication app.Show links Generate API Documentation for Laravel with Scramble OpenAPI JSON Schema JSON:API Swagger Joe Tennanbaum going full Norton Commander with Laravel Prompts Remote Procedure Call (RPC) spatie/laravel-data Pact Stoplight Redoc SwaggerHub MuleSoft Apiary

North Meets South Web Podcast
DIY woodwork, React micro-frontends, and confusing OpenJSONAPISchema

North Meets South Web Podcast

Play Episode Listen Later Sep 28, 2023 40:23


In this episode, Jake and Michael discuss building your own monitor stand, the mysterious world of React micro-frontends, and get confused about JSON API, Open API, Swagger, and JSON Schema.This episode is brought to you by our friends at Workvivo - The leading employee communication app.Show links DIY monitor stand Micro-frontends Module federation JSON:API OpenAPI vs JSON:API JSON:API, OpenAPI, and JSON Schema working in harmony sixlive/json-schema-assertions

Laravel News Podcast
Streaming jets, wrapping words, and ACID-compliant operations

Laravel News Podcast

Play Episode Listen Later Sep 6, 2023 55:59


Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community. (02:17) - Laravel 10.20 released (14:04) - Laravel 10.21 released (19:04) - Laravel 10.22 released (25:53) - Livewire v3 has been released (28:18) - Laravel Jetstream 4.0 with Livewire 3 support (29:28) - Laravel String Wordwrap (30:05) - Laracon EU 2024 - Save the date! (31:50) - Pest Driven Laravel course is now on Laracasts (32:58) - Mary UI: Laravel Blade components for Livewire 3 (34:42) - An introducton to Sharp for Laravel, an open source content management framework (36:36) - Generate Saloon SDKs from Postman or OpenAPI (39:56) - Boost your Eloquent models with Laravel Lift (46:09) - MJML to HTML using PHP (50:11) - Run GitHub Actions locally with Act (51:55) - Laravel API toolkit (53:01) - firstOrCreate() vs createOrFirst() (53:17) - Using AWS S3 for Laravel storage (53:33) - Using Language Servers in Sublime Text (54:09) - Installing Xdebug with Laravel Herd

Data Engineering Podcast
Eliminate The Overhead In Your Data Integration With The Open Source dlt Library

Data Engineering Podcast

Play Episode Listen Later Sep 4, 2023 42:12


Summary Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) Your host is Tobias Macey and today I'm interviewing Adrian Brudaru about dlt, an open source python library for data loading Interview Introduction How did you get involved in the area of data management? Can you describe what dlt is and the story behind it? What is the problem you want to solve with dlt? Who is the target audience? The obvious comparison is with systems like Singer/Meltano/Airbyte in the open source space, or Fivetran/Matillion/etc. in the commercial space. What are the complexities or limitations of those tools that leave an opening for dlt? Can you describe how dlt is implemented? What are the benefits of building it in Python? How have the design and goals of the project changed since you first started working on it? How does that language choice influence the performance and scaling characteristics? What problems do users solve with dlt? What are the interfaces available for extending/customizing/integrating with dlt? Can you talk through the process of adding a new source/destination? What is the workflow for someone building a pipeline with dlt? How does the experience scale when supporting multiple connections? Given the limited scope of extract and load, and the composable design of dlt it seems like a purpose built companion to dbt (down to the naming). What are the benefits of using those tools in combination? What are the most interesting, innovative, or unexpected ways that you have seen dlt used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on dlt? When is dlt the wrong choice? What do you have planned for the future of dlt? Contact Info LinkedIn (https://www.linkedin.com/in/data-team/?originalSubdomain=de) Join our community to discuss further (https://join.slack.com/t/dlthub-community/shared_invite/zt-1slox199h-HAE7EQoXmstkP_bTqal65g) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links dlt (https://dlthub.com/) Harness Success Story (https://dlthub.com/success-stories/harness/) Our guiding product principles (https://dlthub.com/product/) Ecosystem support (https://dlthub.com/docs/dlt-ecosystem) From basic to complex, dlt has many capabilities (https://dlthub.com/docs/getting-started/build-a-data-pipeline) Singer (https://www.singer.io/) Airbyte (https://airbyte.com/) Podcast Episode (https://www.dataengineeringpodcast.com/airbyte-open-source-data-integration-episode-173/) Meltano (https://meltano.com/) Podcast Episode (https://www.dataengineeringpodcast.com/meltano-data-integration-episode-141/) Matillion (https://www.matillion.com/) Podcast Episode (https://www.dataengineeringpodcast.com/matillion-cloud-data-integration-episode-286/) Fivetran (https://www.fivetran.com/) Podcast Episode (https://www.dataengineeringpodcast.com/fivetran-data-replication-episode-93/) DuckDB (https://duckdb.org/) Podcast Episode (https://www.dataengineeringpodcast.com/duckdb-in-process-olap-database-episode-270/) OpenAPI (https://www.openapis.org/) Data Mesh (https://martinfowler.com/articles/data-monolith-to-mesh.html) Podcast Episode (https://www.dataengineeringpodcast.com/data-mesh-revisited-episode-250/) SQLMesh (https://sqlmesh.com/) Podcast Episode (https://www.dataengineeringpodcast.com/sqlmesh-open-source-dataops-episode-380) Airflow (https://airflow.apache.org/) Dagster (https://dagster.io/) Podcast Episode (https://www.dataengineeringpodcast.com/dagster-data-platform-big-complexity-episode-239/) Prefect (https://www.prefect.io/) Podcast Episode (https://www.dataengineeringpodcast.com/prefect-workflow-engine-episode-86/) Alto (https://github.com/z3z1ma/alto) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

Python Bytes
#343 So Much Pydantic!

Python Bytes

Play Episode Listen Later Jul 11, 2023 35:51


Watch on YouTube About the show Sponsored by us! Support our work through: Our courses at Talk Python Training Test & Code Podcast Patreon Supporters Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Show: @pythonbytes@fosstodon.org Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Tuesdays at 11am PT. Older video versions available there too. Michael #1: Pydantic v2 released Pydantic V2 is compatible with Python 3.7 and above. There is a migration guide. Check out the bump-pydantic tool to auto upgrade your classes Brian #2: Two Ways to Turbo-Charge tox Hynek Not just tox run-parallel or tox -p or tox --``parallel , but you should know about that also. The 2 ways Build one wheel instead of N sdists Run pytest in parallel tox builds source distributions, sdists, for each environment before running tests. that's not really what we want, especially if we have a test matrix. It'd be better to build a wheel once, and use that for all the environments. Add this to your tox.ini and now we get one wheel build [testenv] package = wheel wheel_build_env = .pkg It will save time. And a lot if you have a lengthy build. Run pytest in parallel, instead of tox in parallel, with pytest -n auto Requires the pytest-xdist plugin. Can slow down tests if your tests are pretty fast anyway. If you're using hypothesis, you probably want to try this. There are some gotchas and workarounds (like getting coverage to work) in the article. Michael #3: Awesome Pydantic A curated list of awesome things related to Pydantic!