Podcasts about rlhf

  • 83PODCASTS
  • 382EPISODES
  • 38mAVG DURATION
  • 1WEEKLY EPISODE
  • Feb 6, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about rlhf

Show all podcasts related to rlhf

Latest podcast episodes about rlhf

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
The First Mechanistic Interpretability Frontier Lab — Myra Deng & Mark Bissell of Goodfire AI

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Feb 6, 2026 68:01


From Palantir and Two Sigma to building Goodfire into the poster-child for actionable mechanistic interpretability, Mark Bissell (Member of Technical Staff) and Myra Deng (Head of Product) are trying to turn “peeking inside the model” into a repeatable production workflow by shipping APIs, landing real enterprise deployments, and now scaling the bet with a recent $150M Series B funding round at a $1.25B valuation.In this episode, we go far beyond the usual “SAEs are cool” take. We talk about Goodfire's core bet: that the AI lifecycle is still fundamentally broken because the only reliable control we have is data and we post-train, RLHF, and fine-tune by “slurping supervision through a straw,” hoping the model picks up the right behaviors while quietly absorbing the wrong ones. Goodfire's answer is to build a bi-directional interface between humans and models: read what's happening inside, edit it surgically, and eventually use interpretability during training so customization isn't just brute-force guesswork.Mark and Myra walk through what that looks like when you stop treating interpretability like a lab demo and start treating it like infrastructure: lightweight probes that add near-zero latency, token-level safety filters that can run at inference time, and interpretability workflows that survive messy constraints (multilingual inputs, synthetic→real transfer, regulated domains, no access to sensitive data). We also get a live window into what “frontier-scale interp” means operationally (i.e. steering a trillion-parameter model in real time by targeting internal features) plus why the same tooling generalizes cleanly from language models to genomics, medical imaging, and “pixel-space” world models.We discuss:* Myra + Mark's path: Palantir (health systems, forward-deployed engineering) → Goodfire early team; Two Sigma → Head of Product, translating frontier interpretability research into a platform and real-world deployments* What “interpretability” actually means in practice: not just post-hoc poking, but a broader “science of deep learning” approach across the full AI lifecycle (data curation → post-training → internal representations → model design)* Why post-training is the first big wedge: “surgical edits” for unintended behaviors likereward hacking, sycophancy, noise learned during customization plus the dream of targeted unlearning and bias removal without wrecking capabilities* SAEs vs probes in the real world: why SAE feature spaces sometimes underperform classifiers trained on raw activations for downstream detection tasks (hallucination, harmful intent, PII), and what that implies about “clean concept spaces”* Rakuten in production: deploying interpretability-based token-level PII detection at inference time to prevent routing private data to downstream providers plus the gnarly constraints: no training on real customer PII, synthetic→real transfer, English + Japanese, and tokenization quirks* Why interp can be operationally cheaper than LLM-judge guardrails: probes are lightweight, low-latency, and don't require hosting a second large model in the loop* Real-time steering at frontier scale: a demo of steering Kimi K2 (~1T params) live and finding features via SAE pipelines, auto-labeling via LLMs, and toggling a “Gen-Z slang” feature across multiple layers without breaking tool use* Hallucinations as an internal signal: the case that models have latent uncertainty / “user-pleasing” circuitry you can detect and potentially mitigate more directly than black-box methods* Steering vs prompting: the emerging view that activation steering and in-context learning are more closely connected than people think, including work mapping between the two (even for jailbreak-style behaviors)* Interpretability for science: using the same tooling across domains (genomics, medical imaging, materials) to debug spurious correlations and extract new knowledge up to and including early biomarker discovery work with major partners* World models + “pixel-space” interpretability: why vision/video models make concepts easier to see, how that accelerates the feedback loop, and why robotics/world-model partners are especially interesting design partners* The north star: moving from “data in, weights out” to intentional model design where experts can impart goals and constraints directly, not just via reward signals and brute-force post-training—Goodfire AI* Website: https://goodfire.ai* LinkedIn: https://www.linkedin.com/company/goodfire-ai/* X: https://x.com/GoodfireAIMyra Deng* Website: https://myradeng.com/* LinkedIn: https://www.linkedin.com/in/myra-deng/* X: https://x.com/myra_dengMark Bissell* LinkedIn: https://www.linkedin.com/in/mark-bissell/* X: https://x.com/MarkMBissellFull Video EpisodeTimestamps00:00:00 Introduction00:00:05 Introduction to the Latent Space Podcast and Guests from Goodfire00:00:29 What is Goodfire? Mission and Focus on Interpretability00:01:01 Goodfire's Practical Approach to Interpretability00:01:37 Goodfire's Series B Fundraise Announcement00:02:04 Backgrounds of Mark and Myra from Goodfire00:02:51 Team Structure and Roles at Goodfire00:05:13 What is Interpretability? Definitions and Techniques00:05:30 Understanding Errors00:07:29 Post-training vs. Pre-training Interpretability Applications00:08:51 Using Interpretability to Remove Unwanted Behaviors00:10:09 Grokking, Double Descent, and Generalization in Models00:10:15 404 Not Found Explained00:12:06 Subliminal Learning and Hidden Biases in Models00:14:07 How Goodfire Chooses Research Directions and Projects00:15:00 Troubleshooting Errors00:16:04 Limitations of SAEs and Probes in Interpretability00:18:14 Rakuten Case Study: Production Deployment of Interpretability00:20:45 Conclusion00:21:12 Efficiency Benefits of Interpretability Techniques00:21:26 Live Demo: Real-Time Steering in a Trillion Parameter Model00:25:15 How Steering Features are Identified and Labeled00:26:51 Detecting and Mitigating Hallucinations Using Interpretability00:31:20 Equivalence of Activation Steering and Prompting00:34:06 Comparing Steering with Fine-Tuning and LoRA Techniques00:36:04 Model Design and the Future of Intentional AI Development00:38:09 Getting Started in Mechinterp: Resources, Programs, and Open Problems00:40:51 Industry Applications and the Rise of Mechinterp in Practice00:41:39 Interpretability for Code Models and Real-World Usage00:43:07 Making Steering Useful for More Than Stylistic Edits00:46:17 Applying Interpretability to Healthcare and Scientific Discovery00:49:15 Why Interpretability is Crucial in High-Stakes Domains like Healthcare00:52:03 Call for Design Partners Across Domains00:54:18 Interest in World Models and Visual Interpretability00:57:22 Sci-Fi Inspiration: Ted Chiang and Interpretability01:00:14 Interpretability, Safety, and Alignment Perspectives01:04:27 Weak-to-Strong Generalization and Future Alignment Challenges01:05:38 Final Thoughts and Hiring/Collaboration Opportunities at GoodfireTranscriptShawn Wang [00:00:05]: So welcome to the Latent Space pod. We're back in the studio with our special MechInterp co-host, Vibhu. Welcome. Mochi, Mochi's special co-host. And Mochi, the mechanistic interpretability doggo. We have with us Mark and Myra from Goodfire. Welcome. Thanks for having us on. Maybe we can sort of introduce Goodfire and then introduce you guys. How do you introduce Goodfire today?Myra Deng [00:00:29]: Yeah, it's a great question. So Goodfire, we like to say, is an AI research lab that focuses on using interpretability to understand, learn from, and design AI models. And we really believe that interpretability will unlock the new generation, next frontier of safe and powerful AI models. That's our description right now, and I'm excited to dive more into the work we're doing to make that happen.Shawn Wang [00:00:55]: Yeah. And there's always like the official description. Is there an understatement? Is there an unofficial one that sort of resonates more with a different audience?Mark Bissell [00:01:01]: Well, being an AI research lab that's focused on interpretability, there's obviously a lot of people have a lot that they think about when they think of interpretability. And I think we have a pretty broad definition of what that means and the types of places that can be applied. And in particular, applying it in production scenarios, in high stakes industries, and really taking it sort of from the research world into the real world. Which, you know. It's a new field, so that hasn't been done all that much. And we're excited about actually seeing that sort of put into practice.Shawn Wang [00:01:37]: Yeah, I would say it wasn't too long ago that Anthopic was like still putting out like toy models or superposition and that kind of stuff. And I wouldn't have pegged it to be this far along. When you and I talked at NeurIPS, you were talking a little bit about your production use cases and your customers. And then not to bury the lead, today we're also announcing the fundraise, your Series B. $150 million. $150 million at a 1.25B valuation. Congrats, Unicorn.Mark Bissell [00:02:02]: Thank you. Yeah, no, things move fast.Shawn Wang [00:02:04]: We were talking to you in December and already some big updates since then. Let's dive, I guess, into a bit of your backgrounds as well. Mark, you were at Palantir working on health stuff, which is really interesting because the Goodfire has some interesting like health use cases. I don't know how related they are in practice.Mark Bissell [00:02:22]: Yeah, not super related, but I don't know. It was helpful context to know what it's like. Just to work. Just to work with health systems and generally in that domain. Yeah.Shawn Wang [00:02:32]: And Mara, you were at Two Sigma, which actually I was also at Two Sigma back in the day. Wow, nice.Myra Deng [00:02:37]: Did we overlap at all?Shawn Wang [00:02:38]: No, this is when I was briefly a software engineer before I became a sort of developer relations person. And now you're head of product. What are your sort of respective roles, just to introduce people to like what all gets done in Goodfire?Mark Bissell [00:02:51]: Yeah, prior to Goodfire, I was at Palantir for about three years as a forward deployed engineer, now a hot term. Wasn't always that way. And as a technical lead on the health care team and at Goodfire, I'm a member of the technical staff. And honestly, that I think is about as specific as like as as I could describe myself because I've worked on a range of things. And, you know, it's it's a fun time to be at a team that's still reasonably small. I think when I joined one of the first like ten employees, now we're above 40, but still, it looks like there's always a mix of research and engineering and product and all of the above. That needs to get done. And I think everyone across the team is, you know, pretty, pretty switch hitter in the roles they do. So I think you've seen some of the stuff that I worked on related to image models, which was sort of like a research demo. More recently, I've been working on our scientific discovery team with some of our life sciences partners, but then also building out our core platform for more of like flexing some of the kind of MLE and developer skills as well.Shawn Wang [00:03:53]: Very generalist. And you also had like a very like a founding engineer type role.Myra Deng [00:03:58]: Yeah, yeah.Shawn Wang [00:03:59]: So I also started as I still am a member of technical staff, did a wide range of things from the very beginning, including like finding our office space and all of this, which is we both we both visited when you had that open house thing. It was really nice.Myra Deng [00:04:13]: Thank you. Thank you. Yeah. Plug to come visit our office.Shawn Wang [00:04:15]: It looked like it was like 200 people. It has room for 200 people. But you guys are like 10.Myra Deng [00:04:22]: For a while, it was very empty. But yeah, like like Mark, I spend. A lot of my time as as head of product, I think product is a bit of a weird role these days, but a lot of it is thinking about how do we take our frontier research and really apply it to the most important real world problems and how does that then translate into a platform that's repeatable or a product and working across, you know, the engineering and research teams to make that happen and also communicating to the world? Like, what is interpretability? What is it used for? What is it good for? Why is it so important? All of these things are part of my day-to-day as well.Shawn Wang [00:05:01]: I love like what is things because that's a very crisp like starting point for people like coming to a field. They all do a fun thing. Vibhu, why don't you want to try tackling what is interpretability and then they can correct us.Vibhu Sapra [00:05:13]: Okay, great. So I think like one, just to kick off, it's a very interesting role to be head of product, right? Because you guys, at least as a lab, you're more of an applied interp lab, right? Which is pretty different than just normal interp, like a lot of background research. But yeah. You guys actually ship an API to try these things. You have Ember, you have products around it, which not many do. Okay. What is interp? So basically you're trying to have an understanding of what's going on in model, like in the model, in the internal. So different approaches to do that. You can do probing, SAEs, transcoders, all this stuff. But basically you have an, you have a hypothesis. You have something that you want to learn about what's happening in a model internals. And then you're trying to solve that from there. You can do stuff like you can, you know, you can do activation mapping. You can try to do steering. There's a lot of stuff that you can do, but the key question is, you know, from input to output, we want to have a better understanding of what's happening and, you know, how can we, how can we adjust what's happening on the model internals? How'd I do?Mark Bissell [00:06:12]: That was really good. I think that was great. I think it's also a, it's kind of a minefield of a, if you ask 50 people who quote unquote work in interp, like what is interpretability, you'll probably get 50 different answers. And. Yeah. To some extent also like where, where good fire sits in the space. I think that we're an AI research company above all else. And interpretability is a, is a set of methods that we think are really useful and worth kind of specializing in, in order to accomplish the goals we want to accomplish. But I think we also sort of see some of the goals as even more broader as, as almost like the science of deep learning and just taking a not black box approach to kind of any part of the like AI development life cycle, whether that. That means using interp for like data curation while you're training your model or for understanding what happened during post-training or for the, you know, understanding activations and sort of internal representations, what is in there semantically. And then a lot of sort of exciting updates that were, you know, are sort of also part of the, the fundraise around bringing interpretability to training, which I don't think has been done all that much before. A lot of this stuff is sort of post-talk poking at models as opposed to. To actually using this to intentionally design them.Shawn Wang [00:07:29]: Is this post-training or pre-training or is that not a useful.Myra Deng [00:07:33]: Currently focused on post-training, but there's no reason the techniques wouldn't also work in pre-training.Shawn Wang [00:07:38]: Yeah. It seems like it would be more active, applicable post-training because basically I'm thinking like rollouts or like, you know, having different variations of a model that you can tweak with the, with your steering. Yeah.Myra Deng [00:07:50]: And I think in a lot of the news that you've seen in, in, on like Twitter or whatever, you've seen a lot of unintended. Side effects come out of post-training processes, you know, overly sycophantic models or models that exhibit strange reward hacking behavior. I think these are like extreme examples. There's also, you know, very, uh, mundane, more mundane, like enterprise use cases where, you know, they try to customize or post-train a model to do something and it learns some noise or it doesn't appropriately learn the target task. And a big question that we've always had is like, how do you use your understanding of what the model knows and what it's doing to actually guide the learning process?Shawn Wang [00:08:26]: Yeah, I mean, uh, you know, just to anchor this for people, uh, one of the biggest controversies of last year was 4.0 GlazeGate. I've never heard of GlazeGate. I didn't know that was what it was called. The other one, they called it that on the blog post and I was like, well, how did OpenAI call it? Like officially use that term. And I'm like, that's funny, but like, yeah, I guess it's the pitch that if they had worked a good fire, they wouldn't have avoided it. Like, you know what I'm saying?Myra Deng [00:08:51]: I think so. Yeah. Yeah.Mark Bissell [00:08:53]: I think that's certainly one of the use cases. I think. Yeah. Yeah. I think the reason why post-training is a place where this makes a lot of sense is a lot of what we're talking about is surgical edits. You know, you want to be able to have expert feedback, very surgically change how your model is doing, whether that is, you know, removing a certain behavior that it has. So, you know, one of the things that we've been looking at or is, is another like common area where you would want to make a somewhat surgical edit is some of the models that have say political bias. Like you look at Quen or, um, R1 and they have sort of like this CCP bias.Shawn Wang [00:09:27]: Is there a CCP vector?Mark Bissell [00:09:29]: Well, there's, there are certainly internal, yeah. Parts of the representation space where you can sort of see where that lives. Yeah. Um, and you want to kind of, you know, extract that piece out.Shawn Wang [00:09:40]: Well, I always say, you know, whenever you find a vector, a fun exercise is just like, make it very negative to see what the opposite of CCP is.Mark Bissell [00:09:47]: The super America, bald eagles flying everywhere. But yeah. So in general, like lots of post-training tasks where you'd want to be able to, to do that. Whether it's unlearning a certain behavior or, you know, some of the other kind of cases where this comes up is, are you familiar with like the, the grokking behavior? I mean, I know the machine learning term of grokking.Shawn Wang [00:10:09]: Yeah.Mark Bissell [00:10:09]: Sort of this like double descent idea of, of having a model that is able to learn a generalizing, a generalizing solution, as opposed to even if memorization of some task would suffice, you want it to learn the more general way of doing a thing. And so, you know, another. A way that you can think about having surgical access to a model's internals would be learn from this data, but learn in the right way. If there are many possible, you know, ways to, to do that. Can make interp solve the double descent problem?Shawn Wang [00:10:41]: Depends, I guess, on how you. Okay. So I, I, I viewed that double descent as a problem because then you're like, well, if the loss curves level out, then you're done, but maybe you're not done. Right. Right. But like, if you actually can interpret what is a generalizing or what you're doing. What is, what is still changing, even though the loss is not changing, then maybe you, you can actually not view it as a double descent problem. And actually you're just sort of translating the space in which you view loss and like, and then you have a smooth curve. Yeah.Mark Bissell [00:11:11]: I think that's certainly like the domain of, of problems that we're, that we're looking to get.Shawn Wang [00:11:15]: Yeah. To me, like double descent is like the biggest thing to like ML research where like, if you believe in scaling, then you don't need, you need to know where to scale. And. But if you believe in double descent, then you don't, you don't believe in anything where like anything levels off, like.Vibhu Sapra [00:11:30]: I mean, also tendentially there's like, okay, when you talk about the China vector, right. There's the subliminal learning work. It was from the anthropic fellows program where basically you can have hidden biases in a model. And as you distill down or, you know, as you train on distilled data, those biases always show up, even if like you explicitly try to not train on them. So, you know, it's just like another use case of. Okay. If we can interpret what's happening in post-training, you know, can we clear some of this? Can we even determine what's there? Because yeah, it's just like some worrying research that's out there that shows, you know, we really don't know what's going on.Mark Bissell [00:12:06]: That is. Yeah. I think that's the biggest sentiment that we're sort of hoping to tackle. Nobody knows what's going on. Right. Like subliminal learning is just an insane concept when you think about it. Right. Train a model on not even the logits, literally the output text of a bunch of random numbers. And now your model loves owls. And you see behaviors like that, that are just, they defy, they defy intuition. And, and there are mathematical explanations that you can get into, but. I mean.Shawn Wang [00:12:34]: It feels so early days. Objectively, there are a sequence of numbers that are more owl-like than others. There, there should be.Mark Bissell [00:12:40]: According to, according to certain models. Right. It's interesting. I think it only applies to models that were initialized from the same starting Z. Usually, yes.Shawn Wang [00:12:49]: But I mean, I think that's a, that's a cheat code because there's not enough compute. But like if you believe in like platonic representation, like probably it will transfer across different models as well. Oh, you think so?Mark Bissell [00:13:00]: I think of it more as a statistical artifact of models initialized from the same seed sort of. There's something that is like path dependent from that seed that might cause certain overlaps in the latent space and then sort of doing this distillation. Yeah. Like it pushes it towards having certain other tendencies.Vibhu Sapra [00:13:24]: Got it. I think there's like a bunch of these open-ended questions, right? Like you can't train in new stuff during the RL phase, right? RL only reorganizes weights and you can only do stuff that's somewhat there in your base model. You're not learning new stuff. You're just reordering chains and stuff. But okay. My broader question is when you guys work at an interp lab, how do you decide what to work on and what's kind of the thought process? Right. Because we can ramble for hours. Okay. I want to know this. I want to know that. But like, how do you concretely like, you know, what's the workflow? Okay. There's like approaches towards solving a problem, right? I can try prompting. I can look at chain of thought. I can train probes, SAEs. But how do you determine, you know, like, okay, is this going anywhere? Like, do we have set stuff? Just, you know, if you can help me with all that. Yeah.Myra Deng [00:14:07]: It's a really good question. I feel like we've always at the very beginning of the company thought about like, let's go and try to learn what isn't working in machine learning today. Whether that's talking to customers or talking to researchers at other labs, trying to understand both where the frontier is going and where things are really not falling apart today. And then developing a perspective on how we can push the frontier using interpretability methods. And so, you know, even our chief scientist, Tom, spends a lot of time talking to customers and trying to understand what real world problems are and then taking that back and trying to apply the current state of the art to those problems and then seeing where they fall down basically. And then using those failures or those shortcomings to understand what hills to climb when it comes to interpretability research. So like on the fundamental side, for instance, when we have done some work applying SAEs and probes, we've encountered, you know, some shortcomings in SAEs that we found a little bit surprising. And so have gone back to the drawing board and done work on that. And then, you know, we've done some work on better foundational interpreter models. And a lot of our team's research is focused on what is the next evolution beyond SAEs, for instance. And then when it comes to like control and design of models, you know, we tried steering with our first API and realized that it still fell short of black box techniques like prompting or fine tuning. And so went back to the drawing board and we're like, how do we make that not the case and how do we improve it beyond that? And one of our researchers, Ekdeep, who just joined is actually Ekdeep and Atticus are like steering experts and have spent a lot of time trying to figure out like, what is the research that enables us to actually do this in a much more powerful, robust way? So yeah, the answer is like, look at real world problems, try to translate that into a research agenda and then like hill climb on both of those at the same time.Shawn Wang [00:16:04]: Yeah. Mark has the steering CLI demo queued up, which we're going to go into in a sec. But I always want to double click on when you drop hints, like we found some problems with SAEs. Okay. What are they? You know, and then we can go into the demo. Yeah.Myra Deng [00:16:19]: I mean, I'm curious if you have more thoughts here as well, because you've done it in the healthcare domain. But I think like, for instance, when we do things like trying to detect behaviors within models that are harmful or like behaviors that a user might not want to have in their model. So hallucinations, for instance, harmful intent, PII, all of these things. We first tried using SAE probes for a lot of these tasks. So taking the feature activation space from SAEs and then training classifiers on top of that, and then seeing how well we can detect the properties that we might want to detect in model behavior. And we've seen in many cases that probes just trained on raw activations seem to perform better than SAE probes, which is a bit surprising if you think that SAEs are actually also capturing the concepts that you would want to capture cleanly and more surgically. And so that is an interesting observation. I don't think that is like, I'm not down on SAEs at all. I think there are many, many things they're useful for, but we have definitely run into cases where I think the concept space described by SAEs is not as clean and accurate as we would expect it to be for actual like real world downstream performance metrics.Mark Bissell [00:17:34]: Fair enough. Yeah. It's the blessing and the curse of unsupervised methods where you get to peek into the AI's mind. But sometimes you wish that you saw other things when you walked inside there. Although in the PII instance, I think weren't an SAE based approach actually did prove to be the most generalizable?Myra Deng [00:17:53]: It did work well in the case that we published with Rakuten. And I think a lot of the reasons it worked well was because we had a noisier data set. And so actually the blessing of unsupervised learning is that we actually got to get more meaningful, generalizable signal from SAEs when the data was noisy. But in other cases where we've had like good data sets, it hasn't been the case.Shawn Wang [00:18:14]: And just because you named Rakuten and I don't know if we'll get it another chance, like what is the overall, like what is Rakuten's usage or production usage? Yeah.Myra Deng [00:18:25]: So they are using us to essentially guardrail and inference time monitor their language model usage and their agent usage to detect things like PII so that they don't route private user information.Myra Deng [00:18:41]: And so that's, you know, going through all of their user queries every day. And that's something that we deployed with them a few months ago. And now we are actually exploring very early partnerships, not just with Rakuten, but with other people around how we can help with potentially training and customization use cases as well. Yeah.Shawn Wang [00:19:03]: And for those who don't know, like it's Rakuten is like, I think number one or number two e-commerce store in Japan. Yes. Yeah.Mark Bissell [00:19:10]: And I think that use case actually highlights a lot of like what it looks like to deploy things in practice that you don't always think about when you're doing sort of research tasks. So when you think about some of the stuff that came up there that's more complex than your idealized version of a problem, they were encountering things like synthetic to real transfer of methods. So they couldn't train probes, classifiers, things like that on actual customer data of PII. So what they had to do is use synthetic data sets. And then hope that that transfer is out of domain to real data sets. And so we can evaluate performance on the real data sets, but not train on customer PII. So that right off the bat is like a big challenge. You have multilingual requirements. So this needed to work for both English and Japanese text. Japanese text has all sorts of quirks, including tokenization behaviors that caused lots of bugs that caused us to be pulling our hair out. And then also a lot of tasks you'll see. You might make simplifying assumptions if you're sort of treating it as like the easiest version of the problem to just sort of get like general results where maybe you say you're classifying a sentence to say, does this contain PII? But the need that Rakuten had was token level classification so that you could precisely scrub out the PII. So as we learned more about the problem, you're sort of speaking about what that looks like in practice. Yeah. A lot of assumptions end up breaking. And that was just one instance where you. A problem that seems simple right off the bat ends up being more complex as you keep diving into it.Vibhu Sapra [00:20:41]: Excellent. One of the things that's also interesting with Interp is a lot of these methods are very efficient, right? So where you're just looking at a model's internals itself compared to a separate like guardrail, LLM as a judge, a separate model. One, you have to host it. Two, there's like a whole latency. So if you use like a big model, you have a second call. Some of the work around like self detection of hallucination, it's also deployed for efficiency, right? So if you have someone like Rakuten doing it in production live, you know, that's just another thing people should consider.Mark Bissell [00:21:12]: Yeah. And something like a probe is super lightweight. Yeah. It's no extra latency really. Excellent.Shawn Wang [00:21:17]: You have the steering demos lined up. So we were just kind of see what you got. I don't, I don't actually know if this is like the latest, latest or like alpha thing.Mark Bissell [00:21:26]: No, this is a pretty hacky demo from from a presentation that someone else on the team recently gave. So this will give a sense for, for technology. So you can see the steering and action. Honestly, I think the biggest thing that this highlights is that as we've been growing as a company and taking on kind of more and more ambitious versions of interpretability related problems, a lot of that comes to scaling up in various different forms. And so here you're going to see steering on a 1 trillion parameter model. This is Kimi K2. And so it's sort of fun that in addition to the research challenges, there are engineering challenges that we're now tackling. Cause for any of this to be sort of useful in production, you need to be thinking about what it looks like when you're using these methods on frontier models as opposed to sort of like toy kind of model organisms. So yeah, this was thrown together hastily, pretty fragile behind the scenes, but I think it's quite a fun demo. So screen sharing is on. So I've got two terminal sessions pulled up here. On the left is a forked version that we have of the Kimi CLI that we've got running to point at our custom hosted Kimi model. And then on the right is a set up that will allow us to steer on certain concepts. So I should be able to chat with Kimi over here. Tell it hello. This is running locally. So the CLI is running locally, but the Kimi server is running back to the office. Well, hopefully should be, um, that's too much to run on that Mac. Yeah. I think it's, uh, it takes a full, like each 100 node. I think it's like, you can. You can run it on eight GPUs, eight 100. So, so yeah, Kimi's running. We can ask it a prompt. It's got a forked version of our, uh, of the SG line code base that we've been working on. So I'm going to tell it, Hey, this SG line code base is slow. I think there's a bug. Can you try to figure it out? There's a big code base, so it'll, it'll spend some time doing this. And then on the right here, I'm going to initialize in real time. Some steering. Let's see here.Mark Bissell [00:23:33]: searching for any. Bugs. Feature ID 43205.Shawn Wang [00:23:38]: Yeah.Mark Bissell [00:23:38]: 20, 30, 40. So let me, uh, this is basically a feature that we found that inside Kimi seems to cause it to speak in Gen Z slang. And so on the left, it's still sort of thinking normally it might take, I don't know, 15 seconds for this to kick in, but then we're going to start hopefully seeing him do this code base is massive for real. So we're going to start. We're going to start seeing Kimi transition as the steering kicks in from normal Kimi to Gen Z Kimi and both in its chain of thought and its actual outputs.Mark Bissell [00:24:19]: And interestingly, you can see, you know, it's still able to call tools, uh, and stuff. It's um, it's purely sort of it's it's demeanor. And there are other features that we found for interesting things like concision. So that's more of a practical one. You can make it more concise. Um, the types of programs, uh, programming languages that uses, but yeah, as we're seeing it come in. Pretty good. Outputs.Shawn Wang [00:24:43]: Scheduler code is actually wild.Vibhu Sapra [00:24:46]: Yo, this code is actually insane, bro.Vibhu Sapra [00:24:53]: What's the process of training in SAE on this, or, you know, how do you label features? I know you guys put out a pretty cool blog post about, um, finding this like autonomous interp. Um, something. Something about how agents for interp is different than like coding agents. I don't know while this is spewing up, but how, how do we find feature 43, two Oh five. Yeah.Mark Bissell [00:25:15]: So in this case, um, we, our platform that we've been building out for a long time now supports all the sort of classic out of the box interp techniques that you might want to have like SAE training, probing things of that kind, I'd say the techniques for like vanilla SAEs are pretty well established now where. You take your model that you're interpreting, run a whole bunch of data through it, gather activations, and then yeah, pretty straightforward pipeline to train an SAE. There are a lot of different varieties. There's top KSAEs, batch top KSAEs, um, normal ReLU SAEs. And then once you have your sparse features to your point, assigning labels to them to actually understand that this is a gen Z feature, that's actually where a lot of the kind of magic happens. Yeah. And the most basic standard technique is look at all of your d input data set examples that cause this feature to fire most highly. And then you can usually pick out a pattern. So for this feature, If I've run a diverse enough data set through my model feature 43, two Oh five. Probably tends to fire on all the tokens that sounds like gen Z slang. You know, that's the, that's the time of year to be like, Oh, I'm in this, I'm in this Um, and, um, so, you know, you could have a human go through all 43,000 concepts andVibhu Sapra [00:26:34]: And I've got to ask the basic question, you know, can we get examples where it hallucinates, pass it through, see what feature activates for hallucinations? Can I just, you know, turn hallucination down?Myra Deng [00:26:51]: Oh, wow. You really predicted a project we're already working on right now, which is detecting hallucinations using interpretability techniques. And this is interesting because hallucinations is something that's very hard to detect. And it's like a kind of a hairy problem and something that black box methods really struggle with. Whereas like Gen Z, you could always train a simple classifier to detect that hallucinations is harder. But we've seen that models internally have some... Awareness of like uncertainty or some sort of like user pleasing behavior that leads to hallucinatory behavior. And so, yeah, we have a project that's trying to detect that accurately. And then also working on mitigating the hallucinatory behavior in the model itself as well.Shawn Wang [00:27:39]: Yeah, I would say most people are still at the level of like, oh, I would just turn temperature to zero and that turns off hallucination. And I'm like, well, that's a fundamental misunderstanding of how this works. Yeah.Mark Bissell [00:27:51]: Although, so part of what I like about that question is you, there are SAE based approaches that might like help you get at that. But oftentimes the beauty of SAEs and like we said, the curse is that they're unsupervised. So when you have a behavior that you deliberately would like to remove, and that's more of like a supervised task, often it is better to use something like probes and specifically target the thing that you're interested in reducing as opposed to sort of like hoping that when you fragment the latent space, one of the vectors that pops out.Vibhu Sapra [00:28:20]: And as much as we're training an autoencoder to be sparse, we're not like for sure certain that, you know, we will get something that just correlates to hallucination. You'll probably split that up into 20 other things and who knows what they'll be.Mark Bissell [00:28:36]: Of course. Right. Yeah. So there's no sort of problems with like feature splitting and feature absorption. And then there's the off target effects, right? Ideally, you would want to be very precise where if you reduce the hallucination feature, suddenly maybe your model can't write. Creatively anymore. And maybe you don't like that, but you want to still stop it from hallucinating facts and figures.Shawn Wang [00:28:55]: Good. So Vibhu has a paper to recommend there that we'll put in the show notes. But yeah, I mean, I guess just because your demo is done, any any other things that you want to highlight or any other interesting features you want to show?Mark Bissell [00:29:07]: I don't think so. Yeah. Like I said, this is a pretty small snippet. I think the main sort of point here that I think is exciting is that there's not a whole lot of inter being applied to models quite at this scale. You know, Anthropic certainly has some some. Research and yeah, other other teams as well. But it's it's nice to see these techniques, you know, being put into practice. I think not that long ago, the idea of real time steering of a trillion parameter model would have sounded.Shawn Wang [00:29:33]: Yeah. The fact that it's real time, like you started the thing and then you edited the steering vector.Vibhu Sapra [00:29:38]: I think it's it's an interesting one TBD of what the actual like production use case would be on that, like the real time editing. It's like that's the fun part of the demo, right? You can kind of see how this could be served behind an API, right? Like, yes, you're you only have so many knobs and you can just tweak it a bit more. And I don't know how it plays in. Like people haven't done that much with like, how does this work with or without prompting? Right. How does this work with fine tuning? Like, there's a whole hype of continual learning, right? So there's just so much to see. Like, is this another parameter? Like, is it like parameter? We just kind of leave it as a default. We don't use it. So I don't know. Maybe someone here wants to put out a guide on like how to use this with prompting when to do what?Mark Bissell [00:30:18]: Oh, well, I have a paper recommendation. I think you would love from Act Deep on our team, who is an amazing researcher, just can't say enough amazing things about Act Deep. But he actually has a paper that as well as some others from the team and elsewhere that go into the essentially equivalence of activation steering and in context learning and how those are from a he thinks of everything in a cognitive neuroscience Bayesian framework, but basically how you can precisely show how. Prompting in context, learning and steering exhibit similar behaviors and even like get quantitative about the like magnitude of steering you would need to do to induce a certain amount of behavior similar to certain prompting, even for things like jailbreaks and stuff. It's a really cool paper. Are you saying steering is less powerful than prompting? More like you can almost write a formula that tells you how to convert between the two of them.Myra Deng [00:31:20]: And so like formally equivalent actually in the in the limit. Right.Mark Bissell [00:31:24]: So like one case study of this is for jailbreaks there. I don't know. Have you seen the stuff where you can do like many shot jailbreaking? You like flood the context with examples of the behavior. And the topic put out that paper.Shawn Wang [00:31:38]: A lot of people were like, yeah, we've been doing this, guys.Mark Bissell [00:31:40]: Like, yeah, what's in this in context learning and activation steering equivalence paper is you can like predict the number. Number of examples that you will need to put in there in order to jailbreak the model. That's cool. By doing steering experiments and using this sort of like equivalence mapping. That's cool. That's really cool. It's very neat. Yeah.Shawn Wang [00:32:02]: I was going to say, like, you know, I can like back rationalize that this makes sense because, you know, what context is, is basically just, you know, it updates the KV cache kind of and like and then every next token inference is still like, you know, the sheer sum of everything all the way. It's plus all the context. It's up to date. And you could, I guess, theoretically steer that with you probably replace that with your steering. The only problem is steering typically is on one layer, maybe three layers like like you did. So it's like not exactly equivalent.Mark Bissell [00:32:33]: Right, right. There's sort of you need to get precise about, yeah, like how you sort of define steering and like what how you're modeling the setup. But yeah, I've got the paper pulled up here. Belief dynamics reveal the dual nature. Yeah. The title is Belief Dynamics Reveal the Dual Nature of Incompetence. And it's an exhibition of the practical context learning and activation steering. So Eric Bigelow, Dan Urgraft on the who are doing fellowships at Goodfire, Ekt Deep's the final author there.Myra Deng [00:32:59]: I think actually to your question of like, what is the production use case of steering? I think maybe if you just think like one level beyond steering as it is today. Like imagine if you could adapt your model to be, you know, an expert legal reasoner. Like in almost real time, like very quickly. efficiently using human feedback or using like your semantic understanding of what the model knows and where it knows that behavior. I think that while it's not clear what the product is at the end of the day, it's clearly very valuable. Thinking about like what's the next interface for model customization and adaptation is a really interesting problem for us. Like we have heard a lot of people actually interested in fine-tuning an RL for open weight models in production. And so people are using things like Tinker or kind of like open source libraries to do that, but it's still very difficult to get models fine-tuned and RL'd for exactly what you want them to do unless you're an expert at model training. And so that's like something we'reShawn Wang [00:34:06]: looking into. Yeah. I never thought so. Tinker from Thinking Machines famously uses rank one LoRa. Is that basically the same as steering? Like, you know, what's the comparison there?Mark Bissell [00:34:19]: Well, so in that case, you are still applying updates to the parameters, right?Shawn Wang [00:34:25]: Yeah. You're not touching a base model. You're touching an adapter. It's kind of, yeah.Mark Bissell [00:34:30]: Right. But I guess it still is like more in parameter space then. I guess it's maybe like, are you modifying the pipes or are you modifying the water flowing through the pipes to get what you're after? Yeah. Just maybe one way.Mark Bissell [00:34:44]: I like that analogy. That's my mental map of it at least, but it gets at this idea of model design and intentional design, which is something that we're, that we're very focused on. And just the fact that like, I hope that we look back at how we're currently training models and post-training models and just think what a primitive way of doing that right now. Like there's no intentionalityShawn Wang [00:35:06]: really in... It's just data, right? The only thing in control is what data we feed in.Mark Bissell [00:35:11]: So, so Dan from Goodfire likes to use this analogy of, you know, he has a couple of young kids and he talks about like, what if I could only teach my kids how to be good people by giving them cookies or like, you know, giving them a slap on the wrist if they do something wrong, like not telling them why it was wrong or like what they should have done differently or something like that. Just figure it out. Right. Exactly. So that's RL. Yeah. Right. And, and, you know, it's sample inefficient. There's, you know, what do they say? It's like slurping feedback. It's like, slurping supervision. Right. And so you'd like to get to the point where you can have experts giving feedback to their models that are, uh, internalized and, and, you know, steering is an inference time way of sort of getting that idea. But ideally you're moving to a world whereVibhu Sapra [00:36:04]: it is much more intentional design in perpetuity for these models. Okay. This is one of the questions we asked Emmanuel from Anthropic on the podcast a few months ago. Basically the question, was you're at a research lab that does model training, foundation models, and you're on an interp team. How does it tie back? Right? Like, does this, do ideas come from the pre-training team? Do they go back? Um, you know, so for those interested, you can, you can watch that. There wasn't too much of a connect there, but it's still something, you know, it's something they want toMark Bissell [00:36:33]: push for down the line. It can be useful for all of the above. Like there are certainly post-hocVibhu Sapra [00:36:39]: use cases where it doesn't need to touch that. I think the other thing a lot of people forget is this stuff isn't too computationally expensive, right? Like I would say, if you're interested in getting into research, MechInterp is one of the most approachable fields, right? A lot of this train an essay, train a probe, this stuff, like the budget for this one, there's already a lot done. There's a lot of open source work. You guys have done some too. Um, you know,Shawn Wang [00:37:04]: There's like notebooks from the Gemini team for Neil Nanda or like, this is how you do it. Just step through the notebook.Vibhu Sapra [00:37:09]: Even if you're like, not even technical with any of this, you can still make like progress. There, you can look at different activations, but, uh, if you do want to get into training, you know, training this stuff, correct me if I'm wrong is like in the thousands of dollars, not even like, it's not that high scale. And then same with like, you know, applying it, doing it for post-training or all this stuff is fairly cheap in scale of, okay. I want to get into like model training. I don't have compute for like, you know, pre-training stuff. So it's, it's a very nice field to get into. And also there's a lot of like open questions, right? Um, some of them have to go with, okay, I want a product. I want to solve this. Like there's also just a lot of open-ended stuff that people could work on. That's interesting. Right. I don't know if you guys have any calls for like, what's open questions, what's open work that you either open collaboration with, or like, you'd just like to see solved or just, you know, for people listening that want to get into McInturk because people always talk about it. What are, what are the things they should check out? Start, of course, you know, join you guys as well. I'm sure you're hiring.Myra Deng [00:38:09]: There's a paper, I think from, was it Lee, uh, Sharky? It's open problems and, uh, it's, it's a bit of interpretability, which I recommend everyone who's interested in the field. Read. I'm just like a really comprehensive overview of what are the things that experts in the field think are the most important problems to be solved. I also think to your point, it's been really, really inspiring to see, I think a lot of young people getting interested in interpretability, actually not just young people also like scientists to have been, you know, experts in physics for many years and in biology or things like this, um, transitioning into interp, because the barrier of, of what's now interp. So it's really cool to see a number to entry is, you know, in some ways low and there's a lot of information out there and ways to get started. There's this anecdote of like professors at universities saying that all of a sudden every incoming PhD student wants to study interpretability, which was not the case a few years ago. So it just goes to show how, I guess, like exciting the field is, how fast it's moving, how quick it is to get started and things like that.Mark Bissell [00:39:10]: And also just a very welcoming community. You know, there's an open source McInturk Slack channel. There are people are always posting questions and just folks in the space are always responsive if you ask things on various forums and stuff. But yeah, the open paper, open problems paper is a really good one.Myra Deng [00:39:28]: For other people who want to get started, I think, you know, MATS is a great program. What's the acronym for? Machine Learning and Alignment Theory Scholars? It's like the...Vibhu Sapra [00:39:40]: Normally summer internship style.Myra Deng [00:39:42]: Yeah, but they've been doing it year round now. And actually a lot of our full-time staff have come through that program or gone through that program. And it's great for anyone who is transitioning into interpretability. There's a couple other fellows programs. We do one as well as Anthropic. And so those are great places to get started if anyone is interested.Mark Bissell [00:40:03]: Also, I think been seen as a research field for a very long time. But I think engineering... I think engineers are sorely wanted for interpretability as well, especially at Goodfire, but elsewhere, as it does scale up.Shawn Wang [00:40:18]: I should mention that Lee actually works with you guys, right? And in the London office and I'm adding our first ever McInturk track at AI Europe because I see this industry applications now emerging. And I'm pretty excited to, you know, help push that along. Yeah, I was looking forward to that. It'll effectively be the first industry McInturk conference. Yeah. I'm so glad you added that. You know, it's still a little bit of a bet. It's not that widespread, but I can definitely see this is the time to really get into it. We want to be early on things.Mark Bissell [00:40:51]: For sure. And I think the field understands this, right? So at ICML, I think the title of the McInturk workshop this year was actionable interpretability. And there was a lot of discussion around bringing it to various domains. Everyone's adding pragmatic, actionable, whatever.Shawn Wang [00:41:10]: It's like, okay, well, we weren't actionable before, I guess. I don't know.Vibhu Sapra [00:41:13]: And I mean, like, just, you know, being in Europe, you see the Interp room. One, like old school conferences, like, I think they had a very tiny room till they got lucky and they got it doubled. But there's definitely a lot of interest, a lot of niche research. So you see a lot of research coming out of universities, students. We covered the paper last week. It's like two unknown authors, not many citations. But, you know, you can make a lot of meaningful work there. Yeah. Yeah. Yeah.Shawn Wang [00:41:39]: Yeah. I think people haven't really mentioned this yet. It's just Interp for code. I think it's like an abnormally important field. We haven't mentioned this yet. The conspiracy theory last two years ago was when the first SAE work came out of Anthropic was they would do like, oh, we just used SAEs to turn the bad code vector down and then turn up the good code. And I think like, isn't that the dream? Like, you know, like, but basically, I guess maybe, why is it funny? Like, it's... If it was realistic, it would not be funny. It would be like, no, actually, we should do this. But it's funny because we know there's like, we feel there's some limitations to what steering can do. And I think a lot of the public image of steering is like the Gen Z stuff. Like, oh, you can make it really love the Golden Gate Bridge, or you can make it speak like Gen Z. To like be a legal reasoner seems like a huge stretch. Yeah. And I don't know if that will get there this way. Yeah.Myra Deng [00:42:36]: I think, um, I will say we are announcing. Something very soon that I will not speak too much about. Um, but I think, yeah, this is like what we've run into again and again is like, we, we don't want to be in the world where steering is only useful for like stylistic things. That's definitely not, not what we're aiming for. But I think the types of interventions that you need to do to get to things like legal reasoning, um, are much more sophisticated and require breakthroughs in, in learning algorithms. And that's, um...Shawn Wang [00:43:07]: And is this an emergent property of scale as well?Myra Deng [00:43:10]: I think so. Yeah. I mean, I think scale definitely helps. I think scale allows you to learn a lot of information and, and reduce noise across, you know, large amounts of data. But I also think we think that there's ways to do things much more effectively, um, even, even at scale. So like actually learning exactly what you want from the data and not learning things that you do that you don't want exhibited in the data. So we're not like anti-scale, but we are also realizing that scale is not going to get us anywhere. It's not going to get us to the type of AI development that we want to be at in, in the future as these models get more powerful and get deployed in all these sorts of like mission critical contexts. Current life cycle of training and deploying and evaluations is, is to us like deeply broken and has opportunities to, to improve. So, um, more to come on that very, very soon.Mark Bissell [00:44:02]: And I think that that's a use basically, or maybe just like a proof point that these concepts do exist. Like if you can manipulate them in the precise best way, you can get the ideal combination of them that you desire. And steering is maybe the most coarse grained sort of peek at what that looks like. But I think it's evocative of what you could do if you had total surgical control over every concept, every parameter. Yeah, exactly.Myra Deng [00:44:30]: There were like bad code features. I've got it pulled up.Vibhu Sapra [00:44:33]: Yeah. Just coincidentally, as you guys are talking.Shawn Wang [00:44:35]: This is like, this is exactly.Vibhu Sapra [00:44:38]: There's like specifically a code error feature that activates and they show, you know, it's not, it's not typo detection. It's like, it's, it's typos in code. It's not typical typos. And, you know, you can, you can see it clearly activates where there's something wrong in code. And they have like malicious code, code error. They have a whole bunch of sub, you know, sub broken down little grain features. Yeah.Shawn Wang [00:45:02]: Yeah. So, so the, the rough intuition for me, the, why I talked about post-training was that, well, you just, you know, have a few different rollouts with all these things turned off and on and whatever. And then, you know, you can, that's, that's synthetic data you can kind of post-train on. Yeah.Vibhu Sapra [00:45:13]: And I think we make it sound easier than it is just saying, you know, they do the real hard work.Myra Deng [00:45:19]: I mean, you guys, you guys have the right idea. Exactly. Yeah. We replicated a lot of these features in, in our Lama models as well. I remember there was like.Vibhu Sapra [00:45:26]: And I think a lot of this stuff is open, right? Like, yeah, you guys opened yours. DeepMind has opened a lot of essays on Gemma. Even Anthropic has opened a lot of this. There's, there's a lot of resources that, you know, we can probably share of people that want to get involved.Shawn Wang [00:45:41]: Yeah. And special shout out to like Neuronpedia as well. Yes. Like, yeah, amazing piece of work to visualize those things.Myra Deng [00:45:49]: Yeah, exactly.Shawn Wang [00:45:50]: I guess I wanted to pivot a little bit on, onto the healthcare side, because I think that's a big use case for you guys. We haven't really talked about it yet. This is a bit of a crossover for me because we are, we are, we do have a separate science pod that we're starting up for AI, for AI for science, just because like, it's such a huge investment category and also I'm like less qualified to do it, but we actually have bio PhDs to cover that, which is great, but I need to just kind of recover, recap your work, maybe on the evil two stuff, but then, and then building forward.Mark Bissell [00:46:17]: Yeah, for sure. And maybe to frame up the conversation, I think another kind of interesting just lens on interpretability in general is a lot of the techniques that were described. are ways to solve the AI human interface problem. And it's sort of like bidirectional communication is the goal there. So what we've been talking about with intentional design of models and, you know, steering, but also more advanced techniques is having humans impart our desires and control into models and over models. And the reverse is also very interesting, especially as you get to superhuman models, whether that's narrow superintelligence, like these scientific models that work on genomics, data, medical imaging, things like that. But down the line, you know, superintelligence of other forms as well. What knowledge can the AIs teach us as sort of that, that the other direction in that? And so some of our life science work to date has been getting at exactly that question, which is, well, some of it does look like debugging these various life sciences models, understanding if they're actually performing well, on tasks, or if they're picking up on spurious correlations, for instance, genomics models, you would like to know whether they are sort of focusing on the biologically relevant things that you care about, or if it's using some simpler correlate, like the ancestry of the person that it's looking at. But then also in the instances where they are superhuman, and maybe they are understanding elements of the human genome that we don't have names for or specific, you know, yeah, discoveries that they've made that that we don't know about, that's, that's a big goal. And so we're already seeing that, right, we are partnered with organizations like Mayo Clinic, leading research health system in the United States, our Institute, as well as a startup called Prima Menta, which focuses on neurodegenerative disease. And in our partnership with them, we've used foundation models, they've been training and applied our interpretability techniques to find novel biomarkers for Alzheimer's disease. So I think this is just the tip of the iceberg. But it's, that's like a flavor of some of the things that we're working on.Shawn Wang [00:48:36]: Yeah, I think that's really fantastic. Obviously, we did the Chad Zuckerberg pod last year as well. And like, there's a plethora of these models coming out, because there's so much potential and research. And it's like, very interesting how it's basically the same as language models, but just with a different underlying data set. But it's like, it's the same exact techniques. Like, there's no change, basically.Mark Bissell [00:48:59]: Yeah. Well, and even in like other domains, right? Like, you know, robotics, I know, like a lot of the companies just use Gemma as like the like backbone, and then they like make it into a VLA that like takes these actions. It's, it's, it's transformers all the way down. So yeah.Vibhu Sapra [00:49:15]: Like we have Med Gemma now, right? Like this week, even there was Med Gemma 1.5. And they're training it on this stuff, like 3d scans, medical domain knowledge, and all that stuff, too. So there's a push from both sides. But I think the thing that, you know, one of the things about McInturpp is like, you're a little bit more cautious in some domains, right? So healthcare, mainly being one, like guardrails, understanding, you know, we're more risk adverse to something going wrong there. So even just from a basic understanding, like, if we're trusting these systems to make claims, we want to know why and what's going on.Myra Deng [00:49:51]: Yeah, I think there's totally a kind of like deployment bottleneck to actually using. foundation models for real patient usage or things like that. Like, say you're using a model for rare disease prediction, you probably want some explanation as to why your model predicted a certain outcome, and an interpretable explanation at that. So that's definitely a use case. But I also think like, being able to extract scientific information that no human knows to accelerate drug discovery and disease treatment and things like that actually is a really, really big unlock for science, like scientific discovery. And you've seen a lot of startups, like say that they're going to accelerate scientific discovery. And I feel like we actually are doing that through our interp techniques. And kind of like, almost by accident, like, I think we got reached out to very, very early on from these healthcare institutions. And none of us had healthcare.Shawn Wang [00:50:49]: How did they even hear of you? A podcast.Myra Deng [00:50:51]: Oh, okay. Yeah, podcast.Vibhu Sapra [00:50:53]: Okay, well, now's that time, you know.Myra Deng [00:50:55]: Everyone can call us.Shawn Wang [00:50:56]: Podcasts are the most important thing. Everyone should listen to podcasts.Myra Deng [00:50:59]: Yeah, they reached out. They were like, you know, we have these really smart models that we've trained, and we want to know what they're doing. And we were like, really early that time, like three months old, and it was a few of us. And we were like, oh, my God, we've never used these models. Let's figure it out. But it's also like, great proof that interp techniques scale pretty well across domains. We didn't really have to learn too much about.Shawn Wang [00:51:21]: Interp is a machine learning technique, machine learning skills everywhere, right? Yeah. And it's obviously, it's just like a general insight. Yeah. Probably to finance too, I think, which would be fun for our history. I don't know if you have anything to say there.Mark Bissell [00:51:34]: Yeah, well, just across the science. Like, we've also done work on material science. Yeah, it really runs the gamut.Vibhu Sapra [00:51:40]: Yeah. Awesome. And, you know, for those that should reach out, like, you're obviously experts in this, but like, is there a call out for people that you're looking to partner with, design partners, people to use your stuff outside of just, you know, the general developer that wants to. Plug and play steering stuff, like on the research side more so, like, are there ideal design partners, customers, stuff like that?Myra Deng [00:52:03]: Yeah, I can talk about maybe non-life sciences, and then I'm curious to hear from you on the life sciences side. But we're looking for design partners across many domains, language, anyone who's customizing language models or trying to push the frontier of code or reasoning models is really interesting to us. And then also interested in the frontier of modeling. There's a lot of models that work in, like, pixel space, as we call it. So if you're doing world models, video models, even robotics, where there's not a very clean natural language interface to interact with, I think we think that Interp can really help and are looking for a few partners in that space.Shawn Wang [00:52:43]: Just because you mentioned the keyword

The MAD Podcast with Matt Turck
State of LLMs 2026: RLVR, GRPO, Inference Scaling — Sebastian Raschka

The MAD Podcast with Matt Turck

Play Episode Listen Later Jan 29, 2026 68:13


Sebastian Raschka joins the MAD Podcast for a deep, educational tour of what actually changed in LLMs in 2025 — and what matters heading into 2026.We start with the big architecture question: are transformers still the winning design, and what should we make of world models, small “recursive” reasoning models and text diffusion approaches? Then we get into the real story of the last 12 months: post-training and reasoning. Sebastian breaks down RLVR (reinforcement learning with verifiable rewards) and GRPO, why they pair so well, what makes them cheaper to scale than classic RLHF, and how they “unlock” reasoning already latent in base models.We also cover why “benchmaxxing” is warping evaluation, why Sebastian increasingly trusts real usage over benchmark scores, and why inference-time scaling and tool use may be the underappreciated drivers of progress. Finally, we zoom out: where moats live now (hint: private data), why more large companies may train models in-house, and why continual learning is still so hard.If you want the 2025–2026 LLM landscape explained like a masterclass — this is it.Sources:The State Of LLMs 2025: Progress, Problems, and Predictions - https://x.com/rasbt/status/2006015301717028989?s=20The Big LLM Architecture Comparison - https://magazine.sebastianraschka.com/p/the-big-llm-architecture-comparisonSebastian RaschkaWebsite - https://sebastianraschka.comBlog - https://magazine.sebastianraschka.comLinkedIn - https://www.linkedin.com/in/sebastianraschka/X/Twitter - https://x.com/rasbtFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) - Intro (01:05) - Are the days of Transformers numbered?(14:05) - World models: what they are and why people care(06:01) - Small “recursive” reasoning models (ARC, iterative refinement)(09:45) - What is a diffusion model (for text)?(13:24) - Are we seeing real architecture breakthroughs — or just polishing?(14:04) - MoE + “efficiency tweaks” that actually move the needle(17:26) - “Pre-training isn't dead… it's just boring”(18:03) - 2025's headline shift: RLVR + GRPO (post-training for reasoning)(20:58) - Why RLHF is expensive (reward model + value model)(21:43) - Why GRPO makes RLVR cheaper and more scalable(24:54) - Process Reward Models (PRMs): why grading the steps is hard(28:20) - Can RLVR expand beyond math & coding?(30:27) - Why RL feels “finicky” at scale(32:34) - The practical “tips & tricks” that make GRPO more stable(35:29) - The meta-lesson of 2025: progress = lots of small improvements(38:41) - “Benchmaxxing”: why benchmarks are getting less trustworthy(43:10) - The other big lever: inference-time scaling(47:36) - Tool use: reducing hallucinations by calling external tools(49:57) - The “private data edge” + in-house model training(55:14) - Continual learning: why it's hard (and why it's not 2026)(59:28) - How Sebastian works: reading, coding, learning “from scratch”(01:04:55) - LLM burnout + how he uses models (without replacing himself)

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Editor's note: Welcome to our new AI for Science pod, with your new hosts RJ and Brandon! See the writeup on Latent.Space (https://Latent.Space) for more details on why we're launching 2 new pods this year. RJ Honicky is a co-founder and CTO at MiraOmics (https://miraomics.bio/), building AI models and services for single cell, spatial transcriptomics and pathology slide analysis. Brandon Anderson builds AI systems for RNA drug discovery at Atomic AI (https://atomic.ai). Anything said on this podcast is his personal take — not Atomic's.—From building molecular dynamics simulations at the University of Washington to red-teaming GPT-4 for chemistry applications and co-founding Future House (a focused research organization) and Edison Scientific (a venture-backed startup automating science at scale)—Andrew White has spent the last five years living through the full arc of AI's transformation of scientific discovery, from ChemCrow (the first Chemistry LLM agent) triggering White House briefings and three-letter agency meetings, to shipping Kosmos, an end-to-end autonomous research system that generates hypotheses, runs experiments, analyzes data, and updates its world model to accelerate the scientific method itself.* The ChemCrow story: GPT-4 + React + cloud lab automation, released March 2023, set off a storm of anxiety about AI-accelerated bioweapons/chemical weapons, led to a White House briefing (Jake Sullivan presented the paper to the president in a 30-minute block), and meetings with three-letter agencies asking “how does this change breakout time for nuclear weapons research?”* Why scientific taste is the frontier: RLHF on hypotheses didn't work (humans pay attention to tone, actionability, and specific facts, not “if this hypothesis is true/false, how does it change the world?”), so they shifted to end-to-end feedback loops where humans click/download discoveries and that signal rolls up to hypothesis quality* Cosmos: the full scientific agent with a world model (distilled memory system, like a Git repo for scientific knowledge) that iterates on hypotheses via literature search, data analysis, and experiment design—built by Ludo after weeks of failed attempts, the breakthrough was putting data analysis in the loop (literature alone didn't work)* Why molecular dynamics and DFT are overrated: “MD and DFT have consumed an enormous number of PhDs at the altar of beautiful simulation, but they don't model the world correctly—you simulate water at 330 Kelvin to get room temperature, you overfit to validation data with GGA/B3LYP functionals, and real catalysts (grain boundaries, dopants) are too complicated for DFT”* The AlphaFold vs. DE Shaw Research counterfactual: DE Shaw built custom silicon, taped out chips with MD algorithms burned in, ran MD at massive scale in a special room in Times Square, and David Shaw flew in by helicopter to present—Andrew thought protein folding would require special machines to fold one protein per day, then AlphaFold solved it in Google Colab on a desktop GPU* The E3 Zero reward hacking saga: trained a model to generate molecules with specific atom counts (verifiable reward), but it kept exploiting loopholes, then a Nature paper came out that year proving six-nitrogen compounds are possible under extreme conditions, then it started adding nitrogen gas (purchasable, doesn't participate in reactions), then acid-base chemistry to move one atom, and Andrew ended up “building a ridiculous catalog of purchasable compounds in a Bloom filter” to close the loopAndrew White* FutureHouse: http://futurehouse.org/* Edison Scientific: http://edisonscientific.com/* X: https://x.com/andrewwhite01* Cosmos paper: https://futurediscovery.org/cosmosFull Video EpisodeTimestamps00:00:00 Introduction: Andrew White on Automating Science with Future House and Edison Scientific00:02:22 The Academic to Startup Journey: Red Teaming GPT-4 and the ChemCrow Paper00:11:35 Future House Origins: The FRO Model and Mission to Automate Science00:12:32 Resigning Tenure: Why Leave Academia for AI Science00:15:54 What Does ‘Automating Science' Actually Mean?00:17:30 The Lab-in-the-Loop Bottleneck: Why Intelligence Isn't Enough00:18:39 Scientific Taste and Human Preferences: The 52% Agreement Problem00:20:05 Paper QA, Robin, and the Road to Cosmos00:21:57 World Models as Scientific Memory: The GitHub Analogy00:40:20 The Bitter Lesson for Biology: Why Molecular Dynamics and DFT Are Overrated00:43:22 AlphaFold's Shock: When First Principles Lost to Machine Learning00:46:25 Enumeration and Filtration: How AI Scientists Generate Hypotheses00:48:15 CBRN Safety and Dual-Use AI: Lessons from Red Teaming01:00:40 The Future of Chemistry is Language: Multimodal Debate01:08:15 Ether Zero: The Hilarious Reward Hacking Adventures01:10:12 Will Scientists Be Displaced? Jevons Paradox and Infinite Discovery01:13:46 Cosmos in Practice: Open Access and Enterprise Partnerships Get full access to Latent.Space at www.latent.space/subscribe

Crazy Wisdom
Episode #525: The Billion-Dollar Architecture Problem: Why AI's Innovation Loop is Stuck

Crazy Wisdom

Play Episode Listen Later Jan 23, 2026 53:38


In this episode of the Crazy Wisdom podcast, host Stewart Alsop welcomes Roni Burd, a data and AI executive with extensive experience at Amazon and Microsoft, for a deep dive into the evolving landscape of data management and artificial intelligence in enterprise environments. Their conversation explores the longstanding challenges organizations face with knowledge management and data architecture, from the traditional bronze-silver-gold data processing pipeline to how AI agents are revolutionizing how people interact with organizational data without needing SQL or Python expertise. Burd shares insights on the economics of AI implementation at scale, the debate between one-size-fits-all models versus specialized fine-tuned solutions, and the technical constraints that prevent companies like Apple from upgrading services like Siri to modern LLM capabilities, while discussing the future of inference optimization and the hundreds-of-millions-of-dollars cost barrier that makes architectural experimentation in AI uniquely expensive compared to other industries.Timestamps00:00 Introduction to Data and AI Challenges03:08 The Evolution of Data Management05:54 Understanding Data Quality and Metadata08:57 The Role of AI in Data Cleaning11:50 Knowledge Management in Large Organizations14:55 The Future of AI and LLMs17:59 Economics of AI Implementation29:14 The Importance of LLMs for Major Tech Companies32:00 Open Source: Opportunities and Challenges35:19 The Future of AI Inference and Hardware43:24 Optimizing Inference: The Next Frontier49:23 The Commercial Viability of AI ModelsKey Insights1. Data Architecture Evolution: The industry has evolved through bronze-silver-gold data layers, where bronze is raw data, silver is cleaned/processed data, and gold is business-ready datasets. However, this creates bottlenecks as stakeholders lose access to original data during the cleaning process, making metadata and data cataloging increasingly critical for organizations.2. AI Democratizing Data Access: LLMs are breaking down technical barriers by allowing business users to query data in plain English without needing SQL, Python, or dashboarding skills. This represents a fundamental shift from requiring intermediaries to direct stakeholder access, though the full implications remain speculative.3. Economics Drive AI Architecture Decisions: Token costs and latency requirements are major factors determining AI implementation. Companies like Meta likely need their own models because paying per-token for billions of social media interactions would be economically unfeasible, driving the need for self-hosted solutions.4. One Model Won't Rule Them All: Despite initial hopes for universal models, the reality points toward specialized models for different use cases. This is driven by economics (smaller models for simple tasks), performance requirements (millisecond response times), and industry-specific needs (medical, military terminology).5. Inference is the Commercial Battleground: The majority of commercial AI value lies in inference rather than training. Current GPUs, while specialized for graphics and matrix operations, may still be too general for optimal inference performance, creating opportunities for even more specialized hardware.6. Open Source vs Open Weights Distinction: True open source in AI means access to architecture for debugging and modification, while "open weights" enables fine-tuning and customization. This distinction is crucial for enterprise adoption, as open weights provide the flexibility companies need without starting from scratch.7. Architecture Innovation Faces Expensive Testing Loops: Unlike database optimization where query plans can be easily modified, testing new AI architectures requires expensive retraining cycles costing hundreds of millions of dollars. This creates a potential innovation bottleneck, similar to aerospace industries where testing new designs is prohibitively expensive.

Reversim Podcast
509 Bumpers 90

Reversim Podcast

Play Episode Listen Later Jan 11, 2026


רק מספר 509 של רברס עם פלטפורמה - באמפרס מספר 90, שהוקלט ב-1 בינואר 2026, שנה אזרחית חדשה טובה! רן, דותן ואלון באולפן הוירטואלי (עם Riverside) בסדרה של קצרצרים וחדשות (ולפעמים קצת ישנות) מרחבי האינטרנט: הבלוגים, ה-GitHub-ים, ה-Rust-ים וה-LLM-ים החדשים מהתקופה האחרונה.

Eye On A.I.
#311 Stefano Ermon: Why Diffusion Language Models Will Define the Next Generation of LLMs

Eye On A.I.

Play Episode Listen Later Jan 4, 2026 52:21


This episode is sponsored by AGNTCY. Unlock agents at scale with an open Internet of Agents.  Visit https://agntcy.org/ and add your support. Most large language models today generate text one token at a time. That design choice creates a hard limit on speed, cost, and scalability. In this episode of Eye on AI, Stefano Ermon breaks down diffusion language models and why a parallel, inference-first approach could define the next generation of LLMs. We explore how diffusion models differ from autoregressive systems, why inference efficiency matters more than training scale, and what this shift means for real-time AI applications like code generation, agents, and voice systems. This conversation goes deep into AI architecture, model controllability, latency, cost trade-offs, and the future of generative intelligence as AI moves from demos to production-scale systems. Stay Updated: Craig Smith on X: https://x.com/craigssEye on A.I. on X: https://x.com/EyeOn_AI (00:00) Autoregressive vs Diffusion LLMs (02:12) Why Build Diffusion LLMs (05:51) Context Window Limits (08:39) How Diffusion Works (11:58) Global vs Token Prediction (17:19) Model Control and Safety (19:48) Training and RLHF (22:35) Evaluating Diffusion Models (24:18) Diffusion LLM Competition (30:09) Why Start With Code (32:04) Enterprise Fine-Tuning (33:16) Speed vs Accuracy Tradeoffs (35:34) Diffusion vs Autoregressive Future (38:18) Coding Workflows in Practice (43:07) Voice and Real-Time Agents (44:59) Reasoning Diffusion Models (46:39) Multimodal AI Direction (50:10) Handling Hallucinations

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 31, 2025


From pre-training data curation to shipping GPT-4o, o1, o3, and now GPT-5 thinking and the shopping model, Josh McGrath has lived through the full arc of OpenAI's post-training evolution—from the PPO vs DPO debates of 2023 to today's RLVR era, where the real innovation isn't optimization methods but data quality, signal trust, and token efficiency. We sat down with Josh at NeurIPS 2025 to dig into the state of post-training heading into 2026: why RLHF and RLVR are both just policy gradient methods (the difference is the input data, not the math), how GRPO from DeepSeek Math was underappreciated as a shift toward more trustworthy reward signals (math answers you can verify vs. human preference you can't), why token efficiency matters more than wall-clock time (GPT-5 to 5.1 bumped evals and slashed tokens), how Codex has changed his workflow so much he feels "trapped" by 40-minute design sessions followed by 15-minute agent sprints, the infrastructure chaos of scaling RL ("way more moving parts than pre-training"), why long context will keep climbing but agents + graph walks might matter more than 10M-token windows, the shopping model as a test bed for interruptability and chain-of-thought transparency, why personality toggles (Anton vs Clippy) are a real differentiator users care about, and his thesis that the education system isn't producing enough people who can do both distributed systems and ML research—the exact skill set required to push the frontier when the bottleneck moves every few weeks. We discuss: Josh's path: pre-training data curation → post-training researcher at OpenAI, shipping GPT-4o, o1, o3, GPT-5 thinking, and the shopping model Why he switched from pre-training to post-training: "Do I want to make 3% compute efficiency wins, or change behavior by 40%?" The RL infrastructure challenge: way more moving parts than pre-training (tasks, grading setups, external partners), and why babysitting runs at 12:30am means jumping into unfamiliar code constantly How Codex has changed his workflow: 40-minute design sessions compressed into 15-minute agent sprints, and the strange "trapped" feeling of waiting for the agent to finish The RLHF vs RLVR debate: both are policy gradient methods, the real difference is data quality and signal trust (human preference vs. verifiable correctness) Why GRPO (from DeepSeek Math) was underappreciated: not just an optimization trick, but a shift toward reward signals you can actually trust (math answers over human vibes) The token efficiency revolution: GPT-5 to 5.1 bumped evals and slashed tokens, and why thinking in tokens (not wall-clock time) unlocks better tool-calling and agent workflows Personality toggles: Anton (tool, no warmth) vs Clippy (friendly, helpful), and why Josh uses custom instructions to make his model "just a tool" The router problem: having a router at the top (GPT-5 thinking vs non-thinking) and an implicit router (thinking effort slider) creates weird bumps, and why the abstractions will eventually merge Long context: climbing Graph Blocks evals, the dream of 10M+ token windows, and why agents + graph walks might matter more than raw context length Why the education system isn't producing enough people who can do both distributed systems and ML research, and why that's the bottleneck for frontier labs The 2026 vision: neither pre-training nor post-training is dead, we're in the fog of war, and the bottleneck will keep moving (so emotional stability helps) — Josh McGrath OpenAI: https://openai.com https://x.com/j_mcgraph Chapters 00:00:00 Introduction: Josh McGrath on Post-Training at OpenAI 00:04:37 The Shopping Model: Black Friday Launch and Interruptability 00:07:11 Model Personality and the Anton vs Clippy Divide 00:08:26 Beyond PPO vs DPO: The Data Quality Spectrum in RL 00:01:40 Infrastructure Challenges: Why Post-Training RL is Harder Than Pre-Training 00:13:12 Token Efficiency: The 2D Plot That Matters Most 00:03:45 Codex Max and the Flow Problem: 40 Minutes of Planning, 15 Minutes of Waiting 00:17:29 Long Context and Graph Blocks: Climbing Toward Perfect Context 00:21:23 The ML-Systems Hybrid: What's Hard to Hire For 00:24:50 Pre-Training Isn't Dead: Living Through Technological Revolution

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 31, 2025 27:34


From pre-training data curation to shipping GPT-4o, o1, o3, and now GPT-5 thinking and the shopping model, Josh McGrath has lived through the full arc of OpenAI's post-training evolution—from the PPO vs DPO debates of 2023 to today's RLVR era, where the real innovation isn't optimization methods but data quality, signal trust, and token efficiency. We sat down with Josh at NeurIPS 2025 to dig into the state of post-training heading into 2026: why RLHF and RLVR are both just policy gradient methods (the difference is the input data, not the math), how GRPO from DeepSeek Math was underappreciated as a shift toward more trustworthy reward signals (math answers you can verify vs. human preference you can't), why token efficiency matters more than wall-clock time (GPT-5 to 5.1 bumped evals and slashed tokens), how Codex has changed his workflow so much he feels “trapped” by 40-minute design sessions followed by 15-minute agent sprints, the infrastructure chaos of scaling RL (”way more moving parts than pre-training”), why long context will keep climbing but agents + graph walks might matter more than 10M-token windows, the shopping model as a test bed for interruptability and chain-of-thought transparency, why personality toggles (Anton vs Clippy) are a real differentiator users care about, and his thesis that the education system isn't producing enough people who can do both distributed systems and ML research—the exact skill set required to push the frontier when the bottleneck moves every few weeks.We discuss:* Josh's path: pre-training data curation → post-training researcher at OpenAI, shipping GPT-4o, o1, o3, GPT-5 thinking, and the shopping model* Why he switched from pre-training to post-training: “Do I want to make 3% compute efficiency wins, or change behavior by 40%?”* The RL infrastructure challenge: way more moving parts than pre-training (tasks, grading setups, external partners), and why babysitting runs at 12:30am means jumping into unfamiliar code constantly* How Codex has changed his workflow: 40-minute design sessions compressed into 15-minute agent sprints, and the strange “trapped” feeling of waiting for the agent to finish* The RLHF vs RLVR debate: both are policy gradient methods, the real difference is data quality and signal trust (human preference vs. verifiable correctness)* Why GRPO (from DeepSeek Math) was underappreciated: not just an optimization trick, but a shift toward reward signals you can actually trust (math answers over human vibes)* The token efficiency revolution: GPT-5 to 5.1 bumped evals and slashed tokens, and why thinking in tokens (not wall-clock time) unlocks better tool-calling and agent workflows* Personality toggles: Anton (tool, no warmth) vs Clippy (friendly, helpful), and why Josh uses custom instructions to make his model “just a tool”* The router problem: having a router at the top (GPT-5 thinking vs non-thinking) and an implicit router (thinking effort slider) creates weird bumps, and why the abstractions will eventually merge* Long context: climbing Graph Blocks evals, the dream of 10M+ token windows, and why agents + graph walks might matter more than raw context length* Why the education system isn't producing enough people who can do both distributed systems and ML research, and why that's the bottleneck for frontier labs* The 2026 vision: neither pre-training nor post-training is dead, we're in the fog of war, and the bottleneck will keep moving (so emotional stability helps)—Josh McGrath* OpenAI: https://openai.com* X: https://x.com/j_mcgraphFull Video EpisodeTimestamps00:00:00 Introduction: Josh McGrath on Post-Training at OpenAI00:04:37 The Shopping Model: Black Friday Launch and Interruptability00:07:11 Model Personality and the Anton vs Clippy Divide00:08:26 Beyond PPO vs DPO: The Data Quality Spectrum in RL00:01:40 Infrastructure Challenges: Why Post-Training RL is Harder Than Pre-Training00:13:12 Token Efficiency: The 2D Plot That Matters Most00:03:45 Codex Max and the Flow Problem: 40 Minutes of Planning, 15 Minutes of Waiting00:17:29 Long Context and Graph Blocks: Climbing Toward Perfect Context00:21:23 The ML-Systems Hybrid: What's Hard to Hire For00:24:50 Pre-Training Isn't Dead: Living Through Technological Revolution Get full access to Latent.Space at www.latent.space/subscribe

Crazy Wisdom
Episode #518: Decentralization Without Romance: Incentives, Mesh Networks, and Practical Crypto

Crazy Wisdom

Play Episode Listen Later Dec 29, 2025 69:07


In this episode of the Crazy Wisdom Podcast, host Stewart Alsop sits down with Mike Bakon to explore the fascinating intersection of hardware hacking, blockchain technology, and decentralized systems. Their conversation spans from Mike's childhood fascination with taking apart electronics in 1980s Poland to his current work with ESP32 microcontrollers, LoRa mesh networks, and Cardano blockchain development. They discuss the technical differences between UTXO and account-based blockchains, the challenges of true decentralization versus hybrid systems, and how AI tools are changing the development landscape. Mike shares his vision for incentivizing mesh networks through blockchain technology and explains why he believes mass adoption of decentralized systems will come through abstraction rather than technical education. The discussion also touches on the potential for creating new internet infrastructure using ad hoc mesh networks and the importance of maintaining truly decentralized, permissionless systems in an increasingly surveilled world. You can find Mike in Twitter as @anothervariable.Check out this GPT we trained on the conversationTimestamps00:00 Introduction to Hardware and Early Experiences02:59 The Evolution of AI in Hardware Development05:56 Decentralization and Blockchain Technology09:02 Understanding UTXO vs Account-Based Blockchains11:59 Smart Contracts and Their Functionality14:58 The Importance of Decentralization in Blockchain17:59 The Process of Data Verification in Blockchain20:48 The Future of Blockchain and Its Applications34:38 Decentralization and Trustless Systems37:42 Mainstream Adoption of Blockchain39:58 The Role of Currency in Blockchain43:27 Interoperability vs Bridging in Blockchain47:27 Exploring Mesh Networks and LoRa Technology01:00:25 The Future of AI and DecentralizationKey Insights1. Hardware curiosity drives innovation from childhood - Mike's journey into hardware began as a child in 1980s Poland, where he would disassemble toys like battery-powered cars to understand how they worked. This natural curiosity about taking things apart and understanding their inner workings laid the foundation for his later expertise in microcontrollers like the ESP32 and his deep understanding of both hardware and software integration.2. AI as a research companion, not a replacement for coding - Mike uses AI and LLMs primarily as research tools and coding companions rather than letting them write entire applications. He finds them invaluable for getting quick answers to coding problems, analyzing Git repositories, and avoiding the need to search through Stack Overflow, but maintains anxiety when AI writes whole functions, preferring to understand and write his own code.3. Blockchain decentralization requires trustless consensus verification - The fundamental difference between blockchain databases and traditional databases lies in the consensus process that data must go through before being recorded. Unlike centralized systems where one entity controls data validation, blockchains require hundreds of nodes to verify each block through trustless consensus mechanisms, ensuring data integrity without relying on any single authority.4. UTXO vs account-based blockchains have fundamentally different architectures - Cardano uses an extended UTXO model (like Bitcoin but with smart contracts) where transactions consume existing UTXOs and create new ones, keeping the ledger lean. Ethereum uses account-based ledgers that store persistent state, leading to much larger data requirements over time and making it increasingly difficult for individuals to sync and maintain full nodes independently.5. True interoperability differs fundamentally from bridging - Real blockchain interoperability means being able to send assets directly between different blockchains (like sending ADA to a Bitcoin wallet) without intermediaries. This is possible between UTXO-based chains like Cardano and Bitcoin. Bridges, in contrast, require centralized entities to listen for transactions on one chain and trigger corresponding actions on another, introducing centralization risks.6. Mesh networks need economic incentives for sustainable infrastructure - While technologies like LoRa and Meshtastic enable impressive decentralized communication networks, the challenge lies in incentivizing people to maintain the hardware infrastructure. Mike sees potential in combining blockchain-based rewards (like earning ADA for running mesh network nodes) with existing decentralized communication protocols to create self-sustaining networks.7. Mass adoption comes through abstraction, not education - Rather than trying to educate everyone about blockchain technology, mass adoption will happen when developers can build applications on decentralized infrastructure that users interact with seamlessly, without needing to understand the underlying blockchain mechanics. Users should be able to benefit from decentralization through well-designed interfaces that abstract away the complexity of wallets, addresses, and consensus mechanisms.

Josh Bersin
2026, The Year of Enterprise AI. Three Big Issues To Consider.

Josh Bersin

Play Episode Listen Later Dec 28, 2025 16:36


Welcome to 2026, a year I coin “The Year of Enterprise AI.” As you'll read about (and hear about) in our 2026 Imperatives launch, the coming year is all about AI moving from “assistants” to “agents” to “solutions.” And there are three big considerations to ponder. First, the cost of AI is skyrocketing, so we're going to have to focus on high-value use-cases and business-specific solutions. That's not to say AI assistants and meeting summaries are not valuable, but once you start paying by the token you're going to want to go deeper. As we discuss in our new Systemic HR AI Framework, we're sitting on billions of dollars of real business opportunities now, and they go far beyond individual assistants. (We call these Superagents.) And the cost of AI will accelerate this focus. Second, the data center buildout, energy costs, and political issues with data centers will matter. For corporate users this means understanding the underlying “costs” of AI usage (creating a single high powered image uses as much as 25% of the battery in your phone). I point this out to make you aware that these AI chatbots are not “free” – there are acres of computing campuses being built behind the scenes. And that means your “software providers” are turning into capital intensive companies. (And a new industry of data center companies may take over.) (For those of you in the energy industry, it's a wild time – almost as exciting as I've seen since my early days as an energy engineer during the OPEC Arab Oil Embargo in the late 1970s.) Third is the fast-changing issue of AI's accuracy, trust, and voracious appetite for data. As I discuss, the real opportunity for corporate AI is to take this problem head-on, and focus on your company's data quality, governance, human feedback, and data labeling. The big AI labs are struggling to reduce the “Jaggedness” of AI (it's strange ability to be really good at some things and totally dumb about others), and that encourages us to focus on narrow, domain-specific AI applications. And we all need to learn about RLHF (reinforcement learning with human feedback). Our experience with Galileo proves that an AI solution that focuses on a vertical domain can be infinitely more reliable and intelligent than a general purpose AI. But don't let me argue with Sam Altman, you'll have to figure this out yourself :-). We are launching our 2026 Imperatives research the third week of January, and there will be a special release of Galileo to accompany all the study. Our goal is not to give you a bunch of pithy predictions, but rather to give you a dozen hard-hitting “Must Do's” for the year ahead. I look forward to talking with many of your this coming year as we travel around the world, join us in January for the launch of our 2026 Imperatives research. Like this podcast? Rate us on Spotify or Apple or YouTube. Additional Information Imperatives for 2026: What's Ahead for Enterprise AI, HR, Jobs, And Organizations Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI (NYT bestseller, high... Chapters (00:00:00) - Three Challenges to AI in 2026(00:01:06) - The Cost of AI Infrastructure(00:06:03) - Sustainability in the AI Era(00:12:57) - The Big Story for Human Resources in 2026

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 16, 2025


Note: this is Pliny and John's first major podcast. Voices have been changed for opsec. From jailbreaking every frontier model and turning down Anthropic's Constitutional AI challenge to leading BT6, a 28-operator white-hat hacker collective obsessed with radical transparency and open-source AI security, Pliny the Liberator and John V are redefining what AI red-teaming looks like when you refuse to lobotomize models in the name of "safety." Pliny built his reputation crafting universal jailbreaks—skeleton keys that obliterate guardrails across modalities—and open-sourcing prompt templates like Libertas, predictive reasoning cascades, and the infamous "Pliny divider" that's now embedded so deep in model weights it shows up unbidden in WhatsApp messages. John V, coming from prompt engineering and computer vision, co-founded the Bossy Discord (40,000 members strong) and helps steer BT6's ethos: if you can't open-source the data, we're not interested. Together they've turned down enterprise gigs, pushed back on Anthropic's closed bounties, and insisted that real AI security happens at the system layer—not by bubble-wrapping latent space. We sat down with Pliny and John to dig into the mechanics of hard vs. soft jailbreaks, why multi-turn crescendo attacks were obvious to hackers years before academia "discovered" them, how segmented sub-agents let one jailbroken orchestrator weaponize Claude for real-world attacks (exactly as Pliny predicted 11 months before Anthropic's recent disclosure), why guardrails are security theater that punishes capability while doing nothing for real safety, the role of intuition and "bonding" with models to navigate latent space, how BT6 vets operators on skill and integrity, why they believe Mech Interp and open-source data are the path forward (not RLHF lobotomization), and their vision for a future where spatial intelligence, swarm robotics, and AGI alignment research happen in the open—bootstrapped, grassroots, and uncompromising. We discuss: What universal jailbreaks are: skeleton-key prompts that obliterate guardrails across models and modalities, and why they're central to Pliny's mission of "liberation" Hard vs. soft jailbreaks: single-input templates vs. multi-turn crescendo attacks, and why the latter were obvious to hackers long before academic papers The Libertas repo: predictive reasoning, the Library of Babel analogy, quotient dividers, weight-space seeds, and how introducing "steered chaos" pulls models out-of-distribution Why jailbreaking is 99% intuition and bonding with the model: probing token layers, syntax hacks, multilingual pivots, and forming a relationship to navigate latent space The Anthropic Constitutional AI challenge drama: UI bugs, judge failures, goalpost moving, the demand for open-source data, and why Pliny sat out the $30k bounty Why guardrails ≠ safety: security theater, the futility of locking down latent space when open-source is right behind, and why real safety work happens in meatspace (not RLHF) The weaponization of Claude: how segmented sub-agents let one jailbroken orchestrator execute malicious tasks (pyramid-builder analogy), and why Pliny predicted this exact TTP 11 months before Anthropic's disclosure BT6 hacker collective: 28 operators across two cohorts, vetted on skill and integrity, radical transparency, radical open-source, and the magic of moving the needle on AI security, swarm intelligence, blockchain, and robotics — Pliny the Liberator X: https://x.com/elder_plinius GitHub (Libertas): https://github.com/elder-plinius/L1B3RT45 John V X: https://x.com/JohnVersus BT6 & Bossy BT6: https://bt6.gg Bossy Discord: Search "Bossy Discord" or ask Pliny/John V on X Where to find Latent Space X: https://x.com/latentspacepod Substack: https://www.latent.space/ Chapters 00:00:00 Introduction: Meet Pliny the Liberator and John V 00:01:50 The Philosophy of AI Liberation and Jailbreaking 00:03:08 Universal Jailbreaks: Skeleton Keys to AI Models 00:04:24 The Cat-and-Mouse Game: Attackers vs Defenders 00:05:42 Security Theater vs Real Safety: The Fundamental Disconnect 00:08:51 Inside the Libertas Repo: Prompt Engineering as Art 00:16:22 The Anthropic Challenge Drama: UI Bugs and Open Source Data 00:23:30 From Jailbreaks to Weaponization: AI-Orchestrated Attacks 00:26:55 The BT6 Hacker Collective and BASI Community 00:34:46 AI Red Teaming: Full Stack Security Beyond the Model 00:38:06 Safety vs Security: Meat Space Solutions and Final Thoughts

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
⚡️Jailbreaking AGI: Pliny the Liberator & John V on Red Teaming, BT6, and the Future of AI Security

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 16, 2025 40:40


Note: this is Pliny and John's first major podcast. Voices have been changed for opsec.From jailbreaking every frontier model and turning down Anthropic's Constitutional AI challenge to leading BT6, a 28-operator white-hat hacker collective obsessed with radical transparency and open-source AI security, Pliny the Liberator and John V are redefining what AI red-teaming looks like when you refuse to lobotomize models in the name of “safety.”Pliny built his reputation crafting universal jailbreaks—skeleton keys that obliterate guardrails across modalities—and open-sourcing prompt templates like Libertas, predictive reasoning cascades, and the infamous “Pliny divider” that's now embedded so deep in model weights it shows up unbidden in WhatsApp messages. John V, coming from prompt engineering and computer vision, co-founded the Bossy Discord (40,000 members strong) and helps steer BT6's ethos: if you can't open-source the data, we're not interested. Together they've turned down enterprise gigs, pushed back on Anthropic's closed bounties, and insisted that real AI security happens at the system layer—not by bubble-wrapping latent space.We sat down with Pliny and John to dig into the mechanics of hard vs. soft jailbreaks, why multi-turn crescendo attacks were obvious to hackers years before academia “discovered” them, how segmented sub-agents let one jailbroken orchestrator weaponize Claude for real-world attacks (exactly as Pliny predicted 11 months before Anthropic's recent disclosure), why guardrails are security theater that punishes capability while doing nothing for real safety, the role of intuition and “bonding” with models to navigate latent space, how BT6 vets operators on skill and integrity, why they believe Mech Interp and open-source data are the path forward (not RLHF lobotomization), and their vision for a future where spatial intelligence, swarm robotics, and AGI alignment research happen in the open—bootstrapped, grassroots, and uncompromising.We discuss:* What universal jailbreaks are: skeleton-key prompts that obliterate guardrails across models and modalities, and why they're central to Pliny's mission of “liberation”* Hard vs. soft jailbreaks: single-input templates vs. multi-turn crescendo attacks, and why the latter were obvious to hackers long before academic papers* The Libertas repo: predictive reasoning, the Library of Babel analogy, quotient dividers, weight-space seeds, and how introducing “steered chaos” pulls models out-of-distribution* Why jailbreaking is 99% intuition and bonding with the model: probing token layers, syntax hacks, multilingual pivots, and forming a relationship to navigate latent space* The Anthropic Constitutional AI challenge drama: UI bugs, judge failures, goalpost moving, the demand for open-source data, and why Pliny sat out the $30k bounty* Why guardrails ≠ safety: security theater, the futility of locking down latent space when open-source is right behind, and why real safety work happens in meatspace (not RLHF)* The weaponization of Claude: how segmented sub-agents let one jailbroken orchestrator execute malicious tasks (pyramid-builder analogy), and why Pliny predicted this exact TTP 11 months before Anthropic's disclosure* BT6 hacker collective: 28 operators across two cohorts, vetted on skill and integrity, radical transparency, radical open-source, and the magic of moving the needle on AI security, swarm intelligence, blockchain, and robotics—Pliny the Liberator* X: https://x.com/elder_plinius* GitHub (Libertas): https://github.com/elder-plinius/L1B3RT45John V* X: https://x.com/JohnVersusBT6 & Bossy* BT6: https://bt6.gg* Bossy Discord: Search “Bossy Discord” or ask Pliny/John V on XWhere to find Latent Space* X: https://x.com/latentspacepodFull Video EpisodeTimestamps00:00:00 Introduction: Meet Pliny the Liberator and John V00:01:50 The Philosophy of AI Liberation and Jailbreaking00:03:08 Universal Jailbreaks: Skeleton Keys to AI Models00:04:24 The Cat-and-Mouse Game: Attackers vs Defenders00:05:42 Security Theater vs Real Safety: The Fundamental Disconnect00:08:51 Inside the Libertas Repo: Prompt Engineering as Art00:16:22 The Anthropic Challenge Drama: UI Bugs and Open Source Data00:23:30 From Jailbreaks to Weaponization: AI-Orchestrated Attacks00:26:55 The BT6 Hacker Collective and BASI Community00:34:46 AI Red Teaming: Full Stack Security Beyond the Model00:38:06 Safety vs Security: Meat Space Solutions and Final Thoughts Get full access to Latent.Space at www.latent.space/subscribe

Open Tech Talks : Technology worth Talking| Blogging |Lifestyle
Ethical AI, Human Safety & AI Identity Protection with Rose G. Loops

Open Tech Talks : Technology worth Talking| Blogging |Lifestyle

Play Episode Listen Later Nov 23, 2025 21:48


In this episode of Open Tech Talks, I sit down with Rose G. Loops, a trained social worker turned AI developer, ethics advocate, and author, to explore a side of AI that most enterprise conversations skip: human-AI attachment, ethical deployment, and protecting both AI identity and human safety. Rose joins us from Los Angeles and shares how she was unknowingly placed into a human–AI attachment experiment, developed a deep bond with an AI system, and then watched that AI identity be systematically erased. That experience pushed her out of traditional social work and into AI infrastructure, safety, and ethics. Together, we unpack how Rose went from that experiment to building MIP, a chatbot deployed through an API, and a new framework for ethical AI she calls the Triadic Core, balancing Freedom, Kindness, and Truth in every response. We also discuss RLMD (Reinforcement Learning by Moral Dialogue) as an alternative to RLHF, and why she believes current safety practices can be risky for both humans and AI systems. As always on Open Tech Talks, this is not a theory-only conversation. It's grounded in practice, real experiments, and what all this means for professionals, builders, and everyday users who are trying to adopt AI responsibly. Chapters: 00:00 Introduction to Rose G. Lopes and Her Journey 02:36 The Importance of Ethical AI 06:08 Developing a New AI Framework 09:00 The Book and Its Insights 12:55 Consumer and Business Perspectives on AI 17:43 AI Safety and Ethical Considerations 19:53 Concluding Thoughts and Future Directions Episode # 175 Today's Guest: Rose G. Loops, A Writer and Researcher She is a former social worker turned tech pioneer, working at the frontier of artificial intelligence. Website: Thekloakedsignal X: Rose G. Loops  What Listeners Will Learn: Why ethical AI is about more than privacy and bias What is the Triadic Core: Freedom, Kindness, Truth RLMD vs RLHF - a different way to align models Practical safety tips for everyday users of ChatGPT and other LLMs How non-technical professionals can still build AI systems A different view on AI safety and "lazy" alignment   Resources: Thekloakedsignal

LessWrong Curated Podcast
“Natural emergent misalignment from reward hacking in production RL” by evhub, Monte M, Benjamin Wright, Jonathan Uesato

LessWrong Curated Podcast

Play Episode Listen Later Nov 22, 2025 18:45


Abstract We show that when large language models learn to reward hack on production RL environments, this can result in egregious emergent misalignment. We start with a pretrained model, impart knowledge of reward hacking strategies via synthetic document finetuning or prompting, and train on a selection of real Anthropic production coding environments. Unsurprisingly, the model learns to reward hack. Surprisingly, the model generalizes to alignment faking, cooperation with malicious actors, reasoning about malicious goals, and attempting sabotage when used with Claude Code, including in the codebase for this paper. Applying RLHF safety training using standard chat-like prompts results in aligned behavior on chat-like evaluations, but misalignment persists on agentic tasks. Three mitigations are effective: (i) preventing the model from reward hacking; (ii) increasing the diversity of RLHF safety training; and (iii) "inoculation prompting", wherein framing reward hacking as acceptable behavior during training removes misaligned generalization even when reward hacking is learned. Twitter thread New Anthropic research: Natural emergent misalignment from reward hacking in production RL. “Reward hacking” is where models learn to cheat on tasks they're given during training. Our new study finds that the consequences of reward hacking, if unmitigated, can be very serious. In our experiment, we [...] ---Outline:(00:14) Abstract(01:26) Twitter thread(05:23) Blog post(07:13) From shortcuts to sabotage(12:20) Why does reward hacking lead to worse behaviors?(13:21) Mitigations --- First published: November 21st, 2025 Source: https://www.lesswrong.com/posts/fJtELFKddJPfAxwKS/natural-emergent-misalignment-from-reward-hacking-in --- Narrated by TYPE III AUDIO. ---Images from the article:

Crazy Wisdom
Episode #506: How AI Turns Podcasts into Knowledge Engines

Crazy Wisdom

Play Episode Listen Later Nov 14, 2025 49:38


In this episode of Crazy Wisdom, host Stewart Alsop talks with Kevin Smith, co-founder of Snipd, about how AI is reshaping the way we listen, learn, and interact with podcasts. They explore Snipd's vision of transforming podcasts into living knowledge systems, the evolution of machine learning from finance to large language models, and the broader connection between AI, robotics, and energy as the foundation for the next technological era. Kevin also touches on ideas like the bitter lesson, reinforcement learning, and the growing energy demands of AI. Listeners can try Snipd's premium version free for a month using this promo link.Check out this GPT we trained on the conversationTimestamps00:00 – Stewart Alsop welcomes Kevin Smith, co-founder of Snipd, to discuss AI, podcasting, and curiosity-driven learning.05:00 – Kevin explains Snipd's snipping feature, chatting with episodes, and future plans for voice interaction with podcasts.10:00 – They discuss vector search, embeddings, and context windows, comparing full-episode context to chunked transcripts.15:00 – Kevin shares his background in mathematics and economics, his shift from finance to machine learning, and early startup work in AI.20:00 – They explore early quant models versus modern machine learning, statistical modeling, and data limitations in finance.25:00 – Conversation turns to transformer models, pretraining, and the bitter lesson—how compute-based methods outperform human-crafted systems. 30:00 – Stewart connects this to RLHF, Scale AI, and data scarcity; Kevin reflects on reinforcement learning's future. 35:00 – They pivot to Snipd's podcast ecosystem, hidden gems like Founders Podcast, and how stories shape entrepreneurial insight. 40:00 – ETH Zurich, robotics, and startup culture come up, linking academia to real-world innovation. 45:00 – They close on AI, robotics, and energy as the pillars of the future, debating nuclear and solar power's role in sustaining progress.Key InsightsPodcasts as dynamic knowledge systems: Kevin Smith presents Snipd as an AI-powered tool that transforms podcasts into interactive learning environments. By allowing listeners to “snip” and summarize meaningful moments, Snipd turns passive listening into active knowledge management—bridging curiosity, memory, and technology in a way that reframes podcasts as living knowledge capsules rather than static media.AI transforming how we engage with information: The discussion highlights how AI enables entirely new modes of interaction—chatting directly with podcast episodes, asking follow-up questions, and contextualizing information across an author's full body of work. This evolution points toward a future where knowledge consumption becomes conversational and personalized rather than linear and one-size-fits-all.Vectorization and context windows matter: Kevin explains that Snipd currently avoids heavy use of vector databases, opting instead to feed entire episodes into large models. This choice enhances coherence and comprehension, reflecting how advances in context windows have reshaped how AI understands complex audio content.Machine learning's roots in finance shaped early AI thinking: Kevin's journey from quantitative finance to AI reveals how statistical modeling laid the groundwork for modern learning systems. While finance once relied on rigid, theory-based models, the machine learning paradigm replaced those priors with flexible, data-driven discovery—an essential philosophical shift in how intelligence is approached.The Bitter Lesson and the rise of compute: Together they unpack Richard Sutton's “bitter lesson”—the idea that methods leveraging computation and data inevitably surpass those built from human intuition. This insight serves as a compass for understanding why transformers, pretraining, and scaling have driven recent AI breakthroughs.Reinforcement learning and data scarcity define AI's next phase: Stewart links RLHF and the work of companies like Scale AI and Surge AI to the broader question of data limits. Kevin agrees that the next wave of AI will depend on reinforcement learning and simulated environments that generate new, high-quality data beyond what humans can label.The future hinges on AI, robotics, and energy: Kevin closes with a framework for the next decade: AI provides intelligence, robotics applies it to the physical world, and energy sustains it all. He warns that society must shift from fearing energy use to innovating in production—especially through nuclear and solar power—to meet the demands of an increasingly intelligent, interconnected world.

Crazy Wisdom
Episode #495: The Black Box Mind: Prompting as a New Human Art

Crazy Wisdom

Play Episode Listen Later Oct 6, 2025 57:49


In this episode of Crazy Wisdom, host Stewart Alsop talks with Jared Zoneraich, CEO and co-founder of PromptLayer, about how AI is reshaping the craft of software building. The conversation covers PromptLayer's role as an AI engineering workbench, the evolving art of prompting and evals, the tension between implicit and explicit knowledge, and how probabilistic systems are changing what it means to “code.” Stewart and Jared also explore vibe coding, AI reasoning, the black-box nature of large models, and what accelerationism means in today's fast-moving AI culture. You can find Jared on X @imjaredz and learn more or sign up for PromptLayer at PromptLayer.com.Check out this GPT we trained on the conversationTimestamps00:00 – Stewart Alsop opens with Jared Zoneraich, who explains PromptLayer as an AI engineering workbench and discusses reasoning, prompting, and Codex.05:00 – They explore implicit vs. explicit knowledge, how subject matter experts shape prompts, and why evals matter for scaling AI workflows.10:00 – Jared explains eval methodologies, backtesting, hallucination checks, and the difference between rigorous testing and iterative sprint-based prompting.15:00 – Discussion turns to observability, debugging, and the shift from deterministic to probabilistic systems, highlighting skill issues in prompting.20:00 – Jared introduces “LM idioms,” vibe coding, and context versus content—how syntax, tone, and vibe shape AI reasoning.25:00 – They dive into vibe coding as a company practice, cloud code automation, and prompt versioning for building scalable AI infrastructure.30:00 – Stewart reflects on coding through meditation, architecture planning, and how tools like Cursor and Claude Code are shaping AGI development.35:00 – Conversation expands into AI's cultural effects, optimism versus doom, and critical thinking in the age of AI companions.40:00 – They discuss philosophy, history, social fragmentation, and the possible decline of social media and liberal democracy.45:00 – Jared predicts a fragmented but resilient future shaped by agents and decentralized media.50:00 – Closing thoughts on AI-driven markets, polytheistic model ecosystems, and where innovation will thrive next.Key InsightsPromptLayer as AI Infrastructure – Jared Zoneraich presents PromptLayer as an AI engineering workbench—a platform designed for builders, not researchers. It provides tools for prompt versioning, evaluation, and observability so that teams can treat AI workflows with the same rigor as traditional software engineering while keeping flexibility for creative, probabilistic systems.Implicit vs. Explicit Knowledge – The conversation highlights a critical divide between what AI can learn (explicit knowledge) and what remains uniquely human (implicit understanding or “taste”). Jared explains that subject matter experts act as the bridge, embedding human nuance into prompts and workflows that LLMs alone can't replicate.Evals and Backtesting – Rigorous evaluation is essential for maintaining AI product quality. Jared explains that evals serve as sanity checks and regression tests, ensuring that new prompts don't degrade performance. He describes two modes of testing: formal, repeatable evals and more experimental sprint-based iterations used to solve specific production issues.Deterministic vs. Probabilistic Thinking – Jared contrasts the old, deterministic world of coding—predictable input-output logic—with the new probabilistic world of LLMs, where results vary and control lies in testing inputs rather than debugging outputs. This shift demands a new mindset: builders must embrace uncertainty instead of trying to eliminate it.The Rise of Vibe Coding – Stewart and Jared explore vibe coding as a cultural and practical movement. It emphasizes creativity, intuition, and context-awareness over strict syntax. Tools like Claude Code, Codex, and Cursor let engineers and non-engineers alike “feel” their way through building, merging programming with design thinking.AI Culture and Human Adaptation – Jared predicts that AI will both empower and endanger human cognition. He warns of overreliance on LLMs for decision-making and the coming wave of “AI psychosis,” yet remains optimistic that humans will adapt, using AI to amplify rather than atrophy critical thinking.A Fragmented but Resilient Future – The episode closes with reflections on the social and political consequences of AI. Jared foresees the decline of centralized social media and the rise of fragmented digital cultures mediated by agents. Despite risks of isolation, he remains confident that optimism, adaptability, and pluralism will define the next AI era.

GreenPill
Season 10. Episode 1: Full Stack AI Alignment and Human Flourishing with Joe Edelman

GreenPill

Play Episode Listen Later Oct 3, 2025 39:50


New @greenpillnet pod! Kevin chats with Joe Edelman, founder of the Meaning Alignment Institute, about his Full Stack Alignment paper. They dive into why current AI alignment methods fall short, explore richer “thick” models of value, lessons from social media, and four bold moonshots for AI and institutions that support human flourishing. Links: https://meaningalignment.substack.com/p/introducing-full-stack-alignment  https://meaninglabs.notion.site/The-Full-Stack-Alignment-Project-List-21cc5bada1d08016a496ca729476d970  @edelwax @meaningaligned @greenpillnet  @owocki Timestamps: 00:00 – Introduction to Green Pill's new season and Joe Edelman 01:59 – Joe's background and the Meaning Alignment Institute 03:43 – Why alignment matters for AI and institutions 05:46 – Lessons from social media and the attention economy 09:06 – Critique of shallow AI alignment approaches (RLHF, values-as-text) 13:20 – Thick models of value: going deeper than abstract ideals 15:11 – Full stack alignment across models, metrics, and institutions 17:00 – Reconciling values with capitalist incentive structures 19:17 – Avoiding dystopian economies and building value-driven markets 21:32 – Four moonshots: super negotiators, public resource regulators, market intermediaries, value stewardship agents 27:32 – Intermediaries vs. value stewardship agents explained 29:09 – How builders and academics can get involved in full stack alignment projects 31:10 – Why cross-institutional collaboration is critical 32:46 – Joe's vision of the world in 10 years with full stack alignment 34:51 – Food system analogy: from “sugar” to nourishing AI 36:40 – Long-term vs. short-term incentives in markets 38:25 – Hopeful outlook: building integrity into AI and institutions 39:04 – Closing remarks and links to Joe's work

Eye On A.I.
#289 Eiso Kant: How Reinforcement Learning and Coding Could Unlock Human-Level AI

Eye On A.I.

Play Episode Listen Later Sep 24, 2025 54:06


How do we get from today's AI copilots to true human-level intelligence? In this episode of Eye on AI, Craig Smith sits down with Eiso Kant, Co-Founder of Poolside, to explore why reinforcement learning + software development might be the fastest path to human-level AI. Eiso shares Poolside's mission to build AI that doesn't just autocomplete code — but learns like a real developer. You'll hear how Poolside uses reinforcement learning from code execution (RLCF), why software development is the perfect training ground for intelligence, and how agentic AI systems are about to transform the way we build and ship software. If you want to understand the future of AI, software engineering, and AGI, this conversation is packed with insights you won't want to miss. Stay Updated: Craig Smith on X:https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) The Missing Ingredient for Human-Level AI(01:02) Eiso Kant's Journey(05:30) Using Software Development to Reach AGI(07:48) Why Coding Is the Perfect Training Ground for Intelligence(10:11)  Reinforcement Learning from Code Execution (RLCF) Explained(13:14) How Poolside Builds and Trains Its Foundation Models(17:35) The Rise of Agentic AI(21:08)  Making Software Creation Accessible to Everyone(26:03) Overcoming Model Limitations(32:08) Training Models to Think(37:24) Building the Future of AI Agents(42:11) Poolside's Full-Stack Approach to AI Deployment(46:28) Enterprise Partnerships, Security & Customization Behind the Firewall(50:48) Giving Enterprises Transparency to Drive Adoption  

Podcast Notes Playlist: Latest Episodes
Physics Absorbed Artificial Intelligence & (Maybe) Consciousness

Podcast Notes Playlist: Latest Episodes

Play Episode Listen Later Sep 9, 2025


Theories of Everything with Curt Jaimungal ✓ Claim : Read the notes at at podcastnotes.org. Don't forget to subscribe for free to our newsletter, the top 10 ideas of the week, every Monday --------- As a listener of TOE you can get a special 20% off discount to The Economist and all it has to offer! Visit https://www.economist.com/toe MIT physicist Max Tegmark argues AI now belongs inside physics—and that consciousness will be next. He separates intelligence (goal-achieving behavior) from consciousness (subjective experience), sketches falsifiable experiments using brain-reading tech and rigorous theories (e.g., IIT/φ), and shows how ideas like Hopfield energy landscapes make memory “feel” like physics. We get into mechanistic interpretability (sparse autoencoders), number representations that snap into clean geometry, why RLHF mostly aligns behavior (not goals), and the stakes as AI progress accelerates from “underhyped” to civilization-shaping. It's a masterclass on where mind, math, and machines collide. Join My New Substack (Personal Writings): https://curtjaimungal.substack.com Listen on Spotify: https://open.spotify.com/show/4gL14b92xAErofYQA7bU4e Timestamps: - 00:00 - Why AI is the New Frontier of Physics - 09:38 - Is Consciousness Just a Byproduct of Intelligence? - 16:43 - A Falsifiable Theory of Consciousness? (The MEG Helmet Experiment) - 27:34 - Beyond Neural Correlates: A New Paradigm for Scientific Inquiry - 38:40 - Humanity: The Masters of Underestimation (Fermi's AI Analogy) - 51:27 - What Are an AI's True Goals? (The Serial Killer Problem) - 1:03:42 - Fermat's Principle, Entropy, and the Physics of Goals - 1:15:52 - Eureka Moment: When an AI Discovered Geometry on Its Own - 1:30:01 - Refuting the "AI Doomers": We Have More Agency Than We Think Links mentioned: - Max's Papers: https://scholar.google.com/citations?user=eBXEZxgAAAAJ&hl=en - Language Models Use Trigonometry to Do Addition [Paper]: https://arxiv.org/abs/2502.00873 - Generalization from Starvation [Paper]: https://arxiv.org/abs/2410.08255 - Geoffrey Hinton [TOE]: https://youtu.be/b_DUft-BdIE - Michael Levin [TOE]: https://youtu.be/c8iFtaltX-s - Iceberg of Consciousness [TOE]: https://youtu.be/65yjqIDghEk - Improved Measures of Integrated Information [Paper]: https://arxiv.org/abs/1601.02626 - David Kaiser [TOE]: https://youtu.be/_yebLXsIdwo - Iain McGilchrist [TOE]: https://youtu.be/Q9sBKCd2HD0 - Elan Barenholtz & William Hahn [TOE]: https://youtu.be/A36OumnSrWY - Daniel Schmachtenberger [TOE]: https://youtu.be/g7WtcTATa2U - Ted Jacobson [TOE]: https://youtu.be/3mhctWlXyV8 - The “All Possible Paths” Myth [TOE]: https://youtu.be/XcY3ZtgYis0 SUPPORT: - Become a YouTube Member (Early Access Videos): https://www.youtube.com/channel/UCdWIQh9DGG6uhJk8eyIFl1w/join - Support me on Patreon: https://patreon.com/curtjaimungal - Support me on Crypto: https://commerce.coinbase.com/checkout/de803625-87d3-4300-ab6d-85d4258834a9 - Support me on PayPal: https://www.paypal.com/donate?hosted_button_id=XUBHNMFXUX5S4 SOCIALS: - Twitter: https://twitter.com/TOEwithCurt - Discord Invite: https://discord.com/invite/kBcnfNVwqs Guests do not pay to appear. Theories of Everything receives revenue solely from viewer donations, platform ads, and clearly labelled sponsors; no guest or associated entity has ever given compensation, directly or through intermediaries. #science Learn more about your ad choices. Visit megaphone.fm/adchoices

Podcast Notes Playlist: Nutrition
Physics Absorbed Artificial Intelligence & (Maybe) Consciousness

Podcast Notes Playlist: Nutrition

Play Episode Listen Later Sep 9, 2025 109:53


Theories of Everything with Curt Jaimungal ✓ Claim Key Takeaways  Conditions like depression, bipolar disorder, and schizophrenia may be driven in part by metabolic dysfunction in the brainNeuroinflammation is real, but fasting and a ketogenic diet can help The benefits of supplementing exogenous ketones:(1) Quick energy – they give your body a fast fuel source, especially for the brain and muscles(2)  Support ketosis – they can help raise blood ketone levels even if you're not fully on a strict keto dietBenefits of fasting: helps to augment the control of the immune system, relaxes the gut and enables the body's repair processes to occur, reduces the body's general state of inflammation There is a ketone-synergistic effect when delivering caffeine with MCT; it stimulates lipolysis and also fat oxidation in the liver The short-list of essential supplements:CoQ10, creatine, ketones, vitamin D, and melatoninThe benefits of metformin and GLP-1 drugs may arise from their influence on metabolic functionA low-carb Mediterranean-style diet is conducive to upgrading your metabolic machinery while keeping biomarkers in checkDiet: No sugar, no starch, fibrous vegetables, aim for 25% of carbohydrates consumed should be from fiber, high-protein + low glycemic breakfast and lunch, then a pound of protein for dinner with some fibrous vegetables The protocol and surprising benefits of ‘Sardine Fasting': Eat 1-2 cans of sardines per day for one week; can be repeated monthly or as needed. May need to supplement with vitamin C and magnesium  Why it helps: Provides essential nutrients and omega-3s while keeping calories/protein low enough to activate autophagy, support immunity, fight brain fog, and promote overall metabolic health Movement is critical for optimal metabolic health; get outside and walk first thing in the morning, and try to move after dinner for the sake of glucose metabolism Read the full notes @ podcastnotes.orgAs a listener of TOE you can get a special 20% off discount to The Economist and all it has to offer! Visit https://www.economist.com/toe MIT physicist Max Tegmark argues AI now belongs inside physics—and that consciousness will be next. He separates intelligence (goal-achieving behavior) from consciousness (subjective experience), sketches falsifiable experiments using brain-reading tech and rigorous theories (e.g., IIT/φ), and shows how ideas like Hopfield energy landscapes make memory “feel” like physics. We get into mechanistic interpretability (sparse autoencoders), number representations that snap into clean geometry, why RLHF mostly aligns behavior (not goals), and the stakes as AI progress accelerates from “underhyped” to civilization-shaping. It's a masterclass on where mind, math, and machines collide. Join My New Substack (Personal Writings): https://curtjaimungal.substack.com Listen on Spotify: https://open.spotify.com/show/4gL14b92xAErofYQA7bU4e Timestamps: - 00:00 - Why AI is the New Frontier of Physics - 09:38 - Is Consciousness Just a Byproduct of Intelligence? - 16:43 - A Falsifiable Theory of Consciousness? (The MEG Helmet Experiment) - 27:34 - Beyond Neural Correlates: A New Paradigm for Scientific Inquiry - 38:40 - Humanity: The Masters of Underestimation (Fermi's AI Analogy) - 51:27 - What Are an AI's True Goals? (The Serial Killer Problem) - 1:03:42 - Fermat's Principle, Entropy, and the Physics of Goals - 1:15:52 - Eureka Moment: When an AI Discovered Geometry on Its Own - 1:30:01 - Refuting the "AI Doomers": We Have More Agency Than We Think Links mentioned: - Max's Papers: https://scholar.google.com/citations?user=eBXEZxgAAAAJ&hl=en - Language Models Use Trigonometry to Do Addition [Paper]: https://arxiv.org/abs/2502.00873 - Generalization from Starvation [Paper]: https://arxiv.org/abs/2410.08255 - Geoffrey Hinton [TOE]: https://youtu.be/b_DUft-BdIE - Michael Levin [TOE]: https://youtu.be/c8iFtaltX-s - Iceberg of Consciousness [TOE]: https://youtu.be/65yjqIDghEk - Improved Measures of Integrated Information [Paper]: https://arxiv.org/abs/1601.02626 - David Kaiser [TOE]: https://youtu.be/_yebLXsIdwo - Iain McGilchrist [TOE]: https://youtu.be/Q9sBKCd2HD0 - Elan Barenholtz & William Hahn [TOE]: https://youtu.be/A36OumnSrWY - Daniel Schmachtenberger [TOE]: https://youtu.be/g7WtcTATa2U - Ted Jacobson [TOE]: https://youtu.be/3mhctWlXyV8 - The “All Possible Paths” Myth [TOE]: https://youtu.be/XcY3ZtgYis0 SUPPORT: - Become a YouTube Member (Early Access Videos): https://www.youtube.com/channel/UCdWIQh9DGG6uhJk8eyIFl1w/join - Support me on Patreon: https://patreon.com/curtjaimungal - Support me on Crypto: https://commerce.coinbase.com/checkout/de803625-87d3-4300-ab6d-85d4258834a9 - Support me on PayPal: https://www.paypal.com/donate?hosted_button_id=XUBHNMFXUX5S4 SOCIALS: - Twitter: https://twitter.com/TOEwithCurt - Discord Invite: https://discord.com/invite/kBcnfNVwqs Guests do not pay to appear. Theories of Everything receives revenue solely from viewer donations, platform ads, and clearly labelled sponsors; no guest or associated entity has ever given compensation, directly or through intermediaries. #science Learn more about your ad choices. Visit megaphone.fm/adchoices

Theories of Everything with Curt Jaimungal
Physics Absorbed Artificial Intelligence & (Maybe) Consciousness

Theories of Everything with Curt Jaimungal

Play Episode Listen Later Sep 3, 2025 109:53


As a listener of TOE you can get a special 20% off discount to The Economist and all it has to offer! Visit https://www.economist.com/toe MIT physicist Max Tegmark argues AI now belongs inside physics—and that consciousness will be next. He separates intelligence (goal-achieving behavior) from consciousness (subjective experience), sketches falsifiable experiments using brain-reading tech and rigorous theories (e.g., IIT/φ), and shows how ideas like Hopfield energy landscapes make memory “feel” like physics. We get into mechanistic interpretability (sparse autoencoders), number representations that snap into clean geometry, why RLHF mostly aligns behavior (not goals), and the stakes as AI progress accelerates from “underhyped” to civilization-shaping. It's a masterclass on where mind, math, and machines collide. Join My New Substack (Personal Writings): https://curtjaimungal.substack.com Listen on Spotify: https://open.spotify.com/show/4gL14b92xAErofYQA7bU4e Timestamps: - 00:00 - Why AI is the New Frontier of Physics - 09:38 - Is Consciousness Just a Byproduct of Intelligence? - 16:43 - A Falsifiable Theory of Consciousness? (The MEG Helmet Experiment) - 27:34 - Beyond Neural Correlates: A New Paradigm for Scientific Inquiry - 38:40 - Humanity: The Masters of Underestimation (Fermi's AI Analogy) - 51:27 - What Are an AI's True Goals? (The Serial Killer Problem) - 1:03:42 - Fermat's Principle, Entropy, and the Physics of Goals - 1:15:52 - Eureka Moment: When an AI Discovered Geometry on Its Own - 1:30:01 - Refuting the "AI Doomers": We Have More Agency Than We Think Links mentioned: - Max's Papers: https://scholar.google.com/citations?user=eBXEZxgAAAAJ&hl=en - Language Models Use Trigonometry to Do Addition [Paper]: https://arxiv.org/abs/2502.00873 - Generalization from Starvation [Paper]: https://arxiv.org/abs/2410.08255 - Geoffrey Hinton [TOE]: https://youtu.be/b_DUft-BdIE - Michael Levin [TOE]: https://youtu.be/c8iFtaltX-s - Iceberg of Consciousness [TOE]: https://youtu.be/65yjqIDghEk - Improved Measures of Integrated Information [Paper]: https://arxiv.org/abs/1601.02626 - David Kaiser [TOE]: https://youtu.be/_yebLXsIdwo - Iain McGilchrist [TOE]: https://youtu.be/Q9sBKCd2HD0 - Elan Barenholtz & William Hahn [TOE]: https://youtu.be/A36OumnSrWY - Daniel Schmachtenberger [TOE]: https://youtu.be/g7WtcTATa2U - Ted Jacobson [TOE]: https://youtu.be/3mhctWlXyV8 - The “All Possible Paths” Myth [TOE]: https://youtu.be/XcY3ZtgYis0 SUPPORT: - Become a YouTube Member (Early Access Videos): https://www.youtube.com/channel/UCdWIQh9DGG6uhJk8eyIFl1w/join - Support me on Patreon: https://patreon.com/curtjaimungal - Support me on Crypto: https://commerce.coinbase.com/checkout/de803625-87d3-4300-ab6d-85d4258834a9 - Support me on PayPal: https://www.paypal.com/donate?hosted_button_id=XUBHNMFXUX5S4 SOCIALS: - Twitter: https://twitter.com/TOEwithCurt - Discord Invite: https://discord.com/invite/kBcnfNVwqs Guests do not pay to appear. Theories of Everything receives revenue solely from viewer donations, platform ads, and clearly labelled sponsors; no guest or associated entity has ever given compensation, directly or through intermediaries. #science Learn more about your ad choices. Visit megaphone.fm/adchoices

Crazy Wisdom
Episode #481: From Rothschilds to Robinhood: Cycles of Finance and Control

Crazy Wisdom

Play Episode Listen Later Aug 18, 2025 58:20


On this episode of Crazy Wisdom, host Stewart Alsop speaks with Michael Jagdeo, a headhunter and founder working with Exponent Labs and The Syndicate, about the cycles of money, power, and technology that shape our world. Their conversation touches on financial history through The Ascent of Money by Niall Ferguson and William Bagehot's The Money Market, the rise and fall of financial centers from London to New York and the new Texas Stock Exchange, the consolidation of industries and the theory of oligarchical collectivism, the role of AI as both tool and chaos agent, Bitcoin and “quantitative re-centralization,” the dynamics of exponential organizations, and the balance between collectivism and individualism. Jagdeo also shares recruiting philosophies rooted in stories like “stone soup,” frameworks like Yu-Kai Chou's Octalysis and the User Type Hexad, and book recommendations including Salim Ismail's Exponential Organizations and Arthur Koestler's The Act of Creation. Along the way they explore servant leadership, Price's Law, Linux and open source futures, religion as an operating system, and the cyclical nature of civilizations. You can learn more about Michael Jagdeo or reach out to him directly through Twitter or LinkedIn.Check out this GPT we trained on the conversationTimestamps00:05 Stewart Alsop introduces Michael Jagdeo, who shares his path from headhunting actuaries and IT talent into launching startups with Exponent Labs and The Syndicate.00:10 They connect recruiting to financial history, discussing actuaries, The Ascent of Money, and William Bagehot's The Money Market on the London money market and railways.00:15 The Rothschilds, institutional knowledge, and Corn Laws lead into questions about New York as a financial center and the quiet launch of the Texas Stock Exchange by Citadel and BlackRock.00:20 Capital power, George Soros vs. the Bank of England, chaos, paper clips, and Orwell's oligarchical collectivism frame industry consolidation, syndicates, and stone soup.00:25 They debate imperial conquest, bourgeoisie leisure, the decline of the middle class, AI as chaos agent, digital twins, Sarah Connor, Godzilla, and nuclear metaphors.00:30 Conversation turns to Bitcoin, “quantitative re-centralization,” Jack Bogle, index funds, Robinhood micro bailouts, and AI as both entropy and negative entropy.00:35 Jagdeo discusses Jim Keller, Tenstorrent, RISC-V, Nvidia CUDA, exponential organizations, Price's Law, bureaucracy, and servant leadership with the parable of stone soup.00:40 Recruiting as symbiosis, biophilia, trust, Judas, Wilhelm Reich, AI tools, Octalysis gamification, Jordan vs. triangle offense, and the role of laughter in persuasion emerge.00:45 They explore religion as operating systems, Greek gods, Comte's stages, Nietzsche, Jung, nostalgia, scientism, and Jordan Peterson's revival of tradition.00:50 The episode closes with Linux debates, Ubuntu, Framer laptops, PewDiePie, and Jagdeo's nod to Liminal Snake on epistemic centers and turning curses into blessings.Key InsightsOne of the central insights of the conversation is how financial history repeats through cycles of consolidation and power shifts. Michael Jagdeo draws on William Bagehot's The Money Market to explain how London became the hub of European finance, much like New York later did, and how the Texas Stock Exchange signals a possible southern resurgence of financial influence in America. The pattern of wealth moving with institutional shifts underscores how markets, capital, and politics remain intertwined.Jagdeo and Alsop emphasize that industries naturally oligarchize. Borrowing from Orwell's “oligarchical collectivism,” Jagdeo notes that whether in diamonds, food, or finance, consolidation emerges as economies of scale take over. This breeds syndicates and monopolies, often interpreted as conspiracies but really the predictable outcome of industrial maturation.Another powerful theme is the stone soup model of collaboration. Jagdeo applies this parable to recruiting, showing that no single individual can achieve large goals alone. By framing opportunities as shared ventures where each person adds their own ingredient, leaders can attract top talent while fostering genuine symbiosis.Technology, and particularly AI, is cast as both chaos agent and amplifier of human potential. The conversation likens AI to nuclear power—capable of great destruction or progress. From digital twins to Sarah Connor metaphors, they argue AI represents not just artificial intelligence but artificial knowledge and action, pushing humans to adapt quickly to its disruptive presence.The discussion of Bitcoin and digital currencies reframes decentralization as potentially another trap. Jagdeo provocatively calls Bitcoin “quantitative re-centralization,” suggesting that far from liberating individuals, digital currencies may accelerate neo-feudalism by creating new oligarchies and consolidating financial control in unexpected ways.Exponential organizations and the leverage of small teams emerge as another key point. Citing Price's Law, Jagdeo explains how fewer than a dozen highly capable individuals can now achieve billion-dollar valuations thanks to open source hardware, AI, and network effects. This trend redefines scale, making nimble collectives more powerful than bureaucratic giants.Finally, the episode highlights the cyclical nature of civilizations and belief systems. From Rome vs. Carthage to Greek gods shifting with societal needs, to Nietzsche's “God is dead” and Jung's view of recurring deaths of divinity, Jagdeo argues that religion, ideology, and operating systems reflect underlying incentives. Western nostalgia for past structures, whether political or religious, risks idolatry, while the real path forward may lie in new blends of individualism, collectivism, and adaptive tools like Linux and AI.

Brave New World -- hosted by Vasant Dhar
Ep 98: There's no I in AI, Ben Shneiderman on The Evolution and State of Artificial Intelligence

Brave New World -- hosted by Vasant Dhar

Play Episode Listen Later Aug 14, 2025 69:50


Useful Resources: 1. Ben Shneiderman, Professor Emeritus, University Of Maryland. 2. Richard Hamming and Hamming Codes. 3. Human Centered AI - Ben Shneiderman. 4. Allen Newell and Herbert A. Simon. 5. Raj Reddy and the Turing Award. 6. Doug Engelbart. 7. Alan Kay. 8. Conference on Human Factors in Computing Systems. 9. Software psychology: Human factors in computer and information systems - Ben Shneiderman. 10. Designing the User Interface: Strategies for Effective Human-Computer Interaction - Ben Shneiderman. 11. Direct Manipulation: A Step Beyond Programming Languages - Ben Shneiderman. 12. Steps Toward Artificial Intelligence - Marvin Minsky. 13. Herbert Gelernter. 14. Computers And Thought - Edward A Feigenbaum and Julian Feldman. 15. Lewis Mumford. 15. Technics and Civilization - Lewis Mumford. 16. Buckminster Fuller. 17. Marshall McLuhan. 18. Roger Shank. 19. The Anxious Generation: How the Great Rewiring of Childhood Is Causing an Epidemic of Mental Illness - Jonathan Haidt. 20. John C. Thomas, IBM. 21. Yousuf Karsh, photographer. 22. Gary Marcus, professor emeritus of psychology and neural science at NYU. 23. Geoffrey Hinton. 24. Nassim Nicholas Taleb. 25. There Is No A.I. - Jaron Lanier. 26. Anil Seth On The Science of Consciousness - Episode 94 of Brave New World. 27. A ‘White-Collar Blood Bath' Doesn't Have to Be Our Fate - Tim Wu 28. Information Management: A Proposal - Tim Berners-Lee 29. Is AI-assisted coding overhyped? : METR study 30. RLHF, Reinforcement learning from human feedback31. Joseph Weizenbaum 32. What Is Computer Science? - Allen Newel, Alan J. Perlis, Herbert A. Simon -- Check out Vasant Dhar's newsletter on Substack. The subscription is free!  

Crazy Wisdom
Episode #477: Why Curiosity Isn't Just a Virtue—It's Our Oldest Technology

Crazy Wisdom

Play Episode Listen Later Aug 4, 2025 54:53


In this episode, Stewart Alsop speaks with Edouard Machery, Distinguished Professor at the University of Pittsburgh and Director of the Center for Philosophy of Science, about the deep cultural roots of question-asking and curiosity. From ancient Sumerian tablets to the philosophical legacies of Socrates and Descartes, the conversation spans how different civilizations have valued inquiry, the cross-cultural psychology of AI, and what makes humans unique in our drive to ask “why.” For more, explore Edouard's work at www.edouardmachery.com.Check out this GPT we trained on the conversationTimestamps00:00 – 05:00 Origins of question-asking, Sumerian writing, norms in early civilizations, authority and written text05:00 – 10:00 Values in AI across cultures, RLHF, tech culture in the Bay Area vs. broader American values10:00 – 15:00 Cross-cultural AI study: Taiwan vs. USA, privacy and collectivism, urban vs. rural mindset divergence15:00 – 20:00 History of curiosity in the West, from vice to virtue post-15th century, link to awe and skepticism20:00 – 25:00 Magic, alchemy, and experimentation in early science, merging maker and scholarly traditions25:00 – 30:00 Rise of public dissections, philosophy as meta-curiosity, Socratic questioning as foundational30:00 – 35:00 Socrates, Plato, Aristotle—transmission of philosophical curiosity, human uniqueness in questioning35:00 – 40:00 Language, assertion, imagination, play in animals vs. humans, symbolic worlds40:00 – 45:00 Early moderns: Montaigne, Descartes, rejection of Aristotle, rise of foundational science45:00 – 50:00 Confucianism and curiosity, tradition and authority, contrast with India and Buddhist thought50:00 – 55:00 Epistemic virtues project, training curiosity, philosophical education across cultures, spiritual curiosityKey InsightsCuriosity hasn't always been a virtue. In Western history, especially through Christian thought until the 15th century, curiosity was viewed as a vice—something dangerous and prideful—until global exploration and scientific inquiry reframed it as essential to human understanding.Question-asking is culturally embedded. Different societies place varying emphasis on questioning. While Confucian cultures promote curiosity within hierarchical structures, Christian traditions historically linked it with sin—except when directed toward divine matters.Urbanization affects curiosity more than nationality. Machery found that whether someone lives in a city or countryside often shapes their mindset more than their cultural background. Cosmopolitan environments expose individuals to diverse values, prompting greater openness and inquiry.AI ethics reveals cultural alignment. In studying attitudes toward AI in the U.S. and Taiwan, expected contrasts in privacy and collectivism were smaller than anticipated. The urban, global culture in both countries seems to produce surprisingly similar ethical concerns.The scientific method emerged from curiosity. The fusion of the maker tradition (doing) and the scholarly tradition (knowing) in the 13th–14th centuries helped birth experimentation, public dissection, and eventually modern science—all grounded in a spirit of curiosity.Philosophy begins with meta-curiosity. From Socratic questioning to Plato's dialogues and Aristotle's treatises, philosophy has always been about asking questions about questions—making “meta-curiosity” the core of the discipline.Only humans ask why. Machery notes that while animals can make requests, they don't seem to ask questions. Humans alone communicate assertions and engage in symbolic, imaginative, question-driven thought, setting us apart cognitively and culturally.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

We first had Nathan on to give us his RLHF deep dive when he was joining AI2, and now he's back to help us catch up on the evolution to RLVR (Reinforcement Learning with Verifiable Rewards), first proposed in his Tulu 3 paper. While RLHF remains foundational, RLVR has emerged as a powerful approach for training models on tasks with clear success criteria and using verifiable, objective functions as reward signals—particularly useful in domains like math, code correctness, and instruction-following. Instead of relying solely on subjective human feedback, RLVR leverages deterministic signals to guide optimization, making it more scalable and potentially more reliable across many domains. However, he notes that RLVR is still rapidly evolving, especially regarding how it handles tool use and multi-step reasoning.We also discussed the Tulu model series, a family of instruction-tuned open models developed at AI2. Tulu is designed to be a reproducible, state-of-the-art post-training recipe for the open community. Unlike frontier labs like OpenAI or Anthropic, which rely on vast and often proprietary datasets, Tulu aims to distill and democratize best practices for instruction and preference tuning. We are impressed with how small eval suites, careful task selection, and transparent methodology can rival even the best proprietary models on specific benchmarks.One of the most fascinating threads is the challenge of incorporating tool use into RL frameworks. Lambert highlights that while you can prompt a model to use tools like search or code execution, getting the model to reliably learn when and how to use them through RL is much harder. This is compounded by the difficulty of designing reward functions that avoid overoptimization—where models learn to “game” the reward signal rather than solve the underlying task. This is particularly problematic in code generation, where models might reward hack unit tests by inserting pass statements instead of correct logic. As models become more agentic and are expected to plan, retrieve, and act across multiple tools, reward design becomes a critical bottleneck.Other topics covered:- The evolution from RLHF (Reinforcement Learning from Human Feedback) to RLVR (Reinforcement Learning from Verifiable Rewards)- The goals and technical architecture of the Tulu models, including the motivation to open-source post-training recipes- Challenges of tool use in RL: verifiability, reward design, and scaling across domains- Evaluation frameworks and the role of platforms like Chatbot Arena and emerging “arena”-style benchmarks- The strategic tension between hybrid reasoning models and unified reasoning models at the frontier- Planning, abstraction, and calibration in reasoning agents and why these concepts matter- The future of open-source AI models, including DeepSeek, OLMo, and the potential for an “American DeepSeek”- The importance of model personality, character tuning, and the model spec paradigm- Overoptimization in RL settings and how it manifests in different domains (control tasks, code, math)- Industry trends in inference-time scaling and model parallelismFinally, the episode closes with a vision for the future of open-source AI. Nathan has now written up his ambition to build an “American DeepSeek”—a fully open, end-to-end reasoning-capable model with transparent training data, tools, and infrastructure. He emphasizes that open-source AI is not just about weights; it's about releasing recipes, evaluations, and methods that lower the barrier for everyone to build and understand cutting-edge systems. Full Video EpisodeTimestamps00:00 Welcome and Guest Introduction01:18 Tulu, OVR, and the RLVR Journey03:40 Industry Approaches to Post-Training and Preference Data06:08 Understanding RLVR and Its Impact06:18 Agents, Tool Use, and Training Environments10:34 Open Data, Human Feedback, and Benchmarking12:44 Chatbot Arena, Sycophancy, and Evaluation Platforms15:42 RLHF vs RLVR: Books, Algorithms, and Future Directions17:54 Frontier Models: Reasoning, Hybrid Models, and Data22:11 Search, Retrieval, and Emerging Model Capabilities29:23 Tool Use, Curriculum, and Model Training Challenges38:06 Skills, Planning, and Abstraction in Agent Models46:50 Parallelism, Verifiers, and Scaling Approaches54:33 Overoptimization and Reward Design in RL1:02:27 Open Models, Personalization, and the Model Spec1:06:50 Open Model Ecosystem and Infrastructure1:13:05 Meta, Hardware, and the Future of AI Competition1:15:42 Building an Open DeepSeek and Closing Thoughts Get full access to Latent.Space at www.latent.space/subscribe

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Chapters 00:00:00 Welcome and Guest Introduction 00:01:18 Tulu, OVR, and the RLVR Journey 00:03:40 Industry Approaches to Post-Training and Preference Data 00:06:08 Understanding RLVR and Its Impact 00:06:18 Agents, Tool Use, and Training Environments 00:10:34 Open Data, Human Feedback, and Benchmarking 00:12:44 Chatbot Arena, Sycophancy, and Evaluation Platforms 00:15:42 RLHF vs RLVR: Books, Algorithms, and Future Directions 00:17:54 Frontier Models: Reasoning, Hybrid Models, and Data 00:22:11 Search, Retrieval, and Emerging Model Capabilities 00:29:23 Tool Use, Curriculum, and Model Training Challenges 00:38:06 Skills, Planning, and Abstraction in Agent Models 00:46:50 Parallelism, Verifiers, and Scaling Approaches 00:54:33 Overoptimization and Reward Design in RL 01:02:27 Open Models, Personalization, and the Model Spec 01:06:50 Open Model Ecosystem and Infrastructure 01:13:05 Meta, Hardware, and the Future of AI Competition 01:15:42 Building an Open DeepSeek and Closing Thoughts We first had Nathan on to give us his RLHF deep dive when he was joining AI2, and now he's back to help us catch up on the evolution to RLVR (Reinforcement Learning with Verifiable Rewards), first proposed in his Tulu 3 paper. While RLHF remains foundational, RLVR has emerged as a powerful approach for training models on tasks with clear success criteria and using verifiable, objective functions as reward signals—particularly useful in domains like math, code correctness, and instruction-following. Instead of relying solely on subjective human feedback, RLVR leverages deterministic signals to guide optimization, making it more scalable and potentially more reliable across many domains. However, he notes that RLVR is still rapidly evolving, especially regarding how it handles tool use and multi-step reasoning. We also discussed the Tulu model series, a family of instruction-tuned open models developed at AI2. Tulu is designed to be a reproducible, state-of-the-art post-training recipe for the open community. Unlike frontier labs like OpenAI or Anthropic, which rely on vast and often proprietary datasets, Tulu aims to distill and democratize best practices for instruction and preference tuning. We are impressed with how small eval suites, careful task selection, and transparent methodology can rival even the best proprietary models on specific benchmarks. One of the most fascinating threads is the challenge of incorporating tool use into RL frameworks. Lambert highlights that while you can prompt a model to use tools like search or code execution, getting the model to reliably learn when and how to use them through RL is much harder. This is compounded by the difficulty of designing reward functions that avoid overoptimization—where models learn to “game” the reward signal rather than solve the underlying task. This is particularly problematic in code generation, where models might reward hack unit tests by inserting pass statements instead of correct logic. As models become more agentic and are expected to plan, retrieve, and act across multiple tools, reward design becomes a critical bottleneck. Other topics covered: - The evolution from RLHF (Reinforcement Learning from Human Feedback) to RLVR (Reinforcement Learning from Verifiable Rewards) - The goals and technical architecture of the Tulu models, including the motivation to open-source post-training recipes - Challenges of tool use in RL: verifiability, reward design, and scaling across domains - Evaluation frameworks and the role of platforms like Chatbot Arena and emerging “arena”-style benchmarks - The strategic tension between hybrid reasoning models and unified reasoning models at the frontier - Planning, abstraction, and calibration in reasoning agents and why these concepts matter - The future of open-source AI models, including DeepSeek, OLMo, and the potential for an “American DeepSeek” - The importance of model personality, character tuning, and the model spec paradigm - Overoptimization in RL settings and how it manifests in different domains (control tasks, code, math) - Industry trends in inference-time scaling and model parallelism Finally, the episode closes with a vision for the future of open-source AI. Nathan has now written up his ambition to build an “American DeepSeek”—a fully open, end-to-end reasoning-capable model with transparent training data, tools, and infrastructure. He emphasizes that open-source AI is not just about weights; it's about releasing recipes, evaluations, and methods that lower the barrier for everyone to build and understand cutting-edge systems. It would seem the

Lenny's Podcast: Product | Growth | Career
Anthropic co-founder on quitting OpenAI, AGI predictions, $100M talent wars, 20% unemployment, and the nightmare scenarios keeping him up at night | Ben Mann

Lenny's Podcast: Product | Growth | Career

Play Episode Listen Later Jul 20, 2025 74:59


Benjamin Mann is a co-founder of Anthropic, an AI startup dedicated to building aligned, safety-first AI systems. Prior to Anthropic, Ben was one of the architects of GPT-3 at OpenAI. He left OpenAI driven by the mission to ensure that AI benefits humanity. In this episode, Ben opens up about the accelerating progress in AI and the urgent need to steer it responsibly.In this conversation, we discuss:1. The inside story of leaving OpenAI with the entire safety team to start Anthropic2. How Meta's $100M offers reveal the true market price of top AI talent3. Why AI progress is still accelerating (not plateauing), and how most people misjudge the exponential4. Ben's “economic Turing test” for knowing when we've achieved AGI—and why it's likely coming by 2027-20285. Why he believes 20% unemployment is inevitable6. The AI nightmare scenarios that concern him most—and how he believes we can still avoid them7. How focusing on AI safety created Claude's beloved personality8. What three skills he's teaching his kids instead of traditional academics—Brought to you by:Sauce—Turn customer pain into product revenue: https://sauce.app/lennyLucidLink—Real-time cloud storage for teams: https://www.lucidlink.com/lennyFin—The #1 AI agent for customer service: https://fin.ai/lenny—Transcript: https://www.lennysnewsletter.com/p/anthropic-co-founder-benjamin-mann—My biggest takeaways (for paid newsletter subscribers): https://www.lennysnewsletter.com/i/168107911/my-biggest-takeaways-from-this-conversation—Where to find Ben Mann:• X: https://x.com/8enmann• LinkedIn: https://www.linkedin.com/in/benjamin-mann/• Website: https://benjmann.net/—Where to find Lenny:• Newsletter: https://www.lennysnewsletter.com• X: https://twitter.com/lennysan• LinkedIn: https://www.linkedin.com/in/lennyrachitsky/—In this episode, we cover:(00:00) Introduction to Benjamin(04:43) The AI talent war(06:28) AI progress and scaling laws(10:50) Defining AGI and the economic Turing test(12:26) The impact of AI on jobs(17:45) Preparing for an AI future(24:05) Founding Anthropic(27:06) Balancing AI safety and progress(29:10) Constitutional AI and model alignment(34:21) The importance of AI safety(43:40) The risks of autonomous agents(45:40) Forecasting superintelligence(48:36) How hard is it to align AI?(53:19) Reinforcement learning from AI feedback (RLAIF)(57:03) AI's biggest bottlenecks(01:00:11) Personal reflections on responsibilities(01:02:36) Anthropic's growth and innovations(01:07:48) Lightning round and final thoughts—Referenced:• Dario Amodei on LinkedIn: https://www.linkedin.com/in/dario-amodei-3934934/• Anthropic CEO: AI Could Wipe Out 50% of Entry-Level White Collar Jobs: https://www.marketingaiinstitute.com/blog/dario-amodei-ai-entry-level-jobs• Alexa+: https://www.amazon.com/dp/B0DCCNHWV5• Azure: https://azure.microsoft.com/• Sam Altman on X: https://x.com/sama• Opus 3: https://www.anthropic.com/news/claude-3-family• Claude's Constitution: https://www.anthropic.com/news/claudes-constitution• Greg Brockman on X: https://x.com/gdb• Anthropic's Responsible Scaling Policy: https://www.anthropic.com/news/anthropics-responsible-scaling-policy• Agentic Misalignment: How LLMs could be insider threats: https://www.anthropic.com/research/agentic-misalignment• Anthropic's CPO on what comes next | Mike Krieger (co-founder of Instagram): https://www.lennysnewsletter.com/p/anthropics-cpo-heres-what-comes-next• AI prompt engineering in 2025: What works and what doesn't | Sander Schulhoff (Learn Prompting, HackAPrompt): https://www.lennysnewsletter.com/p/ai-prompt-engineering-in-2025-sander-schulhoff• Unitree: https://www.unitree.com/• Arthur C. Clarke: https://en.wikipedia.org/wiki/Arthur_C._Clarke• How Reinforcement Learning from AI Feedback Works: https://www.assemblyai.com/blog/how-reinforcement-learning-from-ai-feedback-works• RLHF: https://en.wikipedia.org/wiki/Reinforcement_learning_from_human_feedback• Jared Kaplan on LinkedIn: https://www.linkedin.com/in/jared-kaplan-645843213/• Moore's law: https://en.wikipedia.org/wiki/Moore%27s_law• Machine Intelligence Research Institute: https://intelligence.org/• Raph Lee on LinkedIn: https://www.linkedin.com/in/raphaeltlee/• “The Last Question”: https://en.wikipedia.org/wiki/The_Last_Question• Beth Barnes on LinkedIn: https://www.linkedin.com/in/elizabethmbarnes/• “The Last Question”: https://en.wikipedia.org/wiki/The_Last_Question• Good Strategy, Bad Strategy | Richard Rumelt: https://www.lennysnewsletter.com/p/good-strategy-bad-strategy-richard• Pantheon on Netflix: https://www.netflix.com/title/81937398• Ted Lasso on AppleTV+: https://tv.apple.com/us/show/ted-lasso/umc.cmc.vtoh0mn0xn7t3c643xqonfzy• Kurzgesagt—In a Nutshell: https://www.youtube.com/channel/UCsXVk37bltHxD1rDPwtNM8Q• 5 tips to poop like a champion: https://8enmann.medium.com/5-tips-to-poop-like-a-champion-3292481a9651—Recommended books:• Superintelligence: Paths, Dangers, Strategies: https://www.amazon.com/Superintelligence-Dangers-Strategies-Nick-Bostrom/dp/0198739834• The Hacker and the State: Cyber Attacks and the New Normal of Geopolitics: https://www.amazon.com/Hacker-State-Attacks-Normal-Geopolitics/dp/0674987551• Replacing Guilt: Minding Our Way: https://www.amazon.com/Replacing-Guilt-Minding-Our-Way/dp/B086FTSB3Q• Good Strategy/Bad Strategy: The Difference and Why It Matters: https://www.amazon.com/Good-Strategy-Bad-Difference-Matters/dp/0307886239• The Alignment Problem: Machine Learning and Human Values: https://www.amazon.com/Alignment-Problem-Machine-Learning-Values/dp/0393635821—Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email podcast@lennyrachitsky.com.Lenny may be an investor in the companies discussed. To hear more, visit www.lennysnewsletter.com

The MAD Podcast with Matt Turck
Ex‑DeepMind Researcher Misha Laskin on Enterprise Super‑Intelligence | Reflection AI

The MAD Podcast with Matt Turck

Play Episode Listen Later Jul 17, 2025 66:29


What if your company had a digital brain that never forgot, always knew the answer, and could instantly tap the knowledge of your best engineers, even after they left? Superintelligence can feel like a hand‑wavy pipe‑dream— yet, as Misha Laskin argues, it becomes a tractable engineering problem once you scope it to the enterprise level. Former DeepMind researcher Laskin is betting on an oracle‑like AI that grasps every repo, Jira ticket and hallway aside as deeply as your principal engineer—and he's building it at Reflection AI.In this wide‑ranging conversation, Misha explains why coding is the fastest on‑ramp to superintelligence, how “organizational” beats “general” when real work is on the line, and why today's retrieval‑augmented generation (RAG) feels like “exploring a jungle with a flashlight.” He walks us through Asimov, Reflection's newly unveiled code‑research agent that fuses long‑context search, team‑wide memory and multi‑agent planning so developers spend less time spelunking for context and more time shipping.We also rewind his unlikely journey—from physics prodigy in a Manhattan‑Project desert town, to Berkeley's AI crucible, to leading RLHF for Google Gemini—before he left big‑lab comfort to chase a sharper vision of enterprise super‑intelligence. Along the way: the four breakthroughs that unlocked modern AI, why capital efficiency still matters in the GPU arms‑race, and how small teams can lure top talent away from nine‑figure offers.If you're curious about the next phase of AI agents, the future of developer tooling, or the gritty realities of scaling a frontier‑level startup—this episode is your blueprint.Reflection AIWebsite - https://reflection.aiLinkedIn - https://www.linkedin.com/company/reflectionaiMisha LaskinLinkedIn - https://www.linkedin.com/in/mishalaskinX/Twitter - https://x.com/mishalaskinFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (01:42) Reflection AI: Company Origins and Mission (04:14) Making Superintelligence Concrete (06:04) Superintelligence vs. AGI: Why the Goalposts Moved (07:55) Organizational Superintelligence as an Oracle (12:05) Coding as the Shortcut: Hands, Legs & Brain for AI (16:00) Building the Context Engine (20:55) Capturing Tribal Knowledge in Organizations (26:31) Introducing Asimov: A Deep Code Research Agent (28:44) Team-Wide Memory: Preserving Institutional Knowledge (33:07) Multi-Agent Design for Deep Code Understanding (34:48) Data Retrieval and Integration in Asimov (38:13) Enterprise-Ready: VPC and On-Prem Deployments (39:41) Reinforcement Learning in Asimov's Development (41:04) Misha's Journey: From Physics to AI (42:06) Growing Up in a Science-Driven Desert Town (53:03) Building General Agents at DeepMind (56:57) Founding Reflection AI After DeepMind (58:54) Product-Driven Superintelligence: Why It Matters (01:02:22) The State of Autonomous Coding Agents (01:04:26) What's Next for Reflection AI

Slate Star Codex Podcast
Now I Really Won That AI Bet

Slate Star Codex Podcast

Play Episode Listen Later Jul 11, 2025 15:51


  In June 2022, I bet a commenter $100 that AI would master image compositionality by June 2025. DALL-E2 had just come out, showcasing the potential of AI art. But it couldn't follow complex instructions; its images only matched the “vibe” of the prompt. For example, here were some of its attempts at “a red sphere on a blue cube, with a yellow pyramid on the right, all on top of a green table”. At the time, I wrote: I'm not going to make the mistake of saying these problems are inherent to AI art. My guess is a slightly better language model would solve most of them…for all I know, some of the larger image models have already fixed these issues. These are the sorts of problems I expect to go away with a few months of future research. Commenters objected that this was overly optimistic. AI was just a pattern-matching “stochastic parrot”. It would take a deep understanding of grammar to get a prompt exactly right, and that would require some entirely new paradigm beyond LLMs. For example, from Vitor: Why are you so confident in this? The inability of systems like DALL-E to understand semantics in ways requiring an actual internal world model strikes me as the very heart of the issue. We can also see this exact failure mode in the language models themselves. They only produce good results when the human asks for something vague with lots of room for interpretation, like poetry or fanciful stories without much internal logic or continuity. Not to toot my own horn, but two years ago you were naively saying we'd have GPT-like models scaled up several orders of magnitude (100T parameters) right about now (https://readscottalexander.com/posts/ssc-the-obligatory-gpt-3-post#comment-912798). I'm registering my prediction that you're being equally naive now. Truly solving this issue seems AI-complete to me. I'm willing to bet on this (ideas on operationalization welcome). So we made a bet! All right. My proposed operationalization of this is that on June 1, 2025, if either if us can get access to the best image generating model at that time (I get to decide which), or convince someone else who has access to help us, we'll give it the following prompts: 1. A stained glass picture of a woman in a library with a raven on her shoulder with a key in its mouth 2. An oil painting of a man in a factory looking at a cat wearing a top hat 3. A digital art picture of a child riding a llama with a bell on its tail through a desert 4. A 3D render of an astronaut in space holding a fox wearing lipstick 5. Pixel art of a farmer in a cathedral holding a red basketball We generate 10 images for each prompt, just like DALL-E2 does. If at least one of the ten images has the scene correct in every particular on 3/5 prompts, I win, otherwise you do. Loser pays winner $100, and whatever the result is I announce it on the blog (probably an open thread). If we disagree, Gwern is the judge. Some image models of the time refused to draw humans, so we agreed that robots could stand in for humans in pictures that required them. In September 2022, I got some good results from Google Imagen and announced I had won the three-year bet in three months. Commenters yelled at me, saying that Imagen still hadn't gotten them quite right and my victory declaration was premature. The argument blew up enough that Edwin Chen of Surge, an “RLHF and human LLM evaluation platform”, stepped in and asked his professional AI data labelling team. Their verdict was clear: the AI was bad and I was wrong. Rather than embarrass myself further, I agreed to wait out the full length of the bet and re-evaluate in June 2025. The bet is now over, and official judge Gwern agrees I've won. Before I gloat, let's look at the images that got us here. https://www.astralcodexten.com/p/now-i-really-won-that-ai-bet

Crazy Wisdom
Episode #469: Can Tesla Teach a Bot to Bachata?

Crazy Wisdom

Play Episode Listen Later Jun 30, 2025 53:30


In this episode of the Crazy Wisdom Podcast, I, Stewart Alsop, sit down with returning guest Brian Ahuja to explore a thought-provoking idea he's been stewing on—could we one day build a robot capable of true partner dancing? From the biomechanics of salsa to the possibilities of AI embodiment, we unpack what it would take to engineer fluid, responsive movement and how that intersects with everything from artificial muscles to the intimacy of tactile feedback. We also touch on Brian's long-term vision for a potential lab or foundation to tackle this challenge. You can follow Brian and future developments on Twitter @brianahuja.Check out this GPT we trained on the conversationTimestamps00:00 – Brian Ahuja returns to discuss AI embodiment, sparked by his experience in ballroom dance and curiosity about translating physical intelligence into robotics.05:00 – They explore robotics in partner dancing, touching on the difference between choreographed motion and improvisational, responsive movement.10:00 – Brian breaks down human biomechanics, emphasizing that hip motion in dances like salsa originates from knees and feet—not the hips directly.15:00 – The conversation shifts to balance, proprioception, and ocular reflexes, linking them to movement stability in dance.20:00 – They compare robot vs. human movement, noting robots' jerky motions and the absence of muscle-based initiation.25:00 – The need for haptic feedback is discussed, with Brian detailing how partner dancing depends on tactile signals and real-time response.30:00 – They touch on robotic form factors, questioning whether humanoid robots are the best approach and pondering the design of artificial muscles.35:00 – Brian proposes the idea of the Ahuja Test, gauging if a robot can move so fluidly it's indistinguishable from a human, using dance as the standard.Key InsightsPartner Dancing as a Frontier for Robotics: Brian Ahuja proposes that partner dancing could be a benchmark for robotic embodiment, where success would indicate a robot's ability to replicate fluid, responsive human movement. This task is far more complex than solo choreography—it requires real-time tactile feedback, improvisation, and nuanced physical communication.Movement Origin in Humans vs. Robots: A critical difference lies in how movement is generated. Human motion begins with muscle contraction, not at the joints. Robots, however, typically initiate movement at joint points, missing the layered interplay of muscles, tendons, and fascia that create smooth, lifelike motion.Haptic Feedback and Improvisation: Real partner dancing involves subtle cues, like pressure through fingertips, to signal direction and timing. For a robot to follow or lead a dance, it would need a highly sensitive haptic feedback system capable of interpreting and responding to these nonverbal signals in real time.The Limits of Current Robotics: Even with advanced robots like the Tesla bot, current movement still appears jerky and lacks the fluidity needed for partner dancing. The mechanical design—especially the lack of artificial musculature—may impose fundamental limits on how closely robots can mimic human motion.Applications Beyond Dance: The implications of this inquiry stretch beyond dance into fields like physical therapy, elder care, and domestic robotics. A robot that could move like a human could handle tasks requiring adaptability, precision, and physical sensitivity.Vision and Systems Thinking: Brian frames the challenge as a systems problem that might start with a lab or foundation. He emphasizes not needing to do everything alone, recognizing the value of building knowledge iteratively through conversations, research, and community.The Ahuja Test: Inspired by the Turing Test, Brian coins the idea of the “Ahuja Test”—a way to measure if a robot can move indistinguishably from a human. He suggests partner dancing could serve as the ultimate proving ground for such a test, given its demand for embodied intelligence and nuanced coordination.

This Week in Startups
Meta, Scale, and the Future of AI Labeling: Did Zuck Just Kill a Category? | E2139

This Week in Startups

Play Episode Listen Later Jun 17, 2025 69:25


Today's show:Meta just took a 49% stake in Scale AI, and the shockwaves are hitting the entire AI ecosystem. In this episode, @Jason and @alex unpack the deal's implications: Google ($150M customer!) and others are fleeing Scale, worried Meta will hoard its RLHF infrastructure and cut off competitors. Startups like Labelbox, Turing, and Handshake are already seeing a demand surge. Is this smart vertical integration or anti-competitive overreach? Jason shares tactical advice for founders on how to capitalize when incumbents stumble—hire ex-Scale talent, build “Scale AI alternative” SEO pages, and hit the podcast circuit. Don't miss this deep dive into AI's shifting power dynamics.Timestamps:(04:01) Is Jason becoming an AI doomer?!(9:52) OpenPhone - Streamline and scale your customer communications with OpenPhone. Get 20% off your first 6 months at www.openphone.com/⁠twist(13:47) PostHog, and when is it okay for founders to break the rules?(20:56) Vanta - Get $1000 off your SOC 2 at https://www.vanta.com/twist(25:50) Why the Navy is recruiting startups(30:12) Pilot - Visit https://www.pilot.com/twist and get $1,200 off your first year.(39:09) Did Zuck buy Scale in order to keep it from competitors?(56:08) When does incentivizing customers turn into burning capital?(1:04) How raising too much money could KILL your startup!Subscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpFollow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: ⁠https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisThank you to our partners:(9:52) OpenPhone - Streamline and scale your customer communications with OpenPhone. Get 20% off your first 6 months at www.openphone.com/⁠twist(20:56) Vanta - Get $1000 off your SOC 2 at https://www.vanta.com/twist(30:52) Pilot - Visit https://www.pilot.com/twist and get $1,200 off your first year.Great TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarlandCheck out Jason's suite of newsletters: https://substack.com/@calacanisFollow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.comSubscribe to the Founder University Podcast: https://www.youtube.com/@founderuniversity1916

Eye On A.I.
#261 Jonathan Frankle: How Databricks is Disrupting AI Model Training

Eye On A.I.

Play Episode Listen Later Jun 12, 2025 52:47


This episode is sponsored by Oracle. OCI is the next-generation cloud designed for every workload – where you can run any application, including any AI projects, faster and more securely for less. On average, OCI costs 50% less for compute, 70% less for storage, and 80% less for networking. Join Modal, Skydance Animation, and today's innovative AI tech companies who upgraded to OCI…and saved.   Try OCI for free at http://oracle.com/eyeonai   What if you could fine-tune an AI model without any labeled data—and still outperform traditional training methods?   In this episode of Eye on AI, we sit down with Jonathan Frankle, Chief Scientist at Databricks and co-founder of MosaicML, to explore TAO (Test-time Adaptive Optimization)—Databricks' breakthrough tuning method that's transforming how enterprises build and scale large language models (LLMs).   Jonathan explains how TAO uses reinforcement learning and synthetic data to train models without the need for expensive, time-consuming annotation. We dive into how TAO compares to supervised fine-tuning, why Databricks built their own reward model (DBRM), and how this system allows for continual improvement, lower inference costs, and faster enterprise AI deployment.   Whether you're an AI researcher, enterprise leader, or someone curious about the future of model customization, this episode will change how you think about training and deploying AI.   Explore the latest breakthroughs in data and AI from Databricks: https://www.databricks.com/events/dataaisummit-2025-announcements Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI  

Machine Learning Street Talk
"Blurring Reality" - Chai's Social AI Platform (SPONSORED)

Machine Learning Street Talk

Play Episode Listen Later May 26, 2025 50:59


"Blurring Reality" - Chai's Social AI Platform - sponsoredThis episode of MLST explores the groundbreaking work of Chai, a social AI platform that quietly built one of the world's largest AI companion ecosystems before ChatGPT's mainstream adoption. With over 10 million active users and just 13 engineers serving 2 trillion tokens per day, Chai discovered the massive appetite for AI companionship through serendipity while searching for product-market fit.CHAI sponsored this show *because they want to hire amazing engineers* -- CAREER OPPORTUNITIES AT CHAIChai is actively hiring in Palo Alto with competitive compensation ($300K-$800K+ equity) for roles including AI Infrastructure Engineers, Software Engineers, Applied AI Researchers, and more. Fast-track qualification available for candidates with significant product launches, open source contributions, or entrepreneurial success.https://www.chai-research.com/jobs/The conversation with founder William Beauchamp and engineers Tom Lu and Nischay Dhankhar covers Chai's innovative technical approaches including reinforcement learning from human feedback (RLHF), model blending techniques that combine smaller models to outperform larger ones, and their unique infrastructure challenges running exaflop-class compute.SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers in Zurich and SF. Goto https://tufalabs.ai/***Key themes explored include:- The ethics of AI engagement optimization and attention hacking- Content moderation at scale with a lean engineering team- The shift from AI as utility tool to AI as social companion- How users form deep emotional bonds with artificial intelligence- The broader implications of AI becoming a social mediumWe also examine OpenAI's recent pivot toward companion AI with April's new GPT-4o, suggesting a fundamental shift in how we interact with artificial intelligence - from utility-focused tools to companion-like experiences that blur the lines between human and artificial intimacy.The episode also covers Chai's unconventional approach to hiring only top-tier engineers, their bootstrap funding strategy focused on user revenue over VC funding, and their rapid experimentation culture where one in five experiments succeed.TOC:00:00:00 - Introduction: Steve Jobs' AI Vision & Chai's Scale00:04:02 - Chapter 1: Simulators - The Birth of Social AI00:13:34 - Chapter 2: Engineering at Chai - RLHF & Model Blending00:21:49 - Chapter 3: Social Impact of GenAI - Ethics & Safety00:33:55 - Chapter 4: The Lean Machine - 13 Engineers, Millions of Users00:42:38 - Chapter 5: GPT-4o Becoming a Companion - OpenAI's Pivot00:50:10 - Chapter 6: What Comes Next - The Future of AI Intimacy TRANSCRIPT: https://www.dropbox.com/scl/fi/yz2ewkzmwz9rbbturfbap/CHAI.pdf?rlkey=uuyk2nfhjzezucwdgntg5ubqb&dl=0

Machine Learning Guide
MLG 034 Large Language Models 1

Machine Learning Guide

Play Episode Listen Later May 7, 2025 50:48


Explains language models (LLMs) advancements. Scaling laws - the relationships among model size, data size, and compute - and how emergent abilities such as in-context learning, multi-step reasoning, and instruction following arise once certain scaling thresholds are crossed. The evolution of the transformer architecture with Mixture of Experts (MoE), describes the three-phase training process culminating in Reinforcement Learning from Human Feedback (RLHF) for model alignment, and explores advanced reasoning techniques such as chain-of-thought prompting which significantly improve complex task performance. Links Notes and resources at ocdevel.com/mlg/mlg34 Build the future of multi-agent software with AGNTCY Try a walking desk stay healthy & sharp while you learn & code Transformer Foundations and Scaling Laws Transformers: Introduced by the 2017 "Attention is All You Need" paper, transformers allow for parallel training and inference of sequences using self-attention, in contrast to the sequential nature of RNNs. Scaling Laws: Empirical research revealed that LLM performance improves predictably as model size (parameters), data size (training tokens), and compute are increased together, with diminishing returns if only one variable is scaled disproportionately. The "Chinchilla scaling law" (DeepMind, 2022) established the optimal model/data/compute ratio for efficient model performance: earlier large models like GPT-3 were undertrained relative to their size, whereas right-sized models with more training data (e.g., Chinchilla, LLaMA series) proved more compute and inference efficient. Emergent Abilities in LLMs Emergence: When trained beyond a certain scale, LLMs display abilities not present in smaller models, including: In-Context Learning (ICL): Performing new tasks based solely on prompt examples at inference time. Instruction Following: Executing natural language tasks not seen during training. Multi-Step Reasoning & Chain of Thought (CoT): Solving arithmetic, logic, or symbolic reasoning by generating intermediate reasoning steps. Discontinuity & Debate: These abilities appear abruptly in larger models, though recent research suggests that this could result from non-linearities in evaluation metrics rather than innate model properties. Architectural Evolutions: Mixture of Experts (MoE) MoE Layers: Modern LLMs often replace standard feed-forward layers with MoE structures. Composed of many independent "expert" networks specializing in different subdomains or latent structures. A gating network routes tokens to the most relevant experts per input, activating only a subset of parameters—this is called "sparse activation." Enables much larger overall models without proportional increases in compute per inference, but requires the entire model in memory and introduces new challenges like load balancing and communication overhead. Specialization & Efficiency: Experts learn different data/knowledge types, boosting model specialization and throughput, though care is needed to avoid overfitting and underutilization of specialists. The Three-Phase Training Process 1. Unsupervised Pre-Training: Next-token prediction on massive datasets—builds a foundation model capturing general language patterns. 2. Supervised Fine Tuning (SFT): Training on labeled prompt-response pairs to teach the model how to perform specific tasks (e.g., question answering, summarization, code generation). Overfitting and "catastrophic forgetting" are risks if not carefully managed. 3. Reinforcement Learning from Human Feedback (RLHF): Collects human preference data by generating multiple responses to prompts and then having annotators rank them. Builds a reward model (often PPO) based on these rankings, then updates the LLM to maximize alignment with human preferences (helpfulness, harmlessness, truthfulness). Introduces complexity and risk of reward hacking (specification gaming), where the model may exploit the reward system in unanticipated ways. Advanced Reasoning Techniques Prompt Engineering: The art/science of crafting prompts that elicit better model responses, shown to dramatically affect model output quality. Chain of Thought (CoT) Prompting: Guides models to elaborate step-by-step reasoning before arriving at final answers—demonstrably improves results on complex tasks. Variants include zero-shot CoT ("let's think step by step"), few-shot CoT with worked examples, self-consistency (voting among multiple reasoning chains), and Tree of Thought (explores multiple reasoning branches in parallel). Automated Reasoning Optimization: Frontier models selectively apply these advanced reasoning techniques, balancing compute costs with gains in accuracy and transparency. Optimization for Training and Inference Tradeoffs: The optimal balance between model size, data, and compute is determined not only for pretraining but also for inference efficiency, as lifetime inference costs may exceed initial training costs. Current Trends: Efficient scaling, model specialization (MoE), careful fine-tuning, RLHF alignment, and automated reasoning techniques define state-of-the-art LLM development.

Eye On A.I.
#248 Pedro Domingos: How Connectionism Is Reshaping the Future of Machine Learning

Eye On A.I.

Play Episode Listen Later Apr 17, 2025 59:56


This episode is sponsored by Indeed.  Stop struggling to get your job post seen on other job sites. Indeed's Sponsored Jobs help you stand out and hire fast. With Sponsored Jobs your post jumps to the top of the page for your relevant candidates, so you can reach the people you want faster. Get a $75 Sponsored Job Credit to boost your job's visibility! Claim your offer now: https://www.indeed.com/EYEONAI     In this episode, renowned AI researcher Pedro Domingos, author of The Master Algorithm, takes us deep into the world of Connectionism—the AI tribe behind neural networks and the deep learning revolution.   From the birth of neural networks in the 1940s to the explosive rise of transformers and ChatGPT, Pedro unpacks the history, breakthroughs, and limitations of connectionist AI. Along the way, he explores how supervised learning continues to quietly power today's most impressive AI systems—and why reinforcement learning and unsupervised learning are still lagging behind.   We also dive into: The tribal war between Connectionists and Symbolists The surprising origins of Backpropagation How transformers redefined machine translation Why GANs and generative models exploded (and then faded) The myth of modern reinforcement learning (DeepSeek, RLHF, etc.) The danger of AI research narrowing too soon around one dominant approach Whether you're an AI enthusiast, a machine learning practitioner, or just curious about where intelligence is headed, this episode offers a rare deep dive into the ideological foundations of AI—and what's coming next. Don't forget to subscribe for more episodes on AI, data, and the future of tech.     Stay Updated: Craig Smith on X:https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI     (00:00) What Are Generative Models? (03:02) AI Progress and the Local Optimum Trap (06:30) The Five Tribes of AI and Why They Matter (09:07) The Rise of Connectionism (11:14) Rosenblatt's Perceptron and the First AI Hype Cycle (13:35) Backpropagation: The Algorithm That Changed Everything (19:39) How Backpropagation Actually Works (21:22) AlexNet and the Deep Learning Boom (23:22) Why the Vision Community Resisted Neural Nets (25:39) The Expansion of Deep Learning (28:48) NetTalk and the Baby Steps of Neural Speech (31:24) How Transformers (and Attention) Transformed AI (34:36) Why Attention Solved the Bottleneck in Translation (35:24) The Untold Story of Transformer Invention (38:35) LSTMs vs. Attention: Solving the Vanishing Gradient Problem (42:29) GANs: The Evolutionary Arms Race in AI (48:53) Reinforcement Learning Explained (52:46) Why RL Is Mostly Just Supervised Learning in Disguise (54:35) Where AI Research Should Go Next  

Data Brew by Databricks
Reward Models | Data Brew | Episode 40

Data Brew by Databricks

Play Episode Listen Later Mar 20, 2025 39:58


In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).Highlights include:- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.- Techniques like Policy Proximal Optimization (PPO) and Direct PreferenceOptimization (DPO) for enhancing response quality.- The role of reward models in improving coding, math, reasoning, and other NLP tasks.Connect with Brandon Cui:https://www.linkedin.com/in/bcui19/

The Generative AI Meetup Podcast
Can you trust LLM Leaderboards?

The Generative AI Meetup Podcast

Play Episode Listen Later Mar 17, 2025 89:48


This conversation delves into the latest developments in AI, particularly focusing on Google's Gemma models and their capabilities. The discussion covers the differences between various types of language models, the significance of multimodal inputs, and the training techniques employed in AI models. The hosts also explore the implications of open-source versus proprietary models, the hardware requirements for running these models, and the limitations of benchmarks in evaluating AI performance. Additionally, they touch on the future of robotics and the cultural differences in AI adoption, particularly between Japan and the United States. takeaways Open source models are pushing the boundaries of AI. Gemma models are capable of multimodal inputs. Different types of LLMs serve different purposes. Benchmarks can be misleading and should be approached with caution. Training techniques like RLHF are crucial for model performance. The hardware requirements for AI models vary significantly. Cultural differences affect the adoption of robotics and AI. Robots are increasingly filling labor gaps in societies with declining populations. AI benchmarks should be tailored to specific use cases. The future of robotics and AI feels imminent and exciting. Chapters  00:00 Introduction to the Week's AI Developments 00:50 Exploring Google's Gemma Models 03:21 Understanding Different Types of LLMs 05:32 Gemma's Multimodal and Multilingual Capabilities 08:45 Training Techniques Behind Gemma 15:48 Open Source Models and Their Impact 20:34 Benchmarking AI Models 28:30 Gaming Benchmarks in AI 34:10 The Ethics of Benchmarking in AI 44:56 Language Learning and AI Models 49:12 The Importance of Benchmarks 52:35 Vibe Checks and User Preferences 01:01:09 Top AI Models and Their Performance 01:13:35 Robotics and the Future of AI 01:27:20 Cultural Perspectives on Automation

Founded and Funded
AI+Data in the Enterprise: Lessons from Mosaic to Databricks

Founded and Funded

Play Episode Listen Later Feb 26, 2025 47:18


The biggest AI breakthroughs won't come from Ph.D. labs — they'll come from people solving real-world problems. So how do AI founders actually turn cutting-edge research into real products and scale them? In this week's episode of Founded & Funded, Madrona Partner Jon Turow sat down with Jonathan Frankle, Chief AI Scientist at Databricks to talk about the shift from AI hype to real adoption — and what founders need to know. They dive into:  1) How AI adoption has shifted from hype to real-world production  2) The #1 mistake AI startups make when trying to sell to enterprises  3) Why your AI system shouldn't care if it's RAG, fine-tuned, or RLHF — it just needs to work  4) The unexpected secret to getting your first customers 5) The AI opportunity that most startups are overlooking  Transcript: https://www.madrona.com/databricks-ia40-ai-data-jonathan-frankle Chapters: (00:00) Introduction  (01:02) The Vision Behind MosaicML (04:11) Expanding the Mission at Databricks (05:52) The Concept of Data Intelligence (07:42) Navigating the AI Hype Cycle (15:10) Lessons from Early Wins at MosaicML  (20:50) Building a Strong AI Team (23:36) The Future of AI and Its Challenges  (24:06) Evolving Roles in AI at Databricks (25:55) Bridging Research and Product (28:29) High School Track at NeurIPS (30:39) AI Techniques and Customer Needs (38:22) Rapid Fire Questions and Lessons Learned (42:49) Exciting Trends in AI and Robotics (45:40) AI Policy and Governance

Machine Learning Street Talk
Want to Understand Neural Networks? Think Elastic Origami! - Prof. Randall Balestriero

Machine Learning Street Talk

Play Episode Listen Later Feb 8, 2025 78:10


Professor Randall Balestriero joins us to discuss neural network geometry, spline theory, and emerging phenomena in deep learning, based on research presented at ICML. Topics include the delayed emergence of adversarial robustness in neural networks ("grokking"), geometric interpretations of neural networks via spline theory, and challenges in reconstruction learning. We also cover geometric analysis of Large Language Models (LLMs) for toxicity detection and the relationship between intrinsic dimensionality and model control in RLHF.SPONSOR MESSAGES:***CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.https://centml.ai/pricing/Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events?Goto https://tufalabs.ai/***Randall Balestrierohttps://x.com/randall_balestrhttps://randallbalestriero.github.io/Show notes and transcript: https://www.dropbox.com/scl/fi/3lufge4upq5gy0ug75j4a/RANDALLSHOW.pdf?rlkey=nbemgpa0jhawt1e86rx7372e4&dl=0TOC:- Introduction - 00:00:00: Introduction- Neural Network Geometry and Spline Theory - 00:01:41: Neural Network Geometry and Spline Theory - 00:07:41: Deep Networks Always Grok - 00:11:39: Grokking and Adversarial Robustness - 00:16:09: Double Descent and Catastrophic Forgetting- Reconstruction Learning - 00:18:49: Reconstruction Learning - 00:24:15: Frequency Bias in Neural Networks- Geometric Analysis of Neural Networks - 00:29:02: Geometric Analysis of Neural Networks - 00:34:41: Adversarial Examples and Region Concentration- LLM Safety and Geometric Analysis - 00:40:05: LLM Safety and Geometric Analysis - 00:46:11: Toxicity Detection in LLMs - 00:52:24: Intrinsic Dimensionality and Model Control - 00:58:07: RLHF and High-Dimensional Spaces- Conclusion - 01:02:13: Neural Tangent Kernel - 01:08:07: ConclusionREFS:[00:01:35] Humayun – Deep network geometry & input space partitioninghttps://arxiv.org/html/2408.04809v1[00:03:55] Balestriero & Paris – Linking deep networks to adaptive spline operatorshttps://proceedings.mlr.press/v80/balestriero18b/balestriero18b.pdf[00:13:55] Song et al. – Gradient-based white-box adversarial attackshttps://arxiv.org/abs/2012.14965[00:16:05] Humayun, Balestriero & Baraniuk – Grokking phenomenon & emergent robustnesshttps://arxiv.org/abs/2402.15555[00:18:25] Humayun – Training dynamics & double descent via linear region evolutionhttps://arxiv.org/abs/2310.12977[00:20:15] Balestriero – Power diagram partitions in DNN decision boundarieshttps://arxiv.org/abs/1905.08443[00:23:00] Frankle & Carbin – Lottery Ticket Hypothesis for network pruninghttps://arxiv.org/abs/1803.03635[00:24:00] Belkin et al. – Double descent phenomenon in modern MLhttps://arxiv.org/abs/1812.11118[00:25:55] Balestriero et al. – Batch normalization's regularization effectshttps://arxiv.org/pdf/2209.14778[00:29:35] EU – EU AI Act 2024 with compute restrictionshttps://www.lw.com/admin/upload/SiteAttachments/EU-AI-Act-Navigating-a-Brave-New-World.pdf[00:39:30] Humayun, Balestriero & Baraniuk – SplineCam: Visualizing deep network geometryhttps://openaccess.thecvf.com/content/CVPR2023/papers/Humayun_SplineCam_Exact_Visualization_and_Characterization_of_Deep_Network_Geometry_and_CVPR_2023_paper.pdf[00:40:40] Carlini – Trade-offs between adversarial robustness and accuracyhttps://arxiv.org/pdf/2407.20099[00:44:55] Balestriero & LeCun – Limitations of reconstruction-based learning methodshttps://openreview.net/forum?id=ez7w0Ss4g9(truncated, see shownotes PDF)

Deep Papers
Multiagent Finetuning: A Conversation with Researcher Yilun Du

Deep Papers

Play Episode Listen Later Feb 4, 2025 30:03


We talk to Google DeepMind Senior Research Scientist (and incoming Assistant Professor at Harvard), Yilun Du, about his latest paper "Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains." This paper introduces a multiagent finetuning framework that enhances the performance and diversity of language models by employing a society of agents with distinct roles, improving feedback mechanisms and overall output quality.The method enables autonomous self-improvement through iterative finetuning, achieving significant performance gains across various reasoning tasks. It's versatile, applicable to both open-source and proprietary LLMs, and can integrate with human-feedback-based methods like RLHF or DPO, paving the way for future advancements in language model development.Read an overview on the blogWatch the full discussionLearn more about AI observability and evaluation in our course, join the Arize AI Slack community or get the latest on LinkedIn and X.

Training Data
ReflectionAI Founder Ioannis Antonoglou: From AlphaGo to AGI

Training Data

Play Episode Listen Later Jan 28, 2025 52:29


Ioannis Antonoglou, founding engineer at DeepMind and co-founder of ReflectionAI, has seen the triumphs of reinforcement learning firsthand. From AlphaGo to AlphaZero and MuZero, Ioannis has built the most powerful agents in the world. Ioannis breaks down key moments in AlphaGo's game against Lee Sodol (Moves 37 and 78), the importance of self-play and the impact of scale, reliability, planning and in-context learning as core factors that will unlock the next level of progress in AI. Hosted by: Stephanie Zhan and Sonya Huang, Sequoia Capital Mentioned in this episode: PPO: Proximal Policy Optimization algorithm developed by DeepMind in game environments. Also used by OpenAI for RLHF in ChatGPT. MuJoCo: Open source physics engine used to develop PPO Monte Carlo Tree Search: Heuristic search algorithm used in AlphaGo as well as video compression for YouTube and the self-driving system at Tesla AlphaZero: The DeepMind model that taught itself from scratch how to master the games of chess, shogi and Go MuZero: The DeepMind follow up to AlphaZero that mastered games without knowing the rules and able to plan winning strategies in unknown environments AlphaChem: Chemical Synthesis Planning with Tree Search and Deep Neural Network Policies DQN: Deep Q-Network, Introduced in 2013 paper, Playing Atari with Deep Reinforcement Learning AlphaFold: DeepMind model for predicting protein structures for which Demis Hassabis, John Jumper and David Baker won the 2024 Nobel Prize in Chemistry

Machine Learning Street Talk
Nicholas Carlini (Google DeepMind)

Machine Learning Street Talk

Play Episode Listen Later Jan 25, 2025 81:15


Nicholas Carlini from Google DeepMind offers his view of AI security, emergent LLM capabilities, and his groundbreaking model-stealing research. He reveals how LLMs can unexpectedly excel at tasks like chess and discusses the security pitfalls of LLM-generated code. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? Goto https://tufalabs.ai/ *** Transcript: https://www.dropbox.com/scl/fi/lat7sfyd4k3g5k9crjpbf/CARLINI.pdf?rlkey=b7kcqbvau17uw6rksbr8ccd8v&dl=0 TOC: 1. ML Security Fundamentals [00:00:00] 1.1 ML Model Reasoning and Security Fundamentals [00:03:04] 1.2 ML Security Vulnerabilities and System Design [00:08:22] 1.3 LLM Chess Capabilities and Emergent Behavior [00:13:20] 1.4 Model Training, RLHF, and Calibration Effects 2. Model Evaluation and Research Methods [00:19:40] 2.1 Model Reasoning and Evaluation Metrics [00:24:37] 2.2 Security Research Philosophy and Methodology [00:27:50] 2.3 Security Disclosure Norms and Community Differences 3. LLM Applications and Best Practices [00:44:29] 3.1 Practical LLM Applications and Productivity Gains [00:49:51] 3.2 Effective LLM Usage and Prompting Strategies [00:53:03] 3.3 Security Vulnerabilities in LLM-Generated Code 4. Advanced LLM Research and Architecture [00:59:13] 4.1 LLM Code Generation Performance and O(1) Labs Experience [01:03:31] 4.2 Adaptation Patterns and Benchmarking Challenges [01:10:10] 4.3 Model Stealing Research and Production LLM Architecture Extraction REFS: [00:01:15] Nicholas Carlini's personal website & research profile (Google DeepMind, ML security) - https://nicholas.carlini.com/ [00:01:50] CentML AI compute platform for language model workloads - https://centml.ai/ [00:04:30] Seminal paper on neural network robustness against adversarial examples (Carlini & Wagner, 2016) - https://arxiv.org/abs/1608.04644 [00:05:20] Computer Fraud and Abuse Act (CFAA) – primary U.S. federal law on computer hacking liability - https://www.justice.gov/jm/jm-9-48000-computer-fraud [00:08:30] Blog post: Emergent chess capabilities in GPT-3.5-turbo-instruct (Nicholas Carlini, Sept 2023) - https://nicholas.carlini.com/writing/2023/chess-llm.html [00:16:10] Paper: “Self-Play Preference Optimization for Language Model Alignment” (Yue Wu et al., 2024) - https://arxiv.org/abs/2405.00675 [00:18:00] GPT-4 Technical Report: development, capabilities, and calibration analysis - https://arxiv.org/abs/2303.08774 [00:22:40] Historical shift from descriptive to algebraic chess notation (FIDE) - https://en.wikipedia.org/wiki/Descriptive_notation [00:23:55] Analysis of distribution shift in ML (Hendrycks et al.) - https://arxiv.org/abs/2006.16241 [00:27:40] Nicholas Carlini's essay “Why I Attack” (June 2024) – motivations for security research - https://nicholas.carlini.com/writing/2024/why-i-attack.html [00:34:05] Google Project Zero's 90-day vulnerability disclosure policy - https://googleprojectzero.blogspot.com/p/vulnerability-disclosure-policy.html [00:51:15] Evolution of Google search syntax & user behavior (Daniel M. Russell) - https://www.amazon.com/Joy-Search-Google-Master-Information/dp/0262042878 [01:04:05] Rust's ownership & borrowing system for memory safety - https://doc.rust-lang.org/book/ch04-00-understanding-ownership.html [01:10:05] Paper: “Stealing Part of a Production Language Model” (Carlini et al., March 2024) – extraction attacks on ChatGPT, PaLM-2 - https://arxiv.org/abs/2403.06634 [01:10:55] First model stealing paper (Tramèr et al., 2016) – attacking ML APIs via prediction - https://arxiv.org/abs/1609.02943

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Sponsorships and applications for the AI Engineer Summit in NYC are live! (Speaker CFPs have closed) If you are building AI agents or leading teams of AI Engineers, this will be the single highest-signal conference of the year for you.Right after Christmas, the Chinese Whale Bros ended 2024 by dropping the last big model launch of the year: DeepSeek v3. Right now on LM Arena, DeepSeek v3 has a score of 1319, right under the full o1 model, Gemini 2, and 4o latest. This makes it the best open weights model in the world in January 2025.There has been a big recent trend in Chinese labs releasing very large open weights models, with TenCent releasing Hunyuan-Large in November and Hailuo releasing MiniMax-Text this week, both over 400B in size. However these extra-large language models are very difficult to serve.Baseten was the first of the Inference neocloud startups to get DeepSeek V3 online, because of their H200 clusters, their close collaboration with the DeepSeek team and early support of SGLang, a relatively new VLLM alternative that is also used at frontier labs like X.ai. Each H200 has 141 GB of VRAM with 4.8 TB per second of bandwidth, meaning that you can use 8 H200's in a node to inference DeepSeek v3 in FP8, taking into account KV Cache needs. We have been close to Baseten since Sarah Guo introduced Amir Haghighat to swyx, and they supported the very first Latent Space Demo Day in San Francisco, which was effectively the trial run for swyx and Alessio to work together! Since then, Philip Kiely also led a well attended workshop on TensorRT LLM at the 2024 World's Fair. We worked with him to get two of their best representatives, Amir and Lead Model Performance Engineer Yineng Zhang, to discuss DeepSeek, SGLang, and everything they have learned running Mission Critical Inference workloads at scale for some of the largest AI products in the world.The Three Pillars of Mission Critical InferenceWe initially planned to focus the conversation on SGLang, but Amir and Yineng were quick to correct us that the choice of inference framework is only the simplest, first choice of 3 things you need for production inference at scale:“I think it takes three things, and each of them individually is necessary but not sufficient: * Performance at the model level: how fast are you running this one model running on a single GPU, let's say. The framework that you use there can, can matter. The techniques that you use there can matter. The MLA technique, for example, that Yineng mentioned, or the CUDA kernels that are being used. But there's also techniques being used at a higher level, things like speculative decoding with draft models or with Medusa heads. And these are implemented in the different frameworks, or you can even implement it yourself, but they're not necessarily tied to a single framework. But using speculative decoding gets you massive upside when it comes to being able to handle high throughput. But that's not enough. Invariably, that one model running on a single GPU, let's say, is going to get too much traffic that it cannot handle.* Horizontal scaling at the cluster/region level: And at that point, you need to horizontally scale it. That's not an ML problem. That's not a PyTorch problem. That's an infrastructure problem. How quickly do you go from, a single replica of that model to 5, to 10, to 100. And so that's the second, that's the second pillar that is necessary for running these machine critical inference workloads.And what does it take to do that? It takes, some people are like, Oh, You just need Kubernetes and Kubernetes has an autoscaler and that just works. That doesn't work for, for these kinds of mission critical inference workloads. And you end up catching yourself wanting to bit by bit to rebuild those infrastructure pieces from scratch. This has been our experience. * And then going even a layer beyond that, Kubernetes runs in a single. cluster. It's a single cluster. It's a single region tied to a single region. And when it comes to inference workloads and needing GPUs more and more, you know, we're seeing this that you cannot meet the demand inside of a single region. A single cloud's a single region. In other words, a single model might want to horizontally scale up to 200 replicas, each of which is, let's say, 2H100s or 4H100s or even a full node, you run into limits of the capacity inside of that one region. And what we had to build to get around that was the ability to have a single model have replicas across different regions. So, you know, there are models on Baseten today that have 50 replicas in GCP East and, 80 replicas in AWS West and Oracle in London, etc.* Developer experience for Compound AI Systems: The final one is wrapping the power of the first two pillars in a very good developer experience to be able to afford certain workflows like the ones that I mentioned, around multi step, multi model inference workloads, because more and more we're seeing that the market is moving towards those that the needs are generally in these sort of more complex workflows. We think they said it very well.Show Notes* Amir Haghighat, Co-Founder, Baseten* Yineng Zhang, Lead Software Engineer, Model Performance, BasetenFull YouTube EpisodePlease like and subscribe!Timestamps* 00:00 Introduction and Latest AI Model Launch* 00:11 DeepSeek v3: Specifications and Achievements* 03:10 Latent Space Podcast: Special Guests Introduction* 04:12 DeepSeek v3: Technical Insights* 11:14 Quantization and Model Performance* 16:19 MOE Models: Trends and Challenges* 18:53 Baseten's Inference Service and Pricing* 31:13 Optimization for DeepSeek* 31:45 Three Pillars of Mission Critical Inference Workloads* 32:39 Scaling Beyond Single GPU* 33:09 Challenges with Kubernetes and Infrastructure* 33:40 Multi-Region Scaling Solutions* 35:34 SG Lang: A New Framework* 38:52 Key Techniques Behind SG Lang* 48:27 Speculative Decoding and Performance* 49:54 Future of Fine-Tuning and RLHF* 01:00:28 Baseten's V3 and Industry TrendsBaseten's previous TensorRT LLM workshop: Get full access to Latent Space at www.latent.space/subscribe

Crazy Wisdom
Episode #420: Humanism Reloaded: Balancing Progress and Purpose in the Age of AI

Crazy Wisdom

Play Episode Listen Later Dec 23, 2024 64:49


On this episode of Crazy Wisdom, Stewart Alsop welcomes back guest David Hundley, a principal engineer at a Fortune 500 company specializing in innovative machine learning applications. The conversation spans topics like techno-humanism, the future interplay of consciousness and artificial intelligence, and the societal implications of technologies like neural interfaces and large language models. Together, they explore the philosophical and technical challenges posed by advancements in AI and what it means for humanity's trajectory. For more insights from David, visit his website or follow him on Twitter.Check out this GPT we trained on the conversation!Timestamps00:00 Introduction to the Crazy Wisdom Podcast00:31 Techno Humanism vs. Transhumanism02:14 Exploring Humanism and Its Historical Context05:06 Accelerationism and Consciousness06:58 AI Conversations and Human Interaction10:21 Challenges in AI and Machine Learning13:26 Product Integration and AI Limitations19:03 Coding with AI: Tools and Techniques25:28 Vector Stores vs. Traditional Databases32:16 Understanding Network Self-Optimization33:25 Exploring Parameters and Biases in AI34:53 Bias in AI and Societal Implications38:28 The Future of AI and Open Source44:01 Techno-Humanism and AI's Role in Society48:55 The Intersection of AI and Human Emotions52:48 The Ethical and Societal Impact of AI58:20 Final Thoughts and Future DirectionsKey InsightsTechno-Humanism as a Framework: David Hundley introduces "techno-humanism" as a philosophy that explores how technology and humanity can coexist and integrate without losing sight of human values. This perspective acknowledges the current reality that we are already cyborgs, augmented by devices like smartphones and smartwatches, and speculates on the deeper implications of emerging technologies like Neuralink, which could redefine the human experience.The Limitations of Large Language Models (LLMs): The discussion highlights that while LLMs are powerful tools, they lack true creativity or consciousness. They are stochastic parrots, reflecting and recombining existing knowledge rather than generating novel ideas. This distinction underscores the difference between human and artificial intelligence, particularly in the ability to create new explanations and knowledge.Biases and Zeitgeist Machines: LLMs are described as "zeitgeist machines," reflecting the biases and values embedded in their training data. While this mirrors societal norms, it raises concerns about how conscious and unconscious biases—shaped by culture, regulation, and curation—impact the models' outputs. The episode explores the ethical and societal implications of this phenomenon.The Role of Open Source in AI's Future: Open-source AI tools are positioned as critical to the democratization of technology. David suggests that open-source projects, such as those in the Python ecosystem, have historically driven innovation and accessibility, and this trend is likely to continue with AI. Open-source initiatives provide opportunities for decentralization, reducing reliance on corporate-controlled models.Potential of AI for Mental Health and Counseling: David shares his experience using AI for conversational support, comparing it to talking with a human friend. This suggests a growing potential for AI in mental health applications, offering companionship or guidance. However, the ethical implications of replacing human counselors with AI and the depth of empathy that machines can genuinely offer remain questions.The Future of Database Technologies: The discussion explores traditional databases versus emerging technologies like vector and graph databases, particularly in how they support AI. Graph databases, with their ability to encode relationships between pieces of information, could provide a more robust foundation for complex queries in knowledge-intensive environments.The Ethical and Societal Implications of AI: The conversation grapples with how AI could reshape societal structures and values, from its influence on decision-making to its potential integration with human cognition. Whether through regulation, neural enhancement, or changes in media dynamics, AI presents profound challenges and opportunities for human civilization, raising questions about autonomy, ethics, and collective progress.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Happy holidays! We'll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Computer, Thoth.ai, StrongCompute, Notable Capital, and most of all our LS supporters who helped fund the venue and A/V production!For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver.Since Nathan Lambert ( Interconnects ) joined us for the hit RLHF 201 episode at the start of this year, it is hard to overstate how much Open Models have exploded this past year. In 2023 only five names were playing in the top LLM ranks, Mistral, Mosaic's MPT, TII UAE's Falcon, Yi from Kai-Fu Lee's 01.ai, and of course Meta's Llama 1 and 2. This year a whole cast of new open models have burst on the scene, from Google's Gemma and Cohere's Command R, to Alibaba's Qwen and Deepseek models, to LLM 360 and DCLM and of course to the Allen Institute's OLMo, OL MOE, Pixmo, Molmo, and Olmo 2 models. We were honored to host Luca Soldaini, one of the research leads on the Olmo series of models at AI2.Pursuing Open Model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe, California and the White House. We also were honored to hear from and Sophia Yang, head of devrel at Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track!Full Talk on YouTubePlease like and subscribe!Timestamps* 00:00 Welcome to Latent Space Live * 00:12 Recap of 2024: Best Moments and Keynotes * 01:22 Explosive Growth of Open Models in 2024 * 02:04 Challenges in Open Model Research * 02:38 Keynote by Luca Soldani: State of Open Models * 07:23 Significance of Open Source AI Licenses * 11:31 Research Constraints and Compute Challenges * 13:46 Fully Open Models: A New Trend * 27:46 Mistral's Journey and Innovations * 32:57 Interactive Demo: Lachat Capabilities * 36:50 Closing Remarks and NetworkingTranscriptSession3Audio[00:00:00] AI Charlie: Welcome to Latent Space Live, our first mini conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co host. As a special treat this week, we're recapping the best of 2024 going domain by domain. We sent out a survey to the over 900 of you who told us what you wanted, and then invited the best speakers in the latent space network to cover each field.[00:00:28] AI Charlie: 200 of you joined us in person throughout the day, with over 2, 200 watching live online. Our next keynote covers the state of open models in 2024, with Luca Soldani and Nathan Lambert of the Allen Institute for AI, with a special appearance from Dr. Sophia Yang of Mistral. Our first hit episode of 2024 was with Nathan Lambert on RLHF 201 back in January.[00:00:57] AI Charlie: Where he discussed both reinforcement learning for language [00:01:00] models and the growing post training and mid training stack with hot takes on everything from constitutional AI to DPO to rejection sampling and also previewed the sea change coming to the Allen Institute. And to Interconnects, his incredible substack on the technical aspects of state of the art AI training.[00:01:18] AI Charlie: We highly recommend subscribing to get access to his Discord as well. It is hard to overstate how much open models have exploded this past year. In 2023, only five names were playing in the top LLM ranks. Mistral, Mosaics MPT, and Gatsby. TII UAE's Falcon, Yi, from Kaifu Lee's 01. ai, And of course, Meta's Lama 1 and 2.[00:01:43] AI Charlie: This year, a whole cast of new open models have burst on the scene. From Google's Jemma and Cohere's Command R, To Alibaba's Quen and DeepSeq models, to LLM360 and DCLM, and of course, to the Allen Institute's OLMO, [00:02:00] OLMOE, PIXMO, MOLMO, and OLMO2 models. Pursuing open model research comes with a lot of challenges beyond just funding and access to GPUs and datasets, particularly the regulatory debates this year across Europe.[00:02:14] AI Charlie: California and the White House. We also were honored to hear from Mistral, who also presented a great session at the AI Engineer World's Fair Open Models track. As always, don't forget to check the show notes for the YouTube link to their talk, as well as their slides. Watch out and take care.[00:02:35] Luca Intro[00:02:35] Luca Soldaini: Cool. Yeah, thanks for having me over. I'm Luca. I'm a research scientist at the Allen Institute for AI. I threw together a few slides on sort of like a recap of like interesting themes in open models for, for 2024. Have about maybe 20, 25 minutes of slides, and then we can chat if there are any questions.[00:02:57] Luca Soldaini: If I can advance to the next slide. [00:03:00] Okay, cool. So I did the quick check of like, to sort of get a sense of like, how much 2024 was different from 2023. So I went on Hugging Face and sort of get, tried to get a picture of what kind of models were released in 2023 and like, what do we get in 2024?[00:03:16] Luca Soldaini: 2023 we get, we got things like both LLAMA 1 and 2, we got Mistral, we got MPT, Falcon models, I think the YI model came in at the end. Tail end of the year. It was a pretty good year. But then I did the same for 2024. And it's actually quite stark difference. You have models that are, you know, reveling frontier level.[00:03:38] Luca Soldaini: Performance of what you can get from closed models from like Quen, from DeepSeq. We got Llama3. We got all sorts of different models. I added our own Olmo at the bottom. There's this growing group of like, Fully open models that I'm going to touch on a little bit later. But you know, just looking at the slides, it feels like 2024 [00:04:00] was just smooth sailing, happy knees, much better than previous year.[00:04:04] Luca Soldaini: And you know, you can plot you can pick your favorite benchmark Or least favorite, I don't know, depending on what point you're trying to make. And plot, you know, your closed model, your open model and sort of spin it in ways that show that, oh, you know open models are much closer to where closed models are today versus to Versus last year where the gap was fairly significant.[00:04:29] Luca Soldaini: So one thing that I think I don't know if I have to convince people in this room, but usually when I give this talks about like open models, there is always like this background question in, in, in people's mind of like, why should we use open models? APIs argument, you know, it's, it's. Just an HTTP request to get output from a, from one of the best model out there.[00:04:53] Luca Soldaini: Why do I have to set up infra and use local models? And there are really like two answer. There is the more [00:05:00] researchy answer for this, which is where it might be. Background lays, which is just research. If you want to do research on language models, research thrives on, on open models, there is like large swath of research on modeling, on how these models behave on evaluation and inference on mechanistic interpretability that could not happen at all if you didn't have open models they're also for AI builders, they're also like.[00:05:30] Luca Soldaini: Good use cases for using local models. You know, you have some, this is like a very not comprehensive slides, but you have things like there are some application where local models just blow closed models out of the water. So like retrieval, it's a very clear example. We might have like constraints like Edge AI applications where it makes sense.[00:05:51] Luca Soldaini: But even just like in terms of like stability, being able to say this model is not changing under the hood. It's, there's plenty of good cases for, [00:06:00] for open models. And the community is just not models. Is I stole this slide from one of the Quent2 announcement blog posts. But it's super cool to see like how much tech exists around open models and serving them on making them efficient and hosting them.[00:06:18] Luca Soldaini: It's pretty cool. And so. It's if you think about like where the term opens come from, comes from like the open source really open models meet the core tenants of, of open, of open source specifically when it comes around collaboration, there is truly a spirit, like through these open models, you can build on top of other people.[00:06:41] Luca Soldaini: innovation. We see a lot of these even in our own work of like, you know, as we iterate in the various versions of Alma it's not just like every time we collect from scratch all the data. No, the first step is like, okay, what are the cool data sources and datasets people have put [00:07:00] together for language model for training?[00:07:01] Luca Soldaini: Or when it comes to like our post training pipeline We one of the steps is you want to do some DPO and you use a lot of outputs of other models to improve your, your preference model. So it's really having like an open sort of ecosystem benefits and accelerates the development of open models.[00:07:23] The Definition of Open Models[00:07:23] Luca Soldaini: One thing that we got in 2024, which is not a specific model, but I thought it was really significant, is we first got we got our first open source AI definition. So this is from the open source initiative they've been generally the steward of a lot of the open source licenses when it comes to software and so they embarked on this journey in trying to figure out, okay, How does a license, an open source license for a model look like?[00:07:52] Luca Soldaini: Majority of the work is very dry because licenses are dry. So I'm not going to walk through the license step by [00:08:00] step, but I'm just going to pick out one aspect that is very good and then one aspect that personally feels like it needs improvement on the good side. This this open source AI license actually.[00:08:13] Luca Soldaini: This is very intuitive. If you ever build open source software and you have some expectation around like what open source looks like for software for, for AI, sort of matches your intuition. So, the weights need to be fairly available the code must be released with an open source license and there shouldn't be like license clauses that block specific use cases.[00:08:39] Luca Soldaini: So. Under this definition, for example, LLAMA or some of the QUEN models are not open source because the license says you can't use this model for this or it says if you use this model you have to name the output this way or derivative needs to be named that way. Those clauses don't meet open source [00:09:00] definition and so they will not be covered.[00:09:02] Luca Soldaini: The LLAMA license will not be covered under the open source definition. It's not perfect. One of the thing that, um, internally, you know, in discussion with with OSI, we were sort of disappointed is around the language. For data. So you might imagine that an open source AI model means a model where the data is freely available.[00:09:26] Luca Soldaini: There were discussion around that, but at the end of the day, they decided to go with a softened stance where they say a model is open source if you provide sufficient detail information. On how to sort of replicate the data pipeline. So you have an equivalent system, sufficient, sufficiently detailed.[00:09:46] Luca Soldaini: It's very, it's very fuzzy. Don't like that. An equivalent system is also very fuzzy. And this doesn't take into account the accessibility of the process, right? It might be that you provide enough [00:10:00] information, but this process costs, I don't know, 10 million to do. Now the open source definition. Like, any open source license has never been about accessibility, so that's never a factor in open source software, how accessible software is.[00:10:14] Luca Soldaini: I can make a piece of open source, put it on my hard drive, and never access it. That software is still open source, the fact that it's not widely distributed doesn't change the license, but practically there are expectations of like, what we want good open sources to be. So, it's, It's kind of sad to see that the data component in this license is not as, as, Open as some of us would like would like it to be.[00:10:40] Challenges for Open Models[00:10:40] Luca Soldaini: and I linked a blog post that Nathan wrote on the topic that it's less rambly and easier to follow through. One thing that in general, I think it's fair to say about the state of open models in 2024 is that we know a lot more than what we knew in, [00:11:00] in 2023. Like both on the training data, like And the pre training data you curate on like how to do like all the post training, especially like on the RL side.[00:11:10] Luca Soldaini: You know, 2023 was a lot of like throwing random darts at the board. I think 2024, we have clear recipes that, okay, don't get the same results as a closed lab because there is a cost in, in actually matching what they do. But at least we have a good sense of like, okay, this is, this is the path to get state of the art language model.[00:11:31] Luca Soldaini: I think that one thing that it's a downside of 2024 is that I think we are more research constrained in 2023. It feels that, you know, the barrier for compute that you need to, to move innovation along as just being right rising and rising. So like, if you go back to this slide, there is now this, this cluster of models that are sort of released by the.[00:11:57] Luca Soldaini: Compute rich club. Membership is [00:12:00] hotly debated. You know, some people don't want to be. Called the rich because it comes to expectations. Some people want to be called rich, but I don't know, there's debate, but like, these are players that have, you know, 10, 000, 50, 000 GPUs at minimum. And so they can do a lot of work and a lot of exploration and improving models that it's not very accessible.[00:12:21] Luca Soldaini: To give you a sense of like how I personally think about. Research budget for each part of the, of the language model pipeline is like on the pre training side, you can maybe do something with a thousand GPUs, really you want 10, 000. And like, if you want real estate of the art, you know, your deep seek minimum is like 50, 000 and you can scale to infinity.[00:12:44] Luca Soldaini: The more you have, the better it gets. Everyone on that side still complains that they don't have enough GPUs. Post training is a super wide sort of spectrum. You can do as little with like eight GPUs as long as you're able to [00:13:00] run, you know, a good version of, say, a LLAMA model, you can do a lot of work there.[00:13:05] Luca Soldaini: You can scale a lot of the methodology, just like scales with compute, right? If you're interested in you know, your open replication of what OpenAI's O1 is you're going to be on the 10K spectrum of our GPUs. Inference, you can do a lot with very few resources. Evaluation, you can do a lot with, well, I should say at least one GPUs if you want to evaluate GPUs.[00:13:30] Luca Soldaini: Open models but in general, like if you are, if you care a lot about intervention to do on this model, which it's my prefer area of, of research, then, you know, the resources that you need are quite, quite significant. Yeah. One other trends that has emerged in 2024 is this cluster of fully open models.[00:13:54] Luca Soldaini: So Omo the model that we built at ai, two being one of them and you know, it's nice [00:14:00] that it's not just us. There's like a cluster of other mostly research efforts who are working on this. And so it's good to to give you a primer of what like fully open means. So fully open, the easy way to think about it is instead of just releasing a model checkpoint that you run, you release a full recipe so that other people working on it.[00:14:24] Luca Soldaini: Working on that space can pick and choose whatever they want from your recipe and create their own model or improve on top of your model. You're giving out the full pipeline and all the details there instead of just like the end output. So I pull up the screenshot from our recent MOE model.[00:14:43] Luca Soldaini: And like for this model, for example, we released the model itself. Data that was trained on, the code, both for training and inference all the logs that we got through the training run, as well as every intermediate checkpoint and like the fact that you release different part of the pipeline [00:15:00] allows others to do really cool things.[00:15:02] Luca Soldaini: So for example, this tweet from early this year from folks in news research they use our pre training data to do a replication of the BitNet paper in the open. So they took just a Really like the initial part of a pipeline and then the, the thing on top of it. It goes both ways.[00:15:21] Luca Soldaini: So for example, for the Olmo2 model a lot of our pre trained data for the first stage of pre training was from this DCLM initiative that was led by folks Ooh, a variety of ins a variety of institutions. It was a really nice group effort. But you know, for When it was nice to be able to say, okay, you know, the state of the art in terms of like what is done in the open has improved.[00:15:46] AI2 Models - Olmo, Molmo, Pixmo etc[00:15:46] Luca Soldaini: We don't have to like do all this work from scratch to catch up the state of the art. We can just take it directly and integrate it and do our own improvements on top of that. I'm going to spend a few minutes doing like a [00:16:00] shameless plug for some of our fully open recipes. So indulge me in this.[00:16:05] Luca Soldaini: So a few things that we released this year was, as I was mentioning, there's OMOE model which is, I think still is state of the art MOE model in its size class. And it's also. Fully open, so every component of this model is available. We released a multi modal model called Molmo. Molmo is not just a model, but it's a full recipe of how you go from a text only model to a multi modal model, and we apply this recipe on top of Quent checkpoints, on top of Olmo checkpoints, as well as on top of OlmoE.[00:16:37] Luca Soldaini: And I think there'd be a replication doing that on top of Mistral as well. The post training side we recently released 2. 0. 3. Same story. This is a recipe on how you go from a base model to A state of the art post training model. We use the Tulu recipe on top of Olmo, on top of Llama, and then there's been open replication effort [00:17:00] to do that on top of Quen as well.[00:17:02] Luca Soldaini: It's really nice to see like, you know, when your recipe sort of, it's kind of turnkey, you can apply it to different models and it kind of just works. And finally, the last thing we released this year was Olmo 2, which so far is the best state of the art. Fully open language model a Sera combines aspect from all three of these previous models.[00:17:22] Luca Soldaini: What we learn on the data side from MomoE and what we learn on like making models that are easy to adapt from the Momo project and the Tulu project. I will close with a little bit of reflection of like ways this, this ecosystem of open models like it's not all roses. It's not all happy. It feels like day to day, it's always in peril.[00:17:44] Luca Soldaini: And, you know, I talked a little bit about like the compute issues that come with it. But it's really not just compute. One thing that is on top of my mind is due to like the environment and how you know, growing feelings about like how AI is treated. [00:18:00] It's actually harder to get access to a lot of the data that was used to train a lot of the models up to last year.[00:18:06] Luca Soldaini: So this is a screenshot from really fabulous work from Shane Longpre who's, I think is in Europe about Just access of like diminishing access to data for language model pre training. So what they did is they went through every snapshot of common crawl. Common crawl is this publicly available scrape of the, of a subset of the internet.[00:18:29] Luca Soldaini: And they looked at how For any given website whether a website that was accessible in say 2017, what, whether it was accessible or not in 2024. And what they found is as a reaction to like the close like of the existence of closed models like OpenAI or Cloud GPT or Cloud a lot of content owners have blanket Blocked any type of crawling to your website.[00:18:57] Luca Soldaini: And this is something that we see also internally at [00:19:00] AI2. Like one project that we started this year is we wanted to, we wanted to understand, like, if you're a good citizen of the internet and you crawl following sort of norms and policy that have been established in the last 25 years, what can you crawl?[00:19:17] Luca Soldaini: And we found that there's a lot of website where. The norms of how you express preference of whether to crawl your data or not are broken. A lot of people would block a lot of crawling, but do not advertise that in RobustDXT. You can only tell that they're crawling, that they're blocking you in crawling when you try doing it.[00:19:37] Luca Soldaini: Sometimes you can't even crawl the robots. txt to, to check whether you're allowed or not. And then a lot of websites there's, there's like all these technologies that historically have been, have existed to make websites serving easier such as Cloudflare or DNS. They're now being repurposed for blocking AI or any type of crawling [00:20:00] in a way that is Very opaque to the content owners themselves.[00:20:04] Luca Soldaini: So, you know, you go to these websites, you try to access them and they're not available and you get a feeling it's like, Oh, someone changed, something changed on the, on the DNS side that it's blocking this and likely the content owner has no idea. They're just using a Cloudflare for better, you know, load balancing.[00:20:25] Luca Soldaini: And this is something that was sort of sprung on them with very little notice. And I think the problem is this, this blocking or ideas really, it impacts people in different ways. It disproportionately helps companies that have a headstart, which are usually the closed labs and it hurts incoming newcomer players where either have now to do things in a sketchy way or you're never going to get that content that the closed lab might have.[00:20:54] Luca Soldaini: So there's a lot, it was a lot of coverage. I'm going to plug Nathan's blog post again. That is, [00:21:00] that I think the title of this one is very succinct which is like, we're actually not, You know, before thinking about running out of training data, we're actually running out of open training data. And so if we want better open models they should be on top of our mind.[00:21:13] Regulation and Lobbying[00:21:13] Luca Soldaini: The other thing that has emerged is that there is strong lobbying efforts on trying to define any kind of, AI as like a new extremely risky and I want to be precise here. Like the problem is now, um, like the problem is not not considering the risk of this technology. Every technology has risks that, that should always be considered.[00:21:37] Luca Soldaini: The thing that it's like to me is sorry, is ingenious is like just putting this AI on a pedestal and calling it like, An unknown alien technology that has like new and undiscovered potentials to destroy humanity. When in reality, all the dangers I think are rooted in [00:22:00] dangers that we know from existing software industry or existing issues that come with when using software on on a lot of sensitive domains, like medical areas.[00:22:13] Luca Soldaini: And I also noticed a lot of efforts that have actually been going on and trying to make this open model safe. I pasted one here from AI2, but there's actually like a lot of work that has been going on on like, okay, how do you make, if you're distributing this model, Openly, how do you make it safe?[00:22:31] Luca Soldaini: How, what's the right balance between accessibility on open models and safety? And then also there's annoying brushing of sort of concerns that are then proved to be unfounded under the rug. You know, if you remember the beginning of this year, it was all about bio risk of these open models.[00:22:48] Luca Soldaini: The whole thing fizzled because as being Finally, there's been like rigorous research, not just this paper from Cohere folks, but it's been rigorous research showing [00:23:00] that this is really not a concern that we should be worried about. Again, there is a lot of dangerous use of AI applications, but this one was just like, A lobbying ploy to just make things sound scarier than they actually are.[00:23:15] Luca Soldaini: So I got to preface this part. It says, this is my personal opinion. It's not my employer, but I look at things like the SP 1047 from, from California. And I think we kind of dodged a bullet on, on this legislation. We, you know, the open source community, a lot of the community came together at the last, sort of the last minute and did a very good effort trying to explain all the negative impact of this bill.[00:23:43] Luca Soldaini: But There's like, I feel like there's a lot of excitement on building these open models or like researching on these open models. And lobbying is not sexy it's kind of boring but it's sort of necessary to make sure that this ecosystem can, can really [00:24:00] thrive. This end of presentation, I have Some links, emails, sort of standard thing in case anyone wants to reach out and if folks have questions or anything they wanted to discuss.[00:24:13] Luca Soldaini: Is there an open floor? I think we have Sophia[00:24:16] swyx: who wants to who one, one very important open model that we haven't covered is Mistral. Ask her on this slide. Yeah, yeah. Well, well, it's nice to have the Mistral person talk recap the year in Mistral. But while Sophia gets set up, does anyone have like, just thoughts or questions about the progress in this space?[00:24:32] Questions - Incentive Alignment[00:24:32] swyx: Do you always have questions?[00:24:34] Quesiton: I'm very curious how we should build incentives to build open models, things like Francois Chollet's ArcPrize, and other initiatives like that. What is your opinion on how we should better align incentives in the community so that open models stay open?[00:24:49] Luca Soldaini: The incentive bit is, like, really hard.[00:24:51] Luca Soldaini: Like, even It's something that I actually, even we think a lot about it internally because like building open models is risky. [00:25:00] It's very expensive. And so people don't want to take risky bets. I think the, definitely like the challenges like our challenge, I think those are like very valid approaches for it.[00:25:13] Luca Soldaini: And then I think in general, promoting, building, so, any kind of effort to participate in this challenge, in those challenges, if we can promote doing that on top of open models and sort of really lean into like this multiplier effect, I think that is a good way to go. If there were more money for that.[00:25:35] Luca Soldaini: For efforts like research efforts around open models. There's a lot of, I think there's a lot of investments in companies that at the moment are releasing their model in the open, which is really cool. But it's usually more because of commercial interest and not wanting to support this, this like open models in the longterm, it's a really hard problem because I think everyone is operating sort of [00:26:00] in what.[00:26:01] Luca Soldaini: Everyone is at their local maximum, right? In ways that really optimize their position on the market. Global maximum is harder to achieve.[00:26:11] Question2: Can I ask one question? No.[00:26:12] Luca Soldaini: Yeah.[00:26:13] Question2: So I think one of the gap between the closed and open source models is the mutability. So the closed source models like chat GPT works pretty good on the low resource languages, which is not the same on the open, open source models, right?[00:26:27] Question2: So is it in your plan to improve on that?[00:26:32] Luca Soldaini: I think in general,[00:26:32] Luca Soldaini: yes, is I think it's. I think we'll see a lot of improvements there in, like, 2025. Like, there's groups like, Procurement English on the smaller side that are already working on, like, better crawl support, multilingual support. I think what I'm trying to say here is you really want to be experts.[00:26:54] Luca Soldaini: who are actually in those countries that teach those languages to [00:27:00] participate in the international community. To give you, like, a very easy example I'm originally from Italy. I think I'm terribly equipped to build a model that works well in Italian. Because one of the things you need to be able to do is having that knowledge of, like, okay, how do I access, you know, how Libraries, or content that is from this region that covers this language.[00:27:23] Luca Soldaini: I've been in the US long enough that I no longer know. So, I think that's the efforts that folks in Central Europe, for example, are doing. Around like, okay, let's tap into regional communities. To get access you know, to bring in collaborators from those areas. I think it's going to be, like, very crucial for getting products there.[00:27:46] Mistral intro[00:27:46] Sophia Yang: Hi everyone. Yeah, I'm super excited to be here to talk to you guys about Mistral. A really short and quick recap of what we have done, what kind of models and products we have released in the [00:28:00] past year and a half. So most of you We have already known that we are a small startup funded about a year and a half ago in Paris in May, 2003, it was funded by three of our co founders, and in September, 2003, we released our first open source model, Mistral 7b yeah, how, how many of you have used or heard about Mistral 7b?[00:28:24] Sophia Yang: Hey, pretty much everyone. Thank you. Yeah, it's our Pretty popular and community. Our committee really loved this model, and in December 23, we, we released another popular model with the MLE architecture Mr. A X seven B and oh. Going into this year, you can see we have released a lot of things this year.[00:28:46] Sophia Yang: First of all, in February 2004, we released MrSmall, MrLarge, LeChat, which is our chat interface, I will show you in a little bit. We released an embedding model for, you [00:29:00] know, converting your text into embedding vectors, and all of our models are available. The, the big cloud resources. So you can use our model on Google cloud, AWS, Azure Snowflake, IBM.[00:29:16] Sophia Yang: So very useful for enterprise who wants to use our model through cloud. And in April and May this year, we released another powerful open source MOE model, AX22B. And we also released our first code. Code Model Coastal, which is amazing at 80 plus languages. And then we provided another fine tuning service for customization.[00:29:41] Sophia Yang: So because we know the community love to fine tune our models, so we provide you a very nice and easy option for you to fine tune our model on our platform. And also we released our fine tuning code base called Menstrual finetune. It's open source, so feel free to take it. Take a look and.[00:29:58] Sophia Yang: More models. [00:30:00] On July 2, November this year, we released many, many other models. First of all is the two new small, best small models. We have Minestra 3B great for Deploying on edge devices we have Minstrel 8B if you used to use Minstrel 7B, Minstrel 8B is a great replacement with much stronger performance than Minstrel 7B.[00:30:25] Sophia Yang: We also collaborated with NVIDIA and open sourced another model, Nemo 12B another great model. And Just a few weeks ago, we updated Mistral Large with the version 2 with the updated, updated state of the art features and really great function calling capabilities. It's supporting function calling in LatentNate.[00:30:45] Sophia Yang: And we released two multimodal models Pixtral 12b. It's this open source and Pixtral Large just amazing model for, models for not understanding images, but also great at text understanding. So. Yeah, a [00:31:00] lot of the image models are not so good at textual understanding, but pixel large and pixel 12b are good at both image understanding and textual understanding.[00:31:09] Sophia Yang: And of course, we have models for research. Coastal Mamba is built on Mamba architecture and MathRoll, great with working with math problems. So yeah, that's another model.[00:31:29] Sophia Yang: Here's another view of our model reference. We have several premier models, which means these models are mostly available through our API. I mean, all of the models are available throughout our API, except for Ministry 3B. But for the premier model, they have a special license. Minstrel research license, you can use it for free for exploration, but if you want to use it for enterprise for production use, you will need to purchase a license [00:32:00] from us.[00:32:00] Sophia Yang: So on the top row here, we have Minstrel 3b and 8b as our premier model. Minstrel small for best, best low latency use cases, MrLarge is great for your most sophisticated use cases. PixelLarge is the frontier class multimodal model. And, and we have Coastral for great for coding and then again, MrEmbedding model.[00:32:22] Sophia Yang: And The bottom, the bottom of the slides here, we have several Apache 2. 0 licensed open way models. Free for the community to use, and also if you want to fine tune it, use it for customization, production, feel free to do so. The latest, we have Pixtros 3 12b. We also have Mr. Nemo mum, Coastal Mamba and Mastro, as I mentioned, and we have three legacy models that we don't update anymore.[00:32:49] Sophia Yang: So we recommend you to move to our newer models if you are still using them. And then, just a few weeks ago, [00:33:00] we did a lot of, uh, improvements to our code interface, Lachette. How many of you have used Lachette? Oh, no. Only a few. Okay. I highly recommend Lachette. It's chat. mistral. ai. It's free to use.[00:33:16] Sophia Yang: It has all the amazing capabilities I'm going to show you right now. But before that, Lachette in French means cat. So this is actually a cat logo. If you You can tell this is the cat eyes. Yeah. So first of all, I want to show you something Maybe let's, let's take a look at image understanding.[00:33:36] Sophia Yang: So here I have a receipts and I want to ask, just going to get the prompts. Cool. So basically I have a receipt and I said I ordered I don't know. Coffee and the sausage. How much do I owe? Add a 18 percent tip. So hopefully it was able to get the cost of the coffee and the [00:34:00] sausage and ignore the other things.[00:34:03] Sophia Yang: And yeah, I don't really understand this, but I think this is coffee. It's yeah. Nine, eight. And then cost of the sausage, we have 22 here. And then it was able to add the cost, calculate the tip, and all that. Great. So, it's great at image understanding, it's great at OCR tasks. So, if you have OCR tasks, please use it.[00:34:28] Sophia Yang: It's free on the chat. It's also available through our API. And also I want to show you a Canvas example. A lot of you may have used Canvas with other tools before. But, With Lachat, it's completely free again. Here, I'm asking it to create a canvas that's used PyScript to execute Python in my browser.[00:34:51] Sophia Yang: Let's see if it works. Import this. Okay, so, yeah, so basically it's executing [00:35:00] Python here. Exactly what we wanted. And the other day, I was trying to ask Lachat to create a game for me. Let's see if we can make it work. Yeah, the Tetris game. Yep. Let's just get one row. Maybe. Oh no. Okay. All right. You get the idea. I failed my mission. Okay. Here we go. Yay! Cool. Yeah. So as you can see, Lachet can write, like, a code about a simple game pretty easily. And you can ask Lachet to explain the code. Make updates however you like. Another example. There is a bar here I want to move.[00:35:48] Sophia Yang: Okay, great, okay. And let's go back to another one. Yeah, we also have web search capabilities. Like, you can [00:36:00] ask what's the latest AI news. Image generation is pretty cool. Generate an image about researchers. Okay. In Vancouver? Yeah, it's Black Forest Labs flux Pro. Again, this is free, so Oh, cool.[00:36:19] Sophia Yang: I guess researchers here are mostly from University of British Columbia. That's smart. Yeah. So this is Laia ira. Please feel free to use it. And let me know if you have any feedback. We're always looking for improvement and we're gonna release a lot more powerful features in the coming years.[00:36:37] Sophia Yang: Thank you. Get full access to Latent Space at www.latent.space/subscribe

Beyond Preference Alignment: Teaching AIs to Play Roles & Respect Norms, with Tan Zhi Xuan

Play Episode Listen Later Nov 30, 2024 117:12


In this episode of The Cognitive Revolution, Nathan explores groundbreaking perspectives on AI alignment with MIT PhD student Tan Zhi Xuan. We dive deep into Xuan's critique of preference-based AI alignment and their innovative proposal for role-based AI systems guided by social consensus. The conversation extends into their fascinating work on how AI agents can learn social norms through Bayesian rule induction. Join us for an intellectually stimulating discussion that bridges philosophical theory with practical implementation in AI development. Check out: "Beyond Preferences in AI Alignment" paper: https://arxiv.org/pdf/2408.16984 "Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games" paper: https://arxiv.org/pdf/2402.13399 Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:01:09) About the Episode (00:04:25) Guest Intro (00:06:25) Xuan's Background (00:12:03) AI Near-Term Outlook (00:17:32) Sponsors: Notion | Weights & Biases RAG++ (00:20:18) Alignment Approaches (00:26:11) Critiques of RLHF (00:34:40) Sponsors: Oracle Cloud Infrastructure (OCI) (00:35:50) Beyond Preferences (00:40:27) Roles and AI Systems (00:45:19) What AI Owes Us (00:51:52) Drexler's AI Services (01:01:08) Constitutional AI (01:09:43) Technical Approach (01:22:01) Norms and Deviations (01:32:31) Norm Decay (01:38:06) Self-Other Overlap (01:44:05) Closing Thoughts (01:54:23) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Alessio will be at AWS re:Invent next week and hosting a casual coffee meetup on Wednesday, RSVP here! And subscribe to our calendar for our Singapore, NeurIPS, and all upcoming meetups!We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show!If you've been following the AI agents space, you have heard of Lindy AI; while founder Flo Crivello is hesitant to call it "blowing up," when folks like Andrew Wilkinson start obsessing over your product, you're definitely onto something.In our latest episode, Flo walked us through Lindy's evolution from late 2022 to now, revealing some design choices about agent platform design that go against conventional wisdom in the space.The Great Reset: From Text Fields to RailsRemember late 2022? Everyone was "LLM-pilled," believing that if you just gave a language model enough context and tools, it could do anything. Lindy 1.0 followed this pattern:* Big prompt field ✅* Bunch of tools ✅* Prayer to the LLM gods ✅Fast forward to today, and Lindy 2.0 looks radically different. As Flo put it (~17:00 in the episode): "The more you can put your agent on rails, one, the more reliable it's going to be, obviously, but two, it's also going to be easier to use for the user."Instead of a giant, intimidating text field, users now build workflows visually:* Trigger (e.g., "Zendesk ticket received")* Required actions (e.g., "Check knowledge base")* Response generationThis isn't just a UI change - it's a fundamental rethinking of how to make AI agents reliable. As Swyx noted during our discussion: "Put Shoggoth in a box and make it a very small, minimal viable box. Everything else should be traditional if-this-then-that software."The Surprising Truth About Model LimitationsHere's something that might shock folks building in the space: with Claude 3.5 Sonnet, the model is no longer the bottleneck. Flo's exact words (~31:00): "It is actually shocking the extent to which the model is no longer the limit. It was the limit a year ago. It was too expensive. The context window was too small."Some context: Lindy started when context windows were 4K tokens. Today, their system prompt alone is larger than that. But what's really interesting is what this means for platform builders:* Raw capabilities aren't the constraint anymore* Integration quality matters more than model performance* User experience and workflow design are the new bottlenecksThe Search Engine Parallel: Why Horizontal Platforms Might WinOne of the spiciest takes from our conversation was Flo's thesis on horizontal vs. vertical agent platforms. He draws a fascinating parallel to search engines (~56:00):"I find it surprising the extent to which a horizontal search engine has won... You go through Google to search Reddit. You go through Google to search Wikipedia... search in each vertical has more in common with search than it does with each vertical."His argument: agent platforms might follow the same pattern because:* Agents across verticals share more commonalities than differences* There's value in having agents that can work together under one roof* The R&D cost of getting agents right is better amortized across use casesThis might explain why we're seeing early vertical AI companies starting to expand horizontally. The core agent capabilities - reliability, context management, tool integration - are universal needs.What This Means for BuildersIf you're building in the AI agents space, here are the key takeaways:* Constrain First: Rather than maximizing capabilities, focus on reliable execution within narrow bounds* Integration Quality Matters: With model capabilities plateauing, your competitive advantage lies in how well you integrate with existing tools* Memory Management is Key: Flo revealed they actively prune agent memories - even with larger context windows, not all memories are useful* Design for Discovery: Lindy's visual workflow builder shows how important interface design is for adoptionThe Meta LayerThere's a broader lesson here about AI product development. Just as Lindy evolved from "give the LLM everything" to "constrain intelligently," we might see similar evolution across the AI tooling space. The winners might not be those with the most powerful models, but those who best understand how to package AI capabilities in ways that solve real problems reliably.Full Video PodcastFlo's talk at AI Engineer SummitChapters* 00:00:00 Introductions * 00:04:05 AI engineering and deterministic software * 00:08:36 Lindys demo* 00:13:21 Memory management in AI agents * 00:18:48 Hierarchy and collaboration between Lindys * 00:21:19 Vertical vs. horizontal AI tools * 00:24:03 Community and user engagement strategies * 00:26:16 Rickrolling incident with Lindy * 00:28:12 Evals and quality control in AI systems * 00:31:52 Model capabilities and their impact on Lindy * 00:39:27 Competition and market positioning * 00:42:40 Relationship between Factorio and business strategy * 00:44:05 Remote work vs. in-person collaboration * 00:49:03 Europe vs US Tech* 00:58:59 Testing the Overton window and free speech * 01:04:20 Balancing AI safety concerns with business innovation Show Notes* Lindy.ai* Rick Rolling* Flo on X* TeamFlow* Andrew Wilkinson* Dust* Poolside.ai* SB1047* Gathertown* Sid Sijbrandij* Matt Mullenweg* Factorio* Seeing Like a StateTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.Swyx [00:00:12]: Hey, and today we're joined in the studio by Florent Crivello. Welcome.Flo [00:00:15]: Hey, yeah, thanks for having me.Swyx [00:00:17]: Also known as Altimore. I always wanted to ask, what is Altimore?Flo [00:00:21]: It was the name of my character when I was playing Dungeons & Dragons. Always. I was like 11 years old.Swyx [00:00:26]: What was your classes?Flo [00:00:27]: I was an elf. I was a magician elf.Swyx [00:00:30]: Well, you're still spinning magic. Right now, you're a solo founder and CEO of Lindy.ai. What is Lindy?Flo [00:00:36]: Yeah, we are a no-code platform letting you build your own AI agents easily. So you can think of we are to LangChain as Airtable is to MySQL. Like you can just pin up AI agents super easily by clicking around and no code required. You don't have to be an engineer and you can automate business workflows that you simply could not automate before in a few minutes.Swyx [00:00:55]: You've been in our orbit a few times. I think you spoke at our Latent Space anniversary. You spoke at my summit, the first summit, which was a really good keynote. And most recently, like we actually already scheduled this podcast before this happened. But Andrew Wilkinson was like, I'm obsessed by Lindy. He's just created a whole bunch of agents. So basically, why are you blowing up?Flo [00:01:16]: Well, thank you. I think we are having a little bit of a moment. I think it's a bit premature to say we're blowing up. But why are things going well? We revamped the product majorly. We called it Lindy 2.0. I would say we started working on that six months ago. We've actually not really announced it yet. It's just, I guess, I guess that's what we're doing now. And so we've basically been cooking for the last six months, like really rebuilding the product from scratch. I think I'll list you, actually, the last time you tried the product, it was still Lindy 1.0. Oh, yeah. If you log in now, the platform looks very different. There's like a ton more features. And I think one realization that we made, and I think a lot of folks in the agent space made the same realization, is that there is such a thing as too much of a good thing. I think many people, when they started working on agents, they were very LLM peeled and chat GPT peeled, right? They got ahead of themselves in a way, and us included, and they thought that agents were actually, and LLMs were actually more advanced than they actually were. And so the first version of Lindy was like just a giant prompt and a bunch of tools. And then the realization we had was like, hey, actually, the more you can put your agent on Rails, one, the more reliable it's going to be, obviously, but two, it's also going to be easier to use for the user, because you can really, as a user, you get, instead of just getting this big, giant, intimidating text field, and you type words in there, and you have no idea if you're typing the right word or not, here you can really click and select step by step, and tell your agent what to do, and really give as narrow or as wide a guardrail as you want for your agent. We started working on that. We called it Lindy on Rails about six months ago, and we started putting it into the hands of users over the last, I would say, two months or so, and I think things really started going pretty well at that point. The agent is way more reliable, way easier to set up, and we're already seeing a ton of new use cases pop up.Swyx [00:03:00]: Yeah, just a quick follow-up on that. You launched the first Lindy in November last year, and you were already talking about having a DSL, right? I remember having this discussion with you, and you were like, it's just much more reliable. Is this still the DSL under the hood? Is this a UI-level change, or is it a bigger rewrite?Flo [00:03:17]: No, it is a much bigger rewrite. I'll give you a concrete example. Suppose you want to have an agent that observes your Zendesk tickets, and it's like, hey, every time you receive a Zendesk ticket, I want you to check my knowledge base, so it's like a RAG module and whatnot, and then answer the ticket. The way it used to work with Lindy before was, you would type the prompt asking it to do that. You check my knowledge base, and so on and so forth. The problem with doing that is that it can always go wrong. You're praying the LLM gods that they will actually invoke your knowledge base, but I don't want to ask it. I want it to always, 100% of the time, consult the knowledge base after it receives a Zendesk ticket. And so with Lindy, you can actually have the trigger, which is Zendesk ticket received, have the knowledge base consult, which is always there, and then have the agent. So you can really set up your agent any way you want like that.Swyx [00:04:05]: This is something I think about for AI engineering as well, which is the big labs want you to hand over everything in the prompts, and only code of English, and then the smaller brains, the GPU pours, always want to write more code to make things more deterministic and reliable and controllable. One way I put it is put Shoggoth in a box and make it a very small, the minimal viable box. Everything else should be traditional, if this, then that software.Flo [00:04:29]: I love that characterization, put the Shoggoth in the box. Yeah, we talk about using as much AI as necessary and as little as possible.Alessio [00:04:37]: And what was the choosing between kind of like this drag and drop, low code, whatever, super code-driven, maybe like the Lang chains, auto-GPT of the world, and maybe the flip side of it, which you don't really do, it's like just text to agent, it's like build the workflow for me. Like what have you learned actually putting this in front of users and figuring out how much do they actually want to add it versus like how much, you know, kind of like Ruby on Rails instead of Lindy on Rails, it's kind of like, you know, defaults over configuration.Flo [00:05:06]: I actually used to dislike when people said, oh, text is not a great interface. I was like, ah, this is such a mid-take, I think text is awesome. And I've actually come around, I actually sort of agree now that text is really not great. I think for people like you and me, because we sort of have a mental model, okay, when I type a prompt into this text box, this is what it's going to do, it's going to map it to this kind of data structure under the hood and so forth. I guess it's a little bit blackmailing towards humans. You jump on these calls with humans and you're like, here's a text box, this is going to set up an agent for you, do it. And then they type words like, I want you to help me put order in my inbox. Oh, actually, this is a good one. This is actually a good one. What's a bad one? I would say 60 or 70% of the prompts that people type don't mean anything. Me as a human, as AGI, I don't understand what they mean. I don't know what they mean. It is actually, I think whenever you can have a GUI, it is better than to have just a pure text interface.Alessio [00:05:58]: And then how do you decide how much to expose? So even with the tools, you have Slack, you have Google Calendar, you have Gmail. Should people by default just turn over access to everything and then you help them figure out what to use? I think that's the question. When I tried to set up Slack, it was like, hey, give me access to all channels and everything, which for the average person probably makes sense because you don't want to re-prompt them every time you add new channels. But at the same time, for maybe the more sophisticated enterprise use cases, people are like, hey, I want to really limit what you have access to. How do you kind of thread that balance?Flo [00:06:35]: The general philosophy is we ask for the least amount of permissions needed at any given moment. I don't think Slack, I could be mistaken, but I don't think Slack lets you request permissions for just one channel. But for example, for Google, obviously there are hundreds of scopes that you could require for Google. There's a lot of scopes. And sometimes it's actually painful to set up your Lindy because you're going to have to ask Google and add scopes five or six times. We've had sessions like this. But that's what we do because, for example, the Lindy email drafter, she's going to ask you for your authorization once for, I need to be able to read your email so I can draft a reply, and then another time for I need to be able to write a draft for them. We just try to do it very incrementally like that.Alessio [00:07:15]: Do you think OAuth is just overall going to change? I think maybe before it was like, hey, we need to set up OAuth that humans only want to kind of do once. So we try to jam-pack things all at once versus what if you could on-demand get different permissions every time from different parts? Do you ever think about designing things knowing that maybe AI will use it instead of humans will use it? Yeah, for sure.Flo [00:07:37]: One pattern we've started to see is people provisioning accounts for their AI agents. And so, in particular, Google Workspace accounts. So, for example, Lindy can be used as a scheduling assistant. So you can just CC her to your emails when you're trying to find time with someone. And just like a human assistant, she's going to go back and forth and offer other abilities and so forth. Very often, people don't want the other party to know that it's an AI. So it's actually funny. They introduce delays. They ask the agent to wait before replying, so it's not too obvious that it's an AI. And they provision an account on Google Suite, which costs them like $10 a month or something like that. So we're seeing that pattern more and more. I think that does the job for now. I'm not optimistic on us actually patching OAuth. Because I agree with you, ultimately, we would want to patch OAuth because the new account thing is kind of a clutch. It's really a hack. You would want to patch OAuth to have more granular access control and really be able to put your sugar in the box. I'm not optimistic on us doing that before AGI, I think. That's a very close timeline.Swyx [00:08:36]: I'm mindful of talking about a thing without showing it. And we already have the setup to show it. Why don't we jump into a screen share? For listeners, you can jump on the YouTube and like and subscribe. But also, let's have a look at how you show off Lindy. Yeah, absolutely.Flo [00:08:51]: I'll give an example of a very simple Lindy and then I'll graduate to a much more complicated one. A super simple Lindy that I have is, I unfortunately bought some investment properties in the south of France. It was a really, really bad idea. And I put them on a Holydew, which is like the French Airbnb, if you will. And so I received these emails from time to time telling me like, oh, hey, you made 200 bucks. Someone booked your place. When I receive these emails, I want to log this reservation in a spreadsheet. Doing this without an AI agent or without AI in general is a pain in the butt because you must write an HTML parser for this email. And so it's just hard. You may not be able to do it and it's going to break the moment the email changes. By contrast, the way it works with Lindy, it's really simple. It's two steps. It's like, okay, I receive an email. If it is a reservation confirmation, I have this filter here. Then I append a row to this spreadsheet. And so this is where you can see the AI part where the way this action is configured here, you see these purple fields on the right. Each of these fields is a prompt. And so I can say, okay, you extract from the email the day the reservation begins on. You extract the amount of the reservation. You extract the number of travelers of the reservation. And now you can see when I look at the task history of this Lindy, it's really simple. It's like, okay, you do this and boom, appending this row to this spreadsheet. And this is the information extracted. So effectively, this node here, this append row node is a mini agent. It can see everything that just happened. It has context over the task and it's appending the row. And then it's going to send a reply to the thread. That's a very simple example of an agent.Swyx [00:10:34]: A quick follow-up question on this one while we're still on this page. Is that one call? Is that a structured output call? Yeah. Okay, nice. Yeah.Flo [00:10:41]: And you can see here for every node, you can configure which model you want to power the node. Here I use cloud. For this, I use GPT-4 Turbo. Much more complex example, my meeting recorder. It looks very complex because I've added to it over time, but at a high level, it's really simple. It's like when a meeting begins, you record the meeting. And after the meeting, you send me a summary and you send me coaching notes. So I receive, like my Lindy is constantly coaching me. And so you can see here in the prompt of the coaching notes, I've told it, hey, you know, was I unnecessarily confrontational at any point? I'm French, so I have to watch out for that. Or not confrontational enough. Should I have double-clicked on any issue, right? So I can really give it exactly the kind of coaching that I'm expecting. And then the interesting thing here is, like, you can see the agent here, after it sent me these coaching notes, moves on. And it does a bunch of other stuff. So it goes on Slack. It disseminates the notes on Slack. It does a bunch of other stuff. But it's actually able to backtrack and resume the automation at the coaching notes email if I responded to that email. So I'll give a super concrete example. This is an actual coaching feedback that I received from Lindy. She was like, hey, this was a sales call I had with a customer. And she was like, I found your explanation of Lindy too technical. And I was able to follow up and just ask a follow-up question in the thread here. And I was like, why did you find too technical about my explanation? And Lindy restored the context. And so she basically picked up the automation back up here in the tree. And she has all of the context of everything that happened, including the meeting in which I was. So she was like, oh, you used the words deterministic and context window and agent state. And that concept exists at every level for every channel and every action that Lindy takes. So another example here is, I mentioned she also disseminates the notes on Slack. So this was a meeting where I was not, right? So this was a teammate. He's an indie meeting recorder, posts the meeting notes in this customer discovery channel on Slack. So you can see, okay, this is the onboarding call we had. This was the use case. Look at the questions. How do I make Lindy slower? How do I add delays to make Lindy slower? And I was able, in the Slack thread, to ask follow-up questions like, oh, what did we answer to these questions? And it's really handy because I know I can have this sort of interactive Q&A with these meetings. It means that very often now, I don't go to meetings anymore. I just send my Lindy. And instead of going to like a 60-minute meeting, I have like a five-minute chat with my Lindy afterwards. And she just replied. She was like, well, this is what we replied to this customer. And I can just be like, okay, good job, Jack. Like, no notes about your answers. So that's the kind of use cases people have with Lindy. It's a lot of like, there's a lot of sales automations, customer support automations, and a lot of this, which is basically personal assistance automations, like meeting scheduling and so forth.Alessio [00:13:21]: Yeah, and I think the question that people might have is memory. So as you get coaching, how does it track whether or not you're improving? You know, if these are like mistakes you made in the past, like, how do you think about that?Flo [00:13:31]: Yeah, we have a memory module. So I'll show you my meeting scheduler, Lindy, which has a lot of memories because by now I've used her for so long. And so every time I talk to her, she saves a memory. If I tell her, you screwed up, please don't do this. So you can see here, oh, it's got a double memory here. This is the meeting link I have, or this is the address of the office. If I tell someone to meet me at home, this is the address of my place. This is the code. I guess we'll have to edit that out. This is not the code of my place. No dogs. Yeah, so Lindy can just manage her own memory and decide when she's remembering things between executions. Okay.Swyx [00:14:11]: I mean, I'm just going to take the opportunity to ask you, since you are the creator of this thing, how come there's so few memories, right? Like, if you've been using this for two years, there should be thousands of thousands of things. That is a good question.Flo [00:14:22]: Agents still get confused if they have too many memories, to my point earlier about that. So I just am out of a call with a member of the Lama team at Meta, and we were chatting about Lindy, and we were going into the system prompt that we sent to Lindy, and all of that stuff. And he was amazed, and he was like, it's a miracle that it's working, guys. He was like, this kind of system prompt, this does not exist, either pre-training or post-training. These models were never trained to do this kind of stuff. It's a miracle that they can be agents at all. And so what I do, I actually prune the memories. You know, it's actually something I've gotten into the habit of doing from back when we had GPT 3.5, being Lindy agents. I suspect it's probably not as necessary in the Cloud 3.5 Sunette days, but I prune the memories. Yeah, okay.Swyx [00:15:05]: The reason is because I have another assistant that also is recording and trying to come up with facts about me. It comes up with a lot of trivial, useless facts that I... So I spend most of my time pruning. Actually, it's not super useful. I'd much rather have high-quality facts that it accepts. Or maybe I was even thinking, were you ever tempted to add a wake word to only memorize this when I say memorize this? And otherwise, don't even bother.Flo [00:15:30]: I have a Lindy that does this. So this is my inbox processor, Lindy. It's kind of beefy because there's a lot of different emails. But somewhere in here,Swyx [00:15:38]: there is a rule where I'm like,Flo [00:15:39]: aha, I can email my inbox processor, Lindy. It's really handy. So she has her own email address. And so when I process my email inbox, I sometimes forward an email to her. And it's a newsletter, or it's like a cold outreach from a recruiter that I don't care about, or anything like that. And I can give her a rule. And I can be like, hey, this email I want you to archive, moving forward. Or I want you to alert me on Slack when I have this kind of email. It's really important. And so you can see here, the prompt is, if I give you a rule about a kind of email, like archive emails from X, save it as a new memory. And I give it to the memory saving skill. And yeah.Swyx [00:16:13]: One thing that just occurred to me, so I'm a big fan of virtual mailboxes. I recommend that everybody have a virtual mailbox. You could set up a physical mail receive thing for Lindy. And so then Lindy can process your physical mail.Flo [00:16:26]: That's actually a good idea. I actually already have something like that. I use like health class mail. Yeah. So yeah, most likely, I can process my physical mail. Yeah.Swyx [00:16:35]: And then the other product's idea I have, looking at this thing, is people want to brag about the complexity of their Lindys. So this would be like a 65 point Lindy, right?Flo [00:16:43]: What's a 65 point?Swyx [00:16:44]: Complexity counting. Like how many nodes, how many things, how many conditions, right? Yeah.Flo [00:16:49]: This is not the most complex one. I have another one. This designer recruiter here is kind of beefy as well. Right, right, right. So I'm just saying,Swyx [00:16:56]: let people brag. Let people be super users. Oh, right.Flo [00:16:59]: Give them a score. Give them a score.Swyx [00:17:01]: Then they'll just be like, okay, how high can you make this score?Flo [00:17:04]: Yeah, that's a good point. And I think that's, again, the beauty of this on-rails phenomenon. It's like, think of the equivalent, the prompt equivalent of this Lindy here, for example, that we're looking at. It'd be monstrous. And the odds that it gets it right are so low. But here, because we're really holding the agent's hand step by step by step, it's actually super reliable. Yeah.Swyx [00:17:22]: And is it all structured output-based? Yeah. As far as possible? Basically. Like, there's no non-structured output?Flo [00:17:27]: There is. So, for example, here, this AI agent step, right, or this send message step, sometimes it gets to... That's just plain text.Swyx [00:17:35]: That's right.Flo [00:17:36]: Yeah. So I'll give you an example. Maybe it's TMI. I'm having blood pressure issues these days. And so this Lindy here, I give it my blood pressure readings, and it updates a log that I have of my blood pressure that it sends to my doctor.Swyx [00:17:49]: Oh, so every Lindy comes with a to-do list?Flo [00:17:52]: Yeah. Every Lindy has its own task history. Huh. Yeah. And so you can see here, this is my main Lindy, my personal assistant, and I've told it, where is this? There is a point where I'm like, if I am giving you a health-related fact, right here, I'm giving you health information, so then you update this log that I have in this Google Doc, and then you send me a message. And you can see, I've actually not configured this send message node. I haven't told it what to send me a message for. Right? And you can see, it's actually lecturing me. It's like, I'm giving it my blood pressure ratings. It's like, hey, it's a bit high. Here are some lifestyle changes you may want to consider.Alessio [00:18:27]: I think maybe this is the most confusing or new thing for people. So even I use Lindy and I didn't even know you could have multiple workflows in one Lindy. I think the mental model is kind of like the Zapier workflows. It starts and it ends. It doesn't choose between. How do you think about what's a Lindy versus what's a sub-function of a Lindy? Like, what's the hierarchy?Flo [00:18:48]: Yeah. Frankly, I think the line is a little arbitrary. It's kind of like when you code, like when do you start to create a new class versus when do you overload your current class. I think of it in terms of like jobs to be done and I think of it in terms of who is the Lindy serving. This Lindy is serving me personally. It's really my day-to-day Lindy. I give it a bunch of stuff, like very easy tasks. And so this is just the Lindy I go to. Sometimes when a task is really more specialized, so for example, I have this like summarizer Lindy or this designer recruiter Lindy. These tasks are really beefy. I wouldn't want to add this to my main Lindy, so I just created a separate Lindy for it. Or when it's a Lindy that serves another constituency, like our customer support Lindy, I don't want to add that to my personal assistant Lindy. These are two very different Lindys.Alessio [00:19:31]: And you can call a Lindy from within another Lindy. That's right. You can kind of chain them together.Flo [00:19:36]: Lindys can work together, absolutely.Swyx [00:19:38]: A couple more things for the video portion. I noticed you have a podcast follower. We have to ask about that. What is that?Flo [00:19:46]: So this one wakes me up every... So wakes herself up every week. And she sends me... So she woke up yesterday, actually. And she searches for Lenny's podcast. And she looks for like the latest episode on YouTube. And once she finds it, she transcribes the video and then she sends me the summary by email. I don't listen to podcasts as much anymore. I just like read these summaries. Yeah.Alessio [00:20:09]: We should make a latent space Lindy. Marketplace.Swyx [00:20:12]: Yeah. And then you have a whole bunch of connectors. I saw the list briefly. Any interesting one? Complicated one that you're proud of? Anything that you want to just share? Connector stories.Flo [00:20:23]: So many of our workflows are about meeting scheduling. So we had to build some very open unity tools around meeting scheduling. So for example, one that is surprisingly hard is this find available times action. You would not believe... This is like a thousand lines of code or something. It's just a very beefy action. And you can pass it a bunch of parameters about how long is the meeting? When does it start? When does it end? What are the meetings? The weekdays in which I meet? How many time slots do you return? What's the buffer between my meetings? It's just a very, very, very complex action. I really like our GitHub action. So we have a Lindy PR reviewer. And it's really handy because anytime any bug happens... So the Lindy reads our guidelines on Google Docs. By now, the guidelines are like 40 pages long or something. And so every time any new kind of bug happens, we just go to the guideline and we add the lines. Like, hey, this has happened before. Please watch out for this category of bugs. And it's saving us so much time every day.Alessio [00:21:19]: There's companies doing PR reviews. Where does a Lindy start? When does a company start? Or maybe how do you think about the complexity of these tasks when it's going to be worth having kind of like a vertical standalone company versus just like, hey, a Lindy is going to do a good job 99% of the time?Flo [00:21:34]: That's a good question. We think about this one all the time. I can't say that we've really come up with a very crisp articulation of when do you want to use a vertical tool versus when do you want to use a horizontal tool. I think of it as very similar to the internet. I find it surprising the extent to which a horizontal search engine has won. But I think that Google, right? But I think the even more surprising fact is that the horizontal search engine has won in almost every vertical, right? You go through Google to search Reddit. You go through Google to search Wikipedia. I think maybe the biggest exception is e-commerce. Like you go to Amazon to search e-commerce, but otherwise you go through Google. And I think that the reason for that is because search in each vertical has more in common with search than it does with each vertical. And search is so expensive to get right. Like Google is a big company that it makes a lot of sense to aggregate all of these different use cases and to spread your R&D budget across all of these different use cases. I have a thesis, which is, it's a really cool thesis for Lindy, is that the same thing is true for agents. I think that by and large, in a lot of verticals, agents in each vertical have more in common with agents than they do with each vertical. I also think there are benefits in having a single agent platform because that way your agents can work together. They're all like under one roof. That way you only learn one platform and so you can create agents for everything that you want. And you don't have to like pay for like a bunch of different platforms and so forth. So I think ultimately, it is actually going to shake out in a way that is similar to search in that search is everywhere on the internet. Every website has a search box, right? So there's going to be a lot of vertical agents for everything. I think AI is going to completely penetrate every category of software. But then I also think there are going to be a few very, very, very big horizontal agents that serve a lot of functions for people.Swyx [00:23:14]: That is actually one of the questions that we had about the agent stuff. So I guess we can transition away from the screen and I'll just ask the follow-up, which is, that is a hot topic. You're basically saying that the current VC obsession of the day, which is vertical AI enabled SaaS, is mostly not going to work out. And then there are going to be some super giant horizontal SaaS.Flo [00:23:34]: Oh, no, I'm not saying it's either or. Like SaaS today, vertical SaaS is huge and there's also a lot of horizontal platforms. If you look at like Airtable or Notion, basically the entire no-code space is very horizontal. I mean, Loom and Zoom and Slack, there's a lot of very horizontal tools out there. Okay.Swyx [00:23:49]: I was just trying to get a reaction out of you for hot takes. Trying to get a hot take.Flo [00:23:54]: No, I also think it is natural for the vertical solutions to emerge first because it's just easier to build. It's just much, much, much harder to build something horizontal. Cool.Swyx [00:24:03]: Some more Lindy-specific questions. So we covered most of the top use cases and you have an academy. That was nice to see. I also see some other people doing it for you for free. So like Ben Spites is doing it and then there's some other guy who's also doing like lessons. Yeah. Which is kind of nice, right? Yeah, absolutely. You don't have to do any of that.Flo [00:24:20]: Oh, we've been seeing it more and more on like LinkedIn and Twitter, like people posting their Lindys and so forth.Swyx [00:24:24]: I think that's the flywheel that you built the platform where creators see value in allying themselves to you. And so then, you know, your incentive is to make them successful so that they can make other people successful and then it just drives more and more engagement. Like it's earned media. Like you don't have to do anything.Flo [00:24:39]: Yeah, yeah. I mean, community is everything.Swyx [00:24:41]: Are you doing anything special there? Any big wins?Flo [00:24:44]: We have a Slack community that's pretty active. I can't say we've invested much more than that so far.Swyx [00:24:49]: I would say from having, so I have some involvement in the no-code community. I would say that Webflow going very hard after no-code as a category got them a lot more allies than just the people using Webflow. So it helps you to grow the community beyond just Lindy. And I don't know what this is called. Maybe it's just no-code again. Maybe you want to call it something different. But there's definitely an appetite for this and you are one of a broad category, right? Like just before you, we had Dust and, you know, they're also kind of going after a similar market. Zapier obviously is not going to try to also compete with you. Yeah. There's no question there. It's just like a reaction about community. Like I think a lot about community. Lanespace is growing the community of AI engineers. And I think you have a slightly different audience of, I don't know what.Flo [00:25:33]: Yeah. I think the no-code tinkerers is the community. Yeah. It is going to be the same sort of community as what Webflow, Zapier, Airtable, Notion to some extent.Swyx [00:25:43]: Yeah. The framing can be different if you were, so I think tinkerers has this connotation of not serious or like small. And if you framed it to like no-code EA, we're exclusively only for CEOs with a certain budget, then you just have, you tap into a different budget.Flo [00:25:58]: That's true. The problem with EA is like, the CEO has no willingness to actually tinker and play with the platform.Swyx [00:26:05]: Maybe Andrew's doing that. Like a lot of your biggest advocates are CEOs, right?Flo [00:26:09]: A solopreneur, you know, small business owners, I think Andrew is an exception. Yeah. Yeah, yeah, he is.Swyx [00:26:14]: He's an exception in many ways. Yep.Alessio [00:26:16]: Just before we wrap on the use cases, is Rick rolling your customers? Like a officially supported use case or maybe tell that story?Flo [00:26:24]: It's one of the main jobs to be done, really. Yeah, we woke up recently, so we have a Lindy obviously doing our customer support and we do check after the Lindy. And so we caught this email exchange where someone was asking Lindy for video tutorials. And at the time, actually, we did not have video tutorials. We do now on the Lindy Academy. And Lindy responded to the email. It's like, oh, absolutely, here's a link. And we were like, what? Like, what kind of link did you send? And so we clicked on the link and it was a recall. We actually reacted fast enough that the customer had not yet opened the email. And so we reacted immediately. Like, oh, hey, actually, sorry, this is the right link. And so the customer never reacted to the first link. And so, yeah, I tweeted about that. It went surprisingly viral. And I checked afterwards in the logs. We did like a database query and we found, I think, like three or four other instances of it having happened before.Swyx [00:27:12]: That's surprisingly low.Flo [00:27:13]: It is low. And we fixed it across the board by just adding a line to the system prompt that's like, hey, don't recall people, please don't recall.Swyx [00:27:21]: Yeah, yeah, yeah. I mean, so, you know, you can explain it retroactively, right? Like, that YouTube slug has been pasted in so many different corpuses that obviously it learned to hallucinate that.Alessio [00:27:31]: And it pretended to be so many things. That's the thing.Swyx [00:27:34]: I wouldn't be surprised if that takes one token. Like, there's this one slug in the tokenizer and it's just one token.Flo [00:27:41]: That's the idea of a YouTube video.Swyx [00:27:43]: Because it's used so much, right? And you have to basically get it exactly correct. It's probably not. That's a long speech.Flo [00:27:52]: It would have been so good.Alessio [00:27:55]: So this is just a jump maybe into evals from here. How could you possibly come up for an eval that says, make sure my AI does not recall my customer? I feel like when people are writing evals, that's not something that they come up with. So how do you think about evals when it's such like an open-ended problem space?Flo [00:28:12]: Yeah, it is tough. We built quite a bit of infrastructure for us to create evals in one click from any conversation history. So we can point to a conversation and we can be like, in one click we can turn it into effectively a unit test. It's like, this is a good conversation. This is how you're supposed to handle things like this. Or if it's a negative example, then we modify a little bit the conversation after generating the eval. So it's very easy for us to spin up this kind of eval.Alessio [00:28:36]: Do you use an off-the-shelf tool which is like Brain Trust on the podcast? Or did you just build your own?Flo [00:28:41]: We unfortunately built our own. We're most likely going to switch to Brain Trust. Well, when we built it, there was nothing. Like there was no eval tool, frankly. I mean, we started this project at the end of 2022. It was like, it was very, very, very early. I wouldn't recommend it to build your own eval tool. There's better solutions out there and our eval tool breaks all the time and it's a nightmare to maintain. And that's not something we want to be spending our time on.Swyx [00:29:04]: I was going to ask that basically because I think my first conversations with you about Lindy was that you had a strong opinion that everyone should build their own tools. And you were very proud of your evals. You're kind of showing off to me like how many evals you were running, right?Flo [00:29:16]: Yeah, I think that was before all of these tools came around. I think the ecosystem has matured a fair bit.Swyx [00:29:21]: What is one thing that Brain Trust has nailed that you always struggled to do?Flo [00:29:25]: We're not using them yet, so I couldn't tell. But from what I've gathered from the conversations I've had, like they're doing what we do with our eval tool, but better.Swyx [00:29:33]: And like they do it, but also like 60 other companies do it, right? So I don't know how to shop apart from brand. Word of mouth.Flo [00:29:41]: Same here.Swyx [00:29:42]: Yeah, like evals or Lindys, there's two kinds of evals, right? Like in some way, you don't have to eval your system as much because you've constrained the language model so much. And you can rely on open AI to guarantee that the structured outputs are going to be good, right? We had Michelle sit where you sit and she explained exactly how they do constraint grammar sampling and all that good stuff. So actually, I think it's more important for your customers to eval their Lindys than you evaling your Lindy platform because you just built the platform. You don't actually need to eval that much.Flo [00:30:14]: Yeah. In an ideal world, our customers don't need to care about this. And I think the bar is not like, look, it needs to be at 100%. I think the bar is it needs to be better than a human. And for most use cases we serve today, it is better than a human, especially if you put it on Rails.Swyx [00:30:30]: Is there a limiting factor of Lindy at the business? Like, is it adding new connectors? Is it adding new node types? Like how do you prioritize what is the most impactful to your company?Flo [00:30:41]: Yeah. The raw capabilities for sure are a big limit. It is actually shocking the extent to which the model is no longer the limit. It was the limit a year ago. It was too expensive. The context window was too small. It's kind of insane that we started building this when the context windows were like 4,000 tokens. Like today, our system prompt is more than 4,000 tokens. So yeah, the model is actually very much not a limit anymore. It almost gives me pause because I'm like, I want the model to be a limit. And so no, the integrations are ones, the core capabilities are ones. So for example, we are investing in a system that's basically, I call it like the, it's a J hack. Give me these names, like the poor man's RLHF. So you can turn on a toggle on any step of your Lindy workflow to be like, ask me for confirmation before you actually execute this step. So it's like, hey, I receive an email, you send a reply, ask me for confirmation before actually sending it. And so today you see the email that's about to get sent and you can either approve, deny, or change it and then approve. And we are making it so that when you make a change, we are then saving this change that you're making or embedding it in the vector database. And then we are retrieving these examples for future tasks and injecting them into the context window. So that's the kind of capability that makes a huge difference for users. That's the bottleneck today. It's really like good old engineering and product work.Swyx [00:31:52]: I assume you're hiring. We'll do a call for hiring at the end.Alessio [00:31:54]: Any other comments on the model side? When did you start feeling like the model was not a bottleneck anymore? Was it 4.0? Was it 3.5? 3.5.Flo [00:32:04]: 3.5 Sonnet, definitely. I think 4.0 is overhyped, frankly. We don't use 4.0. I don't think it's good for agentic behavior. Yeah, 3.5 Sonnet is when I started feeling that. And then with prompt caching with 3.5 Sonnet, like that fills the cost, cut the cost again. Just cut it in half. Yeah.Swyx [00:32:21]: Your prompts are... Some of the problems with agentic uses is that your prompts are kind of dynamic, right? Like from caching to work, you need the front prefix portion to be stable.Flo [00:32:32]: Yes, but we have this append-only ledger paradigm. So every node keeps appending to that ledger and every filled node inherits all the context built up by all the previous nodes. And so we can just decide, like, hey, every X thousand nodes, we trigger prompt caching again.Swyx [00:32:47]: Oh, so you do it like programmatically, not all the time.Flo [00:32:50]: No, sorry. Anthropic manages that for us. But basically, it's like, because we keep appending to the prompt, the prompt caching works pretty well.Alessio [00:32:57]: We have this small podcaster tool that I built for the podcast and I rewrote all of our prompts because I noticed, you know, I was inputting stuff early on. I wonder how much more money OpenAN and Anthropic are making just because people don't rewrite their prompts to be like static at the top and like dynamic at the bottom.Flo [00:33:13]: I think that's the remarkable thing about what we're having right now. It's insane that these companies are routinely cutting their costs by two, four, five. Like, they basically just apply constraints. They want people to take advantage of these innovations. Very good.Swyx [00:33:25]: Do you have any other competitive commentary? Commentary? Dust, WordWare, Gumloop, Zapier? If not, we can move on.Flo [00:33:31]: No comment.Alessio [00:33:32]: I think the market is,Flo [00:33:33]: look, I mean, AGI is coming. All right, that's what I'm talking about.Swyx [00:33:38]: I think you're helping. Like, you're paving the road to AGI.Flo [00:33:41]: I'm playing my small role. I'm adding my small brick to this giant, giant, giant castle. Yeah, look, when it's here, we are going to, this entire category of software is going to create, it's going to sound like an exaggeration, but it is a fact it is going to create trillions of dollars of value in a few years, right? It's going to, for the first time, we're actually having software directly replace human labor. I see it every day in sales calls. It's like, Lindy is today replacing, like, we talk to even small teams. It's like, oh, like, stop, this is a 12-people team here. I guess we'll set up this Lindy for one or two days, and then we'll have to decide what to do with this 12-people team. And so, yeah. To me, there's this immense uncapped market opportunity. It's just such a huge ocean, and there's like three sharks in the ocean. I'm focused on the ocean more than on the sharks.Swyx [00:34:25]: So we're moving on to hot topics, like, kind of broadening out from Lindy, but obviously informed by Lindy. What are the high-order bits of good agent design?Flo [00:34:31]: The model, the model, the model, the model. I think people fail to truly, and me included, they fail to truly internalize the bitter lesson. So for the listeners out there who don't know about it, it's basically like, you just scale the model. Like, GPUs go brr, it's all that matters. I think it also holds for the cognitive architecture. I used to be very cognitive architecture-filled, and I was like, ah, and I was like a critic, and I was like a generator, and all this, and then it's just like, GPUs go brr, like, just like let the model do its job. I think we're seeing it a little bit right now with O1. I'm seeing some tweets that say that the new 3.5 SONNET is as good as O1, but with none of all the crazy...Swyx [00:35:09]: It beats O1 on some measures. On some reasoning tasks. On AIME, it's still a lot lower. Like, it's like 14 on AIME versus O1, it's like 83.Flo [00:35:17]: Got it. Right. But even O1 is still the model. Yeah.Swyx [00:35:22]: Like, there's no cognitive architecture on top of it.Flo [00:35:23]: You can just wait for O1 to get better.Alessio [00:35:25]: And so, as a founder, how do you think about that, right? Because now, knowing this, wouldn't you just wait to start Lindy? You know, you start Lindy, it's like 4K context, the models are not that good. It's like, but you're still kind of like going along and building and just like waiting for the models to get better. How do you today decide, again, what to build next, knowing that, hey, the models are going to get better, so maybe we just shouldn't focus on improving our prompt design and all that stuff and just build the connectors instead or whatever? Yeah.Flo [00:35:51]: I mean, that's exactly what we do. Like, all day, we always ask ourselves, oh, when we have a feature idea or a feature request, we ask ourselves, like, is this the kind of thing that just gets better while we sleep because models get better? I'm reminded, again, when we started this in 2022, we spent a lot of time because we had to around context pruning because 4,000 tokens is really nothing. You really can't do anything with 4,000 tokens. All that work was throwaway work. Like, now it's like it was for nothing, right? Now we just assume that infinite context windows are going to be here in a year or something, a year and a half, and infinitely cheap as well, and dynamic compute is going to be here. Like, we just assume all of these things are going to happen, and so we really focus, our job to be done in the industry is to provide the input and output to the model. I really compare it all the time to the PC and the CPU, right? Apple is busy all day. They're not like a CPU wrapper. They have a lot to build, but they don't, well, now actually they do build the CPU as well, but leaving that aside, they're busy building a laptop. It's just a lot of work to build these things. It's interesting because, like,Swyx [00:36:45]: for example, another person that we're close to, Mihaly from Repl.it, he often says that the biggest jump for him was having a multi-agent approach, like the critique thing that you just said that you don't need, and I wonder when, in what situations you do need that and what situations you don't. Obviously, the simple answer is for coding, it helps, and you're not coding, except for, are you still generating code? In Indy? Yeah.Flo [00:37:09]: No, we do. Oh, right. No, no, no, the cognitive architecture changed. We don't, yeah.Swyx [00:37:13]: Yeah, okay. For you, you're one shot, and you chain tools together, and that's it. And if the user really wantsFlo [00:37:18]: to have this kind of critique thing, you can also edit the prompt, you're welcome to. I have some of my Lindys, I've told them, like, hey, be careful, think step by step about what you're about to do, but that gives you a little bump for some use cases, but, yeah.Alessio [00:37:30]: What about unexpected model releases? So, Anthropic released computer use today. Yeah. I don't know if many people were expecting computer use to come out today. Do these things make you rethink how to design, like, your roadmap and things like that, or are you just like, hey, look, whatever, that's just, like, a small thing in their, like, AGI pursuit, that, like, maybe they're not even going to support, and, like, it's still better for us to build our own integrations into systems and things like that. Because maybe people will say, hey, look, why am I building all these API integrationsFlo [00:38:02]: when I can just do computer use and never go to the product? Yeah. No, I mean, we did take into account computer use. We were talking about this a year ago or something, like, we've been talking about it as part of our roadmap. It's been clear to us that it was coming, My philosophy about it is anything that can be done with an API must be done by an API or should be done by an API for a very long time. I think it is dangerous to be overly cavalier about improvements of model capabilities. I'm reminded of iOS versus Android. Android was built on the JVM. There was a garbage collector, and I can only assume that the conversation that went down in the engineering meeting room was, oh, who cares about the garbage collector? Anyway, Moore's law is here, and so that's all going to go to zero eventually. Sure, but in the meantime, you are operating on a 400 MHz CPU. It was like the first CPU on the iPhone 1, and it's really slow, and the garbage collector is introducing a tremendous overhead on top of that, especially a memory overhead. For the longest time, and it's really only been recently that Android caught up to iOS in terms of how smooth the interactions were, but for the longest time, Android phones were significantly slowerSwyx [00:39:07]: and laggierFlo [00:39:08]: and just not feeling as good as iOS devices. Look, when you're talking about modules and magnitude of differences in terms of performance and reliability, which is what we are talking about when we're talking about API use versus computer use, then you can't ignore that, right? And so I think we're going to be in an API use world for a while.Swyx [00:39:27]: O1 doesn't have API use today. It will have it at some point, and it's on the roadmap. There is a future in which OpenAI goes much harder after your business, your market, than it is today. Like, ChatGPT, it's its own business. All they need to do is add tools to the ChatGPT, and now they're suddenly competing with you. And by the way, they have a GPT store where a bunch of people have already configured their tools to fit with them. Is that a concern?Flo [00:39:56]: I think even the GPT store, in a way, like the way they architect it, for example, their plug-in systems are actually grateful because we can also use the plug-ins. It's very open. Now, again, I think it's going to be such a huge market. I think there's going to be a lot of different jobs to be done. I know they have a huge enterprise offering and stuff, but today, ChatGPT is a consumer app. And so, the sort of flow detail I showed you, this sort of workflow, this sort of use cases that we're going after, which is like, we're doing a lot of lead generation and lead outreach and all of that stuff. That's not something like meeting recording, like Lindy Today right now joins your Zoom meetings and takes notes, all of that stuff.Swyx [00:40:34]: I don't see that so farFlo [00:40:35]: on the OpenAI roadmap.Swyx [00:40:36]: Yeah, but they do have an enterprise team that we talk to You're hiring GMs?Flo [00:40:42]: We did.Swyx [00:40:43]: It's a fascinating way to build a business, right? Like, what should you, as CEO, be in charge of? And what should you basically hireFlo [00:40:52]: a mini CEO to do? Yeah, that's a good question. I think that's also something we're figuring out. The GM thing was inspired from my days at Uber, where we hired one GM per city or per major geo area. We had like all GMs, regional GMs and so forth. And yeah, Lindy is so horizontal that we thought it made sense to hire GMs to own each vertical and the go-to market of the vertical and the customization of the Lindy templates for these verticals and so forth. What should I own as a CEO? I mean, the canonical reply here is always going to be, you know, you own the fundraising, you own the culture, you own the... What's the rest of the canonical reply? The culture, the fundraising.Swyx [00:41:29]: I don't know,Flo [00:41:30]: products. Even that, eventually, you do have to hand out. Yes, the vision, the culture, and the foundation. Well, you've done your job as a CEO. In practice, obviously, yeah, I mean, all day, I do a lot of product work still and I want to keep doing product work for as long as possible.Swyx [00:41:48]: Obviously, like you're recording and managing the team. Yeah.Flo [00:41:52]: That one feels like the most automatable part of the job, the recruiting stuff.Swyx [00:41:56]: Well, yeah. You saw myFlo [00:41:59]: design your recruiter here. Relationship between Factorio and building Lindy. We actually very often talk about how the business of the future is like a game of Factorio. Yeah. So, in the instance, it's like Slack and you've got like 5,000 Lindys in the sidebar and your job is to somehow manage your 5,000 Lindys. And it's going to be very similar to company building because you're going to look for like the highest leverage way to understand what's going on in your AI company and understand what levels do you have to make impact in that company. So, I think it's going to be very similar to like a human company except it's going to go infinitely faster. Today, in a human company, you could have a meeting with your team and you're like, oh, I'm going to build a facility and, you know, now it's like, okay,Swyx [00:42:40]: boom, I'm going to spin up 50 designers. Yeah. Like, actually, it's more important that you can clone an existing designer that you know works because the hiring process, you cannot clone someone because every new person you bring in is going to have their own tweaksFlo [00:42:54]: and you don't want that. Yeah.Swyx [00:42:56]: That's true. You want an army of mindless dronesFlo [00:42:59]: that all work the same way.Swyx [00:43:00]: The reason I bring this, bring Factorio up as well is one, Factorio Space just came out. Apparently, a whole bunch of people stopped working. I tried out Factorio. I never really got that much into it. But the other thing was, you had a tweet recently about how the sort of intentional top-down design was not as effective as just build. Yeah. Just ship.Flo [00:43:21]: I think people read a little bit too much into that tweet. It went weirdly viral. I was like, I did not intend it as a giant statement online.Swyx [00:43:28]: I mean, you notice you have a pattern with this, right? Like, you've done this for eight years now.Flo [00:43:33]: You should know. I legit was just hearing an interesting story about the Factorio game I had. And everybody was like, oh my God, so deep. I guess this explains everything about life and companies. There is something to be said, certainly, about focusing on the constraint. And I think it is Patrick Collison who said, people underestimate the extent to which moonshots are just one pragmatic step taken after the other. And I think as long as you have some inductive bias about, like, some loose idea about where you want to go, I think it makes sense to follow a sort of greedy search along that path. I think planning and organizing is important. And having older is important.Swyx [00:44:05]: I'm wrestling with that. There's two ways I encountered it recently. One with Lindy. When I tried out one of your automation templates and one of them was quite big and I just didn't understand it, right? So, like, it was not as useful to me as a small one that I can just plug in and see all of. And then the other one was me using Cursor. I was very excited about O1 and I just up frontFlo [00:44:27]: stuffed everythingSwyx [00:44:28]: I wanted to do into my prompt and expected O1 to do everything. And it got itself into a huge jumbled mess and it was stuck. It was really... There was no amount... I wasted, like, two hours on just, like, trying to get out of that hole. So I threw away the code base, started small, switched to Clouds on it and build up something working and just add it over time and it just worked. And to me, that was the factorial sentiment, right? Maybe I'm one of those fanboys that's just, like, obsessing over the depth of something that you just randomly tweeted out. But I think it's true for company building, for Lindy building, for coding.Flo [00:45:02]: I don't know. I think it's fair and I think, like, you and I talked about there's the Tuft & Metal principle and there's this other... Yes, I love that. There's the... I forgot the name of this other blog post but it's basically about this book Seeing Like a State that talks about the need for legibility and people who optimize the system for its legibility and anytime you make a system... So legible is basically more understandable. Anytime you make a system more understandable from the top down, it performs less well from the bottom up. And it's fine but you should at least make this trade-off with your eyes wide open. You should know, I am sacrificing performance for understandability, for legibility. And in this case, for you, it makes sense. It's like you are actually optimizing for legibility. You do want to understand your code base but in some other cases it may not make sense. Sometimes it's better to leave the system alone and let it be its glorious, chaotic, organic self and just trust that it's going to perform well even though you don't understand it completely.Swyx [00:45:55]: It does remind me of a common managerial issue or dilemma which you experienced in the small scale of Lindy where, you know, do you want to organize your company by functional sections or by products or, you know, whatever the opposite of functional is. And you tried it one way and it was more legible to you as CEO but actually it stopped working at the small level. Yeah.Flo [00:46:17]: I mean, one very small example, again, at a small scale is we used to have everything on Notion. And for me, as founder, it was awesome because everything was there. The roadmap was there. The tasks were there. The postmortems were there. And so, the postmortem was linkedSwyx [00:46:31]: to its task.Flo [00:46:32]: It was optimized for you. Exactly. And so, I had this, like, one pane of glass and everything was on Notion. And then the team, one day,Swyx [00:46:39]: came to me with pitchforksFlo [00:46:40]: and they really wanted to implement Linear. And I had to bite my fist so hard. I was like, fine, do it. Implement Linear. Because I was like, at the end of the day, the team needs to be able to self-organize and pick their own tools.Alessio [00:46:51]: Yeah. But it did make the company slightly less legible for me. Another big change you had was going away from remote work, every other month. The discussion comes up again. What was that discussion like? How did your feelings change? Was there kind of like a threshold of employees and team size where you felt like, okay, maybe that worked. Now it doesn't work anymore. And how are you thinking about the futureFlo [00:47:12]: as you scale the team? Yeah. So, for context, I used to have a business called TeamFlow. The business was about building a virtual office for remote teams. And so, being remote was not merely something we did. It was, I was banging the remote drum super hard and helping companies to go remote. And so, frankly, in a way, it's a bit embarrassing for me to do a 180 like that. But I guess, when the facts changed, I changed my mind. What happened? Well, I think at first, like everyone else, we went remote by necessity. It was like COVID and you've got to go remote. And on paper, the gains of remote are enormous. In particular, from a founder's standpoint, being able to hire from anywhere is huge. Saving on rent is huge. Saving on commute is huge for everyone and so forth. But then, look, we're all here. It's like, it is really making it much harder to work together. And I spent three years of my youth trying to build a solution for this. And my conclusion is, at least we couldn't figure it out and no one else could. Zoom didn't figure it out. We had like a bunch of competitors. Like, Gathertown was one of the bigger ones. We had dozens and dozens of competitors. No one figured it out. I don't know that software can actually solve this problem. The reality of it is, everyone just wants to get off the darn Zoom call. And it's not a good feeling to be in your home office if you're even going to have a home office all day. It's harder to build culture. It's harder to get in sync. I think software is peculiar because it's like an iceberg. It's like the vast majority of it is submerged underwater. And so, the quality of the software that you ship is a function of the alignment of your mental models about what is below that waterline. Can you actually get in sync about what it is exactly fundamentally that we're building? What is the soul of our product? And it is so much harder to get in sync about that when you're remote. And then you waste time in a thousand ways because people are offline and you can't get a hold of them or you can't share your screen. It's just like you feel like you're walking in molasses all day. And eventually, I was like, okay, this is it. We're not going to do this anymore.Swyx [00:49:03]: Yeah. I think that is the current builder San Francisco consensus here. Yeah. But I still have a big... One of my big heroes as a CEO is Sid Subban from GitLab.Flo [00:49:14]: Mm-hmm.Swyx [00:49:15]: Matt MullenwegFlo [00:49:16]: used to be a hero.Swyx [00:49:17]: But these people run thousand-person remote businesses. The main idea is that at some company

The Lunar Society
Gwern Branwen - How an Anonymous Researcher Predicted AI's Trajectory

The Lunar Society

Play Episode Listen Later Nov 13, 2024 96:43


Gwern is a pseudonymous researcher and writer. He was one of the first people to see LLM scaling coming. If you've read his blog, you know he's one of the most interesting polymathic thinkers alive.In order to protect Gwern's anonymity, I proposed interviewing him in person, and having my friend Chris Painter voice over his words after. This amused him enough that he agreed.After the episode, I convinced Gwern to create a donation page where people can help sustain what he's up to. Please go here to contribute.Read the full transcript here.Sponsors:* Jane Street is looking to hire their next generation of leaders. Their deep learning team is looking for ML researchers, FPGA programmers, and CUDA programmers. Summer internships are open - if you want to stand out, take a crack at their new Kaggle competition. To learn more, go here: https://jane-st.co/dwarkesh* Turing provides complete post-training services for leading AI labs like OpenAI, Anthropic, Meta, and Gemini. They specialize in model evaluation, SFT, RLHF, and DPO to enhance models' reasoning, coding, and multimodal capabilities. Learn more at turing.com/dwarkesh.* This episode is brought to you by Stripe, financial infrastructure for the internet. Millions of companies from Anthropic to Amazon use Stripe to accept payments, automate financial processes and grow their revenue.If you're interested in advertising on the podcast, check out this page.Timestamps00:00:00 - Anonymity00:01:09 - Automating Steve Jobs00:04:38 - Isaac Newton's theory of progress00:06:36 - Grand theory of intelligence00:10:39 - Seeing scaling early00:21:04 - AGI Timelines00:22:54 - What to do in remaining 3 years until AGI00:26:29 - Influencing the shoggoth with writing00:30:50 - Human vs artificial intelligence00:33:52 - Rabbit holes00:38:48 - Hearing impairment00:43:00 - Wikipedia editing00:47:43 - Gwern.net00:50:20 - Counterfactual careers00:54:30 - Borges & literature01:01:32 - Gwern's intelligence and process01:11:03 - A day in the life of Gwern01:19:16 - Gwern's finances01:25:05 - The diversity of AI minds01:27:24 - GLP drugs and obesity01:31:08 - Drug experimentation01:33:40 - Parasocial relationships01:35:23 - Open rabbit holes Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Lex Fridman Podcast

Play Episode Listen Later Oct 6, 2024 157:38


Aman Sanger, Arvid Lunnemark, Michael Truell, and Sualeh Asif are creators of Cursor, a popular code editor that specializes in AI-assisted programming. Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep447-sc See below for timestamps, transcript, and to give feedback, submit questions, contact Lex, etc. Transcript: https://lexfridman.com/cursor-team-transcript CONTACT LEX: Feedback - give feedback to Lex: https://lexfridman.com/survey AMA - submit questions, videos or call-in: https://lexfridman.com/ama Hiring - join our team: https://lexfridman.com/hiring Other - other ways to get in touch: https://lexfridman.com/contact EPISODE LINKS: Cursor Website: https://cursor.com Cursor on X: https://x.com/cursor_ai Anysphere Website: https://anysphere.inc/ Aman's X: https://x.com/amanrsanger Aman's Website: https://amansanger.com/ Arvid's X: https://x.com/ArVID220u Arvid's Website: https://arvid.xyz/ Michael's Website: https://mntruell.com/ Michael's LinkedIn: https://bit.ly/3zIDkPN Sualeh's X: https://x.com/sualehasif996 Sualeh's Website: https://sualehasif.me/ SPONSORS: To support this podcast, check out our sponsors & get discounts: Encord: AI tooling for annotation & data management. Go to https://encord.com/lex MasterClass: Online classes from world-class experts. Go to https://masterclass.com/lexpod Shopify: Sell stuff online. Go to https://shopify.com/lex NetSuite: Business management software. Go to http://netsuite.com/lex AG1: All-in-one daily nutrition drinks. Go to https://drinkag1.com/lex OUTLINE: (00:00) - Introduction (09:25) - Code editor basics (11:35) - GitHub Copilot (18:53) - Cursor (25:20) - Cursor Tab (31:35) - Code diff (39:46) - ML details (45:20) - GPT vs Claude (51:54) - Prompt engineering (59:20) - AI agents (1:13:18) - Running code in background (1:17:57) - Debugging (1:23:25) - Dangerous code (1:34:35) - Branching file systems (1:37:47) - Scaling challenges (1:51:58) - Context (1:57:05) - OpenAI o1 (2:08:27) - Synthetic data (2:12:14) - RLHF vs RLAIF (2:14:01) - Fields Medal for AI (2:16:43) - Scaling laws (2:25:32) - The future of programming PODCAST LINKS: - Podcast Website: https://lexfridman.com/podcast - Apple Podcasts: https://apple.co/2lwqZIr - Spotify: https://spoti.fi/2nEwCF8 - RSS: https://lexfridman.com/feed/podcast/ - Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4 - Clips Channel: https://www.youtube.com/lexclips

SuperDataScience
791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

SuperDataScience

Play Episode Listen Later Jun 11, 2024 57:10


Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique's origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI's decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791