POPULARITY
From Palantir and Two Sigma to building Goodfire into the poster-child for actionable mechanistic interpretability, Mark Bissell (Member of Technical Staff) and Myra Deng (Head of Product) are trying to turn “peeking inside the model” into a repeatable production workflow by shipping APIs, landing real enterprise deployments, and now scaling the bet with a recent $150M Series B funding round at a $1.25B valuation.In this episode, we go far beyond the usual “SAEs are cool” take. We talk about Goodfire's core bet: that the AI lifecycle is still fundamentally broken because the only reliable control we have is data and we post-train, RLHF, and fine-tune by “slurping supervision through a straw,” hoping the model picks up the right behaviors while quietly absorbing the wrong ones. Goodfire's answer is to build a bi-directional interface between humans and models: read what's happening inside, edit it surgically, and eventually use interpretability during training so customization isn't just brute-force guesswork.Mark and Myra walk through what that looks like when you stop treating interpretability like a lab demo and start treating it like infrastructure: lightweight probes that add near-zero latency, token-level safety filters that can run at inference time, and interpretability workflows that survive messy constraints (multilingual inputs, synthetic→real transfer, regulated domains, no access to sensitive data). We also get a live window into what “frontier-scale interp” means operationally (i.e. steering a trillion-parameter model in real time by targeting internal features) plus why the same tooling generalizes cleanly from language models to genomics, medical imaging, and “pixel-space” world models.We discuss:* Myra + Mark's path: Palantir (health systems, forward-deployed engineering) → Goodfire early team; Two Sigma → Head of Product, translating frontier interpretability research into a platform and real-world deployments* What “interpretability” actually means in practice: not just post-hoc poking, but a broader “science of deep learning” approach across the full AI lifecycle (data curation → post-training → internal representations → model design)* Why post-training is the first big wedge: “surgical edits” for unintended behaviors likereward hacking, sycophancy, noise learned during customization plus the dream of targeted unlearning and bias removal without wrecking capabilities* SAEs vs probes in the real world: why SAE feature spaces sometimes underperform classifiers trained on raw activations for downstream detection tasks (hallucination, harmful intent, PII), and what that implies about “clean concept spaces”* Rakuten in production: deploying interpretability-based token-level PII detection at inference time to prevent routing private data to downstream providers plus the gnarly constraints: no training on real customer PII, synthetic→real transfer, English + Japanese, and tokenization quirks* Why interp can be operationally cheaper than LLM-judge guardrails: probes are lightweight, low-latency, and don't require hosting a second large model in the loop* Real-time steering at frontier scale: a demo of steering Kimi K2 (~1T params) live and finding features via SAE pipelines, auto-labeling via LLMs, and toggling a “Gen-Z slang” feature across multiple layers without breaking tool use* Hallucinations as an internal signal: the case that models have latent uncertainty / “user-pleasing” circuitry you can detect and potentially mitigate more directly than black-box methods* Steering vs prompting: the emerging view that activation steering and in-context learning are more closely connected than people think, including work mapping between the two (even for jailbreak-style behaviors)* Interpretability for science: using the same tooling across domains (genomics, medical imaging, materials) to debug spurious correlations and extract new knowledge up to and including early biomarker discovery work with major partners* World models + “pixel-space” interpretability: why vision/video models make concepts easier to see, how that accelerates the feedback loop, and why robotics/world-model partners are especially interesting design partners* The north star: moving from “data in, weights out” to intentional model design where experts can impart goals and constraints directly, not just via reward signals and brute-force post-training—Goodfire AI* Website: https://goodfire.ai* LinkedIn: https://www.linkedin.com/company/goodfire-ai/* X: https://x.com/GoodfireAIMyra Deng* Website: https://myradeng.com/* LinkedIn: https://www.linkedin.com/in/myra-deng/* X: https://x.com/myra_dengMark Bissell* LinkedIn: https://www.linkedin.com/in/mark-bissell/* X: https://x.com/MarkMBissellFull Video EpisodeTimestamps00:00:00 Introduction00:00:05 Introduction to the Latent Space Podcast and Guests from Goodfire00:00:29 What is Goodfire? Mission and Focus on Interpretability00:01:01 Goodfire's Practical Approach to Interpretability00:01:37 Goodfire's Series B Fundraise Announcement00:02:04 Backgrounds of Mark and Myra from Goodfire00:02:51 Team Structure and Roles at Goodfire00:05:13 What is Interpretability? Definitions and Techniques00:05:30 Understanding Errors00:07:29 Post-training vs. Pre-training Interpretability Applications00:08:51 Using Interpretability to Remove Unwanted Behaviors00:10:09 Grokking, Double Descent, and Generalization in Models00:10:15 404 Not Found Explained00:12:06 Subliminal Learning and Hidden Biases in Models00:14:07 How Goodfire Chooses Research Directions and Projects00:15:00 Troubleshooting Errors00:16:04 Limitations of SAEs and Probes in Interpretability00:18:14 Rakuten Case Study: Production Deployment of Interpretability00:20:45 Conclusion00:21:12 Efficiency Benefits of Interpretability Techniques00:21:26 Live Demo: Real-Time Steering in a Trillion Parameter Model00:25:15 How Steering Features are Identified and Labeled00:26:51 Detecting and Mitigating Hallucinations Using Interpretability00:31:20 Equivalence of Activation Steering and Prompting00:34:06 Comparing Steering with Fine-Tuning and LoRA Techniques00:36:04 Model Design and the Future of Intentional AI Development00:38:09 Getting Started in Mechinterp: Resources, Programs, and Open Problems00:40:51 Industry Applications and the Rise of Mechinterp in Practice00:41:39 Interpretability for Code Models and Real-World Usage00:43:07 Making Steering Useful for More Than Stylistic Edits00:46:17 Applying Interpretability to Healthcare and Scientific Discovery00:49:15 Why Interpretability is Crucial in High-Stakes Domains like Healthcare00:52:03 Call for Design Partners Across Domains00:54:18 Interest in World Models and Visual Interpretability00:57:22 Sci-Fi Inspiration: Ted Chiang and Interpretability01:00:14 Interpretability, Safety, and Alignment Perspectives01:04:27 Weak-to-Strong Generalization and Future Alignment Challenges01:05:38 Final Thoughts and Hiring/Collaboration Opportunities at GoodfireTranscriptShawn Wang [00:00:05]: So welcome to the Latent Space pod. We're back in the studio with our special MechInterp co-host, Vibhu. Welcome. Mochi, Mochi's special co-host. And Mochi, the mechanistic interpretability doggo. We have with us Mark and Myra from Goodfire. Welcome. Thanks for having us on. Maybe we can sort of introduce Goodfire and then introduce you guys. How do you introduce Goodfire today?Myra Deng [00:00:29]: Yeah, it's a great question. So Goodfire, we like to say, is an AI research lab that focuses on using interpretability to understand, learn from, and design AI models. And we really believe that interpretability will unlock the new generation, next frontier of safe and powerful AI models. That's our description right now, and I'm excited to dive more into the work we're doing to make that happen.Shawn Wang [00:00:55]: Yeah. And there's always like the official description. Is there an understatement? Is there an unofficial one that sort of resonates more with a different audience?Mark Bissell [00:01:01]: Well, being an AI research lab that's focused on interpretability, there's obviously a lot of people have a lot that they think about when they think of interpretability. And I think we have a pretty broad definition of what that means and the types of places that can be applied. And in particular, applying it in production scenarios, in high stakes industries, and really taking it sort of from the research world into the real world. Which, you know. It's a new field, so that hasn't been done all that much. And we're excited about actually seeing that sort of put into practice.Shawn Wang [00:01:37]: Yeah, I would say it wasn't too long ago that Anthopic was like still putting out like toy models or superposition and that kind of stuff. And I wouldn't have pegged it to be this far along. When you and I talked at NeurIPS, you were talking a little bit about your production use cases and your customers. And then not to bury the lead, today we're also announcing the fundraise, your Series B. $150 million. $150 million at a 1.25B valuation. Congrats, Unicorn.Mark Bissell [00:02:02]: Thank you. Yeah, no, things move fast.Shawn Wang [00:02:04]: We were talking to you in December and already some big updates since then. Let's dive, I guess, into a bit of your backgrounds as well. Mark, you were at Palantir working on health stuff, which is really interesting because the Goodfire has some interesting like health use cases. I don't know how related they are in practice.Mark Bissell [00:02:22]: Yeah, not super related, but I don't know. It was helpful context to know what it's like. Just to work. Just to work with health systems and generally in that domain. Yeah.Shawn Wang [00:02:32]: And Mara, you were at Two Sigma, which actually I was also at Two Sigma back in the day. Wow, nice.Myra Deng [00:02:37]: Did we overlap at all?Shawn Wang [00:02:38]: No, this is when I was briefly a software engineer before I became a sort of developer relations person. And now you're head of product. What are your sort of respective roles, just to introduce people to like what all gets done in Goodfire?Mark Bissell [00:02:51]: Yeah, prior to Goodfire, I was at Palantir for about three years as a forward deployed engineer, now a hot term. Wasn't always that way. And as a technical lead on the health care team and at Goodfire, I'm a member of the technical staff. And honestly, that I think is about as specific as like as as I could describe myself because I've worked on a range of things. And, you know, it's it's a fun time to be at a team that's still reasonably small. I think when I joined one of the first like ten employees, now we're above 40, but still, it looks like there's always a mix of research and engineering and product and all of the above. That needs to get done. And I think everyone across the team is, you know, pretty, pretty switch hitter in the roles they do. So I think you've seen some of the stuff that I worked on related to image models, which was sort of like a research demo. More recently, I've been working on our scientific discovery team with some of our life sciences partners, but then also building out our core platform for more of like flexing some of the kind of MLE and developer skills as well.Shawn Wang [00:03:53]: Very generalist. And you also had like a very like a founding engineer type role.Myra Deng [00:03:58]: Yeah, yeah.Shawn Wang [00:03:59]: So I also started as I still am a member of technical staff, did a wide range of things from the very beginning, including like finding our office space and all of this, which is we both we both visited when you had that open house thing. It was really nice.Myra Deng [00:04:13]: Thank you. Thank you. Yeah. Plug to come visit our office.Shawn Wang [00:04:15]: It looked like it was like 200 people. It has room for 200 people. But you guys are like 10.Myra Deng [00:04:22]: For a while, it was very empty. But yeah, like like Mark, I spend. A lot of my time as as head of product, I think product is a bit of a weird role these days, but a lot of it is thinking about how do we take our frontier research and really apply it to the most important real world problems and how does that then translate into a platform that's repeatable or a product and working across, you know, the engineering and research teams to make that happen and also communicating to the world? Like, what is interpretability? What is it used for? What is it good for? Why is it so important? All of these things are part of my day-to-day as well.Shawn Wang [00:05:01]: I love like what is things because that's a very crisp like starting point for people like coming to a field. They all do a fun thing. Vibhu, why don't you want to try tackling what is interpretability and then they can correct us.Vibhu Sapra [00:05:13]: Okay, great. So I think like one, just to kick off, it's a very interesting role to be head of product, right? Because you guys, at least as a lab, you're more of an applied interp lab, right? Which is pretty different than just normal interp, like a lot of background research. But yeah. You guys actually ship an API to try these things. You have Ember, you have products around it, which not many do. Okay. What is interp? So basically you're trying to have an understanding of what's going on in model, like in the model, in the internal. So different approaches to do that. You can do probing, SAEs, transcoders, all this stuff. But basically you have an, you have a hypothesis. You have something that you want to learn about what's happening in a model internals. And then you're trying to solve that from there. You can do stuff like you can, you know, you can do activation mapping. You can try to do steering. There's a lot of stuff that you can do, but the key question is, you know, from input to output, we want to have a better understanding of what's happening and, you know, how can we, how can we adjust what's happening on the model internals? How'd I do?Mark Bissell [00:06:12]: That was really good. I think that was great. I think it's also a, it's kind of a minefield of a, if you ask 50 people who quote unquote work in interp, like what is interpretability, you'll probably get 50 different answers. And. Yeah. To some extent also like where, where good fire sits in the space. I think that we're an AI research company above all else. And interpretability is a, is a set of methods that we think are really useful and worth kind of specializing in, in order to accomplish the goals we want to accomplish. But I think we also sort of see some of the goals as even more broader as, as almost like the science of deep learning and just taking a not black box approach to kind of any part of the like AI development life cycle, whether that. That means using interp for like data curation while you're training your model or for understanding what happened during post-training or for the, you know, understanding activations and sort of internal representations, what is in there semantically. And then a lot of sort of exciting updates that were, you know, are sort of also part of the, the fundraise around bringing interpretability to training, which I don't think has been done all that much before. A lot of this stuff is sort of post-talk poking at models as opposed to. To actually using this to intentionally design them.Shawn Wang [00:07:29]: Is this post-training or pre-training or is that not a useful.Myra Deng [00:07:33]: Currently focused on post-training, but there's no reason the techniques wouldn't also work in pre-training.Shawn Wang [00:07:38]: Yeah. It seems like it would be more active, applicable post-training because basically I'm thinking like rollouts or like, you know, having different variations of a model that you can tweak with the, with your steering. Yeah.Myra Deng [00:07:50]: And I think in a lot of the news that you've seen in, in, on like Twitter or whatever, you've seen a lot of unintended. Side effects come out of post-training processes, you know, overly sycophantic models or models that exhibit strange reward hacking behavior. I think these are like extreme examples. There's also, you know, very, uh, mundane, more mundane, like enterprise use cases where, you know, they try to customize or post-train a model to do something and it learns some noise or it doesn't appropriately learn the target task. And a big question that we've always had is like, how do you use your understanding of what the model knows and what it's doing to actually guide the learning process?Shawn Wang [00:08:26]: Yeah, I mean, uh, you know, just to anchor this for people, uh, one of the biggest controversies of last year was 4.0 GlazeGate. I've never heard of GlazeGate. I didn't know that was what it was called. The other one, they called it that on the blog post and I was like, well, how did OpenAI call it? Like officially use that term. And I'm like, that's funny, but like, yeah, I guess it's the pitch that if they had worked a good fire, they wouldn't have avoided it. Like, you know what I'm saying?Myra Deng [00:08:51]: I think so. Yeah. Yeah.Mark Bissell [00:08:53]: I think that's certainly one of the use cases. I think. Yeah. Yeah. I think the reason why post-training is a place where this makes a lot of sense is a lot of what we're talking about is surgical edits. You know, you want to be able to have expert feedback, very surgically change how your model is doing, whether that is, you know, removing a certain behavior that it has. So, you know, one of the things that we've been looking at or is, is another like common area where you would want to make a somewhat surgical edit is some of the models that have say political bias. Like you look at Quen or, um, R1 and they have sort of like this CCP bias.Shawn Wang [00:09:27]: Is there a CCP vector?Mark Bissell [00:09:29]: Well, there's, there are certainly internal, yeah. Parts of the representation space where you can sort of see where that lives. Yeah. Um, and you want to kind of, you know, extract that piece out.Shawn Wang [00:09:40]: Well, I always say, you know, whenever you find a vector, a fun exercise is just like, make it very negative to see what the opposite of CCP is.Mark Bissell [00:09:47]: The super America, bald eagles flying everywhere. But yeah. So in general, like lots of post-training tasks where you'd want to be able to, to do that. Whether it's unlearning a certain behavior or, you know, some of the other kind of cases where this comes up is, are you familiar with like the, the grokking behavior? I mean, I know the machine learning term of grokking.Shawn Wang [00:10:09]: Yeah.Mark Bissell [00:10:09]: Sort of this like double descent idea of, of having a model that is able to learn a generalizing, a generalizing solution, as opposed to even if memorization of some task would suffice, you want it to learn the more general way of doing a thing. And so, you know, another. A way that you can think about having surgical access to a model's internals would be learn from this data, but learn in the right way. If there are many possible, you know, ways to, to do that. Can make interp solve the double descent problem?Shawn Wang [00:10:41]: Depends, I guess, on how you. Okay. So I, I, I viewed that double descent as a problem because then you're like, well, if the loss curves level out, then you're done, but maybe you're not done. Right. Right. But like, if you actually can interpret what is a generalizing or what you're doing. What is, what is still changing, even though the loss is not changing, then maybe you, you can actually not view it as a double descent problem. And actually you're just sort of translating the space in which you view loss and like, and then you have a smooth curve. Yeah.Mark Bissell [00:11:11]: I think that's certainly like the domain of, of problems that we're, that we're looking to get.Shawn Wang [00:11:15]: Yeah. To me, like double descent is like the biggest thing to like ML research where like, if you believe in scaling, then you don't need, you need to know where to scale. And. But if you believe in double descent, then you don't, you don't believe in anything where like anything levels off, like.Vibhu Sapra [00:11:30]: I mean, also tendentially there's like, okay, when you talk about the China vector, right. There's the subliminal learning work. It was from the anthropic fellows program where basically you can have hidden biases in a model. And as you distill down or, you know, as you train on distilled data, those biases always show up, even if like you explicitly try to not train on them. So, you know, it's just like another use case of. Okay. If we can interpret what's happening in post-training, you know, can we clear some of this? Can we even determine what's there? Because yeah, it's just like some worrying research that's out there that shows, you know, we really don't know what's going on.Mark Bissell [00:12:06]: That is. Yeah. I think that's the biggest sentiment that we're sort of hoping to tackle. Nobody knows what's going on. Right. Like subliminal learning is just an insane concept when you think about it. Right. Train a model on not even the logits, literally the output text of a bunch of random numbers. And now your model loves owls. And you see behaviors like that, that are just, they defy, they defy intuition. And, and there are mathematical explanations that you can get into, but. I mean.Shawn Wang [00:12:34]: It feels so early days. Objectively, there are a sequence of numbers that are more owl-like than others. There, there should be.Mark Bissell [00:12:40]: According to, according to certain models. Right. It's interesting. I think it only applies to models that were initialized from the same starting Z. Usually, yes.Shawn Wang [00:12:49]: But I mean, I think that's a, that's a cheat code because there's not enough compute. But like if you believe in like platonic representation, like probably it will transfer across different models as well. Oh, you think so?Mark Bissell [00:13:00]: I think of it more as a statistical artifact of models initialized from the same seed sort of. There's something that is like path dependent from that seed that might cause certain overlaps in the latent space and then sort of doing this distillation. Yeah. Like it pushes it towards having certain other tendencies.Vibhu Sapra [00:13:24]: Got it. I think there's like a bunch of these open-ended questions, right? Like you can't train in new stuff during the RL phase, right? RL only reorganizes weights and you can only do stuff that's somewhat there in your base model. You're not learning new stuff. You're just reordering chains and stuff. But okay. My broader question is when you guys work at an interp lab, how do you decide what to work on and what's kind of the thought process? Right. Because we can ramble for hours. Okay. I want to know this. I want to know that. But like, how do you concretely like, you know, what's the workflow? Okay. There's like approaches towards solving a problem, right? I can try prompting. I can look at chain of thought. I can train probes, SAEs. But how do you determine, you know, like, okay, is this going anywhere? Like, do we have set stuff? Just, you know, if you can help me with all that. Yeah.Myra Deng [00:14:07]: It's a really good question. I feel like we've always at the very beginning of the company thought about like, let's go and try to learn what isn't working in machine learning today. Whether that's talking to customers or talking to researchers at other labs, trying to understand both where the frontier is going and where things are really not falling apart today. And then developing a perspective on how we can push the frontier using interpretability methods. And so, you know, even our chief scientist, Tom, spends a lot of time talking to customers and trying to understand what real world problems are and then taking that back and trying to apply the current state of the art to those problems and then seeing where they fall down basically. And then using those failures or those shortcomings to understand what hills to climb when it comes to interpretability research. So like on the fundamental side, for instance, when we have done some work applying SAEs and probes, we've encountered, you know, some shortcomings in SAEs that we found a little bit surprising. And so have gone back to the drawing board and done work on that. And then, you know, we've done some work on better foundational interpreter models. And a lot of our team's research is focused on what is the next evolution beyond SAEs, for instance. And then when it comes to like control and design of models, you know, we tried steering with our first API and realized that it still fell short of black box techniques like prompting or fine tuning. And so went back to the drawing board and we're like, how do we make that not the case and how do we improve it beyond that? And one of our researchers, Ekdeep, who just joined is actually Ekdeep and Atticus are like steering experts and have spent a lot of time trying to figure out like, what is the research that enables us to actually do this in a much more powerful, robust way? So yeah, the answer is like, look at real world problems, try to translate that into a research agenda and then like hill climb on both of those at the same time.Shawn Wang [00:16:04]: Yeah. Mark has the steering CLI demo queued up, which we're going to go into in a sec. But I always want to double click on when you drop hints, like we found some problems with SAEs. Okay. What are they? You know, and then we can go into the demo. Yeah.Myra Deng [00:16:19]: I mean, I'm curious if you have more thoughts here as well, because you've done it in the healthcare domain. But I think like, for instance, when we do things like trying to detect behaviors within models that are harmful or like behaviors that a user might not want to have in their model. So hallucinations, for instance, harmful intent, PII, all of these things. We first tried using SAE probes for a lot of these tasks. So taking the feature activation space from SAEs and then training classifiers on top of that, and then seeing how well we can detect the properties that we might want to detect in model behavior. And we've seen in many cases that probes just trained on raw activations seem to perform better than SAE probes, which is a bit surprising if you think that SAEs are actually also capturing the concepts that you would want to capture cleanly and more surgically. And so that is an interesting observation. I don't think that is like, I'm not down on SAEs at all. I think there are many, many things they're useful for, but we have definitely run into cases where I think the concept space described by SAEs is not as clean and accurate as we would expect it to be for actual like real world downstream performance metrics.Mark Bissell [00:17:34]: Fair enough. Yeah. It's the blessing and the curse of unsupervised methods where you get to peek into the AI's mind. But sometimes you wish that you saw other things when you walked inside there. Although in the PII instance, I think weren't an SAE based approach actually did prove to be the most generalizable?Myra Deng [00:17:53]: It did work well in the case that we published with Rakuten. And I think a lot of the reasons it worked well was because we had a noisier data set. And so actually the blessing of unsupervised learning is that we actually got to get more meaningful, generalizable signal from SAEs when the data was noisy. But in other cases where we've had like good data sets, it hasn't been the case.Shawn Wang [00:18:14]: And just because you named Rakuten and I don't know if we'll get it another chance, like what is the overall, like what is Rakuten's usage or production usage? Yeah.Myra Deng [00:18:25]: So they are using us to essentially guardrail and inference time monitor their language model usage and their agent usage to detect things like PII so that they don't route private user information.Myra Deng [00:18:41]: And so that's, you know, going through all of their user queries every day. And that's something that we deployed with them a few months ago. And now we are actually exploring very early partnerships, not just with Rakuten, but with other people around how we can help with potentially training and customization use cases as well. Yeah.Shawn Wang [00:19:03]: And for those who don't know, like it's Rakuten is like, I think number one or number two e-commerce store in Japan. Yes. Yeah.Mark Bissell [00:19:10]: And I think that use case actually highlights a lot of like what it looks like to deploy things in practice that you don't always think about when you're doing sort of research tasks. So when you think about some of the stuff that came up there that's more complex than your idealized version of a problem, they were encountering things like synthetic to real transfer of methods. So they couldn't train probes, classifiers, things like that on actual customer data of PII. So what they had to do is use synthetic data sets. And then hope that that transfer is out of domain to real data sets. And so we can evaluate performance on the real data sets, but not train on customer PII. So that right off the bat is like a big challenge. You have multilingual requirements. So this needed to work for both English and Japanese text. Japanese text has all sorts of quirks, including tokenization behaviors that caused lots of bugs that caused us to be pulling our hair out. And then also a lot of tasks you'll see. You might make simplifying assumptions if you're sort of treating it as like the easiest version of the problem to just sort of get like general results where maybe you say you're classifying a sentence to say, does this contain PII? But the need that Rakuten had was token level classification so that you could precisely scrub out the PII. So as we learned more about the problem, you're sort of speaking about what that looks like in practice. Yeah. A lot of assumptions end up breaking. And that was just one instance where you. A problem that seems simple right off the bat ends up being more complex as you keep diving into it.Vibhu Sapra [00:20:41]: Excellent. One of the things that's also interesting with Interp is a lot of these methods are very efficient, right? So where you're just looking at a model's internals itself compared to a separate like guardrail, LLM as a judge, a separate model. One, you have to host it. Two, there's like a whole latency. So if you use like a big model, you have a second call. Some of the work around like self detection of hallucination, it's also deployed for efficiency, right? So if you have someone like Rakuten doing it in production live, you know, that's just another thing people should consider.Mark Bissell [00:21:12]: Yeah. And something like a probe is super lightweight. Yeah. It's no extra latency really. Excellent.Shawn Wang [00:21:17]: You have the steering demos lined up. So we were just kind of see what you got. I don't, I don't actually know if this is like the latest, latest or like alpha thing.Mark Bissell [00:21:26]: No, this is a pretty hacky demo from from a presentation that someone else on the team recently gave. So this will give a sense for, for technology. So you can see the steering and action. Honestly, I think the biggest thing that this highlights is that as we've been growing as a company and taking on kind of more and more ambitious versions of interpretability related problems, a lot of that comes to scaling up in various different forms. And so here you're going to see steering on a 1 trillion parameter model. This is Kimi K2. And so it's sort of fun that in addition to the research challenges, there are engineering challenges that we're now tackling. Cause for any of this to be sort of useful in production, you need to be thinking about what it looks like when you're using these methods on frontier models as opposed to sort of like toy kind of model organisms. So yeah, this was thrown together hastily, pretty fragile behind the scenes, but I think it's quite a fun demo. So screen sharing is on. So I've got two terminal sessions pulled up here. On the left is a forked version that we have of the Kimi CLI that we've got running to point at our custom hosted Kimi model. And then on the right is a set up that will allow us to steer on certain concepts. So I should be able to chat with Kimi over here. Tell it hello. This is running locally. So the CLI is running locally, but the Kimi server is running back to the office. Well, hopefully should be, um, that's too much to run on that Mac. Yeah. I think it's, uh, it takes a full, like each 100 node. I think it's like, you can. You can run it on eight GPUs, eight 100. So, so yeah, Kimi's running. We can ask it a prompt. It's got a forked version of our, uh, of the SG line code base that we've been working on. So I'm going to tell it, Hey, this SG line code base is slow. I think there's a bug. Can you try to figure it out? There's a big code base, so it'll, it'll spend some time doing this. And then on the right here, I'm going to initialize in real time. Some steering. Let's see here.Mark Bissell [00:23:33]: searching for any. Bugs. Feature ID 43205.Shawn Wang [00:23:38]: Yeah.Mark Bissell [00:23:38]: 20, 30, 40. So let me, uh, this is basically a feature that we found that inside Kimi seems to cause it to speak in Gen Z slang. And so on the left, it's still sort of thinking normally it might take, I don't know, 15 seconds for this to kick in, but then we're going to start hopefully seeing him do this code base is massive for real. So we're going to start. We're going to start seeing Kimi transition as the steering kicks in from normal Kimi to Gen Z Kimi and both in its chain of thought and its actual outputs.Mark Bissell [00:24:19]: And interestingly, you can see, you know, it's still able to call tools, uh, and stuff. It's um, it's purely sort of it's it's demeanor. And there are other features that we found for interesting things like concision. So that's more of a practical one. You can make it more concise. Um, the types of programs, uh, programming languages that uses, but yeah, as we're seeing it come in. Pretty good. Outputs.Shawn Wang [00:24:43]: Scheduler code is actually wild.Vibhu Sapra [00:24:46]: Yo, this code is actually insane, bro.Vibhu Sapra [00:24:53]: What's the process of training in SAE on this, or, you know, how do you label features? I know you guys put out a pretty cool blog post about, um, finding this like autonomous interp. Um, something. Something about how agents for interp is different than like coding agents. I don't know while this is spewing up, but how, how do we find feature 43, two Oh five. Yeah.Mark Bissell [00:25:15]: So in this case, um, we, our platform that we've been building out for a long time now supports all the sort of classic out of the box interp techniques that you might want to have like SAE training, probing things of that kind, I'd say the techniques for like vanilla SAEs are pretty well established now where. You take your model that you're interpreting, run a whole bunch of data through it, gather activations, and then yeah, pretty straightforward pipeline to train an SAE. There are a lot of different varieties. There's top KSAEs, batch top KSAEs, um, normal ReLU SAEs. And then once you have your sparse features to your point, assigning labels to them to actually understand that this is a gen Z feature, that's actually where a lot of the kind of magic happens. Yeah. And the most basic standard technique is look at all of your d input data set examples that cause this feature to fire most highly. And then you can usually pick out a pattern. So for this feature, If I've run a diverse enough data set through my model feature 43, two Oh five. Probably tends to fire on all the tokens that sounds like gen Z slang. You know, that's the, that's the time of year to be like, Oh, I'm in this, I'm in this Um, and, um, so, you know, you could have a human go through all 43,000 concepts andVibhu Sapra [00:26:34]: And I've got to ask the basic question, you know, can we get examples where it hallucinates, pass it through, see what feature activates for hallucinations? Can I just, you know, turn hallucination down?Myra Deng [00:26:51]: Oh, wow. You really predicted a project we're already working on right now, which is detecting hallucinations using interpretability techniques. And this is interesting because hallucinations is something that's very hard to detect. And it's like a kind of a hairy problem and something that black box methods really struggle with. Whereas like Gen Z, you could always train a simple classifier to detect that hallucinations is harder. But we've seen that models internally have some... Awareness of like uncertainty or some sort of like user pleasing behavior that leads to hallucinatory behavior. And so, yeah, we have a project that's trying to detect that accurately. And then also working on mitigating the hallucinatory behavior in the model itself as well.Shawn Wang [00:27:39]: Yeah, I would say most people are still at the level of like, oh, I would just turn temperature to zero and that turns off hallucination. And I'm like, well, that's a fundamental misunderstanding of how this works. Yeah.Mark Bissell [00:27:51]: Although, so part of what I like about that question is you, there are SAE based approaches that might like help you get at that. But oftentimes the beauty of SAEs and like we said, the curse is that they're unsupervised. So when you have a behavior that you deliberately would like to remove, and that's more of like a supervised task, often it is better to use something like probes and specifically target the thing that you're interested in reducing as opposed to sort of like hoping that when you fragment the latent space, one of the vectors that pops out.Vibhu Sapra [00:28:20]: And as much as we're training an autoencoder to be sparse, we're not like for sure certain that, you know, we will get something that just correlates to hallucination. You'll probably split that up into 20 other things and who knows what they'll be.Mark Bissell [00:28:36]: Of course. Right. Yeah. So there's no sort of problems with like feature splitting and feature absorption. And then there's the off target effects, right? Ideally, you would want to be very precise where if you reduce the hallucination feature, suddenly maybe your model can't write. Creatively anymore. And maybe you don't like that, but you want to still stop it from hallucinating facts and figures.Shawn Wang [00:28:55]: Good. So Vibhu has a paper to recommend there that we'll put in the show notes. But yeah, I mean, I guess just because your demo is done, any any other things that you want to highlight or any other interesting features you want to show?Mark Bissell [00:29:07]: I don't think so. Yeah. Like I said, this is a pretty small snippet. I think the main sort of point here that I think is exciting is that there's not a whole lot of inter being applied to models quite at this scale. You know, Anthropic certainly has some some. Research and yeah, other other teams as well. But it's it's nice to see these techniques, you know, being put into practice. I think not that long ago, the idea of real time steering of a trillion parameter model would have sounded.Shawn Wang [00:29:33]: Yeah. The fact that it's real time, like you started the thing and then you edited the steering vector.Vibhu Sapra [00:29:38]: I think it's it's an interesting one TBD of what the actual like production use case would be on that, like the real time editing. It's like that's the fun part of the demo, right? You can kind of see how this could be served behind an API, right? Like, yes, you're you only have so many knobs and you can just tweak it a bit more. And I don't know how it plays in. Like people haven't done that much with like, how does this work with or without prompting? Right. How does this work with fine tuning? Like, there's a whole hype of continual learning, right? So there's just so much to see. Like, is this another parameter? Like, is it like parameter? We just kind of leave it as a default. We don't use it. So I don't know. Maybe someone here wants to put out a guide on like how to use this with prompting when to do what?Mark Bissell [00:30:18]: Oh, well, I have a paper recommendation. I think you would love from Act Deep on our team, who is an amazing researcher, just can't say enough amazing things about Act Deep. But he actually has a paper that as well as some others from the team and elsewhere that go into the essentially equivalence of activation steering and in context learning and how those are from a he thinks of everything in a cognitive neuroscience Bayesian framework, but basically how you can precisely show how. Prompting in context, learning and steering exhibit similar behaviors and even like get quantitative about the like magnitude of steering you would need to do to induce a certain amount of behavior similar to certain prompting, even for things like jailbreaks and stuff. It's a really cool paper. Are you saying steering is less powerful than prompting? More like you can almost write a formula that tells you how to convert between the two of them.Myra Deng [00:31:20]: And so like formally equivalent actually in the in the limit. Right.Mark Bissell [00:31:24]: So like one case study of this is for jailbreaks there. I don't know. Have you seen the stuff where you can do like many shot jailbreaking? You like flood the context with examples of the behavior. And the topic put out that paper.Shawn Wang [00:31:38]: A lot of people were like, yeah, we've been doing this, guys.Mark Bissell [00:31:40]: Like, yeah, what's in this in context learning and activation steering equivalence paper is you can like predict the number. Number of examples that you will need to put in there in order to jailbreak the model. That's cool. By doing steering experiments and using this sort of like equivalence mapping. That's cool. That's really cool. It's very neat. Yeah.Shawn Wang [00:32:02]: I was going to say, like, you know, I can like back rationalize that this makes sense because, you know, what context is, is basically just, you know, it updates the KV cache kind of and like and then every next token inference is still like, you know, the sheer sum of everything all the way. It's plus all the context. It's up to date. And you could, I guess, theoretically steer that with you probably replace that with your steering. The only problem is steering typically is on one layer, maybe three layers like like you did. So it's like not exactly equivalent.Mark Bissell [00:32:33]: Right, right. There's sort of you need to get precise about, yeah, like how you sort of define steering and like what how you're modeling the setup. But yeah, I've got the paper pulled up here. Belief dynamics reveal the dual nature. Yeah. The title is Belief Dynamics Reveal the Dual Nature of Incompetence. And it's an exhibition of the practical context learning and activation steering. So Eric Bigelow, Dan Urgraft on the who are doing fellowships at Goodfire, Ekt Deep's the final author there.Myra Deng [00:32:59]: I think actually to your question of like, what is the production use case of steering? I think maybe if you just think like one level beyond steering as it is today. Like imagine if you could adapt your model to be, you know, an expert legal reasoner. Like in almost real time, like very quickly. efficiently using human feedback or using like your semantic understanding of what the model knows and where it knows that behavior. I think that while it's not clear what the product is at the end of the day, it's clearly very valuable. Thinking about like what's the next interface for model customization and adaptation is a really interesting problem for us. Like we have heard a lot of people actually interested in fine-tuning an RL for open weight models in production. And so people are using things like Tinker or kind of like open source libraries to do that, but it's still very difficult to get models fine-tuned and RL'd for exactly what you want them to do unless you're an expert at model training. And so that's like something we'reShawn Wang [00:34:06]: looking into. Yeah. I never thought so. Tinker from Thinking Machines famously uses rank one LoRa. Is that basically the same as steering? Like, you know, what's the comparison there?Mark Bissell [00:34:19]: Well, so in that case, you are still applying updates to the parameters, right?Shawn Wang [00:34:25]: Yeah. You're not touching a base model. You're touching an adapter. It's kind of, yeah.Mark Bissell [00:34:30]: Right. But I guess it still is like more in parameter space then. I guess it's maybe like, are you modifying the pipes or are you modifying the water flowing through the pipes to get what you're after? Yeah. Just maybe one way.Mark Bissell [00:34:44]: I like that analogy. That's my mental map of it at least, but it gets at this idea of model design and intentional design, which is something that we're, that we're very focused on. And just the fact that like, I hope that we look back at how we're currently training models and post-training models and just think what a primitive way of doing that right now. Like there's no intentionalityShawn Wang [00:35:06]: really in... It's just data, right? The only thing in control is what data we feed in.Mark Bissell [00:35:11]: So, so Dan from Goodfire likes to use this analogy of, you know, he has a couple of young kids and he talks about like, what if I could only teach my kids how to be good people by giving them cookies or like, you know, giving them a slap on the wrist if they do something wrong, like not telling them why it was wrong or like what they should have done differently or something like that. Just figure it out. Right. Exactly. So that's RL. Yeah. Right. And, and, you know, it's sample inefficient. There's, you know, what do they say? It's like slurping feedback. It's like, slurping supervision. Right. And so you'd like to get to the point where you can have experts giving feedback to their models that are, uh, internalized and, and, you know, steering is an inference time way of sort of getting that idea. But ideally you're moving to a world whereVibhu Sapra [00:36:04]: it is much more intentional design in perpetuity for these models. Okay. This is one of the questions we asked Emmanuel from Anthropic on the podcast a few months ago. Basically the question, was you're at a research lab that does model training, foundation models, and you're on an interp team. How does it tie back? Right? Like, does this, do ideas come from the pre-training team? Do they go back? Um, you know, so for those interested, you can, you can watch that. There wasn't too much of a connect there, but it's still something, you know, it's something they want toMark Bissell [00:36:33]: push for down the line. It can be useful for all of the above. Like there are certainly post-hocVibhu Sapra [00:36:39]: use cases where it doesn't need to touch that. I think the other thing a lot of people forget is this stuff isn't too computationally expensive, right? Like I would say, if you're interested in getting into research, MechInterp is one of the most approachable fields, right? A lot of this train an essay, train a probe, this stuff, like the budget for this one, there's already a lot done. There's a lot of open source work. You guys have done some too. Um, you know,Shawn Wang [00:37:04]: There's like notebooks from the Gemini team for Neil Nanda or like, this is how you do it. Just step through the notebook.Vibhu Sapra [00:37:09]: Even if you're like, not even technical with any of this, you can still make like progress. There, you can look at different activations, but, uh, if you do want to get into training, you know, training this stuff, correct me if I'm wrong is like in the thousands of dollars, not even like, it's not that high scale. And then same with like, you know, applying it, doing it for post-training or all this stuff is fairly cheap in scale of, okay. I want to get into like model training. I don't have compute for like, you know, pre-training stuff. So it's, it's a very nice field to get into. And also there's a lot of like open questions, right? Um, some of them have to go with, okay, I want a product. I want to solve this. Like there's also just a lot of open-ended stuff that people could work on. That's interesting. Right. I don't know if you guys have any calls for like, what's open questions, what's open work that you either open collaboration with, or like, you'd just like to see solved or just, you know, for people listening that want to get into McInturk because people always talk about it. What are, what are the things they should check out? Start, of course, you know, join you guys as well. I'm sure you're hiring.Myra Deng [00:38:09]: There's a paper, I think from, was it Lee, uh, Sharky? It's open problems and, uh, it's, it's a bit of interpretability, which I recommend everyone who's interested in the field. Read. I'm just like a really comprehensive overview of what are the things that experts in the field think are the most important problems to be solved. I also think to your point, it's been really, really inspiring to see, I think a lot of young people getting interested in interpretability, actually not just young people also like scientists to have been, you know, experts in physics for many years and in biology or things like this, um, transitioning into interp, because the barrier of, of what's now interp. So it's really cool to see a number to entry is, you know, in some ways low and there's a lot of information out there and ways to get started. There's this anecdote of like professors at universities saying that all of a sudden every incoming PhD student wants to study interpretability, which was not the case a few years ago. So it just goes to show how, I guess, like exciting the field is, how fast it's moving, how quick it is to get started and things like that.Mark Bissell [00:39:10]: And also just a very welcoming community. You know, there's an open source McInturk Slack channel. There are people are always posting questions and just folks in the space are always responsive if you ask things on various forums and stuff. But yeah, the open paper, open problems paper is a really good one.Myra Deng [00:39:28]: For other people who want to get started, I think, you know, MATS is a great program. What's the acronym for? Machine Learning and Alignment Theory Scholars? It's like the...Vibhu Sapra [00:39:40]: Normally summer internship style.Myra Deng [00:39:42]: Yeah, but they've been doing it year round now. And actually a lot of our full-time staff have come through that program or gone through that program. And it's great for anyone who is transitioning into interpretability. There's a couple other fellows programs. We do one as well as Anthropic. And so those are great places to get started if anyone is interested.Mark Bissell [00:40:03]: Also, I think been seen as a research field for a very long time. But I think engineering... I think engineers are sorely wanted for interpretability as well, especially at Goodfire, but elsewhere, as it does scale up.Shawn Wang [00:40:18]: I should mention that Lee actually works with you guys, right? And in the London office and I'm adding our first ever McInturk track at AI Europe because I see this industry applications now emerging. And I'm pretty excited to, you know, help push that along. Yeah, I was looking forward to that. It'll effectively be the first industry McInturk conference. Yeah. I'm so glad you added that. You know, it's still a little bit of a bet. It's not that widespread, but I can definitely see this is the time to really get into it. We want to be early on things.Mark Bissell [00:40:51]: For sure. And I think the field understands this, right? So at ICML, I think the title of the McInturk workshop this year was actionable interpretability. And there was a lot of discussion around bringing it to various domains. Everyone's adding pragmatic, actionable, whatever.Shawn Wang [00:41:10]: It's like, okay, well, we weren't actionable before, I guess. I don't know.Vibhu Sapra [00:41:13]: And I mean, like, just, you know, being in Europe, you see the Interp room. One, like old school conferences, like, I think they had a very tiny room till they got lucky and they got it doubled. But there's definitely a lot of interest, a lot of niche research. So you see a lot of research coming out of universities, students. We covered the paper last week. It's like two unknown authors, not many citations. But, you know, you can make a lot of meaningful work there. Yeah. Yeah. Yeah.Shawn Wang [00:41:39]: Yeah. I think people haven't really mentioned this yet. It's just Interp for code. I think it's like an abnormally important field. We haven't mentioned this yet. The conspiracy theory last two years ago was when the first SAE work came out of Anthropic was they would do like, oh, we just used SAEs to turn the bad code vector down and then turn up the good code. And I think like, isn't that the dream? Like, you know, like, but basically, I guess maybe, why is it funny? Like, it's... If it was realistic, it would not be funny. It would be like, no, actually, we should do this. But it's funny because we know there's like, we feel there's some limitations to what steering can do. And I think a lot of the public image of steering is like the Gen Z stuff. Like, oh, you can make it really love the Golden Gate Bridge, or you can make it speak like Gen Z. To like be a legal reasoner seems like a huge stretch. Yeah. And I don't know if that will get there this way. Yeah.Myra Deng [00:42:36]: I think, um, I will say we are announcing. Something very soon that I will not speak too much about. Um, but I think, yeah, this is like what we've run into again and again is like, we, we don't want to be in the world where steering is only useful for like stylistic things. That's definitely not, not what we're aiming for. But I think the types of interventions that you need to do to get to things like legal reasoning, um, are much more sophisticated and require breakthroughs in, in learning algorithms. And that's, um...Shawn Wang [00:43:07]: And is this an emergent property of scale as well?Myra Deng [00:43:10]: I think so. Yeah. I mean, I think scale definitely helps. I think scale allows you to learn a lot of information and, and reduce noise across, you know, large amounts of data. But I also think we think that there's ways to do things much more effectively, um, even, even at scale. So like actually learning exactly what you want from the data and not learning things that you do that you don't want exhibited in the data. So we're not like anti-scale, but we are also realizing that scale is not going to get us anywhere. It's not going to get us to the type of AI development that we want to be at in, in the future as these models get more powerful and get deployed in all these sorts of like mission critical contexts. Current life cycle of training and deploying and evaluations is, is to us like deeply broken and has opportunities to, to improve. So, um, more to come on that very, very soon.Mark Bissell [00:44:02]: And I think that that's a use basically, or maybe just like a proof point that these concepts do exist. Like if you can manipulate them in the precise best way, you can get the ideal combination of them that you desire. And steering is maybe the most coarse grained sort of peek at what that looks like. But I think it's evocative of what you could do if you had total surgical control over every concept, every parameter. Yeah, exactly.Myra Deng [00:44:30]: There were like bad code features. I've got it pulled up.Vibhu Sapra [00:44:33]: Yeah. Just coincidentally, as you guys are talking.Shawn Wang [00:44:35]: This is like, this is exactly.Vibhu Sapra [00:44:38]: There's like specifically a code error feature that activates and they show, you know, it's not, it's not typo detection. It's like, it's, it's typos in code. It's not typical typos. And, you know, you can, you can see it clearly activates where there's something wrong in code. And they have like malicious code, code error. They have a whole bunch of sub, you know, sub broken down little grain features. Yeah.Shawn Wang [00:45:02]: Yeah. So, so the, the rough intuition for me, the, why I talked about post-training was that, well, you just, you know, have a few different rollouts with all these things turned off and on and whatever. And then, you know, you can, that's, that's synthetic data you can kind of post-train on. Yeah.Vibhu Sapra [00:45:13]: And I think we make it sound easier than it is just saying, you know, they do the real hard work.Myra Deng [00:45:19]: I mean, you guys, you guys have the right idea. Exactly. Yeah. We replicated a lot of these features in, in our Lama models as well. I remember there was like.Vibhu Sapra [00:45:26]: And I think a lot of this stuff is open, right? Like, yeah, you guys opened yours. DeepMind has opened a lot of essays on Gemma. Even Anthropic has opened a lot of this. There's, there's a lot of resources that, you know, we can probably share of people that want to get involved.Shawn Wang [00:45:41]: Yeah. And special shout out to like Neuronpedia as well. Yes. Like, yeah, amazing piece of work to visualize those things.Myra Deng [00:45:49]: Yeah, exactly.Shawn Wang [00:45:50]: I guess I wanted to pivot a little bit on, onto the healthcare side, because I think that's a big use case for you guys. We haven't really talked about it yet. This is a bit of a crossover for me because we are, we are, we do have a separate science pod that we're starting up for AI, for AI for science, just because like, it's such a huge investment category and also I'm like less qualified to do it, but we actually have bio PhDs to cover that, which is great, but I need to just kind of recover, recap your work, maybe on the evil two stuff, but then, and then building forward.Mark Bissell [00:46:17]: Yeah, for sure. And maybe to frame up the conversation, I think another kind of interesting just lens on interpretability in general is a lot of the techniques that were described. are ways to solve the AI human interface problem. And it's sort of like bidirectional communication is the goal there. So what we've been talking about with intentional design of models and, you know, steering, but also more advanced techniques is having humans impart our desires and control into models and over models. And the reverse is also very interesting, especially as you get to superhuman models, whether that's narrow superintelligence, like these scientific models that work on genomics, data, medical imaging, things like that. But down the line, you know, superintelligence of other forms as well. What knowledge can the AIs teach us as sort of that, that the other direction in that? And so some of our life science work to date has been getting at exactly that question, which is, well, some of it does look like debugging these various life sciences models, understanding if they're actually performing well, on tasks, or if they're picking up on spurious correlations, for instance, genomics models, you would like to know whether they are sort of focusing on the biologically relevant things that you care about, or if it's using some simpler correlate, like the ancestry of the person that it's looking at. But then also in the instances where they are superhuman, and maybe they are understanding elements of the human genome that we don't have names for or specific, you know, yeah, discoveries that they've made that that we don't know about, that's, that's a big goal. And so we're already seeing that, right, we are partnered with organizations like Mayo Clinic, leading research health system in the United States, our Institute, as well as a startup called Prima Menta, which focuses on neurodegenerative disease. And in our partnership with them, we've used foundation models, they've been training and applied our interpretability techniques to find novel biomarkers for Alzheimer's disease. So I think this is just the tip of the iceberg. But it's, that's like a flavor of some of the things that we're working on.Shawn Wang [00:48:36]: Yeah, I think that's really fantastic. Obviously, we did the Chad Zuckerberg pod last year as well. And like, there's a plethora of these models coming out, because there's so much potential and research. And it's like, very interesting how it's basically the same as language models, but just with a different underlying data set. But it's like, it's the same exact techniques. Like, there's no change, basically.Mark Bissell [00:48:59]: Yeah. Well, and even in like other domains, right? Like, you know, robotics, I know, like a lot of the companies just use Gemma as like the like backbone, and then they like make it into a VLA that like takes these actions. It's, it's, it's transformers all the way down. So yeah.Vibhu Sapra [00:49:15]: Like we have Med Gemma now, right? Like this week, even there was Med Gemma 1.5. And they're training it on this stuff, like 3d scans, medical domain knowledge, and all that stuff, too. So there's a push from both sides. But I think the thing that, you know, one of the things about McInturpp is like, you're a little bit more cautious in some domains, right? So healthcare, mainly being one, like guardrails, understanding, you know, we're more risk adverse to something going wrong there. So even just from a basic understanding, like, if we're trusting these systems to make claims, we want to know why and what's going on.Myra Deng [00:49:51]: Yeah, I think there's totally a kind of like deployment bottleneck to actually using. foundation models for real patient usage or things like that. Like, say you're using a model for rare disease prediction, you probably want some explanation as to why your model predicted a certain outcome, and an interpretable explanation at that. So that's definitely a use case. But I also think like, being able to extract scientific information that no human knows to accelerate drug discovery and disease treatment and things like that actually is a really, really big unlock for science, like scientific discovery. And you've seen a lot of startups, like say that they're going to accelerate scientific discovery. And I feel like we actually are doing that through our interp techniques. And kind of like, almost by accident, like, I think we got reached out to very, very early on from these healthcare institutions. And none of us had healthcare.Shawn Wang [00:50:49]: How did they even hear of you? A podcast.Myra Deng [00:50:51]: Oh, okay. Yeah, podcast.Vibhu Sapra [00:50:53]: Okay, well, now's that time, you know.Myra Deng [00:50:55]: Everyone can call us.Shawn Wang [00:50:56]: Podcasts are the most important thing. Everyone should listen to podcasts.Myra Deng [00:50:59]: Yeah, they reached out. They were like, you know, we have these really smart models that we've trained, and we want to know what they're doing. And we were like, really early that time, like three months old, and it was a few of us. And we were like, oh, my God, we've never used these models. Let's figure it out. But it's also like, great proof that interp techniques scale pretty well across domains. We didn't really have to learn too much about.Shawn Wang [00:51:21]: Interp is a machine learning technique, machine learning skills everywhere, right? Yeah. And it's obviously, it's just like a general insight. Yeah. Probably to finance too, I think, which would be fun for our history. I don't know if you have anything to say there.Mark Bissell [00:51:34]: Yeah, well, just across the science. Like, we've also done work on material science. Yeah, it really runs the gamut.Vibhu Sapra [00:51:40]: Yeah. Awesome. And, you know, for those that should reach out, like, you're obviously experts in this, but like, is there a call out for people that you're looking to partner with, design partners, people to use your stuff outside of just, you know, the general developer that wants to. Plug and play steering stuff, like on the research side more so, like, are there ideal design partners, customers, stuff like that?Myra Deng [00:52:03]: Yeah, I can talk about maybe non-life sciences, and then I'm curious to hear from you on the life sciences side. But we're looking for design partners across many domains, language, anyone who's customizing language models or trying to push the frontier of code or reasoning models is really interesting to us. And then also interested in the frontier of modeling. There's a lot of models that work in, like, pixel space, as we call it. So if you're doing world models, video models, even robotics, where there's not a very clean natural language interface to interact with, I think we think that Interp can really help and are looking for a few partners in that space.Shawn Wang [00:52:43]: Just because you mentioned the keyword
This is a special episode, highlighting a session from ELC Annual 2025! OpenAI evolved from a pure research lab into the fastest-growing product in history, scaling from 100 million to 700 million weekly users in record time. In this episode, we deconstruct the organizational design choices and cultural bets that enabled this unprecedented velocity. We explore what it means to hire "extreme generalists," how AI-native interns are redefining productivity, and the real-time trade-offs made during the world's largest product launches. Featuring Sulman Choudhry (Head of ChatGPT Engineering) and Samir Ahmed (Technical Lead), moderated by Lawrence Bruhmeller (Eng Management @ Sigma). ABOUT SULMAN CHOUDHRYSulman leads ChatGPT Engineering at OpenAI, driving the development and scaling of one of the world's most impactful AI products. He pushes the boundaries of innovation by turning cutting‑edge research into practical, accessible tools that transform how people interact with technology. Previously at Meta, Sulman founded and scaled Instagram Reels, IGTV, and Instagram Labs, and helped lead the early development of Instagram Stories.He also brought MetaAI to Instagram and Messenger, integrating generative AI into experiences used by billions. Earlier in his career, Sulman was on the founding team that built and launched UberEATS from the ground up, helping turn it into a global food delivery platform. With a track record of marrying technical vision, product strategy, and large‑scale execution, Sulman focuses on building products that meaningfully change how people live, work, and connect.ABOUT SAMIR AHMEDSamir is the Technical Lead for ChatGPT at OpenAI, where he currently leads the Personalization and Memory efforts to scale adaptive, useful, and human-centered product experiences to over 700 million users. He works broadly across the OpenAI stack—including mobile, web, services, systems, inference, and product research infrastructure.Previously, Samir spent nine years at Snap, working across Ads, AR, Content, and Growth. He led some of the company's most critical technical initiatives, including founding and scaling the machine learning platform that powered nearly all Ads, Content, and AR workloads, handling tens of billions of requests and trillions of inferences daily.ABOUT LAWRENCE BRUHMELLERLawrence Bruhmuller has over 20 years of experience in engineering management, much of it as an overall head of engineering. Previous roles include CTO/VPE roles at Great Expectations, Pave, Optimizely, and WeWork. He is currently leading the core query compiler and serving teams at Sigma Computing, the industry leading business analytics company.Lawrence is passionate about the intersection of engineering management and the growth stage of startups. He has written extensively on engineering leadership (https://lbruhmuller.medium.com/), including how to best evolve and mature engineering organizations before, during and after these growth phases. He enjoys advising and mentoring other engineering leaders in his spare time.Lawrence holds a Bachelors and Masters in Mathematics and Engineering from Harvey Mudd College. He lives in Oakland, California, with his wife and their three daughters. This episode is brought to you by Span!Span is the AI-native developer intelligence platform bringing clarity to engineering organizations with a holistic, human-centered approach to developer productivity.If you want a complete picture of your engineering impact and health, drive high performance, and make smarter business decisions…Go to Span.app to learn more! SHOW NOTES:From research lab to record-breaking product: Navigating the fastest growth in history (4:03)Unpredictable scaling: Handling growth spurts of one million users every hour (5:20)Cross-stack collaboration: How Android, systems, and GPU engineers solve crises together (7:06)The magic of trade-offs: Aligning the team on outcomes like service uptime vs. broad availability (7:57)Why throwing models "over the wall" failed and how OpenAI structures virtual teams (11:17)Lessons from OpenAI's first intern class: Why AI-native new grads are crushing expectations (13:41)Non-hierarchical culture: Using the "Member of Technical Staff" title to blur the lines of expertise (15:37)AI-native engineering: When massive code generation starts breaking traditional CI/CD systems (16:21)Asynchronous workflows: Using coding agents to reduce two-hour investigations to 15 minutes (17:35)The mindset shift: How rapid model improvements changed how leaders audit and trust code (19:00)Predicting success: "Vibes-based" decision making and iterative low-key research previews (20:43)Hiring for high variance: Why unconventional backgrounds lead to high-potential engineering hires (22:09) LINKS AND RESOURCESLink to the video for this sessionLink to all ELC Annual 2025 sessions This episode wouldn't have been possible without the help of our incredible production team:Patrick Gallagher - Producer & Co-HostJerry Li - Co-HostNoah Olberding - Associate Producer, Audio & Video Editor https://www.linkedin.com/in/noah-olberding/Dan Overheim - Audio Engineer, Dan's also an avid 3D printer - https://www.bnd3d.com/Ellie Coggins Angus - Copywriter, Check out her other work at https://elliecoggins.com/about/ Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Anthropic's Tobias Harrison Noonan shares the enterprise AI playbook: why coding leads to broader AI adoption, practical tips for getting started, and why you shouldn't wait for perfection.Topics Include:Tobias from Anthropic's Applied AI team discusses enterprise AI adoption trends and insights.Anthropic founded four years ago balancing AI safety mission with world's most intelligent models.Remarkable velocity: Claude 3.7 and Claude Code both shipped just in 2025 alone.Three-layer partnership: foundation models, enterprise capabilities, and end-user platforms like Claude Code.Anthropic leads in agentic coding for eighteen months, now number one enterprise AI market share.Claude Opus 4.5 launched last week, again tops software engineering benchmark for complex tasks.Claude Code enables thirty-hour autonomous coding sessions, ships features five times faster than before.Next frontier expands beyond coding into data-heavy knowledge work like financial and legal analysis.AI adoption maturity curve: employee workflows, internal processes, core products, then AI-native products.Thomson Reuters started with Claude Code for development team doing code modernization and refactoring.They expanded to Claude.ai for sales, marketing, and finance teams after seeing tangible ROI.Built Claude into core products including co-counsel legal platform and fraud prevention systems strategically.Today Thomson Reuters has eight different product lines powered by Claude across their portfolio.AWS partnership offers safe, secure, scalable deployment from POC to production in existing environments.Don't wait for perfection: AI today is dumbest it'll ever be, start prototyping now.Participants:Tobias Harrison-Noonan: Member of Technical Staff, AnthropicSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
Pundits are screaming about the so-called “AI bubble.” But historically slow-to-adopt industries like medicine and law are actually embracing AI at an unprecedented speed. Sarah Guo and Elad Gil look ahead to 2026, breaking down the major trends that will define the next era of AI technologies. They explore the future of AI foundational models, predicting breakthroughs in solving complex scientific problems. They share competing views on the timeline for robotics and self-driving cars, debating whether startups have a chance for survival or if incumbents will dominate. Elad and Sarah also discuss the return of tech IPOs and M&As, forecast a new wave of AI consumer agent software, and explore why consumer product innovation has been slower than expected. Finally, the two offer bold non-AI predictions for the new year, including the acceleration of defense tech startups and the second-order underrated impacts of GLP-1 drugs on biohacking. Plus, stick around to hear predictions on what's next for AI in 2026 from some of tech's biggest names and industry leaders. We hear from Jensen Huang (Founder/CEO NVIDIA), Arvind Jain (Founder/CEO, Glean), Winston Weinberg (Founder/CEO, Harvey), Scott Wu (Founder/CEO, Cognition), Raiza Martin (Founder/CEO Huxe), Zach Ziegler (Founder/CTO, Open Evidence), Aaron Levie (Founder/CEO, Box), Misha Laskin (Founder/CEO, ReflectionAI), Noam Brown (Research Scientist, OpenAI), Joshua Meier (Founder/CEO Chai Discovery), Bryan Johnson (Living Man, Don't Die), Sholto Douglas (Member of the Technical Staff, Anthropic), Ben & Asher Spector (Stanford PhDs) and Dylan Patel (Founder/CEO SemiAnalysis). Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil Chapters: 00:00 – Introduction 02:43 – AI Predictions for 2026 04:40 – Adoption of AI in Professional Fields 07:17 – Robotics and Self-Driving Cars 08:25 – Robotics: Incumbents vs. Startups 13:59 – Future of IPOs and M&A in AI 16:42 – Challenges in Consumer AI Innovation 21:08 – Funding of Neo Labs, RL Research 26:28 – Predictions for 2026 Beyond AI 26:44 – The Future of Defense and Technology 28:23 – Biohacking and Peptide Therapies 30:37 – 2026 Prediction from AI Industry Leaders 40:46 – Conclusion
** AWS re:Invent 2025 Dec 1-5, Las Vegas - Register Here! **Learn how Anyscale's Ray platform enables companies like Instacart to supercharge their model training while Amazon saves heavily by shifting to Ray's multimodal capabilities.Topics Include:Ray originated at UC Berkeley when PhD students spent more time building clusters than ML modelsAnyscale now launches 1 million clusters monthly with contributions from OpenAI, Uber, Google, CoinbaseInstacart achieved 10-100x increase in model training data using Ray's scaling capabilitiesML evolved from single-node Pandas/NumPy to distributed Spark, now Ray for multimodal dataRay Core transforms simple Python functions into distributed tasks across massive compute clustersHigher-level Ray libraries simplify data processing, model training, hyperparameter tuning, and model servingAnyscale platform adds production features: auto-restart, logging, observability, and zone-aware schedulingUnlike Spark's CPU-only approach, Ray handles both CPUs and GPUs for multimodal workloadsRay enables LLM post-training and fine-tuning using reinforcement learning on enterprise dataMulti-agent systems can scale automatically with Ray Serve handling thousands of requests per secondAnyscale leverages AWS infrastructure while keeping customer data within their own VPCsRay supports EC2, EKS, and HyperPod with features like fractional GPU usage and auto-scalingParticipants:Sharath Cholleti – Member of Technical Staff, AnyscaleSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/
Three Buddy Problem - Episode 70: Dave Aitel from OpenAI's technical staff joins the buddies to discuss the just-launched Aardvark, OpenAI's agentic “security researcher” that claims to read code, finds bugs, validates exploits, and ships patches. We press him on where LLMs beat fuzzers, privacy boundaries, human-in-the-loop realities, SDLC budgets, pen-test cadence, and the zero-day economy. Plus, L3 Harris/Trenchant exec pleads guilty to selling exploits to Russian brokers, Kaspersky catches the return of HackingTeam using Chrome zero-day exploit chain, and news of a proposed law in Russia to force researchers to report vulnerabilities first to goverment agencies. Cast: Dave Aitel (https://www.linkedin.com/in/daveaitel/) (Technical Staff, OpenAI), Juan Andres Guerrero-Saade (https://twitter.com/juanandres_gs), Ryan Naraine (https://twitter.com/ryanaraine) and Costin Raiu (https://twitter.com/craiu).
Building Claude Code: Origin, Story, Product Iterations, & What's Next // MLOps Podcast #342 with Siddharth Bidasaria, Member of Technical Staff at Anthropic.Join the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletter// AbstractDemetrios Brinkmann talks with Siddharth Bidasaria about Anthropic's Claude code — how it was built, key features like file tools and Spotify control, and the team's lean, user-focused approach. They explore testing, subagents, and the future of agentic coding, plus how users are pushing its limits.// BioSoftware engineer. Founding team of Claude Code. Ex-Robinhood and Rubrik. // Related LinksBio: https://sidb.io/Sid's Blog: https://sidb.io/posts/ I Let An AI Play Pokémon! - Claude plays Pokémon Creator: https://youtu.be/nRHeGJwVP18How Data Platforms Affect ML & AI // Jake Watson // MLOps Podcast #207: https://youtu.be/xWApMuyct_4The Agent Landscape - Lessons Learned Putting Agents Into Production: https://youtu.be/lRGldru7ohU~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Marco on LinkedIn: /siddharthbidasaria/Timestamps:[00:00] MCP servers usage creativity[00:34] Claude's code origin story[05:17] R&D freedom and tools[09:08] Model potential discovery[12:06] Model adaptation strategies[19:13] Steerability vs pattern alignment[22:09] Features to delete[24:12] Moore's law in LLMs[32:42] Power user surprises[35:56] Sub-agent evolution insights[39:54] Agent communication governance[45:26] At-scale agent coordination[49:56] Wrap up
Francis Yap is OIC Director at USeP KTTD Office. Jhong Corbeta is Technical Staff at AGILab TBI. The USeP KTTD Office (Knowledge and Technology Transfer Division) drives the university's innovation agenda by offering IP consultations, fabrication-lab access, and incubation services. AGILab TBI, USeP's technology business incubator, recently resumed operations to guide ideas and MVPs through validation and commercialization, supporting both university spinoffs and student startups. This episode is recorded live at the USeP KTTD Office, in partnership with AGILab TBI - knowledge and technology transfer division and technology business incubator of University of Southeastern Philippines in Davao City.In this episode | 00:50 Ano ang USeP KTTD Office & AGILab TBI? | 04:18 What services are provided by the incubator? | 11:07 What type of startups are supported by the incubator? | 13:12 How can interested startups join? | 15:45 What is the story behind the incubator? | 23:54 How is the startup ecosystem in Davao? | 27:44 What are future plans for the incubator? | 31:20 How can listeners find more information?USEP KTTD OFFICE | Facebook: https://facebook.com/usepkttdAGILAB TBIFacebook | https://facebook.com/usepagilabTHIS EPISODE IS CO-PRODUCED BY:YSPACES: https://knowyourspaceph.comAPEIRON: https://apeirongrp.comTWALA: https://twala.ioSYMPH: https://symph.coSECUNA: https://secuna.ioRED CIRCLE GLOBAL: https://redcircleglobal.comMAROON STUDIOS: https://maroonstudios.comAIMHI: https://aimhi.aiCHECK OUT OUR PARTNERS:Ask Lex PH Academy: https://asklexph.com (5% discount on e-learning courses! Code: ALPHAXSUP)Argum AI: http://argum.aiPIXEL by Eplayment: https://pixel.eplayment.co/auth/sign-up?r=PIXELXSUP1 (Sign up using Code: PIXELXSUP1)School of Profits: https://schoolofprofits.academyFounders Launchpad: https://founderslaunchpad.vcHier Business Solutions: https://hierpayroll.comAgile Data Solutions (Hustle PH): https://agiledatasolutions.techSmile Checks: https://getsmilechecks.comCloudCFO: https://cloudcfo.ph (Free financial assessment, process onboarding, and 6-month QuickBooks subscription! Mention: Start Up Podcast PH)Cloverly: https://cloverly.techBuddyBetes: https://buddybetes.comHKB Digital Services: https://contakt-ph.com (10% discount on RFID Business Cards! Code: CONTAKTXSUP)Hyperstacks: https://hyperstacksinc.comOneCFO: https://onecfoph.co (10% discount on CFO services! Code: ONECFOXSUP)UNAWA: https://unawa.asiaSkoolTek: https://skooltek.coBetter Support: https://bettersupport.io (Referral fee for anyone who can bring in new BPO clients!)Britana: https://britanaerp.comWunderbrand: https://wunderbrand.comEastPoint Business Outsourcing Services: https://facebook.com/eastpointoutsourcingDVCode Technologies Inc: https://dvcode.techNutriCoach: https://nutricoach.comUplift Code Camp: https://upliftcodecamp.com (5% discount on bootcamps and courses! Code: UPLIFTSTARTUPPH)START UP PODCAST PHYouTube: https://youtube.com/startuppodcastphSpotify: https://open.spotify.com/show/6BObuPvMfoZzdlJeb1XXVaApple Podcasts: https://podcasts.apple.com/us/podcast/start-up-podcast/id1576462394Facebook: https://facebook.com/startuppodcastphPatreon: https://patreon.com/StartUpPodcastPHPIXEL: https://pixel.eplayment.co/dl/startuppodcastphWebsite: https://phstartup.onlineThis episode is edited by the team at: https://tasharivera.com
Has society reached ‘peak progress'? Can we sustain the level of economic growth that technology has enabled over the last century? Have researchers plucked the last of science's "low-hanging fruit?" Why did early science innovators have outsized impact per capita? As fields mature, why does per-researcher output fall? Can a swarm of AI systems materially accelerate research? What does exponential growth hide about the risk of collapse? Will specialized AI outcompete human polymaths? Is quality of life still improving - and how confident are we in those measures? Is it too late to steer away from the attention economy? Can our control over intelligent systems scale as we develop their power? Will AI ever be capable of truly understanding human values? And if we reach that point, will it choose to align itself?Holden Karnofsky is a Member of Technical Staff at Anthropic, where he focuses on the design of the company's Responsible Scaling Policy and other aspects of preparing for the possibility of highly advanced AI systems in the future. Prior to his work with Anthropic, Holden led several high-impact organizations as the co-founder and co-executive director of charity evaluator GiveWell, and one of three Managing Directors of grantmaking organization Open Philanthropy. You can read more about ideas that matter to Holden at his blog Cold Takes.Further reading:Holden's "most important century" seriesResponsible scaling policiesHolden's thoughts on sustained growthStaffSpencer Greenberg — Host / DirectorJosh Castle — ProducerRyan Kessler — Audio EngineerUri Bram — FactotumWeAmplify — TranscriptionistsIgor Scaldini — Marketing ConsultantMusicBroke for FreeJosh WoodwardLee RosevereQuiet Music for Tiny Robotswowamusiczapsplat.comAffiliatesClearer ThinkingGuidedTrackMind EasePositlyUpLift[Read more]
In this episode, we welcome Neil Thompson, founder of Teach the Geek, to discuss the critical need for developing speaker training programs for technical staff within an organization. Neil shares his personal story of struggling with public speaking as an engineer and how it inspired him to help other technical professionals improve their communication skills. He breaks down the common challenges technical experts face when presenting to non-technical audiences and offers practical strategies HR departments can implement to foster better communication across the organization. [0:00] Introduction Welcome, Neil! Today's Topic: Developing Speaker Training Programs for Technical Staff [6:59] Why is it so important for technical staff to be strong public speakers? How a lack of communication skills can lead to being overlooked for promotions and raises. The benefit of having the person with the expertise communicate it directly, rather than risk information getting lost in translation. [13:33] What are the biggest challenges for technical people presenting to non-technical audiences? The importance of remembering what it was like before becoming a technical expert and tailoring the presentation accordingly. The challenge of making assumptions about what the audience already knows. Strategies for understanding the audience before delivering a presentation. [20:32] What role can HR play in developing presentation and communication skills for technical staff? Involving technical staff in the creation of the training or presentation to ensure it meets their needs. Using feedback questionnaires to measure the effectiveness and improvement of the training over time. [28:52] Closing Thanks for listening! Quick Quote “A lot of times, technical people, we think that us being excellent at our jobs is good enough, and unfortunately, that's not the case. If you're not good at advocating for yourself, you're not good at communicating your worth to an organization, you get overlooked.”
Sholto Douglas, a Member of Technical Staff at Anthropic, joined Unsupervised Learning to break down why coding is the clearest early signal of model progress, how AI agents are already accelerating research, and what it'll take to unlock real-world breakthroughs in fields like biology and robotics. (0:00) Intro(0:48) Claude 4(1:30) Capabilities and Improvements(2:29) Practical Applications and Advice(3:04) Future of AI in Coding(4:38) Managing Multiple AI Models(11:20) The Barrier to Agents is Reliability(16:35) Agents Conducting Research(19:54) Impact of Models on World GDP(25:14) Most Important Metrics in Model Improvement(29:53) Stories of Model Creativity(32:45) How Often Will New Models Be Shipped in the Future?(39:51) Day-to-Day Work of AI Researchers(46:46) The Future of AI and Society(51:26) Quickfire With your co-hosts: @jacobeffron - Partner at Redpoint, Former PM Flatiron Health @patrickachase - Partner at Redpoint, Former ML Engineer LinkedIn @ericabrescia - Former COO Github, Founder Bitnami (acq'd by VMWare) @jordan_segall - Partner at Redpoint
The Value of Speaker Training for Technical StaffIn this episode of the Teach the Geek Podcast, host Neil Thompson explores why organizations should invest in speaker training for their technical staff. He shares a real-life conference experience that highlights the gap in communication between technical professionals and business development teams. Neil breaks down the key elements of an effective speaker training program, including: ✅ Identifying presentation challenges specific to technical professionals ✅ Structuring presentations with a clear call to action ✅ Mastering essential speaking skills like time management and audience adaptation ✅ Providing opportunities for practice through lunch and learns ✅ Measuring success with feedback and gamification By equipping technical staff with strong presentation skills, organizations can create better industry representatives, improve internal communication, and help their employees advance in their careers. Tune in to learn how your organization can benefit from empowering its technical professionals to speak with confidence! __TEACH THE GEEK (http://teachthegeek.com)Follow @teachthegeek (Twitter) and @_teachthegeek_ (IG)Get Public Speaking Tips for STEM professionals at http://teachthegeek.com/tips.
I Let An AI Play Pokémon! - Claude plays Pokémon Creator // MLOps Podcast #295 with David Hershey, Member of Technical Staff at Anthropic.Join the Community: https://go.mlops.community/YTJoinIn Get the newsletter: https://go.mlops.community/YTNewsletter // AbstractDemetrios chats with David Hershey from Anthropic's Applied AI team about his agent-powered Pokémon project using Claude. They explore agent frameworks, prompt optimization vs. fine-tuning, and AI's growing role in software, legal, and accounting fields. David highlights how managed AI platforms simplify deployment, making advanced AI more accessible.// BioDavid Hershey devoted most of his career to machine learning infrastructure and trying to abstract away the hairy systems complexity that gets in the way of people building amazing ML applications.// Related LinksWebsite: https://www.davidhershey.com/~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with David on LinkedIn: /david-hershey-458ab081
Holy moly, AI enthusiasts! Alex Volkov here, reporting live from the AI Engineer Summit in the heart of (touristy) Times Square, New York! This week has been an absolute whirlwind of announcements, from XAI's Grok 3 dropping like a bomb, to Figure robots learning to hand each other things, and even a little eval smack-talk between OpenAI and XAI. It's enough to make your head spin – but that's what ThursdAI is here for. We sift through the chaos and bring you the need-to-know, so you can stay on the cutting edge without having to, well, spend your entire life glued to X and Reddit.This week we had a very special live show with the Haize Labs folks, the ones I previously interviewed about their bijection attacks, discussing their open source judge evaluation library called Verdict. So grab your favorite caffeinated beverage, maybe do some stretches because your mind will be blown, and let's dive into the TL;DR of ThursdAI, February 20th, 2025!Participants* Alex Volkov: AI Evangelist with Weights and Biases* Nisten: AI Engineer and cohost* Akshay: AI Community Member* Nuo: Dev Advocate at 01AI* Nimit: Member of Technical Staff at Haize Labs* Leonard: Co-founder at Haize LabsOpen Source LLMsPerplexity's R1 7076: Censorship-Free DeepSeekPerplexity made a bold move this week, releasing R1 7076, a fine-tuned version of DeepSeek R1 specifically designed to remove what they (and many others) perceive as Chinese government censorship. The name itself, 1776, is a nod to American independence – a pretty clear statement! The core idea? Give users access to information on topics the CCP typically restricts, like Tiananmen Square and Taiwanese independence.Perplexity used human experts to identify around 300 sensitive topics and built a "censorship classifier" to train the bias out of the model. The impressive part? They claim to have done this without significantly impacting the model's performance on standard evals. As Nuo from 01AI pointed out on the show, though, he'd "actually prefer that they can actually disclose more of their details in terms of post training... Running the R1 model by itself, it's already very difficult and very expensive." He raises a good point – more transparency is always welcome! Still, it's a fascinating attempt to tackle a tricky problem, the problem which I always say we simply cannot avoid. You can check it out yourself on Hugging Face and read their blog post.Arc Institute & NVIDIA Unveil Evo 2: Genomics PowerhouseGet ready for some serious science, folks! Arc Institute and NVIDIA dropped Evo 2, a massive genomics model (40 billion parameters!) trained on a mind-boggling 9.3 trillion nucleotides. And it's fully open – two papers, weights, data, training, and inference codebases. We love to see it!Evo 2 uses the StripedHyena architecture to process huge genetic sequences (up to 1 million nucleotides!), allowing for analysis of complex genomic patterns. The practical applications? Predicting the effects of genetic mutations (super important for healthcare) and even designing entire genomes. I've been super excited about genomics models, and seeing these alternative architectures like StripedHyena getting used here is just icing on the cake. Check it out on X.ZeroBench: The "Impossible" Benchmark for VLLMsNeed more benchmarks? Always! A new benchmark called ZeroBench arrived, claiming to be the "impossible benchmark" for Vision Language Models (VLLMs). And guess what? All current top-of-the-line VLLMs get a big fat zero on it.One example they gave was a bunch of scattered letters, asking the model to "answer the question that is written in the shape of the star among the mess of letters." Honestly, even I struggled to see the star they were talking about. It highlights just how much further VLLMs need to go in terms of true visual understanding. (X, Page, Paper, HF)Hugging Face's Ultra Scale Playbook: Scaling UpFor those of you building massive models, Hugging Face released the Ultra Scale Playbook, a guide to building and scaling AI models on huge GPU clusters.They ran 4,000 scaling experiments on up to 512 GPUs (nothing close to Grok's 100,000, but still impressive!). If you're working in a lab and dreaming big, this is definitely a resource to check out. (HF).Big CO LLMs + APIsGrok 3: XAI's Big Swing new SOTA LLM! (and Maybe a Bug?)Monday evening, BOOM! While some of us were enjoying President's Day, the XAI team dropped Grok 3. They announced it with a setting very similar to OpenAI announcements. They're claiming state-of-the-art performance on some benchmarks (more on that drama later!), and a whopping 1 million token context window, finally confirmed after some initial confusion. They talked a lot about agents and a future of reasoners as well.The launch was a bit… messy. First, there was a bug where some users were getting Grok 2 even when the dropdown said Grok 3. That led to a lot of mixed reviews. Even when I finally thought I was using Grok 3, it still flubbed my go-to logic test, the "Beth's Ice Cubes" question. (The answer is zero, folks – ice cubes melt!). But Akshay, who joined us on the show, chimed in with some love: "...with just the base model of Grok 3, it's, in my opinion, it's the best coding model out there." So, mixed vibes, to say the least! It's also FREE for now, "until their GPUs melt," according to XAI, which is great.UPDATE: The vibes are shifting, more and more of my colleagues and mutuals are LOVING grok3 for one shot coding, for talking to it. I'm getting convinced as well, though I did use and will continue to use Grok for real time data and access to X. DeepSearchIn an attempt to show off some Agentic features, XAI also launched a deep search (not research like OpenAI but effectively the same) Now, XAI of course has access to X, which makes their deep search have a leg up, specifically for real time information! I found out it can even “use” the X search! OpenAI's Open Source TeaseIn what felt like a very conveniently timed move, Sam Altman dropped a poll on X the same day as the Grok announcement: if OpenAI were to open-source something, should it be a small, mobile-optimized model, or a model on par with o3-mini? Most of us chose o3 mini, just to have access to that model and play with it. No indication of when this might happen, but it's a clear signal that OpenAI is feeling the pressure from the open-source community.The Eval Wars: OpenAI vs. XAIThings got spicy! There was a whole debate about the eval numbers XAI posted, specifically the "best of N" scores (like best of 64 runs). Boris from OpenAI, and Aiden mcLau called out some of the graphs. Folks on X were quick to point out that OpenAI also used "best of N" in the past, and the discussion devolved from there.XAI is claiming SOTA. OpenAI (or some folks from within OpenAI) aren't so sure. The core issue? We can't independently verify Grok's performance because there's no API yet! As I said, "…we're not actually able to use this model to independently evaluate this model and to tell you guys whether or not they actually told us the truth." Transparency matters, folks!DeepSearch - How Deep?Grok also touted a new "Deep Search" feature, kind of like Perplexity or OpenAI's "Deep Research" in their more expensive plan. My initial tests were… underwhelming. I nicknamed it "Shallow Search" because it spent all of 34 seconds on a complex query where OpenAI's Deep Research took 11 minutes and cited 17 sources. We're going to need to do some more digging (pun intended) on this one.This Week's BuzzWe're leaning hard into agents at Weights & Biases! We just released an agents whitepaper (check it out on our socials!), and we're launching an agents course in collaboration with OpenAI's Ilan Biggio. Sign up at wandb.me/agents! We're hearing so much about agent evaluation and observability, and we're working hard to provide the tools the community needs.Also, sadly, our Toronto workshops are completely sold out. But if you're at AI Engineer in New York, come say hi to our booth! And catch my talk on LLM Reasoner Judges tomorrow (Friday) at 11 am EST – it'll be live on the AI Engineer YouTube channel (HERE)!Vision & VideoMicrosoft MUSE: Playable Worlds from a Single ImageThis one is wild. Microsoft's MUSE can generate minutes of playable gameplay from just a single second of video frames and controller actions.It's based on the World and Human Action Model (WHAM) architecture, trained on a billion gameplay images from Xbox. So if you've been playing Xbox lately, you might be in the model! I found it particularly cool: "…you give it like a single second of a gameplay of any type of game with all the screen elements, with percentages, with health bars, with all of these things and their model generates a game that you can control." (X, HF, Blog).StepFun's Step-Video-T2V: State-of-the-Art (and Open Source!)We got two awesome open-source video breakthroughs this week. First, StepFun's Step-Video-T2V (and T2V Turbo), a 30 billion parameter text-to-video model. The results look really good, especially the text integration. Imagine a Chinese girl opening a scroll, and the words "We will open source" appearing as she unfurls it. That's the kind of detail we're talking about.And it's MIT licensed! As Nisten noted "This is pretty cool. It came out. Right before Sora came out, people would have lost their minds." (X, Paper, HF, Try It).HAO AI's FastVideo: Speeding Up HY-VideoThe second video highlight: HAO AI released FastVideo, a way to make HY-Video (already a strong open-source contender) three times faster with no additional training! They call the trick "Sliding Tile Attention" apparently that alone provides enormous boost compared to even flash attention.This is huge because faster inference means these models become more practical for real-world use. And, bonus: it supports HY-Video's Loras, meaning you can fine-tune it for, ahem, all kinds of creative applications. I will not go as far as to mention civit ai. (Github)Figure's Helix: Robot Collaboration!Breaking news from the AI Engineer conference floor: Figure, the humanoid robot company, announced Helix, a Vision-Language-Action (VLA) model built into their robots!It has full upper body control!What blew my mind: they showed two robots working together, handing objects to each other, based on natural language commands! As I watched, I exclaimed, "I haven't seen a humanoid robot, hand off stuff to the other one... I found it like super futuristically cool." The model runs on the robot, using a 7 billion parameter VLM for understanding and an 80 million parameter transformer for control. This is the future, folks!Tools & OthersMicrosoft's New Quantum Chip (and State of Matter!)Microsoft announced a new quantum chip and a new state of matter (called "topological superconductivity"). "I found it like absolutely mind blowing that they announced something like this," I gushed on the show. While I'm no quantum physicist, this sounds like a big deal for the future of computing.Verdict: Hayes Labs' Framework for LLM JudgesAnd of course, the highlight of our show: Verdict, a new open-source framework from Hayes Labs (the folks behind those "bijection" jailbreaks!) for composing LLM judges. This is a huge deal for anyone working on evaluation. Leonard and Nimit from Hayes Labs joined us to explain how Verdict addresses some of the core problems with LLM-as-a-judge: biases (like preferring their own responses!), sensitivity to prompts, and the challenge of "meta-evaluation" (how do you know your judge is actually good?).Verdict lets you combine different judging techniques ("primitives") to create more robust and efficient evaluators. Think of it as "judge-time compute scaling," as Leonard called it. They're achieving near state-of-the-art results on benchmarks like ExpertQA, and it's designed to be fast enough to use as a guardrail in real-time applications!One key insight: you don't always need a full-blown reasoning model for judging. As Nimit explained, Verdict can combine simpler LLM calls to achieve similar results at a fraction of the cost. And, it's open source! (Paper, Github,X).ConclusionAnother week, another explosion of AI breakthroughs! Here are my key takeaways:* Open Source is THRIVING: From censorship-free LLMs to cutting-edge video models, the open-source community is delivering incredible innovation.* The Need for Speed (and Efficiency): Whether it's faster video generation or more efficient LLM judging, performance is key.* Robots are Getting Smarter (and More Collaborative): Figure's Helix is a glimpse into a future where robots work together.* Evaluation is (Finally) Getting Attention: Tools like Verdict are essential for building reliable and trustworthy AI systems.* The Big Players are Feeling the Heat: OpenAI's open-source tease and XAI's rapid progress show that the competition is fierce.I'll be back in my usual setup next week, ready to break down all the latest AI news. Stay tuned to ThursdAI – and don't forget to give the pod five stars and subscribe to the newsletter for all the links and deeper dives. There's potentially an Anthropic announcement coming, so we'll see you all next week.TLDR* Open Source LLMs* Perplexity R1 1776 - finetune of china-less R1 (Blog, Model)* Arc institute + Nvidia - introduce EVO 2 - genomics model (X)* ZeroBench - impossible benchmark for VLMs (X, Page, Paper, HF)* HuggingFace ultra scale playbook (HF)* Big CO LLMs + APIs* Grok 3 SOTA LLM + reasoning and Deep Search (blog, try it)* OpenAI is about to open source something? Sam posts a polls* This weeks Buzz* We are about to launch an agents course! Pre-sign up wandb.me/agents* Workshops are SOLD OUT* Watch my talk LIVE from AI Engineer - 11am EST Friday (HERE)* Keep watching AI Eng conference after the show on AIE YT* )* Vision & Video* Microsoft MUSE - playable worlds from one image (X, HF, Blog)* Microsoft OmniParser - Better, faster screen parsing for GUI agents with OmniParser v2 (Gradio Demo)* HAO AI - fastVIDEO - making HY-Video 3x as fast (Github)* StepFun - Step-Video-T2V (+Turbo), a SotA 30B text-to-video model (Paper, Github, HF, Try It)* Figure announces HELIX - vision action model built into FIGURE Robot (Paper)* Tools & Others* Microsoft announces a new quantum chip and a new state of matter (Blog, X)* Verdict - Framework to compose SOTA LLM judges with JudgeTime Scaling (Paper, Github,X) This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit sub.thursdai.news/subscribe
Chris Chandler is a Senior Member of the Technical Staff for Developer Productivity at T-Mobile. Chris has led several major initiatives to improve developer experience including their internal developer portal, Starter Kits (a patented developer platform that predates Backstage), and Workforce Transformation Bootcamps for onboarding developers faster.Mentions and links:Follow Chris on LinkedInMeasuring developer productivity with the DX Core 4Listen to Decoder with Nilay Patel.Discussion points:(0:47) From developer experience to developer productivity(7:03) Getting executive buy-in for developer productivity initiatives(13:54) What Chris's team is responsible for(17:02) How they've built relationships with other teams(20:57) How they built and got funding for Dev Console and Starter Kits(27:23) Homegrown solution vs Backstage
The Hoover Institution Program on the US, China, and the World held Critical Issues in the US-China Science and Technology Relationship on Thursday, November 7th, 2024 from 4:00 pm - 5:30 pm PT at the Annenberg Conference Room, George P. Shultz Building. Both the United States and the People's Republic of China see sustaining leadership in science and technology (S+T) as foundational to national and economic security. Policymakers on both sides of the Pacific have taken action to promote indigenous innovation, and to protect S+T ecosystems from misappropriation of research and malign technology transfer. In the US, some of these steps, including the China Initiative, have led to pain, mistrust, and a climate of fear, particularly for students and scholars of and from China. Newer efforts, including research security programs and policies, seek to learn from these mistakes. A distinguished panel of scientists and China scholars discuss these dynamics and their implications. What are the issues facing US-China science and technology collaboration? What are the current challenges confronting Chinese American scientists? How should we foster scientific ecosystems that are inclusive, resilient to security challenges, and aligned with democratic values? Featuring Zhenan Bao is the K.K. Lee Professor of Chemical Engineering, and by courtesy, a Professor of Chemistry and a Professor of Material Science and Engineering at Stanford University. Bao directs the Stanford Wearable Electronics Initiate (eWEAR). Prior to joining Stanford in 2004, she was a Distinguished Member of Technical Staff in Bell Labs, Lucent Technologies from 1995-2004. She received her Ph.D. in Chemistry from the University of Chicago in 1995. Bao is a member of the National Academy of Sciences, the National Academy of Engineering, the American Academy of Arts and Sciences and the National Academy of Inventors. She is a foreign member of the Chinese Academy of Science. Bao is known for her work on artificial electronic skin, which is enabling a new-generation of skin-like electronics for regaining sense of touch for neuro prosthetics, human-friendly robots, human-machine interface and seamless health monitoring devices. Bao has been named by Nature Magazine as a “Master of Materials”. She is a recipient of the VinFuture Prize Female Innovator 2022, ACS Chemistry of Materials Award 2022, Gibbs Medal 2020, Wilhelm Exner Medal 2018, L'Oréal-UNESCO For Women in Science Award 2017. Bao co-founded C3 Nano and PyrAmes, which produced materials used in commercial smartphones and FDA-approved blood pressure monitors. Research inventions from her group have also been licensed as foundational technologies for multiple start-ups founded by her students. Yasheng Huang (黄亚生) is the Epoch Foundation Professor of Global Economics and Management at the MIT Sloan School of Management. He also serves as the president of the Asian American Scholar Forum, a non-governmental organization dedicated to promoting open science and protecting the civil rights of Asian American scientists. Professor Huang is a co-author of MIT's comprehensive report on university engagement with China and has recently contributed an insightful article to Nature on the US-China science and technology agreement. For more information, you can read his recent article in Nature here. Peter F. Michelson is the Luke Blossom Professor in the School of Humanities & Sciences and Professor of Physics at Stanford University. He has also served as the Chair of the Physics Department and as Senior Associate Dean for the Natural Sciences. His research career began with studies of superconductivity and followed a path that led to working on gravitational wave detection. For the past 15 years his research has been focused on observations of the Universe with the Fermi Gamma-ray Space Telescope, launched by NASA in 2008. He leads the international collaboration that designed, built, and operates the Large Area Telescope (LAT), the primary instrument on Fermi. The collaboration has grown from having members from 5 nations (U.S., Japan, France, Italy, Sweden) to more than 20 today, including members in the United States, Europe, China, Japan, Thailand, South America, and South Africa. Professor Michelson has received several awards for the development of the Fermi Observatory, including the Bruno Rossi Prize of the American Astronomical Society. He is an elected member of the American Academy of Arts and Sciences and a Fellow of the American Physical Society. He has served on a number of advisory committees, including for NASA and various U.S. National Academy of Sciences Decadal Surveys. In 2020-21, he co-directed an American Academy of Arts and Sciences study, Challenges for International Scientific Partnerships, that identified the benefits of international scientific collaboration and recommended actions to be taken to address the most pressing challenges facing international scientific collaborations. Glenn Tiffert is a distinguished research fellow at the Hoover Institution and a historian of modern China. He co-chairs Hoover's program on the US, China, and the World, and also leads Stanford's participation in the National Science Foundation's SECURE program, a $67 million effort authorized by the CHIPS and Science Act of 2022 to enhance the security and integrity of the US research enterprise. He works extensively on the security and integrity of ecosystems of knowledge, particularly academic, corporate, and government research; science and technology policy; and malign foreign interference. Moderator Frances Hisgen is the senior research program manager for the program on the US, China, and the World at the Hoover Institution. As key personnel for the National Science Foundation's SECURE program, a joint $67 million effort authorized by the CHIPS and Science Act of 2022, Hisgen focuses on ensuring efforts to enhance the security and integrity of the US research enterprise align with democratic values, promote civil rights, and respect civil liberties. Her AB from Harvard and MPhil from the University of Cambridge are both in Chinese history.
Neil Thompson is an engineer who worked in the medical device industry and used to fail miserably at public speaking. As a result, he founded Teach the Geek. He now works with technical professionals like himself to improve their presentation skills. He also hosts the Teach the Geek podcast. And he's the author of the book, Teach the Geek to Speak: a no-fluff public speaking guide for STEM Professionals. In this episode, Mitch talks about “Public Speaking as a game changer for technical staff”. Host: Marie-Line Germain, Ph.D. Mixing: Kelly Minnis
Shruti Kapoor, lead member of Technical Staff at Slack, explores the new features and updates in React 19. From enhanced form handling to the introduction of React Actions and the React Compiler, this episode provides valuable insights for developers eager to leverage the latest advancements in React. Links https://www.linkedin.com/in/shrutikapoor08 https://shrutikapoor.dev https://x.com/shrutikapoor08 https://www.youtube.com/@shrutikapoor08 bit.ly/shruti-newsletter bit.ly/shruti-discord https://github.com/shrutikapoor08 https://github.com/reactwg/react-compiler https://bit.ly/shruti-discord https://www.youtube.com/watch?v=ExZUdkfu-KE&t=810s We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Shruti Kapoor.
In this episode of the Human Capital Lab Podcast, host Rich Douglas interviews Neil Thompson, founder of Teach the Geek, a service aimed at improving public speaking and presentation skills among technical professionals. Neil shares his personal journey from struggling with presentations as a product development engineer to creating Teach the Geek to help other technical experts enhance their communication skills. He discusses the process and benefits of his coaching, including an online course and in-person training, and emphasizes the importance of clear communication for technical staff within organizations. The episode highlights how talent development departments can collaborate with Teach the Geek to empower their technical employees to become more effective communicators. 00:25 Meet Neil Thompson01:06 Neil's Journey: From Research Associate to Product Development Engineer02:56 The Birth of Teach the Geek03:51 Teach the Geek: Services and Offerings04:25 Improving Technical Communication Skills06:09 The Teach the Geek Process08:48 Engaging Talent Development Departments11:16 The Importance of Technical Staff in Presentations22:02 Conclusion and Final ThoughtsThank you for joining us on the Human Capital Lab podcast journey. We hope you found inspiration and valuable insights from today's discussions. Be sure to share this episode with your colleagues and friends, and stay tuned for our exciting new season. Remember, continuous learning is the key to unlocking the long-term potential of human capital. Connect with the Guests: Neil Thompson;LinkedIn: https://www.linkedin.com/in/neilithompson/Website: https://teachthegeek.com/ Connect with Human Capital Lab; Host: Rich Douglas LinkedIn: https://www.linkedin.com/in/rich-douglas-92b71b52/ Human Captial Lab Links Website: https://humancapitallab.org/ Interested in Being a Guest? https://humancapitallab.org/podcast/ This is a Growth Network Podcasts production.
Dr. Sailesh Rao has over three decades of professional experience and is the Founder and Executive Director of Climate Healers, a non-profit dedicated towards healing the Earth's climate. A systems specialist with a Ph. D. in Electrical Engineering from Stanford University, Dr. Rao worked on the internet communications infrastructure for twenty years after graduation. During this period, he blazed the trail for high speed signal processing chips and technologies for High Definition Television, real-time video communications and the transformation of early analog internet connections to more robust digital connections, while accelerating their speeds ten-fold. Today, over a billion internet connections deploy the communications protocol that he designed. He received five Exceptional Contribution Awards from AT&T Bell Laboratories between 1985 and 1991, a Distinguished Member of the Technical Staff award in 1990, the Intel Principal Engineer Award in 2003, and the IIT Madras Distinguished Alumnus Award in 2013 for his technical contributions. He is the author of 22 peer-reviewed technical papers, 50 standards contributions, 10 US patents and 3 Canadian patents. He was the co-founder of Silicon Design Experts in 1991 which was acquired by Level One Communications in 1996 and which was later acquired by Intel Corporation in 1999 for $2.2 billion. In 2006, he switched careers and became deeply immersed, full time, in solving the environmental crises affecting humanity. Dr. Rao is the author of four books, Carbon Dharma: The Occupation of Butterflies, Carbon Yoga: The Vegan Metamorphosis, Animal Agriculture is Immoral and The Pinky Promise, and an Executive Producer of several documentaries, The Human Experiment (2013), Cowspiracy: The Sustainability Secret (2014), What The Health (2017), A Prayer for Compassion (2019), They're Trying to Kill Us (2021), The End of Medicine (2022), The Land of Ahimsa (2022), Animals – A Parallel History (est. 2024), Milked (2022), Christspiracy (2024) and I Could Never Go Vegan (2024). His work is featured in the award winning film, Countdown to Year Zero produced by Jane Velez-Mitchell and Unchained TV. Dr. Rao is a Human, Earth and Animal Liberation (HEAL) activist, husband, dad and since 2010, a star-struck grandfather. He has promised his granddaughter, Kimaya Rainy Rao, that the world will be largely Vegan before she turns 16 in 2026, so that people will stop eating her relatives, the animals. He has faith that humanity will transform to keep his pinky promise to Kimaya, not just for ethical reasons, but also out of sheer ecological necessity. Along with Kimaya, Dr. Rao was the co-recipient of the inaugural Homo Ahimsa award from the Interfaith Vegan Coalition in 2021. He has formally taken the Ubuntu pledge to become Homo Ahimsa. Dr. Rao was honored with the Karmaveer Puraskaar Global Indian award by the Indian Confederation of NGOs (ICONGO) in 2008, the Shining World Award for Earth Protection from the Supreme Master Ching Hai International Association in 2020 and the Winsome Constance Kindness Medal by the Winsome Constance Kindness Trust in 2022. He was designated a Climate Hero by The Guardian Newspaper in 2023, which recognized him as “a foremost voice on green transition and on the true scale of societal change required to save the planet.” He serves on the Universal Meals Advisory Council of the Physicians Committee for Responsible Medicine and he served on the Board of Directors of the T. Colin Campbell Center for Nutrition Studies in 2023.
Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/ Accelerating Multimodal AI // MLOps podcast #241 with Ethan Rosenthal, Member of Technical Staff of Runway. Huge thank you to AWS for sponsoring this episode. AWS - https://aws.amazon.com/ // Abstract We're still trying to figure out systems and processes for training and serving “regular” machine learning models, and now we have multimodal AI to contend with! These new systems present unique challenges across the spectrum, from data management to efficient inference. I'll talk about the similarities, differences, and challenges that I've seen by moving from tabular machine learning, to large language models, to generative video systems. I'll also talk about the setups and tools that I have seen work best for supporting and accelerating both the research and productionization process. // Bio Ethan works at Runway building systems for media generation. Ethan's work generally straddles the boundary between research and engineering without falling too hard on either side. Prior to Runway, Ethan spent 4 years at Square. There, he led a small team of AI Engineers training large language models for Conversational AI. Before Square, Ethan freelance consulted and worked at a couple ecommerce startups. Ethan found his way into tech by way of a Physics PhD. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.ethanrosenthal.com Ethan's mangum opus: https://www.ethanrosenthal.com/2020/08/25/optimal-peanut-butter-and-banana-sandwiches/ Real-time Model Inference in a Video Streaming Environment // Brannon Dorsey // Coffee Sessions #98: https://youtu.be/TNO6rYwP3yg Feature Stores for Self-Service Machine Learning: https://www.ethanrosenthal.com/2021/02/03/feature-stores-self-service/ Gen-1: The Next Step Forward for Generative AI: https://research.runwayml.com/gen1 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Ethan on LinkedIn: https://bsky.app/profile/ethanrosenthal.com
“One of the great pitfalls is that we just throw them in the deep end and assume they're going to swim. And when they fail, we say, oh well, they just weren't up to it.” Chief, in today's minisode and YouTube video, I'm going to dive into some reasons for why so many people struggle with the shift from a technical, hands-on role into a leadership role, and how you as their leader can ensure a more smooth transition. The reality is that a lot of leaders are promoted based on their technical abilities and not so much on their leadership capabilities, the misconception being that you ‘earn your stripes' through being hands-on and experienced. This approach so often leads to poor outcomes and staff turnover, and for the promoted person feeling hopelessly out of their depth.
Nicholas Marwell, Member of Technical Staff of Anthropic, shares predictions for Generative AI models as well as guidance on safely training and scaling them for the near future.Topics Include:Where are Generative AI Models headingAbout AnthropicClaude Opus, Claude 3 Sonnet & Claude 3 HaikuScaling laws impact on modelsTraining models for text, images and multimediaAI Safety, high stakes and stewarding through potential dangersConstitutional AI and interpretability researchBenefits of Features – combinations of neuronsFinal thoughts – high growth and pace for coming months and years
Valkey, a Redis fork supported by the Linux Foundation, challenges Redis' new license. In this episode, Madelyn Olson, a lead contributor to the Valkey project and former Redis core contributor, along with Ping Xie, Staff Software Engineer at Google and Dmitry Polyakovsky, Consulting Member of Technical Staff at Oracle highlights concerns about the shift to a more restrictive license at Open Source Summit 2024 in Seattle. Despite Redis' free license for end users, many contributors may not support it. Valkey, with significant industry backing, prioritizes continuity and a smooth transition for Redis users. AWS, along with Google and Oracle maintainers, emphasizes the importance of open, permissive licenses for large tech companies. Valkey plans incremental updates and module development in Rust to enhance functionality and attract more engineers. The focus remains on compatibility, continuity, and consolidating client behaviors for a robust ecosystem. Learn more from The New Stack about the Valkey Project and changes to Open Source licensingLinux Foundation Backs 'Valkey' Open Source Fork of Redis Redis Pulls Back on Open Source Licensing, Citing Stingy Cloud ServicesHashiCorp's Licensing Change is only the Latest Challenge to Open Source Join our community of newsletter subscribers to stay on top of the news and at the top of your game.
O belo-horizontino Artur se impressionou logo na infância com o trabalho do tio cientista da computação que, a partir de Florianópolis, trabalhava remotamente para uma empresa americana. Quando ele próprio decidiu fazer Ciência na Computação, aproveitou algumas oportunidades de fazer intercâmbio no exterior, passando por Wisconsin nos EUA e por Leeds na Inglaterra. De volta ao Brasil e já formado, Artur trabalhou em algumas empresas incluindo a Meta, que o mandou para Dublin e, posteriormente, para Londres. Neste episódio, o Artur conta como foi sua trajetória entre a saída da Meta e a decisão de se juntar à Anthropic para trabalhar no empolgante LLM Claude, e também comenta como tem sido sua experiência de morar na terra onde rock and roll é a música popular nacional. Fabrício Carraro, o seu viajante poliglota Artur Rodrigues, Member of Technical Staff na Anthropic em Londres, Inglaterra Links: Work & Travel Minas Mundi Learn X in Y Minutes Chatbot Arena Anthropic Console Conheça a Escola de Inteligência Artificial da Alura e mergulhe com profundidade no universo da Inteligência Artificial (IA) aplicada a diferentes áreas de atuação. TechGuide.sh, um mapeamento das principais tecnologias demandadas pelo mercado para diferentes carreiras, com nossas sugestões e opiniões. #7DaysOfCode: Coloque em prática os seus conhecimentos de programação em desafios diários e gratuitos. Acesse https://7daysofcode.io/ Ouvintes do podcast Dev Sem Fronteiras têm 10% de desconto em todos os planos da Alura Língua. Basta ir a https://www.aluralingua.com.br/promocao/devsemfronteiras/e começar a aprender inglês e espanhol hoje mesmo! Produção e conteúdo: Alura Língua Cursos online de Idiomas – https://www.aluralingua.com.br/ Alura Cursos online de Tecnologia – https://www.alura.com.br/ Edição e sonorização: Rede Gigahertz de Podcasts
Mike Hanley, Chief Security Officer and SVP of Engineering @ GitHub, joins us to discuss how GitHub has successfully combined its engineering & security orgs and shares recommendations for how other orgs can pivot to this model. We cover why it's so important for eng orgs to collaborate with security early on in the product development cycle and tips for educating your engineers on security best practices. We also discuss how the rise of AI tools / usage is changing how companies need to think about & practice security, why AI is providing opportunities for increased safety & security within product development, and strategies for encouraging your org to adopt AI tooling within engineering, security, and beyond.ABOUT MIKE HANLEYMike Hanley is the Chief Security Officer and SVP of Engineering at GitHub. Prior to GitHub, Mike was the Vice President of Security at Duo Security, where he built and led the security research, development, and operations functions. After Duo's acquisition by Cisco for $2.35 billion in 2018, Mike led the transformation of Cisco's cloud security framework and later served as CISO for the company. Mike also spent several years at CERT/CC as a Senior Member of the Technical Staff and security researcher focused on applied R&D programs for the US Department of Defense and the Intelligence Community.When he's not talking about security at GitHub, Mike can be found enjoying Ann Arbor, MI with his wife and eight kids."The idea that the security team is walled off or separate or not really connected, not just to engineering but the entirety of the business, you really can't have that. If you think about the pace of modern development, things are moving so quickly. It's so driven by software. The idea that you're like, ‘Hey, I got to walk down the hall and check in with somebody from security who has no idea what's going on in my roadmap, who has no idea what my day to day experience is living in engineering...' That just doesn't work!”- Mike Hanley We now have 10 local communities of engineering leaders hosting in-person meetups all over the world!Local communities are led by eng leaders just like you, who wanted to create a place to connect, share insights & tackle critical challenges in the job.New York City, Boston, Chicago, Seattle, Los Angeles, San Diego, San Francisco, London, Amsterdam, and Toronto in-person events are happening now!We're launching local events all the time - get involved at elc.community!SHOW NOTES:GitHub's convergence of the eng & security orgs (2:33)Benefits of combining engineering & security org mandates (4:46)How the security team is involved with the internal product dev lifecycle (8:05)The downsides of engaging your security team as an afterthought (10:46)What an early-stage yes/and product conversation looks like (12:48)Examples of educating your eng team on security best practices (17:17)Expanding two-factor authentication externally (19:29)Stewarding security as a responsibility & value (21:59)Security & safety implications for orgs using / building AI tools (23:44)Why the rise of AI is a great time for eng / security collaboration (27:09)How to leverage security best practices using AI tools (29:53)Mike's view that AI will create more opportunities & improve structural tech (32:14)Frameworks for getting to “yes” when it comes to adopting AI tooling (35:15)AI-powered tools GitHub is using to change workflows outside of eng & security (39:06)Considerations pivoting toward combining eng & security functions (40:35)Rapid fire questions (42:25)LINKS AND RESOURCESWhy Johnny Can't Encrypt - Alma Whitten And J. D. Tygar's argument that effective security requires a different usability standard that is not achievable through the user interface techniques commonly found in consumer software.The Space Trilogy - C.S. Lewis believed that popular science was the new mythology of his age, and in The Space Trilogy he ransacks the uncharted territory of space and makes that mythology the medium of his spiritual imagination.The Works of Peter DruckerThis episode wouldn't have been possible without the help of our incredible production team:Patrick Gallagher - Producer & Co-HostJerry Li - Co-HostNoah Olberding - Associate Producer, Audio & Video Editor https://www.linkedin.com/in/noah-olberding/Dan Overheim - Audio Engineer, Dan's also an avid 3D printer - https://www.bnd3d.com/Ellie Coggins Angus - Copywriter, Check out her other work at https://elliecoggins.com/about/
Aarna's News | Inspiring and Uplifting Stories of Women In STEM
In Episode 83 of Aarna's News, join host Aarna Sahu as she delves into the world of AI with special guest Lucia Mocz, Principal Member of the Technical Staff at Oracle. Lucia, an AI research scientist, engineer, and PhD mathematician, shares her expertise in LLMs and transformers, offering insights into their applications across various domains including number theory, computational biology, and computer vision. Through her diverse experiences, Lucia's journey exemplifies the power of perseverance and innovation in pushing the boundaries of STEM. Tune in to gain valuable knowledge and inspiration from one of the leading voices in the field of artificial intelligence. --- Support this podcast: https://podcasters.spotify.com/pod/show/aarna-sahu/support
Description:Today, a fascinating panel discussion from the AWS for Software Companies Executive Forum at re:Invent November 2023, featuring software leaders from Anthropic, Freshworks, Genesys, Snaplogic and AWS discussing the disruption, impact and opportunities of generative AI for software companies. Panelists:Matt Bell, Member of Technical Staff, AnthropicSiddhartha Agarwal, Senior Vice President, Product Strategy & Operations, FreshworksGlenn Nethercutt, Chief Technology Officer, GenesysJeremiah Stone, Chief Technology Officer, SnaplogicSherry Marcus Ph.D, Director of Applied Science, Generative AI, AWSAndy Perkins, Director ISV Sales, AWSTopics Include:Observations of the current state of generative AIHow will AI continue to disrupt industries?Agents and bots to help employers scaleImpact of generative AI on retail and supportLeveraging generative AI and automation to become lean and reduce wasteRisk management for generative AIEthical concerns – Security, data accuracy and privacyManaging/reducing “jailbreaks” – users attempting to break the modelsDeveloping leadership clarity and in-house expertise for large language modelsIncreased power of Product Managers, evolution of engineering departmentsHow SaaS companies work with generative AILeveraging AWS Bedrock for accelerationAdvice for Software Executives going into generative AIPartnering with AWS for POCs
In this episode, Françoise von Trapp hands over the mike to imec's Katrien Marent, who hosted imec's ITF Towards NetZero at SEMICON Europa. She introduces a panel discussion on Collaborative Strategies and Practical Solutions Toward a More Sustainable Semiconductors Future. The panel kicks off by polling the audience on what they think are the most pressing issues facing the semiconductor industry as it endeavors to reduce its carbon footprint while simultaneously growing to meet the demands of semiconductor devices, many of which will help other industries on their paths to sustainability. The panel tackles some grave and difficult questions and offers some useful advice on how to collaborate as an industry and the importance of individual efforts made by companies. What is the role of innovation in achieving these goals? Do we need to have standardization around data? Do we need to report more transparency? In some places, you'll hear instances of the audience polling and the results of those informing the questions asked by panel moderator, Jan-Hinnerk Mohr, Managing Director & Partner, Boston Consulting Group. Panelists Emily Gallagher, Principal Member of Technical Staff, imecJean-Marc Girard, CTO and SVP of Manufacturing Technologies, Air Liquide Advanced MaterialsBenjamin Sokolowski, Managing Director & VP Government Affairs EMEA, QualcommBill Lussier, Senior Vice President Regional Sales & Deputy GM, Tokyo Electron Europe Ltd.SEMI A global association, SEMI represents the entire electronics manufacturing and design supply chain. Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Support the showBecome a sustaining member! Like what you hear? Follow us on LinkedIn and TwitterInterested in reaching a qualified audience of microelectronics industry decision-makers? Invest in host-read advertisements, and promote your company in upcoming episodes. Contact Françoise von Trapp to learn more. Interested in becoming a sponsor of the 3D InCites Podcast? Check out our 2023 Media Kit. Learn more about the 3D InCites Community and how you can become more involved.
Co-hosts Pierce Gorman, Distinguished Member of the Technical Staff, and Keith Buell, General Counsel & Head of Global Public Policy at Numeracle, team up for a comprehensive recap of SIPNOC 2023: "Focus on the Call and Text Authentication Ecosystem." Together, they take you on a journey, encapsulating all the pivotal discussions, groundbreaking innovations, and key advancements that are set to transform the future of secure telephony. Pierce and Keith dissect the most insightful sessions, share their personal highlights, and discuss the implications of these developments for businesses and consumers alike.
Numeracle's Distinguished Member of the Technical Staff, Pierce Gorman, is back with VP of Engineering - Voice, Brett Nemeroff, and Chief Product Officer, Anis Jaffer, to continue their discussion on 3rd-party call signing and the 6th Report and Order and Further Notice of Proposed Rulemaking.To read the 6th Report and Order and Further Notice of Proposed Rulemaking, visit docs.fcc.gov/public/attachments/DOC-391238A1.pdfMentioned Content: Sixth Report and Order and Further Notice of Proposed RulemakingHosted SHAKENCarrier SHAKENSHAKEN softwareSTIR/SHAKENSoftware as a Service (SaaS)SIPOriginating Service Provider (OSP)Know Your Customer (KYC)Pitfalls of third-party authenticationAttestation levelsSpam labelsRobocall Mitigation DatabaseATIS
Numeracle's Distinguished Member of the Technical Staff, Pierce Gorman, and VP of Engineering - Voice, Brett Nemeroff, chat about the 6th Report and Order and Further Notice of Proposed Rulemaking (FNPRM) related to call authentication and the intricacies of 3rd-party call signing.To read the 6th Report and Order and Further Notice of Proposed Rulemaking, visit docs.fcc.gov/public/attachments/DOC-391238A1.pdfMentioned Content: - 6th Report and Order and Further Notice of Proposed Rulemaking - Call spoofing - STIR/SHAKEN - Attestation - Call Authentication - Entity Identity - 3rd-Party Call Signing - ATIS - KYC - TRACED Act
PHOTO: NO KNOWN RESTRICTIONS ON PUBLICATION. @BATCHELORSHOW #Ukraine: Zaporhizhia NPP said to be emptying of technical staff, Henry Sokolski, NPEC https://www.msn.com/en-us/news/world/ukraines-occupied-nuke-plant-faces-possible-staffing-crunch/ar-AA1aZneX
In this weeks episode of CSS on Converge, Bel and Charles go over: The Kraken entering the All-Star break atop the Pacific Division, the OL Reign announcing their preseason roster, the Storm listed as one of two teams Breanna Stewart is considering, the Mariners adding to their technical staff, Seahawks players being nominated for awards, and so much more! Tune in for the most COMPLETE coverage of all 8 professional sports teams in Seattle! Check out Circling Seattle Sports! Check out the website: https://www.circlingseattlesports.com/ Spotify: https://open.spotify.com/show/3IDad4tGmXJMRUJE27hPhd?si=AmKrgu4gQFO-ELMjJMv2vQ Anchor: https://anchor.fm/chamaker23 Apple podcasts: https://podcasts.apple.com/us/podcast/circling-seattle-sports/id1495589940 Follow us on Instagram: @Circlingseattlesports Follow us on Twitter: @CirclingSports Look us up on Facebook: Circling Seattle Sports --- Send in a voice message: https://podcasters.spotify.com/pod/show/chamaker23/message
Cliff Young is a software engineer in Google Research, where he works on codesign for deep learning accelerators. He is one of the designers of Google's Tensor Processing Unit (TPU) and one of the founders of the MLPerf benchmark. Previously, Cliff built special- purpose supercomputers for molecular dynamics at D. E. Shaw Research and was a Member of Technical Staff at Bell Labs. Cliff holds AB, MS, and PhD degrees in computer science from Harvard University. Cliff is a member of ACM and IEEE. Series: "Institute for Energy Efficiency" [Science] [Show ID: 38473]
Cliff Young is a software engineer in Google Research, where he works on codesign for deep learning accelerators. He is one of the designers of Google's Tensor Processing Unit (TPU) and one of the founders of the MLPerf benchmark. Previously, Cliff built special- purpose supercomputers for molecular dynamics at D. E. Shaw Research and was a Member of Technical Staff at Bell Labs. Cliff holds AB, MS, and PhD degrees in computer science from Harvard University. Cliff is a member of ACM and IEEE. Series: "Institute for Energy Efficiency" [Science] [Show ID: 38473]
Cliff Young is a software engineer in Google Research, where he works on codesign for deep learning accelerators. He is one of the designers of Google's Tensor Processing Unit (TPU) and one of the founders of the MLPerf benchmark. Previously, Cliff built special- purpose supercomputers for molecular dynamics at D. E. Shaw Research and was a Member of Technical Staff at Bell Labs. Cliff holds AB, MS, and PhD degrees in computer science from Harvard University. Cliff is a member of ACM and IEEE. Series: "Institute for Energy Efficiency" [Science] [Show ID: 38473]
Cliff Young is a software engineer in Google Research, where he works on codesign for deep learning accelerators. He is one of the designers of Google's Tensor Processing Unit (TPU) and one of the founders of the MLPerf benchmark. Previously, Cliff built special- purpose supercomputers for molecular dynamics at D. E. Shaw Research and was a Member of Technical Staff at Bell Labs. Cliff holds AB, MS, and PhD degrees in computer science from Harvard University. Cliff is a member of ACM and IEEE. Series: "Institute for Energy Efficiency" [Science] [Show ID: 38473]
Robert G. McGwier is the founder and Technical Advisor at Hawkeye 360. He serves as Technical Director of Federated Wireless, Inc. Dr. McGwier is the Director of Research for the Ted and Karyn Hume Center for National Security and Technology, and Research Professor in the Bradley Department of Electrical and Computer Engineering at Virginia Tech. At Virginia Tech, he leads the overall execution of the Center's research mission, and leads the university's program development efforts in national security applications of wireless and space systems. His area of expertise is in radio frequency communications and digital signal processing. Before joining Virginia Tech, Dr. McGwier spent 26 years as a Member of the Technical Staff at the Institute for Defense Analyses' Center for Communications Research in Princeton, NJ, where he worked on advanced research topics in mathematics and communications supporting the federal government. His work on behalf of the federal government has earned him many awards, including one of the intelligence community's highest honors in 2002. Dr. McGwier is an avid amateur radio operator (call sign N4HY) and has previously served as the Vice President of Engineering for the Amateur Radio Satellite Corporation as well as a member of its Board of Directors. He is a member and former Director of the Tucson Amateur Packet Radio. He won the Dayton Amateur Radio Association Technical Award in 1990 and the Central State VHF Society Chambers Award in 2007 for his work in software defined radio and its application to amateur radio. Dr. McGwier was born in Lebanon, TN and grew up in Grove Hill, AL where he graduated from Clarke Country High in 1972. He received his Ph.D. in applied mathematics from Brown University in 1988. Bob Twitter: https://twitter.com/BobMcGwier_N4HY !! SUPPORT DISCLOSURE TEAM !! Patreon: https://www.patreon.com/disclosureteam Buy me a coffee: https://www.buymeacoffee.com/disclosu... Disclosure Team Merch: https://disclosureteam.bigcartel.com/ Disclosure Team Instagram: https://www.instagram.com/disclosure_... Disclosure Team Twitter: https://twitter.com/disclosureteam_ Disclosure Team is part of the Anomalous Podcast Network: https://audioboom.com/channels/5069292 Vinnie Adams is an abassador for UAP Society: https://uapsociety.com/
End-users are the best source of feedback on a product you are launching. Giving value to the voice of the customer can help you realize opportunities and provide the best solution. In this episode, a Lead Member of Technical Staff at Slack, Shruti Kapoor, talks about how validation and differentiation play into making your solution stand out and ensure that your platform doesn't only analyze data but ultimately affects change or improvement in an organization. Get tips on how you can build a level of trust and leverage your customers' feedback in getting your business to scale. Also, get insights on when and how to connect to Y Combinator and set yourself up for success. Love the show? Subscribe, rate, review, and share! http://stratyve.com/
International Involvement: The Intersection Between Domestic and Foreign Voice Service ProvidersPart III of our series on Call Authentication Domination has a new twist: a third expert guest speaker! Jim McEachern, Robocall and SHAKEN expert and consultant and Member of the Alliance for Telecommunications Industry Solutions (ATIS) joins host Pierce Gorman of Numeracle's Technical Staff, and returning guest Eric Priezkalns, Chief Executive of the Risk and Assurance Group and Editor of Commsrisk.com to continue the conversation on international call authentication with a focus on the intersection between domestic and foreign voice service providers.Mentioned ContentSTIR/SHAKEN Call AuthenticationGovernance AuthorityCertification AuthorityPolicy AdministratorKnow Your Customer (KYC) VettingRich Call Data (RCD) for Call SigningIllegal Number SpoofingThe TRACED ACTDelegate CertificatesMentioned OrganizationsAlliance for Telecommunications Industry Solutions (ATIS)The North American Numbering Council (NANC) Call Authentication Trust Anchor (CATA) Working GroupInternational Telecommunications Union (ITU)
Jason Collier Principle Member of Technical Staff, AMD (@bocanuts), a current GreyBeardsOnStorage co-host and I both attended VMware Explore 2022 this past week and we recorded a podcast discussing VMware's announcements on the show floor. It turns out that Keith Townsend, TheCTOAdvisor (@thectoadvisor) had brought his Airstream &studio and was exhibiting on the show floor. … Continue reading "137: GreyBeards talk VMware Explore 2022 Wrap-up"
Call Signing with Rich Call Data (RCD) and Exposing Potential Gaps in Delegate CertificatesEric Priezkalns, Chief Executive Officer of the Risk & Assurance Group and Editor of Commsrisk.com, returns to the podcast, this time taking over as the host with guest Pierce Gorman, Distinguished Member of Numeracle's Technical Staff, for a follow-up conversation on using rich call data for call signing and exposing the potential gaps in delegate certificates. Mentioned ContentRich Call Data (RCD) for Call AuthenticationSTIR/SHAKENRCD & STIR/SHAKEN PASSporTs / SignaturesIdentity VerificationAttestation LevelsDelegate CertificatesCertificate Authorities (CA), Governance Authority (GA), and Policy
This week we welcome Dr. Charles Weschler for a show about a topic of great interest to IEQ and restoration professionals. Restoration contractors are inundated with claims about equipment to help on projects when fires, wildfires and other odor events affect indoor environments. What should practitioners know about ozone, hydroxyls, TI02 and other technologies when investigating or remediating indoor environments? We talk to a world renowned professor about this issue. Charles J. Weschler -After completing his Ph.D. in Chemistry at University of Chicago (1974), Dr. Weschler did postdoctoral studies with Fred Basolo at Northwestern University. In 1975 he joined Bell Laboratories (Physical Chemistry Division) and was made a Distinguished Member of Technical Staff in 1986. He worked at Bell Labs and its successor institutions for twenty-five years. In 2001 he accepted positions at the Environmental & Occupational Health Science Institute, Rutgers University, and the International Centre for Indoor Environment and Energy, Technical University of Denmark, and in 2010 joined the Building Science department at Tsinghua University as an ongoing Visiting Professor. He continues in those positions. His research interests include chemicals in indoor environments, their sources, their chemistry, and their interactions with building occupants. From 1999-2005 Weschler served on the US EPA's Science Advisory Board. He has also served on four committees for the National Academy of Sciences, Engineering, and Medicine. From 2012 to the present, he has been an advisor to the Sloan Foundation's program on Chemistry in Indoor Environments. He was elected to the International Academy of Indoor Air Sciences in 1999 and received the Pettenkofer Award, its highest honor, in 2014. Weschler has also received the 2017 Haagen-Smit Prize from Atmospheric Environment; been made “Distinguished Visiting Professor” at Tsinghua University (2018); awarded “Doctor Technices Honoris Causa” from the Technical University of Denmark (2018); and was elected a Fellow of the American Association for the Advancement of Science (2020). His h-index is 69 (Web of Science) and 79 (Google Scholar). http://eohsi.rutgers.edu/eohsi-directory/name/charles-weschler/ LEARN MORE at IAQ Radio!
Tyler Lebrun is a Principal Member of Technical Staff and Additive Manufacturing Lead at Sandia National Labs where he is focused on all aspects of AM technology and research. He has an extensive background in aerospace having spent time at Aerojet Rocketdyne as well as Blue Origin. Tyler is also heavily involved in the development of standards for the 3D Printing industry and shares some of the behind the scenes work that goes on to help industrialize the technology. Before we get started head over to www.3degreescompany.com and subscribe to the podcast. Remember you can listen to the show anywhere you download your podcasts including Spotify, Apple, Amazon, or Stitcher
The SoundGirls Living History project is a collection of interviews with audio industry veterans. The project seeks to highlight the careers and achievements of women and underrepresented groups in audio. Interviews are conducted by SoundGirls members, with guidance from experienced interviewers in the audio industry. Interviews will be available publicly in our Living History Project and for educational use and research and through our social media, YouTube channel, and The SoundGirls Podcast. The oral history interviews are typically unedited and will be archived in their original form. Dr. Rebecca Mercuri's life and career have been an eclectic mix of music and technology, with Bachelor's degrees in Classical Guitar (University of the Arts), and Computer Science (Penn State), Masters degrees in Science (Drexel) and Engineering (UPenn), and a Ph.D. in Computer and Information Systems (UPenn). She also holds honorary alumna status from Harvard in recognition of her fellowship year at Radcliffe Institute. In the 1980s, following her undergraduate education, she became an Associate Member of Technical Staff at RCA's David Sarnoff Research Center. Projects there included the development of music software for a personal computer project and the design of the first computer-controlled interactive software and hardware for the RCA VideoDisc player. While in graduate school, during the 1990s, Rebecca was employed at AT&T Bell Laboratories where she pioneered the integration of a holophonic audio system into a collaborative and interactive virtual videoconferencing system. Later, at the University of Pennsylvania, her dissertation, "Electronic Vote Tabulation: Checks & Balances" received immediate attention, as it had been defended 11 days before the 2000 Bush v. Gore election. Dr. Mercuri continues to provide expert testimony on election controversies, and her recommendations about better ways to ensure accuracy, integrity, and believability in election results have been sought and adopted internationally. Notable Software, Inc., the company Rebecca founded (in 1985) to develop and market educational and music-related software under the Apple Certified Developer Program, now (since 2005) provides a wide range of digital investigative and expert witness services for civil and criminal matters. M.O.R.E project below the video: http://n2re.org/m-o-r-e-project Interview By: Christina Milinusic an audio professional who has provided live sound reinforcement since founding her company Unity Sound in 2005. She is the current chapter head for SoundGirls Alberta and has experience working as an electronics technician, technical director, and recordist. Full episode - picking up where we left off here: https://youtu.be/VAOEL-VR6yc?t=1863 https://soundgirls.org/soundgirls-living-history-project/
Chester Weiss discusses the latest research from The Leading Edge to successfully use geophysical tools at well sites. Chester shares the impact of well infrastructure on geophysical assessment, how to use EM successfully, the challenges of using near-surface, and the applicability of this research in other cluttered environments. Along with our conversation in episode 141 on the life cycle of a well (https://seg.org/podcast/Post/13689), this episode will help provide the full geophysical picture of working at a well. Chester Weiss is a Distinguished Member of the Technical Staff at Sandia National Laboratories. Visit https://seg.org/podcast to read the full show notes and find the full archive for Seismic Soundoff. RELATED LINKS * Chester J. Weiss, Michael J. Wilt, and Tom Daley, (2022), "Introduction to this special section: Life of the well," The Leading Edge 41: 82–82. (https://library.seg.org/doi/10.1190/tle41020082.1) * Read the special section: Life of the well (https://library.seg.org/toc/leedff/41/2) SPONSOR This episode is sponsored by Geospace Technologies. As the leading innovator and manufacturer of wireless seismic data acquisition systems, Geospace Technologies offers a series of seabed, wireless seismic data acquisition systems designed for extended-duration seabed seismic data acquisition. Geospace is committed to setting new standards for quality, performance, reliability and cost savings to E&P companies and marine geophysical contractors. CREDITS SEG produces Seismic Soundoff to benefit its members, the scientific community, and inform the public on the value of geophysics. To show your support for the show, please leave a 5-star rating on Apple Podcasts and Spotify. It takes less than five seconds to leave a 5-star rating and is the number one action you can take to show your appreciation for this free resource. You can follow the podcast to hear the latest episodes on Apple Podcasts, Google Podcasts, and Spotify. Original music created by Zach Bridges. This episode was hosted, edited, and produced by Andrew Geary at 51 features, LLC. Thank you to the SEG podcast team: Jennifer Cobb, Kathy Gamble, and Ally McGinnis.
David remotely sat down with Julia Boes, Senior Member of Technical Staff in Dublin, to discuss the Simple Web Server (SWS). The SWS, introduced in JDK 18, is a minimal web server that serves static files. It comes with a command-line tool and an API. In this episode, Julia explains why another web server might be useful. She explains its goals, its features, who it is for but also what it is not!. She then goes over the command-line tool, its API, etc.
Why is a systems engineering mindset essential for a scaling startup? In this episode of the Sourcegraph Podcast, Nelson Elhage, creator of the open source code search engine Livegrep, co-creator of the Ruby type checker Sorbet, and Member of Technical Staff at Anthropic, joins Beyang Liu, co-founder and CTO of Sourcegraph, to discuss how Rust is changing the security landscape, explain why Patrick McKenzie, better known as patio11, called his live code search tool “miraculous,” and dive deep into the weeds on the differences between trigram- and suffix-array-based search systems. Along the way, Elhage explains why developer productivity is nonlinear and why investing in developer experience should be axiomatic.Show notes & transcript: https://about.sourcegraph.com/podcast/nelson-elhage/Sourcegraph: about.sourcegraph.com