POPULARITY
Do you ever feel like you're just going through the motions? Do you ever feel lonely even when you're around people? Today, Jay welcomes back the legendary Deepak Chopra after six years, to discuss the unexpected intersection of spirituality and artificial intelligence. Together they unpack the beautiful blend of ancient wisdom and cutting-edge innovation. Jay reflects on his first meeting with Deepak and how their bond has evolved over the years. The conversation quickly flows into one of the most fascinating questions of our time: Can AI actually support our spiritual growth, rather than distract from it? Deepak shares his insights on the mysteries of the universe, the unseen forces that shape our reality, and why he believes consciousness, not matter, is the foundation of everything. Jay and Deepak break down the fears surrounding AI, from misinformation to war, and why it’s so important for us to grow spiritually alongside our technology. They explore how storytelling, creativity, and love are deeply human qualities that no machine can replicate, and how we can use these gifts to shape a better, more compassionate world. In this interview, you'll learn: How to Ask AI Better Questions to Unlock Wisdom How to Use Technology to Support Emotional and Physical Healing How to Find Purpose Through the Art of Storytelling How to Train AI to Reflect Inclusivity and Diversity How to Embrace Consciousness as the Foundation of Reality How to Create a Sacred Relationship with Technology Deepak reminds us that the answers we’re looking for often begin with a better question. And with the right mindset, even the most advanced technology can become a tool for inner transformation and collective healing. With Love and Gratitude, Jay Shetty Join over 750,000 people to receive my most transformative wisdom directly in your inbox every single week with my free newsletter. Subscribe here. Join Jay for his first ever, On Purpose Live Tour! Tickets are on sale now. Hope to see you there! What We Discuss: 00:00 Intro 01:41 What If the Universe Is Just a Giant Digital Simulation? 12:36 How to Train AI to Unlock Ancient and Hidden Knowledge 13:39 Blending AI and Spirituality to Understand Consciousness 18:23 Could AI Really Lead to Human Extinction? 22:34 What’s Actually Holding Humanity Back From Progress? 23:37 How the Human Brain Transformed Over Time 26:11 The 2 Things That Set Humans Apart From All Other Species 27:16 Can Technology Lead Us to True Peace and Prosperity? 28:10 Will AI Replace Our Jobs or Unlock Human Creativity? 30:46 Do You Think AI Can Ever Have a Soul? 31:45 The Gender and Racial Bias Hidden in AI Systems 32:33 How to Build More Inclusive and Equitable AI Models 33:32 Why a Shared Vision Can Solve Any Problem We Face 36:13 Would You Trust AI to Know You Personally? 36:57 How You can Use AI to Get Better Sleep 37:29 Can AI Actually Give You Good Relationship Advice? 38:07 How AI Can Help You Find and Nurture Love 38:29 Why Personal Growth Solutions Should Never Be Generic 39:41 Your DNA Holds the Footprints of Human History 42:33 Rethinking the Big Bang: What Science Still Can’t Explain 44:31 Is Everything You See Just a Projection? 47:48 Why Fear of the Unknown Limits Our Growth 48:52 Want Better Answers? Ask Better Questions 51:41 The True Secret to Longevity Isn’t What You Think 54:30 How Your Brain Turns Experience Into Reality 55:08 Why Consciousness Is Still Life’s Greatest Mystery 56:28 The First Question You Should Always Ask AI 58:38 How ChatGPT Can Spark Deeper, More Intelligent Questions Episode Resources: Deepak Chopra | Website Deepak Chopra | Instagram Deepak Chopra | TikTok Deepak Chopra | Facebook Deepak Chopra | YouTube Deepak Chopra | X Digital Dharma: How AI Can Elevate Spiritual Intelligence and Personal Well-BeingSee omnystudio.com/listener for privacy information.
Audrey Magee, Booker prize longlisted author and Conor Kostick, author and advocacy officer at the Irish Writers Union explain why they will be protesting outside the Department of Trade over the use of their copyrighted work to train AI models.
The EXACT Objection Handling Framework I Used To Earn $2.4MM/yr In Straight Commissions As A W-2 Sales Rep Link ➡️: https://nepqtraining.com/smv-yt-splt-opt-org Text me if you have any sales, persuasion, or influence questions: +1-480-637-2944 In this episode, Jeremy Miner dives deep with Douglas James, a traffic and sales expert, to discuss how AI is revolutionizing lead generation. Discover how cutting-edge technology helps businesses make smarter decisions faster and why your sales strategy might be leaving money on the table. If you're tired of wasting time with unqualified leads and want to scale your business, this episode is a must-watch! ✅ Resources: JOIN the Sales Revolution: https://www.facebook.com/groups/salesrevolutiongroup Book a "Clarity CALL": https://7thlevelhq.com/book-demo/ ✅ Connect with Me: Follow Jeremy Miner on Facebook: https://www.facebook.com/jeremy.miner.52 Follow Jeremy Miner on Instagram: https://www.instagram.com/jeremyleeminer/ Follow Jeremy Miner on LinkedIn: https://www.linkedin.com/in/jeremyleeminer/ ✅ SUBSCRIBE to My Podcast CLOSERS ARE LOSERS with Jeremy Miner: Subscribe on iTunes: https://podcasts.apple.com/us/podcast/closers-are-losers-with-jeremyminer/id1534365100 Subscribe and Review on Spotify: https://open.spotify.com/show/2kNDyUR7fz9SqBr9iGwfwV?si=uMhsOBP4S_SBaHqAFp4EGg Subscribe and Review on Stitcher: https://www.stitcher.com/show/closers-are-losers-with-jeremy-miner Subscribe and Review on Google Podcasts: https://podcasts.google.com/u/1/feed/aHR0cHM6Ly9jbG9zZXJzYXJlbG9zZXJzLmxpYnN5bi5jb20vcnNz
This episode is sponsored by Netsuite by Oracle, the number one cloud financial system, streamlining accounting, financial management, inventory, HR, and more. NetSuite is offering a one-of-a-kind flexible financing program. Head to https://netsuite.com/EYEONAI to know more. In this episode of Eye on AI, Craig Smith sits down with Barr Moses, Co-Founder & CEO of Monte Carlo, the pioneer of data and AI observability. Together, they explore the hidden force behind every great AI system: reliable, trustworthy data. With AI adoption soaring across industries, companies now face a critical question: Can we trust the data feeding our models? Barr unpacks why data quality is more important than ever, how observability helps detect and resolve data issues, and why clean data—not access to GPT or Claude—is the real competitive moat in AI today. What You'll Learn in This Episode: Why access to AI models is no longer a competitive advantage How Monte Carlo helps teams monitor complex data estates in real-time The dangers of “data hallucinations” and how to prevent them Real-world examples of data failures and their impact on AI outputs The difference between data observability and explainability Why legacy methods of data review no longer work in an AI-first world Stay Updated: Craig Smith on X:https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) Intro (01:08) How Monte Carlo Fixed Broken Data (03:08) What Is Data & AI Observability? (05:00) Structured vs Unstructured Data Monitoring (08:48) How Monte Carlo Integrates Across Data Stacks (13:35) Why Clean Data Is the New Competitive Advantage (16:57) How Monte Carlo Uses AI Internally (19:20) 4 Failure Points: Data, Systems, Code, Models (23:08) Can Observability Detect Bias in Data? (26:15) Why Data Quality Needs a Modern Definition (29:22) Explosion of Data Tools & Monte Carlo's 50+ Integrations (33:18) Data Observability vs Explainability (36:18) Human Evaluation vs Automated Monitoring (39:23) What Monte Carlo Looks Like for Users (46:03) How Fast Can You Deploy Monte Carlo? (51:56) Why Manual Data Checks No Longer Work (53:26) The Future of AI Depends on Trustworthy Data
Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis
As enterprise systems shift from deterministic software to probabilistic AI, forward-thinking organizations leverage proactive quality assessment to maximize value, minimize risk and ensure regulatory compliance In today's rapidly evolving technological landscape, ensuring the quality of both traditional software and AI systems has become more critical than ever. Organizations are increasingly relying on complex digital systems to drive innovation and maintain competitive advantage, yet many struggle to effectively evaluate these systems before, during, and after deployment. Yiannis Kanellopoulos is on the forefront of software and AI system quality assessment. He is the founder and CEO of code4thought, a startup specializing in assessing large-scale software systems and AI applications. We connected to explore the challenges and opportunities of quality assessment and share insights both for developers and for people responsible for technology decisions within their organizations. Read the article published on Orchestrate all the Things here: https://linkeddataorchestration.com/2025/04/09/the-quality-imperative-why-leading-organizations-proactively-evaluate-software-and-ai-systems-and-how-you-can-too/
What if AI could think like you—and free up your time in the process? In this episode, Dr. Lauryn welcomes back Callan Faulkner, founder of REI Optimize and one of the most sought-after AI coaches for entrepreneurs, for a mind-expanding conversation on using AI to transform not just your business, but your entire life. From saving time and scaling smarter to aligning tech with your intuition, Callan breaks down exactly how to move from AI curiosity to awakened, empowered implementation.Together, they explore the difference between dabbling in AI and building custom systems that truly serve your vision. You'll learn how to create your own virtual sales assistant, brand copywriter, business coach, meal planner, and even spiritual guide using tools like ChatGPT and Claude. If you've been overwhelmed by the AI conversation—or just want to make it actually useful—this is your roadmap to building tools that think, sound, and act like you.Key Takeaways:AI isn't just for automation—it's a tool for alignment: Callan shares how to create AI systems that reflect your values, voice, and vision so you can scale your business and still feel like you.The difference between a prompt and a project is everything: Learn how to move beyond one-off prompts and instead build custom, reusable systems that save time and increase consistency.Your business deserves a custom AI dream team: From a sales assistant to a brand copywriter, Callan reveals the 3–5 AI projects every entrepreneur should build now.AI can support your personal growth too: Callan discusses how she uses AI as a spiritual coach, a journaling prompt generator, and even a relationship support tool.Guest Bio:Callan Faulkner is the founder of REI Optimize and one of the most sought-after AI coaches for entrepreneurs. She's helped over 500 businesses—from solo practitioners to 9-figure enterprises—use AI to scale smarter, reduce stress, and reclaim their time. Known for making complex technology feel human and actionable, Callan is passionate about building systems that support both business success and personal alignment. Her unique approach blends strategy with soul, helping leaders automate without losing their voice.Enroll in Callan's course, Foundations of AI, and transform how you work in just 10 days (or less!)Join The Effortless Business Bootcamp to automate your systems, multiply your output, and reclaim your freedom!Follow Callan: Instagram | LinkedIn Resources:For those interested in building a profitable personal brand in just two hours a week, check out Dr. Lauryn's new membership group Beyond Brick & Mortar!Sign up for the Weekly Slay newsletter!Follow She Slays and Dr. Lauryn: Website | Instagram | X | LinkedIn |
Epistemic status: This post aims at an ambitious target: improving intuitive understanding directly. The model for why this is worth trying is that I believe we are more bottlenecked by people having good intuitions guiding their research than, for example, by the ability of people to code and run evals. Quite a few ideas in AI safety implicitly use assumptions about individuality that ultimately derive from human experience. When we talk about AIs scheming, alignment faking or goal preservation, we imply there is something scheming or alignment faking or wanting to preserve its goals or escape the datacentre. If the system in question were human, it would be quite clear what that individual system is. When you read about Reinhold Messner reaching the summit of Everest, you would be curious about the climb, but you would not ask if it was his body there, or his [...] ---Outline:(01:38) Individuality in Biology(03:53) Individuality in AI Systems(10:19) Risks and Limitations of Anthropomorphic Individuality Assumptions(11:25) Coordinating Selves(16:19) Whats at Stake: Stories(17:25) Exporting Myself(21:43) The Alignment Whisperers(23:27) Echoes in the Dataset(25:18) Implications for Alignment Research and Policy--- First published: March 28th, 2025 Source: https://www.lesswrong.com/posts/wQKskToGofs4osdJ3/the-pando-problem-rethinking-ai-individuality --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Eiso Kant, CTO of poolside AI, discusses the company's approach to building frontier AI foundation models, particularly focused on software development. Their unique strategy is reinforcement learning from code execution feedback which is an important axis for scaling AI capabilities beyond just increasing model size or data volume. Kant predicts human-level AI in knowledge work could be achieved within 18-36 months, outlining poolside's vision to dramatically increase software development productivity and accessibility. SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***Eiso Kant:https://x.com/eisokanthttps://poolside.ai/TRANSCRIPT:https://www.dropbox.com/scl/fi/szepl6taqziyqie9wgmk9/poolside.pdf?rlkey=iqar7dcwshyrpeoz0xa76k422&dl=0TOC:1. Foundation Models and AI Strategy [00:00:00] 1.1 Foundation Models and Timeline Predictions for AI Development [00:02:55] 1.2 Poolside AI's Corporate History and Strategic Vision [00:06:48] 1.3 Foundation Models vs Enterprise Customization Trade-offs2. Reinforcement Learning and Model Economics [00:15:42] 2.1 Reinforcement Learning and Code Execution Feedback Approaches [00:22:06] 2.2 Model Economics and Experimental Optimization3. Enterprise AI Implementation [00:25:20] 3.1 Poolside's Enterprise Deployment Strategy and Infrastructure [00:26:00] 3.2 Enterprise-First Business Model and Market Focus [00:27:05] 3.3 Foundation Models and AGI Development Approach [00:29:24] 3.4 DeepSeek Case Study and Infrastructure Requirements4. LLM Architecture and Performance [00:30:15] 4.1 Distributed Training and Hardware Architecture Optimization [00:33:01] 4.2 Model Scaling Strategies and Chinchilla Optimality Trade-offs [00:36:04] 4.3 Emergent Reasoning and Model Architecture Comparisons [00:43:26] 4.4 Balancing Creativity and Determinism in AI Models [00:50:01] 4.5 AI-Assisted Software Development Evolution5. AI Systems Engineering and Scalability [00:58:31] 5.1 Enterprise AI Productivity and Implementation Challenges [00:58:40] 5.2 Low-Code Solutions and Enterprise Hiring Trends [01:01:25] 5.3 Distributed Systems and Engineering Complexity [01:01:50] 5.4 GenAI Architecture and Scalability Patterns [01:01:55] 5.5 Scaling Limitations and Architectural Patterns in AI Code Generation6. AI Safety and Future Capabilities [01:06:23] 6.1 Semantic Understanding and Language Model Reasoning Approaches [01:12:42] 6.2 Model Interpretability and Safety Considerations in AI Systems [01:16:27] 6.3 AI vs Human Capabilities in Software Development [01:33:45] 6.4 Enterprise Deployment and Security ArchitectureCORE REFS (see shownotes for URLs/more refs):[00:15:45] Research demonstrating how training on model-generated content leads to distribution collapse in AI models, Ilia Shumailov et al. (Key finding on synthetic data risk)[00:20:05] Foundational paper introducing Word2Vec for computing word vector representations, Tomas Mikolov et al. (Seminal NLP technique)[00:22:15] OpenAI O3 model's breakthrough performance on ARC Prize Challenge, OpenAI (Significant AI reasoning benchmark achievement)[00:22:40] Seminal paper proposing a formal definition of intelligence as skill-acquisition efficiency, François Chollet (Influential AI definition/philosophy)[00:30:30] Technical documentation of DeepSeek's V3 model architecture and capabilities, DeepSeek AI (Details on a major new model)[00:34:30] Foundational paper establishing optimal scaling laws for LLM training, Jordan Hoffmann et al. (Key paper on LLM scaling)[00:45:45] Seminal essay arguing that scaling computation consistently trumps human-engineered solutions in AI, Richard S. Sutton (Influential "Bitter Lesson" perspective)
If you're in SF: Join us for the Claude Plays Pokemon hackathon this Sunday!If you're not: Fill out the 2025 State of AI Eng survey for $250 in Amazon cards!We are SO excited to share our conversation with Dharmesh Shah, co-founder of HubSpot and creator of Agent.ai.A particularly compelling concept we discussed is the idea of "hybrid teams" - the next evolution in workplace organization where human workers collaborate with AI agents as team members. Just as we previously saw hybrid teams emerge in terms of full-time vs. contract workers, or in-office vs. remote workers, Dharmesh predicts that the next frontier will be teams composed of both human and AI members. This raises interesting questions about team dynamics, trust, and how to effectively delegate tasks between human and AI team members.The discussion of business models in AI reveals an important distinction between Work as a Service (WaaS) and Results as a Service (RaaS), something Dharmesh has written extensively about. While RaaS has gained popularity, particularly in customer support applications where outcomes are easily measurable, Dharmesh argues that this model may be over-indexed. Not all AI applications have clearly definable outcomes or consistent economic value per transaction, making WaaS more appropriate in many cases. This insight is particularly relevant for businesses considering how to monetize AI capabilities.The technical challenges of implementing effective agent systems are also explored, particularly around memory and authentication. Shah emphasizes the importance of cross-agent memory sharing and the need for more granular control over data access. He envisions a future where users can selectively share parts of their data with different agents, similar to how OAuth works but with much finer control. This points to significant opportunities in developing infrastructure for secure and efficient agent-to-agent communication and data sharing.Other highlights from our conversation* The Evolution of AI-Powered Agents – Exploring how AI agents have evolved from simple chatbots to sophisticated multi-agent systems, and the role of MCPs in enabling that.* Hybrid Digital Teams and the Future of Work – How AI agents are becoming teammates rather than just tools, and what this means for business operations and knowledge work.* Memory in AI Agents – The importance of persistent memory in AI systems and how shared memory across agents could enhance collaboration and efficiency.* Business Models for AI Agents – Exploring the shift from software as a service (SaaS) to work as a service (WaaS) and results as a service (RaaS), and what this means for monetization.* The Role of Standards Like MCP – Why MCP has been widely adopted and how it enables agent collaboration, tool use, and discovery.* The Future of AI Code Generation and Software Engineering – How AI-assisted coding is changing the role of software engineers and what skills will matter most in the future.* Domain Investing and Efficient Markets – Dharmesh's approach to domain investing and how inefficiencies in digital asset markets create business opportunities.* The Philosophy of Saying No – Lessons from "Sorry, You Must Pass" and how prioritization leads to greater productivity and focus.Timestamps* 00:00 Introduction and Guest Welcome* 02:29 Dharmesh Shah's Journey into AI* 05:22 Defining AI Agents* 06:45 The Evolution and Future of AI Agents* 13:53 Graph Theory and Knowledge Representation* 20:02 Engineering Practices and Overengineering* 25:57 The Role of Junior Engineers in the AI Era* 28:20 Multi-Agent Systems and MCP Standards* 35:55 LinkedIn's Legal Battles and Data Scraping* 37:32 The Future of AI and Hybrid Teams* 39:19 Building Agent AI: A Professional Network for Agents* 40:43 Challenges and Innovations in Agent AI* 45:02 The Evolution of UI in AI Systems* 01:00:25 Business Models: Work as a Service vs. Results as a Service* 01:09:17 The Future Value of Engineers* 01:09:51 Exploring the Role of Agents* 01:10:28 The Importance of Memory in AI* 01:11:02 Challenges and Opportunities in AI Memory* 01:12:41 Selective Memory and Privacy Concerns* 01:13:27 The Evolution of AI Tools and Platforms* 01:18:23 Domain Names and AI Projects* 01:32:08 Balancing Work and Personal Life* 01:35:52 Final Thoughts and ReflectionsTranscriptAlessio [00:00:04]: Hey everyone, welcome back to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hello, and today we're super excited to have Dharmesh Shah to join us. I guess your relevant title here is founder of Agent AI.Dharmesh [00:00:20]: Yeah, that's true for this. Yeah, creator of Agent.ai and co-founder of HubSpot.swyx [00:00:25]: Co-founder of HubSpot, which I followed for many years, I think 18 years now, gonna be 19 soon. And you caught, you know, people can catch up on your HubSpot story elsewhere. I should also thank Sean Puri, who I've chatted with back and forth, who's been, I guess, getting me in touch with your people. But also, I think like, just giving us a lot of context, because obviously, My First Million joined you guys, and they've been chatting with you guys a lot. So for the business side, we can talk about that, but I kind of wanted to engage your CTO, agent, engineer side of things. So how did you get agent religion?Dharmesh [00:01:00]: Let's see. So I've been working, I'll take like a half step back, a decade or so ago, even though actually more than that. So even before HubSpot, the company I was contemplating that I had named for was called Ingenisoft. And the idea behind Ingenisoft was a natural language interface to business software. Now realize this is 20 years ago, so that was a hard thing to do. But the actual use case that I had in mind was, you know, we had data sitting in business systems like a CRM or something like that. And my kind of what I thought clever at the time. Oh, what if we used email as the kind of interface to get to business software? And the motivation for using email is that it automatically works when you're offline. So imagine I'm getting on a plane or I'm on a plane. There was no internet on planes back then. It's like, oh, I'm going through business cards from an event I went to. I can just type things into an email just to have them all in the backlog. When it reconnects, it sends those emails to a processor that basically kind of parses effectively the commands and updates the software, sends you the file, whatever it is. And there was a handful of commands. I was a little bit ahead of the times in terms of what was actually possible. And I reattempted this natural language thing with a product called ChatSpot that I did back 20...swyx [00:02:12]: Yeah, this is your first post-ChatGPT project.Dharmesh [00:02:14]: I saw it come out. Yeah. And so I've always been kind of fascinated by this natural language interface to software. Because, you know, as software developers, myself included, we've always said, oh, we build intuitive, easy-to-use applications. And it's not intuitive at all, right? Because what we're doing is... We're taking the mental model that's in our head of what we're trying to accomplish with said piece of software and translating that into a series of touches and swipes and clicks and things like that. And there's nothing natural or intuitive about it. And so natural language interfaces, for the first time, you know, whatever the thought is you have in your head and expressed in whatever language that you normally use to talk to yourself in your head, you can just sort of emit that and have software do something. And I thought that was kind of a breakthrough, which it has been. And it's gone. So that's where I first started getting into the journey. I started because now it actually works, right? So once we got ChatGPT and you can take, even with a few-shot example, convert something into structured, even back in the ChatGP 3.5 days, it did a decent job in a few-shot example, convert something to structured text if you knew what kinds of intents you were going to have. And so that happened. And that ultimately became a HubSpot project. But then agents intrigued me because I'm like, okay, well, that's the next step here. So chat's great. Love Chat UX. But if we want to do something even more meaningful, it felt like the next kind of advancement is not this kind of, I'm chatting with some software in a kind of a synchronous back and forth model, is that software is going to do things for me in kind of a multi-step way to try and accomplish some goals. So, yeah, that's when I first got started. It's like, okay, what would that look like? Yeah. And I've been obsessed ever since, by the way.Alessio [00:03:55]: Which goes back to your first experience with it, which is like you're offline. Yeah. And you want to do a task. You don't need to do it right now. You just want to queue it up for somebody to do it for you. Yes. As you think about agents, like, let's start at the easy question, which is like, how do you define an agent? Maybe. You mean the hardest question in the universe? Is that what you mean?Dharmesh [00:04:12]: You said you have an irritating take. I do have an irritating take. I think, well, some number of people have been irritated, including within my own team. So I have a very broad definition for agents, which is it's AI-powered software that accomplishes a goal. Period. That's it. And what irritates people about it is like, well, that's so broad as to be completely non-useful. And I understand that. I understand the criticism. But in my mind, if you kind of fast forward months, I guess, in AI years, the implementation of it, and we're already starting to see this, and we'll talk about this, different kinds of agents, right? So I think in addition to having a usable definition, and I like yours, by the way, and we should talk more about that, that you just came out with, the classification of agents actually is also useful, which is, is it autonomous or non-autonomous? Does it have a deterministic workflow? Does it have a non-deterministic workflow? Is it working synchronously? Is it working asynchronously? Then you have the different kind of interaction modes. Is it a chat agent, kind of like a customer support agent would be? You're having this kind of back and forth. Is it a workflow agent that just does a discrete number of steps? So there's all these different flavors of agents. So if I were to draw it in a Venn diagram, I would draw a big circle that says, this is agents, and then I have a bunch of circles, some overlapping, because they're not mutually exclusive. And so I think that's what's interesting, and we're seeing development along a bunch of different paths, right? So if you look at the first implementation of agent frameworks, you look at Baby AGI and AutoGBT, I think it was, not Autogen, that's the Microsoft one. They were way ahead of their time because they assumed this level of reasoning and execution and planning capability that just did not exist, right? So it was an interesting thought experiment, which is what it was. Even the guy that, I'm an investor in Yohei's fund that did Baby AGI. It wasn't ready, but it was a sign of what was to come. And so the question then is, when is it ready? And so lots of people talk about the state of the art when it comes to agents. I'm a pragmatist, so I think of the state of the practical. It's like, okay, well, what can I actually build that has commercial value or solves actually some discrete problem with some baseline of repeatability or verifiability?swyx [00:06:22]: There was a lot, and very, very interesting. I'm not irritated by it at all. Okay. As you know, I take a... There's a lot of anthropological view or linguistics view. And in linguistics, you don't want to be prescriptive. You want to be descriptive. Yeah. So you're a goals guy. That's the key word in your thing. And other people have other definitions that might involve like delegated trust or non-deterministic work, LLM in the loop, all that stuff. The other thing I was thinking about, just the comment on Baby AGI, LGBT. Yeah. In that piece that you just read, I was able to go through our backlog and just kind of track the winter of agents and then the summer now. Yeah. And it's... We can tell the whole story as an oral history, just following that thread. And it's really just like, I think, I tried to explain the why now, right? Like I had, there's better models, of course. There's better tool use with like, they're just more reliable. Yep. Better tools with MCP and all that stuff. And I'm sure you have opinions on that too. Business model shift, which you like a lot. I just heard you talk about RAS with MFM guys. Yep. Cost is dropping a lot. Yep. Inference is getting faster. There's more model diversity. Yep. Yep. I think it's a subtle point. It means that like, you have different models with different perspectives. You don't get stuck in the basin of performance of a single model. Sure. You can just get out of it by just switching models. Yep. Multi-agent research and RL fine tuning. So I just wanted to let you respond to like any of that.Dharmesh [00:07:44]: Yeah. A couple of things. Connecting the dots on the kind of the definition side of it. So we'll get the irritation out of the way completely. I have one more, even more irritating leap on the agent definition thing. So here's the way I think about it. By the way, the kind of word agent, I looked it up, like the English dictionary definition. The old school agent, yeah. Is when you have someone or something that does something on your behalf, like a travel agent or a real estate agent acts on your behalf. It's like proxy, which is a nice kind of general definition. So the other direction I'm sort of headed, and it's going to tie back to tool calling and MCP and things like that, is if you, and I'm not a biologist by any stretch of the imagination, but we have these single-celled organisms, right? Like the simplest possible form of what one would call life. But it's still life. It just happens to be single-celled. And then you can combine cells and then cells become specialized over time. And you have much more sophisticated organisms, you know, kind of further down the spectrum. In my mind, at the most fundamental level, you can almost think of having atomic agents. What is the simplest possible thing that's an agent that can still be called an agent? What is the equivalent of a kind of single-celled organism? And the reason I think that's useful is right now we're headed down the road, which I think is very exciting around tool use, right? That says, okay, the LLMs now can be provided a set of tools that it calls to accomplish whatever it needs to accomplish in the kind of furtherance of whatever goal it's trying to get done. And I'm not overly bothered by it, but if you think about it, if you just squint a little bit and say, well, what if everything was an agent? And what if tools were actually just atomic agents? Because then it's turtles all the way down, right? Then it's like, oh, well, all that's really happening with tool use is that we have a network of agents that know about each other through something like an MMCP and can kind of decompose a particular problem and say, oh, I'm going to delegate this to this set of agents. And why do we need to draw this distinction between tools, which are functions most of the time? And an actual agent. And so I'm going to write this irritating LinkedIn post, you know, proposing this. It's like, okay. And I'm not suggesting we should call even functions, you know, call them agents. But there is a certain amount of elegance that happens when you say, oh, we can just reduce it down to one primitive, which is an agent that you can combine in complicated ways to kind of raise the level of abstraction and accomplish higher order goals. Anyway, that's my answer. I'd say that's a success. Thank you for coming to my TED Talk on agent definitions.Alessio [00:09:54]: How do you define the minimum viable agent? Do you already have a definition for, like, where you draw the line between a cell and an atom? Yeah.Dharmesh [00:10:02]: So in my mind, it has to, at some level, use AI in order for it to—otherwise, it's just software. It's like, you know, we don't need another word for that. And so that's probably where I draw the line. So then the question, you know, the counterargument would be, well, if that's true, then lots of tools themselves are actually not agents because they're just doing a database call or a REST API call or whatever it is they're doing. And that does not necessarily qualify them, which is a fair counterargument. And I accept that. It's like a good argument. I still like to think about—because we'll talk about multi-agent systems, because I think—so we've accepted, which I think is true, lots of people have said it, and you've hopefully combined some of those clips of really smart people saying this is the year of agents, and I completely agree, it is the year of agents. But then shortly after that, it's going to be the year of multi-agent systems or multi-agent networks. I think that's where it's going to be headed next year. Yeah.swyx [00:10:54]: Opening eyes already on that. Yeah. My quick philosophical engagement with you on this. I often think about kind of the other spectrum, the other end of the cell spectrum. So single cell is life, multi-cell is life, and you clump a bunch of cells together in a more complex organism, they become organs, like an eye and a liver or whatever. And then obviously we consider ourselves one life form. There's not like a lot of lives within me. I'm just one life. And now, obviously, I don't think people don't really like to anthropomorphize agents and AI. Yeah. But we are extending our consciousness and our brain and our functionality out into machines. I just saw you were a Bee. Yeah. Which is, you know, it's nice. I have a limitless pendant in my pocket.Dharmesh [00:11:37]: I got one of these boys. Yeah.swyx [00:11:39]: I'm testing it all out. You know, got to be early adopters. But like, we want to extend our personal memory into these things so that we can be good at the things that we're good at. And, you know, machines are good at it. Machines are there. So like, my definition of life is kind of like going outside of my own body now. I don't know if you've ever had like reflections on that. Like how yours. How our self is like actually being distributed outside of you. Yeah.Dharmesh [00:12:01]: I don't fancy myself a philosopher. But you went there. So yeah, I did go there. I'm fascinated by kind of graphs and graph theory and networks and have been for a long, long time. And to me, we're sort of all nodes in this kind of larger thing. It just so happens that we're looking at individual kind of life forms as they exist right now. But so the idea is when you put a podcast out there, there's these little kind of nodes you're putting out there of like, you know, conceptual ideas. Once again, you have varying kind of forms of those little nodes that are up there and are connected in varying and sundry ways. And so I just think of myself as being a node in a massive, massive network. And I'm producing more nodes as I put content or ideas. And, you know, you spend some portion of your life collecting dots, experiences, people, and some portion of your life then connecting dots from the ones that you've collected over time. And I found that really interesting things happen and you really can't know in advance how those dots are necessarily going to connect in the future. And that's, yeah. So that's my philosophical take. That's the, yes, exactly. Coming back.Alessio [00:13:04]: Yep. Do you like graph as an agent? Abstraction? That's been one of the hot topics with LandGraph and Pydantic and all that.Dharmesh [00:13:11]: I do. The thing I'm more interested in terms of use of graphs, and there's lots of work happening on that now, is graph data stores as an alternative in terms of knowledge stores and knowledge graphs. Yeah. Because, you know, so I've been in software now 30 plus years, right? So it's not 10,000 hours. It's like 100,000 hours that I've spent doing this stuff. And so I've grew up with, so back in the day, you know, I started on mainframes. There was a product called IMS from IBM, which is basically an index database, what we'd call like a key value store today. Then we've had relational databases, right? We have tables and columns and foreign key relationships. We all know that. We have document databases like MongoDB, which is sort of a nested structure keyed by a specific index. We have vector stores, vector embedding database. And graphs are interesting for a couple of reasons. One is, so it's not classically structured in a relational way. When you say structured database, to most people, they're thinking tables and columns and in relational database and set theory and all that. Graphs still have structure, but it's not the tables and columns structure. And you could wonder, and people have made this case, that they are a better representation of knowledge for LLMs and for AI generally than other things. So that's kind of thing number one conceptually, and that might be true, I think is possibly true. And the other thing that I really like about that in the context of, you know, I've been in the context of data stores for RAG is, you know, RAG, you say, oh, I have a million documents, I'm going to build the vector embeddings, I'm going to come back with the top X based on the semantic match, and that's fine. All that's very, very useful. But the reality is something gets lost in the chunking process and the, okay, well, those tend, you know, like, you don't really get the whole picture, so to speak, and maybe not even the right set of dimensions on the kind of broader picture. And it makes intuitive sense to me that if we did capture it properly in a graph form, that maybe that feeding into a RAG pipeline will actually yield better results for some use cases, I don't know, but yeah.Alessio [00:15:03]: And do you feel like at the core of it, there's this difference between imperative and declarative programs? Because if you think about HubSpot, it's like, you know, people and graph kind of goes hand in hand, you know, but I think maybe the software before was more like primary foreign key based relationship, versus now the models can traverse through the graph more easily.Dharmesh [00:15:22]: Yes. So I like that representation. There's something. It's just conceptually elegant about graphs and just from the representation of it, they're much more discoverable, you can kind of see it, there's observability to it, versus kind of embeddings, which you can't really do much with as a human. You know, once they're in there, you can't pull stuff back out. But yeah, I like that kind of idea of it. And the other thing that's kind of, because I love graphs, I've been long obsessed with PageRank from back in the early days. And, you know, one of the kind of simplest algorithms in terms of coming up, you know, with a phone, everyone's been exposed to PageRank. And the idea is that, and so I had this other idea for a project, not a company, and I have hundreds of these, called NodeRank, is to be able to take the idea of PageRank and apply it to an arbitrary graph that says, okay, I'm going to define what authority looks like and say, okay, well, that's interesting to me, because then if you say, I'm going to take my knowledge store, and maybe this person that contributed some number of chunks to the graph data store has more authority on this particular use case or prompt that's being submitted than this other one that may, or maybe this one was more. popular, or maybe this one has, whatever it is, there should be a way for us to kind of rank nodes in a graph and sort them in some, some useful way. Yeah.swyx [00:16:34]: So I think that's generally useful for, for anything. I think the, the problem, like, so even though at my conferences, GraphRag is super popular and people are getting knowledge, graph religion, and I will say like, it's getting space, getting traction in two areas, conversation memory, and then also just rag in general, like the, the, the document data. Yeah. It's like a source. Most ML practitioners would say that knowledge graph is kind of like a dirty word. The graph database, people get graph religion, everything's a graph, and then they, they go really hard into it and then they get a, they get a graph that is too complex to navigate. Yes. And so like the, the, the simple way to put it is like you at running HubSpot, you know, the power of graphs, the way that Google has pitched them for many years, but I don't suspect that HubSpot itself uses a knowledge graph. No. Yeah.Dharmesh [00:17:26]: So when is it over engineering? Basically? It's a great question. I don't know. So the question now, like in AI land, right, is the, do we necessarily need to understand? So right now, LLMs for, for the most part are somewhat black boxes, right? We sort of understand how the, you know, the algorithm itself works, but we really don't know what's going on in there and, and how things come out. So if a graph data store is able to produce the outcomes we want, it's like, here's a set of queries I want to be able to submit and then it comes out with useful content. Maybe the underlying data store is as opaque as a vector embeddings or something like that, but maybe it's fine. Maybe we don't necessarily need to understand it to get utility out of it. And so maybe if it's messy, that's okay. Um, that's, it's just another form of lossy compression. Uh, it's just lossy in a way that we just don't completely understand in terms of, because it's going to grow organically. Uh, and it's not structured. It's like, ah, we're just gonna throw a bunch of stuff in there. Let the, the equivalent of the embedding algorithm, whatever they called in graph land. Um, so the one with the best results wins. I think so. Yeah.swyx [00:18:26]: Or is this the practical side of me is like, yeah, it's, if it's useful, we don't necessarilyDharmesh [00:18:30]: need to understand it.swyx [00:18:30]: I have, I mean, I'm happy to push back as long as you want. Uh, it's not practical to evaluate like the 10 different options out there because it takes time. It takes people, it takes, you know, resources, right? Set. That's the first thing. Second thing is your evals are typically on small things and some things only work at scale. Yup. Like graphs. Yup.Dharmesh [00:18:46]: Yup. That's, yeah, no, that's fair. And I think this is one of the challenges in terms of implementation of graph databases is that the most common approach that I've seen developers do, I've done it myself, is that, oh, I've got a Postgres database or a MySQL or whatever. I can represent a graph with a very set of tables with a parent child thing or whatever. And that sort of gives me the ability, uh, why would I need anything more than that? And the answer is, well, if you don't need anything more than that, you don't need anything more than that. But there's a high chance that you're sort of missing out on the actual value that, uh, the graph representation gives you. Which is the ability to traverse the graph, uh, efficiently in ways that kind of going through the, uh, traversal in a relational database form, even though structurally you have the data, practically you're not gonna be able to pull it out in, in useful ways. Uh, so you wouldn't like represent a social graph, uh, in, in using that kind of relational table model. It just wouldn't scale. It wouldn't work.swyx [00:19:36]: Uh, yeah. Uh, I think we want to move on to MCP. Yeah. But I just want to, like, just engineering advice. Yeah. Uh, obviously you've, you've, you've run, uh, you've, you've had to do a lot of projects and run a lot of teams. Do you have a general rule for over-engineering or, you know, engineering ahead of time? You know, like, because people, we know premature engineering is the root of all evil. Yep. But also sometimes you just have to. Yep. When do you do it? Yes.Dharmesh [00:19:59]: It's a great question. This is, uh, a question as old as time almost, which is what's the right and wrong levels of abstraction. That's effectively what, uh, we're answering when we're trying to do engineering. I tend to be a pragmatist, right? So here's the thing. Um, lots of times doing something the right way. Yeah. It's like a marginal increased cost in those cases. Just do it the right way. And this is what makes a, uh, a great engineer or a good engineer better than, uh, a not so great one. It's like, okay, all things being equal. If it's going to take you, you know, roughly close to constant time anyway, might as well do it the right way. Like, so do things well, then the question is, okay, well, am I building a framework as the reusable library? To what degree, uh, what am I anticipating in terms of what's going to need to change in this thing? Uh, you know, along what dimension? And then I think like a business person in some ways, like what's the return on calories, right? So, uh, and you look at, um, energy, the expected value of it's like, okay, here are the five possible things that could happen, uh, try to assign probabilities like, okay, well, if there's a 50% chance that we're going to go down this particular path at some day, like, or one of these five things is going to happen and it costs you 10% more to engineer for that. It's basically, it's something that yields a kind of interest compounding value. Um, as you get closer to the time of, of needing that versus having to take on debt, which is when you under engineer it, you're taking on debt. You're going to have to pay off when you do get to that eventuality where something happens. One thing as a pragmatist, uh, so I would rather under engineer something than over engineer it. If I were going to err on the side of something, and here's the reason is that when you under engineer it, uh, yes, you take on tech debt, uh, but the interest rate is relatively known and payoff is very, very possible, right? Which is, oh, I took a shortcut here as a result of which now this thing that should have taken me a week is now going to take me four weeks. Fine. But if that particular thing that you thought might happen, never actually, you never have that use case transpire or just doesn't, it's like, well, you just save yourself time, right? And that has value because you were able to do other things instead of, uh, kind of slightly over-engineering it away, over-engineering it. But there's no perfect answers in art form in terms of, uh, and yeah, we'll, we'll bring kind of this layers of abstraction back on the code generation conversation, which we'll, uh, I think I have later on, butAlessio [00:22:05]: I was going to ask, we can just jump ahead quickly. Yeah. Like, as you think about vibe coding and all that, how does the. Yeah. Percentage of potential usefulness change when I feel like we over-engineering a lot of times it's like the investment in syntax, it's less about the investment in like arc exacting. Yep. Yeah. How does that change your calculus?Dharmesh [00:22:22]: A couple of things, right? One is, um, so, you know, going back to that kind of ROI or a return on calories, kind of calculus or heuristic you think through, it's like, okay, well, what is it going to cost me to put this layer of abstraction above the code that I'm writing now, uh, in anticipating kind of future needs. If the cost of fixing, uh, or doing under engineering right now. Uh, we'll trend towards zero that says, okay, well, I don't have to get it right right now because even if I get it wrong, I'll run the thing for six hours instead of 60 minutes or whatever. It doesn't really matter, right? Like, because that's going to trend towards zero to be able, the ability to refactor a code. Um, and because we're going to not that long from now, we're going to have, you know, large code bases be able to exist, uh, you know, as, as context, uh, for a code generation or a code refactoring, uh, model. So I think it's going to make it, uh, make the case for under engineering, uh, even stronger. Which is why I take on that cost. You just pay the interest when you get there, it's not, um, just go on with your life vibe coded and, uh, come back when you need to. Yeah.Alessio [00:23:18]: Sometimes I feel like there's no decision-making in some things like, uh, today I built a autosave for like our internal notes platform and I literally just ask them cursor. Can you add autosave? Yeah. I don't know if it's over under engineer. Yep. I just vibe coded it. Yep. And I feel like at some point we're going to get to the point where the models kindDharmesh [00:23:36]: of decide where the right line is, but this is where the, like the, in my mind, the danger is, right? So there's two sides to this. One is the cost of kind of development and coding and things like that stuff that, you know, we talk about. But then like in your example, you know, one of the risks that we have is that because adding a feature, uh, like a save or whatever the feature might be to a product as that price tends towards zero, are we going to be less discriminant about what features we add as a result of making more product products more complicated, which has a negative impact on the user and navigate negative impact on the business. Um, and so that's the thing I worry about if it starts to become too easy, are we going to be. Too promiscuous in our, uh, kind of extension, adding product extensions and things like that. It's like, ah, why not add X, Y, Z or whatever back then it was like, oh, we only have so many engineering hours or story points or however you measure things. Uh, that least kept us in check a little bit. Yeah.Alessio [00:24:22]: And then over engineering, you're like, yeah, it's kind of like you're putting that on yourself. Yeah. Like now it's like the models don't understand that if they add too much complexity, it's going to come back to bite them later. Yep. So they just do whatever they want to do. Yeah. And I'm curious where in the workflow that's going to be, where it's like, Hey, this is like the amount of complexity and over-engineering you can do before you got to ask me if we should actually do it versus like do something else.Dharmesh [00:24:45]: So you know, we've already, let's like, we're leaving this, uh, in the code generation world, this kind of compressed, um, cycle time. Right. It's like, okay, we went from auto-complete, uh, in the GitHub co-pilot to like, oh, finish this particular thing and hit tab to a, oh, I sort of know your file or whatever. I can write out a full function to you to now I can like hold a bunch of the context in my head. Uh, so we can do app generation, which we have now with lovable and bolt and repletage. Yeah. Association and other things. So then the question is, okay, well, where does it naturally go from here? So we're going to generate products. Make sense. We might be able to generate platforms as though I want a platform for ERP that does this, whatever. And that includes the API's includes the product and the UI, and all the things that make for a platform. There's no nothing that says we would stop like, okay, can you generate an entire software company someday? Right. Uh, with the platform and the monetization and the go-to-market and the whatever. And you know, that that's interesting to me in terms of, uh, you know, what, when you take it to almost ludicrous levels. of abstract.swyx [00:25:39]: It's like, okay, turn it to 11. You mentioned vibe coding, so I have to, this is a blog post I haven't written, but I'm kind of exploring it. Is the junior engineer dead?Dharmesh [00:25:49]: I don't think so. I think what will happen is that the junior engineer will be able to, if all they're bringing to the table is the fact that they are a junior engineer, then yes, they're likely dead. But hopefully if they can communicate with carbon-based life forms, they can interact with product, if they're willing to talk to customers, they can take their kind of basic understanding of engineering and how kind of software works. I think that has value. So I have a 14-year-old right now who's taking Python programming class, and some people ask me, it's like, why is he learning coding? And my answer is, is because it's not about the syntax, it's not about the coding. What he's learning is like the fundamental thing of like how things work. And there's value in that. I think there's going to be timeless value in systems thinking and abstractions and what that means. And whether functions manifested as math, which he's going to get exposed to regardless, or there are some core primitives to the universe, I think, that the more you understand them, those are what I would kind of think of as like really large dots in your life that will have a higher gravitational pull and value to them that you'll then be able to. So I want him to collect those dots, and he's not resisting. So it's like, okay, while he's still listening to me, I'm going to have him do things that I think will be useful.swyx [00:26:59]: You know, part of one of the pitches that I evaluated for AI engineer is a term. And the term is that maybe the traditional interview path or career path of software engineer goes away, which is because what's the point of lead code? Yeah. And, you know, it actually matters more that you know how to work with AI and to implement the things that you want. Yep.Dharmesh [00:27:16]: That's one of the like interesting things that's happened with generative AI. You know, you go from machine learning and the models and just that underlying form, which is like true engineering, right? Like the actual, what I call real engineering. I don't think of myself as a real engineer, actually. I'm a developer. But now with generative AI. We call it AI and it's obviously got its roots in machine learning, but it just feels like fundamentally different to me. Like you have the vibe. It's like, okay, well, this is just a whole different approach to software development to so many different things. And so I'm wondering now, it's like an AI engineer is like, if you were like to draw the Venn diagram, it's interesting because the cross between like AI things, generative AI and what the tools are capable of, what the models do, and this whole new kind of body of knowledge that we're still building out, it's still very young, intersected with kind of classic engineering, software engineering. Yeah.swyx [00:28:04]: I just described the overlap as it separates out eventually until it's its own thing, but it's starting out as a software. Yeah.Alessio [00:28:11]: That makes sense. So to close the vibe coding loop, the other big hype now is MCPs. Obviously, I would say Cloud Desktop and Cursor are like the two main drivers of MCP usage. I would say my favorite is the Sentry MCP. I can pull in errors and then you can just put the context in Cursor. How do you think about that abstraction layer? Does it feel... Does it feel almost too magical in a way? Do you think it's like you get enough? Because you don't really see how the server itself is then kind of like repackaging theDharmesh [00:28:41]: information for you? I think MCP as a standard is one of the better things that's happened in the world of AI because a standard needed to exist and absent a standard, there was a set of things that just weren't possible. Now, we can argue whether it's the best possible manifestation of a standard or not. Does it do too much? Does it do too little? I get that, but it's just simple enough to both be useful and unobtrusive. It's understandable and adoptable by mere mortals, right? It's not overly complicated. You know, a reasonable engineer can put a stand up an MCP server relatively easily. The thing that has me excited about it is like, so I'm a big believer in multi-agent systems. And so that's going back to our kind of this idea of an atomic agent. So imagine the MCP server, like obviously it calls tools, but the way I think about it, so I'm working on my current passion project is agent.ai. And we'll talk more about that in a little bit. More about the, I think we should, because I think it's interesting not to promote the project at all, but there's some interesting ideas in there. One of which is around, we're going to need a mechanism for, if agents are going to collaborate and be able to delegate, there's going to need to be some form of discovery and we're going to need some standard way. It's like, okay, well, I just need to know what this thing over here is capable of. We're going to need a registry, which Anthropic's working on. I'm sure others will and have been doing directories of, and there's going to be a standard around that too. How do you build out a directory of MCP servers? I think that's going to unlock so many things just because, and we're already starting to see it. So I think MCP or something like it is going to be the next major unlock because it allows systems that don't know about each other, don't need to, it's that kind of decoupling of like Sentry and whatever tools someone else was building. And it's not just about, you know, Cloud Desktop or things like, even on the client side, I think we're going to see very interesting consumers of MCP, MCP clients versus just the chat body kind of things. Like, you know, Cloud Desktop and Cursor and things like that. But yeah, I'm very excited about MCP in that general direction.swyx [00:30:39]: I think the typical cynical developer take, it's like, we have OpenAPI. Yeah. What's the new thing? I don't know if you have a, do you have a quick MCP versus everything else? Yeah.Dharmesh [00:30:49]: So it's, so I like OpenAPI, right? So just a descriptive thing. It's OpenAPI. OpenAPI. Yes, that's what I meant. So it's basically a self-documenting thing. We can do machine-generated, lots of things from that output. It's a structured definition of an API. I get that, love it. But MCPs sort of are kind of use case specific. They're perfect for exactly what we're trying to use them for around LLMs in terms of discovery. It's like, okay, I don't necessarily need to know kind of all this detail. And so right now we have, we'll talk more about like MCP server implementations, but We will? I think, I don't know. Maybe we won't. At least it's in my head. It's like a back processor. But I do think MCP adds value above OpenAPI. It's, yeah, just because it solves this particular thing. And if we had come to the world, which we have, like, it's like, hey, we already have OpenAPI. It's like, if that were good enough for the universe, the universe would have adopted it already. There's a reason why MCP is taking office because marginally adds something that was missing before and doesn't go too far. And so that's why the kind of rate of adoption, you folks have written about this and talked about it. Yeah, why MCP won. Yeah. And it won because the universe decided that this was useful and maybe it gets supplanted by something else. Yeah. And maybe we discover, oh, maybe OpenAPI was good enough the whole time. I doubt that.swyx [00:32:09]: The meta lesson, this is, I mean, he's an investor in DevTools companies. I work in developer experience at DevRel in DevTools companies. Yep. Everyone wants to own the standard. Yeah. I'm sure you guys have tried to launch your own standards. Actually, it's Houseplant known for a standard, you know, obviously inbound marketing. But is there a standard or protocol that you ever tried to push? No.Dharmesh [00:32:30]: And there's a reason for this. Yeah. Is that? And I don't mean, need to mean, speak for the people of HubSpot, but I personally. You kind of do. I'm not smart enough. That's not the, like, I think I have a. You're smart. Not enough for that. I'm much better off understanding the standards that are out there. And I'm more on the composability side. Let's, like, take the pieces of technology that exist out there, combine them in creative, unique ways. And I like to consume standards. I don't like to, and that's not that I don't like to create them. I just don't think I have the, both the raw wattage or the credibility. It's like, okay, well, who the heck is Dharmesh, and why should we adopt a standard he created?swyx [00:33:07]: Yeah, I mean, there are people who don't monetize standards, like OpenTelemetry is a big standard, and LightStep never capitalized on that.Dharmesh [00:33:15]: So, okay, so if I were to do a standard, there's two things that have been in my head in the past. I was one around, a very, very basic one around, I don't even have the domain, I have a domain for everything, for open marketing. Because the issue we had in HubSpot grew up in the marketing space. There we go. There was no standard around data formats and things like that. It doesn't go anywhere. But the other one, and I did not mean to go here, but I'm going to go here. It's called OpenGraph. I know the term was already taken, but it hasn't been used for like 15 years now for its original purpose. But what I think should exist in the world is right now, our information, all of us, nodes are in the social graph at Meta or the professional graph at LinkedIn. Both of which are actually relatively closed in actually very annoying ways. Like very, very closed, right? Especially LinkedIn. Especially LinkedIn. I personally believe that if it's my data, and if I would get utility out of it being open, I should be able to make my data open or publish it in whatever forms that I choose, as long as I have control over it as opt-in. So the idea is around OpenGraph that says, here's a standard, here's a way to publish it. I should be able to go to OpenGraph.org slash Dharmesh dot JSON and get it back. And it's like, here's your stuff, right? And I can choose along the way and people can write to it and I can prove. And there can be an entire system. And if I were to do that, I would do it as a... Like a public benefit, non-profit-y kind of thing, as this is a contribution to society. I wouldn't try to commercialize that. Have you looked at AdProto? What's that? AdProto.swyx [00:34:43]: It's the protocol behind Blue Sky. Okay. My good friend, Dan Abramov, who was the face of React for many, many years, now works there. And he actually did a talk that I can send you, which basically kind of tries to articulate what you just said. But he does, he loves doing these like really great analogies, which I think you'll like. Like, you know, a lot of our data is behind a handle, behind a domain. Yep. So he's like, all right, what if we flip that? What if it was like our handle and then the domain? Yep. So, and that's really like your data should belong to you. Yep. And I should not have to wait 30 days for my Twitter data to export. Yep.Dharmesh [00:35:19]: you should be able to at least be able to automate it or do like, yes, I should be able to plug it into an agentic thing. Yeah. Yes. I think we're... Because so much of our data is... Locked up. I think the trick here isn't that standard. It is getting the normies to care.swyx [00:35:37]: Yeah. Because normies don't care.Dharmesh [00:35:38]: That's true. But building on that, normies don't care. So, you know, privacy is a really hot topic and an easy word to use, but it's not a binary thing. Like there are use cases where, and we make these choices all the time, that I will trade, not all privacy, but I will trade some privacy for some productivity gain or some benefit to me that says, oh, I don't care about that particular data being online if it gives me this in return, or I don't mind sharing this information with this company.Alessio [00:36:02]: If I'm getting, you know, this in return, but that sort of should be my option. I think now with computer use, you can actually automate some of the exports. Yes. Like something we've been doing internally is like everybody exports their LinkedIn connections. Yep. And then internally, we kind of merge them together to see how we can connect our companies to customers or things like that.Dharmesh [00:36:21]: And not to pick on LinkedIn, but since we're talking about it, but they feel strongly enough on the, you know, do not take LinkedIn data that they will block even browser use kind of things or whatever. They go to great, great lengths, even to see patterns of usage. And it says, oh, there's no way you could have, you know, gotten that particular thing or whatever without, and it's, so it's, there's...swyx [00:36:42]: Wasn't there a Supreme Court case that they lost? Yeah.Dharmesh [00:36:45]: So the one they lost was around someone that was scraping public data that was on the public internet. And that particular company had not signed any terms of service or whatever. It's like, oh, I'm just taking data that's on, there was no, and so that's why they won. But now, you know, the question is around, can LinkedIn... I think they can. Like, when you use, as a user, you use LinkedIn, you are signing up for their terms of service. And if they say, well, this kind of use of your LinkedIn account that violates our terms of service, they can shut your account down, right? They can. And they, yeah, so, you know, we don't need to make this a discussion. By the way, I love the company, don't get me wrong. I'm an avid user of the product. You know, I've got... Yeah, I mean, you've got over a million followers on LinkedIn, I think. Yeah, I do. And I've known people there for a long, long time, right? And I have lots of respect. And I understand even where the mindset originally came from of this kind of members-first approach to, you know, a privacy-first. I sort of get that. But sometimes you sort of have to wonder, it's like, okay, well, that was 15, 20 years ago. There's likely some controlled ways to expose some data on some member's behalf and not just completely be a binary. It's like, no, thou shalt not have the data.swyx [00:37:54]: Well, just pay for sales navigator.Alessio [00:37:57]: Before we move to the next layer of instruction, anything else on MCP you mentioned? Let's move back and then I'll tie it back to MCPs.Dharmesh [00:38:05]: So I think the... Open this with agent. Okay, so I'll start with... Here's my kind of running thesis, is that as AI and agents evolve, which they're doing very, very quickly, we're going to look at them more and more. I don't like to anthropomorphize. We'll talk about why this is not that. Less as just like raw tools and more like teammates. They'll still be software. They should self-disclose as being software. I'm totally cool with that. But I think what's going to happen is that in the same way you might collaborate with a team member on Slack or Teams or whatever you use, you can imagine a series of agents that do specific things just like a team member might do, that you can delegate things to. You can collaborate. You can say, hey, can you take a look at this? Can you proofread that? Can you try this? You can... Whatever it happens to be. So I think it is... I will go so far as to say it's inevitable that we're going to have hybrid teams someday. And what I mean by hybrid teams... So back in the day, hybrid teams were, oh, well, you have some full-time employees and some contractors. Then it was like hybrid teams are some people that are in the office and some that are remote. That's the kind of form of hybrid. The next form of hybrid is like the carbon-based life forms and agents and AI and some form of software. So let's say we temporarily stipulate that I'm right about that over some time horizon that eventually we're going to have these kind of digitally hybrid teams. So if that's true, then the question you sort of ask yourself is that then what needs to exist in order for us to get the full value of that new model? It's like, okay, well... You sort of need to... It's like, okay, well, how do I... If I'm building a digital team, like, how do I... Just in the same way, if I'm interviewing for an engineer or a designer or a PM, whatever, it's like, well, that's why we have professional networks, right? It's like, oh, they have a presence on likely LinkedIn. I can go through that semi-structured, structured form, and I can see the experience of whatever, you know, self-disclosed. But, okay, well, agents are going to need that someday. And so I'm like, okay, well, this seems like a thread that's worth pulling on. That says, okay. So I... So agent.ai is out there. And it's LinkedIn for agents. It's LinkedIn for agents. It's a professional network for agents. And the more I pull on that thread, it's like, okay, well, if that's true, like, what happens, right? It's like, oh, well, they have a profile just like anyone else, just like a human would. It's going to be a graph underneath, just like a professional network would be. It's just that... And you can have its, you know, connections and follows, and agents should be able to post. That's maybe how they do release notes. Like, oh, I have this new version. Whatever they decide to post, it should just be able to... Behave as a node on the network of a professional network. As it turns out, the more I think about that and pull on that thread, the more and more things, like, start to make sense to me. So it may be more than just a pure professional network. So my original thought was, okay, well, it's a professional network and agents as they exist out there, which I think there's going to be more and more of, will kind of exist on this network and have the profile. But then, and this is always dangerous, I'm like, okay, I want to see a world where thousands of agents are out there in order for the... Because those digital employees, the digital workers don't exist yet in any meaningful way. And so then I'm like, oh, can I make that easier for, like... And so I have, as one does, it's like, oh, I'll build a low-code platform for building agents. How hard could that be, right? Like, very hard, as it turns out. But it's been fun. So now, agent.ai has 1.3 million users. 3,000 people have actually, you know, built some variation of an agent, sometimes just for their own personal productivity. About 1,000 of which have been published. And the reason this comes back to MCP for me, so imagine that and other networks, since I know agent.ai. So right now, we have an MCP server for agent.ai that exposes all the internally built agents that we have that do, like, super useful things. Like, you know, I have access to a Twitter API that I can subsidize the cost. And I can say, you know, if you're looking to build something for social media, these kinds of things, with a single API key, and it's all completely free right now, I'm funding it. That's a useful way for it to work. And then we have a developer to say, oh, I have this idea. I don't have to worry about open AI. I don't have to worry about, now, you know, this particular model is better. It has access to all the models with one key. And we proxy it kind of behind the scenes. And then expose it. So then we get this kind of community effect, right? That says, oh, well, someone else may have built an agent to do X. Like, I have an agent right now that I built for myself to do domain valuation for website domains because I'm obsessed with domains, right? And, like, there's no efficient market for domains. There's no Zillow for domains right now that tells you, oh, here are what houses in your neighborhood sold for. It's like, well, why doesn't that exist? We should be able to solve that problem. And, yes, you're still guessing. Fine. There should be some simple heuristic. So I built that. It's like, okay, well, let me go look for past transactions. You say, okay, I'm going to type in agent.ai, agent.com, whatever domain. What's it actually worth? I'm looking at buying it. It can go and say, oh, which is what it does. It's like, I'm going to go look at are there any published domain transactions recently that are similar, either use the same word, same top-level domain, whatever it is. And it comes back with an approximate value, and it comes back with its kind of rationale for why it picked the value and comparable transactions. Oh, by the way, this domain sold for published. Okay. So that agent now, let's say, existed on the web, on agent.ai. Then imagine someone else says, oh, you know, I want to build a brand-building agent for startups and entrepreneurs to come up with names for their startup. Like a common problem, every startup is like, ah, I don't know what to call it. And so they type in five random words that kind of define whatever their startup is. And you can do all manner of things, one of which is like, oh, well, I need to find the domain for it. What are possible choices? Now it's like, okay, well, it would be nice to know if there's an aftermarket price for it, if it's listed for sale. Awesome. Then imagine calling this valuation agent. It's like, okay, well, I want to find where the arbitrage is, where the agent valuation tool says this thing is worth $25,000. It's listed on GoDaddy for $5,000. It's close enough. Let's go do that. Right? And that's a kind of composition use case that in my future state. Thousands of agents on the network, all discoverable through something like MCP. And then you as a developer of agents have access to all these kind of Lego building blocks based on what you're trying to solve. Then you blend in orchestration, which is getting better and better with the reasoning models now. Just describe the problem that you have. Now, the next layer that we're all contending with is that how many tools can you actually give an LLM before the LLM breaks? That number used to be like 15 or 20 before you kind of started to vary dramatically. And so that's the thing I'm thinking about now. It's like, okay, if I want to... If I want to expose 1,000 of these agents to a given LLM, obviously I can't give it all 1,000. Is there some intermediate layer that says, based on your prompt, I'm going to make a best guess at which agents might be able to be helpful for this particular thing? Yeah.Alessio [00:44:37]: Yeah, like RAG for tools. Yep. I did build the Latent Space Researcher on agent.ai. Okay. Nice. Yeah, that seems like, you know, then there's going to be a Latent Space Scheduler. And then once I schedule a research, you know, and you build all of these things. By the way, my apologies for the user experience. You realize I'm an engineer. It's pretty good.swyx [00:44:56]: I think it's a normie-friendly thing. Yeah. That's your magic. HubSpot does the same thing.Alessio [00:45:01]: Yeah, just to like quickly run through it. You can basically create all these different steps. And these steps are like, you know, static versus like variable-driven things. How did you decide between this kind of like low-code-ish versus doing, you know, low-code with code backend versus like not exposing that at all? Any fun design decisions? Yeah. And this is, I think...Dharmesh [00:45:22]: I think lots of people are likely sitting in exactly my position right now, coming through the choosing between deterministic. Like if you're like in a business or building, you know, some sort of agentic thing, do you decide to do a deterministic thing? Or do you go non-deterministic and just let the alum handle it, right, with the reasoning models? The original idea and the reason I took the low-code stepwise, a very deterministic approach. A, the reasoning models did not exist at that time. That's thing number one. Thing number two is if you can get... If you know in your head... If you know in your head what the actual steps are to accomplish whatever goal, why would you leave that to chance? There's no upside. There's literally no upside. Just tell me, like, what steps do you need executed? So right now what I'm playing with... So one thing we haven't talked about yet, and people don't talk about UI and agents. Right now, the primary interaction model... Or they don't talk enough about it. I know some people have. But it's like, okay, so we're used to the chatbot back and forth. Fine. I get that. But I think we're going to move to a blend of... Some of those things are going to be synchronous as they are now. But some are going to be... Some are going to be async. It's just going to put it in a queue, just like... And this goes back to my... Man, I talk fast. But I have this... I only have one other speed. It's even faster. So imagine it's like if you're working... So back to my, oh, we're going to have these hybrid digital teams. Like, you would not go to a co-worker and say, I'm going to ask you to do this thing, and then sit there and wait for them to go do it. Like, that's not how the world works. So it's nice to be able to just, like, hand something off to someone. It's like, okay, well, maybe I expect a response in an hour or a day or something like that.Dharmesh [00:46:52]: In terms of when things need to happen. So the UI around agents. So if you look at the output of agent.ai agents right now, they are the simplest possible manifestation of a UI, right? That says, oh, we have inputs of, like, four different types. Like, we've got a dropdown, we've got multi-select, all the things. It's like back in HTML, the original HTML 1.0 days, right? Like, you're the smallest possible set of primitives for a UI. And it just says, okay, because we need to collect some information from the user, and then we go do steps and do things. And generate some output in HTML or markup are the two primary examples. So the thing I've been asking myself, if I keep going down that path. So people ask me, I get requests all the time. It's like, oh, can you make the UI sort of boring? I need to be able to do this, right? And if I keep pulling on that, it's like, okay, well, now I've built an entire UI builder thing. Where does this end? And so I think the right answer, and this is what I'm going to be backcoding once I get done here, is around injecting a code generation UI generation into, the agent.ai flow, right? As a builder, you're like, okay, I'm going to describe the thing that I want, much like you would do in a vibe coding world. But instead of generating the entire app, it's going to generate the UI that exists at some point in either that deterministic flow or something like that. It says, oh, here's the thing I'm trying to do. Go generate the UI for me. And I can go through some iterations. And what I think of it as a, so it's like, I'm going to generate the code, generate the code, tweak it, go through this kind of prompt style, like we do with vibe coding now. And at some point, I'm going to be happy with it. And I'm going to hit save. And that's going to become the action in that particular step. It's like a caching of the generated code that I can then, like incur any inference time costs. It's just the actual code at that point.Alessio [00:48:29]: Yeah, I invested in a company called E2B, which does code sandbox. And they powered the LM arena web arena. So it's basically the, just like you do LMS, like text to text, they do the same for like UI generation. So if you're asking a model, how do you do it? But yeah, I think that's kind of where.Dharmesh [00:48:45]: That's the thing I'm really fascinated by. So the early LLM, you know, we're understandably, but laughably bad at simple arithmetic, right? That's the thing like my wife, Normies would ask us, like, you call this AI, like it can't, my son would be like, it's just stupid. It can't even do like simple arithmetic. And then like we've discovered over time that, and there's a reason for this, right? It's like, it's a large, there's, you know, the word language is in there for a reason in terms of what it's been trained on. It's not meant to do math, but now it's like, okay, well, the fact that it has access to a Python interpreter that I can actually call at runtime, that solves an entire body of problems that it wasn't trained to do. And it's basically a form of delegation. And so the thought that's kind of rattling around in my head is that that's great. So it's, it's like took the arithmetic problem and took it first. Now, like anything that's solvable through a relatively concrete Python program, it's able to do a bunch of things that I couldn't do before. Can we get to the same place with UI? I don't know what the future of UI looks like in a agentic AI world, but maybe let the LLM handle it, but not in the classic sense. Maybe it generates it on the fly, or maybe we go through some iterations and hit cache or something like that. So it's a little bit more predictable. Uh, I don't know, but yeah.Alessio [00:49:48]: And especially when is the human supposed to intervene? So, especially if you're composing them, most of them should not have a UI because then they're just web hooking to somewhere else. I just want to touch back. I don't know if you have more comments on this.swyx [00:50:01]: I was just going to ask when you, you said you got, you're going to go back to code. What
This episode is sponsored by the DFINITY Foundation. DFINITY Foundation's mission is to develop and contribute technology that enables the Internet Computer (ICP) blockchain and its ecosystem, aiming to shift cloud computing into a fully decentralized state. Find out more at https://internetcomputer.org/ In this episode of Eye on AI, Yoav Shoham, co-founder of AI21 Labs, shares his insights on the evolution of AI, touching on key advancements such as Jamba and Maestro. From the early days of his career to the latest developments in AI systems, Yoav offers a comprehensive look into the future of artificial intelligence. Yoav opens up about his journey in AI, beginning with his academic roots in game theory and logic, followed by his entrepreneurial ventures that led to the creation of AI21 Labs. He explains the founding of AI21 Labs and the company's mission to combine traditional AI approaches with modern deep learning methods, leading to innovations like Jamba—a highly efficient hybrid AI model that's disrupting the traditional transformer architecture. He also introduces Maestro, AI21's orchestrator that works with multiple large language models (LLMs) and AI tools to create more reliable, predictable, and efficient systems for enterprises. Yoav discusses how Maestro is tackling real-world challenges in enterprise AI, moving beyond flashy demos to practical, scalable solutions. Throughout the conversation, Yoav emphasizes the limitations of current large language models (LLMs), even those with reasoning capabilities, and explains how AI systems, rather than just pure language models, are becoming the future of AI. He also delves into the philosophical side of AI, discussing whether models truly "understand" and what that means for the future of artificial intelligence. Whether you're deeply invested in AI research or curious about its applications in business, this episode is filled with valuable insights into the current and future landscape of artificial intelligence. Stay Updated: Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI (00:00) Introduction: The Future of AI Systems (02:33) Yoav's Journey: From Academia to AI21 Labs (05:57) The Evolution of AI: Symbolic AI and Deep Learning (07:38) Jurassic One: AI21 Labs' First Language Model (10:39) Jamba: Revolutionizing AI Model Architecture (16:11) Benchmarking AI Models: Challenges and Criticisms (22:18) Reinforcement Learning in AI Models (24:33) The Future of AI: Is Jamba the End of Larger Models? (27:31) Applications of Jamba: Real-World Use Cases in Enterprise (29:56) The Transition to Mass AI Deployment in Enterprises (33:47) Maestro: The Orchestrator of AI Tools and Language Models (36:03) GPT-4.5 and Reasoning Models: Are They the Future of AI? (38:09) Yoav's Pet Project: The Philosophical Side of AI Understanding (41:27) The Philosophy of AI Understanding (45:32) Explanations and Competence in AI (48:59) Where to Access Jamba and Maestro
Iman Mirzadeh from Apple, who recently published the GSM-Symbolic paper discusses the crucial distinction between intelligence and achievement in AI systems. He critiques current AI research methodologies, highlighting the limitations of Large Language Models (LLMs) in reasoning and knowledge representation. SPONSOR MESSAGES:***Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich. Goto https://tufalabs.ai/***TRANSCRIPT + RESEARCH:https://www.dropbox.com/scl/fi/mlcjl9cd5p1kem4l0vqd3/IMAN.pdf?rlkey=dqfqb74zr81a5gqr8r6c8isg3&dl=0TOC:1. Intelligence vs Achievement in AI Systems [00:00:00] 1.1 Intelligence vs Achievement Metrics in AI Systems [00:03:27] 1.2 AlphaZero and Abstract Understanding in Chess [00:10:10] 1.3 Language Models and Distribution Learning Limitations [00:14:47] 1.4 Research Methodology and Theoretical Frameworks2. Intelligence Measurement and Learning [00:24:24] 2.1 LLM Capabilities: Interpolation vs True Reasoning [00:29:00] 2.2 Intelligence Definition and Measurement Approaches [00:34:35] 2.3 Learning Capabilities and Agency in AI Systems [00:39:26] 2.4 Abstract Reasoning and Symbol Understanding3. LLM Performance and Evaluation [00:47:15] 3.1 Scaling Laws and Fundamental Limitations [00:54:33] 3.2 Connectionism vs Symbolism Debate in Neural Networks [00:58:09] 3.3 GSM-Symbolic: Testing Mathematical Reasoning in LLMs [01:08:38] 3.4 Benchmark Evaluation and Model Performance AssessmentREFS:[00:01:00] AlphaZero chess AI system, Silver et al.https://arxiv.org/abs/1712.01815[00:07:10] Game Changer: AlphaZero's Groundbreaking Chess Strategies, Sadler & Reganhttps://www.amazon.com/Game-Changer-AlphaZeros-Groundbreaking-Strategies/dp/9056918184[00:11:35] Cross-entropy loss in language modeling, Voitahttp://lena-voita.github.io/nlp_course/language_modeling.html[00:17:20] GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in LLMs, Mirzadeh et al.https://arxiv.org/abs/2410.05229[00:21:25] Connectionism and Cognitive Architecture: A Critical Analysis, Fodor & Pylyshynhttps://www.sciencedirect.com/science/article/pii/001002779090014B[00:28:55] Brain-to-body mass ratio scaling laws, Sutskeverhttps://www.theverge.com/2024/12/13/24320811/what-ilya-sutskever-sees-openai-model-data-training[00:29:40] On the Measure of Intelligence, Chollethttps://arxiv.org/abs/1911.01547[00:33:30] On definition of intelligence, Gignac et al.https://www.sciencedirect.com/science/article/pii/S0160289624000266[00:35:30] Defining intelligence, Wanghttps://cis.temple.edu/~wangp/papers.html[00:37:40] How We Learn: Why Brains Learn Better Than Any Machine... for Now, Dehaenehttps://www.amazon.com/How-We-Learn-Brains-Machine/dp/0525559884[00:39:35] Surfaces and Essences: Analogy as the Fuel and Fire of Thinking, Hofstadter and Sanderhttps://www.amazon.com/Surfaces-Essences-Analogy-Fuel-Thinking/dp/0465018475[00:43:15] Chain-of-thought prompting, Wei et al.https://arxiv.org/abs/2201.11903[00:47:20] Test-time scaling laws in machine learning, Brownhttps://podcasts.apple.com/mv/podcast/openais-noam-brown-ilge-akkaya-and-hunter-lightman-on/id1750736528?i=1000671532058[00:47:50] Scaling Laws for Neural Language Models, Kaplan et al.https://arxiv.org/abs/2001.08361[00:55:15] Tensor product variable binding, Smolenskyhttps://www.sciencedirect.com/science/article/abs/pii/000437029090007M[01:08:45] GSM-8K dataset, OpenAIhttps://huggingface.co/datasets/openai/gsm8k
2025 will be a key year for organisations preparing for the phased requirements of the EU AI Act. With provisions relating to "high risk" AI systems effective from 1 August 2026, this podcast episode guides you through the actions you should be taking over the course of 2025 to prepare. DACB partner, Chris Air, and BLD partner, Dr. Alexander Beyer discuss what high risk AI systems are, what the key obligations are for deploying a high risk AI system, and what timelines need to be considered for deployment.
The rise of AI is fundamentally changing and challenging the classic laws and principles of software development and entrepreneurship. Drawing from my experience building Podscan.fm with AI assistance, I dive into how laws like Conway's Law, Brooks' Law, and Postel's Law are being transformed in this new era of AI-assisted development, while sharing practical insights for founders and developers navigating this shifting landscape.The blog post: https://thebootstrappedfounder.com/how-ai-changes-famous-laws-in-software-and-entrepreneurship/The podcast episode: https://tbf.fm/episodes/381-how-ai-changes-famous-laws-in-software-and-entrepreneurshipCheck out Podscan to get alerts when you're mentioned on podcasts: https://podscan.fmSend me a voicemail on Podline: https://podline.fm/arvidYou'll find my weekly article on my blog: https://thebootstrappedfounder.comPodcast: https://thebootstrappedfounder.com/podcastNewsletter: https://thebootstrappedfounder.com/newsletterMy book Zero to Sold: https://zerotosold.com/My book The Embedded Entrepreneur: https://embeddedentrepreneur.com/My course Find Your Following: https://findyourfollowing.comHere are a few tools I use. Using my affiliate links will support my work at no additional cost to you.- Notion (which I use to organize, write, coordinate, and archive my podcast + newsletter): https://affiliate.notion.so/465mv1536drx- Riverside.fm (that's what I recorded this episode with): https://riverside.fm/?via=arvid- TweetHunter (for speedy scheduling and writing Tweets): http://tweethunter.io/?via=arvid- HypeFury (for massive Twitter analytics and scheduling): https://hypefury.com/?via=arvid60- AudioPen (for taking voice notes and getting amazing summaries): https://audiopen.ai/?aff=PXErZ- Descript (for word-based video editing, subtitles, and clips): https://www.descript.com/?lmref=3cf39Q- ConvertKit (for email lists, newsletters, even finding sponsors): https://convertkit.com?lmref=bN9CZw
Our 202nd episode with a summary and discussion of last week's big AI news! Recorded on 03/07/2025 Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. Join our Discord here! https://discord.gg/nTyezGSKwP In this episode: Alibaba released Qwen-32B, their latest reasoning model, on par with leading models like DeepMind's R1. Anthropic raised $3.5 billion in a funding round, valuing the company at $61.5 billion, solidifying its position as a key competitor to OpenAI. DeepMind introduced BigBench Extra Hard, a more challenging benchmark to evaluate the reasoning capabilities of large language models. Reinforcement Learning pioneers Andrew Bartow and Rich Sutton were awarded the prestigious Turing Award for their contributions to the field. Timestamps + Links: cle picks: (00:00:00) Intro / Banter (00:01:41) Episode Preview (00:02:50) GPT-4.5 Discussion (00:14:13) Alibaba's New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI's o1-mini (00:21:29) With Alexa Plus, Amazon finally reinvents its best product (00:26:08) Another DeepSeek moment? General AI agent Manus shows ability to handle complex tasks (00:29:14) Microsoft's new Dragon Copilot is an AI assistant for healthcare (00:32:24) Mistral's new OCR API turns any PDF document into an AI-ready Markdown file (00:33:19) A.I. Start-Up Anthropic Closes Deal That Values It at $61.5 Billion (00:35:49) Nvidia-Backed CoreWeave Files for IPO, Shows Growing Revenue (00:38:05) Waymo and Uber's Austin robotaxi expansion begins today (00:38:54) UK competition watchdog drops Microsoft-OpenAI probe (00:41:17) Scale AI announces multimillion-dollar defense deal, a major step in U.S. military automation (00:44:43) DeepSeek Open Source Week: A Complete Summary (00:45:25) DeepSeek AI Releases DualPipe: A Bidirectional Pipeline Parallelism Algorithm for Computation-Communication Overlap in V3/R1 Training (00:53:00) Physical Intelligence open-sources Pi0 robotics foundation model (00:54:23) BIG-Bench Extra Hard (00:56:10) Cognitive Behaviors that Enable Self-Improving Reasoners (01:01:49) The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems (01:05:32) Pioneers of Reinforcement Learning Win the Turing Award (01:06:56) OpenAI launches $50M grant program to help fund academic research (01:07:25) The Nuclear-Level Risk of Superintelligent AI (01:13:34) METR's GPT-4.5 pre-deployment evaluations (01:17:16) Chinese buyers are getting Nvidia Blackwell chips despite US export controls
Complex problems cannot be solved if examined only through a narrow lens. Enter interdisciplinarity. It is now widely accepted that drawing on varied expertise and perspectives is the only way we can understand and tackle many of the most challenging issues we face, as individuals and as a species. So, there is a growing movement towards more cross disciplinary working in higher education but it faces challenges. Interdisciplinarity requires a shift of mindset in an academy built upon clear disciplinary distinctions and must compete for space in already overcrowded curricula. We speak to two leadings scholars in interdisciplinary research and teaching to find out why it is so important and how they are encouraging more academics and students to break out of traditional academic silos. Gabriele Bammer is a professor of integration and implementation sciences (i2S) at the Australian National University. She is author of several books including ‘Disciplining Interdisciplinarity' and is inaugural president of the Global Alliance for Inter- and Transdisciplinarity. To support progress in interdisciplinarity around the world, she runs the Integration and Implementation Insights blog and repository of theory, methods and tools underpinning i2S. Gabriele has held visiting appointments at Harvard University's John F. Kennedy School of Government, the National Socio-Environmental Synthesis Center at the University of Maryland and the Institute for Advanced Sustainability Studies in Potsdam, Germany. Kate Crawford is an international scholar of the social implications of artificial intelligence who has advised policymakers in the United Nations, the White House, and the European Parliament on AI, and currently leads the Knowing Machines Project, an international research collaboration that investigates the foundations of machine learning. She is a research professor at USC Annenberg in Los Angeles, a senior principal researcher at MSR in New York, an honorary professor at the University of Sydney, and the inaugural visiting chair for AI and Justice at the École Normale Supérieure in Paris. Her award-winning book, Atlas of AI, reveals the extractive nature of this technology while her creative collaborations such as Anatomy of an AI System with Vladan Joler and Excavating AI with Trevor Paglen explore the complex processes behind each human-AI interaction, showing the material and human costs. Her latest exhibition, Calculating Empires: A Genealogy of Technology and Power 1500-2025, opened in Milan, November 2023 and won the Grand Prize of the European Commission for art and technology. More advice and insight can be found in our latest Campus spotlight guide: A focus on interdisciplinarity in teaching.
Dan is joined by Letizia Giuliano, Vice President of Product Marketing and Management at Alphawave Semi. She specializes in architecting cutting-edge solutions for high-speed connectivity and chiplet design architecture. Prior to her role at Alphawave Semi, Letizia held the position of Product Line Manager at Intel, where… Read More
In this episode of the Shift AI Podcast, Kieran Snyder, co-founder of Textio, joins Boaz Ashkenazy to explore the evolving landscape of AI in workplace communications. From her early days combining linguistics and engineering to pioneering bias detection in HR technology, Snyder shares unique insights into the responsible development of AI technologies. The conversation delves into the challenges of bias in large language models, the future of enterprise AI adoption, and observations about how different generations approach AI tools. Drawing from her experience early pre-ChatGPT development and her recent survey of students' AI usage, Snyder offers a thoughtful perspective on the intersection of AI and human labor. Whether you're interested in the future of workplace communication, the challenges of bias in AI, or how the next generation will reshape our relationship with technology, this episode provides valuable insights from someone who has been at the forefront of AI innovation for over a decade.Chapters:[00:00] The Current State of AI Adoption[02:04] Origins and Background: From Engineering to Entrepreneurship[06:52] The Evolution of Textio: From Word Processing to HR Tech[09:36] Generative AI and Early Industry Insights[14:26] Addressing Bias in AI Systems[17:41] The AI Rush: Challenges in Enterprise Adoption[22:12] The Next Generation's Perspective on AI[26:17] AI Policy and Regulation[28:50] Mentorship and the Future of Work[34:33] Looking Ahead: The Actively Emerging FutureConnect with Kieran SnyderLinkedIn: https://www.linkedin.com/in/kieran-snyder/ Connect with Boaz AshkenazyLinkedIn: https://www.linkedin.com/in/boazashkenazy X: boazashkenazyEmail: shift@augmentedailabs.com
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
A Daily Chronicle of AI Innovations on February 19th 2025
Big thanks to Cisco for sponsoring this video and sponsoring my trip to Cisco Live Amsterdam. // DJ Sampath's SOCIAL // LinkedIn: / djsampath X: https://x.com/djsampath // YouTube Videos REFERENCE // Are you using a Hacked AI System?: • Are you using a Hacked AI system? Cisco AI Defense!: • Cisco AI Defense: Groundbreaking secu... // Blogs REFERENCE // https://blogs.cisco.com/security/eval... https://www.cisco.com/c/m/en_us/solut... https://blogs.cisco.com/security/eval... // David's SOCIAL // Discord: discord.com/invite/usKSyzb Twitter: www.twitter.com/davidbombal Instagram: www.instagram.com/davidbombal LinkedIn: www.linkedin.com/in/davidbombal Facebook: www.facebook.com/davidbombal.co TikTok: tiktok.com/@davidbombal YouTube: / @davidbombal // MY STUFF // https://www.amazon.com/shop/davidbombal // SPONSORS // Interested in sponsoring my videos? Reach out to my team here: sponsors@davidbombal.com // MENU // 0:00 - Coming up 0:31 - Intro 01:30 - Can You Block AI? 03:10 - DJ's Demo (Cisco Cloud Security) 06:16 - Jailbreaking AI 09:58 - Deepseek's Open Source 11:41 - AI Defence 14:40 - Should We Avoid AI? 15:24 - Outro Please note that links listed may be affiliate links and provide me with a small percentage/kickback should you use them to purchase any of the items listed or recommended. Thank you for supporting me and this channel! Disclaimer: This video is for educational purposes only. #deepseek #chatgpt #ai
IT'S LAST CALL, FIRST SHOT WEEK.Steve Noviello is joined by Steve Eager, Paige Ellenberger and Brian Demaris to discuss envelope etiquette, a cry for help on social media, and the debate of appropriate tipping or over tipping.
Sepp Hochreiter, the inventor of LSTM (Long Short-Term Memory) networks – a foundational technology in AI. Sepp discusses his journey, the origins of LSTM, and why he believes his latest work, XLSTM, could be the next big thing in AI, particularly for applications like robotics and industrial simulation. He also shares his controversial perspective on Large Language Models (LLMs) and why reasoning is a critical missing piece in current AI systems.SPONSOR MESSAGES:***CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Check out their super fast DeepSeek R1 hosting!https://centml.ai/pricing/Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. They are hiring a Chief Engineer and ML engineers. Events in Zurich.Goto https://tufalabs.ai/***TRANSCRIPT AND BACKGROUND READING:https://www.dropbox.com/scl/fi/n1vzm79t3uuss8xyinxzo/SEPPH.pdf?rlkey=fp7gwaopjk17uyvgjxekxrh5v&dl=0Prof. Sepp Hochreiterhttps://www.nx-ai.com/https://x.com/hochreitersepphttps://scholar.google.at/citations?user=tvUH3WMAAAAJ&hl=enTOC:1. LLM Evolution and Reasoning Capabilities[00:00:00] 1.1 LLM Capabilities and Limitations Debate[00:03:16] 1.2 Program Generation and Reasoning in AI Systems[00:06:30] 1.3 Human vs AI Reasoning Comparison[00:09:59] 1.4 New Research Initiatives and Hybrid Approaches2. LSTM Technical Architecture[00:13:18] 2.1 LSTM Development History and Technical Background[00:20:38] 2.2 LSTM vs RNN Architecture and Computational Complexity[00:25:10] 2.3 xLSTM Architecture and Flash Attention Comparison[00:30:51] 2.4 Evolution of Gating Mechanisms from Sigmoid to Exponential3. Industrial Applications and Neuro-Symbolic AI[00:40:35] 3.1 Industrial Applications and Fixed Memory Advantages[00:42:31] 3.2 Neuro-Symbolic Integration and Pi AI Project[00:46:00] 3.3 Integration of Symbolic and Neural AI Approaches[00:51:29] 3.4 Evolution of AI Paradigms and System Thinking[00:54:55] 3.5 AI Reasoning and Human Intelligence Comparison[00:58:12] 3.6 NXAI Company and Industrial AI ApplicationsREFS:[00:00:15] Seminal LSTM paper establishing Hochreiter's expertise (Hochreiter & Schmidhuber)https://direct.mit.edu/neco/article-abstract/9/8/1735/6109/Long-Short-Term-Memory[00:04:20] Kolmogorov complexity and program composition limitations (Kolmogorov)https://link.springer.com/article/10.1007/BF02478259[00:07:10] Limitations of LLM mathematical reasoning and symbolic integration (Various Authors)https://www.arxiv.org/pdf/2502.03671[00:09:05] AlphaGo's Move 37 demonstrating creative AI (Google DeepMind)https://deepmind.google/research/breakthroughs/alphago/[00:10:15] New AI research lab in Zurich for fundamental LLM research (Benjamin Crouzier)https://tufalabs.ai[00:19:40] Introduction of xLSTM with exponential gating (Beck, Hochreiter, et al.)https://arxiv.org/abs/2405.04517[00:22:55] FlashAttention: fast & memory-efficient attention (Tri Dao et al.)https://arxiv.org/abs/2205.14135[00:31:00] Historical use of sigmoid/tanh activation in 1990s (James A. McCaffrey)https://visualstudiomagazine.com/articles/2015/06/01/alternative-activation-functions.aspx[00:36:10] Mamba 2 state space model architecture (Albert Gu et al.)https://arxiv.org/abs/2312.00752[00:46:00] Austria's Pi AI project integrating symbolic & neural AI (Hochreiter et al.)https://www.jku.at/en/institute-of-machine-learning/research/projects/[00:48:10] Neuro-symbolic integration challenges in language models (Diego Calanzone et al.)https://openreview.net/forum?id=7PGluppo4k[00:49:30] JKU Linz's historical and neuro-symbolic research (Sepp Hochreiter)https://www.jku.at/en/news-events/news/detail/news/bilaterale-ki-projekt-unter-leitung-der-jku-erhaelt-fwf-cluster-of-excellence/YT: https://www.youtube.com/watch?v=8u2pW2zZLCs
Azfar Aslam, VP & Chief Technology Officer, Europe at Nokia discusses the evolving landscape of telecommunications, focusing on the integration of AI and quantum computing. He highlights the challenges of implementing AI in network management, the importance of quantum safe cryptography, and the need for reliability in technology. The discussion also touches on the competitive nature of the industry and the ethical considerations of AI, particularly in ensuring fairness and avoiding biases in decision-making.Key TakeawaysAI is transforming telecommunications but comes with challenges.Focus on solving big problems rather than getting lost in technology.AI-powered maintenance can prevent outages and improve reliability.Quantum computing has the potential to revolutionize network security.Organizations must prepare for the transition to quantum-safe cryptography.Reliability and trust are critical in adopting new technologies.Competition in telecommunications is fierce, requiring constant innovation.AI systems must be designed to avoid biases and ensure fairness.Traceability in AI decision-making is essential for accountability.The balance between technology and economics will drive future innovations.Chapters00:00 Introduction to AI in Telecommunications02:48 Challenges of AI Implementation06:11 The Role of Quantum Computing12:01 Quantum Safe Cryptography15:10 The Future of Quantum in Telecommunications17:58 Competition and Reliability in Tech20:54 Ensuring Fairness in AI Systems
Bio Bala has rich experience in retail technology and process transformation. Most recently, he worked as a Principal Architect for Intelligent Automation, Innovation & Supply Chain in a global Fortune 100 retail corporation. Currently he works for a luxury brand as Principal Architect for Intelligent Automation providing technology advice for the responsible use of technology (Low Code, RPA, Chatbots, and AI). He is passionate about technology and spends his free time reading, writing technical blogs and co-chairing a special interest group with The OR Society. Interview Highlights 02:00 Mentors and peers 04:00 Community bus 07:10 Defining AI 08:20 Contextual awareness 11:45 GenAI 14:30 The human loop 17:30 Natural Language Processing 20:45 Sentiment analysis 24:00 Implementing AI solutions 26:30 Ethics and AI 27:30 Biased algorithms 32:00 EU AI Act 33:00 Responsible use of technology Connect Bala Madhusoodhanan on LinkedIn Books and references · https://nymag.com/intelligencer/article/ai-artificial-intelligence-chatbots-emily-m-bender.html - NLP · https://www.theregister.com/2021/05/27/clearview_europe/ - Facial Technology Issue · https://www.designnews.com/electronics-test/apple-card-most-high-profile-case-ai-bias-yet - Apple Card story · https://www.ft.com/content/2d6fc319-2165-42fb-8de1-0edf1d765be3 - Data Centre growth · https://www.technologyreview.com/2024/02/06/1087793/what-babies-can-teach-ai/ · Independent Audit of AI Systems - · Home | The Alan Turing Institute · Competing in the Age of AI: Strategy and Leadership When Algorithms and Networks Run the World, Marco Iansiti & Karim R. Lakhani · AI Superpowers: China, Silicon Valley, and the New World, Kai-Fu Lee · The Algorithmic Leader: How to Be Smart When Machines Are Smarter Than You, Mike Walsh · Human+Machine: Reimagining Work in the Age of AI, Paul R Daugherty, H. James Wilson · Superintelligence: Paths, Dangers, Strategies, Nick Bostrom · The Alignment Problem: How Can Artificial Intelligence Learn Human Values, Brian Christian · Ethical Machines: Your Concise Guide to Totally Unbiased, Transparent, and Respectful AI, Reid Blackman · Wanted: Human-AI Translators: Artificial Intelligence Demystified, Geertrui Mieke De Ketelaere · The Future of Humanity: Terraforming Mars, Interstellar Travel, Immortality, and Our Destiny Beyond, Michio Kaku, Feodor Chin et al Episode Transcript Intro: Hello and welcome to the Agile Innovation Leaders podcast. I'm Ula Ojiaku. On this podcast I speak with world-class leaders and doers about themselves and a variety of topics spanning Agile, Lean Innovation, Business, Leadership and much more – with actionable takeaways for you the listener. Ula Ojiaku So I have with me here, Bala Madhusoodhanan, who is a principal architect with a global luxury brand, and he looks after their RPA and AI transformation. So it's a pleasure to have you on the Agile Innovation Leaders podcast, Bala, thank you for making the time. Bala Madhusoodhanan It's a pleasure to have a conversation with the podcast and the podcast audience, Ula. I follow the podcast and there have been fantastic speakers in the past. So I feel privileged to join you on this conversation. Ula Ojiaku Well, the privilege is mine. So could you start off with telling us about yourself Bala, what have been the key points or the highlights of your life that have led to you being the Bala we know now? Bala Madhusoodhanan It's putting self into uncharted territory. So my background is mechanical engineering, and when I got the job, it was either you go into the mechanical engineering manufacturing side or the software side, which was slightly booming at that point of time, and obviously it was paying more then decided to take the software route, but eventually somewhere the path kind of overlapped. So from a mainframe background, started working on supply chain, and then came back to optimisation, tied back to manufacturing industry. Somewhere there is an overlap, but yeah, that was the first decision that probably got me here. The second decision was to work in a UK geography, rather than a US geography, which is again very strange in a lot of my peers. They generally go to Silicon Valley or East Coast, but I just took a choice to stay here for personal reasons. And then the third was like the mindset. I mean, I had over the last 15, 20 years, I had really good mentors, really good peers, so I always had their help to soundboard my crazy ideas, and I always try to keep a relationship ongoing. Ula Ojiaku What I'm hearing is, based on what you said, lots of relationships have been key to getting you to where you are today, both from mentors, peers. Could you expand on that? In what way? Bala Madhusoodhanan The technology is changing quite a lot, at least in the last 10 years. So if you look into pre-2010, there was no machine learning or it was statistics. People were just saying everything is statistics and accessibility to information was not that much, but post 2010, 2011, people started getting accessibility. Then there was a data buzz, big data came in, so there were a lot of opportunities where I could have taken a different career path, but every time I was in a dilemma which route to take, I had someone with whom either I have worked or who was my team lead or manager to guide me to tell me, like, take emotion out of the decision making and think in a calm mind, because you might jump into something and you might like it, you might not like it, you should not regret it. So again, over the course of so many such decisions, my cognitive mind has also started thinking about it. So those conversations really help. And again, collective experience. If you look into the decision making, it's not just my decision, I'm going through conversations that I had with people where they have applied their experience, so it's not just me or just not one situation, and to understand the why behind that, and that actually helps. In short, it's like a collection of conversations that I had with peers. A few of them are visionary leaders, they are good readers. So they always had a good insight on where I should focus, where I shouldn't focus, and of late recently, there has been a community bus. So a lot of things are moving to open source, there is a lot of community exchange of conversation, the blogging has picked up a lot. So, connecting to those parts also gives you a different dimension to think about. Ula Ojiaku So you said community bus, some of the listeners or people who are watching the video might not understand what you mean by the community bus. Are you talking about like meetups or communities that come around to discuss shared interests? Bala Madhusoodhanan If you are very much specifically interested in AI, or you are specifically interested in, power platform or a low code platform, there are a lot of content creators on those topics. You can go to YouTube, LinkedIn, and you get a lot of information about what's happening. They do a lot of hackathons, again, you need to invest time in all these things. If you don't, then you are basically missing the boat, but there are various channels like hackathon or meetup groups, or, I mean, it could be us like a virtual conversation like you and me, we both have some passionate topics, that's why we resonate and we are talking about it. So it's all about you taking an initiative, you finding time for it, and then you have tons and tons of information available through community or through conferences or through meetup groups. Ula Ojiaku Thanks for clarifying. So, you said as well, you had a collection of conversations that helped you whenever you were at a crossroad, some new technology or something emerges or there's a decision you had to make and checking in with your mentors, your peers and your personal Board of Directors almost, that they give you guidance. Now, looking back, would you say there were some turns you took that knowing what you know now, you would have done differently? Bala Madhusoodhanan I would have liked to study more. That is the only thing, because sometimes the educational degree, even though without a practical knowledge has a bigger advantage in certain conversation, otherwise your experience and your content should speak for you and it takes a little bit of effort and time to get that trust among leaders or peers just to, even them to trust saying like, okay, this person knows what he's talking about. I should probably trust rather than, someone has done a PhD and it's just finding the right balance of when I should have invested time in continuing my education, if I had time, I would have gone back two years and did everything that I had done, like minus two years off-set it by two years earlier. It would have given me different pathways. That is what I would think, but again, it's all constraints. I did the best at that point in time with whatever constraints I had. So I don't have any regret per se, but yeah, if there is a magic wand, I would do that. Ula Ojiaku So you are a LinkedIn top voice from AI. How would you define AI, artificial intelligence? Bala Madhusoodhanan I am a bit reluctant to give a term Artificial Intelligence. It's in my mind, it is Artificial Narrow Intelligence, it's slightly different. So let me start with a building block, which is machine learning. So machine learning is like a data labeller. You go to a Tesco store, you read the label, you know it is a can of soup because you have read the label, your brain is not only processing that image, it understands the surrounding. It does a lot of things when you pick that can of soup. You can't expect that by just feeding one model to a robot. So that's why I'm saying like it's AI is a bit over glorified in my mind. It is artificial narrow intelligence. What you do to automate certain specific tasks using a data set which is legal, ethical, and drives business value is what I would call machine learning, but yeah, it's just overhyped and heavily utilised term AI. Ula Ojiaku You said, there's a hype around artificial intelligence. So what do you mean by that? And where do you see it going? Bala Madhusoodhanan Going back to the machine learning definition that I said, it's basically predicting an output based on some input. That's as simple as what we would say machine learning. The word algorithm is basically something like a pattern finder. What you're doing is you are giving a lot of data, which is properly labelled, which has proper diversity of information, and there are multiple algorithms that can find patterns. The cleverness or engineering mind that you bring in is to select which pattern or which algorithm you would like to do for your use case. Now you're channelling the whole machine learning into one use case. That's why I'm going with the term narrow intelligence. Computers can do brilliant jobs. So you ask computers to do like a Rubik's cubes solving. It will do it very quickly because the task is very simple and it is just doing a lot of calculation. You give a Rubik's cube to a kid. It has to apply it. The brain is not trained enough, so it has to cognitively learn. Maybe it will be faster. So anything which is just pure calculation, pure computing, if the data is labelled properly, you want to predict an outcome, yes, you can use computers. One of the interesting videos that I showed in one of my previous talks was a robot trying to walk across the street. This is in 2018 or 19. The first video was basically talking about a robot crossing a street and there were vehicles coming across and the robot just had a headbutt and it just fell off. Now a four year old kid was asked to walk and it knew that I have to press a red signal. So it went to the signal stop. It knew, or the baby knew that I can only walk when it is green. And then it looks around and then walks so you can see the difference – a four year old kid has a contextual awareness of what is happening, whereas the robot, which is supposed to be called as artificial intelligence couldn't see that. So again, if you look, our human brains have been evolved over millions of years. There are like 10 billion neurons or something, and it is highly optimised. So when I sleep, there are different set of neurons which are running. When I speak to you, my eyes and ears are running, my motion sensor neurons are running, but these are all highly optimised. So the mother control knows how much energy should be sent on which neuron, right, whereas all these large language models, there is only one task. You ask it, it's just going to do that. It doesn't have that intelligence to optimise. When I sleep, maybe 90 percent of my neurons are sleeping. It's getting recharged. Only the dream neurons are working. Whereas once you put a model live, it doesn't matter, all the hundred thousand neurons would run. So, yeah, it's in very infancy state, maybe with quantum computing, maybe with more power and better chips things might change, but I don't see that happening in the next five to 10 years. Ula Ojiaku Now, what do you say about Gen AI? Would you also classify generative AI as purely artificial neural intelligence? Bala Madhusoodhanan The thing with generative AI is you're trying to generalise a lot of use cases, say ChatGPT, you can throw in a PDF, you can ask something, or you can say, hey, can you create a content for my blog or things like that, right? Again, all it is trying to do is it has some historical content with which it is trying to come up with a response. So the thing that I would say is humans are really good with creativity. If a problem is thrown at a person, he will find creative ways to solve it. The tool with which we are going to solve might be a GenAI tool, I don't know, because I don't know the problem, but because GenAI is in a hype cycle, every problem doesn't need GenAI, that's my view. So there was an interesting research which was done by someone in Montreal University. It talks about 10 of the basic tasks like converting text to text or text to speech and with a generative AI model or multiple models, because you have a lot of vendors providing different GenAI models, and then they went with task specific models and the thing that they found was the task specific models were cheap to run, very, very scalable and robust and highly accurate, right. Whereas GenAI, if, when you try to use it and when it goes into a production ready or enterprise ready and if it is used by customers or third party, which are not part of your ecosystem, you are putting yourself in some kind of risk category. There could be a risk of copyright issues. There could be a risk of IP issues. There could be risk of not getting the right consent from someone. I can say, can you create an image of a podcaster named Ula? You never know because you don't remember that one of your photos on Google or Twitter or somewhere is not set as private. No one has come and asked you saying, I'm using this image. And yeah, it's finding the right balance. So even before taking the technology, I think people should think about what problem are they trying to solve? In my mind, AI or artificial intelligence, or narrow intelligence can have two buckets, right. The first bucket is to do with how can I optimise the existing process? Like there are a lot of things that I'm doing, is there a better way to do it? Is there an efficient way to do it? Can I save time? Can I save money? Stuff like that. So that is an optimisation or driving efficiency lever. Other one could be, I know what to do. I have a lot of data, but I don't have infrastructure or people to do it, like workforce augmentation. Say, I have 10 data entry persons who are graduate level. Their only job is to review the receipts or invoices. I work in FCA. I have to manually look at it, approve it, and file it, right? Now it is a very tedious job. So all you are doing is you are augmenting the whole process with an OCR engine. So OCR is Optical Character Recognition. So there are models, which again, it's a beautiful term for what our eyes do. When we travel somewhere, we get an invoice, we exactly know where to look, right? What is the total amount? What is the currency I have paid? Have they taken the correct credit card? Is my address right? All those things, unconsciously, your brain does it. Whereas our models given by different software vendors, which have trained to capture these specific entities which are universal language, to just pass, on data set, you just pass the image on it. It just picks and maps that information. Someone else will do that job. But as part of your process design, what you would do is I will do the heavy lifting of identifying the points. And I'll give it to someone because I want someone to validate it. It's human at the end. Someone is approving it. So they basically put a human in loop and, human centric design to a problem solving situation. That's your efficiency lever, right? Then you have something called innovation level - I need to do something radical, I have not done this product or service. Yeah, that's a space where you can use AI, again, to do small proof of concepts. One example could be, I'm opening a new store, it's in a new country, I don't know how the store layout should look like. These are my products. This is the store square footage. Can you recommend me the best way so that I can sell through a lot? Now, a visual merchandising team will have some ideas on where the things should be, they might give that prompt. Those texts can be converted into image. Once you get the base image, then it's human. It's us. So it will be a starting point rather than someone implementing everything. It could be a starting point. But can you trust it? I don't know. Ula Ojiaku And that's why you said the importance of having a human in the loop. Bala Madhusoodhanan Yeah. So the human loop again, it's because we humans bring contextual awareness to the situation, which machine doesn't know. So I'll tie back this to the NLP. So Natural Language Processing, it has two components, so you have natural language understanding and then you have natural language generation. When you create a machine learning model, all it is doing is, it is understanding the structure of language. It's called form. I'm giving you 10,000 PDFs, or you're reading a Harry Potter book. There is a difference between you reading a Harry Potter book and the machine interpreting that Harry Potter book. You would have imagination. You will have context of, oh, in the last chapter, we were in the hilly region or in a valley, I think it will be like this, the words like mist, cold, wood. You started already forming images and visualising stuff. The machine doesn't do that. Machine works on this is the word, this is a pronoun, this is the noun, this is the structure of language, so the next one should be this, right? So, coming back to the natural language understanding, that is where the context and the form comes into play. Just think of some alphabets put in front of you. You have no idea, but these are the alphabet. You recognise A, you recognise B, you recognise the word, but you don't understand the context. One example is I'm swimming against the current. Now, current here is the motion of water, right? My current code base is version 01. I'm using the same current, right? The context is different. So interpreting the structure of language is one thing. So, in natural language understanding, what we try to do is we try to understand the context. NLG, Natural Language Generation, is basically how can I respond in a way where I'm giving you an answer to your query. And this combined is NLP. It's a big field, there was a research done, the professor is Emily Bender, and she one of the leading professors in the NLP space. So the experiment was very funny. It was about a parrot in an island talking to someone, and there was a shark in between, or some sea creature, which basically broke the connection and was listening to what this person was saying and mimicking. Again, this is the problem with NLP, right? You don't have understanding of the context. You don't put empathy to it. You don't understand the voice modulation. Like when I'm talking to you, you can judge what my emotion cues are, you can put empathy, you can tailor the conversation. If I'm feeling sad, you can put a different spin, whereas if I'm chatting to a robot, it's just going to give a standard response. So again, you have to be very careful in which situation you're going to use it, whether it is for a small team, whether it is going to be in public, stuff like that. Ula Ojiaku So that's interesting because sometimes I join the Masters of Scale strategy sessions and at the last one there was someone whose organisational startup was featured and apparently what their startup is doing is to build AI solutions that are able to do sentiment analysis. And I think some of these, again, in their early stages, but some of these things are already available to try to understand the tone of voice, the words they say, and match it with maybe the expression and actually can transcribe virtual meetings and say, okay, this person said this, they looked perplexed or they looked slightly happy. So what do you think about that? I understand you're saying that machines can't do that, but it seems like there are already organisations trying to push the envelope towards that direction. Bala Madhusoodhanan So the example that you gave, sentiment of the conversation, again, it is going by the structure or the words that I'm using. I am feeling good. So good, here is positive sentiment. Again, for me the capability is slightly overhyped, the reason being is it might do 20 percent or 30 percent of what a human might do, but the human is any day better than that particular use case, right? So the sentiment analysis typically works on the sentiment data set, which would say, these are the certain proverbs, these are the certain types of words, this generally referred to positive sentiment or a good sentiment or feel good factor, but the model is only good as good as the data is, right? So no one is going and constantly updating that dictionary. No one is thinking about it, like Gen Z have a different lingo, millennials had a different lingo. So, again, you have to treat it use case by use case, Ula. Ula Ojiaku At the end of the day, the way things currently are is that machines aren't at the place where they are as good as humans. Humans are still good at doing what humans do, and that's the key thing. Bala Madhusoodhanan Interesting use case that I recently read probably after COVID was immersive reading. So people with dyslexia. So again, AI is used for good as well, I'm not saying it is completely bad. So AI is used for good, like, teaching kids who are dyslexic, right? Speech to text can talk, or can translate a paragraph, the kid can hear it, and on the screen, I think one note has an immersive reader, it actually highlights which word it is, uttering into the ears and research study showed that kids who were part of the study group with this immersive reading audio textbook, they had a better grasp of the context and they performed well and they were able to manage dyslexia better. Now, again, we are using the technology, but again, kudos to the research team, they identified a real problem, they formulated how the problem could be solved, they were successful. So, again, technology is being used again. Cancer research, they invest heavily, in image clustering, brain tumours, I mean, there are a lot of use cases where it's used for good, but then again, when you're using it, you just need to think about biases. You need to understand the risk, I mean, everything is risk and reward. If your reward is out-paying the minimum risk that you're taking, then it's acceptable. Ula Ojiaku What would you advise leaders of organisations who are considering implementing AI solutions? What are the things we need to consider? Bala Madhusoodhanan Okay. So going back to the business strategy and growth. So that is something that the enterprises or big organisations would have in mind. Always have your AI goals aligned to what they want. So as I said, there are two buckets. One is your efficiency driver, operational efficiency bucket. The other one is your innovation bucket. Just have a sense check of where the business wants to invest in. Just because AI is there doesn't mean you have to use it right. Look into opportunities where you can drive more values. So that would be my first line of thought. The second would be more to do with educating leaders about AI literacy, like what each models are, what do they do? What are the pitfalls, the ethical awareness about use of AI, data privacy is big. So again, that education is just like high level, with some examples on the same business domain where it has been successful, where it has been not so successful, what are the challenges that they face? That's something that I would urge everyone to invest time in. I think I did mention about security again, over the years, the practice has been security is always kept as last. So again, I was fortunate enough to work in organisations where security first mindset was put in place, because once you have a proof of value, once you show that to people, people get excited, and it's about messaging it and making sure it is very secured, protecting the end users. So the third one would be talking about having secure first design policies or principles. Machine learning or AI is of no good if your data quality is not there. So have a data strategy is something that I would definitely recommend. Start small. I mean, just like agile, you take a value, you start small, you realise whether your hypothesis was correct or not, you monitor how you performed and then you think about scale just by hello world doesn't mean that you have mastered that. So have that mindset, start small, monitor, have constant feedback, and then you think about scaling. Ula Ojiaku What are the key things about ethics and AI, do you think leaders should be aware of at this point in time? Bala Madhusoodhanan So again, ethical is very subjective. So it's about having different stakeholders to give their honest opinion of whether your solution is the right thing to do against the value of the enterprise. And it's not your view or my view, it's a consent view and certain things where people are involved, you might need to get HR, you might need to get legal, you might need to get brand reputation team to come and assist you because you don't understand the why behind certain policies were put in place. So one is, is the solution or is the AI ethical to the core value of the enterprise? So that's the first sense check that you need to do. If you pass that sense check, then comes about a lot of other threats, I would say like, is the model that I'm using, did it have a fair representation of all data set? There's a classic case study on one of a big cloud computing giant using an AI algorithm to filter resumes and they had to stop it immediately because the data set was all Ivy League, male, white, dominant, it didn't have the right representation. Over the 10 years, if I'm just hiring certain type of people, my data is inherently biased, no matter how good my algorithm is, if I don't have that data set. The other example is clarify AI. They got into trouble on using very biased data to give an outcome on some decision making to immigration, which has a bigger ramification. Then you talk about fairness, whether the AI system is fair to give you an output. So there was a funny story about a man and a woman in California living together, and I think the woman wasn't provided a credit card, even though everything, the postcode is the same, both of them work in the same company, and it was, I think it has to do with Apple Pay. Apple Pay wanted to bring in a silver credit card, Apple card or whatever it is, but then it is so unfair that the women who was equally qualified was not given the right credit limit, and the bank clearly said the algorithm said so. Then you have privacy concern, right? So all these generic models that you have that is available, even ChatGPT for that matter. Now you can chat with ChatGPT multiple times. You can talk about someone like Trevor Noah and you can say hey, can you create a joke? Now it has been trained with the jokes that he has done, it might be available publicly. But has the creator of model got a consent saying, hey Trevor, I'm going to use your content so that I can give better, and how many such consent, even Wikipedia, if you look into Wikipedia, about 80 percent of the information is public, but it is not diversified. What I mean by that is you can search for a lot of information. If the person is from America or from UK or from Europe, maybe from India to some extent, but what is the quality of data, if you think about countries in Africa, what do you think about South America? I mean, it is not representing the total diversity of data, and we have this large language model, which has been just trained on that data, right? So there is a bias and because of that bias, your outcome might not be fair. So these two are the main things, and of course the privacy concern. So if someone goes and says, hey, you have used my data, you didn't even ask me, then you're into lawsuit. Without getting a proper consent, again, it's a bad world, it's very fast moving and people don't even, including me, I don't even read every terms and condition, I just scroll down, tick, confirm, but those things are the things where I think education should come into play. Think about it, because people don't understand what could go wrong, not to them, but someone like them. Then there is a big fear of job displacement, like if I put this AI system, what will I do with my workforce? Say I had ten people, you need to think about, you need to reimagine your workplace. These are the ten jobs my ten people are doing. If I augment six of those jobs, how can I use my ten resources effectively to do something different or that piece of puzzle is always, again, it goes back to the core values of the company, what they think about their people, how everything is back, but it's just that needs a lot of inputs from multiple stakeholders. Ula Ojiaku It ties back to the enterprise strategy, there is the values, but with technology as it has evolved over the years, things will be made obsolete, but there are new opportunities that are created, so moving from when people travelled with horses and buggies and then the automotive came up. Yes, there wasn't as much demand for horseshoes and horses and buggies, but there was a new industry, the people who would mechanics or garages and things like that. So I think it's really about that. Like, going back to what you're saying, how can you redeploy people? And that might involve, again, training, reskilling, and investing in education of the workforce so that they're able to harness AI and to do those creative things that you've emphasised over this conversation about human beings, that creative aspect, that ability to understand context and nuance and apply it to the situation. Bala Madhusoodhanan So I was fortunate to work with ForHumanity, an NGO which basically is trying to certify people to look into auditing AI systems. So EU AI Act is now in place, it will be enforced soon. So you need people to have controls on all these AI systems to protect - it's done to protect people, it's done to protect the enterprise. So I was fortunate enough to be part of that community. I'm still working closely with the Operation Research Society. Again, you should be passionate enough, you should find time to do it, and if you do it, then the universe will find a way to give you something interesting to work with. And our society, The Alan Turing Institute, the ForHumanity Society, I had a few ICO workshops, which was quite interesting because when you hear perspectives from people from different facets of life, like lawyers and solicitors, you would think, ah, this statement, I wouldn't interpret in this way. It was a good learning experience and I'm sure if I have time, I would still continue to do that and invest time in ethical AI. As technology, it's not only AI, it's ethical use of technology, so sustainability is also part of ethical bucket if you look into it. So there was an interesting paper it talks about how many data centres have been opened between 2018 to 2024, which is like six years and the power consumption has gone from X to three times X or two times X, so we have opened a lot. We have already caused damage to the environment with all these technology, and just because the technology is there, it doesn't mean you have to use it, but again, it's that educational bit, what is the right thing to do? And even the ESG awareness, people are not aware. Like now, if you go to the current TikTok trenders, they know I need to look into certified B Corp when I am buying something. The reason is because they know, and they're more passionate about saving the world. Maybe we are not, I don't know, but again, once you start educating and, telling those stories, humans are really good, so you will have a change of heart. Ula Ojiaku What I'm hearing you say is that education is key to help us to make informed choices. There is a time and place where you would need to use AI, but not everything requires it, and if we're more thoughtful in how we approach, these, because these are tools at the end of the day, then we can at least try to be more balanced in the risks and taking advantage of opportunities versus the risks around it and the impact these decisions and the tools that we choose to use make on the environment. Now, what books have you found yourself recommending most to people, and why? Bala Madhusoodhanan Because we have been talking on AI, AI Superpower is one book which was written by Kai-Fu Lee. There is this book by Brian Christian, The Alignment Problem: Machine Learning and Human Values alignment of human values and machine it was basically talking about what are the human values? Where do you want to use machine learning? How do you basically come up with a decision making, that's a really interesting read. Then there is a book called Ethical Machines by Reid Blackman. So it talks about all the ethical facets of AI, like biases, fairnesses, like data privacy, transparency, explainability, and he gives quite a detail, example and walkthrough of what that means. Another interesting book was Wanted: Human-AI Translators: Artificial Intelligence Demystified by a Dutch professor, again, really, really lovely narration of what algorithms are, what AI is, where, and all you should think about, what controls and stuff like that. So that is an interesting book. Harvard Professor Kahrim Lakhani, he wrote something called, Competing in the Age of AI, that's a good book. The Algorithmic Leader: How to Be Smart When Machines Are Smarter Than You by Mike Walsh is another good book, which I finished a couple of months back. Ula Ojiaku And if the audience wants to find you, how can they reach out to you? Bala Madhusoodhanan They can always reach out to me at LinkedIn, I would be happy to touch base through LinkedIn. Ula Ojiaku Awesome. And do you have any final words and or ask of the audience? Bala Madhusoodhanan The final word is, again, responsible use of technology. Think about not just the use case, think about the environmental impact, think about the future generation, because I think the damage is already done. So, at least not in this lifetime, maybe three or four lifetimes down the line, it might not be the beautiful earth that we have. Ula Ojiaku It's been a pleasure, as always, speaking with you, Bala, and thank you so much for sharing your insights and wisdom, and thank you for being a guest on the Agile Innovation Leaders Podcast. Bala Madhusoodhanan Thank you, lovely conversation, and yeah, looking forward to connecting with more like minded LinkedIn colleagues. Ula Ojiaku That's all we have for now. Thanks for listening. If you liked this show, do subscribe at www.agileinnovationleaders.com or your favourite podcast provider. Also share with friends and do leave a review on iTunes. This would help others find this show. I'd also love to hear from you, so please drop me an email at ula@agileinnovationleaders.com Take care and God bless!
We're experimenting and would love to hear from you!In this episode of 'Discover Daily', we explore Sam Altman's recent acknowledgment that OpenAI may need to reconsider its open-source strategy, suggesting they might be "on the wrong side of history." This significant shift comes as competitive pressure from open-source models like DeepSeek R1 continues to mount, with Altman praising DeepSeek as "a very good model" that has narrowed OpenAI's traditional lead in the field.The European Union has taken a historic step in AI regulation with the implementation of the EU AI Act's first phase on February 2, 2025. The legislation prohibits AI systems deemed to pose "unacceptable risks," including manipulative systems, social scoring, and untargeted facial recognition databases. Violations can result in substantial penalties of up to €35 million or 7% of a company's total worldwide annual turnover, demonstrating the EU's commitment to establishing itself as a global leader in trustworthy AI development.Our main story focuses on two remarkable super-Earth discoveries within their stars' habitable zones. TOI-715 b, located 137 light-years away, is approximately 1.5 times wider than Earth and orbits its red dwarf star every 19 days. The second discovery, HD 20794 d, orbits a Sun-like star just 20 light-years from Earth and is roughly six times more massive than Earth, with an elliptical orbit that moves in and out of the habitable zone. These discoveries represent significant milestones in our search for potentially habitable worlds and provide promising targets for future research with advanced instruments like the James Webb Space Telescope.From Perplexity's Discover Feed: https://www.perplexity.ai/page/altman-reconsiders-open-source-fT0uV12jTna0XkxW8xkEDQ https://www.perplexity.ai/page/eu-bans-risky-ai-systems-.iTygUNvS2mKll.lL9xFdAhttps://www.perplexity.ai/page/super-earth-discovered-WR42RfwCSQWU1ebaQaeQxw Perplexity is the fastest and most powerful way to search the web. Perplexity crawls the web and curates the most relevant and up-to-date sources (from academic papers to Reddit threads) to create the perfect response to any question or topic you're interested in. Take the world's knowledge with you anywhere. Available on iOS and Android Join our growing Discord community for the latest updates and exclusive content. Follow us on: Instagram Threads X (Twitter) YouTube Linkedin
Tune into the Legacy Leaders Show for an enlightening episode featuring Rachel Dzieran, a trailblazer in AI and healthcare and the founder and CEO of the Navy SEALs Fund. With a distinguished educational journey from the United States Air Force Academy to a PhD in Systems Engineering, Rachel leads the charge in pioneering medical technologies. This episode explores her revolutionary AI-driven methods for kidney transplant decision-making, shares powerful leadership lessons from her military and biotech experiences, and unveils her vision for the future of healthcare technology. By listening, you'll gain insights into the practical applications of AI in medicine, learn how disciplined leadership can foster technological innovation, and consider the ethical dimensions of integrating AI into healthcare practices. Additionally, we'll delve into her role with the Navy SEALs Fund, emphasizing her commitment to supporting veterans and their families. Join us for a conversation that promises to inspire and provide actionable knowledge for navigating the complexities of modern medical advancements.
Meta says it may stop development of AI systems it deems too risky Ferret Malware Added to 'Contagious Interview' Campaign Credential Theft Becomes Cybercriminals' Favorite Target Huge thanks to our episode sponsor, ThreatLocker ThreatLocker® is a global leader in Zero Trust endpoint security, offering cybersecurity controls to protect businesses from zero-day attacks and ransomware. ThreatLocker operates with a default deny approach to reduce the attack surface and mitigate potential cyber vulnerabilities. To learn more and start your free trial, visit ThreatLocker.com. Find the stories behind the headlines at CISOseries.com.
Steven Pu, Co-Founder of Taraxa, a purpose-built, fast, scalable, and device-friendly Layer-1 public ledger designed to help democratize reputation by making informal data trustworthy. Prior to Taraxa, Steven launched multiple ventures and products in IoT and mobile healthcare. He was also a Partner at Monitor Deloitte's strategy practice, spearheaded their digital strategy line of business serving Fortune 500 companies with hundreds of millions in upside impact. Steven also had the honor of co-authoring the book “Next Blockchain” with Makoto Yano, vice-minister of Japan's Ministry of Economics, Trade, and Industry. Steven holds undergraduate and master's degrees in Electrical Engineering from Stanford University.
Steven Pu, Co-Founder of Taraxa, a purpose-built, fast, scalable, and device-friendly Layer-1 public ledger designed to help democratize reputation by making informal data trustworthy. Prior to Taraxa, Steven launched multiple ventures and products in IoT and mobile healthcare. He was also a Partner at Monitor Deloitte's strategy practice, spearheaded their digital strategy line of business serving Fortune 500 companies with hundreds of millions in upside impact. Steven also had the honor of co-authoring the book “Next Blockchain” with Makoto Yano, vice-minister of Japan's Ministry of Economics, Trade, and Industry. Steven holds undergraduate and master's degrees in Electrical Engineering from Stanford University.
As of Sunday in the European Union, the bloc's regulators can ban the use of AI systems they deem to pose “unacceptable risk” or harm. February 2 is the first compliance deadline for the EU's AI Act, the comprehensive AI regulatory framework that the European Parliament finally approved last March after years of development. Learn more about your ad choices. Visit podcastchoices.com/adchoices
If a business has spent $100 million developing a product, it's a fair bet that they don't want it stolen in two seconds and uploaded to the web where anyone can use it for free.This problem exists in extreme form for AI companies. These days, the electricity and equipment required to train cutting-edge machine learning models that generate uncanny human text and images can cost tens or hundreds of millions of dollars. But once trained, such models may be only a few gigabytes in size and run just fine on ordinary laptops.Today's guest, the computer scientist and polymath Nova DasSarma, works on computer and information security for the AI company Anthropic with the security team. One of her jobs is to stop hackers exfiltrating Anthropic's incredibly expensive intellectual property, as recently happened to Nvidia. Rebroadcast: this episode was originally released in June 2022.Links to learn more, highlights, and full transcript.As she explains, given models' small size, the need to store such models on internet-connected servers, and the poor state of computer security in general, this is a serious challenge.The worries aren't purely commercial though. This problem looms especially large for the growing number of people who expect that in coming decades we'll develop so-called artificial ‘general' intelligence systems that can learn and apply a wide range of skills all at once, and thereby have a transformative effect on society.If aligned with the goals of their owners, such general AI models could operate like a team of super-skilled assistants, going out and doing whatever wonderful (or malicious) things are asked of them. This might represent a huge leap forward for humanity, though the transition to a very different new economy and power structure would have to be handled delicately.If unaligned with the goals of their owners or humanity as a whole, such broadly capable models would naturally ‘go rogue,' breaking their way into additional computer systems to grab more computing power — all the better to pursue their goals and make sure they can't be shut off.As Nova explains, in either case, we don't want such models disseminated all over the world before we've confirmed they are deeply safe and law-abiding, and have figured out how to integrate them peacefully into society. In the first scenario, premature mass deployment would be risky and destabilising. In the second scenario, it could be catastrophic — perhaps even leading to human extinction if such general AI systems turn out to be able to self-improve rapidly rather than slowly, something we can only speculate on at this point.If highly capable general AI systems are coming in the next 10 or 20 years, Nova may be flying below the radar with one of the most important jobs in the world.We'll soon need the ability to ‘sandbox' (i.e. contain) models with a wide range of superhuman capabilities, including the ability to learn new skills, for a period of careful testing and limited deployment — preventing the model from breaking out, and criminals from breaking in. Nova and her colleagues are trying to figure out how to do this, but as this episode reveals, even the state of the art is nowhere near good enough.Chapters:Cold open (00:00:00)Rob's intro (00:00:52)The interview begins (00:02:44)Why computer security matters for AI safety (00:07:39)State of the art in information security (00:17:21)The hack of Nvidia (00:26:50)The most secure systems that exist (00:36:27)Formal verification (00:48:03)How organisations can protect against hacks (00:54:18)Is ML making security better or worse? (00:58:11)Motivated 14-year-old hackers (01:01:08)Disincentivising actors from attacking in the first place (01:05:48)Hofvarpnir Studios (01:12:40)Capabilities vs safety (01:19:47)Interesting design choices with big ML models (01:28:44)Nova's work and how she got into it (01:45:21)Anthropic and career advice (02:05:52)$600M Ethereum hack (02:18:37)Personal computer security advice (02:23:06)LastPass (02:31:04)Stuxnet (02:38:07)Rob's outro (02:40:18)Producer: Keiran HarrisAudio mastering: Ben Cordell and Beppe RådvikTranscriptions: Katy Moore
Jo Davaris, Global Chief Privacy Officer at Booking Holdings, discusses how to build trust in AI, emphasizing privacy and ethical governance as the pillars of AI innovation. We explore the importance of feedback mechanisms and transparency in AI systems, along with how to maintain user control without compromising innovation. Key Takeaways: Building trust in AI systems and the role of privacy in AI governance The need for public and private collaboration on AI regulations Concerns with AI's rapid advancement The importance of integrating privacy and ethical considerations into AI development from the start Guest Bio: Jo Davaris is the Global Chief Privacy Officer at Booking Holdings, where she established a unified privacy framework across brands like Booking.com and OpenTable. She leads AI Governance, Digital Trust, and Cybersecurity efforts, integrating privacy and innovation globally. Previously, Jo was Mercer's first Global Chief Privacy Officer, developing a privacy program that protected data while enabling growth. At American Express, she strengthened privacy governance for institutional & network businesses. Earlier, as an attorney for New York City's Administration for Children's Services, she prosecuted child abuse cases. Jo serves on advisory boards, speaks at conferences on privacy and governance, and advocates for social services in NYC. ---------------------------------------------------------------------------------------- About this Show: The Brave Technologist is here to shed light on the opportunities and challenges of emerging tech. To make it digestible, less scary, and more approachable for all! Join us as we embark on a mission to demystify artificial intelligence, challenge the status quo, and empower everyday people to embrace the digital revolution. Whether you're a tech enthusiast, a curious mind, or an industry professional, this podcast invites you to join the conversation and explore the future of AI together. The Brave Technologist Podcast is hosted by Luke Mulks, VP Business Operations at Brave Software—makers of the privacy-respecting Brave browser and Search engine, and now powering AI everywhere with the Brave Search API. Music by: Ari Dvorin Produced by: Sam Laliberte
We've spent a lot of time on this show talking about AI: how it's changing war, how your doctor might be using it, and whether or not chatbots are curing, or exacerbating, loneliness.But what we haven't done on this show is try to explain how AI actually works. So this seemed like as good a time as any to ask our listeners if they had any burning questions about AI. And it turns out you did.Where do our queries go once they've been fed into ChatGPT? What are the justifications for using a chatbot that may have been trained on plagiarized material? And why do we even need AI in the first place?To help answer your questions, we are joined by Derek Ruths, a Professor of Computer Science at McGill University, and the best person I know at helping people (including myself) understand artificial intelligence.Further Reading:“Yoshua Bengio Doesn't Think We're Ready for Superhuman AI. We're Building It Anyway,” Machines Like Us podcast“ChatGPT is blurring the lines between what it means to communicate with a machine and a human,” by Derek Ruths“A Brief History of Artificial Intelligence: What It Is, Where We Are, and Where We Are Going,” by Michael Wooldridge“Artificial Intelligence: A Guide for Thinking Humans,” by Melanie Mitchell“Anatomy of an AI System,” by Kate Crawford and Vladan Joler“Two years after the launch of ChatGPT, how has generative AI helped businesses?,” by Joe Castaldo
In this second installment of a three-part series, hosts JC Bonilla and Ardis Kadiu examine how multiple AI agents can work together as coordinated teams. Using Element's Campaign Manager as a real-world example, they show how AI agents can mirror human marketing team structures and workflows. The hosts break down ROI measurement frameworks for these systems and discuss why higher education is still in early adoption. Through practical examples and clear explanations, this episode helps listeners understand the shift from single AI assistants to collaborative AI systems that can handle complex, multi-step tasks.Series Context and Episode 1 Recap (00:00:06)Continuation of 3-episode series on AI agentsEpisode 1 became required listening at ElementFoundation concepts: reason, adapt, and actionVertical vs horizontal AI solutions approachMulti-Agent Systems Explained (00:08:12)Breakdown of agent vs assistant definitionsHow specialized agents work togetherTask distribution and hierarchy systemsComparison to human organizational structuresCampaign Manager Case Study (00:10:09)Element's real-world implementation exampleMultiple specialized roles working together:Campaign strategist agentContent writer agentArt director agentHTML builder agentHow agents pass work between each otherIntegration with existing systemsROI Measurement Frameworks (00:27:00)Three approaches to measuring success:Performance comparison with human workFinancial outcomes and cost savingsOrganization-wide productivity gainsHow AI changes work output benchmarksNew metrics for productivity measurementIndustry Adoption Analysis (00:35:41)Current state of multi-agent implementationExamples from marketing, engineering, customer supportHigher education adoption challengesNeed for institutional mindset changesHuman Role Evolution (00:40:25)Changes in day-to-day work with AI supportShift from individual work to AI orchestrationImpact on job roles and responsibilitiesFuture of human-AI collaborationEpisode 3 Preview (00:45:46)Coming discussion on AI workforce managementOrganizational change considerationsIT departments' evolving roleFuture workplace predictionsResearch-based insights on AI integration - - - -Connect With Our Co-Hosts:Ardis Kadiuhttps://www.linkedin.com/in/ardis/https://twitter.com/ardisDr. JC Bonillahttps://www.linkedin.com/in/jcbonilla/https://twitter.com/jbonillxAbout The Enrollify Podcast Network:Generation AI is a part of the Enrollify Podcast Network. If you like this podcast, chances are you'll like other Enrollify shows too! Enrollify is made possible by Element451 — the next-generation AI student engagement platform helping institutions create meaningful and personalized interactions with students. Learn more at element451.com. Attend the 2025 Engage Summit! The Engage Summit is the premier conference for forward-thinking leaders and practitioners dedicated to exploring the transformative power of AI in education. Explore the strategies and tools to step into the next generation of student engagement, supercharged by AI. You'll leave ready to deliver the most personalized digital engagement experience every step of the way.Register now to secure your spot in Charlotte, NC, on June 24-25, 2025! Early bird registration ends February 1st -- https://engage.element451.com/register
In this special episode of the Eye on AI podcast, Sepp Hochreiter, the inventor of Long Short-Term Memory (LSTM) networks, joins Craig Smith to discuss the profound impact of LSTMs on artificial intelligence, from language models to real-time robotics. Sepp reflects on the early days of LSTM development, sharing insights into his collaboration with Jürgen Schmidhuber and the challenges they faced in gaining recognition for their groundbreaking work. He explains how LSTMs became the foundation for technologies used by giants like Amazon, Apple, and Google, and how they paved the way for modern advancements like transformers. Topics include: - The origin story of LSTMs and their unique architecture. - Why LSTMs were crucial for sequence data like speech and text. - The rise of transformers and how they compare to LSTMs. - Real-time robotics: using LSTMs to build energy-efficient, autonomous systems. The next big challenges for AI and robotics in the era of generative AI. Sepp also shares his optimistic vision for the future of AI, emphasizing the importance of efficient, scalable models and their potential to revolutionize industries from healthcare to autonomous vehicles. Don't miss this deep dive into the history and future of AI, featuring one of its most influential pioneers. (00:00) Introduction: Meet Sepp Hochreiter (01:10) The Origins of LSTMs (02:26) Understanding the Vanishing Gradient Problem (05:12) Memory Cells and LSTM Architecture (06:35) Early Applications of LSTMs in Technology (09:38) How Transformers Differ from LSTMs (13:38) Exploring XLSTM for Industrial Applications (15:17) AI for Robotics and Real-Time Systems (18:55) Expanding LSTM Memory with Hopfield Networks (21:18) The Road to XLSTM Development (23:17) Industrial Use Cases of XLSTM (27:49) AI in Simulation: A New Frontier (32:26) The Future of LSTMs and Scalability (35:48) Inference Efficiency and Potential Applications (39:53) Continuous Learning and Adaptability in AI (42:59) Training Robots with XLSTM Technology (44:47) NXAI: Advancing AI in Industry
Laura Ruis, a PhD student at University College London and researcher at Cohere, explains her groundbreaking research into how large language models (LLMs) perform reasoning tasks, the fundamental mechanisms underlying LLM reasoning capabilities, and whether these models primarily rely on retrieval or develop procedural knowledge. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? Goto https://tufalabs.ai/ *** TOC 1. LLM Foundations and Learning 1.1 Scale and Learning in Language Models [00:00:00] 1.2 Procedural Knowledge vs Fact Retrieval [00:03:40] 1.3 Influence Functions and Model Analysis [00:07:40] 1.4 Role of Code in LLM Reasoning [00:11:10] 1.5 Semantic Understanding and Physical Grounding [00:19:30] 2. Reasoning Architectures and Measurement 2.1 Measuring Understanding and Reasoning in Language Models [00:23:10] 2.2 Formal vs Approximate Reasoning and Model Creativity [00:26:40] 2.3 Symbolic vs Subsymbolic Computation Debate [00:34:10] 2.4 Neural Network Architectures and Tensor Product Representations [00:40:50] 3. AI Agency and Risk Assessment 3.1 Agency and Goal-Directed Behavior in Language Models [00:45:10] 3.2 Defining and Measuring Agency in AI Systems [00:49:50] 3.3 Core Knowledge Systems and Agency Detection [00:54:40] 3.4 Language Models as Agent Models and Simulator Theory [01:03:20] 3.5 AI Safety and Societal Control Mechanisms [01:07:10] 3.6 Evolution of AI Capabilities and Emergent Risks [01:14:20] REFS: [00:01:10] Procedural Knowledge in Pretraining & LLM Reasoning Ruis et al., 2024 https://arxiv.org/abs/2411.12580 [00:03:50] EK-FAC Influence Functions in Large LMs Grosse et al., 2023 https://arxiv.org/abs/2308.03296 [00:13:05] Surfaces and Essences: Analogy as the Core of Cognition Hofstadter & Sander https://www.amazon.com/Surfaces-Essences-Analogy-Fuel-Thinking/dp/0465018475 [00:13:45] Wittgenstein on Language Games https://plato.stanford.edu/entries/wittgenstein/ [00:14:30] Montague Semantics for Natural Language https://plato.stanford.edu/entries/montague-semantics/ [00:19:35] The Chinese Room Argument David Cole https://plato.stanford.edu/entries/chinese-room/ [00:19:55] ARC: Abstraction and Reasoning Corpus François Chollet https://arxiv.org/abs/1911.01547 [00:24:20] Systematic Generalization in Neural Nets Lake & Baroni, 2023 https://www.nature.com/articles/s41586-023-06668-3 [00:27:40] Open-Endedness & Creativity in AI Tim Rocktäschel https://arxiv.org/html/2406.04268v1 [00:30:50] Fodor & Pylyshyn on Connectionism https://www.sciencedirect.com/science/article/abs/pii/0010027788900315 [00:31:30] Tensor Product Representations Smolensky, 1990 https://www.sciencedirect.com/science/article/abs/pii/000437029090007M [00:35:50] DreamCoder: Wake-Sleep Program Synthesis Kevin Ellis et al. https://courses.cs.washington.edu/courses/cse599j1/22sp/papers/dreamcoder.pdf [00:36:30] Compositional Generalization Benchmarks Ruis, Lake et al., 2022 https://arxiv.org/pdf/2202.10745 [00:40:30] RNNs & Tensor Products McCoy et al., 2018 https://arxiv.org/abs/1812.08718 [00:46:10] Formal Causal Definition of Agency Kenton et al. https://arxiv.org/pdf/2208.08345v2 [00:48:40] Agency in Language Models Sumers et al. https://arxiv.org/abs/2309.02427 [00:55:20] Heider & Simmel's Moving Shapes Experiment https://www.nature.com/articles/s41598-024-65532-0 [01:00:40] Language Models as Agent Models Jacob Andreas, 2022 https://arxiv.org/abs/2212.01681 [01:13:35] Pragmatic Understanding in LLMs Ruis et al. https://arxiv.org/abs/2210.14986
Professor Yoshua Bengio is a pioneer in deep learning and Turing Award winner. Bengio talks about AI safety, why goal-seeking “agentic” AIs might be dangerous, and his vision for building powerful AI tools without giving them agency. Topics include reward tampering risks, instrumental convergence, global AI governance, and how non-agent AIs could revolutionize science and medicine while reducing existential threats. Perfect for anyone curious about advanced AI risks and how to manage them responsibly. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? They are hosting an event in Zurich on January 9th with the ARChitects, join if you can. Goto https://tufalabs.ai/ *** Interviewer: Tim Scarfe Yoshua Bengio: https://x.com/Yoshua_Bengio https://scholar.google.com/citations?user=kukA0LcAAAAJ&hl=en https://yoshuabengio.org/ https://en.wikipedia.org/wiki/Yoshua_Bengio TOC: 1. AI Safety Fundamentals [00:00:00] 1.1 AI Safety Risks and International Cooperation [00:03:20] 1.2 Fundamental Principles vs Scaling in AI Development [00:11:25] 1.3 System 1/2 Thinking and AI Reasoning Capabilities [00:15:15] 1.4 Reward Tampering and AI Agency Risks [00:25:17] 1.5 Alignment Challenges and Instrumental Convergence 2. AI Architecture and Safety Design [00:33:10] 2.1 Instrumental Goals and AI Safety Fundamentals [00:35:02] 2.2 Separating Intelligence from Goals in AI Systems [00:40:40] 2.3 Non-Agent AI as Scientific Tools [00:44:25] 2.4 Oracle AI Systems and Mathematical Safety Frameworks 3. Global Governance and Security [00:49:50] 3.1 International AI Competition and Hardware Governance [00:51:58] 3.2 Military and Security Implications of AI Development [00:56:07] 3.3 Personal Evolution of AI Safety Perspectives [01:00:25] 3.4 AI Development Scaling and Global Governance Challenges [01:12:10] 3.5 AI Regulation and Corporate Oversight 4. Technical Innovations [01:23:00] 4.1 Evolution of Neural Architectures: From RNNs to Transformers [01:26:02] 4.2 GFlowNets and Symbolic Computation [01:30:47] 4.3 Neural Dynamics and Consciousness [01:34:38] 4.4 AI Creativity and Scientific Discovery SHOWNOTES (Transcript, references, best clips etc): https://www.dropbox.com/scl/fi/ajucigli8n90fbxv9h94x/BENGIO_SHOW.pdf?rlkey=38hi2m19sylnr8orb76b85wkw&dl=0 CORE REFS (full list in shownotes and pinned comment): [00:00:15] Bengio et al.: "AI Risk" Statement https://www.safe.ai/work/statement-on-ai-risk [00:23:10] Bengio on reward tampering & AI safety (Harvard Data Science Review) https://hdsr.mitpress.mit.edu/pub/w974bwb0 [00:40:45] Munk Debate on AI existential risk, featuring Bengio https://munkdebates.com/debates/artificial-intelligence [00:44:30] "Can a Bayesian Oracle Prevent Harm from an Agent?" (Bengio et al.) on oracle-to-agent safety https://arxiv.org/abs/2408.05284 [00:51:20] Bengio (2024) memo on hardware-based AI governance verification https://yoshuabengio.org/wp-content/uploads/2024/08/FlexHEG-Memo_August-2024.pdf [01:12:55] Bengio's involvement in EU AI Act code of practice https://digital-strategy.ec.europa.eu/en/news/meet-chairs-leading-development-first-general-purpose-ai-code-practice [01:27:05] Complexity-based compositionality theory (Elmoznino, Jiralerspong, Bengio, Lajoie) https://arxiv.org/abs/2410.14817 [01:29:00] GFlowNet Foundations (Bengio et al.) for probabilistic inference https://arxiv.org/pdf/2111.09266 [01:32:10] Discrete attractor states in neural systems (Nam, Elmoznino, Bengio, Lajoie) https://arxiv.org/pdf/2302.06403
AI professor Jeff Clune ruminates on open-ended evolutionary algorithms—systems designed to generate novel and interesting outcomes forever. Drawing inspiration from nature's boundless creativity, Clune and his collaborators aim to build “Darwin Complete” search spaces, where any computable environment can be simulated. By harnessing the power of large language models and reinforcement learning, these AI agents continuously develop new skills, explore uncharted domains, and even cooperate with one another in complex tasks. SPONSOR MESSAGES: *** CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events? They are hosting an event in Zurich on January 9th with the ARChitects, join if you can. Goto https://tufalabs.ai/ *** A central theme throughout Clune's work is “interestingness”: an elusive quality that nudges AI agents toward genuinely original discoveries. Rather than rely on narrowly defined metrics—which often fail due to Goodhart's Law—Clune employs language models to serve as proxies for human judgment. In doing so, he ensures that “interesting” always reflects authentic novelty, opening the door to unending innovation. Yet with these extraordinary possibilities come equally significant risks. Clune says we need AI safety measures—particularly as the technology matures into powerful, open-ended forms. Potential pitfalls include agents inadvertently causing harm or malicious actors subverting AI's capabilities for destructive ends. To mitigate this, Clune advocates for prudent governance involving democratic coalitions, regulation of cutting-edge models, and global alignment protocols. Jeff Clune: https://x.com/jeffclune http://jeffclune.com/ (Interviewer: Tim Scarfe) TOC: 1. Introduction [00:00:00] 1.1 Overview and Opening Thoughts 2. Sponsorship [00:03:00] 2.1 TufaAI Labs and CentML 3. Evolutionary AI Foundations [00:04:12] 3.1 Open-Ended Algorithm Development and Abstraction Approaches [00:07:56] 3.2 Novel Intelligence Forms and Serendipitous Discovery [00:11:46] 3.3 Frontier Models and the 'Interestingness' Problem [00:30:36] 3.4 Darwin Complete Systems and Evolutionary Search Spaces 4. System Architecture and Learning [00:37:35] 4.1 Code Generation vs Neural Networks Comparison [00:41:04] 4.2 Thought Cloning and Behavioral Learning Systems [00:47:00] 4.3 Language Emergence in AI Systems [00:50:23] 4.4 AI Interpretability and Safety Monitoring Techniques 5. AI Safety and Governance [00:53:56] 5.1 Language Model Consistency and Belief Systems [00:57:00] 5.2 AI Safety Challenges and Alignment Limitations [01:02:07] 5.3 Open Source AI Development and Value Alignment [01:08:19] 5.4 Global AI Governance and Development Control 6. Advanced AI Systems and Evolution [01:16:55] 6.1 Agent Systems and Performance Evaluation [01:22:45] 6.2 Continuous Learning Challenges and In-Context Solutions [01:26:46] 6.3 Evolution Algorithms and Environment Generation [01:35:36] 6.4 Evolutionary Biology Insights and Experiments [01:48:08] 6.5 Personal Journey from Philosophy to AI Research Shownotes: We craft detailed show notes for each episode with high quality transcript and references and best parts bolded. https://www.dropbox.com/scl/fi/fz43pdoc5wq5jh7vsnujl/JEFFCLUNE.pdf?rlkey=uu0e70ix9zo6g5xn6amykffpm&st=k2scxteu&dl=0
In this episode of the Eye on AI podcast, we dive into the critical issue of data quality for AI systems with Sedarius Perrotta, co-founder of Shelf. Sedarius takes us on a journey through his experience in knowledge management and how Shelf was built to solve one of AI's most pressing challenges—unstructured data chaos. He shares how Shelf's innovative solutions enhance retrieval-augmented generation (RAG) and ensure tools like Microsoft Copilot can perform at their best by tackling inaccuracies, duplications, and outdated information in real-time. Throughout the episode, we explore how unstructured data acts as the "fuel" for AI systems and why its quality determines success. Sedarius explains Shelf's approach to data observability, transparency, and proactive monitoring to help organizations fix "garbage in, garbage out" issues, ensuring scalable and trusted AI initiatives. We also discuss the accelerating adoption of generative AI, the future of data management, and why building a strategy for clean and trusted data is vital for 2025 and beyond. Learn how Shelf enables businesses to unlock the full potential of their unstructured data for AI-driven productivity and innovation. Don't forget to like, subscribe, and hit the notification bell to stay updated on the latest advancements in AI, data management, and next-gen automation! Stay Updated: Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI (00:00) Introduction and Shelf's Mission (03:01) Understanding SharePoint and Data Challenges (05:29) Tackling Data Entropy in AI Systems (08:13) Using AI to Solve Data Quality Issues (12:30) Fixing AI Hallucinations with Trusted Data (21:01) Gen AI Adoption Insights and Trends (28:44) Benefits of Curated Data for AI Training (37:38) Future of Unstructured Data Management
✨ Subscribe to the Green Pill Podcast ✨ https://pod.link/1609313639
Navigating Organizational Transitions with Stan Deetz In this episode, we cap off a brilliant year of growth with the insightful Stan Deetz, author of 'Leading Organizations Through Transitions'. Stan shares his expertise on managing change within organizations, focusing on technological disruptions, mergers and acquisitions, and the intricate dynamics of power shifts. We dive deep into the effects of AI on organizational structures, the concept of tacit knowledge, and the adjustments required for a healthy and resilient workforce. Stan also discusses the importance of humility and measurement in driving successful change, with practical advice on maintaining the delicate balance between efficiency and adaptability. Join us for an engaging conversation that offers valuable lessons for navigating complex organizational transitions. 00:00 Introduction and Recap 00:43 New Books and Projects 01:50 Technological Disruption and Organizational Change 03:09 The Sophomoric Effect and AI Challenges 04:44 AI's Impact on Knowledge Workers 06:17 Bias and Vigilance in AI Systems 09:05 Tacit Knowledge and Organizational Expertise 29:45 Forms, Data, and Organizational Decisions 39:14 Understanding the Impact of Our Products 39:53 Leadership and Institutional Knowledge 40:31 Navigating Organizational Transitions 41:56 The Myth of Stable Environments 43:56 The Importance of Diversity and Systems 47:43 Challenges of Short-Term Measurements 50:10 The Value of Long-Term Organizational Health 01:04:47 Cultural Sensitivity in Multinational Operations 01:11:02 The Need for Customization in Management 01:13:43 Starting Organizational Change with Humility 01:17:03 Personal and Organizational Growth Find Stan here:
Jineet Doshi is a Staff Data Scientist at Intuit and an AI Lead with a strong background in Computer Science. With over 6 years of relevant experience, he has a proven track record of building end-to-end machine learning models that significantly improve business metrics, from reducing fraud to saving millions of dollars. Holistic Evaluation of Generative AI Systems // MLOps Podcast #280 with Jineet Doshi, Staff AI Scientist or AI Lead at Intuit. // Abstract Evaluating LLMs is essential in establishing trust before deploying them to production. Even post deployment, evaluation is essential to ensure LLM outputs meet expectations, making it a foundational part of LLMOps. However, evaluating LLMs remains an open problem. Unlike traditional machine learning models, LLMs can perform a wide variety of tasks such as writing poems, Q&A, summarization etc. This leads to the question how do you evaluate a system with such broad intelligence capabilities? This talk covers the various approaches for evaluating LLMs such as classic NLP techniques, red teaming and newer ones like using LLMs as a judge, along with the pros and cons of each. The talk includes evaluation of complex GenAI systems like RAG and Agents. It also covers evaluating LLMs for safety and security and the need to have a holistic approach for evaluating these very capable models. // Bio Jineet Doshi is an award winning AI Lead and Engineer with over 7 years of experience. He has a proven track record of leading successful AI projects and building machine learning models from design to production across various domains, which have impacted millions of customers and have significantly improved business metrics, leading to millions of dollars of impact. He is currently an AI Lead at Intuit where he is one of the architects and developers of their Generative AI platform, which is serving Generative AI experiences for more than 100 million customers around the world. Jineet is also a guest lecturer at Stanford University as part of their building LLM Applications class. He is on the Advisory Board of University of San Francisco's AI Program. He holds multiple patents in the field, is on the steering committee of MLOps World Conference and has also co chaired workshops at top AI conferences like KDD. He holds a Masters degree from Carnegie Mellon university. // MLOps Swag/Merch https://shop.mlops.community/ // Related Links Website: https://www.intuit.com/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Jineet on LinkedIn: https://www.linkedin.com/in/jineetdoshi/
Let us know your thoughts. Send us a Text Message. Follow me to see #HeadsTalk Podcast Audiograms every Monday on LinkedInEpisode Title:
In this episode of Data Science at Home, we're diving deep into the powerful strategies that top AI companies, like OpenAI, use to scale their systems to handle millions of requests every minute! From stateless services and caching to the secrets of async processing, discover 8 essential strategies to make your AI and machine learning systems unstoppable. Whether you're working with traditional ML models or large LLMs, these techniques will transform your infrastructure. Hit play to learn how the pros do it and apply it to your own projects! LISTEN / SUBSCRIBE TO THE PODCAST YouTube: https://www.youtube.com/@DataScienceatHome Apple Podcasts: https://podcasts.apple.com/us/podcast/data-science-at-home/id1069871378 Podbean Podcasts: https://datascienceathome.podbean.com/ Player Fm: https://player.fm/series/data-science-at-home-2600992 Chapters 00:00 Intro 00:34 Scalability Strategies 01:08 Stateless Services 02:47 Horizontal Scaling 04:51 Load Balancing 06:14 Auto Scaling 07:41 Caching 09:27 Database Replication 11:07 Database Sharding 12:54 Async Processing 14:50 Infographics RESOURCES & LINKS Data Science at home: https://datascienceathome.com Amethix Technologies: https://amethix.com CONNECT WITH US! Instagram: https://www.instagram.com/datascienceathome/ Twitter: @datascienceathome Facebook: https://www.facebook.com/datascienceAH LinkedIn: https://www.linkedin.com/company/data-science-at-home-podcast Discord Channel: https://discord.gg/4UNKGf3 NEW TO DATA SCIENCE AT HOME? Welcome! Data Science at Home explores the latest in AI, data science, and machine learning. Whether you're a data professional, tech enthusiast, or just curious about the field, our podcast delivers insights, interviews, and discussions. Learn more at https://datascienceathome.com SEND US MAIL! We love hearing from you! Send us mail at: hello@datascienceathome.com
In this episode of The Cognitive Revolution, Nathan explores groundbreaking perspectives on AI alignment with MIT PhD student Tan Zhi Xuan. We dive deep into Xuan's critique of preference-based AI alignment and their innovative proposal for role-based AI systems guided by social consensus. The conversation extends into their fascinating work on how AI agents can learn social norms through Bayesian rule induction. Join us for an intellectually stimulating discussion that bridges philosophical theory with practical implementation in AI development. Check out: "Beyond Preferences in AI Alignment" paper: https://arxiv.org/pdf/2408.16984 "Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games" paper: https://arxiv.org/pdf/2402.13399 Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:01:09) About the Episode (00:04:25) Guest Intro (00:06:25) Xuan's Background (00:12:03) AI Near-Term Outlook (00:17:32) Sponsors: Notion | Weights & Biases RAG++ (00:20:18) Alignment Approaches (00:26:11) Critiques of RLHF (00:34:40) Sponsors: Oracle Cloud Infrastructure (OCI) (00:35:50) Beyond Preferences (00:40:27) Roles and AI Systems (00:45:19) What AI Owes Us (00:51:52) Drexler's AI Services (01:01:08) Constitutional AI (01:09:43) Technical Approach (01:22:01) Norms and Deviations (01:32:31) Norm Decay (01:38:06) Self-Other Overlap (01:44:05) Closing Thoughts (01:54:23) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk
Nora Belrose, Head of Interpretability Research at EleutherAI, discusses critical challenges in AI safety and development. The conversation begins with her technical work on concept erasure in neural networks through LEACE (LEAst-squares Concept Erasure), while highlighting how neural networks' progression from simple to complex learning patterns could have important implications for AI safety. Many fear that advanced AI will pose an existential threat -- pursuing its own dangerous goals once it's powerful enough. But Belrose challenges this popular doomsday scenario with a fascinating breakdown of why it doesn't add up. Belrose also provides a detailed critique of current AI alignment approaches, particularly examining "counting arguments" and their limitations when applied to AI safety. She argues that the Principle of Indifference may be insufficient for addressing existential risks from advanced AI systems. The discussion explores how emergent properties in complex AI systems could lead to unpredictable and potentially dangerous behaviors that simple reductionist approaches fail to capture. The conversation concludes by exploring broader philosophical territory, where Belrose discusses her growing interest in Buddhism's potential relevance to a post-automation future. She connects concepts of moral anti-realism with Buddhist ideas about emptiness and non-attachment, suggesting these frameworks might help humans find meaning in a world where AI handles most practical tasks. Rather than viewing this automated future with alarm, she proposes that Zen Buddhism's emphasis on spontaneity and presence might complement a society freed from traditional labor. SPONSOR MESSAGES: CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on ARC and AGI, they just acquired MindsAI - the current winners of the ARC challenge. Are you interested in working on ARC, or getting involved in their events? Goto https://tufalabs.ai/ Nora Belrose: https://norabelrose.com/ https://scholar.google.com/citations?user=p_oBc64AAAAJ&hl=en https://x.com/norabelrose SHOWNOTES: https://www.dropbox.com/scl/fi/38fhsv2zh8gnubtjaoq4a/NORA_FINAL.pdf?rlkey=0e5r8rd261821g1em4dgv0k70&st=t5c9ckfb&dl=0 TOC: 1. Neural Network Foundations [00:00:00] 1.1 Philosophical Foundations and Neural Network Simplicity Bias [00:02:20] 1.2 LEACE and Concept Erasure Fundamentals [00:13:16] 1.3 LISA Technical Implementation and Applications [00:18:50] 1.4 Practical Implementation Challenges and Data Requirements [00:22:13] 1.5 Performance Impact and Limitations of Concept Erasure 2. Machine Learning Theory [00:32:23] 2.1 Neural Network Learning Progression and Simplicity Bias [00:37:10] 2.2 Optimal Transport Theory and Image Statistics Manipulation [00:43:05] 2.3 Grokking Phenomena and Training Dynamics [00:44:50] 2.4 Texture vs Shape Bias in Computer Vision Models [00:45:15] 2.5 CNN Architecture and Shape Recognition Limitations 3. AI Systems and Value Learning [00:47:10] 3.1 Meaning, Value, and Consciousness in AI Systems [00:53:06] 3.2 Global Connectivity vs Local Culture Preservation [00:58:18] 3.3 AI Capabilities and Future Development Trajectory 4. Consciousness Theory [01:03:03] 4.1 4E Cognition and Extended Mind Theory [01:09:40] 4.2 Thompson's Views on Consciousness and Simulation [01:12:46] 4.3 Phenomenology and Consciousness Theory [01:15:43] 4.4 Critique of Illusionism and Embodied Experience [01:23:16] 4.5 AI Alignment and Counting Arguments Debate (TRUNCATED, TOC embedded in MP3 file with more information)
The talented author, Graham Brown, continuing the #1 NY Times Clive Cussler series, dives down deep with BOOKSTORM! The NUMA crew face swarms of deadly bio-hacked sea locusts, a runaway AI System, and a sinister cult! We talk about the benefits and dangers of AI and the point in which computers and human minds, and AI merge! Can we eliminate fundamental human needs like freedom, hope, autonomy and safety? Humankind's delicate balance with our ecosystem. The possibility of a new and different kind of virus targeting sea life. The real-life development of functional cognitive implants! DO NOT MISS THIS entertaining, fascinating and terrifying real novel! You can find more of your favorite bestselling authors at BOOKSTORM Podcast! We're also on Instagram, TikTok, Facebook, and YouTube!
Prof. Gennady Pekhimenko (CEO of CentML, UofT) joins us in this *sponsored episode* to dive deep into AI system optimization and enterprise implementation. From NVIDIA's technical leadership model to the rise of open-source AI, Pekhimenko shares insights on bridging the gap between academic research and industrial applications. Learn about "dark silicon," GPU utilization challenges in ML workloads, and how modern enterprises can optimize their AI infrastructure. The conversation explores why some companies achieve only 10% GPU efficiency and practical solutions for improving AI system performance. A must-watch for anyone interested in the technical foundations of enterprise AI and hardware optimization. CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. Cheaper, faster, no commitments, pay as you go, scale massively, simple to setup. Check it out! https://centml.ai/pricing/ SPONSOR MESSAGES: MLST is also sponsored by Tufa AI Labs - https://tufalabs.ai/ They are hiring cracked ML engineers/researchers to work on ARC and build AGI! SHOWNOTES (diarised transcript, TOC, references, summary, best quotes etc) https://www.dropbox.com/scl/fi/w9kbpso7fawtm286kkp6j/Gennady.pdf?rlkey=aqjqmncx3kjnatk2il1gbgknk&st=2a9mccj8&dl=0 TOC: 1. AI Strategy and Leadership [00:00:00] 1.1 Technical Leadership and Corporate Structure [00:09:55] 1.2 Open Source vs Proprietary AI Models [00:16:04] 1.3 Hardware and System Architecture Challenges [00:23:37] 1.4 Enterprise AI Implementation and Optimization [00:35:30] 1.5 AI Reasoning Capabilities and Limitations 2. AI System Development [00:38:45] 2.1 Computational and Cognitive Limitations of AI Systems [00:42:40] 2.2 Human-LLM Communication Adaptation and Patterns [00:46:18] 2.3 AI-Assisted Software Development Challenges [00:47:55] 2.4 Future of Software Engineering Careers in AI Era [00:49:49] 2.5 Enterprise AI Adoption Challenges and Implementation 3. ML Infrastructure Optimization [00:54:41] 3.1 MLOps Evolution and Platform Centralization [00:55:43] 3.2 Hardware Optimization and Performance Constraints [01:05:24] 3.3 ML Compiler Optimization and Python Performance [01:15:57] 3.4 Enterprise ML Deployment and Cloud Provider Partnerships 4. Distributed AI Architecture [01:27:05] 4.1 Multi-Cloud ML Infrastructure and Optimization [01:29:45] 4.2 AI Agent Systems and Production Readiness [01:32:00] 4.3 RAG Implementation and Fine-Tuning Considerations [01:33:45] 4.4 Distributed AI Systems Architecture and Ray Framework 5. AI Industry Standards and Research [01:37:55] 5.1 Origins and Evolution of MLPerf Benchmarking [01:43:15] 5.2 MLPerf Methodology and Industry Impact [01:50:17] 5.3 Academic Research vs Industry Implementation in AI [01:58:59] 5.4 AI Research History and Safety Concerns
Eliezer Yudkowsky and Stephen Wolfram discuss artificial intelligence and its potential existen‑ tial risks. They traversed fundamental questions about AI safety, consciousness, computational irreducibility, and the nature of intelligence. The discourse centered on Yudkowsky's argument that advanced AI systems pose an existential threat to humanity, primarily due to the challenge of alignment and the potential for emergent goals that diverge from human values. Wolfram, while acknowledging potential risks, approached the topic from a his signature measured perspective, emphasizing the importance of understanding computational systems' fundamental nature and questioning whether AI systems would necessarily develop the kind of goal‑directed behavior Yudkowsky fears. *** MLST IS SPONSORED BY TUFA AI LABS! The current winners of the ARC challenge, MindsAI are part of Tufa AI Labs. They are hiring ML engineers. Are you interested?! Please goto https://tufalabs.ai/ *** TOC: 1. Foundational AI Concepts and Risks [00:00:01] 1.1 AI Optimization and System Capabilities Debate [00:06:46] 1.2 Computational Irreducibility and Intelligence Limitations [00:20:09] 1.3 Existential Risk and Species Succession [00:23:28] 1.4 Consciousness and Value Preservation in AI Systems 2. Ethics and Philosophy in AI [00:33:24] 2.1 Moral Value of Human Consciousness vs. Computation [00:36:30] 2.2 Ethics and Moral Philosophy Debate [00:39:58] 2.3 Existential Risks and Digital Immortality [00:43:30] 2.4 Consciousness and Personal Identity in Brain Emulation 3. Truth and Logic in AI Systems [00:54:39] 3.1 AI Persuasion Ethics and Truth [01:01:48] 3.2 Mathematical Truth and Logic in AI Systems [01:11:29] 3.3 Universal Truth vs Personal Interpretation in Ethics and Mathematics [01:14:43] 3.4 Quantum Mechanics and Fundamental Reality Debate 4. AI Capabilities and Constraints [01:21:21] 4.1 AI Perception and Physical Laws [01:28:33] 4.2 AI Capabilities and Computational Constraints [01:34:59] 4.3 AI Motivation and Anthropomorphization Debate [01:38:09] 4.4 Prediction vs Agency in AI Systems 5. AI System Architecture and Behavior [01:44:47] 5.1 Computational Irreducibility and Probabilistic Prediction [01:48:10] 5.2 Teleological vs Mechanistic Explanations of AI Behavior [02:09:41] 5.3 Machine Learning as Assembly of Computational Components [02:29:52] 5.4 AI Safety and Predictability in Complex Systems 6. Goal Optimization and Alignment [02:50:30] 6.1 Goal Specification and Optimization Challenges in AI Systems [02:58:31] 6.2 Intelligence, Computation, and Goal-Directed Behavior [03:02:18] 6.3 Optimization Goals and Human Existential Risk [03:08:49] 6.4 Emergent Goals and AI Alignment Challenges 7. AI Evolution and Risk Assessment [03:19:44] 7.1 Inner Optimization and Mesa-Optimization Theory [03:34:00] 7.2 Dynamic AI Goals and Extinction Risk Debate [03:56:05] 7.3 AI Risk and Biological System Analogies [04:09:37] 7.4 Expert Risk Assessments and Optimism vs Reality 8. Future Implications and Economics [04:13:01] 8.1 Economic and Proliferation Considerations SHOWNOTES (transcription, references, summary, best quotes etc): https://www.dropbox.com/scl/fi/3st8dts2ba7yob161dchd/EliezerWolfram.pdf?rlkey=b6va5j8upgqwl9s2muc924vtt&st=vemwqx7a&dl=0
AI is a frequently used term that is only sometimes fully understood in the workplace or legal context. Since no single type of AI performs all functions, employers must identify which type or subset of AI—such as generative AI, machine learning, predictive analytics, or natural language processing—their organization is using or considering. Additionally, understanding how various laws and regulations may impact AI's use is crucial.
Will AI take my job? How does AI funding affect the output? Will AI have a role to play in the church? How are you personally leveraging AI to make your life more effective? These are just a handful of the AI questions you wanted answered. Alongside AI expert and entrepreneur, Chad Reynolds, Brian talks about the impact of AI on his life, his faith, and the future. If you are fearful about AI, overjoyed at the possibilities, or somewhere in between, this aggressive Q&A episode will push you to keep moving forward.