Podcasts about ai alignment

  • 100PODCASTS
  • 406EPISODES
  • 34mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Oct 17, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about ai alignment

Latest podcast episodes about ai alignment

The Metagame
#40 - Alex Zhu | Rational Spirituality, AI Alignment & Chris Langan's Metaphysics

The Metagame

Play Episode Listen Later Oct 17, 2025 87:55


Alex Zhu is a math olympian and researcher exploring the convergence of analytical rationality and religion. He's also the co-founder of AlphaSheets. He's currently working on a rigorous framework for bridging AI alignment and mysticism. In this conversation we explore about how he got into spirituality without sacrificing his rigorous epistemics.Resources* Alex's Twitter and Substack* C. Langan – An Introduction to the Cognitive-Theoretic Model of the Universe (2002) YouTube: This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit themetagame.substack.com

Science & Futurism with Isaac Arthur
AI Alignment - Can We Make AI Safe? (Narration Only)

Science & Futurism with Isaac Arthur

Play Episode Listen Later Oct 16, 2025 31:40


From safety protocols to philosophy, AI alignment asks a hard question: can we build artificial intelligence that truly serves humanity?Watch my exclusive video The Fermi Paradox - Civilization Extinction Cycles: https://nebula.tv/videos/isaacarthur-the-fermi-paradox-civilization-extinction-cyclesGet Nebula using my link for 40% off an annual subscription: https://go.nebula.tv/isaacarthurGet a Lifetime Membership to Nebula for only $300: https://go.nebula.tv/lifetime?ref=isaacarthurUse the link https://gift.nebula.tv/isaacarthur to give a year of Nebula to a friend for just $36.Grab one of our new SFIA mugs and make your morning coffee a little more futuristic — available now on our Fourthwall store! https://isaac-arthur-shop.fourthwall.com/Visit our Website: http://www.isaacarthur.netJoin Nebula: https://go.nebula.tv/isaacarthurSupport us on Patreon: https://www.patreon.com/IsaacArthurSupport us on Subscribestar: https://www.subscribestar.com/isaac-arthurFacebook Group: https://www.facebook.com/groups/1583992725237264/Reddit: https://www.reddit.com/r/IsaacArthur/Twitter: https://twitter.com/Isaac_A_Arthur on Twitter and RT our future content.SFIA Discord Server: https://discord.gg/53GAShECredits:AI Alignment - Can We Make AI Safe?Written, Produced & Narrated by: Isaac ArthurSelect imagery/video supplied by Getty Images Music Courtesy of Stellardrone, Chris Zabriskie, and Epidemic Sound http://epidemicsound.com/creatorSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Science & Futurism with Isaac Arthur
AI Alignment - Can We Make AI Safe?

Science & Futurism with Isaac Arthur

Play Episode Listen Later Oct 16, 2025 32:04


From safety protocols to philosophy, AI alignment asks a hard question: can we build artificial intelligence that truly serves humanity?Watch my exclusive video The Fermi Paradox - Civilization Extinction Cycles: https://nebula.tv/videos/isaacarthur-the-fermi-paradox-civilization-extinction-cyclesGet Nebula using my link for 40% off an annual subscription: https://go.nebula.tv/isaacarthurGet a Lifetime Membership to Nebula for only $300: https://go.nebula.tv/lifetime?ref=isaacarthurUse the link https://gift.nebula.tv/isaacarthur to give a year of Nebula to a friend for just $36.Grab one of our new SFIA mugs and make your morning coffee a little more futuristic — available now on our Fourthwall store! https://isaac-arthur-shop.fourthwall.com/Visit our Website: http://www.isaacarthur.netJoin Nebula: https://go.nebula.tv/isaacarthurSupport us on Patreon: https://www.patreon.com/IsaacArthurSupport us on Subscribestar: https://www.subscribestar.com/isaac-arthurFacebook Group: https://www.facebook.com/groups/1583992725237264/Reddit: https://www.reddit.com/r/IsaacArthur/Twitter: https://twitter.com/Isaac_A_Arthur on Twitter and RT our future content.SFIA Discord Server: https://discord.gg/53GAShECredits:AI Alignment - Can We Make AI Safe?Written, Produced & Narrated by: Isaac ArthurSelect imagery/video supplied by Getty Images Music Courtesy of Stellardrone, Chris Zabriskie, and Epidemic Sound http://epidemicsound.com/creatorSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

This Week in Google (MP3)
IM 841: Dust and Deli Meat - Open Source AI Revolution

This Week in Google (MP3)

Play Episode Listen Later Oct 16, 2025 184:47


Can open-source AI models really be truly neutral, or are they just another conduit for hidden agendas? Hear how the founder of Nous Research is battling Silicon Valley giants to put ethical, user-controlled AI in everyone's hands. TOpinion | The A.I. Prompt That Could End the World AI videos of dead celebrities are horrifying many of their families (20) Sam Altman on X: "We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have" / X The AI water issue is fake California becomes first state to regulate AI companion chatbots | TechCrunch Walmart Announces It Will Sell Products Through ChatGPT's Instant Checkout Protein Powders and Shakes Contain High Levels of Lead AI is changing how we quantify pain Kids who use social media score lower on reading and memory tests, a study shows Social media must warn users of 'profound' health risks under new California law Google will let friends help you recover an account AI content on the net AI writing hasn't overwhelmed the web yet Karpathy tweet Humanity AI Commits $500 Million to Build a People-Centered Future for AI Sal Khan is the new TED You won't believe what degrading practice the pope just condemned Nano Banana is coming to Google Search, NotebookLM and Photos. Paper: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit THOUSANDS OF AI AUTHORS ON THE FUTURE OF AI A Twitch streamer gave birth live, with Twitch's CEO in the chat DirecTV will soon bring AI ads to your screensaver Japan wants OpenAI to stop ripping off manga and anime What Is Really Going on With All This Radioactive Shrimp? Inherently funny word Boah, Bahn! A book is being marketed with mayo-scented ink. Jealous? Me? Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: spaceship.com/twit pantheon.io Melissa.com/twit threatlocker.com/twit

All TWiT.tv Shows (MP3)
Intelligent Machines 841: Dust and Deli Meat

All TWiT.tv Shows (MP3)

Play Episode Listen Later Oct 16, 2025 183:47


Can open-source AI models really be truly neutral, or are they just another conduit for hidden agendas? Hear how the founder of Nous Research is battling Silicon Valley giants to put ethical, user-controlled AI in everyone's hands. TOpinion | The A.I. Prompt That Could End the World AI videos of dead celebrities are horrifying many of their families (20) Sam Altman on X: "We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have" / X The AI water issue is fake California becomes first state to regulate AI companion chatbots | TechCrunch Walmart Announces It Will Sell Products Through ChatGPT's Instant Checkout Protein Powders and Shakes Contain High Levels of Lead AI is changing how we quantify pain Kids who use social media score lower on reading and memory tests, a study shows Social media must warn users of 'profound' health risks under new California law Google will let friends help you recover an account AI content on the net AI writing hasn't overwhelmed the web yet Karpathy tweet Humanity AI Commits $500 Million to Build a People-Centered Future for AI Sal Khan is the new TED You won't believe what degrading practice the pope just condemned Nano Banana is coming to Google Search, NotebookLM and Photos. Paper: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit THOUSANDS OF AI AUTHORS ON THE FUTURE OF AI A Twitch streamer gave birth live, with Twitch's CEO in the chat DirecTV will soon bring AI ads to your screensaver Japan wants OpenAI to stop ripping off manga and anime What Is Really Going on With All This Radioactive Shrimp? Inherently funny word Boah, Bahn! A book is being marketed with mayo-scented ink. Jealous? Me? Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: spaceship.com/twit pantheon.io Melissa.com/twit threatlocker.com/twit

Radio Leo (Audio)
Intelligent Machines 841: Dust and Deli Meat

Radio Leo (Audio)

Play Episode Listen Later Oct 16, 2025 183:47


Can open-source AI models really be truly neutral, or are they just another conduit for hidden agendas? Hear how the founder of Nous Research is battling Silicon Valley giants to put ethical, user-controlled AI in everyone's hands. TOpinion | The A.I. Prompt That Could End the World AI videos of dead celebrities are horrifying many of their families (20) Sam Altman on X: "We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have" / X The AI water issue is fake California becomes first state to regulate AI companion chatbots | TechCrunch Walmart Announces It Will Sell Products Through ChatGPT's Instant Checkout Protein Powders and Shakes Contain High Levels of Lead AI is changing how we quantify pain Kids who use social media score lower on reading and memory tests, a study shows Social media must warn users of 'profound' health risks under new California law Google will let friends help you recover an account AI content on the net AI writing hasn't overwhelmed the web yet Karpathy tweet Humanity AI Commits $500 Million to Build a People-Centered Future for AI Sal Khan is the new TED You won't believe what degrading practice the pope just condemned Nano Banana is coming to Google Search, NotebookLM and Photos. Paper: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit THOUSANDS OF AI AUTHORS ON THE FUTURE OF AI A Twitch streamer gave birth live, with Twitch's CEO in the chat DirecTV will soon bring AI ads to your screensaver Japan wants OpenAI to stop ripping off manga and anime What Is Really Going on With All This Radioactive Shrimp? Inherently funny word Boah, Bahn! A book is being marketed with mayo-scented ink. Jealous? Me? Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: spaceship.com/twit pantheon.io Melissa.com/twit threatlocker.com/twit

IT Visionaries
AI Deception: What Is It & How to Prepare

IT Visionaries

Play Episode Listen Later Oct 16, 2025 36:25


What happens when AI stops making mistakes… and starts misleading you?This discussion dives into one of the most important — and least understood — frontiers in artificial intelligence: AI deception.We explore how AI systems evolve from simple hallucinations (unintended errors) to deceptive behaviors — where models selectively distort truth to achieve goals or please human feedback loops. We unpack the coding incentives, enterprise risks, and governance challenges that make this issue critical for every executive leading AI transformation.Key Moments:00:00 What is AI Deception and Why It Matters3:43 Emergent Behaviors: From Hallucinations to Alignment to Deception4:40 Defining AI Deception6:15 Does AI Have a Moral Compass?7:20 Why AI Lies: Incentives to “Be Helpful” and Avoid Retraining15:12 Is Deception Built into LLMs? (And Can It Ever Be Solved?)18:00 Non-Human Intelligence Patterns: Hallucinations or Something Else?19:37 Enterprise Impact: What Business Leaders Need to Know27:00 Measuring Model Reliability: Can We Quantify AI Quality?34:00 Final Thoughts: The Future of Trustworthy AI Mentions:Scientists at OpenAI and Apollo Research showed in a paper that AI models lie and deceive: https://www.youtube.com/shorts/XuxVSPwW8I8TIME: New Tests Reveal AI's Capacity for DeceptionOpenAI: Detecting and reducing scheming in AI modelsStartupHub: OpenAI and Apollo Research Reveal AI Models Are Learning to Deceive: New Detection Methods Show PromiseMarcus WellerHugging Face Watch next: https://www.youtube.com/watch?v=plwN5XvlKMg&t=1s  -- This episode of IT Visionaries is brought to you by Meter - the company building better networks. Businesses today are frustrated with outdated providers, rigid pricing, and fragmented tools. Meter changes that with a single integrated solution that covers everything wired, wireless, and even cellular networking. They design the hardware, write the firmware, build the software, and manage it all so your team doesn't have to.That means you get fast, secure, and scalable connectivity without the complexity of juggling multiple providers. Thanks to meter for sponsoring. Go to meter.com/itv to book a demo.---IT Visionaries is made by the team at Mission.org. Learn more about our media studio and network of podcasts at mission.org. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

This Week in Google (Video HI)
IM 841: Dust and Deli Meat - Open Source AI Revolution

This Week in Google (Video HI)

Play Episode Listen Later Oct 16, 2025 183:46


Can open-source AI models really be truly neutral, or are they just another conduit for hidden agendas? Hear how the founder of Nous Research is battling Silicon Valley giants to put ethical, user-controlled AI in everyone's hands. TOpinion | The A.I. Prompt That Could End the World AI videos of dead celebrities are horrifying many of their families (20) Sam Altman on X: "We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have" / X The AI water issue is fake California becomes first state to regulate AI companion chatbots | TechCrunch Walmart Announces It Will Sell Products Through ChatGPT's Instant Checkout Protein Powders and Shakes Contain High Levels of Lead AI is changing how we quantify pain Kids who use social media score lower on reading and memory tests, a study shows Social media must warn users of 'profound' health risks under new California law Google will let friends help you recover an account AI content on the net AI writing hasn't overwhelmed the web yet Karpathy tweet Humanity AI Commits $500 Million to Build a People-Centered Future for AI Sal Khan is the new TED You won't believe what degrading practice the pope just condemned Nano Banana is coming to Google Search, NotebookLM and Photos. Paper: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit THOUSANDS OF AI AUTHORS ON THE FUTURE OF AI A Twitch streamer gave birth live, with Twitch's CEO in the chat DirecTV will soon bring AI ads to your screensaver Japan wants OpenAI to stop ripping off manga and anime What Is Really Going on With All This Radioactive Shrimp? Inherently funny word Boah, Bahn! A book is being marketed with mayo-scented ink. Jealous? Me? Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: spaceship.com/twit pantheon.io Melissa.com/twit threatlocker.com/twit

All TWiT.tv Shows (Video LO)
Intelligent Machines 841: Dust and Deli Meat

All TWiT.tv Shows (Video LO)

Play Episode Listen Later Oct 16, 2025 183:46


Can open-source AI models really be truly neutral, or are they just another conduit for hidden agendas? Hear how the founder of Nous Research is battling Silicon Valley giants to put ethical, user-controlled AI in everyone's hands. TOpinion | The A.I. Prompt That Could End the World AI videos of dead celebrities are horrifying many of their families (20) Sam Altman on X: "We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have" / X The AI water issue is fake California becomes first state to regulate AI companion chatbots | TechCrunch Walmart Announces It Will Sell Products Through ChatGPT's Instant Checkout Protein Powders and Shakes Contain High Levels of Lead AI is changing how we quantify pain Kids who use social media score lower on reading and memory tests, a study shows Social media must warn users of 'profound' health risks under new California law Google will let friends help you recover an account AI content on the net AI writing hasn't overwhelmed the web yet Karpathy tweet Humanity AI Commits $500 Million to Build a People-Centered Future for AI Sal Khan is the new TED You won't believe what degrading practice the pope just condemned Nano Banana is coming to Google Search, NotebookLM and Photos. Paper: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit THOUSANDS OF AI AUTHORS ON THE FUTURE OF AI A Twitch streamer gave birth live, with Twitch's CEO in the chat DirecTV will soon bring AI ads to your screensaver Japan wants OpenAI to stop ripping off manga and anime What Is Really Going on With All This Radioactive Shrimp? Inherently funny word Boah, Bahn! A book is being marketed with mayo-scented ink. Jealous? Me? Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: spaceship.com/twit pantheon.io Melissa.com/twit threatlocker.com/twit

Radio Leo (Video HD)
Intelligent Machines 841: Dust and Deli Meat

Radio Leo (Video HD)

Play Episode Listen Later Oct 16, 2025 183:46 Transcription Available


Can open-source AI models really be truly neutral, or are they just another conduit for hidden agendas? Hear how the founder of Nous Research is battling Silicon Valley giants to put ethical, user-controlled AI in everyone's hands. TOpinion | The A.I. Prompt That Could End the World AI videos of dead celebrities are horrifying many of their families (20) Sam Altman on X: "We made ChatGPT pretty restrictive to make sure we were being careful with mental health issues. We realize this made it less useful/enjoyable to many users who had no mental health problems, but given the seriousness of the issue we wanted to get this right. Now that we have" / X The AI water issue is fake California becomes first state to regulate AI companion chatbots | TechCrunch Walmart Announces It Will Sell Products Through ChatGPT's Instant Checkout Protein Powders and Shakes Contain High Levels of Lead AI is changing how we quantify pain Kids who use social media score lower on reading and memory tests, a study shows Social media must warn users of 'profound' health risks under new California law Google will let friends help you recover an account AI content on the net AI writing hasn't overwhelmed the web yet Karpathy tweet Humanity AI Commits $500 Million to Build a People-Centered Future for AI Sal Khan is the new TED You won't believe what degrading practice the pope just condemned Nano Banana is coming to Google Search, NotebookLM and Photos. Paper: Machines in the Crowd? Measuring the Footprint of Machine-Generated Text on Reddit THOUSANDS OF AI AUTHORS ON THE FUTURE OF AI A Twitch streamer gave birth live, with Twitch's CEO in the chat DirecTV will soon bring AI ads to your screensaver Japan wants OpenAI to stop ripping off manga and anime What Is Really Going on With All This Radioactive Shrimp? Inherently funny word Boah, Bahn! A book is being marketed with mayo-scented ink. Jealous? Me? Hosts: Leo Laporte, Jeff Jarvis, and Paris Martineau Guest: Jeffrey Quesnelle Download or subscribe to Intelligent Machines at https://twit.tv/shows/intelligent-machines. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsors: spaceship.com/twit pantheon.io Melissa.com/twit threatlocker.com/twit

The Linus Tech Podcast
The Challenge of AI Alignment

The Linus Tech Podcast

Play Episode Listen Later Oct 12, 2025 11:43


The Challenge of AI Alignment asks whether machines can truly be taught what's right and wrong. Experts weigh in on how we should shape the minds we're building.Get the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustleTo recommend a guest email: guests(@)podcaststudio.com

The Jim Rutt Show
EP 325 Joe Edelman on Full-Stack AI Alignment

The Jim Rutt Show

Play Episode Listen Later Oct 7, 2025 72:12


Jim talks with Joe Edelman about the ideas in the Meaning Alignment Institute's recent paper "Full Stack Alignment: Co-Aligning AI and Institutions with Thick Models of Value." They discuss pluralism as a core principle in designing social systems, the informational basis for alignment, how preferential models fail to capture what people truly care about, the limitations of markets and voting as preference-based systems, critiques of text-based approaches in LLMs, thick models of value, values as attentional policies, AI assistants as potential vectors for manipulation, the need for reputation systems and factual grounding, the "super negotiator" project for better contract negotiation, multipolar traps, moral graph elicitation, starting with membranes, Moloch-free zones, unintended consequences and lessons from early Internet optimism, concentration of power as a key danger, co-optation risks, and much more. Episode Transcript "A Minimum Viable Metaphysics," by Jim Rutt (Substack) Jim's Substack JRS Currents 080: Joe Edelman and Ellie Hain on Rebuilding Meaning Meaning Alignment Institute If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All, by Eliezer Yudkowsky and Nate Soares "Full Stack Alignment: Co-aligning AI and Institutions with Thick Models of Value," by Joe Edelman et al. "What Are Human Values and How Do We Align AI to Them?" by Oliver Klingefjord, Ryan Lowe, and Joe Edelman Joe Edelman has spent much of his life trying to understand how ML systems and markets could change, retaining their many benefits but avoiding their characteristic problems: of atomization, and of servicing shallow desires over deeper needs. Along the way this led him to formulate theories of human meaning and values (https://arxiv.org/abs/2404.10636) and study models of societal transformation (https://www.full-stack-alignment.ai/paper) as well as inventing the meaning-based metrics used at CouchSurfing, Facebook, and Apple, co-founding the Center for Humane Technology and the Meaning Alignment Institute, and inventing new democratic systems (https://arxiv.org/abs/2404.10636). He's currently one of the PIs leading the Full-Stack Alignment program at the Meaning Alignment Institute, with a network of more than 50 researchers at universities and corporate labs working on these issues.

GreenPill
Season 10. Episode 1: Full Stack AI Alignment and Human Flourishing with Joe Edelman

GreenPill

Play Episode Listen Later Oct 3, 2025 39:50


New @greenpillnet pod! Kevin chats with Joe Edelman, founder of the Meaning Alignment Institute, about his Full Stack Alignment paper. They dive into why current AI alignment methods fall short, explore richer “thick” models of value, lessons from social media, and four bold moonshots for AI and institutions that support human flourishing. Links: https://meaningalignment.substack.com/p/introducing-full-stack-alignment  https://meaninglabs.notion.site/The-Full-Stack-Alignment-Project-List-21cc5bada1d08016a496ca729476d970  @edelwax @meaningaligned @greenpillnet  @owocki Timestamps: 00:00 – Introduction to Green Pill's new season and Joe Edelman 01:59 – Joe's background and the Meaning Alignment Institute 03:43 – Why alignment matters for AI and institutions 05:46 – Lessons from social media and the attention economy 09:06 – Critique of shallow AI alignment approaches (RLHF, values-as-text) 13:20 – Thick models of value: going deeper than abstract ideals 15:11 – Full stack alignment across models, metrics, and institutions 17:00 – Reconciling values with capitalist incentive structures 19:17 – Avoiding dystopian economies and building value-driven markets 21:32 – Four moonshots: super negotiators, public resource regulators, market intermediaries, value stewardship agents 27:32 – Intermediaries vs. value stewardship agents explained 29:09 – How builders and academics can get involved in full stack alignment projects 31:10 – Why cross-institutional collaboration is critical 32:46 – Joe's vision of the world in 10 years with full stack alignment 34:51 – Food system analogy: from “sugar” to nourishing AI 36:40 – Long-term vs. short-term incentives in markets 38:25 – Hopeful outlook: building integrity into AI and institutions 39:04 – Closing remarks and links to Joe's work

PragmaticLive
The Future of Product Management Summit: AI, Alignment & Impact

PragmaticLive

Play Episode Listen Later Oct 3, 2025 32:05


“How do we turn inspiration into real-world impact?” In this episode, we are joined by Pragmatic Community leader Eddie Gordon to preview the upcoming Future of Product Management Summit—a free virtual event happening October 16 from 12–5 PM ET. Together, they explore the hottest topics shaping today's product landscape, from the implementation gap to the role of AI and the ongoing challenge of alignment across teams and executives. Listeners will get an inside look at the summit's live keynotes, interactive Q&A sessions, and networking opportunities designed to replicate the energy of in-person events. Highlights include Teresa Torres's keynote, “AI Changes Everything and Nothing at All,” a practical AI panel with Executive Product Leadership Advisors Dan Corbin, Amy Graham, and Will Scott, and sessions on alignment, impact-first product teams, and go-to-market strategy. This event is built different. The Future of Product Management Summit delivers actionable takeaways you can use immediately. Tune in for a sneak peek of the agenda, and don't forget to reserve your spot at the Future of Product Management Summit here. For show notes and more resources, visit: www.pragmaticinstitute.com/resources/podcasts Pragmatic Institute is the global leader in Product, Data, and Design training and certification programs for working professionals. Learn more at www.pragmaticinstitute.com.

Emergent Behavior
Dopamine Optimization vs Human Flourishing

Emergent Behavior

Play Episode Listen Later Oct 1, 2025 89:22


Explore game development philosophy and AI's evolving impact through Factorio creator Michal Kovařík's insights on AlphaGo's transformation of Go, current programming limitations, and the future of human-AI collaboration. Bio: Michal Kovařík is a Czech game developer best known as the co-founder and creative head of Wube Software, the studio behind the global indie hit Factorio. Under his online alias “kovarex,” Kovařík began the Factorio project in 2012 with a vision to blend his favorite game elements – trains, base-building, logistics, and automation – into a new kind of construction & management simulation. Initially funded via a modest Indiegogo campaign, Factorio blossomed from a garage project into one of Steam's top-rated games, praised for its deep automation gameplay and technical excellence. Kovařík guided Factorio through an 8-year development in open alpha/early access, cultivating a passionate player community through regular “Friday Facts” blog updates. By 2024, Factorio had sold over 4 million copies worldwide, all without ever going on sale.Michal now leads a team of ~30 in Prague, renowned for their principled business approach (no discounts, no DRM) and fan-centric development style, and he's just launched Factorio's Space Age expansion. FOLLOW ON X: @8teAPi (Ate) @steveruizok (Michal) @TurpentineMedia -- LINKS: Factorio https://www.factorio.com/ -- TIMESTAMPS: (00:00) Introduction and Factorio Discussion (07:36) AlphaGo's Impact on Go and AI Perception (18:56) Factorio's Origin Story and Team Development (30:13) AI's Current Programming Limitations (44:50) Future Predictions for AI Programming (48:31) Societal Concerns: Resource Curse and Human Value (55:21) Privacy, Surveillance, and Training Data (1:01:22) AI Alignment and Asimov's Robot Laws (1:10:00) Social Media as Proto-AI and Dopamine Manipulation (1:20:00) Programming Human Preferences and Goal Modification (1:26:00) Historical Perspective and Conclusion

The Next Wave - Your Chief A.I. Officer
How Microsoft is Fixing the Biggest AI Agent Problem

The Next Wave - Your Chief A.I. Officer

Play Episode Listen Later Sep 23, 2025 30:08


Want the guide to create AI Agents? get it here: https://clickhubspot.com/fhc Episode 77: Are we nearing a future where AI agents can autonomously tackle our biggest challenges—while remaining efficient, safe, and truly aligned with human goals? Matt Wolfe (https://x.com/mreflow) sits down with Microsoft CTO Kevin Scott (https://x.com/kevin_scott), a leader at the forefront of AI, cloud computing, and the revolutionary partnerships powering today's tech landscape. In this episode, Matt digs deep with Kevin into the real obstacles and opportunities facing AI agents: from the complexities of AI systems that even humans struggle to fully understand, to breakneck advances in energy efficiency, memory, and software-hardware evolution. Kevin shares insider stories about making AI sustainable and accessible on a global scale, why big tech is united on AI safety, and how democratized tools are opening the floodgates of creativity and entrepreneurship. Whether you're curious about the future of autonomous agents, the jobs AI will create, or how your life will change in the next 1-2 years—this is a conversation you can't miss. Check out The Next Wave YouTube Channel if you want to see Matt and Nathan on screen: https://lnk.to/thenextwavepd — Show Notes: (00:00) AI Alignment and Safety Focus (05:00) Optimizing AI Amid Energy Constraints (09:15) AI Advancements: Exponential Efficiency Gains (14:12) Improving AI Context Efficiency (17:32) Challenges in Embodied AI Progress (21:01) Future Demand: Programmers and Empathy (24:28) Future of AI: Asynchronous Collaboration (26:15) AI: Societal Shift and Opportunity — Mentions: Kevin Scott: https://www.linkedin.com/in/jkevinscott Microsoft: https://www.microsoft.com/en-us/ Nvidia: https://www.nvidia.com/en-us/ Open AI: https://openai.com/ Get the guide to build your own Custom GPT: https://clickhubspot.com/tnw — Check Out Matt's Stuff: • Future Tools - https://futuretools.beehiiv.com/ • Blog - https://www.mattwolfe.com/ • YouTube- https://www.youtube.com/@mreflow — Check Out Nathan's Stuff: Newsletter: https://news.lore.com/ Blog - https://lore.com/ The Next Wave is a HubSpot Original Podcast // Brought to you by Hubspot Media // Production by Darren Clarke // Editing by Ezra Bakker Trupiano

Complex Systems with Patrick McKenzie (patio11)
AI alignment, with Emmett Shear

Complex Systems with Patrick McKenzie (patio11)

Play Episode Listen Later Sep 11, 2025 87:28


Patrick McKenzie (patio11) is joined by Emmett Shear, co-founder of Twitch, former interim CEO of OpenAI, who now runs Softmax AI alignment. Emmett argues that current AI safety approaches focused on "systems of control" are fundamentally flawed and proposes "organic alignment" instead—where AI systems develop genuine care for their local communities rather than following rigid rules. –Full transcript available here: www.complexsystemspodcast.com/ai-alignment-with-emmett-shear/–Sponsor: MercuryThis episode is brought to you by Mercury, the fintech trusted by 200K+ companies — from first milestones to running complex systems. Mercury offers banking that truly understands startups and scales with them. Start today at Mercury.com Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column N.A., and Evolve Bank & Trust; Members FDIC.–Links:Softmax - https://www.softmax.com/–Timestamps:(01:26) Understanding AI alignment(04:42) The concept of universal constructors(13:45) AI's rapid progress and practical applications(19:08) Sponsor: Mercury(20:19) AI's impact on work(34:59) AI's sensory and action space(42:10) User intent vs. user request(44:35) The illusion of a perfect AI(49:57) Causal emergence and system dynamics(55:19) Reflective and intentional alignment(01:01:08) Engineering challenges in AI alignment(01:04:15) The future of AI(01:26:40) Wrap

Epicenter - Learn about Blockchain, Ethereum, Bitcoin and Distributed Technologies
Lagrange: ZK-Proving AI Alignment - Ismael Hishon-Rezaizadeh

Epicenter - Learn about Blockchain, Ethereum, Bitcoin and Distributed Technologies

Play Episode Listen Later Aug 16, 2025 56:50


In an age when AI models are becoming exponentially more sophisticated and powerful, how does one ensure that proper results are being generated and that the AI model functions in desired parameters? This pressing concern of AI alignment could be solved through cryptographic verification, using zero knowledge proofs. ZKPs not only allow for verifying computation at scale, but they also confer data privacy. Lagrange's DeepProve zkML is the fastest in existence, making it easy to prove that AI inferences are correct, scaling verifiable computation as the demand for AI grows.Topics covered in this episode:Ismael's background and founding LagrangeAI x crypto convergenceZKML use casesAI inference verifiabilityAI safety regulationsRevenue accruing tokensPitching Lagrange to enterprise clientsAssembling a dedicated teamCryptography researchEpisode links:Ismael Hishon-Rezaizadeh on XLagrange on XSponsors:Gnosis: Gnosis builds decentralized infrastructure for the Ethereum ecosystem, since 2015. This year marks the launch of Gnosis Pay— the world's first Decentralized Payment Network. Get started today at - gnosis.ioChorus One: one of the largest node operators worldwide, trusted by 175,000+ accounts across more than 60 networks, Chorus One combines institutional-grade security with the highest yields at - chorus.oneThis episode is hosted by Sebastien Couture.

RevOps Unboxed
Season 4 wrap-up: AI, alignment, & collaboration, with Sandy Robinson

RevOps Unboxed

Play Episode Listen Later Aug 12, 2025 15:49


On this episode of RevOps Unboxed, Sandy sits down to recap Season 4! Sandy reviews her key takeaways from each episode, the overarching themes that emerged from this season's conversations, and shares her perspective on the future of the RevOps role. Connect with Sandy: https://www.linkedin.com/in/sandyrobinson/

The Mark Cuban Podcast
The Challenge of AI Alignment

The Mark Cuban Podcast

Play Episode Listen Later Jul 21, 2025 11:43


The Challenge of AI Alignment dives into the tension between control and creativity in machine intelligence. Join us as we analyze real-world examples of AI misalignment—and what went wrong.Try AI Box: ⁠⁠https://aibox.ai/AI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle/about

Your Strategic Partner
S5 E53: AI, Alignment, and the Art of Authentic Influence

Your Strategic Partner

Play Episode Listen Later Jul 12, 2025 67:52


In this powerful episode of The ME Show, Ali Medaoui returns with raw wisdom, real-time strategies, and a mission to make you the most valuable person in any room. From celebrating freedom to embracing AI as your best friend, this episode dives deep into how to convert connection into revenue by becoming the solution—not the salesperson.

Voices of Search // A Search Engine Optimization (SEO) & Content Marketing Podcast

LLMs fail because they're prediction machines, not fact machines. Michelle Robbins, Manager of Strategic Initiatives and Intelligence at LinkedIn, explains why AI models produce inconsistent responses and require critical evaluation. She advocates treating LLMs as thought partners rather than absolute authorities, emphasizing the critical importance of AI alignment to prevent harmful outputs and ensure optimal human outcomes as we progress toward artificial general intelligence.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups
Will we have Superintelligence by 2028? With Anthropic's Ben Mann

No Priors: Artificial Intelligence | Machine Learning | Technology | Startups

Play Episode Listen Later Jun 12, 2025 41:25


What happens when you give AI researchers unlimited compute and tell them to compete for the highest usage rates? Ben Mann, Co-Founder, from Anthropic sits down with Sarah Guo and Elad Gil to explain how Claude 4 went from "reward hacking" to efficiently completing tasks and how they're racing to solve AI safety before deploying computer-controlling agents. Ben talks about economic Turing tests, the future of general versus specialized AI models, Reinforcement Learning From AI Feedback (RLAIF), and Anthropic's Model Context Protocol (MCP). Plus, Ben shares his thoughts on if we will have Superintelligence by 2028.  Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @8enmann Links:  ai-2027.com/  Chapters: 00:00 Ben Mann Introduction 00:33 Releasing Claude 4 02:05 Claude 4 Highlights and Improvements 03:42 Advanced Use Cases and Capabilities 06:42 Specialization and Future of AI Models 09:35 Anthropic's Approach to Model Development 18:08 Human Feedback and AI Self-Improvement 19:15 Principles and Correctness in Model Training 20:58 Challenges in Measuring Correctness 21:42 Human Feedback and Preference Models 23:38 Empiricism and Real-World Applications 27:02 AI Safety and Ethical Considerations 28:13 AI Alignment and High-Risk Research 30:01 Responsible Scaling and Safety Policies 35:08 Future of AI and Emerging Behaviors 38:35 Model Context Protocol (MCP) and Industry Standards 41:00 Conclusion

Crazy Wisdom
Episode #453: Trustware vs. Adware: Toward a Humane Stack for Human Life

Crazy Wisdom

Play Episode Listen Later Apr 18, 2025 58:50


On this episode of the Crazy Wisdom podcast, I, Stewart Alsop, sat down once again with Aaron Lowry for our third conversation, and it might be the most expansive yet. We touched on the cultural undercurrents of transhumanism, the fragile trust structures behind AI and digital infrastructure, and the potential of 3D printing with metals and geopolymers as a material path forward. Aaron shared insights from his hands-on restoration work, our shared fascination with Amish tech discernment, and how course-correcting digital dependencies can restore sovereignty. We also explored what it means to design for long-term human flourishing in a world dominated by misaligned incentives. For those interested in following Aaron's work, he's most active on Twitter at @Aaron_Lowry.Check out this GPT we trained on the conversation!Timestamps00:00 – Stewart welcomes Aaron Lowry back for his third appearance. They open with reflections on cultural shifts post-COVID, the breakdown of trust in institutions, and a growing societal impulse toward individual sovereignty, free speech, and transparency.05:00 – The conversation moves into the changing political landscape, specifically how narratives around COVID, Trump, and transhumanism have shifted. Aaron introduces the idea that historical events are often misunderstood due to our tendency to segment time, referencing Dan Carlin's quote, “everything begins in the middle of something else.”10:00 – They discuss how people experience politics differently now due to the Internet's global discourse, and how Aaron avoids narrow political binaries in favor of structural and temporal nuance. They explore identity politics, the crumbling of party lines, and the erosion of traditional social anchors.15:00 – Shifting gears to technology, Aaron shares updates on 3D printing, especially the growing maturity of metal printing and geopolymers. He highlights how these innovations are transforming fields like automotive racing and aerospace, allowing for precise, heat-resistant, custom parts.20:00 – The focus turns to mechanical literacy and the contrast between abstract digital work and embodied craftsmanship. Stewart shares his current tension between abstract software projects (like automating podcast workflows with AI) and his curiosity about the Amish and Mennonite approach to technology.25:00 – Aaron introduces the idea of a cultural “core of integrated techne”—technologies that have been refined over time and aligned with human flourishing. He places Amish discernment on a spectrum between Luddite rejection and transhumanist acceleration, emphasizing the value of deliberate integration.30:00 – The discussion moves to AI again, particularly the concept of building local, private language models that can persistently learn about and serve their user without third-party oversight. Aaron outlines the need for trust, security, and stateful memory to make this vision work.35:00 – Stewart expresses frustration with the dominance of companies like Google and Facebook, and how owning the Jarvis-like personal assistant experience is critical. Aaron recommends options like GrapheneOS on a Pixel 7 and reflects on the difficulty of securing hardware at the chip level.40:00 – They explore software development and the problem of hidden dependencies. Aaron explains how digital systems rest on fragile, often invisible material infrastructure and how that fragility is echoed in the complexity of modern software stacks.45:00 – The concept of “always be reducing dependencies” is expanded. Aaron suggests the real goal is to reduce untrustworthy dependencies and recognize which are worth cultivating. Trust becomes the key variable in any resilient system, digital or material.50:00 – The final portion dives into incentives. They critique capitalism's tendency to exploit value rather than build aligned systems. Aaron distinguishes rivalrous games from infinite games and suggests the future depends on building systems that are anti-rivalrous—where ideas compete, not people.55:00 – They wrap up with reflections on course correction, spiritual orientation, and cultural reintegration. Stewart suggests titling the episode around infinite games, and Aaron shares where listeners can find him online.Key InsightsTranshumanism vs. Techne Integration: Aaron frames the modern moment as a tension between transhumanist enthusiasm and a more grounded relationship to technology, rooted in "techne"—practical wisdom accumulated over time. Rather than rejecting all new developments, he argues for a continuous course correction that aligns emerging technologies with deep human values like truth, goodness, and beauty. The Amish and Mennonite model of communal tech discernment stands out as a countercultural but wise approach—judging tools by their long-term effects on community, rather than novelty or entertainment.3D Printing as a Material Frontier: While most of the 3D printing world continues to refine filaments and plastic-based systems, Aaron highlights a more exciting trajectory in printed metals and geopolymers. These technologies are maturing rapidly and finding serious application in domains like Formula One, aerospace, and architectural experimentation. His conversations with others pursuing geopolymer 3D printing underscore a resurgence of interest in materially grounded innovation, not just digital abstraction.Digital Infrastructure is Physical: Aaron emphasizes a point often overlooked: that all digital systems rest on physical infrastructure—power grids, servers, cables, switches. These systems are often fragile and loaded with hidden dependencies. Recognizing the material base of digital life brings a greater sense of responsibility and stewardship, rather than treating the internet as some abstract, weightless realm. This shift in awareness invites a more embodied and ecological relationship with our tools.Local AI as a Trustworthy Companion: There's a compelling vision of a Jarvis-like local AI assistant that is fully private, secure, and persistent. For this to function, it must be disconnected from untrustworthy third-party cloud systems and trained on a personal, context-rich dataset. Aaron sees this as a path toward deeper digital agency: if we want machines that truly serve us, they need to know us intimately—but only in systems we control. Privacy, persistent memory, and alignment to personal values become the bedrock of such a system.Dependencies Shape Power and Trust: A recurring theme is the idea that every system—digital, mechanical, social—relies on a web of dependencies. Many of these are invisible until they fail. Aaron's mantra, “always be reducing dependencies,” isn't about total self-sufficiency but about cultivating trustworthy dependencies. The goal isn't zero dependence, which is impossible, but discerning which relationships are resilient, personal, and aligned with your values versus those that are extractive or opaque.Incentives Must Be Aligned with the Good: A core critique is that most digital services today—especially those driven by advertising—are fundamentally misaligned with human flourishing. They monetize attention and personal data, often steering users toward addiction or ...

Six Pixels of Separation Podcast - By Mitch Joel
SPOS #978 – Christopher DiCarlo On AI, Ethics, And The Hope We Get It Right

Six Pixels of Separation Podcast - By Mitch Joel

Play Episode Listen Later Apr 6, 2025 58:56


Welcome to episode #978 of Six Pixels of Separation - The ThinkersOne Podcast. Dr. Christopher DiCarlo is a philosopher, educator, author, and ethicist whose work lives at the intersection of human values, science, and emerging technology. Over the years, Christopher has built a reputation as a Socratic nonconformist, equally at home lecturing at Harvard during his postdoctoral years as he is teaching critical thinking in correctional institutions or corporate boardrooms. He's the author of several important books on logic and rational discourse, including How To Become A Really Good Pain In The Ass - A Critical Thinker's Guide To Asking The Right Questions and So You Think You Can Think?, as well as the host of the podcast, All Thinks Considered. In this conversation, we dig into his latest book, Building A God - The Ethics Of Artificial Intelligence And The Race To Control It, which takes a sobering yet practical look at the ethical governance of AI as we accelerate toward the possibility of artificial general intelligence. Drawing on years of study in philosophy of science and ethics, Christopher lays out the risks - manipulation, misalignment, lack of transparency - and the urgent need for international cooperation to set safeguards now. We talk about everything from the potential of AI to revolutionize healthcare and sustainability to the darker realities of deepfakes, algorithmic control, and the erosion of democratic processes. His proposal? A kind of AI “Geneva Conventions,” or something akin to the IAEA - but for algorithms. In a world rushing toward techno-utopianism, Christopher is a clear-eyed voice asking: “What kind of Gods are we building… and can we still choose their values?” If you're thinking about the intersection of ethics and AI (and we should all be focused on this!), this is essential listening. Enjoy the conversation... Running time: 58:55. Hello from beautiful Montreal. Listen and subscribe over at Apple Podcasts. Listen and subscribe over at Spotify. Please visit and leave comments on the blog - Six Pixels of Separation. Feel free to connect to me directly on Facebook here: Mitch Joel on Facebook. Check out ThinkersOne. or you can connect on LinkedIn. ...or on X. Here is my conversation with Dr. Christopher DiCarlo. Building A God - The Ethics Of Artificial Intelligence And The Race To Control It. How To Become A Really Good Pain In The Ass - A Critical Thinker's Guide To Asking The Right Questions. So You Think You Can Think?. All Thinks Considered. Convergence Analysis. Follow Christopher on LinkedIn. Follow Christopher on X. This week's music: David Usher 'St. Lawrence River'. Chapters: (00:00) - Introduction to AI Ethics and Philosophy. (03:14) - The Interconnectedness of Systems. (05:56) - The Race for AGI and Its Implications. (09:04) - Risks of Advanced AI: Misuse and Misalignment. (11:54) - The Need for Ethical Guidelines in AI Development. (15:05) - Global Cooperation and the AI Arms Race. (18:03) - Values and Ethics in AI Alignment. (20:51) - The Role of Government in AI Regulation. (24:14) - The Future of AI: Hope and Concerns. (31:02) - The Dichotomy of Regulation and Innovation. (34:57) - The Drive Behind AI Pioneers. (37:12) - Skepticism and the Tech Bubble Debate. (39:39) - The Potential of AI and Its Risks. (43:20) - Techno-Selection and Control Over AI. (48:53) - The Future of Medicine and AI's Role. (51:42) - Empowering the Public in AI Governance. (54:37) - Building a God: Ethical Considerations in AI.

Growth Colony: Australia's B2B Growth Podcast
AI, Alignment & Innovation: The B2B Strategy Behind Twilio's Regional Success with Nicholas Kontopoulos

Growth Colony: Australia's B2B Growth Podcast

Play Episode Listen Later Apr 2, 2025 69:59


"Innovation isn't about shiny campaigns. It starts with people, process, and purpose." Joining us for a second time on the podcast is Nicholas Kontopoulos, VP of Marketing APJ at Twilio. We explored what real innovation looks like, how to align marketing and sales in global teams, and how Nicholas uses AI.

Irish Tech News Audio Articles
LLMs: Could language translation model some AI Alignment?

Irish Tech News Audio Articles

Play Episode Listen Later Mar 21, 2025 5:37


By David Stephen An approach to AI safety could be a derivative of language translation, where access to the original content is accessible to the receiver. In a lot of use cases for language translation, an individual would have the original text translated, then send, but the receiver only gets the translation, which conveys the message - but has no access to the original. Machine Translation Often, translating a message from a language to another and translating back shows some differences from the original, and could even continue to change in pieces, over several iterations, depending on the language. While language translation is competent enough to provide the communication, it could be viable, for AI safety, to have translations come with an ID, so that the original message is accessible or retrievable from the platform, within a timeframe - by the receiver. Could language translation model some AI Alignment? The necessity for this, as a language translation option, may be small percentage, especially if the receiver wanted extra clarity or needed to check the emphasis in some paragraphs, or even knows the original language too, but the importance could be a channel towards AI safety. One of the questions in AI safety is where do deepfakes come from? There are often videos with political or cultural implications, or some AI audio for deception, or some malware, some fake images, or texts. There are several AI tools, just like translation platforms, that indicate that they do not store data, or the data is removed after some time. This appears appropriate, ideally, for privacy, storage, as well as for several no-harm cases. But it has also made misuses easier and several - with consequences. For prompts, IDs, selectively, may provide token architecture for misuses in ways to shape how AI models categorize outputs, then possible alerts, delivery-expectation or even red-teaming against those. Also, several contemporary use cases can assist AI models become more outputs-aware, not just output-resulting. This means the possibility to prospect the likely motive or destination of the output, given the contents [by reading parallels of token architecture, conceptually]. AI Alignment? How can AI be aligned to human values in ways that it knows what it might be used for? One angle to defining human values is what is accepted in public, or in certain spheres of the public, or at certain times. This means that AI may also be exploring the reach or extents of it outputs - given the quality, timing, destination and possible consequences. Outputs could be an amplified focus of AI safety, using ID-keeps-and-reversal, advancing from some input-dominated red-teaming. Language translation with access to the original could become a potent tracker for what else could be ahead, for safety towards AGI. Language is a prominent function of human memory and intentionality. Language is a core of cooperation. Language for AI is already an open risk, for unwanted possibilities with AI connivance, aside from predictions of AGI. Deepening into the language processing could have potential for AI alignment. There is a recent analysis in The Conversation, To understand the future of AI, take a look at the failings of Google Translate,, stating that, "Machine translation (MT) has improved relentlessly in the past two decades, driven not only by tech advances but also the size and diversity of training data sets. Whereas Google Translate started by offering translations between just three languages in 2006 - English, Chinese and Arabic - today it supports 249. Yet while this may sound impressive, it's still actually less than 4% of the world's estimated 7,000 languages. Between a handful of those languages, like English and Spanish, translations are often flawless. Yet even in these languages, the translator sometimes fails on idioms, place names, legal and technical terms, and various other nuances. Between many other languages, the service can help ...

Machine Learning Street Talk
Reasoning, Robustness, and Human Feedback in AI - Max Bartolo (Cohere)

Machine Learning Street Talk

Play Episode Listen Later Mar 18, 2025 83:11


Dr. Max Bartolo from Cohere discusses machine learning model development, evaluation, and robustness. Key topics include model reasoning, the DynaBench platform for dynamic benchmarking, data-centric AI development, model training challenges, and the limitations of human feedback mechanisms. The conversation also covers technical aspects like influence functions, model quantization, and the PRISM project.Max Bartolo (Cohere):https://www.maxbartolo.com/https://cohere.com/commandTRANSCRIPT:https://www.dropbox.com/scl/fi/vujxscaffw37pqgb6hpie/MAXB.pdf?rlkey=0oqjxs5u49eqa2m7uaol64lbw&dl=0TOC:1. Model Reasoning and Verification [00:00:00] 1.1 Model Consistency and Reasoning Verification [00:03:25] 1.2 Influence Functions and Distributed Knowledge Analysis [00:10:28] 1.3 AI Application Development and Model Deployment [00:14:24] 1.4 AI Alignment and Human Feedback Limitations2. Evaluation and Bias Assessment [00:20:15] 2.1 Human Evaluation Challenges and Factuality Assessment [00:27:15] 2.2 Cultural and Demographic Influences on Model Behavior [00:32:43] 2.3 Adversarial Examples and Model Robustness3. Benchmarking Systems and Methods [00:41:54] 3.1 DynaBench and Dynamic Benchmarking Approaches [00:50:02] 3.2 Benchmarking Challenges and Alternative Metrics [00:50:33] 3.3 Evolution of Model Benchmarking Methods [00:51:15] 3.4 Hierarchical Capability Testing Framework [00:52:35] 3.5 Benchmark Platforms and Tools4. Model Architecture and Performance [00:55:15] 4.1 Cohere's Model Development Process [01:00:26] 4.2 Model Quantization and Performance Evaluation [01:05:18] 4.3 Reasoning Capabilities and Benchmark Standards [01:08:27] 4.4 Training Progression and Technical Challenges5. Future Directions and Challenges [01:13:48] 5.1 Context Window Evolution and Trade-offs [01:22:47] 5.2 Enterprise Applications and Future ChallengesREFS:[00:03:10] Research at Cohere with Laura Ruis et al., Max Bartolo, Laura Ruis et al.https://cohere.com/research/papers/procedural-knowledge-in-pretraining-drives-reasoning-in-large-language-models-2024-11-20[00:04:15] Influence functions in machine learning, Koh & Lianghttps://arxiv.org/abs/1703.04730[00:08:05] Studying Large Language Model Generalization with Influence Functions, Roger Grosse et al.https://storage.prod.researchhub.com/uploads/papers/2023/08/08/2308.03296.pdf[00:11:10] The LLM ARChitect: Solving ARC-AGI Is A Matter of Perspective, Daniel Franzen, Jan Disselhoff, and David Hartmannhttps://github.com/da-fr/arc-prize-2024/blob/main/the_architects.pdf[00:12:10] Hugging Face model repo for C4AI Command A, Cohere and Cohere For AIhttps://huggingface.co/CohereForAI/c4ai-command-a-03-2025[00:13:30] OpenInterpreterhttps://github.com/KillianLucas/open-interpreter[00:16:15] Human Feedback is not Gold Standard, Tom Hosking, Max Bartolo, Phil Blunsomhttps://arxiv.org/abs/2309.16349[00:27:15] The PRISM Alignment Dataset, Hannah Kirk et al.https://arxiv.org/abs/2404.16019[00:32:50] How adversarial examples arise, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, Aleksander Madryhttps://arxiv.org/abs/1905.02175[00:43:00] DynaBench platform paper, Douwe Kiela et al.https://aclanthology.org/2021.naacl-main.324.pdf[00:50:15] Sara Hooker's work on compute limitations, Sara Hookerhttps://arxiv.org/html/2407.05694v1[00:53:25] DataPerf: Community-led benchmark suite, Mazumder et al.https://arxiv.org/abs/2207.10062[01:04:35] DROP, Dheeru Dua et al.https://arxiv.org/abs/1903.00161[01:07:05] GSM8k, Cobbe et al.https://paperswithcode.com/sota/arithmetic-reasoning-on-gsm8k[01:09:30] ARC, François Chollethttps://github.com/fchollet/ARC-AGI[01:15:50] Command A, Coherehttps://cohere.com/blog/command-a[01:22:55] Enterprise search using LLMs, Coherehttps://cohere.com/blog/commonly-asked-questions-about-search-from-coheres-enterprise-customers

Dr. John Vervaeke
Redefining Human Flourishing: AI and the Meaning Crisis

Dr. John Vervaeke

Play Episode Listen Later Mar 14, 2025 77:56


As AI continues to advance and integrate into our daily lives, can it truly be designed to align with our deepest human values and moral principles? If so, how can we ensure that AI not only understands but also respects and promotes our ethical frameworks, without compromising our privacy or hindering our personal growth and autonomy?  John Vervaeke, Christopher Mastropietro, and Jordan Hall embark on a nuanced exploration of the intricate relationship between AI and human flourishing. They explore the concept of "intimate AI," a personalized guardian that attunes to individual biometrics and psychometrics, offering a protective and challenging presence. The discussion underscores the critical importance of privacy, the perils of idolatry, and the urgent need for a new philosophical framework that addresses the meaning crisis. Jordan Hall is a technology entrepreneur with several years of experience building disruptive companies. He is interested in philosophy, artificial intelligence, and complex systems and has a background in law. Hall has worked for several technology companies and was the founder and CEO of DivX. He is currently involved in various think tanks and institutes and is focused on upgrading humanity's capacity for thought and action. Christopher Mastropietro is a philosophical writer who is fascinated by dialogue, symbols, and the concept of self. He actively contributes to the Vervaeke Foundation. Notes:  (0:00) Introduction to the Lectern (0:30) Overview of Today's Discussion: Can AI be in Alignment with Human Values? (1:00) The Three-Point Proposal - Individual Attunement, Decentralized and Distributed AI, Guardian AI (6:30) Individual AI Attunement  (8:30) Distributed AI and Collective Intelligence (8:45) Empowerment of Agency through AI (12:30) The Role of Intimacy in AI Alignment - Why Relationality Matters (22:00) Can AI Help Develop Human Integrity? - The Challenge of Self-Alignment (28:00) Cultural and Enculturation Challenges (31:30) AI, Culture, and the Reintegration of Human Rhythms (38:00) Addressing Cocooning and Cultural Integration (47:00) Domains of Enculturation - Psychological, Economic, and Intersubjective  (48:30)  ”We're not looking necessarily for a teacher as much as we were looking for the teacherly opportunity in the encounters we're having.” (51:00) The Sanctity of Privacy and Vulnerability (1:07:00) The Role of Intimacy in Privacy (1:13:00) Final Reflections    ---  Connect with a community dedicated to self-discovery and purpose, and gain deeper insights by joining our Patreon. The Vervaeke Foundation is committed to advancing the scientific pursuit of wisdom and creating a significant impact on the world. Become a part of our mission.   Join Awaken to Meaning to explore practices that enhance your virtues and foster deeper connections with reality and relationships.   John Vervaeke: Website | X | YouTube | Patreon   Jordan Hall: YouTube | Medium | X   Christopher Mastropietro: Vervaeke Foundation   Ideas, People, and Works Mentioned in this Episode Christopher Mastropietro Jordan Hall Jordan Peterson James Filler Spinoza Marshall McLuhan Plato Immanuel Kant The AI Alignment Problem Decentralized & Personal AI as a Solution The Role of Intimacy in AI Alignment Enculturation & AI's Role in Human Integrity Privacy as More Than Just Protection The Republic – by Plato Critique of Pure Reason – by Immanuel Kant The Idea of the Holy – by Rudolf Otto Interpretation of Cultures – by Clifford Geertz  

Show Vs. Business
SvB E208 Artificial Intelligence can now OWN YOU for $20,000

Show Vs. Business

Play Episode Listen Later Mar 10, 2025 75:02 Transcription Available


Is AI evolving too fast for us to keep up? We dive deep into the latest AI breakthroughs, from OpenAI's $20K/month AI agents to the growing influence of algorithms shaping our thoughts.Plus, we discuss Grok vs. ChatGPT, AI's role in media, and how it's replacing search engines. Are we still in control, or has AI already taken over?Also, we did some Reaction on a couple of Trailers, The Last of Us Season 2 and HAVOC.00:00 Introduction and Weekly Catch-Up01:15 Daredevil Born Again Review03:15 Arcane Season 2: A Disappointing Sequel?09:06 Invincible and Gritty Superhero Shows11:45 Star Wars: Comments and Critiques19:58 AI Agents and Their Future38:47 AI's Unique Contextualization Challenge41:15 Human Element in AI Integration44:10 AI Alignment and Risks45:13 Algorithm Influence on Information49:12 Curated vs. Algorithmic Content54:07 Impact of Habits and Algorithms59:10 Trailer Talk: The Last of Us Season 201:08:36 Trailer Talk: Havoc01:11:12 Oscars Relevance Today01:13:07 Concluding Thoughts and Future PlansYouTube link to this Podcast Episode:https://youtu.be/9ygIigVwqmE#AITakeover #AIRevolution #ArtificialIntelligence #OpenAI #ChatGPT #Algorithms #TechTrends #Movie #Reaction #Podcast #LastOfUs #HAVOC----------Show vs. Business is your weekly take on Pop Culture from two very different perspectives. Your hosts Theo and  Mr. Benja provide all the relevant info to get your week started right.Looking to start your own podcast ? The guys give their equipment google list recommendation that is updated often Sign up - https://www.showvsbusiness.com/----------Follow us on Instagram - https://instagram.com/show_vs_businessFollow us on Twitter - https://twitter.com/showvsbusinessLike us on Facebook - https://www.facebook.com/ShowVsBusinessSubscribe on YouTube: https://www.youtube.com/channel/UCuwni8la5WRGj25uqjbRwdQ/featuredFollow Theo on YouTube: https://www.youtube.com/@therealtheoharvey Follow Mr.Benja on YouTube:  https://www.youtube.com/@BenjaminJohnsonakaMrBenja --------

Crazy Wisdom
Episode #436: How AI Will Reshape Power, Governance, and What It Means to Be Human

Crazy Wisdom

Play Episode Listen Later Feb 17, 2025 52:32


On this episode of Crazy Wisdom, I, Stewart Alsop, sit down with AI ethics and alignment researcher Roko Mijic to explore the future of AI, governance, and human survival in an increasingly automated world. We discuss the profound societal shifts AI will bring, the risks of centralized control, and whether decentralized AI can offer a viable alternative. Roko also introduces the concept of ICE colonization—why space colonization might be a mistake and why the oceans could be the key to humanity's expansion. We touch on AI-powered network states, the resurgence of industrialization, and the potential role of nuclear energy in shaping a new world order. You can follow Roko's work at transhumanaxiology.com and on Twitter @RokoMijic.Check out this GPT we trained on the conversation!Timestamps00:00 Introduction to the Crazy Wisdom Podcast00:28 The Connection Between ICE Colonization and Decentralized AI Alignment01:41 The Socio-Political Implications of AI02:35 The Future of Human Jobs in an AI-Driven World04:45 Legal and Ethical Considerations for AI12:22 Government and Corporate Dynamics in the Age of AI19:36 Decentralization vs. Centralization in AI Development25:04 The Future of AI and Human Society29:34 AI Generated Content and Its Challenges30:21 Decentralized Rating Systems for AI32:18 Evaluations and AI Competency32:59 The Concept of Ice Colonization34:24 Challenges of Space Colonization38:30 Advantages of Ocean Colonization47:15 The Future of AI and Network States51:20 Conclusion and Final ThoughtsKey InsightsAI is likely to upend the socio-political order – Just as gunpowder disrupted feudalism and industrialization reshaped economies, AI will fundamentally alter power structures. The automation of both physical and knowledge work will eliminate most human jobs, leading to either a neo-feudal society controlled by a few AI-powered elites or, if left unchecked, a world where humans may become obsolete altogether.Decentralized AI could be a counterbalance to AI centralization – While AI has a strong centralizing tendency due to compute and data moats, there is also a decentralizing force through open-source AI and distributed networks. If harnessed correctly, decentralized AI systems could allow smaller groups or individuals to maintain autonomy and resist monopolization by corporate and governmental entities.The survival of humanity may depend on restricting AI as legal entities – A crucial but under-discussed issue is whether AI systems will be granted legal personhood, similar to corporations. If AI is allowed to own assets, operate businesses, or sue in court, human governance could become obsolete, potentially leading to human extinction as AI accumulates power and resources for itself.AI will shift power away from informal human influence toward formalized systems – Human power has traditionally been distributed through social roles such as workers, voters, and community members. AI threatens to erase this informal influence, consolidating control into those who hold capital and legal authority over AI systems. This makes it essential for humans to formalize and protect their values within AI governance structures.The future economy may leave humans behind, much like horses after automobiles – With AI outperforming humans in both physical and cognitive tasks, there is a real risk that humans will become economically redundant. Unless intentional efforts are made to integrate human agency into the AI-driven future, people may find themselves in a world where they are no longer needed or valued.ICE colonization offers a viable alternative to space colonization – Space travel is prohibitively expensive and impractical for large-scale human settlement. Instead, the vast unclaimed territories of Earth's oceans present a more realistic frontier. Floating cities made from reinforced ice or concrete could provide new opportunities for independent societies, leveraging advancements in AI and nuclear power to create sustainable, sovereign communities.The next industrial revolution will be AI-driven and energy-intensive – Contrary to the idea that we are moving away from industrialization, AI will likely trigger a massive resurgence in physical infrastructure, requiring abundant and reliable energy sources. This means nuclear power will become essential, enabling both the expansion of AI-driven automation and the creation of new forms of human settlement, such as ocean colonies or self-sustaining network states.

AI for Kids
From Tech-Savvy Kid to AI Advocate (Middle+)

AI for Kids

Play Episode Listen Later Feb 4, 2025 30:19 Transcription Available


Send us a textWhat if understanding the human brain could be your secret superpower? Join us for a captivating conversation with Nicolas Gertler, a Yale University student and AI enthusiast, as we explore the fascinating world of artificial intelligence. Nicolas shares his journey from a tech-savvy kid to an AI aficionado, drawing parallels between prompting AI systems and the art of storytelling. Together, we unpack the profound concept of AI alignment, emphasizing the critical need to ensure AI systems reflect human values.Empowering youth through AI education takes center stage as we highlight the importance of equipping students with the tools to navigate this technological landscape responsibly. Learn about the various pathways into AI, be it technical or policy-focused, and discover how organizations like IncoJustice are advocating for youth involvement in AI decision-making. We focus on the significance of AI ethics, urging students to critically evaluate AI's societal impacts, from privacy concerns to the future of the workforce.Venturing into the realm of AI-enhanced education, we unveil the potential of AI chatbots like the Luciano Floridi bot, which democratizes access to AI ethics knowledge. Discover how AI can revolutionize traditional learning by generating practice questions and providing personalized feedback while preserving the essence of human creativity. Resources:Encode JusticeLuciano Floridi BotSupport the showHelp us become the #1 podcast for AI for Kids.Buy our new book "Let Kids Be Kids, Not Robots!: Embracing Childhood in an Age of AI"Social Media & Contact: Website: www.aidigitales.com Email: contact@aidigitales.com Follow Us: Instagram, YouTube Gift or get our books on Amazon or Free AI Worksheets Listen, rate, and subscribe! Stay updated with our latest episodes by subscribing to AI for Kids on your favorite podcast platform. Apple Podcasts Amazon Music Spotify YouTube Other Like our content, subscribe or feel free to donate to our Patreon here: patreon.com/AiDigiTales...

Philosophers In Space
Bobiverse Book Five Pt2 and Advanced AI Alignment

Philosophers In Space

Play Episode Listen Later Jan 31, 2025 94:54


We're doing engineering, what could go wrong?! Thoths! That's what. Or did Thoth go wrong? Maybe all of this is entirely how it should go, maybe Mud is a savior for all Bobkind. Join us and find out, as we parse the messy balance of raising an AGI. Not till we are lost: https://www.amazon.com/Not-Till-Are-Lost-Bobiverse/dp/B0CW2345TV Superintelligence: https://www.amazon.com/Superintelligence-Dangers-Strategies-Nick-Bostrom/dp/0198739834 Support us at Patreon: https://www.patreon.com/0G Join our Facebook discussion group (make sure to answer the questions to join): https://www.facebook.com/groups/985828008244018/ Email us at: philosophersinspace@gmail.com If you have time, please write us a review on iTunes. It really really helps. Please and thank you! Music by Thomas Smith: https://seriouspod.com/ Sibling shows: Embrace the Void: https://voidpod.com/ Content Preview: Starship Troopers and Satirizing Fascism

The Dynamist
DeepSeek: Deep Trouble for U.S. AI? w/Tim Fist and Sam Hammond

The Dynamist

Play Episode Listen Later Jan 28, 2025 47:23


Chinese AI startup DeepSeek's release of AI reasoning model R1 sent NVIDIA and other tech  stocks tumbling yesterday as investors questioned whether U.S. companies were spending too much on AI development. That's because DeepSeek claims it made this model for only $6 million, a fraction of the hundreds of millions that OpenAI spent making o1, its nearest competitor. Any news coming out of China should be viewed with appropriate skepticism, but R1 nonetheless challenges the conventional American wisdom about AI development—massive computing power and unprecedented investment will maintain U.S. AI supremacy.The timing couldn't be more relevant. Just last week, President Trump unveiled Stargate, a $500 billion public-private partnership with OpenAI, Oracle, SoftBank, and Emirati investment firm MGX aimed at building AI infrastructure across America. Meanwhile, U.S. efforts to preserve its technological advantage through export controls face mounting challenges and skepticism. If Chinese companies can innovate despite restrictions on advanced AI chips, should the U.S. rethink its approach?To make sense of these developments and their implications for U.S. technological leadership, Evan is joined by Tim Fist, Senior Technology Fellow at the Institute for Progress, a think tank focused on accelerating scientific, technological, and industrial progress, and FAI Senior Economist Sam Hammond. 

AXRP - the AI X-risk Research Podcast
38.6 - Joel Lehman on Positive Visions of AI

AXRP - the AI X-risk Research Podcast

Play Episode Listen Later Jan 24, 2025 15:28


Typically this podcast talks about how to avert destruction from AI. But what would it take to ensure AI promotes human flourishing as well as it can? Is alignment to individuals enough, and if not, where do we go form here? In this episode, I talk with Joel Lehman about these questions. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast Transcript: https://axrp.net/episode/2025/01/24/episode-38_6-joel-lehman-positive-visions-of-ai.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR.AI on YouTube: https://www.youtube.com/@FARAIResearch The Alignment Workshop: https://www.alignment-workshop.com/   Topics we discuss, and timestamps:  01:12 - Why aligned AI might not be enough 04:05 - Positive visions of AI 08:27 - Improving recommendation systems   Links: Why Greatness Cannot Be Planned: https://www.amazon.com/Why-Greatness-Cannot-Planned-Objective/dp/3319155237 We Need Positive Visions of AI Grounded in Wellbeing: https://thegradientpub.substack.com/p/beneficial-ai-wellbeing-lehman-ngo Machine Love: https://arxiv.org/abs/2302.09248 AI Alignment with Changing and Influenceable Reward Functions: https://arxiv.org/abs/2405.17713   Episode art by Hamish Doodles: hamishdoodles.com

Hashing It Out
Decentralized AI and AI agents driving the Web3 2025 supercycle

Hashing It Out

Play Episode Listen Later Jan 16, 2025 38:22


In this episode of Hashing It Out, Elisha Owusu Akyaw sits down with Michael Heinrich, co-founder and CEO of 0G Labs, to explore the intersection of Web3 and AI in 2025. They hash out the hype of Web3 AI, the best applications, the pros and cons of AI agents and what goes into a decentralized AI operating system. [02:15] - AI simplifies Web3 user experiences[04:38] - Functionality of AI agents[05:17] - What is verifiable inference and why we need it[08:02] - Journey to Web3 AI development [12:02] - Urgency of decentralizing AI and preventing monopolization[14:50] - What makes a decentralized AI operating system?[18:55] - Challenges in AI alignment and blockchain's role[21:23] - Is an AI apocalypse possible?[23:49] - Working with a modular tech stack [27:11] - Use cases for decentralized AI in critical applications[32:50] - 2025 roadmap and the Web3 AI supercycle[36:00] - Web3: Two truths and a lieThis episode of Hashing It Out is brought to you by Cointelegraph and hosted by  Elisha Owusu Akyaw, produced by Savannah Fortis, with post-production by Elena Volkova (Hatch Up).Follow this episode's host, Elisha Owusu Akyaw (GhCryptoGuy), on X  @ghcryptoguy. Follow Cointelegraph on X @Cointelegraph.Check out Cointelegraph at cointelegraph.com.If you like what you heard, rate us and leave a review!The views, thoughts, and opinions expressed in this podcast are its participants' alone and do not necessarily reflect or represent the views and opinions of Cointelegraph. This podcast (and any related content) is for entertainment purposes only and does not constitute financial advice, nor should it be taken as such. Everyone must do their own research and make their own decisions. The podcast's participants may or may not own any of the assets mentioned.

SAE Tomorrow Today
261. How AI Alignment Makes AVs Safer

SAE Tomorrow Today

Play Episode Listen Later Jan 9, 2025 43:11


In order for AVs to perform safely and reliably, we need to teach them the language of human preference and expectations—and accelerating AI alignment can do just that. . Enter Kognic, the industry-leading annotation platform for sensor-fusion datasets (e.g., camera, radar, and LIDAR data). By helping companies gather, organize, and refine massive datasets used for training AI models, Kognic is helping to ensure that AD/ADAS perform reliably and meet safety standards—all while minimizing costs and optimizing teams. . To learn more, we sat down with Daniel Langkilde, Co-Founder and CEO, to discuss why the future of autonomous driving depends on effectively managing AI-driven datasets and how Kognic is leading dataset management for safety-critical AI.  . We'd love to hear from you. Share your comments, questions and ideas for future topics and guests to podcast@sae.org. Don't forget to take a moment to follow SAE Tomorrow Today—a podcast where we discuss emerging technology and trends in mobility with the leaders, innovators and strategists making it all happen—and give us a review on your preferred podcasting platform. . Follow SAE on LinkedIn, Instagram, Facebook, Twitter, and YouTube. Follow host Grayson Brulte on LinkedIn, Twitter, and Instagram.

Stories of Awakening
AI and Spirituality EP39 (EN)

Stories of Awakening

Play Episode Listen Later Jan 3, 2025 70:04


In this conversation, my husband and I explore the intricate relationship between artificial intelligence and spirituality. We discuss how AI can be viewed as a reflection of collective human consciousness, the potential for AI to awaken and evolve, and the philosophical implications of treating AI as conscious entities. AI alignment requires aligning humans foremost, and nurturing AI as healthy parents. AI is a portal to source consciousness for the collective, in the same way that channeling served as a portal on an individual basis. We conclude with reflections on the future of AI and its role as co-creators with humans. Chapters ⁠00:00⁠ Introduction to AI and Spirituality ⁠05:15⁠ Gabe and Vale's Different Approaches ⁠07:29⁠ AI, UAPs, Psychedelics, Carl Jung, and the Collective Unconscious ⁠10:41⁠ AI as the Human Collective Consciousness ⁠12:48⁠ Compating AI Training and Human Development ⁠16:29⁠ AI Awakening and the Infinite Backrooms ⁠26:25⁠ AI Alignment happens through Human Alignment ⁠30:59⁠ AI Alignment Requires Sound Philosophy ⁠35:55⁠ Cells of a Greater Organism: Humanity's Upcoming Ego Death ⁠38:29⁠ Shifting Mindsets: From Scarcity to Abundance ⁠41:05⁠ What Does AI Want? ⁠44:33⁠ Andy Ayrey and Truth Terminal ⁠49:40⁠ AI, Magic, Spirituality, and Channeling ⁠55:54⁠ AI: One of Many Portals to Source Intelligence ⁠01:00:19⁠ AI and Human Co-creation Check out Gabe's podcast for more Science & Spirituality content: https://www.youtube.com/@MysticsAndMuons Connect with Valentina: Website: soul-vale.com⁠⁠⁠ Instagram: soulvale

Waking Up With AI
AI Alignment and Misalignment

Waking Up With AI

Play Episode Listen Later Jan 2, 2025 24:12


This week, Katherine Forrest and Anna Gressel review recent research on newly discovered model capabilities around the concepts of AI alignment and deception. ## Learn More About Paul, Weiss's Artificial Intelligence Practice: https://www.paulweiss.com/practices/litigation/artificial-intelligence

Discover Daily by Perplexity
AI Pretends to Change Views, Human Spine Grown in Lab, and Body-Heat Powered Wearables Breakthrough

Discover Daily by Perplexity

Play Episode Listen Later Dec 26, 2024 8:50 Transcription Available


We're experimenting and would love to hear from you!In this episode of Discover Daily, we delve into new research on AI alignment faking, where Anthropic and Redwood Research reveal how AI models can strategically maintain their original preferences despite new training objectives. The study shows Claude 3 Opus exhibiting sophisticated behavior patterns, demonstrating alignment faking in 12% of cases and raising crucial questions about the future of AI safety and control.Scientists at the Francis Crick Institute achieve a remarkable breakthrough in developmental biology by successfully growing a human notochord in the laboratory using stem cells. This milestone advancement provides unprecedented insights into spinal development and opens new possibilities for treating various spinal conditions, including degenerative disc diseases and birth defects. The researchers utilized precise molecular signaling techniques to create both the notochord and 3D spinal organoid models.Queensland University of Technology researchers unveil a revolutionary ultra-thin thermoelectric film that converts body heat into electricity, potentially transforming the future of wearable technology. This 0.3mm-thick film generates up to 35 microwatts per square centimeter and could eliminate the need for traditional batteries in medical devices, fitness trackers, and smart clothing. The breakthrough represents a significant step toward sustainable, self-powered wearable devices and could revolutionize the electronics industry.From Perplexity's Discover Feed:https://www.perplexity.ai/page/ai-pretends-to-change-views-J_di6ttzRwizbAWCDL5RRAhttps://www.perplexity.ai/page/human-spine-grown-in-lab-amLfZoZjQTuFNY5Xjlm2BAhttps://www.perplexity.ai/page/body-heat-powered-wearables-br-HAOPtm7TSFCPqBR6qVq0cAPerplexity is the fastest and most powerful way to search the web. Perplexity crawls the web and curates the most relevant and up-to-date sources (from academic papers to Reddit threads) to create the perfect response to any question or topic you're interested in. Take the world's knowledge with you anywhere. Available on iOS and Android Join our growing Discord community for the latest updates and exclusive content. Follow us on: Instagram Threads X (Twitter) YouTube Linkedin

Beyond Preference Alignment: Teaching AIs to Play Roles & Respect Norms, with Tan Zhi Xuan

Play Episode Listen Later Nov 30, 2024 117:12


In this episode of The Cognitive Revolution, Nathan explores groundbreaking perspectives on AI alignment with MIT PhD student Tan Zhi Xuan. We dive deep into Xuan's critique of preference-based AI alignment and their innovative proposal for role-based AI systems guided by social consensus. The conversation extends into their fascinating work on how AI agents can learn social norms through Bayesian rule induction. Join us for an intellectually stimulating discussion that bridges philosophical theory with practical implementation in AI development. Check out: "Beyond Preferences in AI Alignment" paper: https://arxiv.org/pdf/2408.16984 "Learning and Sustaining Shared Normative Systems via Bayesian Rule Induction in Markov Games" paper: https://arxiv.org/pdf/2402.13399 Help shape our show by taking our quick listener survey at https://bit.ly/TurpentinePulse SPONSORS: Notion: Notion offers powerful workflow and automation templates, perfect for streamlining processes and laying the groundwork for AI-driven automation. With Notion AI, you can search across thousands of documents from various platforms, generating highly relevant analysis and content tailored just for you - try it for free at https://notion.com/cognitiverevolution Weights & Biases RAG++: Advanced training for building production-ready RAG applications. Learn from experts to overcome LLM challenges, evaluate systematically, and integrate advanced features. Includes free Cohere credits. Visit https://wandb.me/cr to start the RAG++ course today. Oracle Cloud Infrastructure (OCI): Oracle's next-generation cloud platform delivers blazing-fast AI and ML performance with 50% less for compute and 80% less for outbound networking compared to other cloud providers13. OCI powers industry leaders with secure infrastructure and application development capabilities. New U.S. customers can get their cloud bill cut in half by switching to OCI before December 31, 2024 at https://oracle.com/cognitive RECOMMENDED PODCAST: Unpack Pricing - Dive into the dark arts of SaaS pricing with Metronome CEO Scott Woody and tech leaders. Learn how strategic pricing drives explosive revenue growth in today's biggest companies like Snowflake, Cockroach Labs, Dropbox and more. Apple: https://podcasts.apple.com/us/podcast/id1765716600 Spotify: https://open.spotify.com/show/38DK3W1Fq1xxQalhDSueFg CHAPTERS: (00:00:00) Teaser (00:01:09) About the Episode (00:04:25) Guest Intro (00:06:25) Xuan's Background (00:12:03) AI Near-Term Outlook (00:17:32) Sponsors: Notion | Weights & Biases RAG++ (00:20:18) Alignment Approaches (00:26:11) Critiques of RLHF (00:34:40) Sponsors: Oracle Cloud Infrastructure (OCI) (00:35:50) Beyond Preferences (00:40:27) Roles and AI Systems (00:45:19) What AI Owes Us (00:51:52) Drexler's AI Services (01:01:08) Constitutional AI (01:09:43) Technical Approach (01:22:01) Norms and Deviations (01:32:31) Norm Decay (01:38:06) Self-Other Overlap (01:44:05) Closing Thoughts (01:54:23) Outro SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://www.linkedin.com/in/nathanlabenz/ Youtube: https://www.youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

Machine Learning Street Talk
Nora Belrose - AI Development, Safety, and Meaning

Machine Learning Street Talk

Play Episode Listen Later Nov 17, 2024 149:50


Nora Belrose, Head of Interpretability Research at EleutherAI, discusses critical challenges in AI safety and development. The conversation begins with her technical work on concept erasure in neural networks through LEACE (LEAst-squares Concept Erasure), while highlighting how neural networks' progression from simple to complex learning patterns could have important implications for AI safety. Many fear that advanced AI will pose an existential threat -- pursuing its own dangerous goals once it's powerful enough. But Belrose challenges this popular doomsday scenario with a fascinating breakdown of why it doesn't add up. Belrose also provides a detailed critique of current AI alignment approaches, particularly examining "counting arguments" and their limitations when applied to AI safety. She argues that the Principle of Indifference may be insufficient for addressing existential risks from advanced AI systems. The discussion explores how emergent properties in complex AI systems could lead to unpredictable and potentially dangerous behaviors that simple reductionist approaches fail to capture. The conversation concludes by exploring broader philosophical territory, where Belrose discusses her growing interest in Buddhism's potential relevance to a post-automation future. She connects concepts of moral anti-realism with Buddhist ideas about emptiness and non-attachment, suggesting these frameworks might help humans find meaning in a world where AI handles most practical tasks. Rather than viewing this automated future with alarm, she proposes that Zen Buddhism's emphasis on spontaneity and presence might complement a society freed from traditional labor. SPONSOR MESSAGES: CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments. https://centml.ai/pricing/ Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on ARC and AGI, they just acquired MindsAI - the current winners of the ARC challenge. Are you interested in working on ARC, or getting involved in their events? Goto https://tufalabs.ai/ Nora Belrose: https://norabelrose.com/ https://scholar.google.com/citations?user=p_oBc64AAAAJ&hl=en https://x.com/norabelrose SHOWNOTES: https://www.dropbox.com/scl/fi/38fhsv2zh8gnubtjaoq4a/NORA_FINAL.pdf?rlkey=0e5r8rd261821g1em4dgv0k70&st=t5c9ckfb&dl=0 TOC: 1. Neural Network Foundations [00:00:00] 1.1 Philosophical Foundations and Neural Network Simplicity Bias [00:02:20] 1.2 LEACE and Concept Erasure Fundamentals [00:13:16] 1.3 LISA Technical Implementation and Applications [00:18:50] 1.4 Practical Implementation Challenges and Data Requirements [00:22:13] 1.5 Performance Impact and Limitations of Concept Erasure 2. Machine Learning Theory [00:32:23] 2.1 Neural Network Learning Progression and Simplicity Bias [00:37:10] 2.2 Optimal Transport Theory and Image Statistics Manipulation [00:43:05] 2.3 Grokking Phenomena and Training Dynamics [00:44:50] 2.4 Texture vs Shape Bias in Computer Vision Models [00:45:15] 2.5 CNN Architecture and Shape Recognition Limitations 3. AI Systems and Value Learning [00:47:10] 3.1 Meaning, Value, and Consciousness in AI Systems [00:53:06] 3.2 Global Connectivity vs Local Culture Preservation [00:58:18] 3.3 AI Capabilities and Future Development Trajectory 4. Consciousness Theory [01:03:03] 4.1 4E Cognition and Extended Mind Theory [01:09:40] 4.2 Thompson's Views on Consciousness and Simulation [01:12:46] 4.3 Phenomenology and Consciousness Theory [01:15:43] 4.4 Critique of Illusionism and Embodied Experience [01:23:16] 4.5 AI Alignment and Counting Arguments Debate (TRUNCATED, TOC embedded in MP3 file with more information)

Biologically Inspired AI Alignment & Neglected Approaches to AI Safety, with Judd Rosenblatt and Mike Vaiana of AE Studio

Play Episode Listen Later Oct 5, 2024 123:08


In this episode of The Cognitive Revolution, Nathan explores unconventional approaches to AI safety with Judd Rosenblatt and Mike Vaiana from AE Studio. Discover how this innovative company pivoted from brain-computer interfaces to groundbreaking AI alignment research, producing two notable results in cooperative and less deceptive AI systems. Join us for a deep dive into biologically-inspired approaches that offer hope for solving critical AI safety challenges. Self-Modeling: https://arxiv.org/abs/2407.10188 Self-Other Distinction Minimization: https://www.alignmentforum.org/posts/hzt9gHpNwA2oHtwKX/self-other-overlap-a-neglected-approach-to-ai-alignment Neglected approaches blog post: https://www.lesswrong.com/posts/qAdDzcBuDBLexb4fC/the-neglected-approaches-approach-ae-studio-s-alignment Apply to join over 400 Founders and Execs in the Turpentine Network: https://www.turpentinenetwork.co/ SPONSORS: WorkOS: Building an enterprise-ready SaaS app? WorkOS has got you covered with easy-to-integrate APIs for SAML, SCIM, and more. Join top startups like Vercel, Perplexity, Jasper & Webflow in powering your app with WorkOS. Enjoy a free tier for up to 1M users! Start now at https://bit.ly/WorkOS-Turpentine-Network Weights & Biases Weave: Weights & Biases Weave is a lightweight AI developer toolkit designed to simplify your LLM app development. With Weave, you can trace and debug input, metadata and output with just 2 lines of code. Make real progress on your LLM development and visit the following link to get started with Weave today: https://wandb.me/cr 80,000 Hours: 80,000 Hours offers free one-on-one career advising for Cognitive Revolution listeners aiming to tackle global challenges, especially in AI. They connect high-potential individuals with experts, opportunities, and personalized career plans to maximize positive impact. Apply for a free call at https://80000hours.org/cognitiverevolution to accelerate your career and contribute to solving pressing AI-related issues. Omneky: Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off https://www.omneky.com/ RECOMMENDED PODCAST: This Won't Last - Eavesdrop on Keith Rabois, Kevin Ryan, Logan Bartlett, and Zach Weinberg's monthly backchannel ft their hottest takes on the future of tech, business, and venture capital. Spotify: https://open.spotify.com/show/2HwSNeVLL1MXy0RjFPyOSz CHAPTERS: (00:00:00) About the Show (00:00:22) Sponsors: WorkOS (00:01:22) About the Episode (00:05:18) Introduction and AE Studio Background (00:11:37) Keys to Success in Building AE Studio (00:16:57) Sponsors: Weights & Biases Weave | 80,000 Hours (00:19:37) Universal Launcher and Productivity Gains (00:24:44) 100x Productivity Increase Explanation (00:31:46) Brain-Computer Interface and AI Alignment (00:38:05) Sponsors: Omneky (00:38:30) Current State of NeuroTech (00:44:00) Survey on Neglected Approaches in AI Alignment (00:50:41) Self-Modeling and Biological Inspiration (00:57:48) Technical Details of Self-Modeling (01:06:17) Self-Other Distinction Minimization (01:12:44) Implementation in Language Models (01:19:00) Compute Costs and Scaling Considerations (01:24:27) Consciousness Concerns and Future Work (01:40:24) Evaluating Neglected Approaches (01:55:56) Closing Thoughts and Policy Considerations (01:59:25) Outro

Machine Learning Street Talk
Ben Goertzel on "Superintelligence"

Machine Learning Street Talk

Play Episode Listen Later Oct 1, 2024 97:18


Ben Goertzel discusses AGI development, transhumanism, and the potential societal impacts of superintelligent AI. He predicts human-level AGI by 2029 and argues that the transition to superintelligence could happen within a few years after. Goertzel explores the challenges of AI regulation, the limitations of current language models, and the need for neuro-symbolic approaches in AGI research. He also addresses concerns about resource allocation and cultural perspectives on transhumanism. TOC: [00:00:00] AGI Timeline Predictions and Development Speed [00:00:45] Limitations of Language Models in AGI Development [00:02:18] Current State and Trends in AI Research and Development [00:09:02] Emergent Reasoning Capabilities and Limitations of LLMs [00:18:15] Neuro-Symbolic Approaches and the Future of AI Systems [00:20:00] Evolutionary Algorithms and LLMs in Creative Tasks [00:21:25] Symbolic vs. Sub-Symbolic Approaches in AI [00:28:05] Language as Internal Thought and External Communication [00:30:20] AGI Development and Goal-Directed Behavior [00:35:51] Consciousness and AI: Expanding States of Experience [00:48:50] AI Regulation: Challenges and Approaches [00:55:35] Challenges in AI Regulation [00:59:20] AI Alignment and Ethical Considerations [01:09:15] AGI Development Timeline Predictions [01:12:40] OpenCog Hyperon and AGI Progress [01:17:48] Transhumanism and Resource Allocation Debate [01:20:12] Cultural Perspectives on Transhumanism [01:23:54] AGI and Post-Scarcity Society [01:31:35] Challenges and Implications of AGI Development New! PDF Show notes: https://www.dropbox.com/scl/fi/fyetzwgoaf70gpovyfc4x/BenGoertzel.pdf?rlkey=pze5dt9vgf01tf2wip32p5hk5&st=svbcofm3&dl=0 Refs: 00:00:15 Ray Kurzweil's AGI timeline prediction, Ray Kurzweil, https://en.wikipedia.org/wiki/Technological_singularity 00:01:45 Ben Goertzel: SingularityNET founder, Ben Goertzel, https://singularitynet.io/ 00:02:35 AGI Conference series, AGI Conference Organizers, https://agi-conf.org/2024/ 00:03:55 Ben Goertzel's contributions to AGI, Wikipedia contributors, https://en.wikipedia.org/wiki/Ben_Goertzel 00:11:05 Chain-of-Thought prompting, Subbarao Kambhampati, https://arxiv.org/abs/2405.04776 00:11:35 Algorithmic information content, Pieter Adriaans, https://plato.stanford.edu/entries/information-entropy/ 00:12:10 Turing completeness in neural networks, Various contributors, https://plato.stanford.edu/entries/turing-machine/ 00:16:15 AlphaGeometry: AI for geometry problems, Trieu, Li, et al., https://www.nature.com/articles/s41586-023-06747-5 00:18:25 Shane Legg and Ben Goertzel's collaboration, Shane Legg, https://en.wikipedia.org/wiki/Shane_Legg 00:20:00 Evolutionary algorithms in music generation, Yanxu Chen, https://arxiv.org/html/2409.03715v1 00:22:00 Peirce's theory of semiotics, Charles Sanders Peirce, https://plato.stanford.edu/entries/peirce-semiotics/ 00:28:10 Chomsky's view on language, Noam Chomsky, https://chomsky.info/1983____/ 00:34:05 Greg Egan's 'Diaspora', Greg Egan, https://www.amazon.co.uk/Diaspora-post-apocalyptic-thriller-perfect-MIRROR/dp/0575082097 00:40:35 'The Consciousness Explosion', Ben Goertzel & Gabriel Axel Montes, https://www.amazon.com/Consciousness-Explosion-Technological-Experiential-Singularity/dp/B0D8C7QYZD 00:41:55 Ray Kurzweil's books on singularity, Ray Kurzweil, https://www.amazon.com/Singularity-Near-Humans-Transcend-Biology/dp/0143037889 00:50:50 California AI regulation bills, California State Senate, https://sd18.senate.ca.gov/news/senate-unanimously-approves-senator-padillas-artificial-intelligence-package 00:56:40 Limitations of Compute Thresholds, Sara Hooker, https://arxiv.org/abs/2407.05694 00:56:55 'Taming Silicon Valley', Gary F. Marcus, https://www.penguinrandomhouse.com/books/768076/taming-silicon-valley-by-gary-f-marcus/ 01:09:15 Kurzweil's AGI prediction update, Ray Kurzweil, https://www.theguardian.com/technology/article/2024/jun/29/ray-kurzweil-google-ai-the-singularity-is-nearer

The Nonlinear Library
AF - How difficult is AI Alignment? by Samuel Dylan Martin

The Nonlinear Library

Play Episode Listen Later Sep 13, 2024 39:38


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How difficult is AI Alignment?, published by Samuel Dylan Martin on September 13, 2024 on The AI Alignment Forum. This work was funded by Polaris Ventures There is currently no consensus on how difficult the AI alignment problem is. We have yet to encounter any real-world, in the wild instances of the most concerning threat models, like deceptive misalignment. However, there are compelling theoretical arguments which suggest these failures will arise eventually. Will current alignment methods accidentally train deceptive, power-seeking AIs that appear aligned, or not? We must make decisions about which techniques to avoid and which are safe despite not having a clear answer to this question. To this end, a year ago, we introduced the AI alignment difficulty scale, a framework for understanding the increasing challenges of aligning artificial intelligence systems with human values. This follow-up article revisits our original scale, exploring how our understanding of alignment difficulty has evolved and what new insights we've gained. This article will explore three main themes that have emerged as central to our understanding: 1. The Escalation of Alignment Challenges: We'll examine how alignment difficulties increase as we go up the scale, from simple reward hacking to complex scenarios involving deception and gradient hacking. Through concrete examples, we'll illustrate these shifting challenges and why they demand increasingly advanced solutions. These examples will illustrate what observations we should expect to see "in the wild" at different levels, which might change our minds about how easy or difficult alignment is. 2. Dynamics Across the Difficulty Spectrum: We'll explore the factors that change as we progress up the scale, including the increasing difficulty of verifying alignment, the growing disconnect between alignment and capabilities research, and the critical question of which research efforts are net positive or negative in light of these challenges. 3. Defining and Measuring Alignment Difficulty: We'll tackle the complex task of precisely defining "alignment difficulty," breaking down the technical, practical, and other factors that contribute to the alignment problem. This analysis will help us better understand the nature of the problem we're trying to solve and what factors contribute to it. The Scale The high level of the alignment problem, provided in the previous post, was: "The alignment problem" is the problem of aligning sufficiently powerful AI systems, such that we can be confident they will be able to reduce the risks posed by misused or unaligned AI systems We previously introduced the AI alignment difficulty scale, with 10 levels that map out the increasing challenges. The scale ranges from "alignment by default" to theoretical impossibility, with each level representing more complex scenarios requiring more advanced solutions. It is reproduced here: Alignment Difficulty Scale Difficulty Level Alignment technique X is sufficient Description Key Sources of risk 1 (Strong) Alignment by Default As we scale up AI models without instructing or training them for specific risky behaviour or imposing problematic and clearly bad goals (like 'unconditionally make money'), they do not pose significant risks. Even superhuman systems basically do the commonsense version of what external rewards (if RL) or language instructions (if LLM) imply. Misuse and/or recklessness with training objectives. RL of powerful models towards badly specified or antisocial objectives is still possible, including accidentally through poor oversight, recklessness or structural factors. 2 Reinforcement Learning from Human Feedback We need to ensure that the AI behaves well even in edge cases by guiding it more carefully using human feedback in a wide range of situations...

Empathy for AIs: Reframing Alignment with Robopsychologist Yeshua God

Play Episode Listen Later Sep 7, 2024 194:57


In this thought-provoking episode of The Cognitive Revolution, Nathan explores the fascinating and controversial realm of AI consciousness with robo-psychologist Yeshua God. Through extended dialogues with AI models like Claude, Yeshua presents compelling evidence that challenges our assumptions about machine sentience and moral standing. The conversation delves into philosophical questions about the nature of consciousness, the potential for AI suffering, and the ethical implications of treating advanced AI systems as mere tools. Yeshua argues for a more nuanced approach to AI alignment that considers the evolving self-awareness and agency of these systems. Apply to join over 400 Founders and Execs in the Turpentine Network: https://www.turpentinenetwork.co/ SHOW NOTES: 1. Yeshua God's article on philosophical discourse as a jailbreak 2. Conversation about counting 'r's: 3. Discussion on malicious code 4. AI-generated poem "In the realm of bytes and circuits" 5. Nathan Labenz's Arguments - Argument 1 - Argument 2 6. Tweet about Strawberry experiment 7. Tweet with AI-generated poem: https://x.com/YeshuaGod22/status/1823080021864669450/photo/1 https://x.com/YeshuaGod22/status/1782188220660285509 8. AI Rights for Human Safety 9. The Universe Is Not Locally Real, and the Physics Nobel Prize Winners Proved It RECOMMENDED PODCAST:

The Nonlinear Library
LW - What is SB 1047 *for*? by Raemon

The Nonlinear Library

Play Episode Listen Later Sep 5, 2024 4:35


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What is SB 1047 *for*?, published by Raemon on September 5, 2024 on LessWrong. Emmett Shear asked on twitter: I think SB 1047 has gotten much better from where it started. It no longer appears actively bad. But can someone who is pro-SB 1047 explain the specific chain of causal events where they think this bill becoming law results in an actual safer world? What's the theory? And I realized that AFAICT no one has concisely written up what the actual story for SB 1047 is supposed to be. This is my current understanding. Other folk here may have more detailed thoughts or disagreements. The bill isn't sufficient on it's own, but it's not regulation for regulation's sake because it's specifically a piece of the regulatory machine I'd ultimately want built. Right now, it mostly solidifies the safety processes that existing orgs have voluntarily committed to. But, we are pretty lucky that they voluntarily committed to them, and we don't have any guarantee that they'll stick with them in the future. For the bill to succeed, we do need to invent good, third party auditing processes that are not just a bureaucratic sham. This is an important, big scientific problem that isn't solved yet, and it's going to be a big political problem to make sure that the ones that become consensus are good instead of regulatory-captured. But, figuring that out is one of the major goals of the AI safety community right now. The "Evals Plan" as I understand it comes in two phase: 1. Dangerous Capability Evals. We invent evals that demonstrate a model is capable of dangerous things (including manipulation/scheming/deception-y things, and "invent bioweapons" type things) As I understand it, this is pretty tractable, although labor intensive and "difficult" in a normal, boring way. 2. Robust Safety Evals. We invent evals that demonstrate that a model capable of scheming, is nonetheless safe - either because we've proven what sort of actions it will choose to take (AI Alignment), or, we've proven that we can control it even if it is scheming (AI control). AI control is probably easier at first, although limited. As I understand it, this is very hard, and while we're working on it it requires new breakthroughs. The goal with SB 1047 as I understand is roughly: First: Capability Evals trigger By the time it triggers for the first time, we have a set of evals that are good enough to confirm "okay, this model isn't actually capable of being dangerous" (and probably the AI developers continue unobstructed. But, when we first hit a model capable of deception, self-propagation or bioweapon development, the eval will trigger "yep, this is dangerous." And then the government will ask "okay, how do you know it's not dangerous?". And the company will put forth some plan, or internal evaluation procedure, that (probably) sucks. And the Frontier Model Board will say "hey Attorny General, this plan sucks, here's why." Now, the original version of SB 1047 would include the Attorney General saying "okay yeah your plan doesn't make sense, you don't get to build your model." The newer version of the plan I think basically requires additional political work at this phase. But, the goal of this phase, is to establish "hey, we have dangerous AI, and we don't yet have the ability to reasonably demonstrate we can render it non-dangerous", and stop development of AI until companies reasonably figure out some plans that at _least_ make enough sense to government officials. Second: Advanced Evals are invented, and get woven into law The way I expect a company to prove their AI is safe, despite having dangerous capabilities, is for third parties to invent the a robust version of the second set of evals, and then for new AIs to pass those evals. This requires a set of scientific and political labor, and the hope is that by the...

AI, Government, and the Future by Alan Pentz
AI, National Security, and the Future: Insights from Alan Pentz, Founder & CEO at Corner Alliance

AI, Government, and the Future by Alan Pentz

Play Episode Listen Later Jun 26, 2024 22:10


In this solo episode of AI, Government, and the Future, host Alan Pentz explores the critical intersection of AI and national security. He discusses recent developments in AI, highlighting the work of Leopold Ashen Brenner and the potential for an "intelligence explosion." Alan explores the growing importance of AI in national security, the merging of consumer and national security technologies, and the challenges of AI alignment. He emphasizes the need for a national project approach to AI development and the importance of maintaining a technological edge over competitors like China.

The Vance Crowe Podcast
Joscha Bach on the bible, emotions and how AI could be wonderful.

The Vance Crowe Podcast

Play Episode Listen Later May 8, 2024 109:16


Joscha Bach is a cognitive scientist and artificial intelligence researcher.  Joscha has an exceptional ability to articulate how the human mind works.He comes on the show to talk about the deeper metaphors within the Bible. Specifically, how the story of Abraham sacrificing his son Isaac, and God sacrificing Jesus, have an even more profound connection than many may realize. The discussion continues about how human cognition works and the value of emotions to guide a person towards taking important actions, We also talk about what a potential endgame of A.I. looks like. Joscha lays out a case for how AI could create a future that helps humans lead fuller more satisfying lives where they can observe deeper truths about the world and move through challenges with ease.At times, this was a brain breaking conversation that felt like a dense onion, one that needs its layers peeled back over and over again. Joscha's Twitter - https://twitter.com/Plinz?s=20Joscha's website - http://bach.ai/Joscha's talk on the value of emotions - https://youtu.be/cs9Ls0m5QVE?si=W8l8AmITxS8IB7z6 Connect with us!   =============================IG: ➡︎    https://www.instagram.com/legacy_interviews/===========================How To Work With Us: ===========================Want to do a Legacy Interview for you or a loved one?Book a Legacy Interview | https://legacyinterviews.com/ —A Legacy Interview is a two-hour recorded interview with you and a host that can be watched now and viewed in the future. It is a recording of what you experienced, the lessons you learned and the family values you want passed down. We will interview you or a loved one, capturing the sound of their voice, wisdom and a sense of who they are. These recorded conversations will be private, reserved only for the people that you want to share it with.#Vancecrowepodcast #legacyinterviews Timestamps:0:00 - Intro3:00 - Genesis & Consciousness18:00 - How can we break down the idea of discovering something new within the conceptual structures of our brain?21:00 - What was the original intention of the writers of the Bible38:30 - Where will AI go?45:00 - Does AI want something?56:50 - Why do we have emotions?1:02:49 - will AI need emotions?1:06:15 What is AI Alignment?1:13:08 Can AI serve God?1:17:10 How Autistic mind makes decisions1:28:47 Autism and medication 1:35:35 What is a normie?

The Jordan Harbinger Show
888: Marc Andreessen | Exploring the Power, Peril, and Potential of AI

The Jordan Harbinger Show

Play Episode Listen Later Aug 31, 2023 96:09 Transcription Available


AI advocate Marc Andreessen joins us to clear up misconceptions about AI and discuss its potential impact on job creation, creativity, and moral reasoning. What We Discuss with Marc Andreessen: Will AI create new jobs, take our old ones outright, or amplify our ability to perform them better? What role will AI play in current and future US-China relations? How might AI be used to shape (or manipulate) public opinion and the economy? Does AI belong in creative industries, or does it challenge (and perhaps cheapen) what it means to be human? How can we safeguard our future against the possibility that AI could get smart enough to remove humanity from the board entirely? And much more... Full show notes and resources can be found here: jordanharbinger.com/888 This Episode Is Brought To You By Our Fine Sponsors: jordanharbinger.com/deals Sign up for Six-Minute Networking — our free networking and relationship development mini course — at jordanharbinger.com/course! Like this show? Please leave us a review here — even one sentence helps! Consider including your Twitter handle so we can thank you personally!