Podcasts about human compatible

  • 22PODCASTS
  • 34EPISODES
  • 34mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Feb 29, 2024LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about human compatible

Latest podcast episodes about human compatible

The Nonlinear Library: LessWrong
LW - Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds" by mattmacdermott

The Nonlinear Library: LessWrong

Play Episode Listen Later Feb 29, 2024 22:21


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds", published by mattmacdermott on February 29, 2024 on LessWrong. Yoshua Bengio recently posted a high-level overview of his alignment research agenda on his blog. I'm pasting the full text below since it's fairly short. What can't we afford with a future superintelligent AI? Among others, confidently wrong predictions about the harm that some actions could yield. Especially catastrophic harm. Especially if these actions could spell the end of humanity. How can we design an AI that will be highly capable and will not harm humans? In my opinion, we need to figure out this question - of controlling AI so that it behaves in really safe ways - before we reach human-level AI, aka AGI; and to be successful, we need all hands on deck. Economic and military pressures to accelerate advances in AI capabilities will continue to push forward even if we have not figured out how to make superintelligent AI safe. And even if some regulations and treaties are put into place to reduce the risks, it is plausible that human greed for power and wealth and the forces propelling competition between humans, corporations and countries, will continue to speed up dangerous technological advances. Right now, science has no clear answer to this question of AI control and how to align its intentions and behavior with democratically chosen values. It is a bit like in the "Don't Look Up" movie. Some scientists have arguments about the plausibility of scenarios (e.g., see "Human Compatible") where a planet-killing asteroid is headed straight towards us and may come close to the atmosphere. In the case of AI there is more uncertainty, first about the probability of different scenarios (including about future public policies) and about the timeline, which could be years or decades according to leading AI researchers. And there are no convincing scientific arguments which contradict these scenarios and reassure us for certain, nor is there any known method to "deflect the asteroid", i.e., avoid catastrophic outcomes from future powerful AI systems. With the survival of humanity at stake, we should invest massively in this scientific problem, to understand this asteroid and discover ways to deflect it. Given the stakes, our responsibility to humanity, our children and grandchildren, and the enormity of the scientific problem, I believe this to be the most pressing challenge in computer science that will dictate our collective wellbeing as a species. Solving it could of course help us greatly with many other challenges, including disease, poverty and climate change, because AI clearly has beneficial uses. In addition to this scientific problem, there is also a political problem that needs attention: how do we make sure that no one triggers a catastrophe or takes over political power when AGI becomes widely available or even as we approach it. See this article of mine in the Journal of Democracy on this topic. In this blog post, I will focus on an approach to the scientific challenge of AI control and alignment. Given the stakes, I find it particularly important to focus on approaches which give us the strongest possible AI safety guarantees. Over the last year, I have been thinking about this and I started writing about it in this May 2023 blog post (also see my December 2023 Alignment Workshop keynote presentation). Here, I will spell out some key thoughts that came out of a maturation of my reflection on this topic and that are driving my current main research focus. I have received funding to explore this research program and I am looking for researchers motivated by existential risk and with expertise in the span of mathematics (especially about probabilistic methods), machine learning (especially about amorti...

The Nonlinear Library
LW - Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds" by mattmacdermott

The Nonlinear Library

Play Episode Listen Later Feb 29, 2024 22:21


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds", published by mattmacdermott on February 29, 2024 on LessWrong. Yoshua Bengio recently posted a high-level overview of his alignment research agenda on his blog. I'm pasting the full text below since it's fairly short. What can't we afford with a future superintelligent AI? Among others, confidently wrong predictions about the harm that some actions could yield. Especially catastrophic harm. Especially if these actions could spell the end of humanity. How can we design an AI that will be highly capable and will not harm humans? In my opinion, we need to figure out this question - of controlling AI so that it behaves in really safe ways - before we reach human-level AI, aka AGI; and to be successful, we need all hands on deck. Economic and military pressures to accelerate advances in AI capabilities will continue to push forward even if we have not figured out how to make superintelligent AI safe. And even if some regulations and treaties are put into place to reduce the risks, it is plausible that human greed for power and wealth and the forces propelling competition between humans, corporations and countries, will continue to speed up dangerous technological advances. Right now, science has no clear answer to this question of AI control and how to align its intentions and behavior with democratically chosen values. It is a bit like in the "Don't Look Up" movie. Some scientists have arguments about the plausibility of scenarios (e.g., see "Human Compatible") where a planet-killing asteroid is headed straight towards us and may come close to the atmosphere. In the case of AI there is more uncertainty, first about the probability of different scenarios (including about future public policies) and about the timeline, which could be years or decades according to leading AI researchers. And there are no convincing scientific arguments which contradict these scenarios and reassure us for certain, nor is there any known method to "deflect the asteroid", i.e., avoid catastrophic outcomes from future powerful AI systems. With the survival of humanity at stake, we should invest massively in this scientific problem, to understand this asteroid and discover ways to deflect it. Given the stakes, our responsibility to humanity, our children and grandchildren, and the enormity of the scientific problem, I believe this to be the most pressing challenge in computer science that will dictate our collective wellbeing as a species. Solving it could of course help us greatly with many other challenges, including disease, poverty and climate change, because AI clearly has beneficial uses. In addition to this scientific problem, there is also a political problem that needs attention: how do we make sure that no one triggers a catastrophe or takes over political power when AGI becomes widely available or even as we approach it. See this article of mine in the Journal of Democracy on this topic. In this blog post, I will focus on an approach to the scientific challenge of AI control and alignment. Given the stakes, I find it particularly important to focus on approaches which give us the strongest possible AI safety guarantees. Over the last year, I have been thinking about this and I started writing about it in this May 2023 blog post (also see my December 2023 Alignment Workshop keynote presentation). Here, I will spell out some key thoughts that came out of a maturation of my reflection on this topic and that are driving my current main research focus. I have received funding to explore this research program and I am looking for researchers motivated by existential risk and with expertise in the span of mathematics (especially about probabilistic methods), machine learning (especially about amorti...

20 Minute Books
Human Compatible - Book Summary

20 Minute Books

Play Episode Listen Later Dec 17, 2023 22:35


"Artificial Intelligence and the Problem of Control"

Philosophical Disquisitions
TITE 3 - Value Alignment and the Control Problem

Philosophical Disquisitions

Play Episode Listen Later Oct 10, 2023


In this episode, John and Sven discuss risk and technology ethics. They focus, in particular, on the perennially popular and widely discussed problems of value alignment (how to get technology to align with our values) and control (making sure technology doesn't do something terrible). They start the conversation with the famous case study of Stanislov Petrov and the prevention of nuclear war. You can listen below or download the episode here. You can also subscribe to the podcast on Apple, Spotify, Google, Amazon and a range of other podcasting services. Recommendations for further reading Atoosa Kasirzadeh and Iason Gabriel, 'In Conversation with AI: Aligning Language Models with Human Values' Nick Bostrom, relevant chapters from Superintelligence Stuart Russell, Human Compatible Langdon Winner, 'Do Artifacts Have Politics?' Iason Gabriel, 'Artificial Intelligence, Values and Alignment' Brian Christian, The Alignment Problem Discount You can purchase a 20% discounted copy of This is Technology Ethics by using the code TEC20 at the publisher's website. #mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; } /* Add your own MailChimp form style overrides in your site stylesheet or in this style block. We recommend moving this block and the preceding CSS link to the HEAD of your HTML file. */ Subscribe to the newsletter

The Next Big Idea
HUMAN COMPATIBLE: Can We Control Artificial Intelligence?

The Next Big Idea

Play Episode Listen Later Sep 28, 2023 72:25


Stuart Russell wrote the book on artificial intelligence. Literally. Today, he sits down with Rufus to discuss the promise — and potential peril — of the technology he's been studying for the last 40 years. --- Book: “Human Compatible: Artificial Intelligence and the Problem of Control” Host: Rufus Griscom Guest: Stuart Russell

The Nonlinear Library: LessWrong
LW - I designed an AI safety course (for a philosophy department) by Eleni Angelou

The Nonlinear Library: LessWrong

Play Episode Listen Later Sep 24, 2023 5:35


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: I designed an AI safety course (for a philosophy department), published by Eleni Angelou on September 24, 2023 on LessWrong. Background In the fall of 2023, I'm teaching a course called "Philosophy and The Challenge of the Future"[1] which is focused on AI risk and safety. I designed the syllabus keeping in mind that my students: will have no prior exposure to what AI is or how it works will not necessarily have a strong philosophy background (the course is offered by the Philosophy department, but is open to everyone) will not necessarily be familiar with Effective Altruism at all Goals My approach combines three perspectives: 1) philosophy, 2) AI safety, and 3) Science, Technology, Society (STS); this combination reflects my training in these fields and attempts to create an alternative introduction to AI safety (that doesn't just copy the AISF curriculum). That said, I plan to recommend the AISF course towards the end of the semester; since my students are majoring in all sorts of different things, from CS to psychology, it'd be great if some of them considered AI safety research as their career path. Course Overview INTRO TO AI Week 1 (8/28-9/1): The foundations of Artificial Intelligence (AI) Required Readings: Artificial Intelligence, A Modern Approach, pp. 1-27, Russell & Norvig. Superintelligence, pp. 1-16, Bostrom. Week 2 (9/5-8): AI, Machine Learning (ML), and Deep Learning (DL) Required Readings: You Look Like a Thing and I Love You, Chapters 1, 2, and 3, Shane. But what is a neural network? (video) ML Glossary (optional but helpful for terminological references) Week 3 (9/11-16): What can current AI models do? Required Readings: Artificial Intelligence, A Modern Approach, pp. 27-34, Russell & Norvig. ChatGPT Explained (video) What is Stable Diffusion? (video) AI AND THE FUTURE OF HUMANITY Week 4 (9/18-22): What are the stakes? Required Readings: The Precipice, pp. 15-21, Ord. Existential risk and human extinction: An intellectual history, Moynihan. Everything might change forever this century (video) Week 5 (9/25-29): What are the risks? Required Readings: Taxonomy of Risks posed by Language Models, Weidinger et al. Human Compatible, pp. 140-152, Russell. Loss of Control: "Normal Accidents and AI Systems", Chan. Week 6 (10/2-6): From Intelligence to Superintelligence Required Readings: A Collection of Definitions of Intelligence, Legg & Hutter. Artificial Intelligence as a positive and negative factor in global risk, Yudkowsky. Paths to Superintelligence, Bostrom. Week 7 (10/10-13): Human-Machine interaction and cooperation Required Readings: Cooperative AI: machines must learn to find common ground, Dafoe et. al. AI-written critiques help humans notice flaws AI Generates Hypotheses Human Scientists Have Not Thought Of THE BASICS OF AI SAFETY Week 8 (10/16-20): Value learning and goal-directed behavior Required Readings: Machines Learning Values, Petersen. The Basic AI Drives, Omuhundro. The Value Learning Problem, Soares. Week 9 (10/23-27): Instrumental rationality and the orthogonality thesis Required Readings: The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents, Bostrom. General Purpose Intelligence: Arguing The Orthogonality Thesis, Armstrong. METAPHYSICAL & EPISTEMOLOGICAL CONSIDERATIONS Week 10 (10/30-11/4): Thinking about the Singularity Required Readings: The Singularity: A Philosophical Analysis, Chalmers. Can Intelligence Explode?, Hutter. Week 11 (11/6-11): AI and Consciousness Required Readings: Could a Large Language Model be Conscious?, Chalmers. Will AI Achieve Consciousness? Wrong Question, Dennett. ETHICAL QUESTIONS Week 12 (11/13-17): What are the moral challenges of high-risk technologies? Required Readings: Human Compatible, "Misuses of AI", Russell. The Ethics of Invention, "Risk and Respon...

The Nonlinear Library
LW - I designed an AI safety course (for a philosophy department) by Eleni Angelou

The Nonlinear Library

Play Episode Listen Later Sep 24, 2023 5:35


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: I designed an AI safety course (for a philosophy department), published by Eleni Angelou on September 24, 2023 on LessWrong. Background In the fall of 2023, I'm teaching a course called "Philosophy and The Challenge of the Future"[1] which is focused on AI risk and safety. I designed the syllabus keeping in mind that my students: will have no prior exposure to what AI is or how it works will not necessarily have a strong philosophy background (the course is offered by the Philosophy department, but is open to everyone) will not necessarily be familiar with Effective Altruism at all Goals My approach combines three perspectives: 1) philosophy, 2) AI safety, and 3) Science, Technology, Society (STS); this combination reflects my training in these fields and attempts to create an alternative introduction to AI safety (that doesn't just copy the AISF curriculum). That said, I plan to recommend the AISF course towards the end of the semester; since my students are majoring in all sorts of different things, from CS to psychology, it'd be great if some of them considered AI safety research as their career path. Course Overview INTRO TO AI Week 1 (8/28-9/1): The foundations of Artificial Intelligence (AI) Required Readings: Artificial Intelligence, A Modern Approach, pp. 1-27, Russell & Norvig. Superintelligence, pp. 1-16, Bostrom. Week 2 (9/5-8): AI, Machine Learning (ML), and Deep Learning (DL) Required Readings: You Look Like a Thing and I Love You, Chapters 1, 2, and 3, Shane. But what is a neural network? (video) ML Glossary (optional but helpful for terminological references) Week 3 (9/11-16): What can current AI models do? Required Readings: Artificial Intelligence, A Modern Approach, pp. 27-34, Russell & Norvig. ChatGPT Explained (video) What is Stable Diffusion? (video) AI AND THE FUTURE OF HUMANITY Week 4 (9/18-22): What are the stakes? Required Readings: The Precipice, pp. 15-21, Ord. Existential risk and human extinction: An intellectual history, Moynihan. Everything might change forever this century (video) Week 5 (9/25-29): What are the risks? Required Readings: Taxonomy of Risks posed by Language Models, Weidinger et al. Human Compatible, pp. 140-152, Russell. Loss of Control: "Normal Accidents and AI Systems", Chan. Week 6 (10/2-6): From Intelligence to Superintelligence Required Readings: A Collection of Definitions of Intelligence, Legg & Hutter. Artificial Intelligence as a positive and negative factor in global risk, Yudkowsky. Paths to Superintelligence, Bostrom. Week 7 (10/10-13): Human-Machine interaction and cooperation Required Readings: Cooperative AI: machines must learn to find common ground, Dafoe et. al. AI-written critiques help humans notice flaws AI Generates Hypotheses Human Scientists Have Not Thought Of THE BASICS OF AI SAFETY Week 8 (10/16-20): Value learning and goal-directed behavior Required Readings: Machines Learning Values, Petersen. The Basic AI Drives, Omuhundro. The Value Learning Problem, Soares. Week 9 (10/23-27): Instrumental rationality and the orthogonality thesis Required Readings: The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents, Bostrom. General Purpose Intelligence: Arguing The Orthogonality Thesis, Armstrong. METAPHYSICAL & EPISTEMOLOGICAL CONSIDERATIONS Week 10 (10/30-11/4): Thinking about the Singularity Required Readings: The Singularity: A Philosophical Analysis, Chalmers. Can Intelligence Explode?, Hutter. Week 11 (11/6-11): AI and Consciousness Required Readings: Could a Large Language Model be Conscious?, Chalmers. Will AI Achieve Consciousness? Wrong Question, Dennett. ETHICAL QUESTIONS Week 12 (11/13-17): What are the moral challenges of high-risk technologies? Required Readings: Human Compatible, "Misuses of AI", Russell. The Ethics of Invention, "Risk and Respon...

The Foresight Institute Podcast
Stuart Russel | Human Compatible AI

The Foresight Institute Podcast

Play Episode Listen Later Sep 15, 2023 72:30


Stuart Jonathan Russell is a British computer scientist known for his contributions to artificial intelligence. He is a professor of computer science at the University of California, Berkeley and was from 2008 to 2011 an adjunct professor of neurological surgery at the University of California, San Francisco. He holds the Smith-Zadeh Chair in Engineering at the University of California, Berkeley. He founded and leads the Center for Human-Compatible Artificial Intelligence (CHAI) at UC Berkeley. Russell is the co-author with Peter Norvig of the most popular textbook in the field of AI: Artificial Intelligence: A Modern Approach used in more than 1,500 universities in 135 countries.SummaryThis episode explores the development of artificial intelligence (AI) and its potential impact on humanity. Russel emphasizes the importance of aligning AI systems with human values to ensure their compatibility and avoid unintended consequences. He discusses the risks associated with AI development and proposes the use of provably beneficial AI, which prioritizes human values and addresses concerns such as safety and control. Russel argues for the need to reframe AI research and policymaking to prioritize human well-being and ethical considerations. Computation: Intelligent Cooperation Foresight Group– Info and apply to join!  https://foresight.org/technologies/computation-intelligent-cooperation/The Foresight Institute is a research organization and non-profit that supports the beneficial development of high-impact technologies. Since our founding in 1987 on a vision of guiding powerful technologies, we have continued to evolve into a many-armed organization that focuses on several fields of science and technology that are too ambitious for legacy institutions to support.Allison Duettmann is the president and CEO of Foresight Institute. She directs the Intelligent Cooperation, Molecular Machines, Biotech & Health Extension, Neurotech, and Space Programs, Fellowships, Prizes, and Tech Trees, and shares this work with the public. She founded Existentialhope.com, co-edited Superintelligence: Coordination & Strategy, co-authored Gaming the Future, and co-initiated The Longevity Prize. Apply to Foresight's virtual salons and in person workshops here!We are entirely funded by your donations. If you enjoy what we do please consider donating through our donation page.Visit our website for more content, or join us here:TwitterFacebookLinkedInEvery word ever spoken on this podcast is now AI-searchable using Fathom.fm, a search engine for podcasts.  Hosted on Acast. See acast.com/privacy for more information.

University of California Audio Podcasts (Audio)
How Not To Destroy The World With AI

University of California Audio Podcasts (Audio)

Play Episode Listen Later Jun 16, 2023 58:11


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Stuart Russell, professor of computer science at the UC, Berkeley, discusses what AI is and how it could be beneficial to civilization. Russell is a leading researcher in artificial intelligence and the author, with Peter Norvig, of “Artificial Intelligence: A Modern Approach,” the standard text in the field. His latest book, “Human Compatible,” addresses the long-term impact of AI on humanity. He is also an honorary fellow of Wadham College at the University of Oxford. Series: "The Future of AI" [Science] [Show ID: 38856]

Science (Audio)
How Not To Destroy The World With AI

Science (Audio)

Play Episode Listen Later Jun 16, 2023 58:11


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Stuart Russell, professor of computer science at the UC, Berkeley, discusses what AI is and how it could be beneficial to civilization. Russell is a leading researcher in artificial intelligence and the author, with Peter Norvig, of “Artificial Intelligence: A Modern Approach,” the standard text in the field. His latest book, “Human Compatible,” addresses the long-term impact of AI on humanity. He is also an honorary fellow of Wadham College at the University of Oxford. Series: "The Future of AI" [Science] [Show ID: 38856]

Science (Video)
How Not To Destroy The World With AI

Science (Video)

Play Episode Listen Later Jun 16, 2023 58:11


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Stuart Russell, professor of computer science at the UC, Berkeley, discusses what AI is and how it could be beneficial to civilization. Russell is a leading researcher in artificial intelligence and the author, with Peter Norvig, of “Artificial Intelligence: A Modern Approach,” the standard text in the field. His latest book, “Human Compatible,” addresses the long-term impact of AI on humanity. He is also an honorary fellow of Wadham College at the University of Oxford. Series: "The Future of AI" [Science] [Show ID: 38856]

UC Berkeley (Audio)
How Not To Destroy The World With AI

UC Berkeley (Audio)

Play Episode Listen Later Jun 16, 2023 58:11


This series on artificial intelligence explores recent breakthroughs of AI, its broader societal implications and its future potential. In this presentation, Stuart Russell, professor of computer science at the UC, Berkeley, discusses what AI is and how it could be beneficial to civilization. Russell is a leading researcher in artificial intelligence and the author, with Peter Norvig, of “Artificial Intelligence: A Modern Approach,” the standard text in the field. His latest book, “Human Compatible,” addresses the long-term impact of AI on humanity. He is also an honorary fellow of Wadham College at the University of Oxford. Series: "The Future of AI" [Science] [Show ID: 38856]

The Nonlinear Library
EA - I have thousands of copies of HPMOR in Russian. How to use them with the most impact? by Samin

The Nonlinear Library

Play Episode Listen Later Dec 27, 2022 3:22


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: I have thousands of copies of HPMOR in Russian. How to use them with the most impact?, published by Samin on December 27, 2022 on The Effective Altruism Forum. As a result of a crowdfunding campaign a couple of years ago, I printed 21k copies of HPMOR. 11k of those went to the crowdfunding participants. I need ideas for how to use the ones that are left with the most impact. Over the last few weeks, I've sent 400 copies to winners of IMO, IOI, and other international and Russian olympiads in math, computer science, biology, etc. (and also bought and sent copies of Human Compatible in Russian to about 250 of them who requested it as well). Over the next month, we'll get the media to post about the opportunity for winners of certain olympiads to get free books. I estimate that we can get maybe twice as many (800-1000) impressive students to fill out a form for getting HPMOR and get maybe up to 30k people who follow the same media to read HPMOR online if everything goes well (uncertain estimates, from the previous experience). The theory of change behind sending the books to winners of olympiads is that people with high potential read HPMOR and share it with friends, get some EA-adjacent mindset and values from it, and then get introduced to EA in emails about 80k Hours (which is being translated into Russian) and other EA and cause-specific content, and start participating in the EA community and get more into specific cause areas that interest them. The anecdotal evidence is that most of the Russian EAs, many of whom now work full-time at EA orgs or as independent researchers, got into EA after reading HPMOR and then the LW sequences. It's mostly impossible to give the books for donations to EA charities because Russians can't transfer money to EA charities due to sanctions. Delivering the copies costs around $5-10 in Russia and $40-100 outside Russia. There are some other ideas, but nothing that lets us spend thousands of books effectively, so we need more. So, any ideas on how to spend 8-9k of HPMOR copies in Russian? HPMOR has endorsements from a bunch of Russians important to different audiences. That includes some of the most famous science communicators, computer science professors, literature critics, a person training the Moscow team for math competitions, etc.; currently, there are news media and popular science resources with millions of followers managed by people who've read HPMOR or heard of it in a positive way and who I can talk to. After the war started, some of them were banned, but that doesn't affect the millions of followers they have on, e.g., Telegram. Initially, we planned first to translate and print 80k and then give HPMOR to people together with copies of 80k, but there's now time pressure due to uncertainty over the future legal status of HPMOR in Russia. Recently, a law came into effect that prohibited all sorts of mentions of LGBTQ in books, and it's not obvious whether HPMOR is at risk, so it's better to spend a lot of books now. E.g., there's the Yandex School of Data Analysis, one of the top places for students in Russia to get into machine learning; we hope to be able to get them to give hundreds of copies of HPMOR to students who've completed a machine learning course and give us their emails. That might result in more people familiar with the alignment problem in positions where they can help prevent an AI arms race started by Russia. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

Martin Skadal podcast
Otto Barten: Existential Risk 101, AI, Biotechnology, Unknown Unknowns | Martin Skadal podcast #03

Martin Skadal podcast

Play Episode Listen Later Dec 11, 2022 45:33


Thank you for listening to my new podcast project! - Timestamps - 00:00 - Start 02:40 - Existential Risk 101 04:05 - How Otto Works with Existential Risk 06:20 - The Biggest Existential Risk: Artificial Intelligence 10:17 - Why Biotechnology Might Cause Human Extinction 13:05 - Nanotechnology as an Existential Risk 15:07 - Unknown Unknowns: Risks of the Next Centuries 16:21 - Nuclear War On Human Extinction 18:13 - How Climate Change Impacts Existential Risk 21:22 - The Existential Risk of Natural Phenomena 23:40 - Ways to Reduce the Existential Risks 37:38 - How People Can Take Part 42:07 - Otto's Future plans and Last Thoughts 42:33 - Outro Specific links as I referred to in this episode: - https://www.existentialriskobservatory.org/ - https://twitter.com/XRobservatory - https://www.instagram.com/existentialriskobservatory/ - https://www.linkedin.com/company/71570894/ - https://www.facebook.com/Existential-Risk-Observatory-104387338403042 - The Precipice, Toby Ord (https://www.amazon.com/Precipice-Existential-Risk-Future-Humanity/dp/0316484911) - Superintelligence, Nick Bostrom (https://www.amazon.com/Superintelligence-Dangers-Strategies-Nick-Bostrom/dp/1501227742) - Human Compatible, Stuart Russell (https://www.amazon.com/Human-Compatible/dp/0141987502) - The case for reducing existential risks, 80000 hours (https://80000hours.org/articles/existential-risks/) - Wikipedia: https://en.wikipedia.org/wiki/Existential_risk_from_artificial_general_intelligence - Wait but Why: https://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1.html As I want to run this podcast ad-free, the best way to support me is through Patreon: https://www. patreon.com/martinskadal If you live in Norway, you can consider becoming a support member in the two organizations I run. It costs NOK 50 a year. • Become a support member of WSH: https://forms.gle/ogwYPF1c62a59TsRA • Become a support member of AY: https://forms.gle/LSa4P1gyyyUmDsuP7 If you want to become a volunteer for World Saving Hustle or Altruism for Youth, send me an email and I'll forward it to our team. It might take some time before you'll get an answer as we're currently run by volunteers, but you'll get an answer eventually! Do you have any feedback, questions, suggestions for either topics/guests, let me know in the comment section. If you want to get in touch, the best way is through email: martin@worldsavinghustle.com Thanks to everyone in World Saving Hustle backing up this project and thanks to my creative partner Candace for editing this podcast! Thanks everyone and have an amazing day as always!! • instagram https://www.instagram.com/skadal/ • linkedin https://www.linkedin.com/in/martinska.. . • facebook https://www.facebook.com/martinsskadal/ • twitter https://twitter.com/martinskadal • Norwegian YT https://www.youtube.com/@martinskadal353 • Patreon https://www.patreon.com/martinskadal

The Gradient Podcast
Stuart Russell: The Foundations of Artificial Intelligence

The Gradient Podcast

Play Episode Listen Later Oct 6, 2022 70:47


Have suggestions for future podcast guests (or other feedback)? Let us know here!In episode 44 of The Gradient Podcast, Daniel Bashir speaks to Professor Stuart Russell. Stuart Russell is a Professor of Computer Science and the Smith-Zadeh Professor in Engineering at UC Berkeley, as well as an Honorary Fellow at Wadham College, Oxford. Professor Russell is the co-author with Peter Norvig of Artificial Intelligence: A Modern Approach, probably the most popular AI textbook in history. He is the founder and head of Berkeley's Center for Human-Compatible Artificial Intelligence and recently authored the book Human Compatible: Artificial Intelligence and the Problem of Control. He has also served as co-chair on the World Economic Forum's Council on AI and Robotics.Subscribe to The Gradient Podcast:  Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:45) Stuart's introduction to AI* (05:50) The two most important questions* (07:25) Historical perspectives during Stuart's PhD, agents and learning* (14:30) Rationality and Intelligence, Bounded Optimality* (20:30) Stuart's work on Metareasoning* (29:45) How does Metareasoning fit with Bounded Optimality?* (37:39) “Civilization advances by reducing complex operations to be trivial”* (39:20) Reactions to the rise of Deep Learning, connectionist/symbolic debates, probabilistic modeling* (51:00) The Deep Learning and traditional AI communities will adopt each other's ideas* (51:55) Why Stuart finds the self-driving car arena interesting, Waymo's old-fashioned AI approach* (57:30) Effective generalization without the full expressive power of first-order logic—deep learning is a “weird way to go about it”* (1:03:00) A very short shrift of Human Compatible and its ideas* (1:10:42) OutroLinks:* Stuart's webpage* Human Compatible page with reviews and interviews* Papers mentioned* Rationality and Intelligence* Principles of Metareasoning Get full access to The Gradient at thegradientpub.substack.com/subscribe

The Nonlinear Library
AF - What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment by Xuan (Tan Zhi Xuan)

The Nonlinear Library

Play Episode Listen Later Sep 8, 2022 42:45


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment, published by Xuan (Tan Zhi Xuan) on September 8, 2022 on The AI Alignment Forum. This is an extended and edited transcript of the talk I recently gave at EAGxSingapore 2022. The title has been changed for easier searchability of "Contractualist AI Alignment". Abstract: Artificial intelligence seems poised to alter our civilization in transformative ways. How can we align this development with our collective interests? Dominant trends in AI alignment research adopt a preference utilitarian conception of alignment, but this faces practical challenges when extended to a multiplicity of humans, values, and AI systems. This talk develops contractualist AI alignment as an alternative framework, charting out a vision of societal-scale alignment where AI systems can serve a plurality of roles and values, governed by and accountable to collectively decided, role-specific norms, with technical work ensuring compliance with these overlapping social contracts in the face of normative ambiguity. Overview This talk is an attempt to condense a lot of my thinking about AI alignment over the past few years, and why I think we need to orient the field towards a different set of questions and directions than have typically been pursued so far. It builds upon many ideas in my previous talk on AI alignment and philosophical pluralism, as well as arguments in Comprehensive AI Services as General Intelligence, AI Research Considerations for Human Existential Safety (ARCHES), How AI Fails Us, and Gillian Hadfield's work on The Foundations of Cooperative Intelligence. This will cover a lot of ground, so below is a quick overview: The dominant "preference utilitarian" framework in AI alignment research. Challenges for extending this framework to a multiplicity of humans, values, and autonomous systems. Considerations and desiderata that a successful approach to society-scale AI alignment should address. Pluralist and contractualist AI alignment as an alternate framework, including implications for governance, technical research, and philosophical foundations. Alignment: A Preference Utilitarian Approach One way of describing the framework that most alignment researchers implicitly adopt is a "preference utilitarian" approach. Stuart Russell's 3 Principles for Beneficial AI are good summary of this approach. Recognizing that many dangers arise when machines optimize for proxy metrics that ultimately differ from human values, he instead advocates that: The machine's only objective is to maximize the realization of human preferences. The machine is initially uncertain about what those preferences are. The ultimate source of information about human preferences is human behavior. Stuart Russell, Human Compatible (2019) More broadly, many researchers frame the problem as one of utility matching. Under certain assumptions, a single human's preferences can be represented as a utility function over outcomes, and the goal is to build AI systems that optimize the same utility function. (This is implicit in talk about, e.g., objective functions in inner misalignment, and reward modeling, which suggested that human objectives and values can ultimately be represented as a mapping to a scalar quantity called "reward" or "utility".) Why is this hard? Because while it may be possible to ensure that the system does the right thing during development, it's much harder to ensure this during deployment, especially as systems become more capable of achieving new outcomes. For example, a self-driving car might safely avoid obstacles for all situations it was trained on. But at deployment, the objective it's effectively maximizing for might be much more positive than the true human objective, leading to car crashes. So the...

The Nonlinear Library
AF - I missed the crux of the alignment problem the whole time by zeshen

The Nonlinear Library

Play Episode Listen Later Aug 13, 2022 5:18


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: I missed the crux of the alignment problem the whole time, published by zeshen on August 13, 2022 on The AI Alignment Forum. This post has been written for the first Refine blog post day, at the end of the week of readings, discussions, and exercises about epistemology for doing good conceptual research. Thanks to Adam Shimi for helpful discussion and comments. I first got properly exposed to AI alignment ~1-2 years ago. I read the usual stuff like Superintelligence, The Alignment Problem, Human Compatible, a bunch of posts on LessWrong and Alignment Forum, watched all of Rob Miles' videos, and participated in the AGI Safety Fundamentals program. I recently joined Refine and had more conversations with people, and realized I didn't really get the crux of the problem all this while. I knew that superintelligent AI would be very powerful and would Goodhart whatever goals we give it, but I never really got how this relates to basically ‘killing us all'. It feels basically right that AIs will be misaligned by default and will do stuff that is not what we want it to do while pursuing instrumentally convergent goals all along. But the possible actions that such an AI could take seemed so numerous that ‘killing all of humanity' seemed like such a small point in the whole actionspace of the AI, that it would require extreme bad luck for us to be in that situation. First, this seems partially due to my background as a non-software engineer in oil and gas, an industry that takes safety very very seriously. In making a process safe, we quantify the risks of an activity, understand the bounds of the potential failure modes, and then take actions to mitigate against those risks and also implement steps to minimize damage should a failure mode be realized. How I think about safety is from the perspective of specific risk events and the associated probabilities, coupled with the exact failure modes of those risks. This thinking may have hindered my ability to think of the alignment problem in abstract terms, because I focused on looking for specific failure modes that I could picture in my head. Second, there are a few failure modes that seem more popular in the introductory reading materials that I was exposed to. None of them helped me internalize the crux of the problem. The first was the typical paperclip maximizer or ‘superintelligent AI will kill all of us' scenario. It feels like sci-fi that is not grounded in reality, leading to me failing to internalize the point about unboundedness. I do not dispute that a superintelligent AI will have the capabilities to destroy all of humanity, but it doesn't feel like it would actually do so. The other failure modes were from Paul Christiano's post which in my first reading boiled down to ‘powerful AIs will accelerate present-day societal failures but not pose any additional danger', as well as Andrew Critch's post which felt to me like ‘institutions have structurally perverse incentives that lead to the tragedy of the commons'. In my shallow understanding of both of these posts, current human societies have failure modes that will be accelerated by AIs because AIs basically speed things up, whether they are good or bad. So these scenarios were too close to normal scenarios to let me internalize the crux about unboundedness. My internal model of a superintelligent AI was a very powerful tool AI. I didn't really get why we are trying to ‘align it to human values' because I didn't really see human values as the crux of the problem, nor did I think having a superintelligent AI being fully aligned to a human's value would be particularly useful. Which human's values are we talking about anyway? Would it be any good for an AI to fully adopt human values only to end up like Hitler, who is no less a human than any of us are? The phrase ‘power...

EARadio
SERI 2022: Well Founded and Human Compatible AI | Stuart Russell

EARadio

Play Episode Listen Later Aug 6, 2022 43:22


Stuart Russell is a Professor of Computer Science at the University of California at Berkeley, holder of the Smith-Zadeh Chair in Engineering, and Director of the Center for Human-Compatible AI. His book "Artificial Intelligence: A Modern Approach" (with Peter Norvig) is the standard text in AI, used in 1500 universities in 135 countries. His research covers a wide range of topics in artificial intelligence, with a current emphasis on the long-term future of artificial intelligence and its relation to humanity. This talk was first published by the Stanford Existential Risks Initiative. Click here to view it with the slideshow. 

The Nonlinear Library
AF - Updated Deference is not a strong argument against the utility uncertainty approach to alignment by Ivan Vendrov

The Nonlinear Library

Play Episode Listen Later Jun 24, 2022 6:48


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Updated Deference is not a strong argument against the utility uncertainty approach to alignment, published by Ivan Vendrov on June 24, 2022 on The AI Alignment Forum. Thesis: The problem of fully updated deference is not a strong argument against the viability of the assistance games / utility uncertainty approach to AI (outer) alignment. Background: A proposed high-level approach to AI alignment is to have the AI maintain a probability distribution over possible human utility functions instead of optimizing for any particular fixed utility function. Variants of this approach were advocated by Stuart Russell in Human Compatible and by Hadfield-Menell et al in the CIRL paper. Adding utility uncertainty intuitively seems to provide a number of safety benefits relative to having a fixed objective, including: Utility uncertainty gives the AI an incentive to adjust in response to a human operator's corrective actions. Utility uncertainty weakens the AI's incentive to harm its human operators, since this might result in a permanent loss of utility-relevant information. Utility uncertainty incentivizes the AI to avoid irreversible changes to the state of the world, since those might lead to permanently low utility. Despite the high profile and intuitive appeal of utility uncertainty, almost none of the alignment researchers I know consider it a promising approach to AI alignment. The most common reason cited seems to be the problem of fully updated deference (e.g. Richard Ngo's alignment research exercises point to this as the reason for why CIRL doesn't solve the alignment problem). In this post I will argue why fully updated deference should not be seen as a strong argument against utility uncertainty as approach to AI alignment. This is not meant as an argument in favor of the uncertainty approach; it may have other irresolvable difficulties which I discuss briefly in the conclusion. Outline: The Arbital post that seems to be the canonical reference for updated deference contains many heuristic arguments and one concrete, worked-out example in the section Moral uncertainty and its relation to corrigibility. I will mostly engage with the example, and argue that It conflates the problem of updated deference with the independent problem of prior mis-specification. If we remove prior mis-specification, there is no problem in the limit of increasing AI capability. The Problem of Updated Deference The example in the post has an AI that is uncertain between three utility functions U1, U2, U3 whereas the human's true utility function is V. The AI believes that the utility that will be attained in each of the three possible worlds is ui with AI assistance vi if the human optimizes V without the AI's assistance (e.g. because the humans shut the AI down) If the AI is much more powerful than humans, the argument goes, then uivi in any of the three worlds, so the AI will not let itself be shut down. The uncertainty doesn't help because the AI can choose to keep gathering information until it has fully updated. Since it's more powerful than the humans, it can gather that information more efficiently when it's not shut down, and therefore ignores the shutdown signal. Factoring out prior mis-specification The original example has the AI system assign probability 0 to the true human utility function V, presumably because its prior probability was 0. I think any advocate of the utility uncertainty approach would agree that assigning a nonzero prior to the true human utility function is critical for the approach to work. Describing such a prior abstractly is easy (just take the Solomonoff prior over programs), implementing a real CIRL agent that reasons with such a prior could be intractably difficult, but this is clearly a separate problem from "fully updated deference". So from now on w...

The Nonlinear Library: LessWrong
LW - Resources I send to AI researchers about AI safety by Vael Gates

The Nonlinear Library: LessWrong

Play Episode Listen Later Jun 14, 2022 15:30


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Resources I send to AI researchers about AI safety, published by Vael Gates on June 14, 2022 on LessWrong. This is my masterlist of resources I send AI researchers who are mildly interested in learning more about AI safety. I pick and choose which resources to send based on the researcher's interests. The resources at the top of the email draft are the ones I usually send, and I add in later sections as seems useful. I'll also sometimes send The Alignment Problem, Human-Compatible, or The Precipice.I've also included a list of resources that I had students read through for the course Stanford first-year course "Preventing Human Extinction", though I'd most recommend sufficiently motivated students read AGISF Technical Agenda. These reading choices are drawn from the various other reading lists; this is not original in any way, just something to draw from if you're trying to send someone some of the more accessible resources. There's a decent chance that I'll continue updating this post as time goes on, since my current use case is copy-pasting sections of this email to interested parties. Note that "I" and "Vael" are mentioned a few times, so you'll need to edit a bit if you're copy-pasting. Happy to make any edits and take suggestions. [Crossposted to the EA Forum] List for AI researchers Hello X, Very nice to speak to you! As promised, some resources on AI alignment. I tried to include a bunch of stuff so you could look at whatever you found interesting. Happy to chat more about anything, and thanks again. Introduction to the ideas "The case for taking AI seriously as a threat to humanity" by Kelsey Piper (Vox) The Most Important Century and specifically "Forecasting Transformative AI" by Holden Karnofsky, blog series and podcast. Most recommended for timelines. A short interview from Prof. Stuart Russell (UC Berkeley) about his book, Human-Compatible (the other main book in the space is The Alignment Problem, by Brian Christian, which is written in a style I particularly enjoyed) Technical work on AI alignment Empirical work by DeepMind's Safety team on alignment Empirical work by Anthropic on alignment Talk (and transcript) by Paul Christiano describing the AI alignment landscape in 2020 Podcast (and transcript) by Rohin Shah, describing the state of AI value alignment in 2021 Alignment Newsletter and ML Safety Newsletter Unsolved Problems in ML Safety by Hendrycks et al. (2022) Alignment Research Center Interpretability work aimed at alignment: Elhage et al. (2021) and Olah et al. (2020) AI Safety Resources by Victoria Krakovna (DeepMind) and Technical Alignment Curriculum Introduction to large-scale risks from humanity, including "existential risks" that could lead to the extinction of humanity The first third of this book summary (copied below) of the book "The Precipice: Existential Risk and the Future of Humanity" by Toby Ord Chapter 3 is on natural risks, including risks of asteroid and comet impacts, supervolcanic eruptions, and stellar explosions. Ord argues that we can appeal to the fact that we have already survived for 2,000 centuries as evidence that the total existential risk posed by these threats from nature is relatively low (less than one in 2,000 per century). Chapter 4 is on anthropogenic risks, including risks from nuclear war, climate change, and environmental damage. Ord estimates these risks as significantly higher, each posing about a one in 1,000 chance of existential catastrophe within the next 100 years. However, the odds are much higher that climate change will result in non-existential catastrophes, which could in turn make us more vulnerable to other existential risks. Chapter 5 is on future risks, including engineered pandemics and artificial intelligence. Worryingly, Ord puts the risk of engineered pandemics causing an existential ...

The Nonlinear Library
LW - Resources I send to AI researchers about AI safety by Vael Gates

The Nonlinear Library

Play Episode Listen Later Jun 14, 2022 15:30


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Resources I send to AI researchers about AI safety, published by Vael Gates on June 14, 2022 on LessWrong. This is my masterlist of resources I send AI researchers who are mildly interested in learning more about AI safety. I pick and choose which resources to send based on the researcher's interests. The resources at the top of the email draft are the ones I usually send, and I add in later sections as seems useful. I'll also sometimes send The Alignment Problem, Human-Compatible, or The Precipice.I've also included a list of resources that I had students read through for the course Stanford first-year course "Preventing Human Extinction", though I'd most recommend sufficiently motivated students read AGISF Technical Agenda. These reading choices are drawn from the various other reading lists; this is not original in any way, just something to draw from if you're trying to send someone some of the more accessible resources. There's a decent chance that I'll continue updating this post as time goes on, since my current use case is copy-pasting sections of this email to interested parties. Note that "I" and "Vael" are mentioned a few times, so you'll need to edit a bit if you're copy-pasting. Happy to make any edits and take suggestions. [Crossposted to the EA Forum] List for AI researchers Hello X, Very nice to speak to you! As promised, some resources on AI alignment. I tried to include a bunch of stuff so you could look at whatever you found interesting. Happy to chat more about anything, and thanks again. Introduction to the ideas "The case for taking AI seriously as a threat to humanity" by Kelsey Piper (Vox) The Most Important Century and specifically "Forecasting Transformative AI" by Holden Karnofsky, blog series and podcast. Most recommended for timelines. A short interview from Prof. Stuart Russell (UC Berkeley) about his book, Human-Compatible (the other main book in the space is The Alignment Problem, by Brian Christian, which is written in a style I particularly enjoyed) Technical work on AI alignment Empirical work by DeepMind's Safety team on alignment Empirical work by Anthropic on alignment Talk (and transcript) by Paul Christiano describing the AI alignment landscape in 2020 Podcast (and transcript) by Rohin Shah, describing the state of AI value alignment in 2021 Alignment Newsletter and ML Safety Newsletter Unsolved Problems in ML Safety by Hendrycks et al. (2022) Alignment Research Center Interpretability work aimed at alignment: Elhage et al. (2021) and Olah et al. (2020) AI Safety Resources by Victoria Krakovna (DeepMind) and Technical Alignment Curriculum Introduction to large-scale risks from humanity, including "existential risks" that could lead to the extinction of humanity The first third of this book summary (copied below) of the book "The Precipice: Existential Risk and the Future of Humanity" by Toby Ord Chapter 3 is on natural risks, including risks of asteroid and comet impacts, supervolcanic eruptions, and stellar explosions. Ord argues that we can appeal to the fact that we have already survived for 2,000 centuries as evidence that the total existential risk posed by these threats from nature is relatively low (less than one in 2,000 per century). Chapter 4 is on anthropogenic risks, including risks from nuclear war, climate change, and environmental damage. Ord estimates these risks as significantly higher, each posing about a one in 1,000 chance of existential catastrophe within the next 100 years. However, the odds are much higher that climate change will result in non-existential catastrophes, which could in turn make us more vulnerable to other existential risks. Chapter 5 is on future risks, including engineered pandemics and artificial intelligence. Worryingly, Ord puts the risk of engineered pandemics causing an existential ...

The Nonlinear Library
EA - Resources I send to AI researchers about AI safety by Vael Gates

The Nonlinear Library

Play Episode Listen Later Jun 14, 2022 15:30


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Resources I send to AI researchers about AI safety, published by Vael Gates on June 14, 2022 on The Effective Altruism Forum. This is my masterlist of resources I send AI researchers who are mildly interested in learning more about AI safety. I pick and choose which resources to send based on the researcher's interests. The resources at the top of the email draft are the ones I usually send, and I add in later sections as seems useful. I'll also sometimes send The Alignment Problem, Human-Compatible, or The Precipice.I've also included a list of resources that I had students read through for the course Stanford first-year course "Preventing Human Extinction", though I'd most recommend sufficiently motivated students read AGISF Technical Agenda. These reading choices are drawn from the various other reading lists; this is not original in any way, just something to draw from if you're trying to send someone some of the more accessible resources. There's a decent chance that I'll continue updating this post as time goes on, since my current use case is copy-pasting sections of this email to interested parties. Note that "I" and "Vael" are mentioned a few times, so you'll need to edit a bit if you're copy-pasting. Happy to make any edits and take suggestions. [Crossposted to LessWrong] List for AI researchers Hello X, Very nice to speak to you! As promised, some resources on AI alignment. I tried to include a bunch of stuff so you could look at whatever you found interesting. Happy to chat more about anything, and thanks again. Introduction to the ideas "The case for taking AI seriously as a threat to humanity" by Kelsey Piper (Vox) The Most Important Century and specifically "Forecasting Transformative AI" by Holden Karnofsky, blog series and podcast. Most recommended for timelines. A short interview from Prof. Stuart Russell (UC Berkeley) about his book, Human-Compatible (the other main book in the space is The Alignment Problem, by Brian Christian, which is written in a style I particularly enjoyed) Technical work on AI alignment Empirical work by DeepMind's Safety team on alignment Empirical work by Anthropic on alignment Talk (and transcript) by Paul Christiano describing the AI alignment landscape in 2020 Podcast (and transcript) by Rohin Shah, describing the state of AI value alignment in 2021 Alignment Newsletter and ML Safety Newsletter Unsolved Problems in ML Safety by Hendrycks et al. (2022) Alignment Research Center Interpretability work aimed at alignment: Elhage et al. (2021) and Olah et al. (2020) AI Safety Resources by Victoria Krakovna (DeepMind) and Technical Alignment Curriculum Introduction to large-scale risks from humanity, including "existential risks" that could lead to the extinction of humanity The first third of this book summary (copied below) of the book "The Precipice: Existential Risk and the Future of Humanity" by Toby Ord Chapter 3 is on natural risks, including risks of asteroid and comet impacts, supervolcanic eruptions, and stellar explosions. Ord argues that we can appeal to the fact that we have already survived for 2,000 centuries as evidence that the total existential risk posed by these threats from nature is relatively low (less than one in 2,000 per century). Chapter 4 is on anthropogenic risks, including risks from nuclear war, climate change, and environmental damage. Ord estimates these risks as significantly higher, each posing about a one in 1,000 chance of existential catastrophe within the next 100 years. However, the odds are much higher that climate change will result in non-existential catastrophes, which could in turn make us more vulnerable to other existential risks. Chapter 5 is on future risks, including engineered pandemics and artificial intelligence. Worryingly, Ord puts the risk of engineered pandemics causing...

Artificiality
Mark Nitzberg: Human-Compatible AI

Artificiality

Play Episode Listen Later May 15, 2022 53:33


We hear a lot about harm from AI and how the big platforms are focused on using AI and user data to enhance their profits. What about developing AI for good for the rest of us? What would it take to design AI systems that are beneficial to humans?In this episode, we talk with Mark Nitzberg who is Executive Director of CHAI or the UC Berkeley Center for Human-Compatible AI and head of strategic outreach for Berkeley AI Research. Mark began studying AI in the early 1980s and completed his PhD in Computer Vision and Human Perception under David Mumford at Harvard. He has built companies and products in various AI fields including The Blindsight Corporation, a maker of assistive technologies for low vision and active aging, which was acquired by Amazon. Mark is also co-author of The AI Generation which examines how AI reshapes human values, trust and power around the world.We talk with Mark about CHAI's goal of reorienting AI research towards provably beneficial systems, why it's hard to develop beneficial AI, variability in human thinking and preferences, the parallels between management OKRs and AI objectives, human-centered AI design and how AI might help humans realize the future we prefer.Links:Learn more about UC Berkeley CHAISubscribe to get Artificiality delivered to your emailLearn more about Sonder StudioP.S. Thanks to Jonathan Coulton for our music This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit artificiality.substack.com

The Nonlinear Library
LW - An Inside View of AI Alignment by Ansh Radhakrishnan

The Nonlinear Library

Play Episode Listen Later May 12, 2022 3:04


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An Inside View of AI Alignment, published by Ansh Radhakrishnan on May 11, 2022 on LessWrong. I started to take AI Alignment seriously around early 2020. I'd been interested in AI and machine learning in particular since 2014 or so, taking several online ML courses in high school and implementing some simple models for various projects. I leaned into the same niche in college, taking classes in NLP, Computer Vision, and Deep Learning to learn more of the underlying theory and modern applications of AI, with a continued emphasis on ML. I was very optimistic about AI capabilities then (and still am) and if you'd asked me about AI alignment or safety as late as my sophomore year of college (2018-2019), I probably would have quoted Steven Pinker or Andrew Ng at you. Somewhere in the process of reading The Sequences, portions of the AI Foom Debate, and texts like Superintelligence and Human Compatible, I changed my mind. Some 80,000 hours podcast episodes were no doubt influential as well, particularly the episodes with Paul Christiano. By late 2020, I probably took AI risk as seriously as I do today, believing it to be one of the world's most pressing problems (perhaps the most) and was interested in learning more about it. I binged most of the sequences on the Alignment Forum at this point, learning about proposals and concepts like IDA, Debate, Recursive Reward Modeling, Embedded Agency, Attainable Utility Preservation, CIRL etc. Throughout 2021 I continued to keep a finger on the pulse of the field: I got a large amount of value out of the Late 2021 MIRI Conversations in particular, shifting away from a substantial amount of optimism in prosaic alignment methods, slower takeoff speeds, longer timelines, and a generally “Christiano-ish” view of the field and more towards a “Yudkowsky-ish” position. I had a vague sense that AI safety would eventually be the problem I wanted to work on in my life, but going through the EA Cambridge AGI Safety Fundamentals Course helped make it clear that I could productively contribute to AI safety work right now or in the near future. This sequence is going to be an attempt to explicate my current model or “inside view” of the field. These viewpoints have been developed over several years and are no doubt influenced by my path into and through AI safety research: for example, I tend to take aligning modern ML models extremely seriously, perhaps more seriously than is deserved, because of my greater amount of experience with ML compared to other AI paradigms. I'm writing with the express goal of having my beliefs critiqued and scrutinized: there's a lot I don't know and no doubt a large amount that I'm misunderstanding. I plan on writing on a wide variety of topics: the views of various researchers, my understanding and confidence in specific alignment proposals, timelines, takeoff speeds, the scaling hypothesis, interpretability, etc. I also don't have a fixed timeline or planned order in which I plan to publish different pieces of the model. Without further ado, the posts that follow comprise Ansh's (current) Inside View of AI Alignment. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

The Nonlinear Library: LessWrong
LW - An Inside View of AI Alignment by Ansh Radhakrishnan

The Nonlinear Library: LessWrong

Play Episode Listen Later May 12, 2022 3:04


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An Inside View of AI Alignment, published by Ansh Radhakrishnan on May 11, 2022 on LessWrong. I started to take AI Alignment seriously around early 2020. I'd been interested in AI and machine learning in particular since 2014 or so, taking several online ML courses in high school and implementing some simple models for various projects. I leaned into the same niche in college, taking classes in NLP, Computer Vision, and Deep Learning to learn more of the underlying theory and modern applications of AI, with a continued emphasis on ML. I was very optimistic about AI capabilities then (and still am) and if you'd asked me about AI alignment or safety as late as my sophomore year of college (2018-2019), I probably would have quoted Steven Pinker or Andrew Ng at you. Somewhere in the process of reading The Sequences, portions of the AI Foom Debate, and texts like Superintelligence and Human Compatible, I changed my mind. Some 80,000 hours podcast episodes were no doubt influential as well, particularly the episodes with Paul Christiano. By late 2020, I probably took AI risk as seriously as I do today, believing it to be one of the world's most pressing problems (perhaps the most) and was interested in learning more about it. I binged most of the sequences on the Alignment Forum at this point, learning about proposals and concepts like IDA, Debate, Recursive Reward Modeling, Embedded Agency, Attainable Utility Preservation, CIRL etc. Throughout 2021 I continued to keep a finger on the pulse of the field: I got a large amount of value out of the Late 2021 MIRI Conversations in particular, shifting away from a substantial amount of optimism in prosaic alignment methods, slower takeoff speeds, longer timelines, and a generally “Christiano-ish” view of the field and more towards a “Yudkowsky-ish” position. I had a vague sense that AI safety would eventually be the problem I wanted to work on in my life, but going through the EA Cambridge AGI Safety Fundamentals Course helped make it clear that I could productively contribute to AI safety work right now or in the near future. This sequence is going to be an attempt to explicate my current model or “inside view” of the field. These viewpoints have been developed over several years and are no doubt influenced by my path into and through AI safety research: for example, I tend to take aligning modern ML models extremely seriously, perhaps more seriously than is deserved, because of my greater amount of experience with ML compared to other AI paradigms. I'm writing with the express goal of having my beliefs critiqued and scrutinized: there's a lot I don't know and no doubt a large amount that I'm misunderstanding. I plan on writing on a wide variety of topics: the views of various researchers, my understanding and confidence in specific alignment proposals, timelines, takeoff speeds, the scaling hypothesis, interpretability, etc. I also don't have a fixed timeline or planned order in which I plan to publish different pieces of the model. Without further ado, the posts that follow comprise Ansh's (current) Inside View of AI Alignment. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.

The Nonlinear Library
EA - How to become an AI safety researcher by peterbarnett

The Nonlinear Library

Play Episode Listen Later Apr 12, 2022 22:25


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How to become an AI safety researcher, published by peterbarnett on April 12, 2022 on The Effective Altruism Forum. What skills do you need to work on AI safety? And what can we learn from the paths people have taken into the field? We were inspired by the 80,000 Hours podcast with Catherine Olsson and Daniel Ziegler, which had great personal stories and advice about getting into AI safety, so we wanted to do it for a larger sample size. To better understand the lives and careers of AI safety researchers, I talked to eleven AI safety researchers in a variety of organizations, roles, and subfields. If you're interested in getting into AI safety research, we hope this helps you be better informed about what pursuing a career in the field might entail, including things like: How to develop research taste Which specific technical skills to build What non-technical skills you'll need The first section is about the general patterns we noticed, and the second section describes each person's individual path. Of note, the people we talked with are not a random sample of AI safety researchers, and it is also important to consider the effects of survivorship bias. However, we still think it's useful and informative to hear about how they got into the field and what skills they have found valuable. This post is part of a project I've been working on at Nonlinear. Paths into AI safety What degrees did people get? Perhaps unsurprisingly, the researchers we talked to universally studied at least one STEM field in college, most commonly computer science or mathematics. Most had done research as undergraduates, although this often wasn't in AI safety specifically; people often said that getting early research experience was valuable. It is sometimes joked that the qualification needed for doing AI safety work is dropping out of a PhD program, which three people here have done (not that we would exactly recommend doing this!). Aside from those three, almost everyone else is doing or has completed a PhD. These PhD programs were often but not universally, in machine learning, or else they were in related fields like computer science or cognitive science. All of the researchers we talked with had at least familiarity with Effective Altruism and/or Rationality, with most people being actively involved in at least one of these communities. For influential reading, Superintelligence and writing by 80,000 Hours were each mentioned by three people as being particularly impactful on their decision to work on AI safety. It is worth noting that Superintelligence was one of the main books about risks from AI when the people we talked with were becoming interested, but may not be the best book to recommend to people now. More recent books would include Human Compatible by Stuart Russell, or The Alignment Problem by Brian Christian. Finally, many of the safety researchers participated in a program designed for early-career researchers, such as those run by MIRI, CHAI, and FHI. Skills The researchers interviewed described the utility of both technical skills (e.g. machine learning, linear algebra) and more general research skills (e.g. developing research taste, writing well). What technical skills should you learn? Technical AI safety research requires a strong understanding of the technical side of machine learning. By ‘technical' here I basically mean skills related to programming and math. Indeed, a strong command of concepts in the field is important even for those engaged in less technical roles such as field building and strategy. These skills still seem important for understanding the field, especially if you're talking to technical researchers. Depending on the area you work on, some specific areas will be more useful than others. If you want to do “hands-on” machine learning where you trai...

Jacques Ludik Podcast
#78 (45) Building Human-compatible, Ethical, Trustworthy and Beneficial AI

Jacques Ludik Podcast

Play Episode Listen Later Jan 3, 2022 49:18


45. Building Human-compatible, Ethical, Trustworthy and Beneficial AI Extract from Chapter 11 "Democratizing AI to Help Shape a Beneficial Human-centric Future" in "Democratizing Artificial Intelligence to Benefit Everyone: Shaping a Better Future in the Smart Technology Era" - authored and narrated by Dr Jacques Ludik https://jacquesludik.com See https://jacquesludik.com/books-2/ for details about the book. The Kindle and Paperback versions of the book Democratizing Artificial Intelligence to Benefit Everyone has recently been updated and can currently be obtained via the following Amazon marketplaces: United States, United Kingdom, Canada, Germany, France, Spain, Italy, The Netherlands, Japan, Brazil, Mexico, Australia, and India. https://www.amazon.com/Democratizing-Artificial-Intelligence-Benefit-Everyone-ebook/dp/B08ZYW9487/ Audiobook available on many audiobook marketplaces world-wide such as: Audible, Apple Books, Audiobooks.com, Google Play, SCRIBD, Libro.fm, Downpour, Nook, Kobo, Chirp Books, AudiobooksNZ, MLOL https://www.audible.com/pd/Democratizing-Artificial-Intelligence-to-Benefit-Everyone-Audiobook/B09LNL4JHC Video: https://youtu.be/8R8K1EBMfzE YouTube channel: https://www.youtube.com/channel/UCRdG_cB69R0tzRRdURTg5jw Podcasts: https://jacquesludik.com/podcasts/ Spotify: https://open.spotify.com/show/1mozteqpuyVqu2RdanLEHU Website: https://jacquesludik.com  

unSILOed with Greg LaBlanc
Deep Dive into AI: Ethics, Design, and Human Compatibility feat. Stuart Russell

unSILOed with Greg LaBlanc

Play Episode Listen Later Aug 23, 2021 51:01


Popular culture often portrays artificial intelligence (AI) as a super-powerful, ominous threat to jobs, lives, and ultimately humanity. As AI is moving out of the lab into the real world, how can we harness its potential for good? In this episode, Stuart Russell talks about how to avoid this impending problem, a topic he covers in his book Human Compatible.Listen as he takes us on a deep dive into the history of AI and how a new foundation can be used to develop machines that are sensitive to human preferences.Episode Quotes:Should computer science students be seriously concerned about ethics?Computer science students do need to understand more than their technical discipline. And that's what one would expect for a discipline that starts to impact the real world. I think there's more to it. Because you know, the products of civil engineering, for example, the bridges, don't think and participate and act in our democracy the way some AI systems may be starting to do. So, in the long run, if we are making things that function as if they were minds — I'm not going to say that they are minds, but functioning as if they were minds — then you're going to bring in all the considerations.How long ago was it that experts began seriously thinking about AI's practical applications, from the time the first computer was invented by Charles Babbage?So, the real impetus for AI was the development of the computer in the second world war, which arose from Turing's mathematical work in his 1936 paper. And Turing himself, as soon as he figured out that you could actually start computing, and he understood this idea of universal computation, he wanted to build intelligent machines.Misconceptions about A.I. as a practiceAI is a problem, not a technology. So, it can't fail. Right? It can just take longer to solve, you know? And you wouldn't say physics failed because of confusion. So people have always had this strange idea, in the outside world and the media, that AI is a technology. So, nowadays, people often confuse deep learning and AI.What happens every time there's a new discovery in A.I.?Every time there's a small gain in function in one branch of AI, because of the generality of these techniques, there's a big explosion in economic interest. As long as people see all kinds of things in the real world that they can apply them to, that will continue to happen. Deep learning is just one step. There'll be another half dozen such steps. And each of those will probably increase the scope of applications by a factor of 10.Show LinksGuest ProfileBio at University of California BerkeleyProfile at International Telecommunication UnionStuart Russell on LinkedInHis WorkStuart Russell on TEDStuart Russell on Google ScholarArtificial Intelligence: A Modern Approach (4th Edition)Human Compatible: Artificial Intelligence and the Problem of ControlDo the Right Thing: Studies in Limited Rationality (Artificial Intelligence Series)The use of knowledge in analogy and induction (Research notes in artificial intelligence)

Himura Podcast
Reflection on AI Design & Human Impact | Himura Podcast #13

Himura Podcast

Play Episode Listen Later Jun 17, 2021 10:18


Khalil could not reflect on his conversation with Firdaus Adib, a fellow student of the "Abundance & Exponential Mindset" due to trade secrets. Instead, he shares something related to the conversation which is about AI designs & its impact on humanity. The above sharing is based on a 2019 book titled "Human Compatible" by Stuart Russell Get the book summary here: https://blinki.st/d2572e61331d?blinkspack=human-compatible-en

Indivisible
Stuart Russell on Human Compatible AI: Saving Us From the Tyranny of Fixed Objectives

Indivisible

Play Episode Listen Later Apr 21, 2021 63:39


Today's guest is Stuart Russell, and when it comes to AI you might just say, “he wrote the book.” In fact, Stuart is a co-author of the standard textbook that is used to teach AI at universities across the world.  He has also written multiple books for general audiences, testifying to both his range and prolific body of work. Stuart is currently a Professor of Computer Science at UC Berkeley (where he is also Director of the Center for Human Compatible AI) and has been a renowned voice in the AI field for years. In his latest book —  “Human Compatible: AI and the Problem of Control,” Stuart asserts that if we continue to design AI based on optimizing for fixed objectives (the standard approach), it will evolve in the context of superhuman AI to create disastrous consequences that unfold outside of our control. Stuart also explains this as "the King Midas Problem” of AI. Thankfully, he proposes a new approach — derived from inverse reinforcement learning and designated  “provably beneficial AI”— that just might save us from this fate. In this model, AI is designed to 1) optimize for human preferences yet 2) is inherently uncertain about those preferences and 3) deferential to human behavior in figuring those out over time.So how do we get to a place where this model becomes the industry standard? Stuart walks us through the practical mechanics of standing this up. We'll discuss the behavioral and collective challenge of identifying human preferences and steps that must first happen through research to change the industry's foundation for building AI.We also couldn't end the conversation without briefly touching on the opportunity to promote human thriving in a new paradigm for the future of work.  Whether you're a casual observer or have been working in the field for years, my guess is you will come away from this conversation with a better understanding of how we should — how we must — think about controlling systems with capabilities that exceed our own. Show Notes: 3:15 - the economic potential of general purpose AI7:50 - explanation of the standard AI model12:40 - fixed objective failure in social media context16:45 - introduction to provably beneficial AI25:10 - understanding human preferences through behavior37:15 - considering a normative framework for negotiating shared preferences42:00 - standing up beneficial AI in practice 51:15 - mistakes made by regulators53:25 -  how to consider an “off switch” in the context of integrated systems56:10 -   maximizing human potential in future of work 

Book Movement
SBM 034 | Human Compatible - Stuart Russell | Lesly Zerna

Book Movement

Play Episode Listen Later Feb 2, 2021 68:27


Science Book Movement - Notion360. Revisión Online del Libro: Human Compatible - Stuart Russell. Invitada: Lesly Zerna. Únete a nuestra comunidad en Discord a través del siguiente enlace: https://bookmovement.co/discord See acast.com/privacy for privacy and opt-out information.

丽莎老师讲机器人
丽莎老师讲机器人之教机器人了解人类的真实愿望

丽莎老师讲机器人

Play Episode Listen Later May 2, 2020 9:39


丽莎老师讲机器人之教机器人了解人类的真实愿望欢迎收听丽莎老师讲机器人,想要孩子参加机器人竞赛、创意编程、创客竞赛的辅导,找丽莎老师!欢迎添加微信号:153 5359 2068,或搜索钉钉群:31532843。加州大学伯克利分校的计算机科学家斯图尔特·罗素(Stuart Russell)认为,尽管在完成特定任务,比如下围棋、识别图像和文字、甚至创作音乐和散文时取得了巨大成功,如今的 AI 还是有局限性的。要求机器优化“奖励功能”(即在增强学习问题中的目标,要求AI不断优化总奖励),将不可避免地导致AI的错位,因为AI不可能在奖励功能中囊括并正确地对所有目标、子目标、例外和警告事项进行权衡,它们甚至不知道正确的目标是什么。将目标交给自由发展的“自主”机器人将变得越来越危险,因为随着它们变得更加智能,机器人将“无情地”追求奖励的最大化,并试图阻止我们关闭它们。新提出的逻辑,不是让机器追求自己的目标,而是寻求让它们满足人类的偏好:AI唯一的目标应该是更多地了解我们的偏好。罗素认为,对人类偏好的不确定性、并需要向人类寻求指导,这两点将保证AI系统对人类的安全。在他最近出版的《人类兼容》(Human Compatible)一书中,罗素以三个“有益机器的原则”的形式阐述了他的观点。这三个原则与艾萨克·阿西莫夫(Isaac Asimov)1942年的“机器人三定律”相呼应,但都成熟许多。罗素的版本是:机器的唯一目标是最大限度地实现人类的偏好。机器最初不确定这些是什么。关于人类偏好的信息的最终来源是人类的行为。如果我们按照纯理性目标构建人工智能,就会出现很多麻烦,比如“你让AI帮你买杯咖啡,并不是让它不计一切代价去获得一杯咖啡”。发展AI的重点是调整研究方向。在过去的几年里,罗素和来自伯克利大学、斯坦福大学、得克萨斯大学等机构的同事,一直在开发创新的方法,为AI系统理解我们的偏好提供线索,但又永远不必具体说明这些偏好是什么。实验室正在教机器人如何学习那些从未阐明、甚至不确定具体目标的人类偏好。机器人可以通过观看不完美的演示来了解我们的欲望,甚至能够学习如何理解人类的不确定性。这表明,AI可能出奇地善于推断我们的心态和偏好,即使是那些我们在做某件事情时即时产生的偏好。“这是首次尝试使问题正式化,”人们开始意识到我们需要更仔细地看待人与机器人之间的互动。”这些新尝试,外加罗素的机器新三大原则,是否真正预示着AI发展的光明未来,尚还有待观察。这种方法将衡量机器人表现的标准聚焦在它们理解人类真正喜欢什么的能力上。OpenAI的研究员保罗·克里斯蒂安诺(Paul Christiano)说,罗素和他的团队已经大大地推动了这一进程。如何理解人类?罗素的观点仿佛来自于一种顿悟。2014年,他从伯克利到巴黎休假,“我突然想到,AI 最重要的关注是人类感受的总体质量” 。机器人的目标不应该是“将观看时间最大化”这样的具体目标,它们应该试着改善我们的生活。其实只有一个问题:“如果机器的目标是试图优化人类感受的总体质量,它们究竟如何知道应该怎么做?”在德克萨斯大学奥斯汀分校的实验室里,一个名叫双子座的机器人正在学习如何在桌子中央放置一个花瓶。人类演示是模棱两可的,因为机器理解的意图可能是把花瓶放在绿色盘子的右边,或者放在红碗的左边。但是,在经过几次尝试后,机器人的表现相当不错。人类不是理性的,我们不可能计算在一个特定时刻哪种行动将导致很长一段时间后的最佳结果,AI 也不能。人类的决策是分层的,我们通过中期目标追求模糊的长期目标,同时最关注我们的眼前情况,从而表现出近似理性的状态。他认为,机器人需要做类似的事情,或者至少了解我们是如何这样做的。如果计算机不知道人类喜欢什么,“它们可以做某种反向强化学习来学习更多这方面的知识”。回到伯克利后,罗素开始与同事合作开发一种新的“合作逆向增强学习”,让机器人和人类可以一起工作,通过各种“辅助游戏”学习人类的真正偏好。游戏中抽象的场景代表了现实世界的情况。他们开发了一款“开关游戏”,针对的就是自主机器人最可能与我们的真实意图出现偏差之处:自主机器人可能会禁用自己的关闭开关。1951年,图灵在BBC的一次广播讲座中提出,要“保持机器处于从属地位,例如在某个特定时刻关闭电源”。开关问题是“智能系统控制问题的核心。如果我们不能关闭一台机器是因为它不让我们关闭,那我们真的有大麻烦了。”人类偏好的不确定性可能是关键所在。在“开关游戏”中有两个角色:一个是人类,叫哈里特;另一个是机器人罗比。罗比需要代表哈里特做决定——比如说,是否为她预订一间漂亮但昂贵的酒店房间——但又不确定她更喜欢什么。这里有三种情况:罗比替哈里特做出选择:罗比预期哈里特的回报可能在-40到60英镑之间,平均数值10英镑(罗比认为她可能会喜欢这个花哨的房间,但不确定)。罗比什么都不做:回报为 0。罗比可以询问哈里特,她是否希望它继续做决策,还是更希望“关闭它”——也就是说,不需要罗比再做酒店预订的决策。如果她让机器人继续,平均预期回报将超过10。所以罗比将决定询问哈里特,如果她愿意,可以让她关掉它。一般来说,除非罗比完全确定哈里特自己会怎么做,否则最好让她决定。事实证明,目标的不确定性对于确保我们关闭机器至关重要,即使它比我们更聪明。”通过深度学习来实现罗素的理念,帮助人工智能系统为减少不确定性而了解人类的偏好。“当然,还需要进行更多的研究工作,才能实现这一点,”研究人员面临两大挑战。“一个事实是,我们的行为远非理性,了解我们真正的基本偏好是很难的,”AI 系统需要对长期、中期和短期目标的等级进行推理。只有知道我们潜意识中存在的欲望,机器人才能真正帮助我们(并避免犯严重的错误)。在斯坦福大学汽车研究中心的驾驶模拟器中,自动驾驶汽车正在了解人类驾驶员的喜好第二个挑战是人类偏好的改变。我们的思想会随着我们生活的进程而改变,而且也会因为一些鸡毛蒜皮的小事而改变,偏好可能会取决于我们的心情,而机器人可能难以适应这种改变。当然,还有第三个问题:坏人的喜好是怎样的?怎样才能阻止机器人满足其邪恶主人的邪恶目的?AI系统善于找到绕过禁令的方法,正如YouTube一直在努力修复的推荐算法一样,该算法正在利用无处不在的人类冲动。尽管如此,研究者还是乐观地认为:虽然需要更多的算法和博弈论研究,在教机器人“做好人”的同时,我们可能会找到一种方法来教导自己。

Víðsjá
Listahátíð, Marat, Mannsamræmanleiki, Töfrafjallið

Víðsjá

Play Episode Listen Later Apr 2, 2020 55:00


Hlustendur heyra í þættinum af væntanlegri Listahátíð í Reykjavík þegar rætt verður við Vigdísi Jakobsdóttur, sem er listrænn stjórnandi hátíðarinnar. Hermann Stefánsson rithöfundur veltir í dag fyrir sér frægu málverki sem sýnir byltingarmanninn Marat þar sem hann liggur dáinn í baði. Gauti Kristmannsson segir frá væntanlegri þýðingu sinni á skáldsögunni Töfrafjallinu eftir þýska rithöfundinnThomas Mann. Björn Þór Vilhjálmsson fjallar um bókina Human Compatible, eða Mannsamræmanleiki, eftir enska rithöfundinn og tölvufræðinginn Stuart Russell, en bókin er í senn yfirlit um stöðu þekkingar um þróun gervigreindar, og hætturnar sem slíkri uppfinningu fylgja. Og hlustendur heyra að venju ljóð fyrir þjóð.

Technovation with Peter High (CIO, CTO, CDO, CXO Interviews)
AI Pioneer Stuart Russell, author of "Human Compatible"

Technovation with Peter High (CIO, CTO, CDO, CXO Interviews)

Play Episode Listen Later Jan 13, 2020 33:58


428: AI Pioneer and UC Berkeley Professor Stuart Russell warns that AI is reshaping society in unintended ways. For example, social media content selection algorithms that choose what individuals watch and read do not even know that human beings exist. As AI becomes more capable, he suggests that we are going to see bigger failures of this kind unless we change the way we think about AI altogether. Stuart argues that to ensure AI is provably beneficial for human beings, we must design machines to be inherently uncertain about human preferences. This way, we can ensure they are humble, altruistic, and committed to pursuing our objectives even as they set their own goals. We also discuss why AI needs regulation similar to civil engineering and medicine, the impact AI is going to make over the next decade, autonomous vehicles, among other topics.