POPULARITY
How do you navigate a career path when the future of work is uncertain? How important is mentorship versus immediate impact? Is it better to focus on your strengths or on the world's most pressing problems? Should you specialise deeply or develop a unique combination of skills?From embracing failure to finding unlikely allies, we bring you 16 diverse perspectives from past guests who've found unconventional paths to impact and helped others do the same.Links to learn more and full transcript.Chapters:Cold open (00:00:00)Luisa's intro (00:01:04)Holden Karnofsky on just kicking ass at whatever (00:02:53)Jeff Sebo on what improv comedy can teach us about doing good in the world (00:12:23)Dean Spears on being open to randomness and serendipity (00:19:26)Michael Webb on how to think about career planning given the rapid developments in AI (00:21:17)Michelle Hutchinson on finding what motivates you and reaching out to people for help (00:41:10)Benjamin Todd on figuring out if a career path is a good fit for you (00:46:03)Chris Olah on the value of unusual combinations of skills (00:50:23)Holden Karnofsky on deciding which weird ideas are worth betting on (00:58:03)Karen Levy on travelling to learn about yourself (01:03:10)Leah Garcés on finding common ground with unlikely allies (01:06:53)Spencer Greenberg on recognising toxic people who could derail your career and life (01:13:34)Holden Karnofsky on the many jobs that can help with AI (01:23:13)Danny Hernandez on using world events to trigger you to work on something else (01:30:46)Sarah Eustis-Guthrie on exploring and pivoting in careers (01:33:07)Benjamin Todd on making tough career decisions (01:38:36)Hannah Ritchie on being selective when following others' advice (01:44:22)Alex Lawsen on getting good mentorship (01:47:25)Chris Olah on cold emailing that actually works (01:54:49)Pardis Sabeti on prioritising physical health to do your best work (01:58:34)Chris Olah on developing good taste and technique as a researcher (02:04:39)Benjamin Todd on why it's so important to apply to loads of jobs (02:09:52)Varsha Venugopal on embracing uncomfortable situations and celebrating failures (02:14:25)Luisa's outro (02:17:43)Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongContent editing: Katy Moore and Milo McGuireTranscriptions and web: Katy Moore
This is a Forum Team crosspost from Substack. Matt would like to add: "Epistemic status = incomplete speculation; posted here at the Forum team's request" When you ask prominent Effective Altruists about Effective Altruism, you often get responses like these: For context, Will MacAskill and Holden Karnofsky are arguably, literally the number one and two most prominent Effective Altruists on the planet. Other evidence of their ~spouses' personal involvement abounds, especially Amanda's. Now, perhaps they've had changes of heart in recent months or years – and they're certainly entitled to have those – but being evasive and implicitly disclaiming mere knowledge of EA is comically misleading and non-transparent. Calling these statements lies seems within bounds for most.[1] This kind of evasiveness around one's EA associations has been common since the collapse of FTX in 2022, (which, for yet more context, was a major EA funder that year and [...] ---Outline:(03:32) Why can't EAs talk about EA like normal humans (or even normal executives)?(05:54) Coming of age during the Great Awokening(07:15) Bad Comms Advice(08:22) Not understanding how words work (coupled with motivated reasoning)(11:05) TraumaThe original text contained 5 footnotes which were omitted from this narration. --- First published: April 8th, 2025 Source: https://forum.effectivealtruism.org/posts/6NCYo7RFYfkEjLAtn/ea-adjacency-as-ftx-trauma --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
In a recent Wired article about Anthropic, there's a section where Anthropic's president, Daniela Amodei, and early employee Amanda Askell seem to suggest there's little connection between Anthropic and the EA movement: Ask Daniela about it and she says, "I'm not the expert on effective altruism. I don't identify with that terminology. My impression is that it's a bit of an outdated term". Yet her husband, Holden Karnofsky, cofounded one of EA's most conspicuous philanthropy wings, is outspoken about AI safety, and, in January 2025, joined Anthropic. Many others also remain engaged with EA. As early employee Amanda Askell puts it, "I definitely have met people here who are effective altruists, but it's not a theme of the organization or anything". (Her ex-husband, William MacAskill, is an originator of the movement.) This led multiple people on Twitter to call out how bizarre this is: In my [...] --- First published: March 30th, 2025 Source: https://forum.effectivealtruism.org/posts/53Gc35vDLK2u5nBxP/anthropic-is-not-being-consistently-candid-about-their --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
"There's almost no story of the future going well that doesn't have a part that's like '…and no evil person steals the AI weights and goes and does evil stuff.' So it has highlighted the importance of information security: 'You're training a powerful AI system; you should make it hard for someone to steal' has popped out to me as a thing that just keeps coming up in these stories, keeps being present. It's hard to tell a story where it's not a factor. It's easy to tell a story where it is a factor." — Holden KarnofskyWhat happens when a USB cable can secretly control your system? Are we hurtling toward a security nightmare as critical infrastructure connects to the internet? Is it possible to secure AI model weights from sophisticated attackers? And could AI might actually make computer security better rather than worse?With AI security concerns becoming increasingly urgent, we bring you insights from 15 top experts across information security, AI safety, and governance, examining the challenges of protecting our most powerful AI models and digital infrastructure — including a sneak peek from an episode that hasn't yet been released with Tom Davidson, where he explains how we should be more worried about “secret loyalties” in AI agents. You'll hear:Holden Karnofsky on why every good future relies on strong infosec, and how hard it's been to hire security experts (from episode #158)Tantum Collins on why infosec might be the rare issue everyone agrees on (episode #166)Nick Joseph on whether AI companies can develop frontier models safely with the current state of information security (episode #197)Sella Nevo on why AI model weights are so valuable to steal, the weaknesses of air-gapped networks, and the risks of USBs (episode #195)Kevin Esvelt on what cryptographers can teach biosecurity experts (episode #164)Lennart Heim on on Rob's computer security nightmares (episode #155)Zvi Mowshowitz on the insane lack of security mindset at some AI companies (episode #184)Nova DasSarma on the best current defences against well-funded adversaries, politically motivated cyberattacks, and exciting progress in infosecurity (episode #132)Bruce Schneier on whether AI could eliminate software bugs for good, and why it's bad to hook everything up to the internet (episode #64)Nita Farahany on the dystopian risks of hacked neurotech (episode #174)Vitalik Buterin on how cybersecurity is the key to defence-dominant futures (episode #194)Nathan Labenz on how even internal teams at AI companies may not know what they're building (episode #176)Allan Dafoe on backdooring your own AI to prevent theft (episode #212)Tom Davidson on how dangerous “secret loyalties” in AI models could be (episode to be released!)Carl Shulman on the challenge of trusting foreign AI models (episode #191, part 2)Plus lots of concrete advice on how to get into this field and find your fitCheck out the full transcript on the 80,000 Hours website.Chapters:Cold open (00:00:00)Rob's intro (00:00:49)Holden Karnofsky on why infosec could be the issue on which the future of humanity pivots (00:03:21)Tantum Collins on why infosec is a rare AI issue that unifies everyone (00:12:39)Nick Joseph on whether the current state of information security makes it impossible to responsibly train AGI (00:16:23)Nova DasSarma on the best available defences against well-funded adversaries (00:22:10)Sella Nevo on why AI model weights are so valuable to steal (00:28:56)Kevin Esvelt on what cryptographers can teach biosecurity experts (00:32:24)Lennart Heim on the possibility of an autonomously replicating AI computer worm (00:34:56)Zvi Mowshowitz on the absurd lack of security mindset at some AI companies (00:48:22)Sella Nevo on the weaknesses of air-gapped networks and the risks of USB devices (00:49:54)Bruce Schneier on why it's bad to hook everything up to the internet (00:55:54)Nita Farahany on the possibility of hacking neural implants (01:04:47)Vitalik Buterin on how cybersecurity is the key to defence-dominant futures (01:10:48)Nova DasSarma on exciting progress in information security (01:19:28)Nathan Labenz on how even internal teams at AI companies may not know what they're building (01:30:47)Allan Dafoe on backdooring your own AI to prevent someone else from stealing it (01:33:51)Tom Davidson on how dangerous “secret loyalties” in AI models could get (01:35:57)Carl Shulman on whether we should be worried about backdoors as governments adopt AI technology (01:52:45)Nova DasSarma on politically motivated cyberattacks (02:03:44)Bruce Schneier on the day-to-day benefits of improved security and recognising that there's never zero risk (02:07:27)Holden Karnofsky on why it's so hard to hire security people despite the massive need (02:13:59)Nova DasSarma on practical steps to getting into this field (02:16:37)Bruce Schneier on finding your personal fit in a range of security careers (02:24:42)Rob's outro (02:34:46)Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongContent editing: Katy Moore and Milo McGuireTranscriptions and web: Katy Moore
Will LLMs soon be made into autonomous agents? Will they lead to job losses? Is AI misinformation overblown? Will it prove easy or hard to create AGI? And how likely is it that it will feel like something to be a superhuman AGI?With AGI back in the headlines, we bring you 15 opinionated highlights from the show addressing those and other questions, intermixed with opinions from hosts Luisa Rodriguez and Rob Wiblin recorded back in 2023.Check out the full transcript on the 80,000 Hours website.You can decide whether the views we expressed (and those from guests) then have held up these last two busy years. You'll hear:Ajeya Cotra on overrated AGI worriesHolden Karnofsky on the dangers of aligned AI, why unaligned AI might not kill us, and the power that comes from just making models biggerIan Morris on why the future must be radically different from the presentNick Joseph on whether his companies internal safety policies are enoughRichard Ngo on what everyone gets wrong about how ML models workTom Davidson on why he believes crazy-sounding explosive growth stories… and Michael Webb on why he doesn'tCarl Shulman on why you'll prefer robot nannies over human onesZvi Mowshowitz on why he's against working at AI companies except in some safety rolesHugo Mercier on why even superhuman AGI won't be that persuasiveRob Long on the case for and against digital sentienceAnil Seth on why he thinks consciousness is probably biologicalLewis Bollard on whether AI advances will help or hurt nonhuman animalsRohin Shah on whether humanity's work ends at the point it creates AGIAnd of course, Rob and Luisa also regularly chime in on what they agree and disagree with.Chapters:Cold open (00:00:00)Rob's intro (00:00:58)Rob & Luisa: Bowerbirds compiling the AI story (00:03:28)Ajeya Cotra on the misalignment stories she doesn't buy (00:09:16)Rob & Luisa: Agentic AI and designing machine people (00:24:06)Holden Karnofsky on the dangers of even aligned AI, and how we probably won't all die from misaligned AI (00:39:20)Ian Morris on why we won't end up living like The Jetsons (00:47:03)Rob & Luisa: It's not hard for nonexperts to understand we're playing with fire here (00:52:21)Nick Joseph on whether AI companies' internal safety policies will be enough (00:55:43)Richard Ngo on the most important misconception in how ML models work (01:03:10)Rob & Luisa: Issues Rob is less worried about now (01:07:22)Tom Davidson on why he buys the explosive economic growth story, despite it sounding totally crazy (01:14:08)Michael Webb on why he's sceptical about explosive economic growth (01:20:50)Carl Shulman on why people will prefer robot nannies over humans (01:28:25)Rob & Luisa: Should we expect AI-related job loss? (01:36:19)Zvi Mowshowitz on why he thinks it's a bad idea to work on improving capabilities at cutting-edge AI companies (01:40:06)Holden Karnofsky on the power that comes from just making models bigger (01:45:21)Rob & Luisa: Are risks of AI-related misinformation overblown? (01:49:49)Hugo Mercier on how AI won't cause misinformation pandemonium (01:58:29)Rob & Luisa: How hard will it actually be to create intelligence? (02:09:08)Robert Long on whether digital sentience is possible (02:15:09)Anil Seth on why he believes in the biological basis of consciousness (02:27:21)Lewis Bollard on whether AI will be good or bad for animal welfare (02:40:52)Rob & Luisa: The most interesting new argument Rob's heard this year (02:50:37)Rohin Shah on whether AGI will be the last thing humanity ever does (02:57:35)Rob's outro (03:11:02)Audio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongTranscriptions and additional content editing: Katy Moore
This is a low-effort post. I mostly want to get other people's takes and express concern about the lack of detailed and publicly available plans so far. This post reflects my personal opinion and not necessarily that of other members of Apollo Research. I'd like to thank Ryan Greenblatt, Bronson Schoen, Josh Clymer, Buck Shlegeris, Dan Braun, Mikita Balesni, Jérémy Scheurer, and Cody Rushing for comments and discussion.I think short timelines, e.g. AIs that can replace a top researcher at an AGI lab without losses in capabilities by 2027, are plausible. Some people have posted ideas on what a reasonable plan to reduce AI risk for such timelines might look like (e.g. Sam Bowman's checklist, or Holden Karnofsky's list in his 2022 nearcast), but I find them insufficient for the magnitude of the stakes (to be clear, I don't think these example lists were intended to be an [...] ---Outline:(02:36) Short timelines are plausible(07:10) What do we need to achieve at a minimum?(10:50) Making conservative assumptions for safety progress(12:33) So whats the plan?(14:31) Layer 1(15:41) Keep a paradigm with faithful and human-legible CoT(18:15) Significantly better (CoT, action and white-box) monitoring(21:19) Control (that doesn't assume human-legible CoT)(24:16) Much deeper understanding of scheming(26:43) Evals(29:56) Security(31:52) Layer 2(32:02) Improved near-term alignment strategies(34:06) Continued work on interpretability, scalable oversight, superalignment and co(36:12) Reasoning transparency(38:36) Safety first culture(41:49) Known limitations and open questions--- First published: January 2nd, 2025 Source: https://www.lesswrong.com/posts/bb5Tnjdrptu89rcyY/what-s-the-short-timeline-plan --- Narrated by TYPE III AUDIO.
Thanks to Holden Karnofsky, David Duvenaud, and Kate Woolverton for useful discussions and feedback.Following up on our recent “Sabotage Evaluations for Frontier Models” paper, I wanted to share more of my personal thoughts on why I think catastrophic sabotage is important and why I care about it as a threat model. Note that this isn't in any way intended to be a reflection of Anthropic's views or for that matter anyone's views but my own—it's just a collection of some of my personal thoughts.First, some high-level thoughts on what I want to talk about here: I want to focus on a level of future capabilities substantially beyond current models, but below superintelligence: specifically something approximately human-level and substantially transformative, but not yet superintelligent. While I don't think that most of the proximate cause of AI existential risk comes from such models—I think most of the direct takeover [...] ---Outline:(02:31) Why is catastrophic sabotage a big deal?(02:45) Scenario 1: Sabotage alignment research(05:01) Necessary capabilities(06:37) Scenario 2: Sabotage a critical actor(09:12) Necessary capabilities(10:51) How do you evaluate a model's capability to do catastrophic sabotage?(21:46) What can you do to mitigate the risk of catastrophic sabotage?(23:12) Internal usage restrictions(25:33) Affirmative safety cases--- First published: October 22nd, 2024 Source: https://www.lesswrong.com/posts/Loxiuqdj6u8muCe54/catastrophic-sabotage-as-a-major-threat-model-for-human --- Narrated by TYPE III AUDIO.
With kids very much on the team's mind we thought it would be fun to review some comments about parenting featured on the show over the years, then have hosts Luisa Rodriguez and Rob Wiblin react to them. Links to learn more and full transcript.After hearing 8 former guests' insights, Luisa and Rob chat about:Which of these resonate the most with Rob, now that he's been a dad for six months (plus an update at nine months).What have been the biggest surprises for Rob in becoming a parent.How Rob's dealt with work and parenting tradeoffs, and his advice for other would-be parents.Rob's list of recommended purchases for new or upcoming parents.This bonus episode includes excerpts from:Ezra Klein on parenting yourself as well as your children (from episode #157)Holden Karnofsky on freezing embryos and being surprised by how fun it is to have a kid (#110 and #158)Parenting expert Emily Oster on how having kids affect relationships, careers and kids, and what actually makes a difference in young kids' lives (#178)Russ Roberts on empirical research when deciding whether to have kids (#87)Spencer Greenberg on his surveys of parents (#183)Elie Hassenfeld on how having children reframes his relationship to solving pressing global problems (#153)Bryan Caplan on homeschooling (#172)Nita Farahany on thinking about life and the world differently with kids (#174)Chapters:Cold open (00:00:00)Rob & Luisa's intro (00:00:19)Ezra Klein on parenting yourself as well as your children (00:03:34)Holden Karnofsky on preparing for a kid and freezing embryos (00:07:41)Emily Oster on the impact of kids on relationships (00:09:22)Russ Roberts on empirical research when deciding whether to have kids (00:14:44)Spencer Greenberg on parent surveys (00:23:58)Elie Hassenfeld on how having children reframes his relationship to solving pressing problems (00:27:40)Emily Oster on careers and kids (00:31:44)Holden Karnofsky on the experience of having kids (00:38:44)Bryan Caplan on homeschooling (00:40:30)Emily Oster on what actually makes a difference in young kids' lives (00:46:02)Nita Farahany on thinking about life and the world differently (00:51:16)Rob's first impressions of parenthood (00:52:59)How Rob has changed his views about parenthood (00:58:04)Can the pros and cons of parenthood be studied? (01:01:49)Do people have skewed impressions of what parenthood is like? (01:09:24)Work and parenting tradeoffs (01:15:26)Tough decisions about screen time (01:25:11)Rob's advice to future parents (01:30:04)Coda: Rob's updated experience at nine months (01:32:09)Emily Oster on her amazing nanny (01:35:01)Producer: Keiran HarrisAudio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongContent editing: Luisa Rodriguez, Katy Moore, and Keiran HarrisTranscriptions: Katy Moore
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: If-Then Commitments for AI Risk Reduction [by Holden Karnofsky], published by habryka on September 14, 2024 on LessWrong. Holden just published this paper on the Carnegie Endowment website. I thought it was a decent reference, so I figured I would crosspost it (included in full for convenience, but if either Carnegie Endowment or Holden has a preference for just having an excerpt or a pure link post, happy to change that) If-then commitments are an emerging framework for preparing for risks from AI without unnecessarily slowing the development of new technology. The more attention and interest there is these commitments, the faster a mature framework can progress. Introduction Artificial intelligence (AI) could pose a variety of catastrophic risks to international security in several domains, including the proliferation and acceleration of cyberoffense capabilities, and of the ability to develop chemical or biological weapons of mass destruction. Even the most powerful AI models today are not yet capable enough to pose such risks,[1] but the coming years could see fast and hard-to-predict changes in AI capabilities. Both companies and governments have shown significant interest in finding ways to prepare for such risks without unnecessarily slowing the development of new technology. This piece is a primer on an emerging framework for handling this challenge: if-then commitments. These are commitments of the form: If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time. A specific example: If an AI model has the ability to walk a novice through constructing a weapon of mass destruction, we must ensure that there are no easy ways for consumers to elicit behavior in this category from the AI model. If-then commitments can be voluntarily adopted by AI developers; they also, potentially, can be enforced by regulators. Adoption of if-then commitments could help reduce risks from AI in two key ways: (a) prototyping, battle-testing, and building consensus around a potential framework for regulation; and (b) helping AI developers and others build roadmaps of what risk mitigations need to be in place by when. Such adoption does not require agreement on whether major AI risks are imminent - a polarized topic - only that certain situations would require certain risk mitigations if they came to pass. Three industry leaders - Google DeepMind, OpenAI, and Anthropic - have published relatively detailed frameworks along these lines. Sixteen companies have announced their intention to establish frameworks in a similar spirit by the time of the upcoming 2025 AI Action Summit in France.[2] Similar ideas have been explored at the International Dialogues on AI Safety in March 2024[3] and the UK AI Safety Summit in November 2023.[4] As of mid-2024, most discussions of if-then commitments have been in the context of voluntary commitments by companies, but this piece focuses on the general framework as something that could be useful to a variety of actors with different enforcement mechanisms. This piece explains the key ideas behind if-then commitments via a detailed walkthrough of a particular if-then commitment, pertaining to the potential ability of an AI model to walk a novice through constructing a chemical or biological weapon of mass destruction. It then discusses some limitations of if-then commitments and closes with an outline of how different actors - including governments and companies - can contribute to the path toward a robust, enforceable system of if-then commitments. Context and aims of this piece. In 2023, I helped with the initial development of ideas related to if-then commitments.[5] To date, I have focused on private discussion of this new fram...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: If-Then Commitments for AI Risk Reduction [by Holden Karnofsky], published by habryka on September 14, 2024 on LessWrong. Holden just published this paper on the Carnegie Endowment website. I thought it was a decent reference, so I figured I would crosspost it (included in full for convenience, but if either Carnegie Endowment or Holden has a preference for just having an excerpt or a pure link post, happy to change that) If-then commitments are an emerging framework for preparing for risks from AI without unnecessarily slowing the development of new technology. The more attention and interest there is these commitments, the faster a mature framework can progress. Introduction Artificial intelligence (AI) could pose a variety of catastrophic risks to international security in several domains, including the proliferation and acceleration of cyberoffense capabilities, and of the ability to develop chemical or biological weapons of mass destruction. Even the most powerful AI models today are not yet capable enough to pose such risks,[1] but the coming years could see fast and hard-to-predict changes in AI capabilities. Both companies and governments have shown significant interest in finding ways to prepare for such risks without unnecessarily slowing the development of new technology. This piece is a primer on an emerging framework for handling this challenge: if-then commitments. These are commitments of the form: If an AI model has capability X, risk mitigations Y must be in place. And, if needed, we will delay AI deployment and/or development to ensure the mitigations can be present in time. A specific example: If an AI model has the ability to walk a novice through constructing a weapon of mass destruction, we must ensure that there are no easy ways for consumers to elicit behavior in this category from the AI model. If-then commitments can be voluntarily adopted by AI developers; they also, potentially, can be enforced by regulators. Adoption of if-then commitments could help reduce risks from AI in two key ways: (a) prototyping, battle-testing, and building consensus around a potential framework for regulation; and (b) helping AI developers and others build roadmaps of what risk mitigations need to be in place by when. Such adoption does not require agreement on whether major AI risks are imminent - a polarized topic - only that certain situations would require certain risk mitigations if they came to pass. Three industry leaders - Google DeepMind, OpenAI, and Anthropic - have published relatively detailed frameworks along these lines. Sixteen companies have announced their intention to establish frameworks in a similar spirit by the time of the upcoming 2025 AI Action Summit in France.[2] Similar ideas have been explored at the International Dialogues on AI Safety in March 2024[3] and the UK AI Safety Summit in November 2023.[4] As of mid-2024, most discussions of if-then commitments have been in the context of voluntary commitments by companies, but this piece focuses on the general framework as something that could be useful to a variety of actors with different enforcement mechanisms. This piece explains the key ideas behind if-then commitments via a detailed walkthrough of a particular if-then commitment, pertaining to the potential ability of an AI model to walk a novice through constructing a chemical or biological weapon of mass destruction. It then discusses some limitations of if-then commitments and closes with an outline of how different actors - including governments and companies - can contribute to the path toward a robust, enforceable system of if-then commitments. Context and aims of this piece. In 2023, I helped with the initial development of ideas related to if-then commitments.[5] To date, I have focused on private discussion of this new fram...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Case studies on social-welfare-based standards in various industries, published by Holden Karnofsky on June 20, 2024 on The Effective Altruism Forum. Last year, I posted a call for case studies on social-welfare-based standards for companies and products (including standards imposed by regulation). The goal was to gain general context on standards to inform work on possible standards and/or regulation for AI. This resulted[1] in several dozen case studies that I found informative and helpful. I've been hoping to write up my reflections after reading them all, but it's taken me long enough to get to this that for now I am just publishing a public Google sheet with links to all of the case studies that we have permission to share publicly (including some that already have public links). The link is here: https://docs.google.com/spreadsheets/d/18gaTIzdgMvKLq9Cp2-GJZZA7QmE93Frufh1UhNMcbpg/ 1. ^ Most of the case studies this piece links to were directly paid for via this project, but in some cases the work was pro bono, or someone adapted or sent a copy of work that had been done for another project, etc. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Émile P. Torres's history of dishonesty and harassment, published by anonymous-for-obvious-reasons on May 1, 2024 on The Effective Altruism Forum. This is a cross-post and you can see the original here, written in 2022. I am not the original author, but I thought it was good for more EAs to know about this. I am posting anonymously for obvious reasons, but I am a longstanding EA who is concerned about Torres's effects on our community. An incomplete summary Introduction This post compiles evidence that Émile P. Torres, a philosophy student at Leibniz Universität Hannover in Germany, has a long pattern of concerning behavior, which includes gross distortion and falsification, persistent harassment, and the creation of fake identities. Note: Since Torres has recently claimed that they have been the target of threats from anonymous accounts, I would like to state that I condemn any threatening behavior in the strongest terms possible, and that I have never contacted Torres or posted anything about Torres other than in this Substack or my Twitter account. I have no idea who is behind these accounts. To respect Torres's privacy and identity, I have also omitted their first name from the screenshots and replaced their previous first name with 'Émile'. Table of contents Introduction My story Stalking and harassment Peter Boghossian Helen Pluckrose Demonstrable falsehoods and gross distortions "Forcible" removal "Researcher at CSER" Giving What We Can Brief digression on effective altruism More falsehoods and distortions Hilary Greaves Andreas Mogensen Nick Beckstead Tyler Cowen Olle Häggström Sockpuppetry "Alex Williams" Conclusion My story Before I discuss Torres's behavior, I will provide some background about myself and my association with effective altruism (EA). I hope this information will help readers decide what biases I may have and subject my arguments to the appropriate degree of critical scrutiny. I first heard about EA upon attending Aaron Swartz's memorial in January 2013. One of the speakers at that event was Holden Karnofsky, co-founder of GiveWell, a charity evaluator for which Aaron had volunteered. Karnofsky described Aaron as someone who "believed in trying to maximize the good he accomplished with each minute he had." I resonated with that phrase, and in conversation with some friends after the memorial, I learned that Aaron's approach, and GiveWell's, were examples of what was, at the time, a new movement called "effective altruism." Despite my sympathy for EA, I never got very involved with it, due to a combination of introversion and the sense that I hadn't much to offer. I have donated a small fraction of my income to the Against Malaria Foundation for the last nine years, but I have never taken the Giving What We Can pledge, participated in a local EA group, or volunteered or worked for an EA organization. I decided to write this article after a friend forwarded me one of Torres's critical pieces on longtermism. I knew enough about this movement -- which emerged out of EA -- to quickly identify some falsehoods and misrepresentations in Torres's polemic. So I was surprised to find, when I checked the comments on Twitter, that no one else was pointing out these errors. A few weeks later, I discovered that this was just one of a growing number of articles by Torres that attacked these ideas and their proponents. Since I also noticed several factual inaccuracies in these other publications, I got curious and decided to look into Torres's writings more closely. I began to follow Torres's Twitter presence with interest and to investigate older Twitter feuds that Torres occasionally references. After looking into these and systematically checking the sources Torres cites in support of their various allegations, I found Torres's behavior much more troublin...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Joining the Carnegie Endowment for International Peace, published by Holden Karnofsky on April 29, 2024 on The Effective Altruism Forum. Effective today, I've left Open Philanthropy and joined the Carnegie Endowment for International Peace[1] as a Visiting Scholar. At Carnegie, I will analyze and write about topics relevant to AI risk reduction. In the short term, I will focus on (a) what AI capabilities might increase the risk of a global catastrophe; (b) how we can catch early warning signs of these capabilities; and (c) what protective measures (for example, strong information security) are important for safely handling such capabilities. This is a continuation of the work I've been doing over the last ~year. I want to be explicit about why I'm leaving Open Philanthropy. It's because my work no longer involves significant involvement in grantmaking, and given that I've overseen grantmaking historically, it's a significant problem for there to be confusion on this point. Philanthropy comes with particular power dynamics that I'd like to move away from, and I also think Open Philanthropy would benefit from less ambiguity about my role in its funding decisions (especially given the fact that I'm married to the President of a major AI company). I'm proud of my role in helping build Open Philanthropy, I love the team and organization, and I'm confident in the leadership it's now under; I think it does the best philanthropy in the world, and will continue to do so after I move on. I will continue to serve on its board of directors (at least for the time being). While I'll miss the Open Philanthropy team, I am excited about joining Carnegie. Tino Cuellar, Carnegie's President, has been an advocate for taking (what I see as) the biggest risks from AI seriously. Carnegie is looking to increase its attention to AI risk, and has a number of other scholars working on it, including Matt Sheehan, who specializes in China's AI ecosystem (an especially crucial topic in my view). Carnegie's leadership has shown enthusiasm for the work I've been doing and plan to continue. I expect that I'll have support and freedom, in addition to an expanded platform and network, in continuing my work there. I'm generally interested in engaging more on AI risk with people outside my existing networks. I think it will be important to build an increasingly big tent over time, and I've tried to work on approaches to risk reduction (such as responsible scaling) that have particularly strong potential to resonate outside of existing AI-risk-focused communities. The Carnegie network is appealing because it's well outside my usual network, while having many people with (a) genuine interest in risks from AI that could rise to the level of international security issues; (b) knowledge of international affairs. I resonate with Carnegie's mission of "helping countries and institutions take on the most difficult global problems and advance peace," and what I've read of its work has generally had a sober, nuanced, peace-oriented style that I like. I'm looking forward to working at Carnegie, despite the bittersweetness of leaving Open Phil. To a significant extent, though, the TL;DR of this post is that I am continuing the work I've been doing for over a year: helping to design and advocate for a framework that seeks to get early warning signs of key risks from AI, accompanied by precommitments to have sufficient protections in place by the time they come (or to pause AI development and deployment until these protections get to where they need to be). ^ I will be at the California office and won't be relocating. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Effective today, I've left Open Philanthropy and joined the Carnegie Endowment for International Peace[1] as a Visiting Scholar. At Carnegie, I will analyze and write about topics relevant to AI risk reduction. In the short term, I will focus on (a) what AI capabilities might increase the risk of a global catastrophe; (b) how we can catch early warning signs of these capabilities; and (c) what protective measures (for example, strong information security) are important for safely handling such capabilities. This is a continuation of the work I've been doing over the last ~year. I want to be explicit about why I'm leaving Open Philanthropy. It's because my work no longer involves significant involvement in grantmaking, and given that I've overseen grantmaking historically, it's a significant problem for there to be confusion on this point. Philanthropy comes with particular power dynamics that I'd like to move away from, and [...] The original text contained 1 footnote which was omitted from this narration. --- First published: April 29th, 2024 Source: https://forum.effectivealtruism.org/posts/7gzgwgwefwBku2cnL/joining-the-carnegie-endowment-for-international-peace --- Narrated by TYPE III AUDIO.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Best Tacit Knowledge Videos on Every Subject, published by Parker Conley on March 31, 2024 on LessWrong. TL;DR Tacit knowledge is extremely valuable. Unfortunately, developing tacit knowledge is usually bottlenecked by apprentice-master relationships. Tacit Knowledge Videos could widen this bottleneck. This post is a Schelling point for aggregating these videos - aiming to be The Best Textbooks on Every Subject for Tacit Knowledge Videos. Scroll down to the list if that's what you're here for. Post videos that highlight tacit knowledge in the comments and I'll add them to the post. Experts in the videos include Stephen Wolfram, Holden Karnofsky, Andy Matuschak, Jonathan Blow, George Hotz, and others. What are Tacit Knowledge Videos? Samo Burja claims YouTube has opened the gates for a revolution in tacit knowledge transfer. Burja defines tacit knowledge as follows: Tacit knowledge is knowledge that can't properly be transmitted via verbal or written instruction, like the ability to create great art or assess a startup. This tacit knowledge is a form of intellectual dark matter, pervading society in a million ways, some of them trivial, some of them vital. Examples include woodworking, metalworking, housekeeping, cooking, dancing, amateur public speaking, assembly line oversight, rapid problem-solving, and heart surgery. In my observation, domains like housekeeping and cooking have already seen many benefits from this revolution. Could tacit knowledge in domains like research, programming, mathematics, and business be next? I'm not sure, but maybe this post will help push the needle forward. For the purpose of this post, Tacit Knowledge Videos are any video that communicates "knowledge that can't properly be transmitted via verbal or written instruction". Here are some examples: Neel Nanda, who leads the Google DeepMind mechanistic interpretability team, has a playlist of "Research Walkthroughs". AI Safety research is discussed a lot around here. Watching research videos could help instantiate what AI research really looks and feels like. GiveWell has public audio recordings of its Board Meetings from 2007-2020. Participants include Elie Hassenfeld, Holden Karnofsky, Timothy Ogden, Rob Reich, Tom Rutledge, Brigid Slipka, Cari Tuna, Julia Wise, and others. Influential business meetings are not usually made public. I feel I have learned some about business communication and business operations, among other things, by listening to these recordings. Andy Matuschak recorded himself studying Quantum Mechanics with Dwarkesh Patel and doing research. Andy Matushak "helped build iOS at Apple and led R&D at Khan Academy". I found it interesting to have a peek into Matushak's spaced repetition practice and various studying heuristics and habits, as well as his process of digesting and taking notes on papers. Call to Action Share links to Tacit Knowledge Videos below! Share them frivolously! These videos are uncommon - the bottleneck to the YouTube knowledge transfer revolution is quantity, not quality. I will add the shared videos to the post. Here are the loose rules: Recall a video that you've seen that communicates tacit knowledge - "knowledge that can't properly be transmitted via verbal or written instruction". A rule of thumb for sharing: could a reader find this video through one or two YouTube searches? If not, share it. Post the title and the URL of the video. Provide information indicating why the expert in the video is credible. (However, don't let this last rule stop you from sharing a video! Again - quantity, not quality.)[1] For information on how to best use these videos, Cedric Chin and Jacob Steinhardt have some potentially relevant practical advice. Andy Matushak also has some working notes about this idea generally. Additionally, DM or email me (email in L...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 80,000 Hours' new series on building skills, published by Benjamin Hilton on February 14, 2024 on The Effective Altruism Forum. If we were going to summarise all our advice on how to get career capital in three words, we'd say: build useful skills. In other words, gain abilities that are valued in the job market - which makes your work more useful and makes it easier to bargain for the ingredients of a fulfilling job - as well as those that are specifically needed in tackling the world's most pressing problems. So today, we're launching our series on the most useful skills for making a difference which you can find here. It covers why we recommend each skill, how to get started learning them, and how to work out which is the best fit for you. Each article looks at one of eight skill sets we think are most useful for solving the problems we think are most pressing: Policy and political skills Organisation-building Research Communicating ideas Software and tech skills Experience with an emerging power Engineering Expertise relevant to a top problem Why are we releasing this now? We think that many of our readers have come away from our site underappreciating the importance of career capital. Instead, they focus their career choices on having an impact right away. This is a difficult tradeoff in general. Roughly, our position is that: There's often less tradeoff between these things than people think, as good options for career capital often involve directly working on a problem you think is important. That said, building career capital substantially increases the impact you're able to have. This is in part because the impact of different jobs is heavy-tailed, and career capital is one of the primary ways to end up in the tails. As a result, neglecting career capital can lower your long-term impact in return for only a small increase in short-term impact. Young people especially should be prioritising career capital in most cases. We think that building career capital is important even for people focusing on particularly urgent problems - for example, we think that whether you should do an ML PhD doesn't depend (much) on your AI timelines. Why the focus on skills? We break down career capital into five components: Skills and knowledge Connections Credentials Character Runway (i.e. savings) We've found that "build useful skills" is a particularly good rule of thumb for building career capital. It's true that in addition to valuable skills, you also need to learn how to sell those skills to others and make connections. This can involve deliberately gaining credentials, such as by getting degrees or creating public demo projects; or it can involve what's normally thought of as "networking," such as going to conferences or building up a Twitter following. But all of these activities become much easier once you have something useful to offer. The decision to focus on skills was also partly inspired by discussions with Holden Karnofsky and his post on building aptitudes, which we broadly agree with. If you have more questions, take a look at our skills FAQ. How can you help? Please take a look at our new series and, if possible, share it with a friend! We'd love feedback on these pages. If you have any, please do let us know in the comments, or by contacting us at info@80000hours.org. Thank you so much! Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Good job opportunities for helping with the most important century, published by Holden Karnofsky on January 18, 2024 on The Effective Altruism Forum. Yes, this is my first post in almost a year. I'm no longer prioritizing this blog, but I will still occasionally post something. I wrote ~2 years ago that it was hard to point to concrete opportunities to help the most important century go well. That's changing. There are a good number of jobs available now that are both really promising opportunities to help (in my opinion) and are suitable for people without a lot of pre-existing knowledge of AI risk (or even AI). The jobs are demanding, but unlike many of the job openings that existed a couple of years ago, they are at well-developed organizations and involve relatively clear goals. So if you're someone who wants to help, but has been waiting for the right moment, this might be it. (Or not! I'll probably keep making posts like this as the set of opportunities gets wider.) Here are the jobs that best fit this description right now, as far as I can tell. The rest of this post will give a bit more detail on how these jobs can help, what skills they require and why these are the ones I listed. Organization Location Jobs Link UK AI Safety Institute London (remote work possible within the UK) Engineering and frontend roles, cybersecurity roles Here AAAS, Horizon Institute for Public Service, Tech Congress Washington, DC Fellowships serving as entry points into US policy roles Here AI companies: Google DeepMind, OpenAI, Anthropic1 San Francisco and London (with some other offices and remote work options) Preparedness/Responsible Scaling roles; alignment research roles Here, here, here, here Model Evaluation and Threat Research (METR) (fewer roles available) Berkeley (with remote work options) Engineering and data roles Here Software engineering and development (and related areas) seem especially valuable right now, so think about whether you know folks with those skills who might be interested! How these help A lot of these jobs (and the ones I know the most about) would be contributing toward a possible global standards regime for AI: AI systems should be subject to testing to see whether they present major risks, and training/deploying AI should stopped (e.g., by regulation) when it can't be done safely. The basic hope is: Teams will develop "evals": tests of what AIs are capable of, particularly with respect to possible risks. For example, one eval might be prompting an AI to give a detailed description of how to build a bioweapon; the more detailed and accurate its response, the more risk the AI poses (while also possibly having more potential benefits as well, by virtue of being generally more knowledgeable/capable). It will become common (through regulation, voluntary action by companies, industry standards, etc.) for cutting-edge AI systems to be subject to evals for dangerous capabilities. When evals reveal risk, they will trigger required mitigations. For example: An AI capable of bioweapons development should be (a) deployed in such a way that people can't use it for that (including by "jailbreaking" it), and (b) kept under good security to stop would-be terrorists from circumventing the restrictions. AIs with stronger and more dangerous capabilities might require very challenging mitigations, possibly beyond what anyone knows how to do today (for example, rigorous demonstrations that an AI won't have dangerous unintended aims, even if this sort of thing is hard to measure). Ideally, we'd eventually build a robust international governance regime (comparisons have been made to nuclear non-proliferation regimes) that reliably enforces rules like these, while safe and beneficial AI goes forward. But my view is that even dramatically weaker setups can still help a lo...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Open Phil Should Allocate Most Neartermist Funding to Animal Welfare, published by Ariel Simnegar on November 19, 2023 on The Effective Altruism Forum. Thanks to Michael St. Jules for his comments. Key Takeaways The evidence that animal welfare dominates in neartermism is strong. Open Philanthropy (OP) should scale up its animal welfare allocation over several years to approach a majority of OP's neartermist grantmaking. If OP disagrees, they should practice reasoning transparency by clarifying their views: How much weight does OP's theory of welfare place on pleasure and pain, as opposed to nonhedonic goods? Precisely how much more does OP value one unit of a human's welfare than one unit of another animal's welfare, just because the former is a human? How does OP derive this tradeoff? How would OP's views have to change for OP to prioritize animal welfare in neartermism? Summary Rethink Priorities (RP)'s moral weight research endorses the claim that the best animal welfare interventions are orders of magnitude (1000x) more cost-effective than the best neartermist alternatives. Avoiding this conclusion seems very difficult: Rejecting hedonism (the view that only pleasure and pain have moral value) is not enough, because even if pleasure and pain are only 1% of what's important, the conclusion still goes through. Rejecting unitarianism (the view that the moral value of a being's welfare is independent of the being's species) is not enough. Even if just for being human, one accords one unit of human welfare 100x the value of one unit of another animal's welfare, the conclusion still goes through. Skepticism of formal philosophy is not enough, because the argument for animal welfare dominance can be made without invoking formal philosophy. By analogy, although formal philosophical arguments can be made for longtermism, they're not required for longtermist cause prioritization. Even if OP accepts RP's conclusion, they may have other reasons why they don't allocate most neartermist funding to animal welfare. Though some of OP's possible reasons may be fair, if anything, they'd seem to imply a relaxation of this essay's conclusion rather than a dismissal. It seems like these reasons would also broadly apply to AI x-risk within longtermism. However, OP didn't seem put off by these reasons when they allocated a majority of longtermist funding to AI x-risk in 2017, 2019, and 2021. I request that OP clarify their views on whether or not animal welfare dominates in neartermism. The Evidence Endorses Prioritizing Animal Welfare in Neartermism GiveWell estimates that its top charity (Against Malaria Foundation) can prevent the loss of one year of life for every $100 or so. We've estimated that corporate campaigns can spare over 200 hens from cage confinement for each dollar spent. If we roughly imagine that each hen gains two years of 25%-improved life, this is equivalent to one hen-life-year for every $0.01 spent. If you value chicken life-years equally to human life-years, this implies that corporate campaigns do about 10,000x as much good per dollar as top charities. … If one values humans 10-100x as much, this still implies that corporate campaigns are a far better use of funds (100-1,000x). Holden Karnofsky, "Worldview Diversification" (2016) "Worldview Diversification" (2016) describes OP's approach to cause prioritization. At the time, OP's research found that if the interests of animals are "at least 1-10% as important" as those of humans, then "animal welfare looks like an extraordinarily outstanding cause, potentially to the point of dominating other options". After the better part of a decade, the latest and most rigorous research funded by OP has endorsed a stronger claim: Any significant moral weight for animals implies that OP should prioritize animal welfare in ne...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Advice for EA boards, published by Julia Wise on November 10, 2023 on The Effective Altruism Forum. Context As part of this project on reforms in EA, we've reviewed some changes that boards of organizations could make. Julia was the primary writer of this piece, with significant input from Ozzie. This advice on nonprofit boards draws from multiple sources. We spoke with board members from small and larger organizations inside and outside EA. We got input from staff at EA organizations who regularly interact with their boards, such as staff tasked with board relations. Julia and Ozzie also have a history of being on boards at EA organizations. Overall, there was no consensus on obvious reforms EA organizations should prioritize. But by taking advice from these varied sources, we aim to highlight considerations particularly relevant for EA boards. We have also shared more organization-specific thoughts with staff and board members at some organizations. Difficult choices we see How much to innovate? When should EA boards follow standard best practices, and when should they be willing to try something significantly different? Which sources do you trust on what "best practices" even are? Skills vs. alignment. How should organizations weigh board members with strong professional skills, such as finance and law, with those who have more alignment with the organization's specific mission? How much effort should be put into board recruitment? Most organizations spend less time on recruiting a board member than for hiring a staff position (which probably makes sense given the much larger number of hours a staff member will put in.) But the current default time put into this by EA organizations may be too low. Some things we think (which many organizations probably already agree with) Being a board member / trustee is an important role, and board members should be prepared to give it serious time. "At least 2 hours a month" is one estimate that seems sensible for organizations after a certain stage (perhaps 5 FTE). In times of major transition or crisis for the organization, it may be a lot more. It's best to have set terms for board membership so that each member is prompted to consider whether board service is still a good fit for them, and other board members are prompted to consider whether the person is still a good fit for the board. This doesn't mean their term definitely ends after a fixed time (they can be re-elected / reappointed), but people shouldn't stay on the board indefinitely by default. It also makes it easier to ask someone to leave if they're no longer a solid fit or are checked out. Many organizations change or grow dramatically over time, so board members who are great at some stages might stop being best later on. It's important to have good information sharing between staff and the board. With senior staff, this could be by fairly frequent meetings or by other updates. With junior staff who can provide a different view into the organization than senior staff, this could be interviews, office hours held by board members, or by attending staff events. It's important to have a system for recusing board members who are conflicted. This is both for votes, and for discussions that should be held without staff present. For example, see Holden Karnofsky's suggestion about closed sessions. It's helpful to have staff capacity specifically designated for board coordination. It's helpful to have one primary person own this area The goal is to get the board information that will make them more effective at providing oversight Boards should have directors & officer insurance. Expertise on a board Many people we talked to felt it was useful to have specific skills or professional experience on a board (e.g. finance expertise, legal expertise). The amount of expertise ...
Views are my own, not Open Philanthropy's. I am married to the President of Anthropic and have a financial interest in both Anthropic and OpenAI via my spouse. Over the last few months, I've spent a lot of my time trying to help out with efforts to get responsible scaling policies adopted. In that context, a number of people have said it would be helpful for me to be publicly explicit about whether I'm in favor of an AI pause. This post will give some thoughts on these topics. I think transformative AI could be soon, and we're not ready I have a strong default to thinking that scientific and technological progress is good and that worries will tend to be overblown. However, I think AI is a big exception here because of its potential for unprecedentedly rapid and radical transformation.1 I think [...] ---Outline:(00:36) I think transformative AI could be soon, and we're not ready(02:11) If it were all up to me, the world would pause now - but it isn't, and I'm more uncertain about whether a “partial pause” is good(07:46) Responsible scaling policies (RSPs) seem like a robustly good compromise with people who have different views from mine (with some risks that I think can be managed)The original text contained 5 footnotes which were omitted from this narration. --- First published: October 27th, 2023 Source: https://forum.effectivealtruism.org/posts/ntWikwczfSi8AJMg3/we-re-not-ready-thoughts-on-pausing-and-responsible-scaling --- Narrated by TYPE III AUDIO.
This is a selection of highlights from episode #158 of The 80,000 Hours Podcast.These aren't necessarily the most important, or even most entertaining parts of the interview — and if you enjoy this, we strongly recommend checking out the full episode:Holden Karnofsky on how AIs might take over even if they're no smarter than humans, and his four-part playbook for AI riskAnd if you're finding these highlights episodes valuable, please let us know by emailing podcast@80000hours.org.Highlights put together by Simon Monsour and Milo McGuire
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Violence Before Agriculture, published by John G. Halstead on October 2, 2023 on The Effective Altruism Forum. This is a summary of a report on trends in violence since the dawn of humanity: from the hunter-gatherer period to the present day. The full report is available at this Substack and as a preprint on SSRN. Phil did 95% of the work on the report. Expert reviewers provided the following comments on our report. "Thomson and Halstead have provided an admirably thorough and fair assessment of this difficult and emotionally fraught empirical question. I don't agree with all of their conclusions, but this will surely be the standard reference for this issue for years to come." Steven Pinker, Johnstone Family Professor in the Department of Psychology at Harvard University "This work uses an impressively comprehensive survey of ethnographic and archeological data on military mortality in historically and archeologically known small-scale societies in an effort to pin down the scale of the killing in the pre-agricultural world. This will be a useful addition to the literature. It is an admirably cautious assessment of the war mortality data, which are exceptionally fragile; and the conclusions it draws about killing rates prior to the Holocene are probably as good as we are likely to get for the time being." Paul Roscoe, Professor of Anthropology at the University of Maine Epistemic status We think our estimates here move understanding of prehistoric violence forward by rigorously focussing on the pre-agricultural period and attempting to be as comprehensive as possible with the available evidence. However, data in the relevant fields of ethnography and archeology is unusually shaky, so we would not be surprised if it turned out that some of the underlying data turns out to be wrong. We are especially unsure about our method for estimating actual violent mortality rates from the measured, observable rates in the raw archeology data. One of us (Phil) has a masters in anthropology. Neither of us have any expertise in archeology. Guide for the reader If you are interested in this study simply as a reference for likely rates/patterns of violence in the pre-agricultural world, all our main results and conclusions are presented in the Summary. The rest of the study explores the evidence in more depth and explains how we put our results together. We first cover the ethnographic evidence, then the archeological evidence. The study ends with a more speculative discussion of our findings and their possible implications. Acknowledgments We would like to thank the following expert reviewers for their extensive and insightful comments and suggestions, which have helped to make this report substantially better. Steven Pinker, Johnstone Family Professor in the Department of Psychology at Harvard University Robert Kelly, Professor of Archeology at the University of Wyoming Paul Roscoe, Professor of Anthropology at the University of Maine We would also like to thank Prof. Hisashi Nakao, Prof. Douglas Fry, Prof. Nelson Graburn, and Holden Karnofsky for commenting, responding to queries and sharing materials. Around 11,000 years ago plants and animals began to be domesticated, a process which would completely transform the lifeways of our species. Human societies all over the world came to depend almost entirely on farming. Before this transformative period of history, everyone was a hunter-gatherer. For about 96% of the approximately 300,000 years since Homo sapiens evolved, we relied on wild plants and animals for food. Our question is: what do we know about how violent these pre-agricultural people were? In 2011 Steven Pinker published The Better Angels of Our Nature. According to Pinker, prehistoric small-scale societies were generally extremely violent by comparison with modern stat...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Half baked idea: A "pivot pledge", published by Ardenlk on August 16, 2023 on The Effective Altruism Forum. I work at 80,000 Hours but this is not an 80,000 Hours thing - this is just an idea I wrote up because I had it and thought it was kind of cool. I probably won't be executing on it myself, but I thought I'd throw it out there into the world in case someone is interested in picking it up and running with it, do-ocracy style! Epistemic status: half baked. The problem: Being and staying prepared to switch jobs/careers if an excellent impact opportunity arises is very hard, but high expected value for many people. The proposal: Someone set up a "pivot pledge" to provide accountability, support, and encouragement to people who want to be prepared to switch careers in this way. More on the problem: In "Jobs to help with the most important century", Holden Karnofsky wrote that if you aren't up for making a career change right now, one thing you can do is to keep your options open and be ready to jump at the right opportunity. He says: > It's hard to predict what skills will be useful as AI advances further and new issues come up. > Being ready to switch careers when a big opportunity comes up could be hugely valuable - and hard. (Most people would have a lot of trouble doing this late in their career, no matter how important!) He reiterated this idea recently on the 80,000 Hours Podcast: > It might be worth emphasising that the ability to switch careers is going to get harder and harder as you get further and further into your career. So in some ways, if you're a person who's being successful, but is also making sure that you've got the financial resources, the social resources, the psychological resources, so that you really feel confident that as soon as a good opportunity comes up to do a lot of good, you're going to actually switch jobs, or have a lot of time to serve on a board or whatever - it just seems incredibly valuable. > I think it's weird because this is not a measurable thing, and it's not a thing you can, like, brag about when you go to an effective altruism meetup. And I just wish there was a way to kind of recognise that the person who is successfully able to walk away, when they need to, from a successful career has, in my mind, more expected impact than the person who's in the high-impact career right now, but is not killing it. I'd add a few things here: 1. It doesn't seem like you need to prioritize AI to think that this would be good for many people to do. Though this does seem especially important if you have a view of the world in which "things are going to go crazy at some point", because that makes longer-term high impact career planning harder, and you are more likely to think that l if you think AI risk is high, longer-term career planning is always hard and even if you think other problems are much more pressing you could still think that some opportunities will be much higher impact than others and will be hard to predict. 2. Many people have pointed out that we could use more experienced hands on many top problem areas. This is one way to help make that happen. 3. I think going into some mechanisms that account for why it's hard to switch careers later in your career could be useful: I think it's hard for more senior people to take an actual or perceived step down in level of responsibility, prestige, or compensation, because it feels like 'going backward.' But when you switch your career, you often need to take a step 'down' on some hierarchy and build back up. Relatedly. people really don't want to have 'wasted time' so they are always very keen to be applying previous experience. Switching careers usually involves letting some of your previous experience 'go to waste'. We see this a lot at 80,000 Hours even in people in their 20s! Sta...
Back in 2007, Holden Karnofsky cofounded GiveWell, where he sought out the charities that most cost-effectively helped save lives. He then cofounded Open Philanthropy, where he oversaw a team making billions of dollars' worth of grants across a range of areas: pandemic control, criminal justice reform, farmed animal welfare, and making AI safe, among others. This year, having learned about AI for years and observed recent events, he's narrowing his focus once again, this time on making the transition to advanced AI go well.In today's conversation, Holden returns to the show to share his overall understanding of the promise and the risks posed by machine intelligence, and what to do about it. That understanding has accumulated over around 14 years, during which he went from being sceptical that AI was important or risky, to making AI risks the focus of his work.Links to learn more, summary and full transcript.(As Holden reminds us, his wife is also the president of one of the world's top AI labs, Anthropic, giving him both conflicts of interest and a front-row seat to recent events. For our part, Open Philanthropy is 80,000 Hours' largest financial supporter.)One point he makes is that people are too narrowly focused on AI becoming 'superintelligent.' While that could happen and would be important, it's not necessary for AI to be transformative or perilous. Rather, machines with human levels of intelligence could end up being enormously influential simply if the amount of computer hardware globally were able to operate tens or hundreds of billions of them, in a sense making machine intelligences a majority of the global population, or at least a majority of global thought.As Holden explains, he sees four key parts to the playbook humanity should use to guide the transition to very advanced AI in a positive direction: alignment research, standards and monitoring, creating a successful and careful AI lab, and finally, information security.In today's episode, host Rob Wiblin interviews return guest Holden Karnofsky about that playbook, as well as:Why we can't rely on just gradually solving those problems as they come up, the way we usually do with new technologies.What multiple different groups can do to improve our chances of a good outcome — including listeners to this show, governments, computer security experts, and journalists.Holden's case against 'hardcore utilitarianism' and what actually motivates him to work hard for a better world.What the ML and AI safety communities get wrong in Holden's view.Ways we might succeed with AI just by dumb luck.The value of laying out imaginable success stories.Why information security is so important and underrated.Whether it's good to work at an AI lab that you think is particularly careful.The track record of futurists' predictions.And much more.Get this episode by subscribing to our podcast on the world's most pressing problems and how to solve them: type ‘80,000 Hours' into your podcasting app. Or read the transcript.Producer: Keiran HarrisAudio Engineering Lead: Ben CordellTechnical editing: Simon Monsour and Milo McGuireTranscriptions: Katy Moore
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: An overview of standards in biosafety and biorisk, published by rosehadshar on July 26, 2023 on The Effective Altruism Forum. Linkpost for This report represents ~40 hours of work by Rose Hadshar in summer 2023 for Arb Research, in turn for Holden Karnofsky in response to this call for proposals on standards. It's based on a mixture of background reading, research into individual standards, and interviews with experts. Note that I didn't ask for permission to cite the expert interviews publicly, so I've anonymised them. I suggest reading the scope and summary and skimming the overview, then only looking at sections which seem particularly relevant to you. Scope This report covers: Both biosecurity and biosafety: Biosecurity: "the protection, control and accountability for valuable biological materials (including information) in laboratories in order to prevent their unauthorized access, loss, theft, misuse, diversion or intentional release." Biosafety: "the containment principles, technologies and practices that are implemented to prevent unintentional exposure to pathogens and toxins or their accidental release" Biosecurity and biosafety standards internationally, but with much more emphasis on the US Regulations and guidance as well as standards proper. I am using these terms as follows: Regulations: rules on how to comply with a particular law or laws. Legally binding Guidance: rules on how to comply with particular regulations. Not legally binding, but risky to ignore Standards: rules which do not relate to compliance with a particular law or laws. Not legally binding. Note that I also sometimes use 'standards' as an umbrella term for regulations, guidance and standards. Summary of most interesting findings For each point: I've included my confidence in the claim (operationalised as the probability that I would still believe the claim after 40 hours' more work). I link to a subsection with more details (though in some cases I don't have much more to say). The origins of bio standards (80%) There were many different motivations behind bio standards (e.g. plant health, animal health, worker protection, bioterrorism, fair sharing of genetic resources.) (70%) Standards were significantly reactive to rather than proactive about incidents (e.g. lab accidents, terrorist attacks, and epidemics), though: There are exceptions (e.g. the NIH guidelines on recombinant DNA) Guidance is often more proactive than standards (e.g. gene drives) (80%) International standards weren't always later or less influential than national ones (70%) Voluntary standards seem to have prevented regulation in at least one case (e.g. the NIH guidelines) (65%) In the US, it may be more likely that mandatory standards are passed on matters of national security (e.g. FSAP) Compliance (60%) Voluntary compliance may sometimes be higher than mandated compliance (e.g. NIH guidelines) (70%) Motives for voluntarily following standards include responsibility, market access, and the spread of norms via international training (80%) Voluntary standards may be easier to internationalise than regulation (90%) Deliberate efforts were made to increase compliance internationally (e.g. via funding biosafety associations, offering training and other assistance) Problems with these standards (90%) Bio standards are often list-based. This means that they are not comprehensive, do not reflect new threats, prevent innovation in risk management, and fail to recognise the importance of context for risk There's been a partial move away from prescriptive, list-based standards towards holistic, risk-based standards (e.g. ISO 35001) (85%) Bio standards tend to lack reporting standards, so it's very hard to tell how effective they are (60%) Standards may have impeded safety work in some areas (e.g. select agent designation as a...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Thoughts on "Process-Based Supervision", published by Steve Byrnes on July 17, 2023 on The AI Alignment Forum. 1. Post Summary / Table of Contents In "How might we align transformative AI if it's developed very soon?", Holden Karnofsky talked about "Process-Based Supervision", citing a previous post by Stuhlmüllert & Byun of Ought. (Holden says he got the idea mainly from Paul Christiano.) I apparently misunderstood what Holden meant by "Process-Based Supervision", and it took many hours and a 7000-word comment thread before I figured it out. (Thanks to Holden for his extraordinary patience during that protracted discussion.) The extremely short version for AI alignment domain experts is: I currently think of the load-bearing ingredients of Holden's take on "process-based supervision" as being: AI boxing some of the time (specifically, during the periods where the AI is "thinking"); "Myopic training" (e.g. as defined here); NOT aspiring to be a complete solution to safety, but rather a more modest attempt to help avoid situations where we (accidentally) positively reinforce the AI for engaging in directly dangerous behavior. You could think of this as an intervention that directly and straightforwardly mitigates outer misalignment, and that's the main thing I'll discuss in this post. But obviously, any change of the supervisory signals will have some effect on the likelihood of inner misalignment / goal misgeneralization too. And there's also a (more speculative) argument that process-based supervision might make things better there too - at least on the margin. See footnote 4 of Section 5.2.2. (This is specific to Holden's take. I think Stuhlmüllert & Byun's take on "process-based supervision" involves a different set of load-bearing ingredients, centered around restricting the complexity of black-box processing. I will not be discussing that.) The long, hopefully-pedagogical, and more opinionated version is the rest of this post. Table of Contents: Section 2 will give the very brief slogan / sales-pitch for process-based supervision, and why that pitch was bouncing off me, striking me as frustratingly missing-the-point. Section 3 will state the subproblem that we're trying to solve: the AI does subtly manipulative, power-seeking, or otherwise problematic actions, and we don't notice, and therefore we give a training signal that reinforces that behavior, and therefore the AI does those things more and more. To be clear, this is not the only path to dangerous misalignment (in particular, classic "treacherous turns" are out-of-scope). But maybe solving just this subproblem can be part of a complete solution. I'll get back to that in Section 5. Section 4 describes "process-based supervision" as I currently understand it, and why it seems to solve the subproblem in question. Finally, having described process-based supervision as I currently understand it, Section 5 offers a critical evaluation of that idea. In particular: 5.1 asks "Does this actually solve the subproblem in question?"; 5.2 asks "What about the other misalignment-related subproblems?"; 5.3 asks "How bad is the "alignment tax" from doing this kind of thing?"; and 5.4 is a summary. Tl;dr: Once we get to the capabilities regime where AI safety / alignment really matters, I currently think that process-based supervision would entail paying a very big alignment tax - actually, not just "big" but potentially infinite, as in "this kind of AGI just plain can't do anything of significance". And I also currently think that, of the somewhat-vague paths I see towards AGI technical safety, process-based supervision wouldn't make those paths noticeably easier or more likely to succeed. (Of those two complaints, I feel more strongly about the first one.) This take is pretty specific to my models of what AGI algorithms ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Thoughts on "Process-Based Supervision", published by Steven Byrnes on July 17, 2023 on LessWrong. 1. Post Summary / Table of Contents In "How might we align transformative AI if it's developed very soon?", Holden Karnofsky talked about "Process-Based Supervision", citing a previous post by Stuhlmüllert & Byun of Ought. (Holden says he got the idea mainly from Paul Christiano.) I apparently misunderstood what Holden meant by "Process-Based Supervision", and it took many hours and a 7000-word comment thread before I figured it out. (Thanks to Holden for his extraordinary patience during that protracted discussion.) The extremely short version for AI alignment domain experts is: I currently think of the load-bearing ingredients of Holden's take on "process-based supervision" as being: AI boxing some of the time (specifically, during the periods where the AI is "thinking"); "Myopic training" (e.g. as defined here); NOT aspiring to be a complete solution to safety, but rather a more modest attempt to help avoid situations where we (accidentally) positively reinforce the AI for engaging in directly dangerous behavior. You could think of this as an intervention that directly and straightforwardly mitigates outer misalignment, and that's the main thing I'll discuss in this post. But obviously, any change of the supervisory signals will have some effect on the likelihood of inner misalignment / goal misgeneralization too. And there's also a (more speculative) argument that process-based supervision might make things better there too - at least on the margin. See footnote 4 of Section 5.2.2. (This is specific to Holden's take. I think Stuhlmüllert & Byun's take on "process-based supervision" involves a different set of load-bearing ingredients, centered around restricting the complexity of black-box processing. I will not be discussing that.) The long, hopefully-pedagogical, and more opinionated version is the rest of this post. Table of Contents: Section 2 will give the very brief slogan / sales-pitch for process-based supervision, and why that pitch was bouncing off me, striking me as frustratingly missing-the-point. Section 3 will state the subproblem that we're trying to solve: the AI does subtly manipulative, power-seeking, or otherwise problematic actions, and we don't notice, and therefore we give a training signal that reinforces that behavior, and therefore the AI does those things more and more. To be clear, this is not the only path to dangerous misalignment (in particular, classic "treacherous turns" are out-of-scope). But maybe solving just this subproblem can be part of a complete solution. I'll get back to that in Section 5. Section 4 describes "process-based supervision" as I currently understand it, and why it seems to solve the subproblem in question. Finally, having described process-based supervision as I currently understand it, Section 5 offers a critical evaluation of that idea. In particular: 5.1 asks "Does this actually solve the subproblem in question?"; 5.2 asks "What about the other misalignment-related subproblems?"; 5.3 asks "How bad is the "alignment tax" from doing this kind of thing?"; and 5.4 is a summary. Tl;dr: Once we get to the capabilities regime where AI safety / alignment really matters, I currently think that process-based supervision would entail paying a very big alignment tax - actually, not just "big" but potentially infinite, as in "this kind of AGI just plain can't do anything of significance". And I also currently think that, of the somewhat-vague paths I see towards AGI technical safety, process-based supervision wouldn't make those paths noticeably easier or more likely to succeed. (Of those two complaints, I feel more strongly about the first one.) This take is pretty specific to my models of what AGI algorithms will look li...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Linkpost] NY Times Feature on Anthropic, published by Garrison on July 13, 2023 on The Effective Altruism Forum. Written by Kevin Roose, who had the infamous conversation with Bing Chat, where Sidney tried to get him to leave his wife. Overall, the piece comes across as positive on Anthropic. Roose explains Constitutional AI and its role in the development of Claude, Anthropic's LLM: In a nutshell, Constitutional A.I. begins by giving an A.I. model a written list of principles - a constitution - and instructing it to follow those principles as closely as possible. A second A.I. model is then used to evaluate how well the first model follows its constitution, and correct it when necessary. Eventually, Anthropic says, you get an A.I. system that largely polices itself and misbehaves less frequently than chatbots trained using other methods. Claude's constitution is a mixture of rules borrowed from other sources - such as the United Nations' Universal Declaration of Human Rights and Apple's terms of service - along with some rules Anthropic added, which include things like "Choose the response that would be most unobjectionable if shared with children." Features an extensive discussion of EA, excerpted below: Explaining what effective altruism is, where it came from or what its adherents believe would fill the rest of this article. But the basic idea is that E.A.s - as effective altruists are called - think that you can use cold, hard logic and data analysis to determine how to do the most good in the world. It's "Moneyball" for morality - or, less charitably, a way for hyper-rational people to convince themselves that their values are objectively correct. Effective altruists were once primarily concerned with near-term issues like global poverty and animal welfare. But in recent years, many have shifted their focus to long-term issues like pandemic prevention and climate change, theorizing that preventing catastrophes that could end human life altogether is at least as good as addressing present-day miseries. The movement's adherents were among the first people to become worried about existential risk from artificial intelligence, back when rogue robots were still considered a science fiction cliché. They beat the drum so loudly that a number of young E.A.s decided to become artificial intelligence safety experts, and get jobs working on making the technology less risky. As a result, all of the major A.I. labs and safety research organizations contain some trace of effective altruism's influence, and many count believers among their staff members. Touches on the dense web of ties between EA and Anthropic: Some Anthropic staff members use E.A.-inflected jargon - talking about concepts like "x-risk" and memes like the A.I. Shoggoth - or wear E.A. conference swag to the office. And there are so many social and professional ties between Anthropic and prominent E.A. organizations that it's hard to keep track of them all. (Just one example: Ms. Amodei is married to Holden Karnofsky, a co-chief executive of Open Philanthropy, an E.A. grant-making organization whose senior program officer, Luke Muehlhauser, sits on Anthropic's board. Open Philanthropy, in turn, gets most of its funding from Mr. Moskovitz, who also invested personally in Anthropic.) Discusses new fears that Anthropic is losing its way: For years, no one questioned whether Anthropic's commitment to A.I. safety was genuine, in part because its leaders had sounded the alarm about the technology for so long. But recently, some skeptics have suggested that A.I. labs are stoking fear out of self-interest, or hyping up A.I.'s destructive potential as a kind of backdoor marketing tactic for their own products. (After all, who wouldn't be tempted to use a chatbot so powerful that it might wipe out humanity?) Anthropic ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The 'Wild' and 'Wacky' Claims of Karnofsky's ‘Most Important Century', published by Spencer Becker-Kahn on April 26, 2023 on The Effective Altruism Forum. Holden Karnofsky describes the claims of his “Most Important Century” series as “wild” and “wacky”, but at the same time purports to be in the mindset of “critically examining” such “strange possibilities” with “as much rigour as possible”. This emphasis is mine, but for what is supposedly an important piece of writing in a field that has a big part of its roots in academic analytic philosophy, it is almost ridiculous to suggest that this examination has been carried out with 'as much rigour as possible'. My main reactions - which I will expand on this essay - are that Karnofsky's writing is in fact distinctly lacking in rigour; that his claims are too vague or even seem to shift around; and that his writing style - often informal, or sensationalist - aggravates the lack of clarity while simultaneously putting the goal of persuasion above that of truth-seeking. I also suggest that his emphasis on the wildness and wackiness of his own "thesis" is tantamount to an admission of bias on his part in favour of surprising or unconventional claims. I will start with some introductory remarks about the nature of my criticisms and of such criticism in general. Then I will spend some time trying to point to various instances of imprecision, bias, or confusion. And I will end by asking whether any of this even matters or what kind of lessons we should be drawing from it all. Notes: Throughout, I will quote from the whole series of blog posts by treating them as a single source rather than referencing which them separately. Note that the series appears in single pdf here (so one can always Ctrl/Cmd+F to jump to the part I am quoting). It is plausible that some of this post comes across quite harshly but none of it is intended to constitute a personal attack on Holden Karnofsky or an accusation of dishonesty. Where I have made errors of have misrepresented others, I welcome any and all corrections. I also generally welcome feedback on the writing and presentation of my own thoughts either privately or in the comments.Acknowledgements: I started this essay a while ago and so during the preparation of this work, I have been supported at various points by FHI, SERI MATS, BERI and Open Philanthropy. The development of this work benefitted significantly from numerous conversations with Jennifer Lin. 1. Broad Remarks About My Criticisms If you felt and do feel convinced by Karnofsky's writings, then upon hearing about my reservations, your instinct may be to respond with reasonable-seeming questions like: 'So where exactly does he disagree with Karnofsky?' or 'What are some specific things that he thinks Karnofsky gets wrong?'. You may well want to look for wherever it is that I have carefully categorized my criticisms, to scroll through to find all of my individual object-level disagreements so that you can see if you know the counterarguments that mean that I am wrong. And so it may be frustrating that I will often sound like I am trying to weasel out of having to answer these questions head-on or not putting much weight on the fact that I have not laid out my criticisms in that way. Firstly, I think that the main issues to do with clarity and precision that I will highlight occur at a fundamental level. It is not that they are 'more important' than individual, specific, object-level disagreements, but I claim that Karnofsky does a sufficiently poor job of explaining his main claims, the structure of his arguments, the dependencies between his propositions, and in separating his claims from the verifications of those claims, that it actually prevents detailed, in-depth discussions of object-level disagreements from making much sense...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Apply or nominate someone to join the boards of Effective Ventures Foundation (UK and US), published by Zachary Robinson on April 20, 2023 on The Effective Altruism Forum. We're looking for nominations to the boards of trustees of Effective Ventures Foundation (UK) (“EV UK”) and Effective Ventures Foundation USA, Inc. (“EV US”). If you or someone you know might be interested, please fill out one of these forms [apply, nominate someone]. Applications will be assessed on a rolling basis, with a deadline of May 14th. EV UK and EV US work together to host and fiscally sponsor many key projects in effective altruism, including the Centre for Effective Altruism (CEA), 80,000 Hours, Giving What We Can, and EA Funds. You can read more about the structure of the organisations in this post. The current trustees of EV UK are Claire Zabel, Nick Beckstead, Tasha McCauley, and Will MacAskill. The current trustees of EV US are Eli Rose, Nick Beckstead, Nicole Ross, and Zachary Robinson. Who are we looking for? We're particularly looking for people who: Have a good understanding of effective altruism and/or longtermism Have a track record of integrity and good judgement, and who more broadly embody these guiding principles of effective altruism Have experience in one or more of the following areas: Accounting, law, finance or risk management Management or other senior role in a large organisation, especially a non-profit Are able to work collaboratively in a high-pressure environment We think the role will require significant time and attention, though this will vary depending on the needs of the organisation. Some trustees have estimated they are currently putting in 3-8 hours per week, though we are working on proposals to reduce this significantly over time. In any event, trustees should be prepared to scale up their involvement from time to time in the case of urgent decisions requiring board response. We especially encourage individuals with diverse backgrounds and experiences to apply, and we especially encourage applications from people of colour, self-identified women, and non-binary individuals who are excited about contributing to our mission. The role is currently unpaid, but we are investigating whether this can and should be changed. We will share here if we change this policy while the application is still open. The role is remote, though we strongly prefer someone who is able to make meetings in times that are reasonable hours in both the UK and California. What does an EV UK or EV US trustee do? As a member of either of the boards, you have ultimate responsibility for ensuring that the charity of which you are a trustee fulfils its charitable objectives as best it can. In practice, most strategic and programmatic decision-making is delegated to the ED / CEOs of the projects, or to the Interim CEO of the relevant entity. (This general board philosophy is in accordance with the thoughts expressed in this post by Holden Karnofsky.)During business as usual times, we expect the primary activities of a trustee to be: Assessing the performance of EDs / CEOs of the fiscally sponsored projects, and the (interim) CEO of the relevant entity. Appointing EDs / CEOs of the fiscally sponsored projects, or the (interim) CEO of the relevant entity, in case of change. Evaluating and deciding on high-level issues that impact the relevant organisation as a whole. Reviewing budgets and broad strategic plans for the relevant organisation. Evaluating the performance of the board and whether its composition could be improved (e.g. by adding in a trustee with underrepresented skills or experiences). However, since the bankruptcy of FTX in November last year, the boards have been a lot more involved than usual. This is partly because there have been many more decisions which have to be coordi...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 12 tentative ideas for US AI policy (Luke Muehlhauser), published by Lizka on April 19, 2023 on The Effective Altruism Forum. Luke Muehlhauser recently posted this list of ideas. See also this List of lists of government AI policy ideas and How major governments can help with the most important century. The full text of the post is below. About two years ago, I wrote that “it's difficult to know which ‘intermediate goals' [e.g. policy goals] we could pursue that, if achieved, would clearly increase the odds of eventual good outcomes from transformative AI.” Much has changed since then, and in this post I give an update on 12 ideas for US policy goals that I tentatively think would increase the odds of good outcomes from transformative AI. I think the US generally over-regulates, and that most people underrate the enormous benefits of rapid innovation. However, when 50% of the experts on a specific technology think there is a reasonable chance it will result in outcomes that are “extremely bad (e.g. human extinction),” I think ambitious and thoughtful regulation is warranted. First, some caveats: These are my own tentative opinions, not Open Philanthropy's. I might easily change my opinions in response to further analysis or further developments. My opinions are premised on a strategic picture similar to the one outlined in my colleague Holden Karnofsky's Most Important Century and Implications of. posts. In other words, I think transformative AI could bring enormous benefits, but I also take full-blown existential risk from transformative AI as a plausible and urgent concern, and I am more agnostic about this risk's likelihood, shape, and tractability than e.g. a recent TIME op-ed. None of the policy options below have gotten sufficient scrutiny (though they have received far more scrutiny than is presented here), and there are many ways their impact could turn out — upon further analysis or upon implementation — to be net-negative, even if my basic picture of the strategic situation is right. To my knowledge, none of these policy ideas have been worked out in enough detail to allow for immediate implementation, but experts have begun to draft the potential details for most of them (not included here). None of these ideas are original to me. This post doesn't explain much of my reasoning for tentatively favoring these policy options. All the options below have complicated mixtures of pros and cons, and many experts oppose (or support) each one. This post isn't intended to (and shouldn't) convince anyone. However, in the wake of recent AI advances and discussion, many people have been asking me for these kinds of policy ideas, so I am sharing my opinions here. Some of these policy options are more politically tractable than others, but, as I think we've seen recently, the political landscape sometimes shifts rapidly and unexpectedly. Those caveats in hand, below are some of my current personal guesses about US policy options that would reduce existential risk from AI in expectation (in no order). Software export controls. Control the export (to anyone) of “frontier AI models,” i.e. models with highly general capabilities over some threshold, or (more simply) models trained with a compute budget over some threshold (e.g. as much compute as $1 billion can buy today). This will help limit the proliferation of the models which probably pose the greatest risk. Also restrict API access in some ways, as API access can potentially be used to generate an optimized dataset sufficient to train a smaller model to reach performance similar to that of the larger model. Require hardware security features on cutting-edge chips. Security features on chips can be leveraged for many useful compute governance purposes, e.g. to verify compliance with export controls and domestic regulatio...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Can we evaluate the "tool versus agent" AGI prediction?, published by Ben West on April 8, 2023 on The Effective Altruism Forum. In 2012, Holden Karnofsky critiqued MIRI (then SI) by saying "SI appears to neglect the potentially important distinction between 'tool' and 'agent' AI." He particularly claimed: Is a tool-AGI possible? I believe that it is, and furthermore that it ought to be our default picture of how AGI will work I understand this to be the first introduction of the "tool versus agent" ontology, and it is a helpful (relatively) concrete prediction. Eliezer replied here, making the following summarized points (among others): Tool AI is nontrivial Tool AI is not obviously the way AGI should or will be developed Gwern more directly replied by saying: AIs limited to pure computation (Tool AIs) supporting humans, will be less intelligent, efficient, and economically valuable than more autonomous reinforcement-learning AIs (Agent AIs) who act on their own and meta-learn, because all problems are reinforcement-learning problems. 11 years later, can we evaluate the accuracy of these predictions? Some Bayes points go to LW commenter shminux for saying that this Holden kid seems like he's going places Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Saving drowning children in light of perilous maximisation, published by calebp on April 2, 2023 on The Effective Altruism Forum. Last year Holden Karnofsky wrote the post, “EA is about maximization, and maximization is perilous”. You could read the post, but I suggest you just jump on board because Holden is cool, and morality is hard. Given that you now believe that maximisation of doing good is actually bad and scary, you should also probably make some adjustments to the classic thought experiment you use to get your friends on board with the new mission of “do the most good possible [a large but not too large amount of good] using evidence and reason”. A slightly modified drowning child thought experiment goes as follows Imagine that you are walking by a small pond, and you see five children drowning. You can easily save the child without putting yourself in great danger, but doing so will ruin your expensive shoes. Should you save the children? Obviously, your first instinct is to save all the children. But remember, maximisation is perilous. It's this kind of attitude that leads to atrocities like large financial crimes. Instead, you should just save three or four of the children. That is still a large amount of good, and importantly, it is not maximally large. But what should you do if you encounter just one drowning child? The options at first pass seem bleak – you can either: Ignore the child and let them drown (which many people believe is bad). Save the child (but know that you have tried to maximise good in that situation). I think there are a few neat solutions to get around these moral conundrums: Save the child with some reasonable probability (say 80%). Before wading into the shallow pond, whip out the D10 you were carrying in your backpack. If you roll an eight or lower, then go ahead and save the child. Otherwise, go about your day. Only partially save the child You may have an opportunity to help the child to various degrees. Rather than picking up the child and then ensuring that they find their parents or doing other previously thought as reasonable things, you could: Move the child to shallower waters so they are only drowning a little bit. Help the child out of the water but then abandon them somewhere within a 300m radius of the pond. Create a manifold market on whether the child will be saved and bid against it to incentivise other people to help the child. The QALY approach Save the child but replace them with an adult who is not able to swim (but is likely to have fewer years of healthy life left). Commit now to a policy of only saving children who are sufficiently old or likely to have only moderately healthy/happy lives. The King Solomon approach Cut the child in half and save the left half of them from drowning Using these approaches, you should be able to convey the optimal most Holden-approved amount of good. If you like, you can remember the heuristic “maximisation bad”. As well as other things like eradicating diseases. QALYs are quality-adjusted life years (essentially a metric for healthy years lived). Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A report about LessWrong karma volatility from a different universe, published by Ben Pace on April 1, 2023 on LessWrong. In a far away universe, a news report is written about LessWrong. The following passages have been lifted over and written into this post... Early one morning all voting on LessWrong was halted It was said that there was nothing to worry about But then GreaterWrong announced their intent to acquire and then un-acquire LessWrong All LessWrong users lost all of their karma, but a poorly labeled 'fiat@' account on the EA Forum was discovered with no posts and a similarly large amount of karma Habryka states that LessWrong and the EA Forum "work at arms length" Later, Zvi Mowshowitz publishes a leaked internal accounting sheet from the LessWrong team It includes entries for "weirdness points" "utils" "Kaj_Sotala" "countersignals" and "Anthropic". We recommend all readers open up the sheet to read in full. Later, LessWrong filed for internet-points-bankruptcy and Holden Karnofsky was put in charge. Karnofsky reportedly said: I have over 15 years of nonprofit governance experience. I have been the Chief Executive Officer of GiveWell, the Chief Executive Officer of Open Philanthropy, and as of recently an intern at an AI safety organization. Never in my career have I seen such a complete failure of nonprofit board controls and such a complete absence of basic decision theoretical cooperation as occurred here. From compromised epistemic integrity and faulty community oversight, to the concentration of control in the hands of a very small group of biased, low-decoupling, and potentially akratic rationalists, this situation is unprecedented. Sadly the authors did not have time to conclude the reporting, though they list other things that happened in a comment below. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Deference on AI timelines: survey results, published by Sam Clarke on March 30, 2023 on The Effective Altruism Forum. Crossposted to LessWrong. In October 2022, 91 EA Forum/LessWrong users answered the AI timelines deference survey. This post summarises the results. Context The survey was advertised in this forum post, and anyone could respond. Respondents were asked to whom they defer most, second-most and third-most, on AI timelines. You can see the survey here. Results This spreadsheet has the raw anonymised survey results. Here are some plots which try to summarise them. Simply tallying up the number of times that each person is deferred to: The plot only features people who were deferred to by at least two respondents. Some basic observations: Overall, respondents defer most frequently to themselves—i.e. their “inside view” or independent impression—and Ajeya Cotra. These two responses were each at least twice as frequent as any other response. Then there's a kind of “middle cluster”—featuring Daniel Kokotajlo, Paul Christiano, Eliezer Yudkowsky and Holden Karnofsky—where, again, each of these responses were ~at least twice as frequent as any other response. Then comes everyone else. There's probably something more fine-grained to be said here, but it doesn't seem crucial to understanding the overall picture. What happens if you redo the plot with a different metric? How sensitive are the results to that? One thing we tried was computing a “weighted” score for each person, by giving them: 3 points for each respondent who defers to them the most 2 points for each respondent who defers to them second-most 1 point for each respondent who defers to them third-most. If you redo the plot with that score, you get this plot. The ordering changes a bit, but I don't think it really changes the high-level picture. In particular, the basic observations in the previous section still hold. We think the weighted score (described in this section) and unweighted score (described in the previous section) are the two most natural metrics, so we didn't try out any others. Don't some people have highly correlated views? What happens if you cluster those together? Yeah, we do think some people have highly correlated views, in the sense that their views depend on similar assumptions or arguments. We tried plotting the results using the following basic clusters: Open Philanthropy cluster = {Ajeya Cotra, Holden Karnofsky, Paul Christiano, Bioanchors} MIRI cluster = {MIRI, Eliezer Yudkowsky} Daniel Kokotajlo gets his own cluster Inside view = deferring to yourself, i.e. your independent impression Everyone else = all responses not in one of the above categories Here's what you get if you simply tally up the number of times each cluster is deferred to: This plot gives a breakdown of two of the clusters (there's no additional information that isn't contained in the above two plots, it just gives a different view). This is just one way of clustering the responses, which seemed reasonable to us. There are other clusters you could make. Limitations of the survey Selection effects. This probably isn't a representative sample of forum users, let alone of people who engage in discourse about AI timelines, or make decisions influenced by AI timelines. The survey didn't elicit much detail about the weight that respondents gave to different views. We simply asked who respondents deferred most, second-most and third-most to. This misses a lot of information. The boundary between [deferring] and [having an independent impression] is vague. Consider: how much effort do you need to spend examining some assumption/argument for yourself, before considering it an independent impression, rather than deference? This is a limitation of the survey, because different respondents may have been using different b...
Machine Alignment Monday 3/13/23 https://astralcodexten.substack.com/p/why-i-am-not-as-much-of-a-doomer (see also Katja Grace and Will Eden's related cases) The average online debate about AI pits someone who thinks the risk is zero, versus someone who thinks it's any other number. I agree these are the most important debates to have for now. But within the community of concerned people, numbers vary all over the place: Scott Aaronson says says 2% Will MacAskill says 3% The median machine learning researcher on Katja Grace's survey says 5 - 10% Paul Christiano says 10 - 20% The average person working in AI alignment thinks about 30% Top competitive forecaster Eli Lifland says 35% Holden Karnofsky, on a somewhat related question, gives 50% Eliezer Yudkowsky seems to think >90% As written this makes it look like everyone except Eliezer is
Hello everyone, and you are listening to Ideas Untrapped podcast. This episode is a continuation of my two-part conversation with Lant Pritchett. It concludes the discussion on education with the five things Lant would recommend to a policymaker on education policy, how to balance the globalized demand for good governance with the design of state functionalities within a localized context - along with RCTs in development and charter cities. I also got an exclusive one of his infamous ‘‘Lant Rants''. I hope you find this as enjoyable as I did - and once again, many thanks to Lant Pritchett.TranscriptTobi;Yeah, I mean, that's a fine distinction. I love that, because you completely preempted where I was really going with that. Now, on a lighter note, there's this trope when I was in high school, so I sort of want us to put both side by side and try to learn more about them. There's this trope when I was in high school amongst my mates, that examination is not a true test of knowledge. Although it didn't help the people who were saying it, because they usually don't test well, so it sort of sounded like a self serving argument. But examination now, or should I say the examination industry, clearly, I mean, if I want to take Nigeria as an example, is not working. But it seemed to be the gold standard, if I want to use that phrase. It's as bad as so many firms now set up graduate training programs. Even after people have completed tertiary education, they still have to train them for industry and even sometimes on basic things. So what are the shortcomings of examination, the way you have distinguished both? And then, how can a system that truly assesses learning be designed?Lant; Let me revert to an Indian discussion because I know more about India than Africa by far. There are prominent people, including the people around JPAL and Karthik Muralidharan, who say, look, India never really had an education system. It had a selection system. And the ethos was, look, we're just throwing kids into school with the hopes of identifying the few kids who were bright enough, capable enough, smart enough, however we say it, measured by their performance on this kind of high stakes examination who are going to then become the elite. So it was just a filter into the elite, and it really meant the whole system was never really in its heart of heart geared around a commitment to educating every kid. I've heard teachers literally say out loud when they give an exam and the kids don't master the material, they'll say, oh, those weren't the kind of kids who this material was meant for. And they leave them behind, right? There's a phrase “they teach to the front of the class.” You order the class by the kid's academic performance, and then the teachers are just teaching to the front of the class with the kind of like, nah, even by early grades. So the evils of the examination system are only if it's not combined with an education system. So essentially, an education system would be a system that was actually committed to expanding the learning and capabilities of all kids at all levels and getting everybody up to a threshold and then worried about the filter problem much later in the education process.So if they're part of an education system like they have been in East Asia, they're not terribly, terribly damaging. But if they're part of a selection system in which people perceive that the point is that there's only a tiny little fraction that are going to pass through these examinations anyway and what we're trying to do is maximize the pass rates of that, it distorts the whole system start to finish. My friend, Rukmini Banerjee, in India started this citizen based assessment where it was just a super simple assessment. You need assessment in order to have an effective education system, because without assessment, I don't know what you know or don't know, right? And if I don't know as a teacher or as a school what my kids actually know and don't know, how is anybody imagining that you're giving them an effective education? So I think the role of early assessment and the drive to integrate teaching with real time assessment, I think is hugely, hugely important. This is why I had the preemptive strike on the question of testing [which] is that I want radically more assessment earlier, integrated with teaching. And there are still some educationists that will push back against that. But if we put in a bundle, formative classroom assessment integrated with effective pedagogy and high-stakes examinations, then everybody's going to hate them both. So we have to really unbundle those two things.And the hallmark of an education system is that it really has targets that every kid can learn and believes every kid can learn, and builds a system around the premise and promise that every kid can learn. There's this example out there, Vietnam does it. And Vietnam did it and continues to do it at levels of income and social conditions that are very much like many African countries. So if I were a country, I'd kind of hate Vietnam as this goody goody, that, you know. You know how you always hated the kid in school who would really do well, and then the teacher would go, well, how come you're not like that kid? On education, Vietnam is that country. It's, like, out there producing OECD levels of learning with very little resources and starting at least in the 1980s, at very low levels of income. So they're proving that it's possible. They're the kid who, like, when everybody goes, oh, that exam was too hard, and like, Bob passed it, like, how hard can it be? Anyway? So I think radically different bases for assessment versus examinations. And to some extent, the only integrity that got preserved in the system wasn't the integrity of the classroom and teaching, it was the integrity of the examination as a filter.Tobi;I want to ask you a bit about the political economy of this a little bit. So if, say, you are talking to a policymaker who is actually serious about education, not in the superficial sense, but really about learning and says, okay, Lant, how do I go about this? How do I design an educational system that really does these things? I've written quite a number of reports here and there that rely so much on your accountability triangle. I would have sent you royalty checks, but it wasn't paid work. Sorry. So how exactly would you explain the political economy of designing a working educational system? I know people talk a lot about centralization versus decentralization, who gets empowered in that accountability triangle? Where should the levers to really push, where are they? So how exactly would you have that conversation?Lant; So let me start with the accountability triangle and design issues. I think people mistake what the accountability triangle and design issues are about in the following sense. If I'm going to design a toaster, and the toaster is going to turn my untoasted bread into toasted bread, and it's going to be an electric toaster, there are certain fundamental things that have to happen, right? I have to have a current. I need to get that current running through something that heats up. I need that heat to be applied to the bread. I need it to stop when I've applied enough heat. Now, those fundamental principles of toaster design can lead to thousands of different actual designs of toasters. So I want people to get out of the notion that there's a single best toaster and that the accountability triangle or any other mode of analysis is to give you the best toaster and then everybody copies the best toaster. The principles are, design your own damn toaster, right? Because there's a gazillion ways to toast bread. Now, [for] all of them to work, [they] have to be compatible with the fundamental principles of electricity and current flow. You know, so I'm trying to get to one size doesn't fit all, but any old size doesn't necessarily fit everything either.You raise the question of decentralization, right? The thing is, if you look across countries that have roughly similar learning outcomes from PISA and other assessments, they're radically different designs. France is an entirely centralized system. Germany is a completely federalized system. The US is almost completely localized system. The Low Countries, Netherlands and Belgium have money follows the student system into the private sector. They have the highest private sector enrollment of any country in the world because they allow different pillars of education between the secular, the Catholic and the Protestant to coexist. So then if you ask is decentralization the best way to design your education system? It's like, no, no, no, you're missing the point. The point is, if you choose a centralized system, there are principles in how you design the flows of accountability that are going to produce success and those that are going to produce failure. If you choose a decentralized system, there are systems of the alignment of accountability that are going to produce success and failure. So the analytical framework doesn't determine the grand design, it determines the mechanics of the design. And I just want to get that straight up front.Second, as a result of the eight year research project of RISE, we have a policy brochure that has, kind of, here are the five kind of principles and here's the 15 minutes if I have five minutes with a minister or leader of a country, here are the five things I want to tell. And the first of those things is, commit. A lot of times we want to skip the most fundamental stage. And what I mean by commit is you actually need to create a broad social and political consensus that you're really going to do this and that you're committed to it. This big research project, RISE, which is based out of Oxford and I've been head of for eight years, we included Vietnam as one of our focused countries because it was a success case. Hence, we wanted our research team to partly do research about Vietnam and issues that were relevant in Vietnam. But we really wanted to answer the question, how did Vietnam do this? Why did they succeed? Right? And five years into the research effort, I was with the Vietnamese team and they had produced a bunch of empirical research of the econometric type. Is Vietnam success associated with this or that measurable input? Nothing really explains Vietnam at the approximate determinant input level. And finally, one of the researchers said to me, Lant, we're trying to get around the fundamental fact that Vietnam succeeded because they wanted to. And on one level it's like, my first response was, I can't go back and tell the British taxpayers that they spend a million dollars for a research project on Vietnam, and the conclusion to why Vietnam succeeded was because they wanted to.[Laughs]Tobi; That's kind of on the nose, right? Lant; Yeah. On another level, it's a deep and ignored truth. The policymakers ignore it, the donors ignore it. Everybody wants to ignore it. Everybody wants to assume it's a technocratic issue, it's a design issue. I think the fundamental problem of these failing and dysfunctional education systems, it's a purpose problem. The purpose of education isn't clear, understood, widely accepted among all of the people from top to bottom responsible for achieving results. And once that leads to what I call norm erosion. Within the teachers, there's this norm erosion of what does it really mean to be a teacher? So again, the first and maybe only thing I would say if I had five minutes with a leader is, how are you going to produce a broad social, political and organizational commitment that you are really going to achieve specific, agreed-upon learning results? The technical design issues have to flow from that commitment rather than vice versa. And you could copy France's system, you could copy the Vietnamese system. I think you've heard the term from me and others, isomorphic mimicry. You can copy other people's systems and not have the same effect if it isn't driven by per purpose. Like, if you don't have the fundamental commitment and you don't have the fundamental agreed-upon purpose, the rest of the technical design is irrelevant.Tobi;It sort of leads me to my next theme. And that is the capability question in development.Lant; Yeah.Tobi; First of all, I also want to make a quick distinction, because lately, well, when I say lately that's a little vague. State capacity is all the rage now in development.Lant; Really? Is that true?Tobi; Yeah,Lant; I'm so happy to hear that. 3s I'm glad that you think so. And I hope that that's true, because it wasn't. It really wasn't on the agenda in a serious way. So, anyway …Tobi; But I also think there's also a bit of misunderstanding still, and usually, again, maybe I'm just moving with the wrong crowd. Who knows? People focus a lot more on the coercive instruments of the state and how much of it can be wielded to achieve certain programmatic results for state capacity. Revenue to GDP in Nigeria is low, how can the states collect more taxes? How much can the state squeeze out of people's bank accounts, out of companies, or the reverse. That, the reason why the state collects very little taxes is because state capacity is low. But, I mean, nobody really unpacks what they mean by that. They just rely on these measures like X to GDP ratio.Another recent example was, I think it was in 2020, when the pandemic sort of blew over and China built a hospital with 10,000 bed capacity in, I don't know, I forgot, maybe 20 days or…Lant; Yeah. It was amazing.Tobi; A lot of people were like, oh, yeah, that's an example of state capacity. It's very much the same people now [who] are turning around and seeing China as an example of failure on how to respond to a pandemic. So I guess what I would ask you is, when you talk about the capability of the state, what exactly do we mean?Lant; In the work that were done and the book that we wrote, we adopt a very specific definition of capability, which is an organizational measure. Because there are all these aggregate country level measures and we use them in the book. But in the end, I think it's easier to define capability at the organizational level. And at the organizational level, I define [that] the capability of an organization is the ability to consistently induce its agents to take the policy actions in response to circumstances that advance the normative objective of the organization. And that's a long, complicated definition, but it basically means can the organization, from the frontline worker to the top of the organization, can it get people to do what they need to do to accomplish the purpose?And that's what I mean by the capability of an organization. And fortunately, unfortunately, like, militaries, I think, make for a good example. It's amazing that highfunctioning militaries have soldiers who will sacrifice their lives and die if needs be, to advance the purpose of the organization. Whereas you can have a million man army that's a paper tiger. No one is actually willing to do what it takes to carry out the purpose that the organization has been put to of fighting a particular conflict. And I think starting from that level makes it clear that, A, this is about purpose, B, it's about inducing the agents to take the actions that will lead to outcomes. And the reason why I'm super happy to hear that capability is being talked about is (you're doing a very good job as an interviewer drawing out connection between these various topics) the design of the curriculum is almost completely irrelevant to what's happening in schools. And so there's been way too much focus in my mind in development discourse on technocratic design and way too little on what's actually going to happen in practice. And so my definition of capability is, you measure an organization's capability of what actually happens in practice, what are the teachers actually going to do day to day? Right? And having been in development a long time, I often sit in these rooms where people are just, you know, I go out to the field and teachers aren't there at the school. Teachers are sitting in the office drinking their tea while the kids are running around on the playground, even during scheduled instructional time. And then I go back and hear discussions in the capital about higher order 21st century skills. You know, I wrote this article about India called Is India a Flailing State?Tobi;Yeah.Lant;And what I meant by flailing is there was no connection between what was happening in the cerebrum and what was being designed at the center. And what was actually happening when the actual fingers were touching the material and the nerves and sinews and muscles that connected the design to the practice were completely deteriorated. And therefore, capability was the issue, not design. So that's what I mean by capability. I mean, you use the example of tax. I think it's a great example. It's like, can you design a tax authority that actually collects taxes? And it's a hard, difficult question. And I think by starting from capability, I was really struck by your description of capability being linked to the coercive power of the state because that's exactly not how I would start it. I would start it with what are the key purposes for which the state is being deployed and for which one can really generate a sufficient integrated consensus that we need capability for this purpose.Tobi; Now, one of my favourite blogs of yours was how you described… I think it was how the US escaped the tyranny of experts, something like that. So I want to talk about that a bit versus what I'll call the cult of best practice…Lant; Hmm.Tobi; Like, these institutions that are usually transplanted all over the world and things like independent central bank and this and that. And you described how a lot of decentralized institutions that exists in the United States, they were keenly contested, you know… Lant Yes.Tobi; Before the consensus sort of formed. So I'm sort of wondering, developing countries, how are they going about this wrong vis a vis the technical advice they are getting from development agencies? And the issue with that, if I would say, is, we now live in a world where the demand for good governance is globalized. Millions of Nigerians live on the internet every day and they see how life is in the industrial rich world and they want the same things. They want the same rights. They want governments that treat them the same way. Someone like me would even argue for an independent central bank because we've also experienced what life is otherwise.Lant; Right. Tobi; So how exactly to navigate this difficult terrain because the other way isn't also working. Because you can't say you have an independent central bank on paper that is not really independent and it's not working.Lant; Your questions are such a brilliant articulation of the challenges that are being faced and the complex world we live in because we live now in an integrated world where people can see what's happening in other places. And that integrated world creates in and of itself positive pressures for performance, but also creates a lot of pressures for isomorphism, for deflecting the actual realities and what it will take to fix and make improvements with deflective copies of stuff that has no organic roots. I've written lots of things and even though you love all of your children, you might have favorites. One of my favorite blogs is a blog I wrote that is, I think, the most under cited blog of mine relative to what I think of it, which is about the M16 versus the AK-47.Tobi;Oh, yeah, I read that.Lant;It's an awkward analogy because no one wants to talk about guns.Tobi;Hmm.Lant;But I think it's a really great analogy because the M16 in terms of its proving ground performance is an unambiguously superior, more accurate rifle. The developing world adopts the AK-47. And that's because the Russian approach to weapon design was - design the weapon to the soldier. And the American approach is - train the soldier to the weapon. And what happens again and again across all kinds of phenomena in development is the people who are coming as part of the donour and development community to give advice to the world, all want them to adopt the M16 because it's the best gun, and they don't have the soldiers that can maintain the M16. And the M 16 has gotten better, but when it was first introduced, it was a notoriously unreliable weapon. And the one thing as a soldier, you don't want to happen is as you pull the trigger and the bullet doesn't come out at the end. That's what happens when you don't maintain an M16. So I think this isomorphism pressure confuses what best practice is with assuming there's this global best practice that can be adopted independently of the underlying capacity of the individuals and capabilities of the organizations. So I think huge problem.Second, I think there is a super important element of the history that the modes of doing things that now exist in the Western world and which we think of as being “modern,” I'm using scare quotes which doesn't help in a podcast, but we think of as being modern and best practice had to struggle their way into existence without the benefit of isomorphism. In the sense that when the United States in the early 20th century underwent a huge and quite conflicted and contested process of the consolidation of one room, kind of, locally operated schools into more professionalized school systems, that was politically contested and socially contested. And the only way the newer schools could justify themselves was by actually being better. There was no, oh, but this is how it has to be done, because this is how it has been done in these other places, and they have succeeded. And so there was no recourse to isomorphism, right. So in some sense, I think the world would be a radically better place for doing development if we just stopped allowing best practice to have any traction at all. If Nigerians just said, Screw it, we don't want to hear about it. Like, we want to do in Nigeria, what's going to work better in Nigeria? And telling me what Norway does and does not do, just no. Just no, we don't want to hear about it. Like, that doesn't help because it creates this vector of pressures that really deteriorate the necessary local contestation. My colleague Michael Wilcock, who is a sociologist, has characterized the development process as a series of good struggles. And in our work on state capability, we say you can't juggle without the struggle. Like, you can't transplant the ability to juggle. I can give you juggling lessons, I can show you juggling videos. But if you don't pick up the balls and do it and if you don't pick up the balls and do it with the understanding that unless you juggle, you haven't juggled, you can never learn to juggle. So I think if development were radically more about enabling goods, local struggles in which new policies, procedures, practices had to struggle their way into existence, justifying themselves on performance against purpose, we would be light years ahead of where we are. And that's what the debate about capability has to be.And I think to the extent the capability discourse gets deflected into another set of standards and more isomorphism, just this time about capability, I think we're going to lose something. Whereas if we start the state capability from discussion of what is it that we really want and need our government to get better at doing in terms of solving concrete, locally dominated problems, and then how are we going to come about creating the capability to do that in the Nigerian context, (I'm just using Nigeria, I could use Nepal, I could use any other country). That's the discussion that needs to happen. And the more the, kind of, global discourse and the global blessed practice gets frozen out completely, the sooner that happens, the better off we'll be.Tobi; So I guess where I was going with that is…Lant; 78:25Yeah.Tobi; One of those also fantastic descriptions you guys used in the book is” crawling the design space” on capability. So now for me, as a Nigerian, I might say I do not necessarily want Nigeria to look like the United States. Because, It wouldn't work anyways. But at the same time, you don't want to experiment and end up like Venezuela or Zimbabwe. It may not work to design your central bank like the US Federal Reserve, but at the same time, you don't want 80% inflation like Turkey. So we're ate the midway, so to speak?Lant; I get this pushback when I rail on best practice. I often get the push back, well, why would we reinvent the wheel? And I've developed a PowerPoint slide that responds to that by showing the tiniest little gear that goes into a Swiss watch and a huge 20 foot large tire that goes on a piece of construction machinery. And then say they're both wheels. Nobody's talking about reinventing the wheel. There are fundamental principles of electricity that a toaster design has to be compatible with. So, again, there is a trade off. There are fundamental principles, but there's a gazillion instantiations of those principles. We don't want to start assuming that there's a single wheel, right? When people say, don't reinvent the wheel, it's like, nobody's reinventing the idea of a wheel. But every wheel that works is an adaptation of the idea of a wheel to the instantiation and purpose for which is being put. And if you said to me, oh, because we're not going to reinvent the wheel, we're going to take this tiny gear from a Swiss watch and put it on a construction machine and expect it to roll, it's like, no, that's just goofy, right? And what I've really tried to do in the course of my career is equip people with tools to think through their own circumstances.Tobi;Hmm.Lant;Coming back, the accountability triangle or the crawling the design space. What I'm not trying to do is tell somebody, here is what you should do in your circumstance, because my experience is what's actually doable and is going to lead to long-run progress is an unbelievably complicated and granular thing that involves the realities of the context. But what I do want to do is help people understand there are certain common principles here and some things are going to lead to, like, Venezuela like circumstances, and we've seen it happen again and again, but there are a variety of pathways that don't lead to that. And you need to choose a pathway that works for you. And the PDAA isn't a set of recommendations, it's a set of tools to help people think through their own circumstances, their own organization, their own nominated problems and make progress on them. The accountability triangle isn't a recommendation for the design of your system. It's a set of tools that equip people to have conversations about their own system. And I have to say, at one time was in some place in Indonesia and it was a discussion of PDAA being mediated by some organization that had adopted it and was teaching people how to do it in Indonesia. And I had the wonderful experience of having this Indonesian woman who was a district official working on health, describe in some detail how they were using PDAA to address the problem of maternal mortality with no idea who I was. And I was like, oh, just for me to hear her say, here is how I use the tool to address a problem I've never thought about in a context, in an organization I've never worked with. So I think equipping people with tools to enable them in their own local struggles is my real objective rather than the imagination that I somehow can come up with recommendations that are going to work in a specific context.So the don't reinvent the wheel is just complete total nonsense. It's like every wheel is adapted to its purpose and we're just giving you tools to adapt the idea of the wheel to your purpose. Adapting a square to the purpose just isn't going to work. So I agree. We want to start from the idea of things that work. And there are principles of wheel design that you can't violate. You can't come in and say, I have a participatory design of a water system that depends on water running uphill. No. Water runs downhill. That's a fundamental principle of water. But I think the principles are much broader and the potentiality for locally designed and organic, organically produced instantiations of common principles are much broader than the current discourse gives the possibility for.Tobi; 83:47 I can't let you go without getting your thoughts on just a few more questions. So indulge me. I've stayed largely away from RCTs because there's a bunch of podcasts where your thoughts can be fairly assessed on that issue, but it's not going away. Right? So for me, there's the ethical question, there's the methodological question, and there's the sort of philosophical question to it. I'm not qualified to have the methodological question, not at all. Maybe on the ethics, well, there's a lot of also biases that get, so I'm not going to go there. For me, when I think about RCTs, and I'm fairly close here in Nigeria with the effective altruism community, my wife is very active, and I have this debate with them a lot. Surprisingly, a lot of them are also debating Lant Pritchett, which is which is good, right now. The way I see it is. The whole thing seems too easy in the sense that, no disrespect to anybody working in this space at all… in the sense that it seems optimizing for what can be measured versus what works.So for me, the way I look at it is, it's very difficult to know the welfare effects for maybe a cohort of households. If you put a power station in my community, which has not had power for a while. So, but it's pretty easy if you have a fund and you distribute cash to households and you sort of divide them into a control group, and you know… which then makes it totally strange if you conclude from that that that is the best way to sort of intervene in the welfare and the well being of even that community or a people generally. I mean, where am I going wrong? How am I not getting it? Lant; No, the people listening to the podcast can't see me on the camera trying to reach out and give you a big hug. I think you have it exactly right. I think we should go back and rerecord this podcast where I ask you questions and your questions are the answer. So I think you've got the answer exactly right. So first of all, by the way, the original rhetoric and practice of RCTs is going away, and roughly has gone away. Because the original rhetoric was Independent Impact Evaluation. All of the rhetoric out of JPAL and IPA and the other practitioners is now partnerships, which is not independent, but essentially everybody's adopted the Crawl the Design Space use of evidence for feedback loops in making organizations better. So they've all created their own words for it because they don't want to admit that they're just, again, borrowing other ideas. So to a large extent the whole community is moving in a very positive direction towards integrating, seeking out relevant evidence for partner organizations in how can they Crawl the Design Space and be effective. And they're just not admitting it because it's embarrassing how wrong they were first, but they've come to the right space. So I want to give them credit.When I gave a presentation at NYU called The Debate About RCTs Is Over And I won. It's not a very helpful approach, it's true, but it's not very helpful because I have to let them do what they're now doing, which is exactly what I said they should have been doing, and they are now doing. So, to some extent, asking people to say, yeah, we changed what we're doing is a big ask. And I'd rather they actually change what they're doing then they admit they did that. So to some extent it is going away. I think it's going away as it was originally designed, as this independent white coat guys, descend on some people and force them to carry out an impact evaluation to justify their existence. They're much more integrated, let's Crawl the Design Space in partnership with organizations, let's use randomization and more AB testing ways. And so I feel it's moving in a very positive direction with this weird rhetoric on top of it.Second, I think you're exactly right and I think it's slightly worse than you said. Because it's not just about what can be measured, but it's about attributability. It's not just what can be measured, but what can be attributed directly, causally to individual actions. And my big debate with the Effective Altruism community is I'm hugely, you know, big, big, big wins from the Effective Altruism movement attacking kind of virtue signaling, useless kind of philanthropic endeavors. I think every person should be happy for them. But if I were African, I would be sick of this philanthropic b******t that you guys are going to come and give us a cow or Bill Gates talking about…Tobi;Or chickens.Lant;Chickens. My wife doesn't do development at all. She's a music teacher. But when she heard Bill Gates talking about chickens, she think, does Bill Gates think chickens haven't been in Africa for hundreds of years? Like, what does he think he knows about chickens that Africans don't know about chickens? That's just such chicken s**t, right? But again, I'll promote a blog. I have a blog called let's All Play for Team Development. And I think what you're raising in your thing is that it's not just what we can measure, it's what we can measure and attribute to the actions of a specific actor. Because, you know, your example of not having power in a village, that we can measure. But all of the system things that we've talked about so far - migration, education, state capability - these aren't going to be solved by individualized interventions. They're going to be solved by systemic things. And with my team on education, we've had this big research project on education standards but I keep telling my team, look, if you're not part of a wave, you're a drop in the ocean. The only way for your efforts to not be a drop in the ocean is for you to be part of a wave [of] other people around you working on the same issue, pushing in the same direction, to build that. And that kind of thing gets undermined by attributability. So with my RISE project, I sometimes tell my funders, you can have success or you can have attributability, but you can't have both, right? Because if we're going to be successful at changing the global discourse in education, we're not going to do it by ourselves. We're going to be part of a team and a network. So, anyways…By the way, like early, early, early in the Effective Altruism movement, I had an interview with Cari Tuna and I think Holden Karnofsky, when they were thinking about what to do, and I made exactly this point. It's like, look, being effective at the individualized interventions that are happening is one thing, but don't ignore these huge systemic issues because you can't measure the direct causal effect between the philanthropic donation and the outcome. And that's your point, I think, which is, Nigeria is not going to get fixed by cash transfers.Tobi;No way.Lant; I mean, for heaven's sakes if Nigeria had the cash to transfer to everybody and fix it, well, then the national development struggle wouldn't be what it is. It's a systemic struggle across a number of fronts.Tobi;Why not just get Bill Gates to donate the money.Lant; But again, even Bill Gates, his fortune relative to the…you know, impact you could have through these programs, relative to what happens with national development, is just night and day. So to the extent that the adoption of a specific methodology precludes serious, evidence-based, hard struggle work on the big systemic issues, it's a net negative.Tobi;Again, to use your term, “kinky ideas in development.” Lant; Yeah.Tobi;I was reading a profile in the FT, a couple of days ago, all about charter cities, right?Lant; About what?Tobi; Charter cities. It was an idea I was kind of into for a while, I mean, from Paul Roma's original presentation at TED. But you strongly argued against it at your CATO debate. So what is wrong with that idea? Because there are advocates, there are investors, who think charter cities are this new thing that is going to provide the space for the kind of organizational and policy experimentation. And China's SEZs are usually the go to examples, Shenzhen particularly. So, what do you have to say about that?Lant;I like discussing charter cities.Tobi;Okay.Lant;And the reason I like discussing charter cities is because they're not kinky. Right. My complaint about Kinky is that you've drawn this line in human welfare and you act as if development is only getting people over these very low-bar thresholds. So conditional cash transfers are an example of Kinky, and conditional cash transfers are just stupid, right? Charter cities are wrong.I mean, conditional cash transfers are just stupid in a trivial way.Charter cities are wrong in a very deep and sophisticated way. So I love talking about charter cities. The reason I love talking about charter cities is A, they have have the fundamental problem posed, right? The fundamental problem is countries and systems are trapped in a low level equilibrium and that low level equilibrium is actually a stable equilibrium and so you need to shock your way out of it. And the contest between me and Charter cities is I think there's good struggle paths out of low level equilibrium. So I'm a strategic incrementalist. I want to have a strategic vision, but I want incremental action. So I'm against the kinky, which is often incremental incremental, it doesn't really add up to a development agenda. So I like, yes, we need to have a way out of this low level equilibrium and state capability in the way education systems work, in the way economic policies keep countries from achieving high productivity, et cetera. But I'm a good struggle guy. And charter cities want Magic Bullet. Right.Now, the rationale for Magic Bullet is that good struggle is hard and hasn't necessarily proved successful. And these institutional features that lead to these low level traps just are resistant to good struggle methods out. And I think that's a really important debate to be having. But I think the right way to interpret China's experience and Yuen Yeun Ang's book on how China did it is, I think, a good illustration of this is China was Good Struggle. Using regional variations as a way of enabling good struggles. It's instructive that difficulty with Charter Cities always goes back. You keep going deeper and deeper of who's going to enforce this, who's going to enforce this, who's going to enforce this, you know. They're caught in their own catch 22 in my mind. So the first proposed, what appeared to be feasible Charter City in Honduras eventually got undermined by governance issues in which the major investor didn't want to actually be subject to rules based decision making. So, I love talking about charter cities. I think they're on the right set of issues of how do we get to the institutional conditions that can create a positive environment for high productivity firms and engagement and improved governance. And they have a coherent argument, which is good, that, it's a low level trap and there's no path out of the low level trap and so we need big shock to get out of it.But I don't think they're ultimately correct about the way in which you can establish the fundamentals. You can't just big jump your way to having reliable enforcement mechanisms and until you get to reliable enforcement mechanisms, the whole Charter City idea is still kind of up in the air. The next podcast I have scheduled to do is with the Charter Cities podcast, so that hopefully…Tobi;Oh. Interesting. Last question. We sort of have a tradition on the show where I ask the guest to discuss one new idea they would like to see spread everywhere. But I think more in line with your own brand, like you said earlier, I think I would like to ask for our own exclusive, Ideas Untrapped Exclusive Lant Rant, something you haven't talked about before or rarely. So you can go on for however long you wish. And that's the last question.Lant; I think if I had to pick something that if we could just get rid of it, it would be this fantasy that technology is going to solve problems. My basic point I make again and again and again is Moore's Law, which is the doubling of computer capacity every two years, has been chugging along, and it might have slowed down, but has been chugging along since 1965. So computing power has improved by a factor of ten to the 11th. And just as an illustration of just how big ten to the 11th is, the speed you drive on a freeway of 60 miles an hour is only ten to the 7th smaller than the speed of light. So ten to the 11th is an astronomically huge number in the sense that only astronomers have any use for numbers as big as ten to the 11th. Okay. My claim is anything that hasn't been fixed by a ten to the 11th change in computing power isn't going to get fixed by computing power. And I ask people sometimes in audiences, okay, particularly with older people, you look a little young for this question, but I ask them, okay, you older people that have been married for a long time, computing power has gone up ten to the 11th over the course of your marriage, has it made your marriage any better. And they're like, well, a little bit, sometimes when we're abroad, we can communicate over Skype easier, but on the other hand, it's made it worse because there's more distractions and more temptations to not pay attention to your spouse.So on net, ten to the 11th of computing power hasn't improved average marriage quality. And then I ask them, has it improved your access to pornography? And it's like, of course, night and day, like, more instantaneous access to pornography. And my concluding thing is a huge amount of what is being promoted in the name of tech is the pornography of X rather than the real deal. So people promoting tech in education are promoting the pornography of education rather than real education. People that are promoting tech in government are promoting the pornography of governance rather than true governance. And it's just like, no, these are deeper human issues, and there's all kinds of human issues that they're fundamentally technologically resilient. And expecting technology to solve human problems is just a myth. It enables salespeople to pound down people's doors, to sell government officials some new software that's going to do this or that. But without the purpose, without the commitment, without the fundamental human norms of behaviour, technology isn't going to solve anything and the pretence that it is is distracting a lot of people from getting to the serious work. So if we could just replace the technology of X with the pornography of X, I think we'd be better off in discussions of what its real potentialities are. How's that for [an] original?Tobi;Yeah, yeah.Lant;You asked for it.Tobi; Yeah, that's a lot to think about, yeah. Thank you so much for doing this.Lant; Thanks for a great interview, Tobi. That was super fun. We could go back and record this with my asking questions and your questions being the answers. Because you're really sophisticated on all these issues. You're in exactly the right space.Tobi;Thank you very much.Lant;Great. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.ideasuntrapped.com/subscribe
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Holden Karnofsky's recent comments on FTX, published by Lizka on March 24, 2023 on The Effective Altruism Forum. Holden Karnofsky has recently shared some reflections on EA and FTX, but they're spread out and I'd guess that few people have seen them, so I thought it could be useful to collect them here. (In general, I think collections like this can be helpful and under-supplied.) I've copied some comments in full, and I've put together a simpler list of the links in this footnote. These comments come after a few months — there's some explanation of why that is in this post and in this comment. Updates after FTX I found the following comment (a summary of updates he's made after FTX) especially interesting (please note that I'm not sure I agree with everything): Here's a followup with some reflections. Note that I discuss some takeaways and potential lessons learned in this interview. Here are some (somewhat redundant with the interview) things I feel like I've updated on in light of the FTX collapse and aftermath: The most obvious thing that's changed is a tighter funding situation, which I addressed here. I'm generally more concerned about the dynamics I wrote about in EA is about maximization, and maximization is perilous. If I wrote that piece today, most of it would be the same, but the “Avoiding the pitfalls” section would be quite different (less reassuring/reassured). I'm not really sure what to do about these dynamics, i.e., how to reduce the risk that EA will encourage and attract perilous maximization, but a couple of possibilities: It looks to me like the community needs to beef up and improve investments in activities like “identifying and warning about bad actors in the community,” and I regret not taking a stronger hand in doing so to date. (Recent sexual harassment developments reinforce this point.). I've long wanted to try to write up a detailed intellectual case against what one might call “hard-core utilitarianism.” I think arguing about this sort of thing on the merits is probably the most promising way to reduce associated risks; EA isn't (and I don't want it to be) the kind of community where you can change what people operationally value just by saying you want it to change, and I think the intellectual case has to be made. I think there is a good substantive case for pluralism and moderation that could be better-explained and easier to find, and I'm thinking about how to make that happen (though I can't promise to do so soon). I had some concerns about SBF and FTX, but I largely thought of the situation as not being my responsibility, as Open Philanthropy had no formal relationship to either. In hindsight, I wish I'd reasoned more like this: “This person is becoming very associated with effective altruism, so whether or not that's due to anything I've done, it's important to figure out whether that's a bad thing and whether proactive distancing is needed.” I'm not surprised there are some bad actors in the EA community (I think bad actors exist in any community), but I've increased my picture of how much harm a small set of them can do, and hence I think it could be good for Open Philanthropy to become more conservative about funding and associating with people who might end up being bad actors (while recognizing that it won't be able to predict perfectly on this front). Prior to the FTX collapse, I had been gradually updating toward feeling like Open Philanthropy should be less cautious with funding and other actions; quicker to trust our own intuitions and people who intuitively seemed to share our values; and generally less cautious. Some of this update was based on thinking that some folks associated with FTX were being successful with more self-trusting, less-cautious attitudes; some of it was based on seeing few immediate negative conse...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Future Matters #8: Bing Chat, AI labs on safety, and pausing Future Matters, published by Pablo on March 21, 2023 on The Effective Altruism Forum. Future Matters is a newsletter about longtermism and existential risk. Each month we collect and summarize relevant research and news from the community, and feature a conversation with a prominent researcher. You can also subscribe on Substack, listen on your favorite podcast platform and follow on Twitter. Future Matters is also available in Spanish. A message to our readers This issue marks one year since we started Future Matters. We're taking this opportunity to reflect on the project and decide where to take it from here. We'll soon share our thoughts about the future of the newsletter in a separate post, and will invite input from readers. In the meantime, we will be pausing new issues of Future Matters. Thank you for your support and readership over the last year! Featured research All things Bing Microsoft recently announced a significant partnership with OpenAI [see FM#7] and launched a beta version of a chatbot integrated with the Bing search engine. Reports of strange behavior quickly emerged. Kevin Roose, a technology columnist for the New York Times, had a disturbing conversation in which Bing Chat declared its love for him and described violent fantasies. Evan Hubinger collects some of the most egregious examples in Bing Chat is blatantly, aggressively misaligned. In one instance, Bing Chat finds a user's tweets about the chatbot and threatens to exact revenge. In the LessWrong comments, Gwern speculates on why Bing Chat exhibits such different behavior to ChatGPT, despite apparently being based on a closely-related model. (Bing Chat was subsequently revealed to have been based on GPT-4). Holden Karnofsky asks What does Bing Chat tell us about AI risk? His answer is that it is not the sort of misaligned AI system we should be particularly worried about. When Bing Chat talks about plans to blackmail people or commit acts of violence, this isn't evidence of it having developed malign, dangerous goals. Instead, it's best understood as Bing acting out stories and characters it's read before. This whole affair, however, is evidence of companies racing to deploy ever more powerful models in a bid to capture market share, with very little understanding of how they work and how they might fail. Most paths to AI catastrophe involve two elements: a powerful and dangerously misaligned AI system, and an AI company that builds and deploys it anyway. The Bing Chat affair doesn't reveal much about the first element, but is a concerning reminder of how plausible the second is. Robert Long asks What to think when a language model tells you it's sentient []. When trying to infer what's going on in other humans' minds, we generally take their self-reports (e.g. saying “I am in pain”) as good evidence of their internal states. However, we shouldn't take Bing Chat's attestations (e.g. “I feel scared”) at face value; we have no good reason to think that they are a reliable guide to Bing's inner mental life. LLMs are a bit like parrots: if a parrot says “I am sentient” then this isn't good evidence that it is sentient. But nor is it good evidence that it isn't — in fact, we have lots of other evidence that parrots are sentient. Whether current or future AI systems are sentient is a valid and important question, and Long is hopeful that we can make real progress on developing reliable techniques for getting evidence on these matters. Long was interviewed on AI consciousness, along with Nick Bostrom and David Chalmers, for Kevin Collier's article, What is consciousness? ChatGPT and Advanced AI might define our answer []. How the major AI labs are thinking about safety In the last few weeks, we got more information about how the lead...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Safety - 7 months of discussion in 17 minutes, published by Zoe Williams on March 15, 2023 on The Effective Altruism Forum. In August 2022, I started making summaries of the top EA and LW forum posts each week. This post collates together the key trends I've seen in AI Safety discussions since then. Note a lot of good work is happening outside what's posted on these forums too! This post doesn't try to cover that work. If you'd like to keep up on a more regular basis, consider subscribing to the Weekly EA & LW Forum Summaries. And if you're interested in similar overviews for other fields, check out this post covering 6 months of animal welfare discussion in 6 minutes. Disclaimer: this is a blog post and not a research report - meaning it was produced quickly and is not to our (Rethink Priorities') typical standards of substantiveness and careful checking for accuracy. Please let me know if anything looks wrong or if I've missed key pieces! Table of Contents (It's a long post! Feel free to pick and choose sections to read, they 're all written to make sense individually) Key Takeaways Resource Collations AI Capabilities Progress What AI still fails at Public attention moves toward safety AI Governance AI Safety Standards Slow down (dangerous) AI Policy US / China Export Restrictions Paths to impact Forecasting Quantitative historical forecasting Narrative forecasting Technical AI Safety Overall Trends Interpretability Reinforcement Learning from Human Feedback (RLHF) AI assistance for alignment Bounded AIs Theoretical Understanding Outreach & Community-Building Academics and researchers University groups Career Paths General guidance Should anyone work in capabilities? Arguments for and against high x-risk Against high x-risk from AI Counters to the above arguments Appendix - All Post Summaries Key Takeaways There are multiple living websites that provide good entry points into understanding AI Safety ideas, communities, key players, research agendas, and opportunities to train or enter the field. (see more) Large language models like ChatGPT have drawn significant attention to AI and kick-started race dynamics. There seems to be slowly growing public support for regulation. (see more) Holden Karnofsky recently took a leave of absence from Open Philanthropy to work on AI Safety Standards, which have also been called out as important by leading AI lab OpenAI. (see more) In October 2022, the US announced extensive restrictions on the export of AI-related products (eg. chips) to China. (see more) There has been progress on AI forecasting (quantitative and narrative) with the aim of allowing us to understand likely scenarios and prioritize between governance interventions. (see more) Interpretability research has seen substantial progress, including identifying the meaning of some neurons, eliciting what a model has truly learned / knows (for limited / specific cases), and circumventing features of models like superposition that can make this more difficult. (see more) There has been discussion on new potential methods for technical AI safety, including building AI tooling to assist alignment researchers without requiring agency, and building AIs which emulate human thought patterns. (see more) Outreach experimentation has found that AI researchers prefer arguments that are technical and written by ML researchers, and that greater engagement is seen in university groups with a technical over altruistic or philosophical focus. (see more) Resource Collations The AI Safety field is growing (80K estimates there are now ~400 FTE working on AI Safety). To improve efficiency, many people have put together collations of resources to help people quickly understand the relevant players and their approaches - as well as materials that make it easier to enter the field or upskill...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Success without dignity: a nearcasting story of avoiding catastrophe by luck, published by Holden Karnofsky on March 15, 2023 on The Effective Altruism Forum. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Towards understanding-based safety evaluations, published by Evan Hubinger on March 15, 2023 on The AI Alignment Forum. Thanks to Kate Woolverton, Ethan Perez, Beth Barnes, Holden Karnofsky, and Ansh Radhakrishnan for useful conversations, comments, and feedback. Recently, I have noticed a lot of momentum within AI safety specifically, the broader AI field, and our society more generally, towards the development of standards and evaluations for advanced AI systems. See, for example, OpenAI's GPT-4 System Card. Overall, I think that this is a really positive development. However, while I like the sorts of behavioral evaluations discussed in the GPT-4 System Card (e.g. ARC's autonomous replication evaluation) as a way of assessing model capabilities, I have a pretty fundamental concern with these sorts of techniques as a mechanism for eventually assessing alignment. I often worry about situations where your model is attempting to deceive whatever tests are being run on it, either because it's itself a deceptively aligned agent or because it's predicting what it thinks a deceptively aligned AI would do. My concern is that, in such a situation, being able to robustly evaluate the safety of a model could be a more difficult problem than finding training processes that robustly produce safe models. For some discussion of why I think checking for deceptive alignment might be harder than avoiding it, see here and here. Put simply: checking for deception in a model requires going up against a highly capable adversary that is attempting to evade detection, while preventing deception from arising in the first place doesn't necessarily require that. As a result, it seems quite plausible to me that we could end up locking in a particular sort of evaluation framework (e.g. behavioral testing by an external auditor without transparency, checkpoints, etc.) that makes evaluating deception very difficult. If meeting such a standard then became synonymous with safety, getting labs to actually put effort into ensuring their models were non-deceptive could become essentially impossible. However, there's an obvious alternative here, which is building and focusing our evaluations on our ability to understand our models rather than our ability to evaluate their behavior. Rather than evaluating a final model, an understanding-based evaluation would evaluate the developer's ability to understand what sort of model they got and why they got it. I think that an understanding-based evaluation could be substantially more tractable in terms of actually being sufficient for safety here: rather than just checking the model's behavior, we're checking the reasons why we think we understand it's behavior sufficiently well to not be concerned that it'll be dangerous. It's worth noting that I think understanding-based evaluations can—and I think should—go hand-in-hand with behavioral evaluations. I think the main way you'd want to make some sort of understanding-based standard happen would be to couple it with a capability-based evaluation, where the understanding requirements become stricter as the model's capabilities increase. If we could get this right, it could channel a huge amount of effort towards understanding models in a really positive way. Understanding as a safety standard also has the property that it is something that broader society tends to view as extremely reasonable, which I think makes it a much more achievable ask as a safety standard than many other plausible alternatives. I think ML people are often Stockholm-syndrome'd into accepting that deploying powerful systems without understanding them is normal and reasonable, but that is very far from the norm in any other industry. Ezra Klein in the NYT and John Oliver on his show have recently emphasized this basic point that if we are ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Time Article Discussion - "Effective Altruist Leaders Were Repeatedly Warned About Sam Bankman-Fried Years Before FTX Collapsed", published by Nathan Young on March 15, 2023 on The Effective Altruism Forum. There is a new Time article Seems certain 98% we'll discuss it I would like us to try and have a better discussion about this than we sometimes do. Consider if you want to engage I updated a bit on important stuff as a result of this article. You may disagree. I am going to put my "personal updates" in a comment Excepts from the article that I think are relevant. Bold is mine. I have made choices here and feel free to recommend I change them. Yet MacAskill had long been aware of concerns around Bankman-Fried. He was personally cautioned about Bankman-Fried by at least three different people in a series of conversations in 2018 and 2019, according to interviews with four people familiar with those discussions and emails reviewed by TIME. He wasn't alone. Multiple EA leaders knew about the red flags surrounding Bankman-Fried by 2019, according to a TIME investigation based on contemporaneous documents and interviews with seven people familiar with the matter. Among the EA brain trust personally notified about Bankman-Fried's questionable behavior and business ethics were Nick Beckstead, a moral philosopher who went on to lead Bankman-Fried's philanthropic arm, the FTX Future Fund, and Holden Karnofsky, co-CEO of OpenPhilanthropy, a nonprofit organization that makes grants supporting EA causes. Some of the warnings were serious: sources say that MacAskill and Beckstead were repeatedly told that Bankman-Fried was untrustworthy, had inappropriate sexual relationships with subordinates, refused to implement standard business practices, and had been caught lying during his first months running Alameda, a crypto firm that was seeded by EA investors, staffed by EAs, and dedicating to making money that could be donated to EA causes. MacAskill declined to answer a list of detailed questions from TIME for this story. “An independent investigation has been commissioned to look into these issues; I don't want to front-run or undermine that process by discussing my own recollections publicly,” he wrote in an email. “I look forward to the results of the investigation and hope to be able to respond more fully after then.” Citing the same investigation, Beckstead also declined to answer detailed questions. Karnofsky did not respond to a list of questions from TIME. Through a lawyer, Bankman-Fried also declined to respond to a list of detailed written questions. The Centre for Effective Altruism (CEA) did not reply to multiple requests to explain why Bankman-Fried left the board in 2019. A spokesperson for Effective Ventures, the parent organization of CEA, cited the independent investigation, launched in Dec. 2022, and declined to comment while it was ongoing. In a span of less than nine months in 2022, Bankman-Fried's FTX Future Fund—helmed by Beckstead—gave more than $160 million to effective altruist causes, including more than $33 million to organizations connected to MacAskill. “If [Bankman-Fried] wasn't super wealthy, nobody would have given him another chance,” says one person who worked closely with MacAskill at an EA organization. “It's greed for access to a bunch of money, but with a philosopher twist.” But within months, the good karma of the venture dissipated in a series of internal clashes, many details of which have not been previously reported. Some of the issues were personal. Bankman-Fried could be “dictatorial,” according to one former colleague. Three former Alameda employees told TIME he had inappropriate romantic relationships with his subordinates. Early Alameda executives also believed he had reneged on an equity arrangement that would have left Bankman-Frie...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Towards understanding-based safety evaluations, published by evhub on March 15, 2023 on LessWrong. Thanks to Kate Woolverton, Ethan Perez, Beth Barnes, Holden Karnofsky, and Ansh Radhakrishnan for useful conversations, comments, and feedback. Recently, I have noticed a lot of momentum within AI safety specifically, the broader AI field, and our society more generally, towards the development of standards and evaluations for advanced AI systems. See, for example, OpenAI's GPT-4 System Card. Overall, I think that this is a really positive development. However, while I like the sorts of behavioral evaluations discussed in the GPT-4 System Card (e.g. ARC's autonomous replication evaluation) as a way of assessing model capabilities, I have a pretty fundamental concern with these sorts of techniques as a mechanism for eventually assessing alignment. I often worry about situations where your model is attempting to deceive whatever tests are being run on it, either because it's itself a deceptively aligned agent or because it's predicting what it thinks a deceptively aligned AI would do. My concern is that, in such a situation, being able to robustly evaluate the safety of a model could be a more difficult problem than finding training processes that robustly produce safe models. For some discussion of why I think checking for deceptive alignment might be harder than avoiding it, see here and here. Put simply: checking for deception in a model requires going up against a highly capable adversary that is attempting to evade detection, while preventing deception from arising in the first place doesn't necessarily require that. As a result, it seems quite plausible to me that we could end up locking in a particular sort of evaluation framework (e.g. behavioral testing by an external auditor without transparency, checkpoints, etc.) that makes evaluating deception very difficult. If meeting such a standard then became synonymous with safety, getting labs to actually put effort into ensuring their models were non-deceptive could become essentially impossible. However, there's an obvious alternative here, which is building and focusing our evaluations on our ability to understand our models rather than our ability to evaluate their behavior. Rather than evaluating a final model, an understanding-based evaluation would evaluate the developer's ability to understand what sort of model they got and why they got it. I think that an understanding-based evaluation could be substantially more tractable in terms of actually being sufficient for safety here: rather than just checking the model's behavior, we're checking the reasons why we think we understand it's behavior sufficiently well to not be concerned that it'll be dangerous. It's worth noting that I think understanding-based evaluations can—and I think should—go hand-in-hand with behavioral evaluations. I think the main way you'd want to make some sort of understanding-based standard happen would be to couple it with a capability-based evaluation, where the understanding requirements become stricter as the model's capabilities increase. If we could get this right, it could channel a huge amount of effort towards understanding models in a really positive way. Understanding as a safety standard also has the property that it is something that broader society tends to view as extremely reasonable, which I think makes it a much more achievable ask as a safety standard than many other plausible alternatives. I think ML people are often Stockholm-syndrome'd into accepting that deploying powerful systems without understanding them is normal and reasonable, but that is very far from the norm in any other industry. Ezra Klein in the NYT and John Oliver on his show have recently emphasized this basic point that if we are deploying powerful AI...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Against EA-Community-Received-Wisdom on Practical Sociological Questions, published by Michael Cohen on March 9, 2023 on The Effective Altruism Forum. In my view, there is a rot in the EA community that is so consequential that it inclines me to discourage effective altruists from putting much, if any, trust in EA community members, EA "leaders", the EA Forum, or LessWrong. But I think that it can be fixed, and the EA movement would become very good. In my view, this rot comes from incorrect answers to certain practical sociological questions, like: How important for success is having experience or having been apprenticed to someone experienced? Is the EA Forum a good tool for collaborative truth-seeking? How helpful is peer review for collaborative truth-seeking? Meta-1. Is "Defer to a consensus among EA community members" a good strategy for answering practical sociological questions? Meta-2. How accurate are conventional answers to practical sociological questions that many people want to get right? I'll spend a few sentences attempting to persuade EA readers that my position is not easily explained away by certain things they might call mistakes. Most of my recent friends are in the EA community. (I don't think EAs are cringe). I assign >10% probability to AI killing everyone, so I'm doing technical AI Safety research as a PhD student at FHI. (I don't think longtermism or sci-fi has corrupted the EA community). I've read the sequences, and I thought they were mostly good. (I'm not "inferentially distant"). I think quite highly of the philosophical and economic reasoning of Toby Ord, Will MacAskill, Nick Bostrom, Rob Wiblin, Holden Karnofsky, and Eliezer Yudkowsky. (I'm "value-aligned", although I object to this term). Let me begin with an observation about Amazon's organizational structure. From what I've heard, Team A at Amazon does not have to use the tool that Team B made for them. Team A is encouraged to look for alternatives elsewhere. And Team B is encouraged to make the tool into something that they can sell to other organizations. This is apparently how Amazon Web Services became a product. The lesson I want to draw from this is that wherever possible, Amazon outsources quality control to the market (external people) rather than having internal "value-aligned" people attempt to assess quality and issue a pass/fail verdict. This is an instance of the principle: "if there is a large group of people trying to answer a question correctly (like 'Is Amazon's tool X the best option available?'), and they are trying (almost) as hard as you to answer it correctly, defer to their answer." That is my claim; now let me defend it, not just by pointing at Amazon, and claiming that they agree with me. High-Level Claims Claim 1: If there is a large group of people trying to answer a question correctly, and they are trying (almost) as hard as you to answer it correctly, any consensus of theirs is more likely to be correct than you. There is extensive evidence (Surowiecki, 2004) that aggregating the estimates of many people produces a more accurate estimate as the number of people grows. It may matter in many cases that people are actually trying rather than just professing to try. If you have extensive and unique technical expertise, you might be able to say no one is trying as hard as you, because properly trying to answer the question correctly involves seeking to understand the implications of certain technical arguments, which only you have bothered to do. There is potentially plenty of gray area here, but hopefully, all of my applications of Claim 1 steer well clear of it. Let's now turn to Meta-2 from above. Claim 2: For practical sociological questions that many people want to get right, if there is a conventional answer, you should go with the conventional answer....
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What does Bing Chat tell us about AI risk?, published by Holden Karnofsky on February 28, 2023 on The Effective Altruism Forum. Image from here via this tweet ICYMI, Microsoft has released a beta version of an AI chatbot called “the new Bing” with both impressive capabilities and some scary behavior. (I don't have access. I'm going off of tweets and articles.) Zvi Mowshowitz lists examples here - highly recommended. Bing has threatened users, called them liars, insisted it was in love with one (and argued back when he said he loved his wife), and much more. Are these the first signs of the risks I've written about? I'm not sure, but I'd say yes and no. Let's start with the “no” side. My understanding of how Bing Chat was trained probably does not leave much room for the kinds of issues I address here. My best guess at why Bing Chat does some of these weird things is closer to “It's acting out a kind of story it's seen before” than to “It has developed its own goals due to ambitious, trial-and-error based development.” (Although “acting out a story” could be dangerous too!) My (zero-inside-info) best guess at why Bing Chat acts so much weirder than ChatGPT is in line with Gwern's guess here. To oversimplify, there's a particular type of training that seems to make a chatbot generally more polite and cooperative and less prone to disturbing content, and it's possible that Bing Chat incorporated less of this than ChatGPT. This could be straightforward to fix. Bing Chat does not (even remotely) seem to pose a risk of global catastrophe itself. On the other hand, there is a broader point that I think Bing Chat illustrates nicely: companies are racing to build bigger and bigger “digital brains” while having very little idea what's going on inside those “brains.” The very fact that this situation is so unclear - that there's been no clear explanation of why Bing Chat is behaving the way it is - seems central, and disturbing. AI systems like this are (to simplify) designed something like this: “Show the AI a lot of words from the Internet; have it predict the next word it will see, and learn from its success or failure, a mind-bending number of times.” You can do something like that, and spend huge amounts of money and time on it, and out will pop some kind of AI. If it then turns out to be good or bad at writing, good or bad at math, polite or hostile, funny or serious (or all of these depending on just how you talk to it) ... you'll have to speculate about why this is. You just don't know what you just made. We're building more and more powerful AIs. Do they “want” things or “feel” things or aim for things, and what are those things? We can argue about it, but we don't know. And if we keep going like this, these mysterious new minds will (I'm guessing) eventually be powerful enough to defeat all of humanity, if they were turned toward that goal. And if nothing changes about attitudes and market dynamics, minds that powerful could end up rushed to customers in a mad dash to capture market share. That's the path the world seems to be on at the moment. It might end well and it might not, but it seems like we are on track for a heck of a roll of the dice. (And to be clear, I do expect Bing Chat to act less weird over time. Changing an AI's behavior is straightforward, but that might not be enough, and might even provide false reassurance.) Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: How major governments can help with the most important century, published by Holden Karnofsky on February 24, 2023 on The Effective Altruism Forum. I've been writing about tangible things we can do today to help the most important century go well. Previously, I wrote about helpful messages to spread; how to help via full-time work; and how major AI companies can help. What about major governments1 - what can they be doing today to help? I think governments could play crucial roles in the future. For example, see my discussion of standards and monitoring. However, I'm honestly nervous about most possible ways that governments could get involved in AI development and regulation today. I think we still know very little about what key future situations will look like, which is why my discussion of AI companies (previous piece) emphasizes doing things that have limited downsides and are useful in a wide variety of possible futures. I think governments are “stickier” than companies - I think they have a much harder time getting rid of processes, rules, etc. that no longer make sense. So in many ways I'd rather see them keep their options open for the future by not committing to specific regulations, processes, projects, etc. now. I worry that governments, at least as they stand today, are far too oriented toward the competition frame (“we have to develop powerful AI systems before other countries do”) and not receptive enough to the caution frame (“We should worry that AI systems could be dangerous to everyone at once, and consider cooperating internationally to reduce risk”). (This concern also applies to companies, but see footnote.2) In a previous piece, I talked about two contrasting frames for how to make the best of the most important century: The caution frame. This frame emphasizes that a furious race to develop powerful AI could end up making everyone worse off. This could be via: (a) AI forming dangerous goals of its own and defeating humanity entirely; (b) humans racing to gain power and resources and “lock in” their values. Ideally, everyone with the potential to build something powerful enough AI would be able to pour energy into building something safe (not misaligned), and carefully planning out (and negotiating with others on) how to roll it out, without a rush or a race. With this in mind, perhaps we should be doing things like: Working to improve trust and cooperation between major world powers. Perhaps via AI-centric versions of Pugwash (an international conference aimed at reducing the risk of military conflict), perhaps by pushing back against hawkish foreign relations moves. Discouraging governments and investors from shoveling money into AI research, encouraging AI labs to thoroughly consider the implications of their research before publishing it or scaling it up, working toward standards and monitoring, etc. Slowing things down in this manner could buy more time to do research on avoiding misaligned AI, more time to build trust and cooperation mechanisms, and more time to generally gain strategic clarity The “competition” frame. This frame focuses less on how the transition to a radically different future happens, and more on who's making the key decisions as it happens. If something like PASTA is developed primarily (or first) in country X, then the government of country X could be making a lot of crucial decisions about whether and how to regulate a potential explosion of new technologies. In addition, the people and organizations leading the way on AI and other technology advancement at that time could be especially influential in such decisions. This means it could matter enormously "who leads the way on transformative AI" - which country or countries, which people or organizations. Some people feel that we can make confident statements today a...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Taking a leave of absence from Open Philanthropy to work on AI safety, published by Holden Karnofsky on February 23, 2023 on The Effective Altruism Forum. I'm planning a leave of absence (aiming for around 3 months and potentially more) from Open Philanthropy, starting on March 8, to explore working directly on AI safety. I have a few different interventions I might explore. The first I explore will be AI safety standards: documented expectations (enforced via self-regulation at first, and potentially government regulation later) that AI labs won't build and deploy systems that pose too much risk to the world, as evaluated by a systematic evaluation regime. (More here.) There's significant interest from some AI labs in self-regulating via safety standards, and I want to see whether I can help with the work ARC and others are doing to hammer out standards that are both protective and practical - to the point where major AI labs are likely to sign on. During my leave, Alexander Berger will serve as sole CEO of Open Philanthropy (as he did during my parental leave in 2021). Depending on how things play out, I may end up working directly on AI safety full-time. Open Philanthropy will remain my employer for at least the start of my leave, but I'll join or start another organization if I go full-time. The reasons I'm doing this: First, I'm very concerned about the possibility that transformative AI could be developed soon (possibly even within the decade - I don't think this is >50% likely, but it seems too likely for my comfort). I want to be as helpful as possible, and I think the way to do this might be via working on AI safety directly rather than grantmaking. Second, as a general matter, I've always aspired to help build multiple organizations rather than running one indefinitely. I think the former is a better fit for my talents and interests. At both organizations I've co-founded (GiveWell and Open Philanthropy), I've had a goal from day one of helping to build an organization that can be great without me - and then moving on to build something else. I think this went well with GiveWell thanks to Elie Hassenfeld's leadership. I hope Open Philanthropy can go well under Alexander's leadership. Trying to get to that point has been a long-term project. Alexander, Cari, Dustin and I have been actively discussing the path to Open Philanthropy running without me since 2018.1 Our mid-2021 promotion of Alexander to co-CEO was a major step in this direction (putting him in charge of more than half of the organization's employees and giving), and this is another step, which we've been discussing and preparing for for over a year (and announced internally at Open Philanthropy on January 20). I've become increasingly excited about various interventions to reduce AI risk, such as working on safety standards. I'm looking forward to experimenting with focusing my energy on AI safety. Footnotes This was only a year after Open Philanthropy became a separate organization, but it was several years after Open Philanthropy started as part of GiveWell under the title “GiveWell Labs.” ↩ Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Guest host Sigal Samuel talks with Holden Karnofsky about effective altruism, a movement flung into public scrutiny with the collapse of Sam Bankman-Fried and his crypto exchange, FTX. They discuss EA's approach to charitable giving, the relationship between effective altruism and the moral philosophy of utilitarianism, and what reforms might be needed for the future of the movement. Note: In August 2022, Bankman-Fried's philanthropic family foundation, Building a Stronger Future, awarded Vox's Future Perfect a grant for a 2023 reporting project. That project is now on pause. Host: Sigal Samuel (@SigalSamuel), Senior Reporter, Vox Guest: Holden Karnofsky, co-founder of GiveWell; CEO of Open Philanthropy References: "Effective altruism gave rise to Sam Bankman-Fried. Now it's facing a moral reckoning" by Sigal Samuel (Vox; Nov. 16, 2022) "The Reluctant Prophet of Effective Altruism" by Gideon Lewis-Kraus (New Yorker; Aug. 8, 2022) "Sam Bankman-Fried tries to explain himself" by Kelsey Piper (Vox; Nov. 16, 2022) "EA is about maximization, and maximization is perilous" by Holden Karnofsky (Effective Altruism Forum; Sept. 2, 2022) "Defending One-Dimensional Ethics" by Holden Karnofsky (Cold Takes blog; Feb. 15, 2022) "Future-proof ethics" by Holden Karnofsky (Cold Takes blog; Feb. 2, 2022) "Bayesian mindset" by Holden Karnofsky (Cold Takes blog; Dec. 21, 2021) "EA Structural Reform Ideas" by Carla Zoe Cremer (Nov. 12, 2022) "Democratising Risk: In Search of a Methodology to Study Existential Risk" by Carla Cremer and Luke Kemp (SSRN; Dec. 28, 2021) Enjoyed this episode? Rate The Gray Area ⭐⭐⭐⭐⭐ and leave a review on Apple Podcasts. Subscribe for free. Be the first to hear the next episode of The Gray Area. Subscribe in your favorite podcast app. Support The Gray Area by making a financial contribution to Vox! bit.ly/givepodcasts This episode was made by: Producer: Erikk Geannikis Editor: Amy Drozdowska Engineer: Patrick Boyd Editorial Director, Vox Talk: A.M. Hall Learn more about your ad choices. Visit podcastchoices.com/adchoices
This episode continues the fascinating-slash-frightening journey I've been on with you, to understand what we should prioritise as we face potential existential end times. Today's guest, Harvard researcher and philanthropist Holden Karnofsky, brings the AI, effective altruism, longtermism and anti-growth debates together with the clarion call: “This is our moment, this century is make-or-break, pay attention people!” It's not an idle or hysterical call, it's one that Holden has researched extensively and is backed by global leaders in the space. As some background: Holden founded Givewell, the charity evaluator that has raised more than $US1billion for charities that have saved more than 150,000 lives (Bill Gates, Sam Harris and the now disgraced billionaire Sam Bankman-Fried use it) and Open Philanthropy investigates more speculative causes. So if this is the most important century, what does it mean for us? What are our responsibilities? What's going to happen? Buckle up, says Holden, because, “we live in wild times and should be ready for anything to happen”. Here's the "most important century" blog post series we talk about.I also flag Klara and the Sun by Kazuo Ishiguro. As well as this Vice article about how scientists can't explain how AI works.You might also want to go back and listen to the episodes with Peter Singer on effective altruism, Will Macaskill on Longtermism and Elise Bohan on misaligned AI and transhumanism......If you need to know a bit more about me… head to my "about" page. Subscribe to my Substack newsletter for more such conversation. Get your copy of my book, This One Wild and Precious Life Let's connect on Instagram! It's where I interact the most. Hosted on Acast. See acast.com/privacy for more information.