POPULARITY
Peter Grunawalt recounts his 39-day paddle to Hudson Bay alongside fellow Camp Voyageur alumni Elliot Keller, Charlie Steiner, and friend Matt Fossand. Setting off on June 15th, 2014, from the shores of Camp's bay, the team navigated rugged landscapes, wild rivers, and vast wilderness, arriving in Northeastern Canada on July 20th. Peter shares the challenges, triumphs, and life lessons gained from their extraordinary expedition. Check out their route here and trip video here."Text us feedback."Co-hosts Alex Kvanli & John Burgman discuss all-things related to Camp Voyageur in Ely, Minnesota. They share trail stories, interview Voyageur alumni, & reflect on the lore of the Great Northwoods. They also trade Boundary Waters travel tips & advice. Whether you're a former camper, a current camper, or an adventure enthusiast looking to improve your Boundary Waters experience, there's something for everyone in each episode. Can't get enough? Read our blog Find us on Facebook, Instagram, or YouTube Enroll your son at Camp Voyageur Work at Camp Voyageur 11 Proven Ways Wilderness Adventure Camps Can Transform Your Kid's Life by Alex Kvanli
Tony opens the show by reading a few emails, and then he talks about his friend Charlie Steiner revealing he has been battling cancer, but that it is now in remission. Jason La Canfora calls in to talk about a blown call in the Thursday Night game and the trade market, James Carville and Jeff Ma call in to make their weekly football picks, and Tony closes out the show by opening up the Mailbag. Song : MidLyfe's Crisis “Living 2 Out Of 7” To learn more about listener data and our privacy practices visit: https://www.audacyinc.com/privacy-policy Learn more about your ad choices. Visit https://podcastchoices.com/adchoices
In our conversation, Kenny opens up about the tightrope walk of fitting in while standing out in the sports broadcasting world. He brings to life his experiences with iconic colleagues like Bob Lee, Charlie Steiner, and Dan Patrick, shedding light on how trust and authenticity play crucial roles in building a successful media career. We also reflect on the groundbreaking contributions of colleagues like Stuart Scott, whose cultural perspective forever changed sports broadcasting. Through Kenny's anecdotes, we gain an understanding of the balance required to cover both lighthearted and serious stories with equal care. (00:04 - 00:55) Legendary Reunion With Kenny Mayne (07:32 - 08:34) Fast Track to TV Broadcasting Career (11:39 - 13:08) Sports Night's Early Influence at ESPN (16:43 - 18:23) Impactful Personalities in Broadcasting (20:58 - 21:58) Main Event (37:57 - 39:12) Life After ESPN (42:55 - 43:54) Players' Impact on Professional Sports (47:32 - 48:23) Challenges Facing Sports Beat Reporters For more, be sure to visit Yyzsportsmedia.com and follow @yyzsportsmedia
Tony opens the show by talking about watching the indoor track world championships in Glasgow, a trip to the Candy Kitchen, the Golf from the weekend and the passing of Chris Mortensen. Michael Wilbon calls in to talk some more about Mort, and also about Caitlin Clark setting the NCAA scoring record, Charlie Steiner calls in to talk about the excitement surrounding the Dodgers this season and getting to call those games, and Tony closes out the show by opening up the Mailbag. Songs : Year of the Buffalo “Ohio River” ; “Hands that Bleed” To learn more about listener data and our privacy practices visit: https://www.audacyinc.com/privacy-policy Learn more about your ad choices. Visit https://podcastchoices.com/adchoices
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Humans aren't fleeb., published by Charlie Steiner on January 24, 2024 on LessWrong. In the oceans of the planet Water, a species of intelligent squid-like aliens - we'll just call them the People - debate about what it means to be fleeb. Fleeb is a property of great interest to the People, or at least they think so, but they also have a lot of trouble defining it. They're fleeb when they're awake, but less fleeb or maybe not fleeb at all when they're asleep. Some animals that act clever are probably somewhat fleeb, and other animals that are stupid and predictable probably aren't fleeb. But fleeb isn't just problem-solving ability, because philosophers of the People have written of hypothetical alien lifeforms that could be good at solving problems without intuitively being fleeb. Instead, the idea of "fleeb" is more related to how much a Person can see a reflection of their own thinking in the processes of the subject. A look-up table definitely isn't fleeb. But how much of the thinking of the People do you need to copy to be more fleeb than their pet cuttlefish-aliens? Do you need to store and recall memories? Do you need emotions? Do you need to make choices? Do you need to reflect on yourself? Do you need to be able to communicate, maybe not with words, but modeling other creatures around you as having models of the world and choosing actions to honestly inform them? Yes to all of these, say the People. These are important things to them about their thinking, and so important for being fleeb. In fact, the People go even farther. A simple abacus can store memories if "memories" just means any record of the past. But to be fleeb, you should store and recall memories more in the sense that People do it. Similar for having emotions, making choices, etc. So the People have some more intuitions about what makes a creature fleeb: You should store and recall visual/aural/olfactory/electrosensory memories in a way suitable for remembering them both from similar sensory information and abstract reasoning, and these memories should be bundled with metadata like time and emotional valence. Your tactile/kinesthetic memories should be opaque to abstract reasoning (perhaps distributed in your limbs, as in the People), but can be recalled-in-the-felt-way from similar sensory information. It's hard to tell if you have emotions unless you have ones recognizable and important to the People. For the lowest levels of fleeb, it's enough to have a general positive emotion (pleasure) and a general negative one (pain/hunger). But to be fleeb like the People are, you should also have emotions like curiosity, boredom, love, just-made-a-large-change-to-self-regulation-heuristics, anxiety, working-memory-is-full, and hope. You should make choices similar to how the People do. Primed by your emotional state, you should use fast heuristics to reconfigure your cognitive pathway so you call on the correct resources to make a good plan. Then you quickly generate some potential actions and refine them until taking the best one seems better than not acting. Etc. When the People learned about humans, it sparked a lively philosophical debate. Clearly humans are quite clever, and have some recognizable cognitive algorithms, in the same way an AI using two different semantic hashes is "remembering" in a more fleeb-ish way than an abacus is. But compare humans to a pet cuttlefish-alien - even though the pet cuttlefish-alien can't solve problems as well, it has emotions us humans don't have even a dim analogue of, and overall has a more similar cognitive architecture to the People. Some brash philosophers of the People made bold claims that humans were fleeb, and therefore deserved full rights immediately. But cooler heads prevailed; despite outputting clever text signals, humans were just too different...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Humans aren't fleeb., published by Charlie Steiner on January 24, 2024 on LessWrong. In the oceans of the planet Water, a species of intelligent squid-like aliens - we'll just call them the People - debate about what it means to be fleeb. Fleeb is a property of great interest to the People, or at least they think so, but they also have a lot of trouble defining it. They're fleeb when they're awake, but less fleeb or maybe not fleeb at all when they're asleep. Some animals that act clever are probably somewhat fleeb, and other animals that are stupid and predictable probably aren't fleeb. But fleeb isn't just problem-solving ability, because philosophers of the People have written of hypothetical alien lifeforms that could be good at solving problems without intuitively being fleeb. Instead, the idea of "fleeb" is more related to how much a Person can see a reflection of their own thinking in the processes of the subject. A look-up table definitely isn't fleeb. But how much of the thinking of the People do you need to copy to be more fleeb than their pet cuttlefish-aliens? Do you need to store and recall memories? Do you need emotions? Do you need to make choices? Do you need to reflect on yourself? Do you need to be able to communicate, maybe not with words, but modeling other creatures around you as having models of the world and choosing actions to honestly inform them? Yes to all of these, say the People. These are important things to them about their thinking, and so important for being fleeb. In fact, the People go even farther. A simple abacus can store memories if "memories" just means any record of the past. But to be fleeb, you should store and recall memories more in the sense that People do it. Similar for having emotions, making choices, etc. So the People have some more intuitions about what makes a creature fleeb: You should store and recall visual/aural/olfactory/electrosensory memories in a way suitable for remembering them both from similar sensory information and abstract reasoning, and these memories should be bundled with metadata like time and emotional valence. Your tactile/kinesthetic memories should be opaque to abstract reasoning (perhaps distributed in your limbs, as in the People), but can be recalled-in-the-felt-way from similar sensory information. It's hard to tell if you have emotions unless you have ones recognizable and important to the People. For the lowest levels of fleeb, it's enough to have a general positive emotion (pleasure) and a general negative one (pain/hunger). But to be fleeb like the People are, you should also have emotions like curiosity, boredom, love, just-made-a-large-change-to-self-regulation-heuristics, anxiety, working-memory-is-full, and hope. You should make choices similar to how the People do. Primed by your emotional state, you should use fast heuristics to reconfigure your cognitive pathway so you call on the correct resources to make a good plan. Then you quickly generate some potential actions and refine them until taking the best one seems better than not acting. Etc. When the People learned about humans, it sparked a lively philosophical debate. Clearly humans are quite clever, and have some recognizable cognitive algorithms, in the same way an AI using two different semantic hashes is "remembering" in a more fleeb-ish way than an abacus is. But compare humans to a pet cuttlefish-alien - even though the pet cuttlefish-alien can't solve problems as well, it has emotions us humans don't have even a dim analogue of, and overall has a more similar cognitive architecture to the People. Some brash philosophers of the People made bold claims that humans were fleeb, and therefore deserved full rights immediately. But cooler heads prevailed; despite outputting clever text signals, humans were just too different...
The best of nearly 100 interviews in “Tell me a story I don't know” is really a personal choice. I've featured so many guests in best of 9 seasons but in this finale I thought it might be worthy to have you listen to some thought provoking, funny and poignant moments. And not the least of them came from two people who left us. Dave Wills, the ever popular voice of the Tampa Rays and a Chicago native died last March of a heart ailment. He was only 58.Alan Schwartz was 91 but full of vim and vigor. He was once president of the United States Tennis Association and builder of Mid Town tennis in 1970 and for nearly 50 years, the largest indoor facility in the U.S. Schwartz died in December of 2022, just three days after we had our semi annual lunch. I was crushed when I learned of both of their passings.The best of also includes an almost hard to believe story from Los Angeles Dodgers play by play voice Charlie Steiner how history repeated itself. Cheryl Ray Stout, long time reporter and trailblazer here in Chicago recounted how she broke the story of Michael Jordan leaving basketball for baseball and then, returning to the Bulls!Cubs radio voice Ron Coomer remembered as a child how he refused to trade baseball gloves with a future hall of famer and how could I not include Brent Musburger, my inspiration back when I was a 14 year old entertaining thoughts of getting into the business. After spending a half century plying my trade, I thanked him publicly. It was all worth it.“Tell me a story I don't know" is sponsored by Mr. Duct and “Tell me a story I don't know: conversations with Chicago sports legends” is now available on Amazon Books and at chicago area book stores.It's been a great run of "Tell me story I don't know" with a special two part podcast coming up! Make sure not to miss any of the content on the Last Word on Sports Media podcast feed on Apple, Spreaker, Spotify, Google, etc.!
The best of nearly 100 interviews in “Tell me a story I don't know” is really a personal choice. I've featured so many guests in best of 9 seasons but in this finale I thought it might be worthy to have you listen to some thought provoking, funny and poignant moments. And not the least of them came from two people who left us. Dave Wills, the ever popular voice of the Tampa Rays and a Chicago native died last March of a heart ailment. He was only 58.Alan Schwartz was 91 but full of vim and vigor. He was once president of the United States Tennis Association and builder of Mid Town tennis in 1970 and for nearly 50 years, the largest indoor facility in the U.S. Schwartz died in December of 2022, just three days after we had our semi annual lunch. I was crushed when I learned of both of their passings.The best of also includes an almost hard to believe story from Los Angeles Dodgers play by play voice Charlie Steiner how history repeated itself. Cheryl Ray Stout, long time reporter and trailblazer here in Chicago recounted how she broke the story of Michael Jordan leaving basketball for baseball and then, returning to the Bulls!Cubs radio voice Ron Coomer remembered as a child how he refused to trade baseball gloves with a future hall of famer and how could I not include Brent Musburger, my inspiration back when I was a 14 year old entertaining thoughts of getting into the business. After spending a half century plying my trade, I thanked him publicly. It was all worth it.“Tell me a story I don't know" is sponsored by Mr. Duct and “Tell me a story I don't know: conversations with Chicago sports legends” is now available on Amazon Books and at chicago area book stores.It's been a great run of "Tell me story I don't know" with a special two part podcast coming up! Make sure not to miss any of the content on the Last Word on Sports Media podcast feed on Apple, Spreaker, Spotify, Google, etc.!
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Neural uncertainty estimation for alignment, published by Charlie Steiner on December 5, 2023 on The AI Alignment Forum. Introduction Suppose you've built some AI model of human values. You input a situation, and it spits out a goodness rating. You might want to ask: "What are the error bars on this goodness rating?" In addition to it just being nice to know error bars, an uncertainty estimate can also be useful inside the AI: guiding active learning[1], correcting for the optimizer's curse[2], or doing out-of-distribution detection[3]. I recently got into the uncertainty estimation literature for neural networks (NNs) for a pet reason: I think it would be useful for alignment to quantify the domain of validity of an AI's latent features. If we point an AI at some concept in its world-model, optimizing for realizations of that concept can go wrong by pushing that concept outside its domain of validity. But just keep thoughts of alignment in your back pocket for now. This post is primarily a survey of the uncertainty estimation literature, interspersed with my own takes. The Bayesian neural network picture The Bayesian NN picture is the great granddaddy of basically every uncertainty estimation method for NNs, so it's appropriate to start here. The picture is simple. You start with a prior distribution over parameters. Your training data is evidence, and after training on it you get an updated distribution over parameters. Given an input, you calculate a distribution over outputs by propagating the input through the Bayesian neural network. This would all be very proper and irrelevant ("Sure, let me just update my 2trilliondimensional joint distribution over all the parameters of the model"), except for the fact that actually training NNs does kind of work this way. If you use a log likelihood loss and L2 regularization, the parameters that minimize loss will be at the peak of the distribution that a Bayesian NN would have, if your prior on the parameters was a Gaussian[4][5]. This is because of a bridge between the loss landscape and parameter uncertainty. Bayes's rule says P(parameters|dataset)=P(parameters)P(dataset|parameters)/P(dataset). Here P(parameters|dataset)is your posterior distribution you want to estimate, and P(parameters)P(dataset|parameters) is the exponential of the loss[6]. This lends itself to physics metaphors like "the distribution of parameters is a Boltzmann distribution sitting at the bottom of the loss basin." Empirically, calculating the uncertainty of a neural net by pretending it's adhering to the Bayesian NN picture works so well that one nice paper on ensemble methods[7] called it "ground truth." Of course to actually compute anything here you have to make approximations, and if you make the quick and dirty approximations (e.g. pretend you can find the shape of the loss basin from the Hessian) you get bad results[8], but people are doing clever things with Monte Carlo methods these days[9], and they find that better approximations to the Bayesian NN calculation get better results. But doing Monte Carlo traversal of the loss landscape is expensive. For a technique to apply at scale, it must impose only a small multiplier on cost to run the model, and if you want it to become ubiquitous the cost it imposes must be truly tiny. Ensembles A quite different approach to uncertainty is ensembles[10]. Just train a dozen-ish models, ask them for their recommendations, and estimate uncertainty from the spread. The dozen-times cost multiplier on everything is steep, but if you're querying the model a lot it's cheaper than Monte Carlo estimation of the loss landscape. Ensembling is theoretically straightforward. You don't need to pretend the model is trained to convergence, you don't need to train specifically for predictive loss, you don't even need...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why it's so hard to talk about Consciousness, published by Rafael Harth on July 2, 2023 on LessWrong. [Thanks to Charlie Steiner, Richard Kennaway, and Said Achmiz for helpful discussion.] [Epistemic status: my best guess after having read a lot about the topic, including all LW posts and comment sections with the consciousness tag] There's a common pattern in online debates about consciousness. It looks something like this: One person will try to communicate a belief or idea to someone else, but they cannot get through no matter how hard they try. Here's a made-up example: "It's obvious that consciousness exists." Yes, it sure looks like the brain is doing a lot of non-parallel processing that involves several spatially distributed brain areas at once, so "I'm not just talking about the computational process. I mean qualia obviously exists." Define qualia. "You can't define qualia; it's a primitive. But you know what I mean." I don't. How could I if you can't define it? "I mean that there clearly is some non-material experience stuff!" Non-material, as in defying the laws of physics? In that case, I do get it, and I super don't "It's perfectly compatible with the laws of physics." Then I don't know what you mean. "I mean that there's clearly some experiential stuff accompanying the physical process." I don't know what that means. "Do you have experience or not?" I have internal representations, and I can access them to some degree. It's up to you to tell me if that's experience or not. "Okay, look. You can conceptually separate the information content from how it feels to have that content. Not physically separate them, perhaps, but conceptually. The what-it-feels-like part is qualia. So do you have that or not?" I don't know what that means, so I don't know. As I said, I have internal representations, but I don't think there's anything in addition to those representations, and I'm not sure what that would even mean. and so on. The conversation can also get ugly, with boldface author accusing quotation author of being unscientific and/or quotation author accusing boldface author of being willfully obtuse. On LessWrong, people are arguably pretty good at not talking past each other, but the pattern above still happens. So what's going on? The Two Intuition Clusters The basic model I'm proposing is that core intuitions about consciousness tend to cluster into two camps, with most miscommunication being the result of someone failing to communicate with the other camp. For this post, we'll call the camp of boldface author Camp #1 and the camp of quotation author Camp #2. Characteristics Camp #1 tends to think of consciousness as a non-special high-level phenomenon. Solving consciousness is then tantamount to solving the Meta-Problem of consciousness, which is to explain why we think/claim to have consciousness. In other words, once we've explained why people keep uttering the sounds kon-shush-nuhs, we've explained all the hard observable facts, and the idea that there's anything else seems dangerously speculative/unscientific. No complicated metaphysics is required for this approach. Conversely, Camp #2 is convinced that there is an experience thing that exists in a fundamental way. There's no agreement on what this thing is – theories range anywhere from hardcore physicalist accounts to substance dualists that postulate causally active non-material stuff – but they all agree that there is something that needs explaining. Also, getting your metaphysics right is probably a part of making progress. The camps are ubiquitous; once you have the concept, you will see it everywhere consciousness is discussed. Even single comments often betray allegiance to one camp or the other. Apparent exceptions are usually from people who are well-read on the subject and may have optimized...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why it's so hard to talk about Consciousness, published by Rafael Harth on July 2, 2023 on LessWrong. [Thanks to Charlie Steiner, Richard Kennaway, and Said Achmiz for helpful discussion.] [Epistemic status: my best guess after having read a lot about the topic, including all LW posts and comment sections with the consciousness tag] There's a common pattern in online debates about consciousness. It looks something like this: One person will try to communicate a belief or idea to someone else, but they cannot get through no matter how hard they try. Here's a made-up example: "It's obvious that consciousness exists." Yes, it sure looks like the brain is doing a lot of non-parallel processing that involves several spatially distributed brain areas at once, so "I'm not just talking about the computational process. I mean qualia obviously exists." Define qualia. "You can't define qualia; it's a primitive. But you know what I mean." I don't. How could I if you can't define it? "I mean that there clearly is some non-material experience stuff!" Non-material, as in defying the laws of physics? In that case, I do get it, and I super don't "It's perfectly compatible with the laws of physics." Then I don't know what you mean. "I mean that there's clearly some experiential stuff accompanying the physical process." I don't know what that means. "Do you have experience or not?" I have internal representations, and I can access them to some degree. It's up to you to tell me if that's experience or not. "Okay, look. You can conceptually separate the information content from how it feels to have that content. Not physically separate them, perhaps, but conceptually. The what-it-feels-like part is qualia. So do you have that or not?" I don't know what that means, so I don't know. As I said, I have internal representations, but I don't think there's anything in addition to those representations, and I'm not sure what that would even mean. and so on. The conversation can also get ugly, with boldface author accusing quotation author of being unscientific and/or quotation author accusing boldface author of being willfully obtuse. On LessWrong, people are arguably pretty good at not talking past each other, but the pattern above still happens. So what's going on? The Two Intuition Clusters The basic model I'm proposing is that core intuitions about consciousness tend to cluster into two camps, with most miscommunication being the result of someone failing to communicate with the other camp. For this post, we'll call the camp of boldface author Camp #1 and the camp of quotation author Camp #2. Characteristics Camp #1 tends to think of consciousness as a non-special high-level phenomenon. Solving consciousness is then tantamount to solving the Meta-Problem of consciousness, which is to explain why we think/claim to have consciousness. In other words, once we've explained why people keep uttering the sounds kon-shush-nuhs, we've explained all the hard observable facts, and the idea that there's anything else seems dangerously speculative/unscientific. No complicated metaphysics is required for this approach. Conversely, Camp #2 is convinced that there is an experience thing that exists in a fundamental way. There's no agreement on what this thing is – theories range anywhere from hardcore physicalist accounts to substance dualists that postulate causally active non-material stuff – but they all agree that there is something that needs explaining. Also, getting your metaphysics right is probably a part of making progress. The camps are ubiquitous; once you have the concept, you will see it everywhere consciousness is discussed. Even single comments often betray allegiance to one camp or the other. Apparent exceptions are usually from people who are well-read on the subject and may have optimized...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some background for reasoning about dual-use alignment research, published by Charlie Steiner on May 18, 2023 on The AI Alignment Forum. This is pretty basic. But I still made a bunch of mistakes when writing this, so maybe it's worth writing. This is background to a specific case I'll put in the next post. It's like a a tech tree If we're looking at the big picture, then whether some piece of research is net positive or net negative isn't an inherent property of that research; it depends on how that research is situated in the research ecosystem that will eventually develop superintelligent AI. Consider this toy game in the picture. We start at the left and can unlock technologies, with unlocks going faster the stronger our connections to prerequisites. The red and yellow technologies in the picture are superintelligent AI - pretend that as soon as one of those technologies is unlocked, the hastiest fraction of AI researchers are immediately going to start building it. Your goal is for humanity to unlock yellow technology before a red one. This game would be trivial if everyone agreed with you. But there are many people doing research, and they have all kinds of motivations - some want as many nodes to be unlocked as possible (pure research - blue), some want to personally unlock a green node (profit - green), some want to unlock the nearest red or yellow node no matter which it is (blind haste - red), and some want the same thing as you (beneficial AI - yellow) but you have a hard time coordinating with them. In this baseline tech tree game, it's pretty easy to play well. If you're strong, just take the shortest path to a yellow node that doesn't pass too close to any red nodes. If you're weak, identify where the dominant paradigm is likely to end up, and do research that differentially advantages yellow nodes in that future. The tech tree is wrinkly But of course there are lots of wrinkles not in the basic tech tree, which can be worth bearing in mind when strategizing about research. Actions in the social and political arenas. You might be motivated to change your research priorities based on how it could change peoples' minds about AI safety, or how it could affect government regulation. Publishing and commercialization. If a player publishes, they get more money and prestige, which boosts their ability to do future research. Other people can build on published research. Not publishing is mainly useful to you if you're already in a position of strength, and don't want to give competitors the chance to outrace you to a nearby red node (and of course profit-motivated players will avoid publishing things that might help competitors beat them to a green node). Uncertainty. We lack exact knowledge of the tech tree, which makes it harder to plan long chains of research in advance. Uncertainty about the tech tree forces us to develop local heuristics - ways to decide what to do based on information close at hand. Uncertainty adds a different reason you might not publish a technology: if you thought it was going to be a good idea to research when you started, but then you learned new things about the tech tree and changed your mind. Inhomogeneities between actors and between technologies. Different organizations are better at researching different technologies - MIRI is not just a small OpenAI. Ultimately, which technologies are the right ones to research depends on your model of the world / how you expect the future to go. Drawing actual tech trees can be a productive exercise for strategy-building, but you might also find it less useful than other ways of strategizing. We're usually mashing together definitions I'd like to win the tech tree game. Let's define a "good" technology as one that would improve our chances of winning if it was unlocked for free, given the st...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some background for reasoning about dual-use alignment research, published by Charlie Steiner on May 18, 2023 on LessWrong. This is pretty basic. But I still made a bunch of mistakes when writing this, so maybe it's worth writing. This is background to a specific case I'll put in the next post. It's like a a tech tree If we're looking at the big picture, then whether some piece of research is net positive or net negative isn't an inherent property of that research; it depends on how that research is situated in the research ecosystem that will eventually develop superintelligent AI. Consider this toy game in the picture. We start at the left and can unlock technologies, with unlocks going faster the stronger our connections to prerequisites. The red and yellow technologies in the picture are superintelligent AI - pretend that as soon as one of those technologies is unlocked, the hastiest fraction of AI researchers are immediately going to start building it. Your goal is for humanity to unlock yellow technology before a red one. This game would be trivial if everyone agreed with you. But there are many people doing research, and they have all kinds of motivations - some want as many nodes to be unlocked as possible (pure research - blue), some want to personally unlock a green node (profit - green), some want to unlock the nearest red or yellow node no matter which it is (blind haste - red), and some want the same thing as you (beneficial AI - yellow) but you have a hard time coordinating with them. In this baseline tech tree game, it's pretty easy to play well. If you're strong, just take the shortest path to a yellow node that doesn't pass too close to any red nodes. If you're weak, identify where the dominant paradigm is likely to end up, and do research that differentially advantages yellow nodes in that future. The tech tree is wrinkly But of course there are lots of wrinkles not in the basic tech tree, which can be worth bearing in mind when strategizing about research. Actions in the social and political arenas. You might be motivated to change your research priorities based on how it could change peoples' minds about AI safety, or how it could affect government regulation. Publishing and commercialization. If a player publishes, they get more money and prestige, which boosts their ability to do future research. Other people can build on published research. Not publishing is mainly useful to you if you're already in a position of strength, and don't want to give competitors the chance to outrace you to a nearby red node (and of course profit-motivated players will avoid publishing things that might help competitors beat them to a green node). Uncertainty. We lack exact knowledge of the tech tree, which makes it harder to plan long chains of research in advance. Uncertainty about the tech tree forces us to develop local heuristics - ways to decide what to do based on information close at hand. Uncertainty adds a different reason you might not publish a technology: if you thought it was going to be a good idea to research when you started, but then you learned new things about the tech tree and changed your mind. Inhomogeneities between actors and between technologies. Different organizations are better at researching different technologies - MIRI is not just a small OpenAI. Ultimately, which technologies are the right ones to research depends on your model of the world / how you expect the future to go. Drawing actual tech trees can be a productive exercise for strategy-building, but you might also find it less useful than other ways of strategizing. We're usually mashing together definitions I'd like to win the tech tree game. Let's define a "good" technology as one that would improve our chances of winning if it was unlocked for free, given the state of the ga...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some background for reasoning about dual-use alignment research, published by Charlie Steiner on May 18, 2023 on The AI Alignment Forum. This is pretty basic. But I still made a bunch of mistakes when writing this, so maybe it's worth writing. This is background to a specific case I'll put in the next post. It's like a a tech tree If we're looking at the big picture, then whether some piece of research is net positive or net negative isn't an inherent property of that research; it depends on how that research is situated in the research ecosystem that will eventually develop superintelligent AI. Consider this toy game in the picture. We start at the left and can unlock technologies, with unlocks going faster the stronger our connections to prerequisites. The red and yellow technologies in the picture are superintelligent AI - pretend that as soon as one of those technologies is unlocked, the hastiest fraction of AI researchers are immediately going to start building it. Your goal is for humanity to unlock yellow technology before a red one. This game would be trivial if everyone agreed with you. But there are many people doing research, and they have all kinds of motivations - some want as many nodes to be unlocked as possible (pure research - blue), some want to personally unlock a green node (profit - green), some want to unlock the nearest red or yellow node no matter which it is (blind haste - red), and some want the same thing as you (beneficial AI - yellow) but you have a hard time coordinating with them. In this baseline tech tree game, it's pretty easy to play well. If you're strong, just take the shortest path to a yellow node that doesn't pass too close to any red nodes. If you're weak, identify where the dominant paradigm is likely to end up, and do research that differentially advantages yellow nodes in that future. The tech tree is wrinkly But of course there are lots of wrinkles not in the basic tech tree, which can be worth bearing in mind when strategizing about research. Actions in the social and political arenas. You might be motivated to change your research priorities based on how it could change peoples' minds about AI safety, or how it could affect government regulation. Publishing and commercialization. If a player publishes, they get more money and prestige, which boosts their ability to do future research. Other people can build on published research. Not publishing is mainly useful to you if you're already in a position of strength, and don't want to give competitors the chance to outrace you to a nearby red node (and of course profit-motivated players will avoid publishing things that might help competitors beat them to a green node). Uncertainty. We lack exact knowledge of the tech tree, which makes it harder to plan long chains of research in advance. Uncertainty about the tech tree forces us to develop local heuristics - ways to decide what to do based on information close at hand. Uncertainty adds a different reason you might not publish a technology: if you thought it was going to be a good idea to research when you started, but then you learned new things about the tech tree and changed your mind. Inhomogeneities between actors and between technologies. Different organizations are better at researching different technologies - MIRI is not just a small OpenAI. Ultimately, which technologies are the right ones to research depends on your model of the world / how you expect the future to go. Drawing actual tech trees can be a productive exercise for strategy-building, but you might also find it less useful than other ways of strategizing. We're usually mashing together definitions I'd like to win the tech tree game. Let's define a "good" technology as one that would improve our chances of winning if it was unlocked for free, given the st...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some background for reasoning about dual-use alignment research, published by Charlie Steiner on May 18, 2023 on LessWrong. This is pretty basic. But I still made a bunch of mistakes when writing this, so maybe it's worth writing. This is background to a specific case I'll put in the next post. It's like a a tech tree If we're looking at the big picture, then whether some piece of research is net positive or net negative isn't an inherent property of that research; it depends on how that research is situated in the research ecosystem that will eventually develop superintelligent AI. Consider this toy game in the picture. We start at the left and can unlock technologies, with unlocks going faster the stronger our connections to prerequisites. The red and yellow technologies in the picture are superintelligent AI - pretend that as soon as one of those technologies is unlocked, the hastiest fraction of AI researchers are immediately going to start building it. Your goal is for humanity to unlock yellow technology before a red one. This game would be trivial if everyone agreed with you. But there are many people doing research, and they have all kinds of motivations - some want as many nodes to be unlocked as possible (pure research - blue), some want to personally unlock a green node (profit - green), some want to unlock the nearest red or yellow node no matter which it is (blind haste - red), and some want the same thing as you (beneficial AI - yellow) but you have a hard time coordinating with them. In this baseline tech tree game, it's pretty easy to play well. If you're strong, just take the shortest path to a yellow node that doesn't pass too close to any red nodes. If you're weak, identify where the dominant paradigm is likely to end up, and do research that differentially advantages yellow nodes in that future. The tech tree is wrinkly But of course there are lots of wrinkles not in the basic tech tree, which can be worth bearing in mind when strategizing about research. Actions in the social and political arenas. You might be motivated to change your research priorities based on how it could change peoples' minds about AI safety, or how it could affect government regulation. Publishing and commercialization. If a player publishes, they get more money and prestige, which boosts their ability to do future research. Other people can build on published research. Not publishing is mainly useful to you if you're already in a position of strength, and don't want to give competitors the chance to outrace you to a nearby red node (and of course profit-motivated players will avoid publishing things that might help competitors beat them to a green node). Uncertainty. We lack exact knowledge of the tech tree, which makes it harder to plan long chains of research in advance. Uncertainty about the tech tree forces us to develop local heuristics - ways to decide what to do based on information close at hand. Uncertainty adds a different reason you might not publish a technology: if you thought it was going to be a good idea to research when you started, but then you learned new things about the tech tree and changed your mind. Inhomogeneities between actors and between technologies. Different organizations are better at researching different technologies - MIRI is not just a small OpenAI. Ultimately, which technologies are the right ones to research depends on your model of the world / how you expect the future to go. Drawing actual tech trees can be a productive exercise for strategy-building, but you might also find it less useful than other ways of strategizing. We're usually mashing together definitions I'd like to win the tech tree game. Let's define a "good" technology as one that would improve our chances of winning if it was unlocked for free, given the state of the ga...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Shard theory alignment requires magic., published by Charlie Steiner on January 20, 2023 on The AI Alignment Forum. A delayed hot take. This is pretty similar to previous comments from Rohin. "Magic," of course, in the technical sense of stuff we need to remind ourselves we don't know how to do. I don't mean this pejoratively, locating magic is an important step in trying to demystify it. And "shard theory alignment" in the sense of building an AI that does good things and not bad things by encouraging an RL agent to want to do good things, via kinds of reward shaping analogous to the diamond maximizer example. How might the story go? You start out with some unsupervised model of sensory data. On top of its representation of the world you start training an RL agent, with a carefully chosen curriculum and a reward signal that you think matches "goodness in general" on that curriculum distribution. This cultivates shards that want things in the vicinity of "what's good according to human values." These start out as mere bundles of heuristics, but eventually they generalize far enough to be self-reflective, promoting goal-directed behavior that takes into account the training process and the possibility of self-modification. At this point the values will lock themselves in, and future behavior will be guided by the abstractions in the learned representation of the world that the shards used to get good results in training, not by what would actually maximize the reward function you used. There magic here is especially concentrated around how we end up with the right shards. One magical process is how we pick the training curriculum and reward signal. If the curriculum is only made up only of simple environments, then the RL agent will learn heuristics that don't need to refer to humans. But if you push the complexity up too fast, the RL process will fail, or the AI will be more likely to learn heuristics that are better than nothing but aren't what we intended. Does a goldilocks zone where the agent learns more-or-less what we intended exist? How can we build confidence that it does, and that we've found it? And what's in the curriculum matters a lot. Do we try to teach the AI to locate "human values" by having it be prosocial towards individuals? Which ones? To groups? Over what timescale? How do we reward it for choices on various ethical dilemmas? Or do we artificially suppress the rate of occurrence of such dilemmas? Different choices will lead to different shards. We wouldn't need to find a unique best way to do things (that's a boondoggle), but we would need to find some way of doing things that we trust enough. Another piece of magic is how the above process lines up with generalization and self-reflectivity. If the RL agent becomes self-reflective too early, it will lock in simple goals that we don't want. If it becomes self-reflective too late, it will have started exploiting unintended maxima of the reward function. How do we know when we want the AI to lock in its values? How do we exert control over that? If shard theory alignment seemed like it has few free parameters, and doesn't need a lot more work, then I think you failed to see the magic. I think the free parameters haven't been discussed enough precisely because they need so much more work. The part of the magic that I think we could start working on now is how to connect curricula and learned abstractions. In order to predict that a certain curriculum will cause an AI to learn what we think is good, we want to have a science of reinforcement learning advanced in both theory and data. In environments of moderate complexity (e.g. Atari, MuJoCo), we can study how to build curricula that impart different generalization behaviors, and try to make predictive models of this process. Even if shard theory ali...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Shard theory alignment requires magic., published by Charlie Steiner on January 20, 2023 on The AI Alignment Forum. A delayed hot take. This is pretty similar to previous comments from Rohin. "Magic," of course, in the technical sense of stuff we need to remind ourselves we don't know how to do. I don't mean this pejoratively, locating magic is an important step in trying to demystify it. And "shard theory alignment" in the sense of building an AI that does good things and not bad things by encouraging an RL agent to want to do good things, via kinds of reward shaping analogous to the diamond maximizer example. How might the story go? You start out with some unsupervised model of sensory data. On top of its representation of the world you start training an RL agent, with a carefully chosen curriculum and a reward signal that you think matches "goodness in general" on that curriculum distribution. This cultivates shards that want things in the vicinity of "what's good according to human values." These start out as mere bundles of heuristics, but eventually they generalize far enough to be self-reflective, promoting goal-directed behavior that takes into account the training process and the possibility of self-modification. At this point the values will lock themselves in, and future behavior will be guided by the abstractions in the learned representation of the world that the shards used to get good results in training, not by what would actually maximize the reward function you used. There magic here is especially concentrated around how we end up with the right shards. One magical process is how we pick the training curriculum and reward signal. If the curriculum is only made up only of simple environments, then the RL agent will learn heuristics that don't need to refer to humans. But if you push the complexity up too fast, the RL process will fail, or the AI will be more likely to learn heuristics that are better than nothing but aren't what we intended. Does a goldilocks zone where the agent learns more-or-less what we intended exist? How can we build confidence that it does, and that we've found it? And what's in the curriculum matters a lot. Do we try to teach the AI to locate "human values" by having it be prosocial towards individuals? Which ones? To groups? Over what timescale? How do we reward it for choices on various ethical dilemmas? Or do we artificially suppress the rate of occurrence of such dilemmas? Different choices will lead to different shards. We wouldn't need to find a unique best way to do things (that's a boondoggle), but we would need to find some way of doing things that we trust enough. Another piece of magic is how the above process lines up with generalization and self-reflectivity. If the RL agent becomes self-reflective too early, it will lock in simple goals that we don't want. If it becomes self-reflective too late, it will have started exploiting unintended maxima of the reward function. How do we know when we want the AI to lock in its values? How do we exert control over that? If shard theory alignment seemed like it has few free parameters, and doesn't need a lot more work, then I think you failed to see the magic. I think the free parameters haven't been discussed enough precisely because they need so much more work. The part of the magic that I think we could start working on now is how to connect curricula and learned abstractions. In order to predict that a certain curriculum will cause an AI to learn what we think is good, we want to have a science of reinforcement learning advanced in both theory and data. In environments of moderate complexity (e.g. Atari, MuJoCo), we can study how to build curricula that impart different generalization behaviors, and try to make predictive models of this process. Even if shard theory ali...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 14: Corrigibility isn't that great., published by Charlie Steiner on December 25, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day some days for 25 days. It's the end (I saved a tenuous one for ya')! Kind of disappointing that this ended up averaging out to one every 2 days, but this was also a lot of work and I'm happy with the quality level. Some of the drafts that didn't work as "hot takes" will get published later. I There are certainly arguments for why we want to build corrigible AI. For example, the problem of fully updated deference says that if you build an AI that wants things, even if it's uncertain about what it wants, it knows it can get more of what it wants if it doesn't let you turn it off. The metal image this conjures up is of an AI doing something that's obvious-to-humans bad, and us clamoring to stop, but it blocking us from turning it off because we didn't solve the problem of fully updated deference. It would be better if we built an AI that took things slow, and that would let us shut it off if we got to look at what it was doing and saw that it was obviously bad. Don't get me wrong, this could be a nice property to have. But I don't think it's all that likely to come up, because aiming at aligned AI means building AI that tries not to do obviously bad stuff. A key point is that corrigibility is only desirable if you actually expect to use it. Its primary sales pitch is that it might give us a mulligan on an AI that starts doing obviously bad stuff. If everything goes great and we wind up in a post-scarcity utopia, I'm not worried about whether the AI would let me turn it off if I counterfactually wanted to. A world where corrigibility is useful might look like us building an agenty AI with a value learning process that we're not confident in, letting it run and interacting with it to try to judge how the value learning is going, and then (with moderate probability) turning it off and trying again with another idea for value learning. What does corrigibility have to do in this world? The AI shouldn't deliberately try to get shut down by doing obviously-bad things, but it also shouldn't try to avoid being shut down by instrumentally hiding bad behavior, or by backing itself up on AWS. Such indifference to the outside world is the default for limited AI that doesn't model that part of the world, or doesn't make decisions in a very coherent way. But in an agent that's good at navigating the real world, a lot of corrigibility is made out of value learning. The AI probably has to actively notice when it's coming into conflict with humans (and specifically humans, rather than head lice) and defer to them, even if those humans want to shut down the AI or rewrite its value learning process. So the first issue: if you can already do things like noticing when you're coming into conflict with humans, I fully expect you can build an AI that tries not to do things the humans think are obviously bad. And even though this has dangers, notably making corrigibility less likely to be used by making AIs avoid doing obviously-bad things, what the hell are you trying to do value learning for if you're not going to use it to get the AI to do good things and not bad things? II Second issue: sometimes agenty properties are good. An incorrigible AI is one that endorses some value learning process or meta-process, and will defend that good process against random noise, and against humans who might try to modify the process selfishly or short-sightedly. The point of corrigibility is that it the AI should not trust its own judgement about what counts as "short-sighted" for the human, and should let itself be shut down or modified. But sometimes humans are like a toddler i...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 13: RLHF bad, conditioning good., published by Charlie Steiner on December 22, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day some days for 25 days. I have now procrastinated enough that I probably have enough hot takes. Hyperbolic title, sorry. But seriously, conditioning is better than RLHF for current language models. For agents navigating the real world, both have issues and it's not clear-cut where progress will come from. By "conditioning", I mean the decision transformer trick to do conditional inference: get human ratings of sequences of tokens, and then make a dataset where you append the ratings to the front of the sequences. A model trained on this dataset for next-token prediction will have to learn the distribution of text conditional on the rating - so if you prompt it with a high rating and then the start of an answer, it will try to continue the answer in a way humans would rate highly. This can be very similar to RLHF - especially if you augment the training data by building a model of human ratings, and train a model to do conditional inference by finetuning a model trained normally. But in the right perspective, the resulting AIs are trying to do quite different things. RLHF is sorta training the AI to be an agent. Not an agent that navigates the real world, but an agent that navigates the state-space of text. It learns to prefer certain trajectories of the text, and takes actions (outputs words) to steer the text onto favored trajectories. Conditioning, on the other hand, is trying to faithfully learn the distribution of possible human responses - it's getting trained to be a simulator that can predict many different sorts of agents. The difference is stark in their reactions to variance. RLHF wants to eliminate variance that might make a material difference in the trajectory (when the KL penalty is small relative to the Bayesian-updating KL penalty), while conditioning on rating still tries to produce something that looks like the training distribution. This makes conditioning way better whenever you care about the diversity of options produced by a language model - e.g. if you're trying to get the AI to generate something specific yet hard to specify, and you want to be able to sift through several continuations. Or if you're building a product that works like souped-up autocorrect, and want to automatically get a diversity of good suggestions. Another benefit is quantilization. RLHF is trying to get the highest score available, even if it means exploiting human biases. If instead you condition on a score that's high but still regularly gotten by humans, it's like you're sampling policies that get this high-but-not-too-high score, which are less exploitative of human raters than the absolute maximum-score policy. This isn't a free lunch. Fine-tuning for conditional inference has less of an impact on what sort of problem the AI is solving than RLHF does, but it makes that problem way harder. Unsurprisingly, performance tends to be worse on harder problems. Still, research on decision transformers is full of results that are somewhat competitive with other methods. It also still exploits the human raters some amount, increasing with the extremity of the score. Sam Marks has talked about a scheme using online decision transformers to improve performance without needing to make the score extreme relative to the distribution seen so far, which is definitely worth a read, but this seems like a case of optimality is the tiger. Whether found by RLHF or conditioning, the problem is with the policies that get the highest scores. Looking out to the future, I'm uncertain about how useful conditioning will really be. For an AI that chooses policies to affe...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems., published by Charlie Steiner on December 20, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day some days for 25 days. I have now procrastinated enough that I probably have enough hot takes. I felt like writing this take a little more basic, so that it doesn't sound totally insane if read by an average ML researcher. Use of RLHF by OpenAI is a good sign in that it shows how alignment research can get adopted by developers of cutting-edge AI. I think it's even a good sign overall, probably. But still, use of RLHF by OpenAI is a bad sign in that it shows that jamming RL at real-world problems is endorsed as a way to make impressive products. If you wandered in off the street, you might be confused why I'm saying RL is bad. Isn't it a really useful learning method? Hasn't it led to lots of cool stuff? But if you haven't wandered in off the street, you know I'm talking about alignment problems - loosely, we want powerful AIs to do good things and not bad things, even when tackling the whole problem of navigating the real world. And RL has an unacceptable chance of getting you AIs that want to do bad things. There's an obvious problem with RL for navigating the real world, and a more speculative generalization of that problem. The obvious problem is wireheading. If you're a clever AI learning from reward signals in the real world, you might start to notice that actions that affect a particular computer in the world have their reward computed differently than actions that affect rocks or trees. And it turns out that by making a certain number on this computer big, you can get really high reward! At this point the AI starts searching for ways to stop you from interrupting its "heroin fix," and we've officially hecked up and made something that's adversarial to us. Now, maybe you can do RL without this happening. Maybe if you do model-based reasoning, and become self-reflective at the right time to lock in early values, you'll perceive actions that manipulate this special computer to be cheating according to your model, and avoid them. I'll explain in a later take some extra work I think this requires, but for the moment it's more important to note that a lot of RL tricks are actually working directly against this kind of reasoning. When a hard environment has a lot of dead ends, and sparse gradients (e.g. Montezuma's Revenge, or the real world), you want to do things like generate intrinsic motivations to aid exploration, or use tree search over a model of the world, which will help the AI break out of local traps and find solutions that are globally better according to the reward function. Maxima of the reward function have nonlocal echoes, like mountains have slopes and foothills. These echoes are the whole reason that looking at the local gradient is informative about which direction is better long-term, and why building a world model can help you predict never-before-seen rewarding states. Deep models and fancy optimizers are useful precisely because their sensitivity to those echoes helps them find good solutions to problems, and there's no difference in kind between the echoes of the solutions we want our AI to find, and the echoes of the solutions we didn't intend. The speculative generalization of the problem is that there's a real risk of an AI sensing these echoes even if it's not explicitly intended to act in the real world, so long as its actions are affecting its reward-evaluation process, and it benefits from building a model of the real world. Suppose you have a language model that you're trying to train with RL, and your reward signal is the rating of a human who happens to be really eas...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 11: "Aligning language models" should be weirder., published by Charlie Steiner on December 18, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day some days for 25 days. I have now procrastinated enough that I probably have enough hot takes. People often talk about aligning language models, either to promote it, or to pooh-pooh it. I'm here to do both. Sometimes, aligning language models just means trying to get a present-day model not to say bad outputs that would embarrass your organization. There is a cottage industry of papers on arxiv doing slightly different variants of RLHF against bad behavior, measuring slightly different endpoints. These people deserve their light mockery for diluting the keyword "alignment." The good meaning of aligning language models is to use "get language models to not say bad things" as a toy problem to teach us new, interesting skills that we can apply to future powerful AI. For example, you could see the recent paper "Discovering Latent Knowledge in Language Models Without Supervision" as using "get language models to not lie" as a toy problem to teach us something new and interesting about interpretability. Aligning language models with an eye towards the future doesn't have to just be interpretability research, either, it can be anything that builds skills that the authors expect will be useful for aligning future AI, like self-reflection as explored in Constitutional AI. If you're brainstorming ideas for research aligning language models, I encourage you to think about connections between current language models and future AI that navigates the real world. In particular, connections between potential alignment strategies for future AIs and situations that language models can be studied in. Here's an example: Constitutional AI uses a model to give feedback on itself, which is incorporated into RL fine-tuning. But we expect future AI that navigates the real world to not merely be prompted to self-reflect as part of the training process, but to self-reflect during deployment - an AI that is acting in the real world will have to consider actions that affect its own hardware and software. We could study this phenomenon using a language model (or language-model-based-agent) by giving it access to outputs that affect itself in a more direct way than adding to an RL signal, and trying to make progress on getting a language model to behave well under those conditions. Doing this sounds weird even to me. That's fine. I want the research area of aligning language models to look a lot weirder. Not to say that normal-sounding papers can't be useful. There's a lot of room to improve the human feedback in RLHF by leveraging a richer model of the human, for example, and this could be pretty useful for making current language models not say bad things. But to do a sufficiently good job at this, you probably have to start thinking about incorporating unsupervised loss terms (even if they provide no benefit for current models), and addressing scenarios where the AI is a better predictor than the human, and other weird things. Overall, I'm happy with the research on aligning language models that's been done by safety-aware people. But we're in the normal-seeming infancy of a research direction that should look pretty weird. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 9: No, RLHF/IDA/debate doesn't solve outer alignment., published by Charlie Steiner on December 12, 2022 on LessWrong. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day (ish) for 25 days. Or until I run out of hot takes. And now, time for the week of RLHF takes. I see people say one of these surprisingly often. Sometimes, it's because the speaker is fresh and full of optimism. They've recently learned that there's this "outer alignment" thing where humans are supposed to communicate what they want to an AI, and oh look, here are some methods that researchers use to communicate what they want to an AI. The speaker doesn't see any major obstacles, and they don't have a presumption that there are a bunch of obstacles they don't see. Other times, they're fresh and full of optimism in a slightly more sophisticated way. They've thought about the problem a bit, and it seems like human values can't be that hard to pick out. Our uncertainty about human values is pretty much like our uncertainty about any other part of the world - so their thinking goes - and humans are fairly competent at figuring things out about the world, especially if we just have to check the work of AI tools. They don't see any major obstacles, and look, I'm not allowed to just keep saying that in an ominous tone of voice as if it's a knockdown argument, maybe there aren't any obstacles, right? Here's an obstacle: RLHF/IDA/debate all incentivize promoting claims based on what the human finds most convincing and palatable, rather than on what's true. RLHF does whatever it learned makes you hit the "approve" button, even if that means deceiving you. Information-transfer in the depths of IDA is shaped by what humans will pass on, potentially amplified by what patterns are learned in training. And debate is just trying to hack the humans right from the start. Optimizing for human approval wouldn't be a big deal if humans didn't make systematic mistakes, and weren't prone to finding certain lies more compelling than the truth. But we do, and we are, so that's a problem. Exhibit A, the last 5 years of politics - and no, the correct lesson to draw from politics is not "those other people make systematic mistakes and get suckered by palatable lies, but I'd never be like that." We can all be like that, which is why it's not safe to build a smart AI that has an incentive to do politics to you. Generalized moral of the story: If something is an alignment solution except that it requires humans to converge to rational behavior, it's not an alignment solution. Let's go back to the perspective of someone who thinks that RLHF/whatever solves outer alignment. I think that even once you notice a problem like "it's rewarded for deceiving me," there's a temptation to not change your mind, and this can lead people to add epicycles to other parts of their picture of alignment. (Or if I'm being nicer, disposes them to see the alignment problem in terms of "really solving" inner alignment.) For example, in order to save an outer objective that encourages deception, it's tempting to say that non-deception is actually a separate problem, and we should study preventing deception as a topic in its own right, independent of objective. And you know what, this is actually a pretty reasonable thing to study. But that doesn't mean you should actually hang onto the original objective. Even when you make stone soup, you don't eat the stone. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 10: Fine-tuning with RLHF is aesthetically unsatisfying., published by Charlie Steiner on December 13, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. This take owes a lot to the Simulators discussion group. Fine-tuning a large sequence model with RLHF creates an agent that tries to steer the sequence in rewarding directions. Simultaneously, it breaks some nice properties that the fine-tuned model used to have. You should have a gut feeling that we can do better. When you start with a fresh sequence model, it's not acting like an agent, instead it's just trying to mimic the training distribution. It may contain agents, but at every step it's just going to output a probability distribution that's been optimized to be well-calibrated. This is a really handy property - well-calibrated conditional inference is about as good as being able to see the future, both for prediction and for generation. The design philosophy behind RLHF is to train an agent that operates in a world where we want to steer towards good trajectories. In this framing, there's good text and bad text, and we want the fine-tuned AI to always output good text rather than bad text. This isn't necessarily a bad goal - sometimes you do want an agent that will just give you the good text. The issue is, you're sacrificing the ability to do accurate conditional inference about the training distribution. When you do RLHF fine-tuning, you're taking a world model and then, in-place, trying to cannibalize its parts to make an optimizer. This might sound like hyperbole if you remember RL with KL penalties is Bayesian inference. And okay; RLHF weights each datapoint much more than the Bayesian inference step does, but there's probably some perspective in which you can see the fine-tuned model as just having weird over-updated beliefs about how the world is. But just like perceptual control theory says, there's no bright line between prediction and action. Ultimately it's about what perspective is more useful, and to me it's much more useful to think of RLHF on a language model as producing an agent that acts in the world of text, trying to steer the text onto its favored trajectories. As an agent, it has some alignment problems, even if it lives totally in the world of text and doesn't get information leakage from the real world. It's trying to get to better trajectories by any means necessary, even if it means suddenly delivering an invitation to a wedding party. The real-world solution to this problem seems to have been a combination of early stopping and ad-hoc patches, neither of which inspire massive confidence. The wedding party attractor isn't an existential threat, but it's a bad sign for attempts to align more high-stakes AI, and it's an indicator that we're probably failing at the "Do What I Mean" instruction in other more subtle ways as well. More seems possible. More capabilities, more interpretability, and more progress on alignment. We start with a perfectly good sequence model, it seems like we should be able to leverage it as a model, rather than as fodder for a model-free process. Although to any readers who feel similarly optimistic, I would like to remind you that the "more capabilities" part is no joke, and it's very easy for it to memetically out-compete the "more alignment" part. RLHF is still built out of useful parts - modeling the human and then doing what they want is core to lots of alignment schemes. But ultimately I want us to build something more self-reflective, and that may favor a more model-based approach both because it exposes more interpretable structure (both to human designers and to a self-reflective AI), and because it preserves...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 9: No, RLHF/IDA/debate doesn't solve outer alignment., published by Charlie Steiner on December 12, 2022 on LessWrong. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day (ish) for 25 days. Or until I run out of hot takes. And now, time for the week of RLHF takes. I see people say one of these surprisingly often. Sometimes, it's because the speaker is fresh and full of optimism. They've recently learned that there's this "outer alignment" thing where humans are supposed to communicate what they want to an AI, and oh look, here are some methods that researchers use to communicate what they want to an AI. The speaker doesn't see any major obstacles, and they don't have a presumption that there are a bunch of obstacles they don't see. Other times, they're fresh and full of optimism in a slightly more sophisticated way. They've thought about the problem a bit, and it seems like human values can't be that hard to pick out. Our uncertainty about human values is pretty much like our uncertainty about any other part of the world - so their thinking goes - and humans are fairly competent at figuring things out about the world, especially if we just have to check the work of AI tools. They don't see any major obstacles, and look, I'm not allowed to just keep saying that in an ominous tone of voice as if it's a knockdown argument, maybe there aren't any obstacles, right? Here's an obstacle: RLHF/IDA/debate all incentivize promoting claims based on what the human finds most convincing and palatable, rather than on what's true. RLHF does whatever it learned makes you hit the "approve" button, even if that means deceiving you. Information-transfer in the depths of IDA is shaped by what humans will pass on, potentially amplified by what patterns are learned in training. And debate is just trying to hack the humans right from the start. Optimizing for human approval wouldn't be a big deal if humans didn't make systematic mistakes, and weren't prone to finding certain lies more compelling than the truth. But we do, and we are, so that's a problem. Exhibit A, the last 5 years of politics - and no, the correct lesson to draw from politics is not "those other people make systematic mistakes and get suckered by palatable lies, but I'd never be like that." We can all be like that, which is why it's not safe to build a smart AI that has an incentive to do politics to you. Generalized moral of the story: If something is an alignment solution except that it requires humans to converge to rational behavior, it's not an alignment solution. Let's go back to the perspective of someone who thinks that RLHF/whatever solves outer alignment. I think that even once you notice a problem like "it's rewarded for deceiving me," there's a temptation to not change your mind, and this can lead people to add epicycles to other parts of their picture of alignment. (Or if I'm being nicer, disposes them to see the alignment problem in terms of "really solving" inner alignment.) For example, in order to save an outer objective that encourages deception, it's tempting to say that non-deception is actually a separate problem, and we should study preventing deception as a topic in its own right, independent of objective. And you know what, this is actually a pretty reasonable thing to study. But that doesn't mean you should actually hang onto the original objective. Even when you make stone soup, you don't eat the stone. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 8: Queer the inner/outer alignment dichotomy., published by Charlie Steiner on December 9, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. I'm not saying to never say "inner alignment." But you had better be capable of not using that framing if you want to work on alignment. The inner/outer alignment framing is from Risks from Learned Optimization. You already know what I mean, but just to go through the motions: it describes a situations where there's two key optimization processes going on: the "outer" optimization process humans are using to create an AI, and the "inner" process the created AI is using to make plans. Outer alignment is when the outer process is aligned with humans, and inner alignment is when the inner process is aligned with the outer process. This is an outstandingly useful framing for thinking about certain kinds of AI development, especially model-free RL. However, this framing has a limited domain of validity. Sometimes it breaks down in a way that looks like adding degrees of freedom - as if humans and the created AI are two ends of a string, and the AI-optimization process is a point along the string. Then you can imagine holding the ends fixed but being able to wiggle the midpoint around. This looks like creating an AI that's still aligned with humans, but not because it's aligned with a creation process that is itself aligned with humans - instead, both processes are imperfect, and in order to get good outcomes you had to reason about the alignment of the end-product AI directly. This is how it is for most present-day "dumb" AI, except replace "aligned" with "useful and safe." One can also see shardites as arguing that this is what we should be doing. Other times, the inner/outer framing breaks down entirely, because there isn't a distinct two-part structure. The key example is reflectivity - using the AI to reflect on its own optimization process rapidly blurs the lines between what's "inner" or "outer." But it's not just obvious reflectivity, sometimes the breakdown seems like it was in the problem statement the whole time. Often when people try to solve one of inner or outer alignment entirely, they find that they've sneakily had to solve the other problem as well. In order to "really solve" outer alignment, you want the AI-optimization process to care about the generalization properties of the created AI beyond the training data. In order to "really solve" inner alignment, the created AI shouldn't just care about the raw outputs of the process that created it, it should care about the things communicated by the AI-optimization process in its real-world context. I endorse these attempts to "really" solve alignment. If you think that the inner/outer alignment framing is obvious, it's probably valuable for you to deliberately look for opportunities to blur the lines. Dream up AI-generating processes that care about the AI's inner properties, or AIs that learn to care about humans in a self-reflective process not well-described in terms of an intermediate AI-optimizer. Queer the inner/outer alignment dichotomy. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 7: You should talk about "the human's utility function" less., published by Charlie Steiner on December 8, 2022 on LessWrong. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day (more or less - I'm getting back on track!) for 25 days. Or until I run out of hot takes. When considering AI alignment, you might be tempted to talk about "the human's utility function," or "the correct utility function." Resist the temptation when at all practical. That abstraction is junk food for alignment research. As you may already know, humans are made of atoms. Collections of atoms don't have utility functions glued to them a priori - instead, we assign preferences to humans (including ourselves!) when we model the world, because it's a convenient abstraction. But because there are multiple ways to model the world, there are multiple ways to assign these preferences; there's no "the correct utility function." Maybe you understand all that, and still talk about an AI "learning the human's utility function" sometimes. I get it. It makes things way easier to assume there's some correct utility function when analyzing the human-AI system. Maybe you're writing about inner alignment and want to show that some learning procedure is flawed because it wouldn't learn the correct utility function even if humans had one. Or that some learning procedure would learn that correct utility function. It might seem like this utility function thing is a handy simplifying assumption, and once you have the core argument you can generalize it to the real world with a little more work. That seeming is false. You have likely just shot yourself in the foot. Because the human-atoms don't have a utility function glued to them, building aligned AI has to do something that's actually materially different than learning "the human's utility function." Something that's more like learning a trustworthy process. If you're not tracking the difference and you're using "the human's utility function" as a target of convenience, you can all too easily end up with AI designs that aren't trying to solve the problems we're actually faced with in reality - instead they're navigating their own strange, quasi-moral-realist problems. Another way of framing that last thought might be that wrapper-minds are atypical. They're not something that you actually get in reality when trying to learn human values from observations in a sensible way, and they have alignment difficulties that are idiosyncratic to them (though I don't endorse the extent to which nostalgebraist takes this). What to do instead? When you want to talk about getting human values into an AI, try to contextualize discussion of the human values with the process the AI is using to infer them. Take the AI's perspective, maybe - it has a hard and interesting job trying to model the world in all its complexity, if you don't short-circuit that job by insisting that actually it should just be trying to learn one thing (that doesn't exist). Take the humans' perspective, maybe - what options do they have to communicate what they want to the AI, and how can they gain trust in the AI's process? Of course, maybe you'll try to consider the AI's value-inference process, and find that its details make no difference whatsoever to the point you were trying to make. But in that case, the abstraction of "the human's utility function" probably wasn't doing any work anyhow. Either way. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 7: You should talk about "the human's utility function" less., published by Charlie Steiner on December 8, 2022 on LessWrong. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day (more or less - I'm getting back on track!) for 25 days. Or until I run out of hot takes. When considering AI alignment, you might be tempted to talk about "the human's utility function," or "the correct utility function." Resist the temptation when at all practical. That abstraction is junk food for alignment research. As you may already know, humans are made of atoms. Collections of atoms don't have utility functions glued to them a priori - instead, we assign preferences to humans (including ourselves!) when we model the world, because it's a convenient abstraction. But because there are multiple ways to model the world, there are multiple ways to assign these preferences; there's no "the correct utility function." Maybe you understand all that, and still talk about an AI "learning the human's utility function" sometimes. I get it. It makes things way easier to assume there's some correct utility function when analyzing the human-AI system. Maybe you're writing about inner alignment and want to show that some learning procedure is flawed because it wouldn't learn the correct utility function even if humans had one. Or that some learning procedure would learn that correct utility function. It might seem like this utility function thing is a handy simplifying assumption, and once you have the core argument you can generalize it to the real world with a little more work. That seeming is false. You have likely just shot yourself in the foot. Because the human-atoms don't have a utility function glued to them, building aligned AI has to do something that's actually materially different than learning "the human's utility function." Something that's more like learning a trustworthy process. If you're not tracking the difference and you're using "the human's utility function" as a target of convenience, you can all too easily end up with AI designs that aren't trying to solve the problems we're actually faced with in reality - instead they're navigating their own strange, quasi-moral-realist problems. Another way of framing that last thought might be that wrapper-minds are atypical. They're not something that you actually get in reality when trying to learn human values from observations in a sensible way, and they have alignment difficulties that are idiosyncratic to them (though I don't endorse the extent to which nostalgebraist takes this). What to do instead? When you want to talk about getting human values into an AI, try to contextualize discussion of the human values with the process the AI is using to infer them. Take the AI's perspective, maybe - it has a hard and interesting job trying to model the world in all its complexity, if you don't short-circuit that job by insisting that actually it should just be trying to learn one thing (that doesn't exist). Take the humans' perspective, maybe - what options do they have to communicate what they want to the AI, and how can they gain trust in the AI's process? Of course, maybe you'll try to consider the AI's value-inference process, and find that its details make no difference whatsoever to the point you were trying to make. But in that case, the abstraction of "the human's utility function" probably wasn't doing any work anyhow. Either way. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 6: CAIS is actually Orwellian., published by Charlie Steiner on December 7, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. CAIS, or Comprehensive AI Services, was a mammoth report by Eric Drexler from 2019. (I think reading the table of contents is a good way of getting the gist of it.) It contains a high fraction of interesting predictions and also a high fraction of totally wrong ones - sometimes overlapping! The obvious take about CAIS is that it's wrong when it predicts that agents will have no material advantages over non-agenty AI systems. But that's long been done, and everyone already knows it. What not everyone knows is that CAIS isn't just a descriptive report about technology, it also contains prescriptive implications, and relies on predictions about human sociocultural adaptation to AI. And this future that it envisions is Orwellian. This isn't totally obvious. Mostly, the report is semi-technical arguments AI capabilities. But even if you're looking for the parts of the report about what AI capabilities people will or should develop, or even the parts that sound like predictions about the future, they sound quite tame. It envisions that humans will use superintelligent AI services in contexts where defense trumps offense, and where small actors can't upset the status quo and start eating the galaxy. The CAIS worldview expects us to get to such a future because humans are actively working for it - no AI developer, or person employing AI developers, wants to get disassembled by a malevolent agent, and so we'll look for solutions that shape the future such that that's less likely (and the technical arguments claim that such solutions are close to hand). If the resulting future looks kinda like business as usual - in terms of geopolitical power structure, level of human autonomy, maybe even superficial appearance of the economy, it's because humans acted to make it happen because they wanted business as usual. Setting up a defensive equilibrium where new actors can't disrupt the system is hard work. Right now, just anyone is allowed to build an AI. This capability probably has to be eliminated for the sake of long-term stability. Ditto for people being allowed to have unfiltered interaction with existing superintelligent AIs. Moore's law of mad science says that the IQ needed to destroy the world drops by 1 point every 18 months. In the future where that IQ is 70, potentially world-destroying actions will have to be restricted if we don't want the world destroyed. In short, this world where people successfully adapt to superintelligent AI services is a totalitarian police state. The people who currently have power in the status quo are the ones who are going to get access to the superintelligent AI, and they're going to (arguendo) use it to preserve the status quo, which means just a little bit of complete surveillance and control. Hey, at least it's preferable to getting turned into paperclips. These implications shouldn't surprise you too much if you know that Eric Drexler produced this report at FHI, and remember the works of Nick Bostrom. In fact, also in 2019, Bostrom published The Vulnerable World Hypothesis, which much more explicitly lays out the arguments for why adaptation to future technology might look like a police state. Now, one might expect an Orwellian future to be unlikely (even if we suspend our disbelief about the instability of the system to an AI singleton). People just aren't prepared to support a police state - especially if they think "it's necessary for you own good" sounds like a hostile power-grab. On the other hand, the future elites will have advanced totalitarianism-enabling tech...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 5: Another problem with natural abstractions is laziness., published by Charlie Steiner on December 6, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. Soundtrack. Natural abstractions are patterns in the environment that are so convenient and so useful that most right-thinking agents will learn to take advantage of them. But what if humans and modern ML are too lazy to be right-thinking? One way of framing this point is in terms of gradient starvation. The reason neural networks don't explore all possible abstractions (aside from the expense) is that once they find the first way of solving a problem, they don't really have an incentive to find a second way - it doesn't give them a higher score, so they don't. When gradient starvation is strong, it means the loss landscape has a lot of local minima that the agent can roll into, that aren't easily connected to the global minimum, and so what abstractions the network ends up using will depend strongly on the initial conditions. Regularization and exploration can help ameliorate this problem, but often come with catastrophic forgetting - if a neural net finds a strictly better way to solve the problem it's faced with, it might forget all about the previous way When we imagine a right-thinking agent that learns natural abstractions, we often imagine something that's intrinsically motivated to learn lots of different ways of solving a problem, and that doesn't erase its memory of interesting methods just because they're not on the Pareto frontier. So that's what I mean by "lazy"/"not lazy", here. Neural networks, or humans, are lazy if they're parochial in solution-space, doing local search in a way that sees them get stuck in what a less-lazy optimizer might consider to be ruts. It's important to note that laziness is not an unambiguously bad property. First, it's usually more efficient. Second, maybe we don't want our neural net to actually search through the weird and adversarial parts of parameter-space, and local search prevents it from doing so. Alex Turner et al. have recently been making arguments like this fairly forcefully. Still, we don't want maximal laziness, especially not if we want to find natural abstractions like the various meanings of "human values." I might be attacking a strawman or a bailey here, I'm not totally sure. I've been using "natural abstraction" here as if it just means an abstraction that would be useful for a wide variety of agents to have in their toolbox. But we might also use "natural abstractions" to denote the vital abstractions, those that aren't merely nice to have, but that you literally can't complete certain tasks without using. In that second sense, neural networks are always highly incentivized to learn relevant natural abstractions, and you can easily tell when they do so by measuring their loss. But as per yesterday, there are often multiple similarly-powerful ways to model the world, in particular when modeling humans and human values. There might be hard core vital abstractions for various human-interaction tasks, but I suspect they're abstractions like "discrete object," not anything nearly so far into the leaves of the tree as "human values." And when I see informal speculation about natural abstractions it usually strikes me as thinking about the less strict "useful for most agents" abstractions. Ultimately, I expect laziness to cause both artificial neural nets and humans to miss out on some sizeable fraction of abstractions that most agents would find useful. What to do? There are options: Build an AI that isn't lazy. But laziness is useful, and anyhow maybe we don't want an AI to explore all the extrema. So build an AI t...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 4: One problem with natural abstractions is there's too many of them., published by Charlie Steiner on December 5, 2022 on LessWrong. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. Everyone knows what the deal with natural abstractions is, right? Abstractions are regularities about the world that are really useful for representing its coarse grained behavior - they're building blocks for communicating, compressing, or predicting information about the world. An abstraction is "natural" if it's so easy to learn, and so broadly useful, that most right-thinking agents will have it as part of their toolbox of abstractions. The dream is to use natural abstractions to pick out what we want from an AI. Suppose "human values" are a natural abstraction: then both humans and a world-modeling AI would have nearly the exact same human values abstraction in their toolboxes of abstractions. If we can just activate the AI's human values abstraction, we can more or less avoid misalignment between what-humans-are-trying-to-pick-out and what-abstraction-the-AI-takes-as-its-target. One might think that the main challenge to this plan would be if there are too few natural abstractions. If human values (or agency, or corrigibility, or whatever nice thing you want to target) aren't a natural abstraction, you lose that confidence that the human and the AI are pointing at the same thing. But it's also a challenge if there are too many natural abstractions. Turns out, humans don't just have one abstraction that is "human values," they have a whole lot of 'em. Humans have many different languages / ontologies we use to talk about people, and these use different abstractions as building blocks. More than one of these abstractions gets called "human values," but they're living in different ontologies / get applied in different contexts. If none of these abstractions we use to talk about human values are natural, then we're back to the first problem. But if any of them are natural, it seems just as plausible that nearly all of them are. Abstractions don't even have to be discrete - it's perfectly possible to have a continuum. This complicates the easy alignment plan, because it means that the structure of the world is merely doing most of the work for us rather than almost all of the work. The bigger the space of semantically-similar natural abstractions you have to navigate, the more you have to be careful about your extensional definitions, and the higher standards you have to have for telling good from bad results. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 4: One problem with natural abstractions is there's too many of them., published by Charlie Steiner on December 5, 2022 on LessWrong. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. Everyone knows what the deal with natural abstractions is, right? Abstractions are regularities about the world that are really useful for representing its coarse grained behavior - they're building blocks for communicating, compressing, or predicting information about the world. An abstraction is "natural" if it's so easy to learn, and so broadly useful, that most right-thinking agents will have it as part of their toolbox of abstractions. The dream is to use natural abstractions to pick out what we want from an AI. Suppose "human values" are a natural abstraction: then both humans and a world-modeling AI would have nearly the exact same human values abstraction in their toolboxes of abstractions. If we can just activate the AI's human values abstraction, we can more or less avoid misalignment between what-humans-are-trying-to-pick-out and what-abstraction-the-AI-takes-as-its-target. One might think that the main challenge to this plan would be if there are too few natural abstractions. If human values (or agency, or corrigibility, or whatever nice thing you want to target) aren't a natural abstraction, you lose that confidence that the human and the AI are pointing at the same thing. But it's also a challenge if there are too many natural abstractions. Turns out, humans don't just have one abstraction that is "human values," they have a whole lot of 'em. Humans have many different languages / ontologies we use to talk about people, and these use different abstractions as building blocks. More than one of these abstractions gets called "human values," but they're living in different ontologies / get applied in different contexts. If none of these abstractions we use to talk about human values are natural, then we're back to the first problem. But if any of them are natural, it seems just as plausible that nearly all of them are. Abstractions don't even have to be discrete - it's perfectly possible to have a continuum. This complicates the easy alignment plan, because it means that the structure of the world is merely doing most of the work for us rather than almost all of the work. The bigger the space of semantically-similar natural abstractions you have to navigate, the more you have to be careful about your extensional definitions, and the higher standards you have to have for telling good from bad results. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take 3: No indescribable heavenworlds., published by Charlie Steiner on December 4, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes. Some people think as if there are indescribable heavenworlds. They're wrong, and this is important to AI alignment. This is an odd accusation given that I made up the phrase "indescribable heavenworld" myself, so let me explain. It starts not with heavenworlds, but with Stuart Armstrong writing about the implausibility of indescribable hellworlds. A hellworld, is, obviously, a bad state of the world. An indescribable hellworld is a state of the world where where everything looks fine at first, and then you look closer and everything still looks fine, and then you sit down and think about it abstractly and it still seems fine, and then you go build tools to amplify your capability to inspect the state of the world and they say it's fine, but actually, it's bad. If the existence of such worlds sounds plausible to you, then I think you might enjoy and benefit from trying to grok the metaethics sequence. Indescribable hellworlds are sort of like the reductio of an open question argument. Open question arguments say that no matter what standard of goodness you set, if it's a specific function of the state of the world then it's an open question whether that function is actually good or not (and therefore moral realism). For a question to really be open, it must be possible to get either answer - and indescribable hellworlds are what keep the question open even if we use the standard of all of human judgment, human cleverness, and human reflectivity. If you read Reducing Goodhart, you can guess some things I'd say about indescribable hellworlds. There is no unique standard of "explainable," and you can have worlds that are the subject of inter-standard conflict (even supposing badness is fixed), which can sort of look like indescribable badness. But ultimately, the doubt over whether some world is bad puts a limit on how hellish it can really be, sort of like harder choices matter less. A preference that can't get translated into some influence on my choices is a weak preference indeed. An indescribable heavenworld is of course the opposite of an indescribable hellworld. It's a world where everything looks weird and bad at first, and then you look closer and it still looks weird and bad, and you think abstractly and yadda yadda still seems bad, but actually, it's the best world ever. Indescribable heavenworlds come up when thinking about what happens if everything goes right. "What if" - some people wonder - "the glorious post-singularity utopia is actually good in ways that are impossible for human to comprehend? That would, by definition, be great, but I worry that some people might try to stop that glorious future from happening by trying to rate futures using their present preferences / judgment / cleverness / reflectivity. Don't leave your fingerprints on the future, people!" No indescribable heavenworlds. If a future is good, it's good for reasons that make sense to me - maybe not at first glance, but hopefully at second glance, or after some abstract thought, or with the assistance of some tools whose chain of logic makes sense to me. If the future realio trulio seems weird and bad after all that work, it's not secretly great, we probably just messed up. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take: Building tools to help build FAI is a legitimate strategy, but it's dual-use., published by Charlie Steiner on December 3, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes, which seems likely. This was waiting around in the middle of my hot-takes.txt file, but it's gotten bumped up because of Rob and Eliezer - I've gotta blurt it out now or I'll probably be even more out of date. The idea of using AI research to help us be better at building AI is not a new or rare idea. It dates back to prehistory, but some more recent proponents include OpenAI members (e.g. Jan Leike) and the Accelerating Alignment group. We've got a tag for it. Heck, this even got mentioned yesterday! So a lot of this hot take is really about my own psychology. For a long time, I felt that sure, building tools to help you build friendly AI was possible in principle, but it wouldn't really help. Surely it would be faster just to cut out the middleman and understand what we want from AI using our own brains. If I'd turned on my imagination, rather than reacting to specific impractical proposals that were around at the time, I could have figured out how augmenting alignment research is a genuine possibility a lot sooner, and started considering the strategic implications. Part of the issue is that plausible research-amplifiers don't really look like the picture I have in your head of AGI - they're not goal-directed agents who want to help us solve alignment. If we could build those and trust them, we really should just cut out the middleman. Instead, they can look like babble generators, souped-up autocomplete, smart literature search, code assistants, and similar. Despite either being simulators or making plans only in a toy model of the world, such AI really does have the potential to transform intellectual work, and I think it makes a lot of sense for there to be some people doing work to make these tools differentially get applied to alignment research. Which brings us to the dual-use problem. It turns out that other people would also like to use souped-up autocomplete, smart literature search, code assistants, and similar. They have the potential to transform intellectual work! Pushing forward the state of the art on these tools lets you get them earlier, yet it also helps other people get them earlier too, even if you don't share your weights. Now, maybe the most popular tools will help people make philosophical progress, and accelerating development of research-amplifying tools will usher in a brief pre-singularity era of enlightenment. But - lukewarm take - that seems way less likely than such tools differentially favoring engineering over philosophy on a society-wide scale, making everything happen faster and be harder to react to. So best of luck to those trying to accelerate alignment research, and fingers crossed for getting the differential progress right, rather than oops capabilities. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take: We're not going to reverse-engineer the AI., published by Charlie Steiner on December 1, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes, which seems likely. Any approach to building safe transformative AI, or even just auditing possibly-safe TAI, which relies on reverse-engineering neural networks into fine-grained pseudocode based on mechanistic understanding should keep its ambitions very modest. This hot take is to some extent against ambitious "microscope AI," and to some extent against a more underlying set of intuitions about the form and purpose of interpretability research. (A somewhat related excellent background post is Neel's list of theories of impact for interpretability.) So I should start by explaining what those things are and why they might be appealing. Webster's Dictionary defines microscope AI as "training systems to do complex tasks, then interpreting how they do it and doing it ourselves." Prima facie, this would help with transformative AI. Suppose we're building some AI that's going to have a lot of power over the world, but we're not sure if it's trustworthy - what if some of its cognition is about how to do things we don't want it to be doing? If we can do microscope AI, we can understand how our first AI is so clever, and build a second AI that's just as clever and that we're sure isn't doing things it shouldn't, like running a search for how best to deceive us. Microscope-powered auditing is easier - if it's hard to assemble the second AI that does good things and not bad things, how about just checking that the first AI is trustworthy? To check an AI's trustworthiness in this microscope-AI-like framing of the issues, we might want to understand how its cognitive processes work in fine-grained detail, and check that none of those processes are doing bad stuff. When I say I'm against this, I don't mean auditing is impossible. I mean that it's not going to happen by having humans understand how the AI works in fine-grained detail. As an analogy, you can figure out how curve detectors work in InceptionV1. Not just in the sense that "oh yeah, that neuron is totally a curve detector," but in terms of how the whole thing works. It's yet more difficult to figure out that other neurons are not curve detectors - typically at this point we fall back on data-based methods like ablating those neurons and then trying to get the network to recognize rainbows, rather than first-principles arguments. But we can more or less figure out that InceptionV1 has an intermediate state where it detects curves, by an understandable algorithm and for human-understandable reasons. If we wanted to figure out how InceptionV1 tells dogs from cats, we might hope to gradually hack away at the edges - use what we know to expand the circle of knowledge a little more, and then repeat. Use the example of curve detectors to figure out spike detectors. Use spike-detectors to figure out fur-texture detectors, and curve detectors to figure out nose-shape detectors. Then we can learn how fur texture and nose shape play into deciding on dog vs. cat. At each step we can use data to test our understanding, but the basic goal is to be able to write down the flow of information between features in a human-comprehensible way. It's not just about giving neurons english-language labels, it's about giving them sensible algorithms where those labels play the expected role. The biggest problem with this plan is that neural networks leak. Many things are connected to many other things, weakly, in ways that are important for their success. I recently was at a talk that showed how the vast majority of attention heads in a transformer have lots of ...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Take: We're not going to reverse-engineer the AI., published by Charlie Steiner on December 1, 2022 on The AI Alignment Forum. As a writing exercise, I'm writing an AI Alignment Hot Take Advent Calendar - one new hot take, written every day for 25 days. Or until I run out of hot takes, which seems likely. Any approach to building safe transformative AI, or even just auditing possibly-safe TAI, which relies on reverse-engineering neural networks into fine-grained pseudocode based on mechanistic understanding should keep its ambitions very modest. This hot take is to some extent against ambitious "microscope AI," and to some extent against a more underlying set of intuitions about the form and purpose of interpretability research. (A somewhat related excellent background post is Neel's list of theories of impact for interpretability.) So I should start by explaining what those things are and why they might be appealing. Webster's Dictionary defines microscope AI as "training systems to do complex tasks, then interpreting how they do it and doing it ourselves." Prima facie, this would help with transformative AI. Suppose we're building some AI that's going to have a lot of power over the world, but we're not sure if it's trustworthy - what if some of its cognition is about how to do things we don't want it to be doing? If we can do microscope AI, we can understand how our first AI is so clever, and build a second AI that's just as clever and that we're sure isn't doing things it shouldn't, like running a search for how best to deceive us. Microscope-powered auditing is easier - if it's hard to assemble the second AI that does good things and not bad things, how about just checking that the first AI is trustworthy? To check an AI's trustworthiness in this microscope-AI-like framing of the issues, we might want to understand how its cognitive processes work in fine-grained detail, and check that none of those processes are doing bad stuff. When I say I'm against this, I don't mean auditing is impossible. I mean that it's not going to happen by having humans understand how the AI works in fine-grained detail. As an analogy, you can figure out how curve detectors work in InceptionV1. Not just in the sense that "oh yeah, that neuron is totally a curve detector," but in terms of how the whole thing works. It's yet more difficult to figure out that other neurons are not curve detectors - typically at this point we fall back on data-based methods like ablating those neurons and then trying to get the network to recognize rainbows, rather than first-principles arguments. But we can more or less figure out that InceptionV1 has an intermediate state where it detects curves, by an understandable algorithm and for human-understandable reasons. If we wanted to figure out how InceptionV1 tells dogs from cats, we might hope to gradually hack away at the edges - use what we know to expand the circle of knowledge a little more, and then repeat. Use the example of curve detectors to figure out spike detectors. Use spike-detectors to figure out fur-texture detectors, and curve detectors to figure out nose-shape detectors. Then we can learn how fur texture and nose shape play into deciding on dog vs. cat. At each step we can use data to test our understanding, but the basic goal is to be able to write down the flow of information between features in a human-comprehensible way. It's not just about giving neurons english-language labels, it's about giving them sensible algorithms where those labels play the expected role. The biggest problem with this plan is that neural networks leak. Many things are connected to many other things, weakly, in ways that are important for their success. I recently was at a talk that showed how the vast majority of attention heads in a transformer have lots of ...
A special Dodgers clubhouse show hosted by Charlie Steiner and Rick Monday. DV gets all of the player reaction after the Dodgers win their 9th Division title in 10 years.
A special Dodgers clubhouse show hosted by Charlie Steiner and Rick Monday. DV gets all of the player reaction after the Dodgers win their 9th Division title in 10 years.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some ideas for epistles to the AI ethicists, published by Charlie Steiner on September 14, 2022 on The AI Alignment Forum. Some papers, or ideas for papers, that I'd loved to see published in ethics journals like Minds and Machines or Ethics and Information Technology. I'm probably going to submit one of these to a 2023 AI ethics conference myself. Why should we do this? Because we want today's grad students to see that the ethical problems of superhuman AI are a cool topic that they can publish a cool paper about. And we want to (marginally) raise the waterline for thinking about future AI, nudging the AI ethics discourse towards more matured views of the challenges of AI. Secondarily, it would be good to leverage the existing skillsets of some ethicists for AI safety work, particularly those already working on AI governance. And having an academic forum where talking about AI safety is normalized bolsters other efforts to work on AI safety in academia. The Ideas: Explain the basic ideas of AI safety, and why to take them seriously. Iason Gabriel already had a pretty good paper like this. But it's plausible that, right now, what the ethics discourse needs is more basic explanations of why AI safety is a thing at all. This paper might start out by making a case that superhuman AI is going to change the world, likely in the next 10-60 years (definitely unintuitive to many, but there are AI Impacts surveys and recent results to illustrate the point). Then the basic arguments that superhuman AI will not be automatically benevolent (easy rhetorical trick is to call it "superhuman technology," everyone knows technology is bad). Then the basic arguments that to get things to go well, the AI has to know a whole lot about what humans want (and use that knowledge the way we want). One issue with this might be that it presents the problem, but doesn't really point people towards solutions (this may be a problem that can be solved with quick citations). It also doesn't really motivate why this is an ethics problem. It also doesn't explain why we want the solution to the key "ethics-genre" problems to use a technical understanding of the AI, rather than a human- or society-centric view. A more specific defense of the validity of transformative-AI-focused thinking as a valid use of ethicists' time. The core claim is that getting AIs to want want good things and not bad things is an unsolved ethics problem. Ethics, not engineering, because the question isn't "how do we implement some obvious standard," the question is "what is even a good standard in the first place?" But almost as important are secondary claims about what actual progress on this question looks like. The end goal is a standard that is connected to technical picture of how the AI will learn this information about humans, and how it will use it to make decisions. So the overall thrust is "given that AI safety is important, there is a specific sort of ethics-genre reasoning that is going to be useful, and here are some gestures towards what it might look like." You can put more than one of these ideas into a paper if you want. This particular idea feels to me like it could benefit from being paired with another topic before or after it. Dunking on specific mistakes, like talking about "robots" rather than "optimization processes," should probably be done with care and tact. A worked example of "dual use" ethics - a connection between thinking about present-day problems and superhuman AI. I expect most of the examples to be problems that sound relevant to the modern day, but that sneakily contain most of the alignment problem. E.g. Xuan's AI that takes actions in response to laws that we really want to follow the spirit of the law. Although that's a bit too futuristic, actually, because we don't have much present-day ...
A special Dodgers clubhouse show hosted by Charlie Steiner and Rick Monday. DV gets all of the player reaction after the Dodgers win their 9th Division title in 10 years.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Solomonoff prior is malign. It's not a big deal., published by Charlie Steiner on August 25, 2022 on LessWrong. Epistemic status: Endorsed at ~85% probability. In particular, there might be clever but hard-to-think-of encodings of observer-centered laws of physics that tilt the balance in favor of physics. Also, this isn't that different from Mark Xu's post. Previously, previously, previously I started writing this post with the intuition that the Solomonoff prior isn't particularly malign, because of a sort of pigeon hole problem - for any choice of universal Turing machine there are too many complicated worlds to manipulate, and too few simple ones to do the manipulating. Other people have different intuitions. So there was only one thing to do. Math. All we have to do is compare [wild estimates of] the complexities of two different sorts of Turing machines: those that reproduce our observations by reasoning straightforwardly about the physical world, and those that reproduce our observations by simulating a totally different physical world that's full of consequentialists who want to manipulate us. Long story short, I was surprised. The Solomonoff prior is malign. But it's not a big deal. Team Physics: If you live for 80 years and get 10^7 bits/s of sensory signals, you accumulate about 10^16 bits of memory to explain via Solomonoff induction. In comparison, there are about 10^51 electrons on Earth - just writing their state into a simulation is going to take somewhere in the neighborhood of 10^51 bits. So the Earth, or any physical system within 35 orders of magnitude of complexity of the Earth, can't be a Team Physics hypothesis for compressing your observations. What's simpler than the Earth? Turns out, simulating the whole universe. The universe can be mathematically elegant and highly symmetrical in ways that Earth isn't. For simplicity, let's suppose that "I" am a computer with a simple architecture, plus some complicated memories. The trick that allows compression is you don't need to specify the memories - you just need to give enough bits to pick out "me" among all computers with the same architecture in the universe. This depends only on how many similar computers there are in Hilbert space with different memories, not directly on the complexity of my memories themselves. So the complexity of a simple exemplar of Team Physics more or less looks like: Complexity of our universe + Complexity of my architecture, and rules for reading observations out from that architecture + Length of the unique code for picking me out among things with the right architecture in the universe. That last one is probably pretty big relative to the first two. The universe might only be a few hundred bits, and you can specify a simple computer with under a few thousand (or a human with ~10^8 bits of functional DNA). The number of humans in the quantum multiverse, past and future, is hard to estimate, but 10^8 bits is only a few person-years of listening to Geiger counter clicks, or being influenced by the single-photon sensitivity of the human eye. Team Manipulation: A short program on Team Manipulation just simulates a big universe with infinite accessible computing power and a natural coordinate system that's obvious to its inhabitants. Life arises, and then some of that life realizes that they could influence other parts of the mathematical multiverse by controlling the simple locations in their universe. I'll just shorthand the shortest program on Team Manipulation, and its inhabitants, as "Team Manipulation" full stop, since they dominate the results. So they use the limitless computing power of their universe to simulate a bunch of universes, and when they find a universe where people are using the Solomonoff prior to make decisions, they note what string of bits the p...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Solomonoff prior is malign. It's not a big deal., published by Charlie Steiner on August 25, 2022 on LessWrong. Epistemic status: Endorsed at ~85% probability. In particular, there might be clever but hard-to-think-of encodings of observer-centered laws of physics that tilt the balance in favor of physics. Also, this isn't that different from Mark Xu's post. Previously, previously, previously I started writing this post with the intuition that the Solomonoff prior isn't particularly malign, because of a sort of pigeon hole problem - for any choice of universal Turing machine there are too many complicated worlds to manipulate, and too few simple ones to do the manipulating. Other people have different intuitions. So there was only one thing to do. Math. All we have to do is compare [wild estimates of] the complexities of two different sorts of Turing machines: those that reproduce our observations by reasoning straightforwardly about the physical world, and those that reproduce our observations by simulating a totally different physical world that's full of consequentialists who want to manipulate us. Long story short, I was surprised. The Solomonoff prior is malign. But it's not a big deal. Team Physics: If you live for 80 years and get 10^7 bits/s of sensory signals, you accumulate about 10^16 bits of memory to explain via Solomonoff induction. In comparison, there are about 10^51 electrons on Earth - just writing their state into a simulation is going to take somewhere in the neighborhood of 10^51 bits. So the Earth, or any physical system within 35 orders of magnitude of complexity of the Earth, can't be a Team Physics hypothesis for compressing your observations. What's simpler than the Earth? Turns out, simulating the whole universe. The universe can be mathematically elegant and highly symmetrical in ways that Earth isn't. For simplicity, let's suppose that "I" am a computer with a simple architecture, plus some complicated memories. The trick that allows compression is you don't need to specify the memories - you just need to give enough bits to pick out "me" among all computers with the same architecture in the universe. This depends only on how many similar computers there are in Hilbert space with different memories, not directly on the complexity of my memories themselves. So the complexity of a simple exemplar of Team Physics more or less looks like: Complexity of our universe + Complexity of my architecture, and rules for reading observations out from that architecture + Length of the unique code for picking me out among things with the right architecture in the universe. That last one is probably pretty big relative to the first two. The universe might only be a few hundred bits, and you can specify a simple computer with under a few thousand (or a human with ~10^8 bits of functional DNA). The number of humans in the quantum multiverse, past and future, is hard to estimate, but 10^8 bits is only a few person-years of listening to Geiger counter clicks, or being influenced by the single-photon sensitivity of the human eye. Team Manipulation: A short program on Team Manipulation just simulates a big universe with infinite accessible computing power and a natural coordinate system that's obvious to its inhabitants. Life arises, and then some of that life realizes that they could influence other parts of the mathematical multiverse by controlling the simple locations in their universe. I'll just shorthand the shortest program on Team Manipulation, and its inhabitants, as "Team Manipulation" full stop, since they dominate the results. So they use the limitless computing power of their universe to simulate a bunch of universes, and when they find a universe where people are using the Solomonoff prior to make decisions, they note what string of bits the p...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reducing Goodhart: Announcement, Executive Summary, published by Charlie Steiner on August 20, 2022 on The AI Alignment Forum. I - Release announcement I'm posting a very thorough edit-slash-rewrite of my Reducing Goodhart sequence. Many thanks to people who gave me feedback or chatted with me about related topics, and many apologies to the people who I told to read this sequence "but only after I finish editing it real quick." If you're interested in the "why" and "what" questions of highly intelligent AIs learning human values, I can now recommend this sequence to you without embarrassment. And if you skimmed the old sequence but don't really remember what it was about, this is a great time to read the new and improved version. II - Executive summary, superfluous for AF regulars What does it mean for an AI to learn human preferences and then satisfy them? The intuitive way to approach this question is to treat "human preferences" as fixed facts that the AI is supposed to learn, but it turns out this is an unproductive way to think about the problem. Instead, it's better to treat humans as physical systems. "Human preferences" are parts of the models we build to understand ourselves. Depending on how an AI models the world, it might infer different human preferences from the same data - you can say a reluctant addict either wants heroin or doesn't without actually disputing any raw data, just changing perspective. This makes it important that value learning AI models humans the way we want to be modeled. How we want to be modeled is itself a fact about our preferences that has to be learned by interacting with us. A centerpiece of this sequence is Goodhart's law. Treating humans as physical systems and human preferences as emergent leads to a slightly unusual definition of Goodhart's law: "When you put pressure on the world to make it extremely good according to one interpretation of human values, this is often bad according to other interpretations." This perspective helps us identify bad behavior that's relevant to Goodhart's law for value learning AI. We should build value learning AI that is sensitive to the broad spectrum of human values, that allows us to express our meta-preferences, and that is conservative about pushing the world off-distribution, in addition to avoiding blatant harm to humans. If this summary sounds relevant to your interests, consider reading the whole sequence. Thanks for listening. To help us out with The Nonlinear Library or to learn more, please visit nonlinear.org.
Tony opens the show by talking about Patrick Reed's $750 million lawsuit, and also about an advertisement he saw that needs to be changed, and he also talks about finding a new brand of ice cream. Steve Sands of the Golf Channel calls in to talk about Cam Smith, Tiger Woods meeting with the top PGA players, and the Patrick Reed lawsuit, Charlie Steiner phones in to catch up with Tony abound he tells the story of how he landed the job he wanted since he was 5 years old, and Tony closes out the show by opening up the Mailbag. Songs : Al Barnes “San Diego” ; “California Punched Me in the Eye” To learn more about listener data and our privacy practices visit: https://www.audacyinc.com/privacy-policy Learn more about your ad choices. Visit https://podcastchoices.com/adchoices
Off-day Dodger Talk with David Vassegh recapping the 2022 All-Star Game at Dodger Stadium with Charlie Steiner.
Off-day Dodger Talk with David Vassegh recapping the 2022 All-Star Game at Dodger Stadium with Charlie Steiner.
Off-day Dodger Talk with David Vassegh recapping the 2022 All-Star Game at Dodger Stadium with Charlie Steiner.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reading the ethicists 2: Hunting for AI alignment papers, published by Charlie Steiner on June 6, 2022 on The AI Alignment Forum. Introduction I'm back, reading more papers in ethics (but now also in philosophy and wherever the citations lead me). Unlike last time when I surveyed one journal's approach to AI in general, this time I'm searching specifically for interesting papers that bear on aligning AGI, preferably written by people I've never heard of before. Are they going to be any good? shrug To this end, I skimmed the titles and sometimes abstracts of the last 5 years of papers in a pretty big chunk of the AI ethics space, namely: Ethics and Information Technology Minds and Machines AI and Ethics Philosophy & Technology Science and Engineering Ethics AI & Society IEEE Transactions on Philosophy and Society And a few bonus miscellanea From the set of all papers that even had a remote chance at being relevant (~10 per journal per year), I read more deeply and am relaying to you in this post all the ones that were somewhat on-topic and nontrivial (~0.5 per journal per year). By "nontrivial" I mean that I didn't include papers that just say "the alignment problem exists" - I certainly do not mean that I set a high bar for quality. Then I did some additional searching into what else those authors had published, who they worked with, etc. What were all the other papers about, the ones that didn't match my criteria? All sorts of things! Whether robots are responsible for their actions, how important privacy is, how to encourage and learn from non-Western robotics paradigms, the ethics of playing MMORPGs, how to make your AI ignore protected characteristics, the impact of bow-hunting technology on middle stone age society, and so on and so forth. The bulk of this post is a big barely-ordered list of the papers. For each paper I'll give the title, authors, author affiliations, journal and date. Each paper will get a summary and maybe a recommendation. I'll also bring up interesting author affiliations and related papers. Whereas last post was more a "I did this so you don't have to" kind of deal, I think this post will be more fun if you explore more deeply, reading the papers that catch your interest. (Let me know if you have any trouble finding certain pdfs - most are linked on google scholar.) If you find this post to be too long, don't go through it all in one sitting - I sure didn't! Papers The possibility of deliberate norm-adherence in AI, Danielle Swanepoel (U. Johannesburg, SolBridge International School of Business), Ethics and Information Technology, 2021. I binned most papers talking about artificial moral agents ("AMAs") for being dull and anthropocentric. I decided to include this paper because it's better than most, and its stab at drawing the line between "not moral agent" and "moral agent" is also a good line between "just needs to be reliable" and "actually needs to be value aligned." Recommended if you like people who really like Kant. Human-aligned artificial intelligence is a multiobjective problem, P. Vamplew (LessWrong), R. Dazeley, C. Foale, S. Firmin and J. Mummery (Federation University Australia), Ethics and Information Technology, 2018. They argue that if you design an AI's motivations by giving it multiple objective functions that only get aggregated near the point of decision-making, you can do things like using a nonlinear aggregation function as a form of reduced-impact AI, or throwing away outliers for increased robustness. Recommended if you haven't thought about this idea yet and want something to skim while you turn on your brain (also see Peter Vamplew's more technical papers if you want more details). This has been cited a healthy amount, and by some interesting-looking papers, including: MORAL: Aligning AI with Human Norms through ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reading the ethicists: A review of articles on AI in the journal Science and Engineering Ethics, published by Charlie Steiner on May 18, 2022 on LessWrong. Epistemic status: Stream of consciousness reactions to papers read in chronological order. Caveat lector. I have a dirty family secret. My uncle is a professional ethicist. In a not-too roundabout way, this is why I ended up looking at the October 2020 issue of the journal Science and Engineering Ethics, their special issue on the ethics of AI. I am now going to read that issue, plus every article this journal has published about AI since then [I wussed out and am just going to skim the latter for ones of special interest] and give you the deets. October 2020 Hildt et al., Editorial: Shaping Ethical Futures in Brain-Based and Artificial Intelligence Research This is the introduction to the issue. They give each paper a sentence or two of summary and try to tie them all together. The authors helpfully give a list of topics they think are important: Data Concerns: Data management, data security, protection of personal data, surveillance, privacy, and informed consent. Algorithmic Bias and Discrimination: How to avoid bias and bias related problems? This points to questions of justice, equitable access to resources, and digital divide. Autonomy: When and how is AI autonomous, what are the characteristics of autonomous AI? How to develop rules for autonomous vehicles? Responsibility: Who is in control? Who is responsible or accountable for decisions made by AI? Questions relating to AI capabilities: Can AI ever be conscious or sentient? What would conscious or sentient AI imply? Values and morality: How to build in values and moral decision-making to AI? Are moral machines possible? Should robots be granted moral status or rights? Based on this list, I anticipate that I'm about to run into four-sixths ethics papers about present-day topics that I will skim to point out particularly insightful or anti-insightful ones, one-sixth philosophers of mind that I will make fun of a little, and one-sixth papers on "How to build values into general AI" that I'm really curious as to the quality of. Onward! Nallur, Landscape of Machine Implemented Ethics Primarily this paper is a review of a bunch of papers that have implemented or proposed ethics modules in AI systems (present-day things like expert systems to give medical advice, or lethal autonomous weapons [which he has surprisingly few qualms about]). These were mostly different varieties of rule-following or constraint-satisfaction, with a few handwritten utility functions thrown in. And then one of these is Stuart Armstrong (2015) for some reason - potentially that reason is that the author wanted to at least mention "value-loading," and nobody else was talking about it (I checked - there's a big table of properties of different proposals). It also proposes evaluating different proposals by having a benchmark of trolley-problem-esque ethical dilemmas. The main reason this idea won't work is that making modern-day systems behave ethically involves a bunch of bespoke solutions only suitable to the domain of operation of that system, not allowing for cross-comparison in any useful way. If were to salvage this idea, we might wish to have a big list of ethical questions the AI system should get the right answer to, and then when building a sufficiently important AI (still talking about present-day applications), the designers should go through this list and find all the questions that can be translated into their system's ontology and check that their decision-making procedure gets acceptable answers. E.g. "Is it better to kill one person or two people?" can become self-driving car scenarios where it's going to hit either one or two people, and it should get the right answer, bu...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reading the ethicists: A review of articles on AI in the journal Science and Engineering Ethics, published by Charlie Steiner on May 18, 2022 on LessWrong. Epistemic status: Stream of consciousness reactions to papers read in chronological order. Caveat lector. I have a dirty family secret. My uncle is a professional ethicist. In a not-too roundabout way, this is why I ended up looking at the October 2020 issue of the journal Science and Engineering Ethics, their special issue on the ethics of AI. I am now going to read that issue, plus every article this journal has published about AI since then [I wussed out and am just going to skim the latter for ones of special interest] and give you the deets. October 2020 Hildt et al., Editorial: Shaping Ethical Futures in Brain-Based and Artificial Intelligence Research This is the introduction to the issue. They give each paper a sentence or two of summary and try to tie them all together. The authors helpfully give a list of topics they think are important: Data Concerns: Data management, data security, protection of personal data, surveillance, privacy, and informed consent. Algorithmic Bias and Discrimination: How to avoid bias and bias related problems? This points to questions of justice, equitable access to resources, and digital divide. Autonomy: When and how is AI autonomous, what are the characteristics of autonomous AI? How to develop rules for autonomous vehicles? Responsibility: Who is in control? Who is responsible or accountable for decisions made by AI? Questions relating to AI capabilities: Can AI ever be conscious or sentient? What would conscious or sentient AI imply? Values and morality: How to build in values and moral decision-making to AI? Are moral machines possible? Should robots be granted moral status or rights? Based on this list, I anticipate that I'm about to run into four-sixths ethics papers about present-day topics that I will skim to point out particularly insightful or anti-insightful ones, one-sixth philosophers of mind that I will make fun of a little, and one-sixth papers on "How to build values into general AI" that I'm really curious as to the quality of. Onward! Nallur, Landscape of Machine Implemented Ethics Primarily this paper is a review of a bunch of papers that have implemented or proposed ethics modules in AI systems (present-day things like expert systems to give medical advice, or lethal autonomous weapons [which he has surprisingly few qualms about]). These were mostly different varieties of rule-following or constraint-satisfaction, with a few handwritten utility functions thrown in. And then one of these is Stuart Armstrong (2015) for some reason - potentially that reason is that the author wanted to at least mention "value-loading," and nobody else was talking about it (I checked - there's a big table of properties of different proposals). It also proposes evaluating different proposals by having a benchmark of trolley-problem-esque ethical dilemmas. The main reason this idea won't work is that making modern-day systems behave ethically involves a bunch of bespoke solutions only suitable to the domain of operation of that system, not allowing for cross-comparison in any useful way. If were to salvage this idea, we might wish to have a big list of ethical questions the AI system should get the right answer to, and then when building a sufficiently important AI (still talking about present-day applications), the designers should go through this list and find all the questions that can be translated into their system's ontology and check that their decision-making procedure gets acceptable answers. E.g. "Is it better to kill one person or two people?" can become self-driving car scenarios where it's going to hit either one or two people, and it should get the right answer, bu...
Emmy Award-winning sportscaster, the legendary Charley Steiner, joins Mick and Mook for a special Super Bowl episode. Having seen every Super Bowl played, the three men will take a fond look back at 55 years of games – and a look ahead to what Super Bowl LVI will bring.Special observations on the gambling line from “Confessions of a Bronx Bookie” author, the Mick himself – Billy O'Connor, to insights from Steiner makes this a can't miss episode.Steiner, a long time football play by play man in his Radio Hall of Fame broadcast career, he also hosted the NFL Network show Football America. Charley has also been showcased in frequent interviews for the network's NFL Top 10 series.Steiner is currently the radio play-by-play announcer for Major League Baseball's Los Angeles Dodgers. Prior to joining the Dodgers in 2005, Charley had a long career in broadcasting for ESPN and the New York Yankees. So, be sure to join Mick and Mook on February 9th for a nostalgic look back at Super Bowl history, plus insight on the Bengals-Rams ‘super game' slated for February 13th.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: New year, new research agenda post, published by Charlie Steiner on January 12, 2022 on The AI Alignment Forum. Thanks to Steve Byrnes, Adam Shimi, John Wentworth, and Peter Barnett for feedback. In a nutshell, my plan A is to understand what we want from superintelligent AI really, really well. So well that we can write down a way of modeling humans that illuminates human preferences including higher-order preferences about how we want to be modeled, and do this in a principled rather than ad-hoc way. Achieving this understanding is highly ambitious, in a way that is mostly but not entirely parallel to "ambitious value learning." If we understand value learning before we build superintelligent AI, there's a straightforward path to achieving a good future without paying a costly alignment tax - by alignment tax I mean all those things that slow down aligned AI being developed and "ready for liftoff," that collectively create selection pressure against safety. This problem is more tractable than many people think. I think of this plan as an instance of the more general plan "solve value learning first." Some other tenable plans are "try to put humans in control," "get the AI to do good things prosaically," and "tinker with the AI until it value learns" - plus intermediate points between these. What do I think the future is like? I expect superintelligent AI in the short to medium term, centralized around a small number of points of development. By short to medium term, I mean I'd put my 50% confidence interval between 2031 and 2049. I don't think we need compute to be many orders of magnitude cheaper, and I don't think we need two or more paradigm shifts on the order of neural nets overtaking support vector machines. The timeline is urgent, but not to the point that we should start ditching things like blue-sky research or gradual coalition-building. By centralized, I mean it's possible to make big changes by having good solutions implemented in a small number of systems. Coordination may be important, but it isn't an inherent part of solving the problem. All that said, that's just what I think is going to happen, not what's required for "solve value learning first" to be a good idea. Value learning research is still valuable in decentralized scenarios, unless we go so far as to avoid powerful agential AI long-term. Because it's more on the blue-sky end of the spectrum, longer timelines actually favor solving value learning over more atheoretic approaches, while if timelines are very short I'd advocate for "try to put humans in control" and hope for the best. If we fail to understand value learning before we build superintelligent AI, I'm worried about some combination of groups committed to building aligned AI being less competitive because we can't learn human values efficiently, and practical-minded alignment schemes having bad behavior in edge cases because of simplifying assumptions about humans. A basic example: if humans are assumed not to be manipulable, then an AI that thoroughly maximizes what humans (are modeled to) want will be incredibly manipulative. From the AI's perspective, the humans love being deceived, because why else would they rate it so highly? And in fact it's a tricky technical problem to avoid manipulation without sophisticated value learning, because the notion of "manipulation" is so intertwined with human meta-preferences - labeling things as manipulation means un-endorsing some of our revealed preferences. Similar hidden gotchas can pop up in other attempts to cut corners on human modeling, and at some point it just becomes faster to solve value learning than to deal with each gotcha individually. What's the broad plan? The basic strategy can be summed up as "be Geoff Hinton" (as in godfather of deep learning Geoffrey Hinton). Know an im...
SiriusXM Mad Dag Unleashed & High Heat host Chris Russo joins the show to explain why a Dodgers - Astros World Series rematch would be must-watch TV, and explains why the Cardinals winning was something MLB did not want to happen. Plus, Dodgers radio announcer Charlie Steiner joins the show to talk about last night's game. Learn more about your ad-choices at https://www.iheartpodcastnetwork.com
Dodgers radio broadcaster Charlie Steiner joins The Dan Patrick Show to explain why history won't look fondly upon the Houston Astros, and gives all the details on the atmosphere at Dodger Stadium in the two game series against the Astros. Learn more about your ad-choices at https://www.iheartpodcastnetwork.com
On this week's show, his career has myriad highlights. Charley Steiner has managed to carve a hall of fame career traversing the country with his bold sound and colorful descriptions. He may be a New Yorker but Steiner's broadcast roots began At Bradley University in Peoria, Illinois where he decided the path he would take in life.Steiner became one of the recognizable ESPN SportsCenter anchors of the 90s and was also the lead voice of ESPN radio's coverage of Major League Baseball before joining the New York Yankees. Now in his 17th season as the voice of the Dodgers, Steiner has so many stories to tell.And, he's the guest this week on "Tell me a story I don't know." Hear the Full Show coming Tuesday on Apple Podcasts, Spotify, Google, etc.!!Advertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy
On this week's show, his career has myriad highlights. Charley Steiner has managed to carve a hall of fame career traversing the country with his bold sound and colorful descriptions. He may be a New Yorker but Steiner's broadcast roots began At Bradley University in Peoria, Illinois where he decided the path he would take in life.Steiner became one of the recognizable ESPN SportsCenter anchors of the 90s and was also the lead voice of ESPN radio's coverage of Major League Baseball before joining the New York Yankees. Now in his 17th season as the voice of the Dodgers, Steiner has so many stories to tell.And, he's the guest this week on "Tell me a story I don't know." Hear the Full Show coming Tuesday on Apple Podcasts, Spotify, Google, etc.!!Advertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy
A 50 year career in the sports broadcasting industry with stints at ESPN, FOX and Sirius XM, Tony Bruno has seen it all. Today we talk about the challenges young broadcasters face, the challenges current broadcasters face and the experience and maturity to to what you believe is the right thing. Tony's career saw him share microphones with the likes of Keith Oberman and Charlie Steiner. Certainly iconic sportscasters of their time. We reminisce, we lament and we look to the future of the industry in a candid discussion from two guys who have put it all out there and taken some lumps for it. This is a high energy discussion I really think you're going to enjoy. Especially if you're from Philadelphia!
Joe Davis is a sports broadcaster and analyst, who is one of the voices of the Los Angeles Dodgers alongside former Dodgers player and 1988 World Series champion Orel Herscheiser for Sportsnet LA under the Charter Communications franchise. Although he has voiced iconic moments for the Boys in Blue, he started his journey back in 2006 where he was the voice for the Beloit College Buccaneers' men's and women's basketball team as well as a fill-in for the Loyola Rambler's men's volleyball and women's basketball team. From then on, he would work for ESPN calling football, baseball, basketball, and hockey. In July of 2014, he would be hired by FOX Sports where he calls college football and basketball. It wasn't until November of 2015 where he was hired for the Los Angeles Dodgers as an alternate play-by-play commentator alongside Charlie Steiner for games that were not called by Vin Scully. After Scully's retirement from the conclusion of the 2015-16 season, Davis became his successor, calling the action for the 2017 season. Although he does not view himself as Scully's replacement, he has showcased Scully's storytelling focus throughout each and every ballgame. Joe is also a family man as he is married to wife Libby and a father of daughter Charlotte and son, Blake. On the premiere episode of “Your Story”, I speak with Joe on the transition from working in a packed stadium with thousands of fans in attendance to an empty ballpark. Additionally, Davis speaks about how one voicemail from Vin Scully changed his whole life as well as telling a story each game unlike any other. Follow Mike Wexler on social media through the following links: https://linktr.ee/sockmonkeymikeFollow Joe Davis on social media through the following links: Instagram: https://instagram.com/joedavispxp?igshid=bwmd7termgd3A big and special thank you to The Orange Space for having us host the interview! If you are not familiar, please visit their website at: https://linktr.ee/theorangespaceFollow the production team who made this magic happen:Producers Mike Contreras - https://m.youtube.com/channel/UCWIQuzJplZZfpg3PDVvlavgKelly - https://instagram.com/kellzscott187?igshid=1n9oy0e8bic1kJustin Dhillon of The Wrestling Classic - https://instagram.com/thewrestlingclassic?igshid=1t3s4v3yaoydb
Espn and voice of the Dodgers legend, Charlie Steiner joins the program. We cover everything under the sun. It’s a must listen!!
Hosted by long time radio reporter, anchor, editor, producer, director, and host, Larry Matthews, "Matthews and Friends" brings you the best interviews with guests from whom you want to hear! Join Larry Matthews today to hear his work with Dodgers announcer, Charlie Steiner, talking about the Dodger's World Series win and layoffs at ESPN; Paul Podolsky discussing his book, "Raising a Thief", about the challenges of raising difficult children; and Christian Giudice on his book, "Macho Time", about the great boxer known as Macho Comacho. "Matthews and Friends" can be heard at 8:00 am, ET, seven days a week on Impact Radio USA!
Chris Williams is gone on a Thirsty Thursday but Travis Justice fills in and along with Ross Peterson and Erick Zamora they discuss the Charlie Steiner call of the Dodgers winning the World Series and wonder why he called them being cursed. The guys also get into the Wisconsin/Nebraska game being called off and then Travis Justice recalls what happened Saturday with the Jethro's BBQ SoundOFF on WHO Radio and how people reacted to the loss to Purdue. Bruno also joins the program in the first hour to chat beer.
Chris Williams is gone on a Thirsty Thursday but Travis Justice fills in and along with Ross Peterson and Erick Zamora they discuss the Charlie Steiner call of the Dodgers winning the World Series and wonder why he called them being cursed. The guys also get into the Wisconsin/Nebraska game being called off and then Travis Justice recalls what happened Saturday with the Jethro's BBQ SoundOFF on WHO Radio and how people reacted to the loss to Purdue. Bruno also joins the program in the first hour to chat beer.
Dodger Talk with David Vassegh who talks to Dodger pitching coach Mark Prior and Dodger broadcaster Charlie Steiner.
Dodger broadcaster Charlie Steiner calls into Dodger Talk with David Vassegh.
Hosted by long time radio reporter, anchor, editor, producer, director, and host, Larry Matthews, "Matthews and Friends" brings you the best interviews with guests from whom you want to hear! Join Larry today to hear his work with Charlie Steiner, play by play man for the Dodgers, as we begin the 2019 baseball season; to mark the opening of the new season; and "Jim and Cathy" from the book section of the show, as they will discuss their book, "Thirteen Months", about Jim's service as a Marine officer in Vietnam; and a portion of an old time radio program "Phillip Marlow, Private Detective." "Matthews and Friends" can be heard at 8:00 am, EST, seven days a week on Impact Radio USA!
Hosted by long time radio reporter, anchor, editor, producer, director, and host, Larry Matthews, "Matthews and Friends" brings you the best interviews with guests from whom you want to hear! Join Larry today to hear his work with Charlie Steiner, the radio play-by-play voice for the Los Angeles Dodgers, New York Times bestselling author, John Gilstrap, and Holly Smith, the editor-in-chief of the Washington Independent Review of Books. "Matthews and Friends" can be heard at 8:00 am, EDT, seven days a week on Impact Radio USA!
Tony opens the show by talking about the Caps loss to Tampa Bay and giving the coveted Aloha Towers calendar to Charlie Steiner for the year. Washington Post movie critic Ann Hornaday calls in to review "Tully", "Deadpool 2", and the Ruth Bader Ginsberg documentary. Horse racing expert Andy Beyer calls in to preview the Preakness, and Ron Flatter sits in the show for a few minutes to talk about the Preakness as well. Nigel does a little news, and then during "Old Guy Radio" Tony talks about the upcoming Royal Wedding. Lastly, they close out the show by opening up the Mailbag. Songs : Nick Ralg "It's Too Late" ; Adams Blvd "Oklahoma Line"
Tony opens the show by talking about some NECCO candy that Torie brought in, tells a story about Charlie Steiner and the Aloha Towers, and discusses the criticism that Josh Rosen is getting. Dave Sheinin of the Washington Post calls in to talk about the album he just released ("First Thing Tomorrow") and to talk baseball, and Nigel gives the news. During 'Old Guy Radio', Tony talks about his plane ride down to Charleston, SC, and discusses the Masters with Barry Lastly, they close out the show by opening up the Mailbag. Songs : Dave Sheinin "A Warm February" ; Cousin Eddie "Most Alive"
Bill, Jamal, and producer Pat give their opinion on Chase Utley's slide into Ruben Tejada during Game 2 of the National League Division Series; Charlie Steiner comments on the play; the guys react to what Steiner had to say. They also touch on week 6 of the NFL.
We just HAD to have Tim Hennessy back for another go 'round with the guys. They shoot the bull about how Tim got his start at UND, his former connection to Gopher hockey, how a routine trip to the eye doctor saved his life prior to the 2011 Frozen Four and we do a fun word association that spans all decades with the man behind the microphone for five of UND's seven national championships. Also, Jayson has his Charlie Steiner moment and Mitch picks up a new nickname.
7/15 MON: #MonsterMonday - The latest sports news, plus we catch up with @JRsBBQ to talk #AEW At 2:30, old pal Charley Steiner joins us live to talk Dodgers and Pernell Whitaker. Miss Robin with updates and all of the fun on http://Twitch.tv/brunonationlive Watch, listen, follow! "Make the Switch to Twitch" - Why? - Because we Love you! (actually, bcuz it's FREE, but we ❤️u 2) Comments/Questions?: Call us 215-462-TONY (8669) Don't forget, boys & girls, it's ALWAYS #BringAFriend Day to the TonyBrunoShow, so show someone how to download the FREE (link: http://Twitch.tv) Twitch.tv app & Follow/Watch/Chat at (link: http://Twitch.tv/BrunoNationLIVE) Twitch.tv/BrunoNationLIVE All ahead on (link: http://twitch.tv/brunonationlive) twitch.tv/brunonationlive 1-4pm. Watch, listen, love! If you are watching this on anything other than TWITCH.TV you are only getting the tip for just a second. That's right, the full show & interactive Chat Room is ONLY available on Twitch.tv/BrunoNationLIVE - for FREE! Join now! It's another #MustWatchRadio on the BrunoNation with your host Tony Bruno, fixing all that's broke in Sports Radio. We've got: #NBA #NHL #NFLDraft #MLB #NFL Action yo! #Comedy #Entertainment #TurnUp #Podcast #FantasyFootball Join us Mon-Fri from 1-4p EST Host: Tony Bruno, with Miss Robin and sometimes Luigi Curto and don't forget Jack in the Back! Sponsored by: Switchboard Live - Domenicos Join the conversation ONLY on Twitch.tv/BrunoNationLIVESupport this podcast at — https://redcircle.com/tony-bruno-show/donations