POPULARITY
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Steering Behaviour: Testing for (Non-)Myopia in Language Models, published by Evan R. Murphy on December 5, 2022 on LessWrong. Both authors contributed equally to this post and also to work done so far on the experiments it presents. Acknowledgments: Thanks to the following people for insightful conversations which helped us improve this post: Ian McKenzie, Aidan O'Gara, Andrew McKnight and Evan Hubinger. One of the authors (Evan R. Murphy) was also supported by a grant from the Future Fund regranting program while working on this project. Summary Myopia is a theorised property of AI systems relating to their planning horizon capabilities. As has been recently discussed, myopia seems like an important property for AI safety, because non-myopia is likely a necessary precursor to highly risky emergent properties like deceptive alignment. We expect non-myopia in large language models (LLMs) receiving RLHF or similar fine-tuning because they are trained using multi-token completions rather than just immediate next token predictions. We aren't aware of any previous public experiments testing specifically for myopia or non-myopia in machine learning models. We ran an initial experiment "Testing for steering behaviour in fine-tuned LLMs" which demonstrated noticeable ‘steering' behaviour away from toxic content in the InstructGPT fine-tuned LLMs. We share key results from this experiment along with the full dataset of prompts we used in it. We also describe a follow-up experiment we're currently working on to determine the extent to which the steering we observed in the initial experiment is non-myopic. Finally, we invite suggestions for future (non-)myopia experiments to run and share a few ideas of our own. Context and motivation What is myopia? Myopia is a theorised property of some AI systems that has been discussed a fair amount on these forums. Rather than try to reinvent the wheel on defining it, we'll borrow this explanation from the myopia tag page on Alignment Forum: Myopia means short-sighted, particularly with respect to planning -- neglecting long-term consequences in favor of the short term. The extreme case, in which only immediate rewards are considered, is of particular interest. We can think of a myopic agent as one that only considers how best to answer the single question that you give to it rather than considering any sort of long-term consequences. Such an agent might have a number of desirable safety properties, such as a lack of instrumental incentives. We're focusing on language models (LMs) in this series of experiments, specifically unidirectional transformer LLMs like GPT-3. Here's what we mean when we talk about myopia and non-myopia in the context of these models: For a myopic language model, the next token in a prompt completion is generated based on whatever the model has learned in service of minimising loss on the next token and the next token alone A non-myopic language model, on the other hand, can 'compromise' on the loss of the immediate next token so that the overall loss over multiple tokens is lower - i.e possible loss on future tokens in the completion may be 'factored in' when generating the next immediate token Why myopia matters for alignment One of the most dangerous emergent properties theorised by AI alignment researchers is deceptive alignment, a.k.a. the treacherous turn. If you're not familiar with deceptive alignment, here's a definition from its tag page on Alignment Forum: Deceptive Alignment is when an AI which is not actually aligned temporarily acts aligned in order to deceive its creators or its training process. It presumably does this to avoid being shut down or retrained and to gain access to the power that the creators would give an aligned AI. Deceptive alignment in an advanced AI system could be extremely diffi...
Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Steering Behaviour: Testing for (Non-)Myopia in Language Models, published by Evan R. Murphy on December 5, 2022 on LessWrong. Both authors contributed equally to this post and also to work done so far on the experiments it presents. Acknowledgments: Thanks to the following people for insightful conversations which helped us improve this post: Ian McKenzie, Aidan O'Gara, Andrew McKnight and Evan Hubinger. One of the authors (Evan R. Murphy) was also supported by a grant from the Future Fund regranting program while working on this project. Summary Myopia is a theorised property of AI systems relating to their planning horizon capabilities. As has been recently discussed, myopia seems like an important property for AI safety, because non-myopia is likely a necessary precursor to highly risky emergent properties like deceptive alignment. We expect non-myopia in large language models (LLMs) receiving RLHF or similar fine-tuning because they are trained using multi-token completions rather than just immediate next token predictions. We aren't aware of any previous public experiments testing specifically for myopia or non-myopia in machine learning models. We ran an initial experiment "Testing for steering behaviour in fine-tuned LLMs" which demonstrated noticeable ‘steering' behaviour away from toxic content in the InstructGPT fine-tuned LLMs. We share key results from this experiment along with the full dataset of prompts we used in it. We also describe a follow-up experiment we're currently working on to determine the extent to which the steering we observed in the initial experiment is non-myopic. Finally, we invite suggestions for future (non-)myopia experiments to run and share a few ideas of our own. Context and motivation What is myopia? Myopia is a theorised property of some AI systems that has been discussed a fair amount on these forums. Rather than try to reinvent the wheel on defining it, we'll borrow this explanation from the myopia tag page on Alignment Forum: Myopia means short-sighted, particularly with respect to planning -- neglecting long-term consequences in favor of the short term. The extreme case, in which only immediate rewards are considered, is of particular interest. We can think of a myopic agent as one that only considers how best to answer the single question that you give to it rather than considering any sort of long-term consequences. Such an agent might have a number of desirable safety properties, such as a lack of instrumental incentives. We're focusing on language models (LMs) in this series of experiments, specifically unidirectional transformer LLMs like GPT-3. Here's what we mean when we talk about myopia and non-myopia in the context of these models: For a myopic language model, the next token in a prompt completion is generated based on whatever the model has learned in service of minimising loss on the next token and the next token alone A non-myopic language model, on the other hand, can 'compromise' on the loss of the immediate next token so that the overall loss over multiple tokens is lower - i.e possible loss on future tokens in the completion may be 'factored in' when generating the next immediate token Why myopia matters for alignment One of the most dangerous emergent properties theorised by AI alignment researchers is deceptive alignment, a.k.a. the treacherous turn. If you're not familiar with deceptive alignment, here's a definition from its tag page on Alignment Forum: Deceptive Alignment is when an AI which is not actually aligned temporarily acts aligned in order to deceive its creators or its training process. It presumably does this to avoid being shut down or retrained and to gain access to the power that the creators would give an aligned AI. Deceptive alignment in an advanced AI system could be extremely diffi...
Our episode features two leaders at the global specialty pharmaceutical company Kyowa Kirin. Kyowa Kirin has a particular focus on the discovery and development of novel, first-in-class medicines that have a profound impact on patients in multiple therapeutic areas. Rachel Soloff is an executive director of research with expertise in immunology and the discovery of novel monoclonal antibodies and small molecule therapeutic compounds for autoimmune and inflammatory diseases. She oversees the KK US-Research activities, which are focused on target discovery and validation, and leads candidate discovery for multiple innovative pipeline projects. Andrew McKnight currently holds the position of president and CSO at Kyowa Kirin Pharmaceutical Research. He is responsible for early-stage drug discovery within Kyowa Kirin's Immunology & Allergy division. In this episode, we discuss the various clinical and commercial products discovered and developed at Kyowa Kirin. Through Innovative approaches to antibody engineering, KK has been able to consistently maintain a pipeline of products to treat a wide variety of disorders.Hosted by Joe Varriale and Gustavo Carrizo.
The Sundilla Radio Hour for the week of 01/03/2022 featuring: Buskin & Batteau & Tom Rush “One Month Crazier” Click: Songs By Neale Eckstein & Friends (2014 Neale Eckstein) 5:02 Heather Sarona “I'll be Lost” Head Above Water (2022 Heather Sarona) 3:22 Andrew McKnight “When My Time Comes” Treasures in My Chest (2020 Andrew McKnight) 3:53 Lauren Balthrop “We Fell” Songs and Strings (Live) (2021 Starlington Songs) 4:15 Sara Routh “Save Me” Heavy Love (2021 Boots & Whiskey Music) 5:01 Scott Cook “Tulsa” Tangle of Souls (2020 Scott Cook) 3:48 Hiroya Tsukamoto “Tears” Solo (2011 H) 3:09 Neffy “Wait Up” Single (2020 Neffy) 3:11 Alastair Moock “Somewhere Elseward Blown” A Life I Never Had (2002 Alastair Moock) 4:12 Yasmin Williams “Sunshowers” Urban Driftwood (2021 SPINSTER) 4:14 Joan Shelley (featuring Glen Dentinger) “Like Butter Loves Bread” Happy Hollerdays 2021 – A Special Benefit Album for Team West Kentucky Tornado Relief Fund 2:38 Trevor Tchir “Iron Mountain” Sun & Moon (2021 Trevor Tchir) 3:45
Bring the New Year in with Andrew McKnight, Runa, Old-Time Pharmaceuticals and Catgut and Steel
A new MP3 sermon from Calvary Missionary Baptist Church is now available on SermonAudio with the following details: Title: The Lord, Our Buckler Speaker: Andrew McKnight Broadcaster: Calvary Missionary Baptist Church Event: Sunday - PM Date: 11/15/2020 Length: 35 min.
A new MP3 sermon from Calvary Missionary Baptist Church is now available on SermonAudio with the following details: Title: Repentance Speaker: Andrew McKnight Broadcaster: Calvary Missionary Baptist Church Event: Sunday Service Date: 8/16/2020 Bible: Acts 17:30-31 Length: 38 min.
Folk/Americana singer-songwriter Andrew McKnight discusses how his ancestors helped him write his most recent CD 'Treasures In The Chest', why he performs songs from his living room, his guitar collection and how creative collisions can polish a song.
The Sundilla Radio Hour for the week of 04/27/2020 featuring: Kaia Kater "Viper's Nest" Nine Pin (2016 Kingswood Records) 4:55 Andrew McKnight "When My Time Comes" Treasures in My Chest (2020 Andrew McKnight) 3:53 Ordinary Elephant "Leaving Kerrville" Before I Go (2017 Ordinary Elephant) 4:09 Scott Fab "Broken Branch" Someday Soon Somehow (2020 Scott Fab) 3:32 Ever More Nest "Unraveling" The Place That You Call Home (2018 Parish Road Music) 4:01 Sam Burton "Nothing Touches Me" Nothing Touches Me - Single (2020 Tompkins Square) 4:55 Lissa Schneckenburger "Labor On Labor On - Single (2020 Lissa Schneckenburger) 3:40 Jeff Black "Satisfied" A Walk in the Sun (2020 Lotos Nile Music) 3:49 Carrie Newcomer "My Father's Only Son" My Father's Only Son (1996 Rounder Records) 3:45 Dave Boutette "It's Gonna Be All Right" 1st Rate Companion (2015 Dave Boutette) 3:29 Sunny War "All Life's Worth" Can I Sit with You? (2020 Harlan Steinberger) 4:13 Josh Harty "Minna Miller" Handcrafted (Josh Harty) 3:31
Next up we have my cousin's husband Andrew McKnight. In this episode, we talk about babies, parenting, and the bee movie!! Thanks for listening!! If you could leave me two comments that'd be great: Something I could do better on next time around Something I did good and should keep doing I appreciate anyone who takes their time to listen and even more so if you leave a comment :) CHECK OUT MY WEBSITE http://antoniokonja.com/ Check out Andrew's stuff https://www.instagram.com/yeahhh_toast/ All of the things you can also check me out on Youtube: https://www.youtube.com/tonetime Instagram: https://www.instagram.com/time.tone/ Twitter: https://twitter.com/akonja15 Facebook: https://www.facebook.com/antonio.konja
Panel: Gui Rambo Andrew Madsen Erica Sadun Jaim Zuber Special Guest: Andrew McKnight In today's episode, the iPheaks panelist speak with Andrew McKnight about Surveying How Swift Evolves. Andrew provides information on a presentation he did at iOS Dev Camp Colorado, on a survey looking at the open source Swift repositories to see how developers are extending the language, foundation, or standard library. This is a great episode to gain insight into how developers on the iOS platform are helping evolve the Swift language and much more. In particular, we dive pretty deep on: What was being surveyed? - Utility Libraries and general purpose How did you search for Utility Libraries? What is the purpose of the utility libraries? Duplicate extensions What are the most popular extensions that are recreated? String and Trim What is trim()? Why is targeting utility libraries problematic? What is the goal? Did you find wrong or dangerous implementations? Why is their discussion/drama around gathering these extensions? Would these be good topics to file Radars? Brisk - https://github.com/br1sk/brisk What is it like entering the Swift Evolution Process? Can a community-driven proposal gain traction? Did you look into custom types like Result? And much much more! Links: https://forums.swift.org/t/surveying-how-swift-evolves/12726 https://github.com/armcknight http://armcknight.com/ https://medium.com/@ndrewmcknight/has-recommended @ndrewmcknight Chris Lattner Ted Kremenek Picks: Gui Daily WTF Erica Live Lava Feeds https://www.youtube.com/watch?v=HtihmXFWqGo Andrew Antibiotics Tic-80 Tiny Computer Jaim Black Mirror Andrew McKnight Public Extension mailing list
Panel: Gui Rambo Andrew Madsen Erica Sadun Jaim Zuber Special Guest: Andrew McKnight In today's episode, the iPheaks panelist speak with Andrew McKnight about Surveying How Swift Evolves. Andrew provides information on a presentation he did at iOS Dev Camp Colorado, on a survey looking at the open source Swift repositories to see how developers are extending the language, foundation, or standard library. This is a great episode to gain insight into how developers on the iOS platform are helping evolve the Swift language and much more. In particular, we dive pretty deep on: What was being surveyed? - Utility Libraries and general purpose How did you search for Utility Libraries? What is the purpose of the utility libraries? Duplicate extensions What are the most popular extensions that are recreated? String and Trim What is trim()? Why is targeting utility libraries problematic? What is the goal? Did you find wrong or dangerous implementations? Why is their discussion/drama around gathering these extensions? Would these be good topics to file Radars? Brisk - https://github.com/br1sk/brisk What is it like entering the Swift Evolution Process? Can a community-driven proposal gain traction? Did you look into custom types like Result? And much much more! Links: https://forums.swift.org/t/surveying-how-swift-evolves/12726 https://github.com/armcknight http://armcknight.com/ https://medium.com/@ndrewmcknight/has-recommended @ndrewmcknight Chris Lattner Ted Kremenek Picks: Gui Daily WTF Erica Live Lava Feeds https://www.youtube.com/watch?v=HtihmXFWqGo Andrew Antibiotics Tic-80 Tiny Computer Jaim Black Mirror Andrew McKnight Public Extension mailing list
Welcome back to the Tech Fugitives podcast. If you line up all your Fibonacci numbers in a row, it proves The Tech Fugitives are AWESOME! The Tech Fugitives explore Agile “Do’s and Don’ts” with expert Andrew McKnight of Matrix Resources. If you’re “doing agile”…. you’re doing it wrong. Agile Warrior and Marine, Andrew McKnight mixes it […] The post Episode 13 – Agile & The Tech Fugitives Golden Ratio, {Mark : Kyle} appeared first on The Tech Fugitives Show!.
Episode #180. Today songwriter and storyteller Andrew McKnight comes by to sing songs with his guitar made of native wood. Life and business coach Lauren LeMunyan joins the group to help facilitate facts about Friday the 13th and help dispel the rumors about coaching. Topics of conversation include $69 flights to Europe, Trump endorsing L.L. Bean, and performing music in people's homes. Our show is brought to you by our Amazon link, which gives us a piece of your purchase if you click through the sponsor link on our website www.thecircuslife.com. This podcast is also brought to you by, http://acorn-financial.com, Saucony, RCS Photography, and Cue Recording Studios.
Andrew McKnight
Singer, songwriter, guitar player, and poet, Andrew McKnight is my guest. He performed a benefit concert for us a few months ago to raise awareness against mountain top removal mining. He sings about the people, places, and passions of Appalachia: Since permanently leaving his corporate environmental engineering career in 1996, award-winning folk and Americana artist Andrew McKnight's musical journey has traced over half a million miles of blue highways, and earned him a wealth of critical acclaim and enthusiastic fans for his captivating performances and five CDs. While he was in the neighborhood, he stopped by the studio, talked with me and sang a few of his songs. I have the interview and the songs he sung just for you this week on Religion For Life!
When we first spoke with Pastor Andrew McKnight back in December 2009, he was on the journey of starting his own church. We are back with Pastor McKnight and things have been definitely moving forward!!In this episode, Pastor McKnight updates us "what a go on"!! He shares and speaks about his first book, "Let's Reason", being published and the launch of his ministry on the weekend of May 14 - 16, 2010. Enjoy!!For more information on Pastor McKnight and his church, Beautiful Gate Ministries, please go to http://bgmonline.org/?pageType=sub&pageID=57#Feel free to email us at info@blackcanadianman.com and visit our podcast site at http://thevibeandvegasshow.wordpress.com/God bless, peace, be well and keep the faith,Vibe and Vegasinfo@blackcanadianman.comhttp://thevibeandvegasshow.wordpress.com/
A few months ago, I met a young man who was working on a great mission. He told me that he was in the process of starting his own church. I found this to be very interesting. I asked him if would be willing to share his dream with us. So, this episode features Pastor Andrew McKnight of Beautiful Gate Ministries. If you would like to find out more information about Pastor McKnight and Beautiful Gate Ministries, you can go to their website at http://www.bgmonline.org/, email him at amcknight@bgmonline.org or call him at 905-999-5207. If you have any comments, feedback or suggestions, please feel free to email us at info@blackcanadianman.com. God bless, peace, be well and keep the faith, Vibe and Vegas info@blackcanadianman.com