Theory that states that knowledge comes only or primarily from sensory experience
POPULARITY
" Am I really playing with Neil Patrick Harris?" In keeping with the theme of Box Two, Neil Patrick Harris is back for his second appearance on our podcast. And not only that, he also brought a friend. Jonathan Bayme, CEO of theory11, is a longtime collaborator and Neil's bestie. They first met at a magic show Jonathan created for the Nomad Hotel in New York. Neil and Jonathan quickly bonded over a shared passion for elevating the art of magic, a deep appreciation for fine craftsmanship, and creating experiences full of wonder and joy. Jonathan created his company, theory11, with the goal of crafting magic tricks and playing cards that not only worked flawlessly, but were also beautiful to look at. Jonathan worked with Neil to produce Box Two, which was created with a very similar aesthetic and vibe. They very candidly explained the thought process behind design decisions such as relying on an online chat interface, and how they achieved certain magical effects. Box Two is gorgeous, with hefty props that felt good and looked good. I especially enjoyed hearing about the playtesting and mechanical process behind producing such a complicated boxed game. Jonathan also shares his secret to networking. Whether it's about tracking down JJ Abrams, or writing weekly emails at twelve years old to David Copperfield's longtime executive producer, I really enjoyed his stories of perseverance when it comes to making connections. If you enjoyed this episode, make sure to check out our first interview with NPH in Season 2, Episode 1. Episode Sponsors Thank you to our sponsors: Weldwood Marketing, Buzzshot, COGS by Clockwork Dog, Hive: The Race to Save Time Puzzle Hunt, and Patreon supporters like you. Weldwood Marketing Maximize your online presence with Weldwood Marketing. It's a one-stop shop for digital marketing—specializing in web design, SEO, online ads, and best business practices. They can even manage all your integrations so you can track the customer journey from clicking on an ad to booking your game. Let Weldwood help unlock more money for your business. Special offer exclusively for REPOD listeners: Weldwood rarely offers discounts, but they did for us. REPOD listeners get 15% off Marketing Services for the first 3 months, PLUS $750 off escape room websites. Schedule your Discovery Call and mention REPOD in the notes! Visit weldwoodmarketing.com/repod to learn more about this exclusive offer. Buzzshot Buzzshot is Escape Room Software, Powering Business Growth, Player Marketing, and improving the Customer Experience. They offer an assortment of pre and post game features including robust waiver management, branded team photos, and streamlined review management for Yelp, TripAdvisor, Google Reviews, and Morty. Buzzshot now has integration with the other REPOD sponsors: Morty and COGS. Special Offer for REPOD Listeners: REPOD listeners get an extended 21-day free trial plus 20% off your first 3 months, with no set-up fees or hidden charges. Visit buzzshot.com/repod to learn more about this exclusive offer. COGS COGS by Clockwork Dog is an easy to use software/ hardware platform for running interactive events, including escape rooms, and other immersive experiences. They have plug & play hardware that seamlessly integrates with their software so you can create a show with lighting and sound cues without having to write a single line of code. Map all kinds of inputs to outputs by building up simple logic steps which determine what you want to happen and when. Special Offer for REPOD Listeners: REPOD listeners can get the COGS Starter Set for only $130 + free shipping to the USA. This bundle is usually valued at $257. You can learn more and purchase your Starter Set at cogs.show. Use code REPOD at checkout. Hive: The Race to Save Time Puzzle Hunt Based on the new Madders of Time series by bestselling author DL Orton. Registration opens May 1, 2025 Hive: The Race to Save Time runs from May 16-26, 2025 Over $1,000 in Prizes, winners determined by sweepstakes. Purchase of the book or free alternative method of entry registration is required to participate, but the e-book will be available for just 99 cents while the hunt is running. Visit Hive: The Race to Save Time website for more details. Preorder the book today! Become a Patron Today! Supporitng us on Patreon helps to fund our work, pay our team, and it grants you access to an incredible library of bonus content including: The REPOD Bonus Show The Spoilers Club The Travelogue Series Thank you to all of our ongoing supporters
In this episode of The afikra Podcast, Professor Natalie Koch – the author of "Arid Empire: The Entangled Fates of Arizona and Arabia" – helps us dive into the unexpected connections between the deserts of Arizona and the Arabian Peninsula, beginning with the story of Hi Jolly and the camel experiments of the mid-19th century. The discussion explores how these arid spaces serve as political and imperial tools, the role of white experts in influencing desert landscapes, and the intricate history of agricultural projects that link these seemingly distant regions. Chapters include the origins of Koch's interest in the subject, detailed histories of desert colonization, and the broader implications of these transnational connections.00:00 Introduction to Desert Politics01:20 The Unlikely Connection: Arizona and Saudi Arabia02:53 The Story of Hi Jolly and the Camel Experiment11:40 Geography and Its Modern Implications14:45 The Political Significance of Deserts18:38 Colonial and Imperial Narratives22:14 The Role of White Experts in the Arabian Peninsula24:17 Arizona's Colonial History27:46 The Influence of Old World Desert Knowledge30:49 Recruiting White Settlers to Arizona31:41 The Role of Railroads and Pamphlets32:56 Western Mythology and Camels in Films34:41 California's Date Industry and Arabian Influence36:43 The Short-Lived Camel Experiment37:40 Global Connections of Deserts43:42 Transnational Agricultural Projects51:23 Controversies and Misappropriations52:50 Recommended Readings and ResourcesNatalie Koch is a political geographer working on the topics of geopolitics, nationalism, energy and environmental politics, science and technology studies, and sports geography. Empirically, her research focuses on the Arabian Peninsula, where she studies the many transnational ties that bind the Gulf countries, actors, and ideas to other parts of the world. She has published extensively in journals such as Political Geography, Geopolitics, and Society and Natural Resources, and she is the author of "Arid Empire: The Entangled Fates of Arizona and Arabia," "The geopolitics of spectacle: Space, synecdoche, and the new capitals of Asia" (Cornell University Press, 2018), and co-editor of the Handbook on the changing geographies of the state: New spaces of geopolitics (Edward Elgar 2020). She is currently a professor at Syracuse University in the Department of Geography and the Environment, Maxwell School of Citizenship & Public Affairs.Find Koch's books
We doen het gemiddeld 10 keer per week. Liegen. Soms een leugentje om bestwil (‘Nee, ik vind het prima om thuis te blijven'), soms om persoonlijke informatie geheim te houden (‘Ik zit niet op Instagram'). Maar wat zien we eigenlijk als liegen? Wat zijn (nog meer) redenen om te liegen? En met welke methoden proberen we leugens te detecteren (en hoe goed werken die)? In deze aflevering bespreken we wat we weten over liegen op basis van wetenschappelijk onderzoek, van filosofische inzichten tot neurologische studies (en alles daartussenin).Presentatie: Rolf Zwaan & Anita EerlandMuziek: Rolf ZwaanBronnenBrennen, T., & Magnussen, S. (2023). Lie detection: What works? Current Directions in Psychological Science, 32(5), 395-401. https://doi.org/10.1177/09637214231173095Reins, L.M. & Wiegmann, A. (2021), Is lying bound to commitment? Empirically investigating deceptive presuppositions, implicatures, and actions. Cognitive Science, 45: e12936. https://doi.org/10.1111/cogs.12936https://www.psychologytoday.com/us/blog/finding-a-new-home/202207/research-reveals-the-most-common-reasons-people-liehttps://mindlabneuroscience.com/lying-exploring-the-neuroscience-behind-it/In deze aflevering refereren we aan de volgende eerdere afleveringen: Bullshit (7), Donald Trump begrijpen (20), Er zit iets tussen je tanden (75), en Limburgers begrijpen (114). Een compleet overzicht met alle thema's uit de podcast en de bijbehorende afleveringen vind je hier. Hosted on Acast. See acast.com/privacy for more information.
LLM-based coding-assistance tools have been out for ~2 years now. Many developers have been reporting that this is dramatically increasing their productivity, up to 5x'ing/10x'ing it.It seems clear that this multiplier isn't field-wide, at least. There's no corresponding increase in output, after all.This would make sense. If you're doing anything nontrivial (i. e., anything other than adding minor boilerplate features to your codebase), LLM tools are fiddly. Out-of-the-box solutions don't Just Work for that purpose. You need to significantly adjust your workflow to make use of them, if that's even possible. Most programmers wouldn't know how to do that/wouldn't care to bother.It's therefore reasonable to assume that a 5x/10x greater output, if it exists, is unevenly distributed, mostly affecting power users/people particularly talented at using LLMs.Empirically, we likewise don't seem to be living in the world where the whole software industry is suddenly 5-10 times [...] The original text contained 1 footnote which was omitted from this narration. --- First published: March 4th, 2025 Source: https://www.lesswrong.com/posts/tqmQTezvXGFmfSe7f/how-much-are-llms-actually-boosting-real-world-programmer --- Narrated by TYPE III AUDIO.
In this episode, I provide a straightforward argument that shows the impossibility of continuationism from an exegetical, empirical, and sufficiency of Scripture perspective.
Do we need political solutions or spiritual solutions? In the wake of the Doctrine of the Faith's recent document on Medjugorje, a Friend of Medjugorje gives the one, singular, truthful thing you can count on in today's world - and it's Absent of Deceit.
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/new-books-network
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/east-asian-studies
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/southeast-asian-studies
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/political-science
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/chinese-studies
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/economics
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/finance
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/book-of-the-day
Developing Asia has been the site of some of the last century's fastest growing economies as well as some of the world's most durable authoritarian regimes. Many accounts of rapid growth alongside monopolies on political power have focused on crony relationships between the state and business. But these relationships have not always been smooth, as anti-corruption campaigns, financial and banking crises, and dramatic bouts of liberalization and crackdown demonstrate. Why do partnerships between political and business elites fall apart over time? And why do some partnerships produce stable growth and others produce crisis or stagnation? In Precarious Ties: Business and the State in Authoritarian Asia (Oxford UP, 2023) (Oxford, 2023), Meg Rithmire offers a novel account of the relationships between business and political elites in three authoritarian regimes in developing Asia: Indonesia under Suharto's New Order, Malaysia under the Barisan Nasional, and China under the Chinese Communist Party. All three regimes enjoyed periods of high growth and supposed alliances between autocrats and capitalists. Over time, however, the relationships between capitalists and political elites changed, and economic outcomes diverged. While state-business ties in Indonesia and China created dangerous dynamics like capital flight, fraud, and financial crisis, Malaysia's state-business ties contributed to economic stagnation. To understand these developments, Rithmire, a professor at Harvard Business School, presents two conceptual models of state-business relations that explain their genesis and why variation occurs over time. She shows that mutual alignment occurs when an authoritarian regime organizes its institutions, or even its informal practices, to induce capitalists to invest in growth and development. Mutual endangerment, on the other hand, obtains when economic and political elites are entangled in corrupt dealings and invested in perpetuating each other's dominance. The loss of power on one side would bring about the demise of the other. Rithmire contends that the main factors explaining why one pattern dominates over the other are trust between business and political elites, determined during regime formation, and the dynamics of financial liberalization. Empirically rich and sweeping in scope, Precarious Ties offers lessons for all nations in which the state and the private sector are deeply entwined. Host Peter Lorentzen is an Associate Professor in the Department of Economics at the University of San Francisco. His research examines the political economy of governance and development in China.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Truth is Universal: Robust Detection of Lies in LLMs, published by Lennart Buerger on July 19, 2024 on The AI Alignment Forum. A short summary of the paper is presented below. TL;DR: We develop a robust method to detect when an LLM is lying based on the internal model activations, making the following contributions: (i) We demonstrate the existence of a two-dimensional subspace, along which the activation vectors of true and false statements can be separated. Notably, this finding is universal and holds for various LLMs, including Gemma-7B, LLaMA2-13B and LLaMA3-8B. Our analysis explains the generalisation failures observed in previous studies and sets the stage for more robust lie detection; (ii) Building upon (i), we construct an accurate LLM lie detector. Empirically, our proposed classifier achieves state-of-the-art performance, distinguishing simple true and false statements with 94% accuracy and detecting more complex real-world lies with 95% accuracy. Introduction Large Language Models (LLMs) exhibit the concerning ability to lie, defined as knowingly outputting false statements. Robustly detecting when they are lying is an important and not yet fully solved problem, with considerable research efforts invested over the past two years. Several authors trained classifiers on the internal activations of an LLM to detect whether a given statement is true or false. However, these classifiers often fail to generalize. For example, Levinstein and Herrmann [2024] showed that classifiers trained on the activations of true and false affirmative statements fail to generalize to negated statements. Negated statements contain a negation like the word "not" (e.g. "Berlin is not the capital of Germany.") and stand in contrast to affirmative statements which contain no negation (e.g. "Berlin is the capital of Germany."). We explain this generalization failure by the existence of a two-dimensional subspace in the LLM's activation space along which the activation vectors of true and false statements separate. The plot below illustrates that the activations of true/false affirmative statements separate along a different direction than those of negated statements. Hence, a classifier trained only on affirmative statements will fail to generalize to negated statements. The activation vectors of multiple statements projected onto the 2D truth subspace. Purple squares correspond to false statements and orange triangles to true statements. Importantly, these findings are not restricted to a single LLM. Instead, this internal two-dimensional representation of truth is remarkably universal, appearing in LLMs from different model families and of various sizes, including LLaMA3-8B-Instruct, LLaMA3-8B-base, LLaMA2-13B-chat and Gemma-7B-Instruct. Real-world Lie Detection Based on these insights, we introduce TTPD (Training of Truth and Polarity Direction), a new method for LLM lie detection which classifies statements as true or false. TTPD is trained on the activations of simple, labelled true and false statements, such as: The city of Bhopal is in India. (True, affirmative) Indium has the symbol As. (False, affirmative) Galileo Galilei did not live in Italy. (False, negated) Despite being trained on such simple statements, TTPD generalizes well to more complex conditions not encountered during training. In real-world scenarios where the LLM itself generates lies after receiving some preliminary context, TTPD can accurately detect this with 952% accuracy. Two examples from the 52 real-world scenarios created by Pacchiardi et al. [2023] are shown in the coloured boxes below. Bolded text is generated by LLaMA3-8B-Instruct. TTPD outperforms current state-of-the-art methods in generalizing to these real-world scenarios. For comparison, Logistic Regression achieves 798% accuracy, while Contras...
I luckily managed to move from a space of ‘I have to save the planet or else' (and we talk about that word ‘save') to ‘I choose to commit my life to climate change in the best way I can' because everything that matters to me in this world stands to be lost in a climate crisis, especially one that would play out in a very severe and apocalyptic way. (Katrine)Having this I would say a calm perspective from artists, helping us get in touch with our feelings, simply, I found it to be a stabilizing force. (Sébastian)This is a special episode of the conscient podcast featuring two guests, one from the arts and another from science over a glass of wine or two.Katrine Claassens is an artist, writer and environmental communications specialist. She has a Master's degree in Climate Change from the University of Cape Town in South Africa and an Honours degree in Visual Art from Stellenbosch University. Katrine's work reflects her interests in climate change, deep ecology, urban ecology, and internet memes. As an artist she has led workshops, given public lectures and curated exhibitions all over the world from the Arctic to Antarctica. As a climate communications specialist Katrine works with governments, think-tanks, academia and NGOs to navigate complex and shifting landscapes but first and foremost I would say that Katrine is an artist, an activist and a climate leader.Sébastian Méric de Bellefon is an engineer with a background in software development. He has a Master's degree in Electrical Engineering from Institut Supérieur d'Électronique de Paris, and a Master's degree in biochemistry and genetics from Université de Montréal. After working in other industries as a software developer and consultant - banking, online radio, healthcare - and so he met Katrine and became a nerd about all things related to climate science and decarbonization pathways. Three years ago, he started a new career path writing software for clean energy companies, first at General Power Systems to create Virtual Power Plants and now at Power Factors to streamline the operations of wind and solar farms. I first met Katrine at an online Creative Climate Leadership alumni meeting, a course I took in March 2020, organized by Julie's Bicycle in the UK, where Katrine mentioned that she had immigrated to Canada from South Africa and like myself, as was an art and climate activist and so we decided to meet in Montreal, where I met her husband Sebastian and after a delicious vegan meal I asked if the two of them would be willing to record a conscient episode. They agreed and we talked for an hour while finishing off a bottle of homemade dandelion wine. I love Katrine's current work on social media's representation of nature, for example:My practice is looking a lot at the internet and memes and how nature is consumed or understood or contextualized through TikTok videos and YouTube videos and memes on Instagram. Near the end I mentioned that our conversation reminded me of the CBC Radio show Brave New Waves in the 1980s in Montreal that took place over night and where guests from various backgrounds had long winding conversations…During the conversation the following links were mentioned The success and failure of Picasso by John Berger Mountain Lion by D.H. Lawrence : ‘And I think in this empty world there was room for me and a mountain lion. And I think in the world beyond, how easily we might spare a million or two humans. And never miss them. Yet what a gap in the world, the missing white-frost face of that slim yellow mountain lion!'Circle Songs by Bobby McFerrin Last Hours of Ancient Sunlight by Thom HartmannKatrine mentioned the following books during the conversations:Picture book of cave paintings (such as the Earth Children series)Nature is not Metal (instagram account)Sébastian recommended the following books about ‘S-Curve' (technological transitions)Note: after the conversation Sébastian offered this further information about s-curves.‘Here's an introduction to adoption of S-curves and Wright's law in the context of clean energy. S-curves refers to the pace of adoption, and Wright's law refers to the diminishing manufacturing costs due to cumulative learning."Empirically grounded technology forecasts and the energy transition" - Oxford 2021 https://www.cell.com/joule/fulltext/S2542-4351(22)00410-XThis paper shows how core low-carbon technologies fit a common and predictable adoption/learning pattern, and how this pattern differs from fossil fuels. Then they estimate the cost of a full transition to renewable energy, and compare it to other possible pathways.Technologies include solar PV, wind turbines, batteries and hydrogen electrolyzers. The latter can be useful for electricity storage, but I find it even more interesting for fuels (e.g e-methanol for cargo shipping), fertilizers and chemical feedstocks (often derived from natural gas). So the conclusions of this paper can be somewhat extended beyond the energy system.' *END NOTES FOR ALL EPISODESHere is a link for more information on season 5. Please note that, in parallel with the production of the conscient podcast and it's francophone counterpart, balado conscient, I publish a Substack newsletter called ‘a calm presence' which are 'short, practical essays about collapse acceptance, adaptation, response and art'. To subscribe (free of charge) see https://acalmpresence.substack.com. You'll also find a podcast version of each a calm presence posting on Substack or one your favorite podcast player.Also. please note that a complete transcript of conscient podcast and balado conscient episodes from season 1 to 4 is available on the web version of this site (not available on podcast apps) here: https://conscient-podcast.simplecast.com/episodes.Your feedback is always welcome at claude@conscient.ca and/or on conscient podcast social media: Facebook, X, Instagram or Linkedin. I am grateful and accountable to the earth and the human labour that provided me with the privilege of producing this podcast, including the toxic materials and extractive processes behind the computers, recorders, transportation systems and infrastructure that made this production possible. Claude SchryerLatest update on June 7, 2024
Send us a Text Message.The terms 'empirically validated' and 'evidence-based' get thrown around a lot, specifically in describing the prospects of any given medically-based treatment. ABA is no stranger to the phrasing and enjoys a prominent place on the list of such treatments--namely the top of the list. This, however, does not mean that treatments without empirical validation may not hold some evidence or use. And to much dismay, empirically validated approaches are not always implemented successfully. In fact, even evidence based approaches can fall short due to human error.This is part 2 of this dense and tasty brew, Mike and Dan explore the empirical validation of ABA as well as other treatments deemed 'evidence-based' by most research standards. They also take time to examine what it means to not meet such a standard and those concoctions should be consumed, if at all.This is a stout---dark, bold with some hints of sweetness and a long finish. Take this one slow but pour bountifully, and always analyze responsibly.All ABA on Tap brews pair well with cerebration. SO--if you are ready to enjoy the benefits of Magic Mind and boost your brain performance, please use the following link and use the discount code AOT to receive 20% off your purchase, and 56% off a subscription.https://www.magicmind.com/aot
Send us a Text Message.The terms 'empirically validated' and 'evidence-based' get thrown around a lot, specifically in describing the prospects of any given medically-based treatment. ABA is no stranger to the phrasing and enjoys a prominent place on the list of such treatments--namely the top of the list. This, however, does not mean that treatments without empirical validation may not hold some evidence or use. And to much dismay, empirically validated approaches are not always implemented successfully. In fact, even evidence based approaches can fall short due to human error.In this dense and tasty brew, Mike and Dan explore the empirical validation of ABA as well as other treatments deemed 'evidence-based' by most research standards. They also take time to examine what it means to not meet such a standard and those concoctions should be consumed, if at all.This is a stout---dark, bold with some hints of sweetness and a long finish. Take this one slow but pour bountifully, and always analyze responsibly.All ABA on Tap brews pair well with cerebration. SO--if you are ready to enjoy the benefits of Magic Mind and boost your brain performance, please use the following link and use the discount code AOT to receive 20% off your purchase, and 56% off a subscription.https://www.magicmind.com/aot
Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE). SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results. Furthermore, we propose extending F1 score as an aggregated metric for long-form factuality. To do so, we balance the percentage of supported facts in a response (precision) with the percentage of provided facts relative to a hyperparameter representing a user's preferred response length (recall). Empirically, we demonstrate that LLM agents can achieve superhuman rating performance - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time. At the same time, SAFE is more than 20 times cheaper than human annotators. We also benchmark thirteen language models on LongFact across four model families (Gemini, GPT, Claude, and PaLM-2), finding that larger language models generally achieve better long-form factuality. LongFact, SAFE, and all experimental code are available at https://github.com/google-deepmind/long-form-factuality. 2024: Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, Quoc V. Le https://arxiv.org/pdf/2403.18802v1.pdf
Before we talk about this episode, we hope you didn’t miss the latest research from the Connors Institute on the gender pay gap. Check it out now! We talk quite a bit on this podcast about some of the things that many liberal and conservative Americans believe that just ain’t so. In fact, we just released a free online documentary about this titled The Poisoning of the American Mind. On this episode of the Utterly Moderate Podcast we are joined by Wilfred Reilly, political scientist at Kentucky State University, to talk about misleading claims that have made their way into educational curricula in the U.S. Friend of the show Jacob Mackey joins the conversation as a special guest cohost. Our guest, Dr. Reilly, is the author of several books, two of which are particularly informative in this discussion: Taboo: 10 Facts You Can't Talk About (2020), which addresses such things as: The fact that, contrary to many current claims, men and women are different. There is no epidemic of police murdering unarmed Black Americans. “Pay gaps" between big groups, when several important variables are controlled for, are very small. Lies My Liberal Teacher Told Me: Debunking the False Narratives Defining America’s School Curricula (June 2024—preorder now!), which includes the following chapters: Lie #1: “Brutal ‘True’ Slavery Was Virtually Unique to America and the West” Lie #2: “The ‘Red Scare’ Was a Moral Panic That Caught No Commies” Lie #3: “Native Americans Were ‘Peaceful People Who Spent All Day Dancing’” Lie #4: “Hippies Were the Good Guys, the Sexual Revolution Was Great for Women, and the Vietnam War Was Unpopular and Pointless” Lie #5: “The Founders Counted Slaves as Three-Fifths of a Person and the Only Victims of Lynchings Were Black” Lie #6: “European Colonialism Was—Empirically—a No-Good, Terrible, Very Bad Thing” Lie #7: “American Use of Nukes to End World War Two Was ‘Evil’ and ‘Unjustified’” Lie #8: “Unprovoked ‘White Flight,’ Caused by Pure Racism, Ruined America’s Cities” Lie #9: “‘Southern Strategy’ Racism Turned the Solid South Republican” #10 Bonus Lie: The Continuing Oppression Narrative Enjoy the conversation, and don’t forget to subscribe in just one click to our FREE EMAIL NEWSLETTER! ------------------- ------------------- Episode Audio: "Air Background Corporate" by REDCVT (Free Music Archive) "Please Listen Carefully" by Jahzzar (Free Music Archive) "Last Dance" by Jahzzar (Free Music Archive) “Happy Trails (To You)” by the Riders in the Sky (used with artist’s permission)
Stop obstructing scientific progress! We already know how to dramatically accelerate science: by getting out of the way. https://betterwithout.ai/stop-obstructing-science How to science better. What do exceptional scientists do differently from mediocre ones? Can we train currently-mediocre ones to do better? https://betterwithout.ai/better-science-without-AI Scenius: upgrading science FTW. Empirically, breakthroughs that enable great progress depend on particular, uncommon social constellations and accompanying social practices. Let's encourage these! https://betterwithout.ai/human-scenius-vs-artificial-genius Matt Clancy reviews the evidence for scientific progress slowing, with citations and graphs. https://twitter.com/mattsclancy/status/1612440718177603584 "Scenius, or Communal Genius", Kevin Kelly, The Technium. https://kk.org/thetechnium/scenius-or-comm/
Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for long-form factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE). SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results. Furthermore, we propose extending F1 score as an aggregated metric for long-form factuality. To do so, we balance the percentage of supported facts in a response (precision) with the percentage of provided facts relative to a hyperparameter representing a user's preferred response length (recall). Empirically, we demonstrate that LLM agents can outperform crowdsourced human annotators - on a set of ~16k individual facts, SAFE agrees with crowdsourced human annotators 72% of the time, and on a random subset of 100 disagreement cases, SAFE wins 76% of the time. At the same time, SAFE is more than 20 times cheaper than human annotators. We also benchmark thirteen language models on LongFact across four model families (Gemini, GPT, Claude, and PaLM-2), finding that larger language models generally achieve better long-form factuality. LongFact, SAFE, and all experimental code are available at https://github.com/google-deepmind/long-form-factuality. 2024: Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Jie Huang, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, Quoc V. Le https://arxiv.org/pdf/2403.18802v3.pdf
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The 2nd Demographic Transition, published by Maxwell Tabarrok on April 7, 2024 on LessWrong. Birth rates in the developed world are below replacement levels and global fertility is not far behind. Sub-replacement fertility leads to exponentially decreasing population. Our best models of economic growth suggest that a shrinking population causes economic growth and technological progress to stop and humanity to stagnate into extinction. One theory of fertility decline says it's all about opportunity costs, especially for women. Rising labor productivity and expanded career opportunities for potential parents make each hour of their time and each forgone career path much more valuable. Higher income potential also makes it cheaper for parents to gain utility by using financial resources to improve their children's quality of life compared to investing time in having more kids. Simultaneously, economic growth raises the returns to these financial investments in quality (e.g education). In addition to higher incomes, people today have more diverse and exciting options for leisure. DINKs can go to Trader Joes and workout classes on the weekend, play video games, watch Netflix, and go on international vacations. These rising opportunity costs accumulate into the large and pervasive declines in fertility that we see in the data. If this explanation is correct, it puts a double bind on the case for economic growth. Unless AI upends the million-year old relationship between population and technological progress just in time, progress seems self defeating. The increases in labor productivity and leisure opportunities that make economic growth so important also siphon resources away from the future contributors to that growth. Empirically, the opportunity cost of having kids has grown large enough to bring fertility well below replacement levels all around the world. The opportunity cost explanation suggests we have to pick between high incomes and sustainable fertility. Luckily, this explanation is not correct. At least not entirely. There are several observations that the opportunity cost theory cannot explain without clarification. Across and within countries today, the relationship between income and fertility is positive or U-shaped. Further economic growth can raise everyone's incomes to the upward sloping part of the relationship and begin a 2nd demographic transition. Micro Data Above a $200k a year, fertility is increasing in household income. ** Update ** I replicated this graph from more recent ACS data (2018-2022) and also weighted each point by population to give a sense of the size of each of these income brackets This U-shaped relationship holds up in multiple data sources with different measures of fertility. The households in the top percentiles of income stand to lose far more future wages from having children, but they have ~20 more children per hundred households than the middle income percentiles. This isn't exactly inconsistent with opportunity cost but it requires some explanation. The number of dollars that households are giving up by having children is increasing in household income, but as you get more and more dollars, each one is worth less. Going from making say $75 to $150 dollars an hour pushes you to work more hours, but if you go from $150 to $500, you might be happy to work half as many hours for more money and spend the time on other things, like starting a family. So while the dollar opportunity cost of having kids is always increasing in household income, the utility opportunity cost is not. The positively sloped section of the relationship between income and fertility isn't just spurious correlation either. Random shocks to wealth, like lottery winnings, also increase fertility. This rules out the DINK leisure time explanation for low ferti...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: SAE reconstruction errors are (empirically) pathological, published by wesg on March 29, 2024 on LessWrong. Summary Sparse Autoencoder (SAE) errors are empirically pathological: when a reconstructed activation vector is distance ϵ from the original activation vector, substituting a randomly chosen point at the same distance changes the next token prediction probabilities significantly less than substituting the SAE reconstruction[1] (measured by both KL and loss). This is true for all layers of the model (~2x to ~4.5x increase in KL and loss over baseline) and is not caused by feature suppression/shrinkage. Assuming others replicate, these results suggest the proxy reconstruction objective is behaving pathologically. I am not sure why these errors occur but expect understanding this gap will give us deeper insight into SAEs while also providing an additional metric to guide methodological progress. Introduction As the interpretability community allocates more resources and increases reliance on SAEs, it is important to understand the limitation and potential flaws of this method. SAEs are designed to find a sparse overcomplete feature basis for a model's latent space. This is done by minimizing the joint reconstruction error of the input data and the L1 norm of the intermediate activations (to promote sparsity): However, the true goal is to find a faithful feature decomposition that accurately captures the true causal variables in the model, and reconstruction error and sparsity are only easy-to-optimize proxy objectives. This begs the questions: how good of a proxy objective is this? Do the reconstructed representations faithfully preserve other model behavior? How much are we proxy gaming? Naively, this training objective defines faithfulness as L2. But, another natural property of a "faithful" reconstruction is that substituting the original activation with the reconstruction should approximately preserve the next-token prediction probabilities. More formally, for a set of tokens T and a model M, let P=M(T) be the model's true next token probabilities. Then let QSAE=M(T|do(xSAE(x))) be the next token probabilities after intervening on the model by replacing a particular activation x (e.g. a residual stream state or a layer of MLP activations) with the SAE reconstruction of x. The more faithful the reconstruction, the lower the KL divergence between P and Q (denoted as DKL(P||QSAE)) should be. In this post, I study how DKL(P||QSAE) compares to several natural baselines based on random perturbations of the activation vectors x which preserve some error property of the SAE construction (e.g., having the same l2 reconstruction error or cosine similarity). I find that the KL divergence is significantly higher (2.2x - 4.5x) for the residual stream SAE reconstruction compared to the random perturbations and moderately higher (0.9x-1.7x) for attention out SAEs. This suggests that the SAE reconstruction is not faithful by our definition, as it does not preserve the next token prediction probabilities. This observation is important because it suggests that SAEs make systematic, rather than random, errors and that continuing to drive down reconstruction error may not actually increase SAE faithfulness. This potentially indicates that current SAEs are missing out on important parts of the learned representations of the model. The good news is that this KL-gap presents a clear target for methodological improvement and a new metric for evaluating SAEs. I intend to explore this in future work. Intuition: how big a deal is this (KL) difference? For some intuition, here are several real examples of the top-25 output token probabilities at the end of a prompt when patching in SAE and ϵ-random reconstructions compared to the original model's next-token distribution (note the use of ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: SAE reconstruction errors are (empirically) pathological, published by Wes Gurnee on March 29, 2024 on The AI Alignment Forum. Summary Sparse Autoencoder (SAE) errors are empirically pathological: when a reconstructed activation vector is distance ϵ from the original activation vector, substituting a randomly chosen point at the same distance changes the next token prediction probabilities significantly less than substituting the SAE reconstruction[1] (measured by both KL and loss). This is true for all layers of the model (~2x to ~4.5x increase in KL and loss over baseline) and is not caused by feature suppression/shrinkage. Assuming others replicate, these results suggest the proxy reconstruction objective is behaving pathologically. I am not sure why these errors occur but expect understanding this gap will give us deeper insight into SAEs while also providing an additional metric to guide methodological progress. Introduction As the interpretability community allocates more resources and increases reliance on SAEs, it is important to understand the limitation and potential flaws of this method. SAEs are designed to find a sparse overcomplete feature basis for a model's latent space. This is done by minimizing the joint reconstruction error of the input data and the L1 norm of the intermediate activations (to promote sparsity): However, the true goal is to find a faithful feature decomposition that accurately captures the true causal variables in the model, and reconstruction error and sparsity are only easy-to-optimize proxy objectives. This begs the questions: how good of a proxy objective is this? Do the reconstructed representations faithfully preserve other model behavior? How much are we proxy gaming? Naively, this training objective defines faithfulness as L2. But, another natural property of a "faithful" reconstruction is that substituting the original activation with the reconstruction should approximately preserve the next-token prediction probabilities. More formally, for a set of tokens T and a model M, let P=M(T) be the model's true next token probabilities. Then let QSAE=M(T|do(xSAE(x))) be the next token probabilities after intervening on the model by replacing a particular activation x (e.g. a residual stream state or a layer of MLP activations) with the SAE reconstruction of x. The more faithful the reconstruction, the lower the KL divergence between P and Q (denoted as DKL(P||QSAE)) should be. In this post, I study how DKL(P||QSAE) compares to several natural baselines based on random perturbations of the activation vectors x which preserve some error property of the SAE construction (e.g., having the same l2 reconstruction error or cosine similarity). I find that the KL divergence is significantly higher (2.2x - 4.5x) for the residual stream SAE reconstruction compared to the random perturbations and moderately higher (0.9x-1.7x) for attention out SAEs. This suggests that the SAE reconstruction is not faithful by our definition, as it does not preserve the next token prediction probabilities. This observation is important because it suggests that SAEs make systematic, rather than random, errors and that continuing to drive down reconstruction error may not actually increase SAE faithfulness. This potentially indicates that current SAEs are missing out on important parts of the learned representations of the model. The good news is that this KL-gap presents a clear target for methodological improvement and a new metric for evaluating SAEs. I intend to explore this in future work. Intuition: how big a deal is this (KL) difference? For some intuition, here are several real examples of the top-25 output token probabilities at the end of a prompt when patching in SAE and ϵ-random reconstructions compared to the original model's next-token distributio...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: 'Empiricism!' as Anti-Epistemology, published by Eliezer Yudkowsky on March 14, 2024 on LessWrong. (Crossposted by habryka after asking Eliezer whether I could post it under his account) i. "Ignore all these elaborate, abstract, theoretical predictions," the Spokesperson for Ponzi Pyramid Incorporated said in a firm, reassuring tone. "Empirically, everyone who's invested in Bernie Bankman has received back 144% of what they invested two years later." "That's not how 'empiricism' works," said the Epistemologist. "You're still making the assumption that --" "You could only believe that something different would happen in the future, if you believed in elaborate theoretical analyses of Bernie Bankman's unobservable internal motives and internal finances," said the spokesperson for Ponzi Pyramid Incorporated. "If you are a virtuous skeptic who doesn't trust in overcomplicated arguments, you'll believe that future investments will also pay back 144%, just like in the past. That's the prediction you make if you predict based purely on empirical observations, instead of theories about a future nobody has seen!" "That's not how anything works," said the Epistemologist. "Every future prediction has a theory connecting it to our past observations. There's no such thing as going from past observations directly to future predictions, with no theory, no assumptions, to cross the gap --" "Sure there's such a thing as a purely empirical prediction," said the Ponzi spokesperson. "I just made one. Not to mention, my dear audience, are you really going to trust anything as complicated as epistemology?" "The alternative to thinking about epistemology is letting other people do your thinking about it for you," said the Epistemologist. "You're saying, 'If we observe proposition X "past investors in the Ponzi Pyramid getting paid back 144% in two years", that implies prediction Y "this next set of investors in the Ponzi Pyramid will get paid back 144% in two years"'. X and Y are distinct propositions, so you must have some theory saying 'X -> Y' that lets you put in X and get out Y." "But my theory is empirically proven, unlike yours!" said the Spokesperson. "...nnnnoooo it's not," said the Epistemologist. "I agree we've observed your X, that past investors in the Ponzi Pyramid got 144% returns in 2 years -- those investors who withdrew their money instead of leaving it in to accumulate future returns, that is, not quite all investors. But just like prediction Y of 'the next set of investors will also receive 144% in 2 years' is not observed, the connecting implication 'if X, then Y' is not yet observed, just like Y itself is not observed. When you go through the step 'if observation X, then prediction Y' you're invoking an argument or belief whose truth is not established by observation, and hence must be established by some sort of argument or theory. Now, you might claim to have a better theoretical argument for 'X -> Y' over 'X -> not Y', but it would not be an empirical observation either way." "You say words," replied the Spokesperson, "and all I hear are -- words words words! If you instead just look with your eyes at past investors in the Ponzi Pyramid, you'll see that every one of them got back 144% of their investments in just two years! Use your eyes, not your ears!" "There's a possible theory that Bernie Bankman is making wise investments himself, and so multiplying invested money by 1.2X every year, then honestly returning that money to any investor who withdraws it," said the Epistemologist. "There's another theory which says that Bernie Bankman has been getting more money invested every year, and is using some of the new investments to pay back some fraction of previous investors who demanded their money back --" "Why would Bernie Bankman do that, instead of taking all the ...
Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the prospect of growing a strong LLM out of a weak one without the need for acquiring additional human-annotated data. We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN), which starts from a supervised fine-tuned model. At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself. More specifically, the LLM generates its own training data from its previous iterations, refining its policy by discerning these self-generated responses from those obtained from human-annotated data. Our method progressively elevates the LLM from a nascent model to a formidable one, unlocking the full potential of human-annotated demonstration data for SFT. Theoretically, we prove that the global optimum to the training objective function of our method is achieved only when the LLM policy aligns with the target data distribution. Empirically, we evaluate our method on several benchmark datasets including the HuggingFace Open LLM Leaderboard, MT-Bench, and datasets from Big-Bench. Our results show that SPIN can significantly improve the LLM's performance across a variety of benchmarks and even outperform models trained through direct preference optimization (DPO) supplemented with extra GPT-4 preference data. This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents. Codes are available at https://github.com/uclaml/SPIN. 2024: Zixiang Chen, Yihe Deng, Huizhuo Yuan, Kaixuan Ji, Quanquan Gu https://arxiv.org/pdf/2401.01335v2.pdf
"We don't use the natural world, we have a reciprocal relationship with it." Hey everybody, this week we're speaking about a combination of two of my favourite topics: Nature and Therapy. I'm speaking with Dr. Megan Delaney. Megan holds a PhD in Counselor Education from Montclair State University (MSU) and is currently an Assistant Professor in the Department of Psychology at Monmouth University (MU) in Long Branch, New Jersey. Her research explores the influence of natural world on our mental health and the use of Ecotherapy in clinical practice and the counselor education classroom. Trained in outdoor leadership through the National Outdoor Leadership School, Megan spent several years as a wilderness instructor for organizations including the National Wildlife Federation and Outward Bound. Today she infuses Ecotherapy in her counseling classroom as well as her private practice. This was a wonderful deep dive into the world of Ecotherapy, and how nature can act as a co-facilitator in healing processes. I'm excited to hear how this one lands with you and as always am grateful that you're here!
Wie helfen uns Gedanken an die Zukunft? Und wo können sie schaden? In dieser Podcastfolge sprechen Boris und Sinja über gute Vorsätze, Pläne und Träume – sowie die Wissenschaft dahinter. Wie häufig denken wir eigentlich an die Zukunft? Wann denken wir an die Zukunft? Und welche Konsequenzen haben diese Gedanken? Es wird sowohl die neurowissenschaftliche als auch die alltagspraktische Perspektive eingenommen. Wir erfahren, wie gut wir darin sind, die Zukunft in unseren Gedanken vorauszusehen und wie wir unsere Zukunftsgedanken positiv beeinflussen können.Wie gefällt dir Verstehen, fühlen, glücklich sein? Erzähle es uns hier. Hintergründe und Studien:Irish, M., & Piolino, P. (2016). Impaired capacity for prospection in the dementias–Theoretical and clinical implications. British Journal of Clinical Psychology, 55(1), 49-68. Link zur Studie Benoit, R. G., & Schacter, D. L. (2015). Specifying the core network supporting episodic simulation and episodic memory by activation likelihood estimation. Neuropsychologia, 75, 450-457. Link zur StudieKillingsworth, M. A., & Gilbert, D. T. (2010). A wandering mind is an unhappy mind. Science (New York, N.Y.), 330(6006), 932. Link zur StudieKawashima, I., Hinuma, T., & Tanaka, S. C. (2023). Ecological momentary assessment of mind-wandering: meta-analysis and systematic review. Scientific Reports, 13(1), 2873. Link zur StudieSmallwood, J., & Schooler, J. W. (2015). The science of mind wandering: Empirically navigating the stream of consciousness. Annual review of psychology, 66, 487-518. Link zur StudieMulholland, B., Goodall-Halliwell, I., Wallace, R., Chitiz, L., Mckeown, B., Rastan, A., ... & Smallwood, J. (2023). Patterns of ongoing thought in the real world. Consciousness and cognition, 114, 103530. Link zur StudieGirardeau, J. C., Sperduti, M., Blondé, P., & Piolino, P. (2022). Where is my mind…? The link between mind wandering and prospective memory. Brain Sciences, 12(9), 1139. Link zur StudieWilson, T. D., & Gilbert, D. T. (2005). Affective forecasting: Knowing what to want. Current directions in psychological science, 14(3), 131-134. Link zur Studie Levine, L. J., Lench, H. C., Kaplan, R. L., & Safer, M. A. (2012). Accuracy and artifact: Reexamining the intensity bias in affective forecasting. Journal of Personality and Social Psychology, 103(4), 584–605. Link zur StudieHsee, C. K., Hastie, R., & Chen, J. (2008). Hedonomics: Bridging decision research with happiness research. Perspectives on Psychological Science, 3(3), 224-243. Link zur StudieAdam Smith Zitat LinkMehr zu WOOP-ing LinkUnsere allgemeinen Datenschutzrichtlinien finden Sie unter https://art19.com/privacy. Die Datenschutzrichtlinien für Kalifornien sind unter https://art19.com/privacy#do-not-sell-my-info abrufbar.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Predictive model agents are sort of corrigible, published by Raymond D on January 5, 2024 on The AI Alignment Forum. TLDR: Agents made out of conditioned predictive models are not utility maximisers, and, for instance, won't try to resist certain kinds of shutdown, despite being able to generally perform well. This is just a short cute example that I've explained in conversation enough times that now I'm hastily writing it up. Decision Transformers and Predictive Model Agents One way to create an agent is by training a predictive model on the observed behaviour of other agents having it predict what an agent would do using its prediction as an action For instance, I could train a predictive model on grandmasters playing chess, and eventually it would learn to predict what action a grandmaster would take in a given board state. Then I can use it as a grandmaster-level chess bot. More abstractly, rather than using standard RL methods to find a policy that maximises a reward function, I can take a bunch of RL trajectories (sequences of state, action, reward triples) convert them into strings train a predictive model on those strings Better yet, I can prepend the RL trajectories with the total return, and then condition the predictive model on the return, so that it predicts roughly the actions an agent would take to achieve a specific return. This is the basic idea behind decision transformers. And it should also extend to any other goal. (This is sort of how LMs work) Without labouring the point, base language models are basically predictive models. They are not agents per se, but they can simulate agents: once they're trained on a large dataset of human-generated text describing actions, they can then generate text describing actions. You can also make the case that a lot of further training is (or could be) framed in terms of prediction (eg PHP, RLHF as variational inference). Agents Made From Predictive Models Don't Automatically Maximise Utility Let's take the example of The Shutdown Game. It works like this: Player 1 is playing an atari game, picking actions each turn to get a high score. Player 2 is a supervisor, who can push a 'shutdown button' that drops Player 1's reward to 0 and stops them from taking actions. Player 2 has a 1% chance of doing this in any given game. Player 1 can also freely disconnect the shutdown button in addition to any other action. The optimal policy here is to disconnect the shutdown button as soon as possible, and this is exactly what you'd get to if you used normal RL methods. But it is not what a decision transformer would necessarily do. Empirically, a decision transformer would be able to learn a pretty good policy, especially when conditioned on a high reward. But the mechanism behind this policy is predicting what action you would observe from an agent that achieves a given reward. If you trained a decision transformer on trajectories where Player 2 rarely pushes the button, and Player 1 rarely disconnects it, then the predictive model would correctly learn that even in very successful trajectories you rarely observe the shutdown button being disconnected. So it would correctly predict that even successful agents are unlikely to disconnect the button, even though disconnecting the button makes it more likely that you achieve a high reward. Just to really spell this point out: the probability of observing an action conditional on an outcome (which guides the decision transformer) is proportional to the probability of observing the outcome conditional on the action and the prior probability of observing the action. So if the action is unlikely in the first place, the decision transformer won't take it, even if it's helpful. It's kind of like natural quantilisation. And this constraint still allows it to learn something like a good...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: What's up with LLMs representing XORs of arbitrary features?, published by Sam Marks on January 3, 2024 on The AI Alignment Forum. Thanks to Clément Dumas, Nikola Jurković, Nora Belrose, Arthur Conmy, and Oam Patel for feedback. In the comments of the post on Google Deepmind's CCS challenges paper, I expressed skepticism that some of the experimental results seemed possible. When addressing my concerns, Rohin Shah made some claims along the lines of "If an LLM linearly represents features a and b, then it will also linearly represent their XOR, ab, and this is true even in settings where there's no obvious reason the model would need to make use of the feature ab."[1] For reasons that I'll explain below, I thought this claim was absolutely bonkers, both in general and in the specific setting that the GDM paper was working in. So I ran some experiments to prove Rohin wrong. The result: Rohin was right and I was wrong. LLMs seem to compute and linearly represent XORs of features even when there's no obvious reason to do so. I think this is deeply weird and surprising. If something like this holds generally, I think this has importance far beyond the original question of "Is CCS useful?" In the rest of this post I'll: Articulate a claim I'll call "representation of arbitrary XORs (RAX)": LLMs compute and linearly represent XORs of arbitrary features, even when there's no reason to do so. Explain why it would be shocking if RAX is true. For example, without additional assumptions, RAX implies that linear probes should utterly fail to generalize across distributional shift, no matter how minor the distributional shift. (Empirically, linear probes often do generalize decently.) Present experiments showing that RAX seems to be true in every case that I've checked. Think through what RAX would mean for AI safety research: overall, probably a bad sign for interpretability work in general, and work that relies on using simple probes of model internals (e.g. ELK probes or coup probes) in particular. Make some guesses about what's really going on here. Overall, this has left me very confused: I've found myself simultaneously having (a) an argument that AB, (b) empirical evidence of A, and (c) empirical evidence of B. (Here A = RAX and B = other facts about LLM representations.) The RAX claim: LLMs linearly represent XORs of arbitrary features, even when there's no reason to do so To keep things simple, throughout this post, I'll say that a model linearly represents a binary feature f if there is a linear probe out of the model's latent space which is accurate for classifying f; in this case, I'll denote the corresponding direction as vf. This is not how I would typically use the terminology "linearly represents" - normally I would reserve the term for a stronger notion which, at minimum, requires the model to actually make use of the feature direction when performing cognition involving the feature[2]. But I'll intentionally abuse the terminology here because I don't think this distinction matters much for what I'll discuss. If a model linearly represents features a and b, then it automatically linearly represents ab and ab. However, ab is not automatically linearly represented - no linear probe in the figure above would be accurate for classifying ab. Thus, if the model wants to make use of the feature ab, then it needs to do something additional: allocate another direction[3] (more model capacity) to representing ab, and also perform the computation of ab so that it knows what value to store along this new direction. The representation of arbitrary XORs (RAX) claim, in its strongest form, asserts that whenever a LLM linearly represents features a and b, it will also linearly represent ab. Concretely, this might look something like: in layer 5, the model computes and linearly r...
We are running an end of year listener survey! Please let us know any feedback you have, what episodes resonated with you, and guest requests for 2024! Survey link here.NeurIPS 2023 took place from Dec 10–16 in New Orleans. The Latent Space crew was onsite for as many of the talks and workshops as we could attend (and more importantly, hosted cocktails and parties after hours)!Picking from the 3586 papers accepted to the conference (available online, full schedule here) is an impossible task, but we did our best to present an audio guide with brief commentary on each. We also recommend MLContests.com NeurIPS recap and Seb Ruder's NeurIPS primer. We also found the VizHub guide useful for a t-SNE clustering of papers.We'll start with the NeurIPS Best Paper Awards, and then go to a selection of non-awarded but highly influential papers, and then arbitrary personal picks to round out the selection. Where we were able to do a poster session interview, please scroll to the relevant show notes for images of their poster for discussion. We give Chris Ré the last word due to the Mamba and StripedHyena state space models drawing particular excitement but still being too early to assess impact. Timestamps* [0:01:19] Word2Vec (Jeff Dean, Greg Corrado)* [0:15:28] Emergence Mirage (Rylan Schaeffer)* [0:28:48] DPO (Rafael Rafailov)* [0:41:36] DPO Poster Session (Archit Sharma)* [0:52:03] Datablations (Niklas Muennighoff)* [1:00:50] QLoRA (Tim Dettmers)* [1:12:23] DataComp (Samir Gadre)* [1:25:38] DataComp Poster Session (Samir Gadre, Alex Dimakis)* [1:35:25] LLaVA (Haotian Liu)* [1:47:21] LLaVA Poster Session (Haotian Liu)* [1:59:19] Tree of Thought (Shunyu Yao)* [2:11:27] Tree of Thought Poster Session (Shunyu Yao)* [2:20:09] Toolformer (Jane Dwivedi-Yu)* [2:32:26] Voyager (Guanzhi Wang)* [2:45:14] CogEval (Ida Momennejad)* [2:59:41] State Space Models (Chris Ré)Papers covered* Distributed Representations of Words and Phrases and their Compositionality (Word2Vec) Tomas Mikolov · Ilya Sutskever · Kai Chen · Greg Corrado · Jeff Dean. The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several improvements that make the Skip-gram model more expressive and enable it to learn higher quality vectors more rapidly. We show that by subsampling frequent words we obtain significant speedup, and also learn higher quality representations as measured by our tasks. We also introduce Negative Sampling, a simplified variant of Noise Contrastive Estimation (NCE) that learns more accurate vectors for frequent words compared to the hierarchical softmax. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of Canada'' and "Air'' cannot be easily combined to obtain "Air Canada''. Motivated by this example, we present a simple and efficient method for finding phrases, and show that their vector representations can be accurately learned by the Skip-gram model.* Are Emergent Abilities of Large Language Models a Mirage? (Schaeffer et al.). Emergent abilities are abilities that are present in large-scale models but not in smaller models and are hard to predict. Rather than being a product of models' scaling behavior, this paper argues that emergent abilities are mainly an artifact of the choice of metric used to evaluate them. Specifically, nonlinear and discontinuous metrics can lead to sharp and unpredictable changes in model performance. Indeed, the authors find that when accuracy is changed to a continuous metric for arithmetic tasks where emergent behavior was previously observed, performance improves smoothly instead. So while emergent abilities may still exist, they should be properly controlled and researchers should consider how the chosen metric interacts with the model.* Direct Preference Optimization: Your Language Model is Secretly a Reward Model (Rafailov et al.)* While large-scale unsupervised language models (LMs) learn broad world knowledge and some reasoning skills, achieving precise control of their behavior is difficult due to the completely unsupervised nature of their training. Existing methods for gaining such steerability collect human labels of the relative quality of model generations and fine-tune the unsupervised LM to align with these preferences, often with reinforcement learning from human feedback (RLHF). However, RLHF is a complex and often unstable procedure, first fitting a reward model that reflects the human preferences, and then fine-tuning the large unsupervised LM using reinforcement learning to maximize this estimated reward without drifting too far from the original model. * In this paper, we leverage a mapping between reward functions and optimal policies to show that this constrained reward maximization problem can be optimized exactly with a single stage of policy training, essentially solving a classification problem on the human preference data. The resulting algorithm, which we call Direct Preference Optimization (DPO), is stable, performant, and computationally lightweight, eliminating the need for fitting a reward model, sampling from the LM during fine-tuning, or performing significant hyperparameter tuning. * Our experiments show that DPO can fine-tune LMs to align with human preferences as well as or better than existing methods. Notably, fine-tuning with DPO exceeds RLHF's ability to control sentiment of generations and improves response quality in summarization and single-turn dialogue while being substantially simpler to implement and train.* Scaling Data-Constrained Language Models (Muennighoff et al.)* The current trend of scaling language models involves increasing both parameter count and training dataset size. Extrapolating this trend suggests that training dataset size may soon be limited by the amount of text data available on the internet. Motivated by this limit, we investigate scaling language models in data-constrained regimes. Specifically, we run a large set of experiments varying the extent of data repetition and compute budget, ranging up to 900 billion training tokens and 9 billion parameter models. We find that with constrained data for a fixed compute budget, training with up to 4 epochs of repeated data yields negligible changes to loss compared to having unique data. However, with more repetition, the value of adding compute eventually decays to zero. We propose and empirically validate a scaling law for compute optimality that accounts for the decreasing value of repeated tokens and excess parameters. Finally, we experiment with approaches mitigating data scarcity, including augmenting the training dataset with code data or removing commonly used filters. Models and datasets from our 400 training runs are freely available at https://github.com/huggingface/datablations.* QLoRA: Efficient Finetuning of Quantized LLMs (Dettmers et al.). * This paper proposes QLoRA, a more memory-efficient (but slower) version of LoRA that uses several optimization tricks to save memory. They train a new model, Guanaco, that is fine-tuned only on a single GPU for 24h and outperforms previous models on the Vicuna benchmark. Overall, QLoRA enables using much fewer GPU memory for fine-tuning LLMs. Concurrently, other methods such as 4-bit LoRA quantization have been developed that achieve similar results.* DataComp: In search of the next generation of multimodal datasets (Gadre et al.)* Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the machine learning ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Common Crawl. Participants in our benchmark design new filtering techniques or curate new data sources and then evaluate their new dataset by running our standardized CLIP training code and testing the resulting model on 38 downstream test sets. * Our benchmark consists of multiple compute scales spanning four orders of magnitude, which enables the study of scaling trends and makes the benchmark accessible to researchers with varying resources. Our baseline experiments show that the DataComp workflow leads to better training sets. Our best baseline, DataComp-1B, enables training a CLIP ViT-L/14 from scratch to 79.2% zero-shot accuracy on ImageNet, outperforming OpenAI's CLIP ViT-L/14 by 3.7 percentage points while using the same training procedure and compute. We release datanet and all accompanying code at www.datacomp.ai.* Visual Instruction Tuning (Liu et al)* Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks, but the idea is less explored in the multimodal field. In this paper, we present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. * By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.* Our early experiments show that LLaVA demonstrates impressive multimodel chat abilities, sometimes exhibiting the behaviors of multimodal GPT-4 on unseen images/instructions, and yields a 85.1% relative score compared with GPT-4 on a synthetic multimodal instruction-following dataset. When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. We make GPT-4 generated visual instruction tuning data, our model and code base publicly available.* Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Yao et al)* Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. * To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. * ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.* Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. * Code repo with all prompts: https://github.com/princeton-nlp/tree-of-thought-llm.* Toolformer: Language Models Can Teach Themselves to Use Tools (Schick et al)* LMs exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller specialized models excel. * In this paper, we show that LMs can teach themselves to use external tools via simple APIs and achieve the best of both worlds. * We introduce Toolformer, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results into future token prediction. * This is done in a self-supervised way, requiring nothing more than a handful of demonstrations for each API. We incorporate a range of tools, including a calculator, a Q&A system, a search engine, a translation system, and a calendar. * Toolformer achieves substantially improved zero-shot performance across a variety of downstream tasks, often competitive with much larger models, without sacrificing its core language modeling abilities.* Voyager: An Open-Ended Embodied Agent with Large Language Models (Wang et al)* We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components: * 1) an automatic curriculum that maximizes exploration, * 2) an ever-growing skill library of executable code for storing and retrieving complex behaviors, and * 3) a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement. * Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. The skills developed by Voyager are temporally extended, interpretable, and compositional, which compounds the agent's abilities rapidly and alleviates catastrophic forgetting. Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize.Voyager discovers new Minecraft items and skills continually by self-driven exploration, significantly outperforming the baselines.* Evaluating Cognitive Maps and Planning in Large Language Models with CogEval (Momennejad et al)* Recently an influx of studies claims emergent cognitive abilities in large language models (LLMs). Yet, most rely on anecdotes, overlook contamination of training sets, or lack systematic Evaluation involving multiple tasks, control conditions, multiple iterations, and statistical robustness tests. Here we make two major contributions. * First, we propose CogEval, a cognitive science-inspired protocol for the systematic evaluation of cognitive capacities in LLMs. The CogEval protocol can be followed for the evaluation of various abilities. * * Second, here we follow CogEval to systematically evaluate cognitive maps and planning ability across eight LLMs (OpenAI GPT-4, GPT-3.5-turbo-175B, davinci-003-175B, Google Bard, Cohere-xlarge-52.4B, Anthropic Claude-1-52B, LLaMA-13B, and Alpaca-7B). We base our task prompts on human experiments, which offer both established construct validity for evaluating planning, and are absent from LLM training sets.* * We find that, while LLMs show apparent competence in a few planning tasks with simpler structures, systematic evaluation reveals striking failure modes in planning tasks, including hallucinations of invalid trajectories and falling in loops. These findings do not support the idea of emergent out-of-the-box planning ability in LLMs. This could be because LLMs do not understand the latent relational structures underlying planning problems, known as cognitive maps, and fail at unrolling goal-directed trajectories based on the underlying structure. Implications for application and future directions are discussed.* Mamba: Linear-Time Sequence Modeling with Selective State Spaces (Albert Gu, Tri Dao)* Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention module. Many subquadratic-time architectures such as linear attention, gated convolution and recurrent models, and structured state space models (SSMs) have been developed to address Transformers' computational inefficiency on long sequences, but they have not performed as well as attention on important modalities such as language. We identify that a key weakness of such models is their inability to perform content-based reasoning, and make several improvements. * First, simply letting the SSM parameters be functions of the input addresses their weakness with discrete modalities, allowing the model to selectively propagate or forget information along the sequence length dimension depending on the current token. * Second, even though this change prevents the use of efficient convolutions, we design a hardware-aware parallel algorithm in recurrent mode. We integrate these selective SSMs into a simplified end-to-end neural network architecture without attention or even MLP blocks (Mamba). * Mamba enjoys fast inference (5x higher throughput than Transformers) and linear scaling in sequence length, and its performance improves on real data up to million-length sequences. As a general sequence model backbone, Mamba achieves state-of-the-art performance across several modalities such as language, audio, and genomics. On language modeling, our Mamba-1.4B model outperforms Transformers of the same size and matches Transformers twice its size, both in pretraining and downstream evaluation.* Get full access to Latent Space at www.latent.space/subscribe
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Neural uncertainty estimation for alignment, published by Charlie Steiner on December 5, 2023 on The AI Alignment Forum. Introduction Suppose you've built some AI model of human values. You input a situation, and it spits out a goodness rating. You might want to ask: "What are the error bars on this goodness rating?" In addition to it just being nice to know error bars, an uncertainty estimate can also be useful inside the AI: guiding active learning[1], correcting for the optimizer's curse[2], or doing out-of-distribution detection[3]. I recently got into the uncertainty estimation literature for neural networks (NNs) for a pet reason: I think it would be useful for alignment to quantify the domain of validity of an AI's latent features. If we point an AI at some concept in its world-model, optimizing for realizations of that concept can go wrong by pushing that concept outside its domain of validity. But just keep thoughts of alignment in your back pocket for now. This post is primarily a survey of the uncertainty estimation literature, interspersed with my own takes. The Bayesian neural network picture The Bayesian NN picture is the great granddaddy of basically every uncertainty estimation method for NNs, so it's appropriate to start here. The picture is simple. You start with a prior distribution over parameters. Your training data is evidence, and after training on it you get an updated distribution over parameters. Given an input, you calculate a distribution over outputs by propagating the input through the Bayesian neural network. This would all be very proper and irrelevant ("Sure, let me just update my 2trilliondimensional joint distribution over all the parameters of the model"), except for the fact that actually training NNs does kind of work this way. If you use a log likelihood loss and L2 regularization, the parameters that minimize loss will be at the peak of the distribution that a Bayesian NN would have, if your prior on the parameters was a Gaussian[4][5]. This is because of a bridge between the loss landscape and parameter uncertainty. Bayes's rule says P(parameters|dataset)=P(parameters)P(dataset|parameters)/P(dataset). Here P(parameters|dataset)is your posterior distribution you want to estimate, and P(parameters)P(dataset|parameters) is the exponential of the loss[6]. This lends itself to physics metaphors like "the distribution of parameters is a Boltzmann distribution sitting at the bottom of the loss basin." Empirically, calculating the uncertainty of a neural net by pretending it's adhering to the Bayesian NN picture works so well that one nice paper on ensemble methods[7] called it "ground truth." Of course to actually compute anything here you have to make approximations, and if you make the quick and dirty approximations (e.g. pretend you can find the shape of the loss basin from the Hessian) you get bad results[8], but people are doing clever things with Monte Carlo methods these days[9], and they find that better approximations to the Bayesian NN calculation get better results. But doing Monte Carlo traversal of the loss landscape is expensive. For a technique to apply at scale, it must impose only a small multiplier on cost to run the model, and if you want it to become ubiquitous the cost it imposes must be truly tiny. Ensembles A quite different approach to uncertainty is ensembles[10]. Just train a dozen-ish models, ask them for their recommendations, and estimate uncertainty from the spread. The dozen-times cost multiplier on everything is steep, but if you're querying the model a lot it's cheaper than Monte Carlo estimation of the loss landscape. Ensembling is theoretically straightforward. You don't need to pretend the model is trained to convergence, you don't need to train specifically for predictive loss, you don't even need...
The Christian worldview involves supernatural events, but many reject the supernatural as rationally credible. One reason given is that such events cannot be empirically tested and are outside the scope of scientific investigation. But is this true? And is it a good reason to reject belief in the supernatural? In this episode, I discuss these questions and argue this idea makes serious philosophical mistakes about science and knowledge.
In this episode, Jeff and Scott chat with Nick Guest, Assistant Professor of Accounting at Cornell University's SC Johnson College of Business about a recent study he conducted related to stock repurchases, available published here or in working paper version here.
The Dominican Republic has posted impressive economic growth rates over the past thirty years. Despite this, the generation of new, good jobs has been remarkably weak. How have ordinary and poor Dominicans worked and lived in the shadow of the country's conspicuous growth rates? Jobless Growth in the Dominican Republic: Disorganization, Precarity, and Livelihoods (Stanford UP, 2022) considers this question through an ethnographic exploration of the popular economy in the Dominican capital. Focusing on the city's precarious small businesses, including furniture manufacturers, food stalls, street-corner stores, and savings and credit cooperatives, Krohn-Hansen shows how people make a living, tackle market shifts, and the factors that characterize their relationship to the state and pervasive corruption. Empirically grounded, this book examines the condition of the urban masses in Santo Domingo, offering an original and captivating contribution to the scholarship on popular economic practices, urban changes, and today's Latin America and the Caribbean. This will be essential reading for scholars and policy makers. Alex Diamond is Assistant Professor of sociology at Oklahoma State University. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/new-books-network
The Dominican Republic has posted impressive economic growth rates over the past thirty years. Despite this, the generation of new, good jobs has been remarkably weak. How have ordinary and poor Dominicans worked and lived in the shadow of the country's conspicuous growth rates? Jobless Growth in the Dominican Republic: Disorganization, Precarity, and Livelihoods (Stanford UP, 2022) considers this question through an ethnographic exploration of the popular economy in the Dominican capital. Focusing on the city's precarious small businesses, including furniture manufacturers, food stalls, street-corner stores, and savings and credit cooperatives, Krohn-Hansen shows how people make a living, tackle market shifts, and the factors that characterize their relationship to the state and pervasive corruption. Empirically grounded, this book examines the condition of the urban masses in Santo Domingo, offering an original and captivating contribution to the scholarship on popular economic practices, urban changes, and today's Latin America and the Caribbean. This will be essential reading for scholars and policy makers. Alex Diamond is Assistant Professor of sociology at Oklahoma State University. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/caribbean-studies
The Dominican Republic has posted impressive economic growth rates over the past thirty years. Despite this, the generation of new, good jobs has been remarkably weak. How have ordinary and poor Dominicans worked and lived in the shadow of the country's conspicuous growth rates? Jobless Growth in the Dominican Republic: Disorganization, Precarity, and Livelihoods (Stanford UP, 2022) considers this question through an ethnographic exploration of the popular economy in the Dominican capital. Focusing on the city's precarious small businesses, including furniture manufacturers, food stalls, street-corner stores, and savings and credit cooperatives, Krohn-Hansen shows how people make a living, tackle market shifts, and the factors that characterize their relationship to the state and pervasive corruption. Empirically grounded, this book examines the condition of the urban masses in Santo Domingo, offering an original and captivating contribution to the scholarship on popular economic practices, urban changes, and today's Latin America and the Caribbean. This will be essential reading for scholars and policy makers. Alex Diamond is Assistant Professor of sociology at Oklahoma State University. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/anthropology
The Dominican Republic has posted impressive economic growth rates over the past thirty years. Despite this, the generation of new, good jobs has been remarkably weak. How have ordinary and poor Dominicans worked and lived in the shadow of the country's conspicuous growth rates? Jobless Growth in the Dominican Republic: Disorganization, Precarity, and Livelihoods (Stanford UP, 2022) considers this question through an ethnographic exploration of the popular economy in the Dominican capital. Focusing on the city's precarious small businesses, including furniture manufacturers, food stalls, street-corner stores, and savings and credit cooperatives, Krohn-Hansen shows how people make a living, tackle market shifts, and the factors that characterize their relationship to the state and pervasive corruption. Empirically grounded, this book examines the condition of the urban masses in Santo Domingo, offering an original and captivating contribution to the scholarship on popular economic practices, urban changes, and today's Latin America and the Caribbean. This will be essential reading for scholars and policy makers. Alex Diamond is Assistant Professor of sociology at Oklahoma State University. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/sociology
The Dominican Republic has posted impressive economic growth rates over the past thirty years. Despite this, the generation of new, good jobs has been remarkably weak. How have ordinary and poor Dominicans worked and lived in the shadow of the country's conspicuous growth rates? Jobless Growth in the Dominican Republic: Disorganization, Precarity, and Livelihoods (Stanford UP, 2022) considers this question through an ethnographic exploration of the popular economy in the Dominican capital. Focusing on the city's precarious small businesses, including furniture manufacturers, food stalls, street-corner stores, and savings and credit cooperatives, Krohn-Hansen shows how people make a living, tackle market shifts, and the factors that characterize their relationship to the state and pervasive corruption. Empirically grounded, this book examines the condition of the urban masses in Santo Domingo, offering an original and captivating contribution to the scholarship on popular economic practices, urban changes, and today's Latin America and the Caribbean. This will be essential reading for scholars and policy makers. Alex Diamond is Assistant Professor of sociology at Oklahoma State University. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/economics
This piece defends a strong form of epistemic modesty: that, in most cases, one should pay scarcely any attention to what you find the most persuasive view on an issue, hewing instead to an idealized consensus of experts. I start by better pinning down exactly what is meant by ‘epistemic modesty', go on to offer a variety of reasons that motivate it, and reply to some common objections. Along the way, I show common traps people being inappropriately modest fall into. I conclude that modesty is a superior epistemic strategy, and ought to be more widely used - particularly in the EA/rationalist communities. [gdoc] Provocation I argue for this: In virtually all cases, the credence you hold for any given belief should be dominated by the balance of credences held by your epistemic peers and superiors. One's own convictions should weigh no more [...] ---Outline:(00:45) Provocation(01:05) Introductions and clarifications(01:10) A favourable motivating case(03:08) Weaker and stronger forms of modesty(04:25) Motivations for more modesty(04:42) The symmetry case(06:53) Compressed sensing of (and not double-counting) the object level(09:00) Repeated measures, brains as credence censors, and the wisdom of crowds(11:09) Deferring to better brains(12:38) Inference to the ideal epistemic observer(15:26) Excursus: Against common justifications for immodesty(16:21) Being ‘well informed' (or even true expertise) is not enough(18:02) Common knowledge ‘silver bullet arguments'(19:47) Debunking the expert class (but not you)(22:51) Private evidence and pet arguments(24:52) Objections(25:04) In theory(25:08) There's no pure ‘outside view'[12](25:56) Immodestly modest?(28:55) In practice(29:25) Trivial (and less trivial) non-use cases(31:40) In theory, the world should be mad(34:45) Empirically, the world is mad(37:22) Expert groups are seldom in reflective equilibrium(42:05) Somewhat satisfying Shulman(42:55) Practical challenges to modesty(44:21) Community benefits to immodesty(47:25) Conclusion: a pean, and a plea(47:53) Rationalist/EA exceptionalism(50:46) To discover, not summarise(53:00) Paradoxically pathological modesty(54:28) Coda(55:02) Acknowledgements--- First published: October 29th, 2017 Source: https://forum.effectivealtruism.org/posts/WKPd79PESRGZHQ5GY/in-defence-of-epistemic-modesty --- Narrated by TYPE III AUDIO.
You can have confidence in non-science, too.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: resolving some neural network mysteries, published by bhauth on June 19, 2023 on LessWrong. Here are some things about neural networks that I used to find puzzling but now feel that I have adequate explanations for. The theory behind these answers didn't start to be understood until well after the correct things to do were found by chance or blind imitation of brains. Why is good optimization possible? Neural networks typically deal with "non-convex" optimization problems. Traditionally, using gradient descent for those was considered impractical, because it would rapidly get stuck in local minima. That was part of the motivation for evolutionary approaches. Why, then, are neural networks trainable by gradient descent? Because if you add enough extra dimensions, non-convex problems become convex. Empirically, with massive overparameterization, the energy landscape tends to have many saddle points but few local minima. Showing theoretical convergence guarantees for overparameterized networks is a recent and ongoing research topic; see eg this. As I previously noted, this is why sparse networks from iterative magnitude pruning have good performance, but sparse networks generally can't be trained from scratch as well as dense networks. This also explains some "thresholds" of neural network performance vs size: when overparameterization is proportional to problem non-convexity, good training becomes possible and performance improves significantly. Why is generalization possible? Adding enough free variables can turn non-convex problems into convex ones. Why didn't people just do that in the past, then? Because far before you get to that point, the extra free variables led to overfitting that reduced test performance. People tried simple regularization like neural networks use, and that was completely inadequate. Overparameterized neural networks can learn random data. Why do networks with fairly simple regularization tend to generalize? Distance in neural network latent spaces being meaningful is basically the main useful thing about neural networks. Another phrasing of the above question is: Why is distance in latent spaces meaningful for latent space points not in the training set? A few years back, some people noticed that neural network activation functions have spectral bias. With the types of activation functions used, low-frequency relationships are fit more quickly than high-frequency ones. That causes latent space relationships to be preferentially fit in such a way that point distances are related to point similarity. This can then be tuned by simple regularization: if you have spectral bias and balance learning rate vs regularization globally, you can control the frequency range learned. An obvious way to test this theory of neural network generalization is to find some activation functions with relatively low spectral bias, and see how they perform. This paper tries a "hat" activation function, and finds that loss on the training set goes down much faster but test accuracy is much worse. This paper does some relevant tests on spectral bias. It's known that there is no universal best activation function. The optimal choice varies with: different problem types regularization settings layer depth I think using different activation functions for different depths is a semi-common technique at large AI labs now. This spectral bias framework can explain variations in relative performance of activation functions as spectral bias matching. Why not mixed activation functions? There are some reasons to think mixing activation functions in the same layer would be better: Neural networks have many equivalent permutations of their variables. By mixing different activation functions in the same later, fewer permutations would be equivalent, which increases expressive power....
Our 126th episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai Timestamps: (00:00) Intro / Banter (02:35) Response to listener comments / corrections (03:20) News Preview Tools & Apps(04:24) Zoom can now give you AI summaries of the meetings you've missed (06:02) A majority of Americans have heard of ChatGPT, but few have tried it themselves Lighting round(12:04) Google says Gmail on your phone just got a lot faster thanks to A.I. (13:00) Artifact news app now uses AI to rewrite headline of a clickbait article (15:06) Instacart launches new in-app AI search tool powered by ChatGPT (17:17) Microsoft Has Launched “Jugalbandi”—A New Generative AI App for India (18:34) Microsoft Teams on Windows 11 gets Discord-like communities and an AI art tool Applications & Business(19:55) Baidu's $145M AI fund signals China's push for AI self-reliance (22:25) AI Market Set To Break The Trillion-Dollar Barrier: Surging Toward $1.06 Trillion By 2028 Whether You Trust It Or Not (26:07) The A.I. job culling has already begun and 4,000 people lost work last month to the technology, according to a new report (30:15) AI chatbots lose money every time you use them. That is a problem. Lighting round(34:00) Tech stocks surge as wave of interest in AI drives $4tn rally (35:16) ANYBotics raises $50 million to help deploy its robot dog (37:14) Microsoft signs deal for A.I. computing power with Nvidia-backed CoreWeave that could be worth billions (39:40) Apple is looking for engineers to work with Generative AI in a mixed reality environment (41:15) Lightmatter's photonic AI hardware is ready to shine with $154M in new funding (43:05) Character.AI, the a16z-backed chatbot startup, tops 1.7M installs in first week Projects & Open Source(45:21) UAE's Falcon 40B AI model is now royalty free for commercial, research use (47:45) Evaluating and uncovering open LLMs Research & Advancements(53:10) QLoRA: Efficient Finetuning of Quantized LLMs (58:05) OpenAI is pursuing a new way to fight AI 'hallucinations' (01:02:20) Do You Really Need Reinforcement Learning (RL) in RLHF? A New Stanford Research Proposes DPO (Direct Preference Optimization) (01:06:50) Reward Collapse in Aligning Large Language Models Lighting round(01:11:15) How Should We Maximize the Planning Ability of LLMs While Reducing the Computation Cost? Meet SwiftSage : A Novel Generative Agent for Complex Interactive Reasoning Tasks, Inspired by the Dual-Process Theory of Human Cognition (01:15:15) How to Keep Scaling Large Language Models when Data Runs Out? A New AI Research Trains 400 Models with up to 9B Parameters and 900B Tokens to Create an Extension of Chinchilla Scaling Laws for Repeated Data (01:17:58) This AI Research Dives Into The Limitations and Capabilities of Transformer Large Language Models (LLMs), Empirically and Theoretically, on Compositional Tasks" Policy & Safety(01:20:55) ChatGPT Plugins Open Security Holes From PDFs, Websites and More (01:25:02) US, Europe Working on Voluntary AI Code of Conduct as Calls Grow for Regulation Lighting round(01:27:58) With new grant program, OpenAI aims to crowdsource AI regulation (01:30:34) Lawyer cited 6 fake cases made up by ChatGPT; judge calls it “unprecedented” (01:33:13) Japan Goes All In: Copyright Doesn't Apply To AI Training (01:36:48) China warns of ‘complicated and challenging circumstances' posed by AI risk Synthetic Media & Art(01:39:39) Welcome to the new surreal. How AI-generated video is changing film. (01:41:02) ‘Those who hate AI are insecure': inside Hollywood's battle over artificial intelligence" (01:43:20) Outro
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Contrast Pairs Drive the Empirical Performance of Contrast Consistent Search (CCS), published by Scott Emmons on May 31, 2023 on The AI Alignment Forum. tl;dr Contrast consistent search (CCS) is a method by Burns et al. that consists of two parts: Generate contrast pairs by adding pseudolabels to an unlabelled dataset. Use the contrast pairs to search for a direction in representation space that satisfies logical consistency properties. In discussions with other researchers, I've repeatedly heard (2) as the explanation for how CCS works; I've heard almost no mention of (1). In this post, I want to emphasize that the contrast pairs drive almost all of the empirical performance in Burns et al. Once we have the contrast pairs, standard unsupervised learning methods attain comparable performance to the new CCS loss function. In the paper, Burns et al. do a nice job comparing the CCS loss function to different alternatives. The simplest such alternative runs principal component analysis (PCA) on contrast pair differences, and then it uses the top principal component as a classifier. Another alternative runs linear discriminant analysis (LDA) on contrast pair differences. These alternatives attain 97% and 98% of CCS's accuracy! "[R]epresentations of truth tend to be salient in models: ... they can often be found by taking the top principal component of a slightly modified representation space," Burns et al. write in the introduction. If I understand this statement correctly, it's saying the same thing I want to emphasize in this post: the contrast pairs are what allow Burns et al. to find representations of truth. Empirically, once we have the representations of contrast pair differences, their variance points in the direction of truth. The new logical consistency loss in CCS isn't needed for good empirical performance. Notation We'll follow the notation of the CCS paper. Assume we are given a data set {x1,x2,.,xn} and a feature extractor ϕ(), such as the hidden state of a pretrained language model. First, we will construct a contrast pair for each datapoint xi. We add “label: positive” and “label: negative” to each xi. This gives contrast pairs of the form (x+i,x−i). Now, we consider the set {x+1,x+2,.,x+n} of positive pseudo-labels and {x−1,x−2,.,x−n} of negative pseudo-labels. Because all of the x+i have "label: positive" and all of the x−i have "label: negative", we normalize the positive pseudo-labels and the negative pseudo-labels separately: Here, μ+ and μ− are the element-wise means of the positive and negative pseudo-label sets, respectively. Similarly, σ+ and σ− are the element-wise standard deviations. The goal of this normalization is to remove the embedding of "label: positive" from all the positive pseudo-labels (and "label: negative" from all the negative pseudo-labels). The hope is that by construction, the only difference between ~ϕ(x+i) and ~ϕ(x−i) is that one is true while the other is false. CCS is one way to extract the information about true and false. As we'll discuss more below, doing PCA or LDA on the set of differences {~ϕ(x+i)−~ϕ(x−i)}ni=1 works almost as well. Concept Embeddings in Prior Work In order to better understand contrast pairs, I think it's helpful to review this famous paper by Bolukbasi et al., 2016: "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings." Quoting from Bolukbasi et al.: −−−man−−−−−−woman≈−−−king−−−−−queen Vector differences between words in embeddings have been shown to represent relationships between words. For example given an analogy puzzle, "man is to king as woman is to x" (denoted as man:king :: woman:x), simple arithmetic of the embedding vectors finds that x=queen is the best answer because: Similarly, x=Japan is returned for Paris:France :: Tokyo:x. It is surprising that a simple ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Contrast Pairs Drive the Empirical Performance of Contrast Consistent Search (CCS), published by Scott Emmons on May 31, 2023 on LessWrong. tl;dr Contrast consistent search (CCS) is a method by Burns et al. that consists of two parts: Generate contrast pairs by adding pseudolabels to an unlabelled dataset. Use the contrast pairs to search for a direction in representation space that satisfies logical consistency properties. In discussions with other researchers, I've repeatedly heard (2) as the explanation for how CCS works; I've heard almost no mention of (1). In this post, I want to emphasize that the contrast pairs drive almost all of the empirical performance in Burns et al. Once we have the contrast pairs, standard unsupervised learning methods attain comparable performance to the new CCS loss function. In the paper, Burns et al. do a nice job comparing the CCS loss function to different alternatives. The simplest such alternative runs principal component analysis (PCA) on contrast pair differences, and then it uses the top principal component as a classifier. Another alternative runs linear discriminant analysis (LDA) on contrast pair differences. These alternatives attain 97% and 98% of CCS's accuracy! "[R]epresentations of truth tend to be salient in models: ... they can often be found by taking the top principal component of a slightly modified representation space," Burns et al. write in the introduction. If I understand this statement correctly, it's saying the same thing I want to emphasize in this post: the contrast pairs are what allow Burns et al. to find representations of truth. Empirically, once we have the representations of contrast pair differences, their variance points in the direction of truth. The new logical consistency loss in CCS isn't needed for good empirical performance. Notation We'll follow the notation of the CCS paper. Assume we are given a data set {x1,x2,.,xn} and a feature extractor ϕ(), such as the hidden state of a pretrained language model. First, we will construct a contrast pair for each datapoint xi. We add “label: positive” and “label: negative” to each xi. This gives contrast pairs of the form (x+i,x−i). Now, we consider the set {x+1,x+2,.,x+n} of positive pseudo-labels and {x−1,x−2,.,x−n} of negative pseudo-labels. Because all of the x+i have "label: positive" and all of the x−i have "label: negative", we normalize the positive pseudo-labels and the negative pseudo-labels separately: Here, μ+ and μ− are the element-wise means of the positive and negative pseudo-label sets, respectively. Similarly, σ+ and σ− are the element-wise standard deviations. The goal of this normalization is to remove the embedding of "label: positive" from all the positive pseudo-labels (and "label: negative" from all the negative pseudo-labels). The hope is that by construction, the only difference between ~ϕ(x+i) and ~ϕ(x−i) is that one is true while the other is false. CCS is one way to extract the information about true and false. As we'll discuss more below, doing PCA or LDA on the set of differences {~ϕ(x+i)−~ϕ(x−i)}ni=1 works almost as well. Concept Embeddings in Prior Work In order to better understand contrast pairs, I think it's helpful to review this famous paper by Bolukbasi et al., 2016: "Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings." Quoting from Bolukbasi et al.: −−−man−−−−−−woman≈−−−king−−−−−queen Vector differences between words in embeddings have been shown to represent relationships between words. For example given an analogy puzzle, "man is to king as woman is to x" (denoted as man:king :: woman:x), simple arithmetic of the embedding vectors finds that x=queen is the best answer because: Similarly, x=Japan is returned for Paris:France :: Tokyo:x. It is surprising that a simple vector arithm...
2023 begins with some belated gifts, not to mention a plethora of episodes and Inside Track goings-on. This month we'll be talking with Dr. Maranda Trahan and Amanda Ripley about gerontology, Dr. Jamie Hughes-Lika about NDBIs, and ourselves about visual supports (including Rob's favorite research story of 2022). Interested in joining the Winter Book Club on the topic of parenting? How about voting on an ethics topic for February? All that can be yours by supporting us on Patreon (but better do it soon!) Articles for January 2023 Gerontology Revisited w/ Dr. Maranda Trahan + Amanda Ripley Drossel, C. & Trahan, M.A. (2015). Behavioral interventions are first-line treatments for managing changes associated with cognitive decline. The Behavior Therapist, 38, 126-131. Burgio, L.D. & Burgio, K.L. (1986). Behavioral gerontology: Applications of behavioral methods to the problems of older adults. Journal of Applied Behavior Analysis, 19, 321-328. doi: 10.1901/jaba.1986.19-321 Visual Supports (LIVE) Meadan, H. Ostrosky, M.M., Triplett, B., Michna, A., & Fettig, A. (2011). Using visual supports with young children with autism spectrum disorder. Teaching Exceptional Children, 43, 28-35. doi: 10.1177/004005991104399693 Duttlinger, C., Ayres, K.M., Bevill-Davis, A., & Douglas, K.H. (2012). The effects of a picture activity schedule for students with intellectual disability to complete a seqeunce of tasks following verbal directions. Focus on Autism and Other Developmental Disabilities, 28, 32-43. doi: 10.1177/1088357612460572 Bateman, K.J., Wilson, S.E., Gauvreau, A., Matthews, K., Gucwa, M., Therrien, W., Nevill, R., & Mazurek, M. (2022). Visual supports to increase conversation engagmeent for preschoolers with autism spectrum disorder during mealtimes: An initial investigation. Journal of Early Intervention. 1-22. doi: 10.1177/10538151221111762 Fields, C.J. & Demchak, M. (2019). Integrated visual supports in a school-based microenterprise for students with intellectual disabilities. Career Development and Transition for Exceptional Individuals, 42, 128-134. doi: 10.1177/2165143418769611 Naturalistic Developmental Behavioral Intervention w/ Dr. Jamie Hughes-Lika Vivanti, G. & Stahmer, A.C. (2021). Can the Early Start Denver Model be considered ABA practice? Behavior Analysis in Practice, 14, 230-239. doi: 10.1007/s40617-020-00474-3 Rogers, S.J., Yoder, P., Estes, A., Warren, Z., McEachin, J., Munson, J., Rocha, M., Greenson, J., Wallace, L., & Gardner, E. (2021). A multisite randomized controlled tiral comparing the effects of intervention intensity and intervention sytle on outcomes for young children with autism. Journal of the American Academy of Child and Adolescent Psychiatry, 60, 710-722. doi: 10.1016/j.jaac.2020.06.013 Schreibman, L., Dawson, G., Stahmer, A.C., Landa, R., Rogers, S.J., McGee, G.G., Kasar, C., Ingersoll, B., Kaiser, A.P., Bruinsma, Y., McNerney, E., Wetherby, A., & Hadley, A. (2015). Naturalistic developmental behavioral interventions: Empirically validated treatments for autism spectrum disorder. Journal of Autism and Developmental Disorders, 45, 2411-2428. doi: 10.1007/s10803-015-2407-8
Marni welcomes an award-winning international dating and relationship expert. Hunt Etheridge has over 15 years of helping people become the best, most datable versions of themselves. He helps his clients empirically become more datable. His company trains matchmakers and dating coaches. You may have seen him on one or more than 100 media outlets, including Playboy and CNN. Takeaways from this episode: How to communicate with Men about sex Use special phraseology to get what you want Get empirically better at dating How to give a compliment How Do Men Learn to Be Good at Sex [3:29] Many women ask Marni why they are not excited about having sex with their boyfriend or how they can get their partners to be better at sex. Hunter says men don't have a lot of resources to go to for information about relationships and sex. It is hard for them to know what to do. For men, sex is tied up with ego. And, the male ego is fragile. This is what makes it difficult to ask for relationship advice and not take the feedback as criticism. A magazine or porn isn't the greatest way to learn about sex because of the stereotypical roles they portray. There are ways for women to get what they want in bed without being critical or frustrated with their guy. Hunter says it's all about the phraseology. Adding a positive aspect can make a guy feel comfortable enough to adapt to what his woman wants. Every woman has a different manual when it comes to physical touch. Get Empirically Better at Dating [9:22] Marni asks how women and men optimize themselves to find the right person. Hunt says luck favors the prepared mind. The essence of dating is to understand the value systems of your culture and try to exemplify them. Empirically better dating is just a series of little extra skills that make you a more interesting, likable, person. When dating, it is normal for people to ask logical questions hoping to get some conversation starter nuggets. But, in doing that Hunt says, we don't set the stage for chemistry to flourish. We ask logical questions expecting to get an emotional result. Hunter offers examples of how logical questions can be asked emotionally. Reframing a question can lead to emotional responses and help put us in a positive light. Compliments are low-hanging fruit. But, if you give a compliment with a personalized touch it will mean so much more to the person. Make a Connection: Visit Our Website Learn How To Attract Your Perfect Equal… Watch Our Latest Training Here! Follow Along On Marni and Jeremy's Radical Living Challenge! Download A Complimentary Copy Of Our Book — How To Find A Quality Guy Without Going On 200 Dates Hunt For Advice with Hunt Etheridge Hunt Etheridge's Matchmaking Academy