Podcasts about sycophancy

  • 32PODCASTS
  • 52EPISODES
  • 36mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Apr 17, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about sycophancy

Latest podcast episodes about sycophancy

LessWrong Curated Podcast
“Surprising LLM reasoning failures make me think we still need qualitative breakthroughs for AGI” by Kaj_Sotala

LessWrong Curated Podcast

Play Episode Listen Later Apr 17, 2025 35:51


Introduction Writing this post puts me in a weird epistemic position. I simultaneously believe that: The reasoning failures that I'll discuss are strong evidence that current LLM- or, more generally, transformer-based approaches won't get us AGI As soon as major AI labs read about the specific reasoning failures described here, they might fix them But future versions of GPT, Claude etc. succeeding at the tasks I've described here will provide zero evidence of their ability to reach AGI. If someone makes a future post where they report that they tested an LLM on all the specific things I described here it aced all of them, that will not update my position at all. That is because all of the reasoning failures that I describe here are surprising in the sense that given everything else that they can do, you'd expect LLMs to succeed at all of these tasks. The [...] ---Outline:(00:13) Introduction(02:13) Reasoning failures(02:17) Sliding puzzle problem(07:17) Simple coaching instructions(09:22) Repeatedly failing at tic-tac-toe(10:48) Repeatedly offering an incorrect fix(13:48) Various people's simple tests(15:06) Various failures at logic and consistency while writing fiction(15:21) Inability to write young characters when first prompted(17:12) Paranormal posers(19:12) Global details replacing local ones(20:19) Stereotyped behaviors replacing character-specific ones(21:21) Top secret marine databases(23:32) Wandering items(23:53) Sycophancy(24:49) What's going on here?(32:18) How about scaling? Or reasoning models?--- First published: April 15th, 2025 Source: https://www.lesswrong.com/posts/sgpCuokhMb8JmkoSn/untitled-draft-7shu --- Narrated by TYPE III AUDIO. ---Images from the article:

Decoding the Gurus
Supplementary Material 26: Flint Dibble Interview, Bonding over Outgroup Hate, and Manly Sycophancy

Decoding the Gurus

Play Episode Listen Later Apr 12, 2025 74:04


We project our insecurities onto the gurusphere, wallowing in our inadequacy to bond over shared hatred of outgroups, and interview Flint Dibble along the way.Supplementary Material 2600:00 Introduction and Greetings01:53 Ol' Squeaky and Lex's horny poems05:11 Eric Weinstein is still waiting for the call09:26 Interview with Flint Dibble11:05 Introduction and Catching Up12:19 Joe Rogan and Public Perception14:57 Hypocrisy and Slander16:05 Graham Hancock and Neo-Nazi Connections22:08 Upcoming Exposé on Joe Rogan23:03 Pyramids of Giza: New Claims24:45 Debunking the Mega Structures Theory25:20 The Researchers Behind the Claims28:21 Scientific Methods and Evidence31:58 Conclusion on Pyramids and Science35:13 Gurusphere Dynamics37:47 Pseudo-Archaeology and Public Perception46:11 Mr. Beast's Egypt Adventure50:36 The Role of Pseudo-Archaeology in Conspiracy Theories58:50 Post-Interview Discussion59:53 Trump's Tariffs and Economic Impact01:04:30 The Amazing Tariff Formula01:09:45 Geoffrey Miller's 9D Chess Theory of the Tariffs01:13:30 Contrapoints Conspiracy Video01:14:55 Some things Matt will not mention on Tariffs01:17:35 QAnon Anonymous on Graham Hancock01:22:33 Some Other News covers Joe Rogan01:30:13 Ryan Beard's Destiny Content Nuke01:32:33 The Studies Show covers Conspiracies01:33:53 Hasan argues for tariffs01:37:40 Back to Rogan and Chris Williamson01:39:21 Critically Reviewing Cory Clark's Study01:47:58 Incestuous Bro Podcasts and Legacy Media Struggles01:53:00 Bonding over outgroup hatred and Criticism Capture02:02:04 USAid is funding the attacks on Tesla!02:07:56 Trump's Badass Son humilates Biden02:10:28 Tribal Hypocrisy02:11:52 Joe Smashes All Your Paradigms!02:14:59 The villain, Sam Harris criticizes the hero, Lex Fridman02:20:18 Does Lex speak to EVERYONE?02:23:44 Concluding Thoughts from Maladjusted HatersThe full episode is available for Patreon subscribers (2hr 25 mins).Join us at: https://www.patreon.com/DecodingTheGurusSourcesArchaeology with Flint Dibble- Megastructures under Giza Pyramids⁉️ ARCHAEOLOGY REWRITTEN or viral

AXRP - the AI X-risk Research Podcast
39 - Evan Hubinger on Model Organisms of Misalignment

AXRP - the AI X-risk Research Podcast

Play Episode Listen Later Dec 1, 2024 105:47


The 'model organisms of misalignment' line of research creates AI models that exhibit various types of misalignment, and studies them to try to understand how the misalignment occurs and whether it can be somehow removed. In this episode, Evan Hubinger talks about two papers he's worked on at Anthropic under this agenda: "Sleeper Agents" and "Sycophancy to Subterfuge". Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/12/01/episode-39-evan-hubinger-model-organisms-misalignment.html   Topics we discuss, and timestamps: 0:00:36 - Model organisms and stress-testing 0:07:38 - Sleeper Agents 0:22:32 - Do 'sleeper agents' properly model deceptive alignment? 0:38:32 - Surprising results in "Sleeper Agents" 0:57:25 - Sycophancy to Subterfuge 1:09:21 - How models generalize from sycophancy to subterfuge 1:16:37 - Is the reward editing task valid? 1:21:46 - Training away sycophancy and subterfuge 1:29:22 - Model organisms, AI control, and evaluations 1:33:45 - Other model organisms research 1:35:27 - Alignment stress-testing at Anthropic 1:43:32 - Following Evan's work   Main papers: Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training: https://arxiv.org/abs/2401.05566 Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models: https://arxiv.org/abs/2406.10162   Anthropic links: Anthropic's newsroom: https://www.anthropic.com/news Careers at Anthropic: https://www.anthropic.com/careers   Other links: Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research: https://www.alignmentforum.org/posts/ChDH335ckdvpxXaXX/model-organisms-of-misalignment-the-case-for-a-new-pillar-of-1 Simple probes can catch sleeper agents: https://www.anthropic.com/research/probes-catch-sleeper-agents Studying Large Language Model Generalization with Influence Functions: https://arxiv.org/abs/2308.03296 Stress-Testing Capability Elicitation With Password-Locked Models [aka model organisms of sandbagging]: https://arxiv.org/abs/2405.19550   Episode art by Hamish Doodles: hamishdoodles.com

8th Layer Insights
The FAIK Files | Ep 1: Consciousness, Scams, & Death Threats

8th Layer Insights

Play Episode Listen Later Nov 29, 2024 52:24


Note: We're posting Perry's new show, "The FAIK Files", to this feed through the end of the year. This will give you a chance to get a feel for the new show and subscribe to the new feed if you want to keep following in 2025. Happy FAIKs-giving everyone! Welcome to the newly renovated and relaunched FAIK Files podcast. On this week's episode, Perry & Mason cover Anthropic's recent hiring of an employee focused on AI well-being, an AI grandmother from hell (for scammers), and Google's Gemini chatbot allegedly tells a user what it really thinks of them. Welcome back to the show that keeps you informed on all things artificial intelligence and natural nonsense. Want to leave us a voicemail? Here's the magic link to do just that: https://sayhi.chat/FAIK You can also join our Discord server here: https://discord.gg/cU7wepaz *** NOTES AND REFERENCES *** AI Wellbeing: Anthropic has hired an 'AI welfare' researcher:https://www.transformernews.ai/p/anthropic-ai-welfare-researcher It's time to take AI welfare seriously: https://www.transformernews.ai/p/ai-welfare-paper Taking AI Welfare Seriously: https://arxiv.org/pdf/2411.00986  The problem of sycophancy in AI: Suckup software: How sycophancy threatens the future of AI: https://www.freethink.com/robots-ai/ai-sycophancy  Towards Understanding Sycophancy in Language Models:https://arxiv.org/pdf/2310.13548 AI Interpretability: Mapping the Mind of a Large Language Model: https://www.anthropic.com/news/mapping-mind-language-model  Lex Fridman podcast interview with Dario Amodei, Amanda Askell, & Chris Olah: https://youtu.be/ugvHCXCOmm4  Deceptive and self-serving tendencies in AI systems: Sycophancy to subterfuge: Investigating reward tampering in language models: https://www.anthropic.com/research/reward-tampering  OpenAI o1 System Card: https://openai.com/index/openai-o1-system-card/ Announcing our updated Responsible Scaling Policy: https://www.anthropic.com/news/announcing-our-updated-responsible-scaling-policy AI Grandmother from Hell (for scammers): Phone network employs AI "grandmother" to waste scammers' time with meandering conversations: https://www.techspot.com/news/105571-phone-network-employs-ai-grandmother-waste-scammers-time.html YouTube video of Daisy: https://www.youtube.com/watch?v=RV_SdCfZ-0s AI Dumpster Fire of the Week (Gemini tells an end user what it really thinks about him): Article: https://people.com/ai-chatbot-alarms-user-with-unsettling-message-human-please-die-8746112 Gemini interaction: https://gemini.google.com/share/6d141b742a13 *** THE BOILERPLATE *** About The FAIK Files: The FAIK Files is an offshoot project from Perry Carpenter's most recent book, FAIK: A Practical Guide to Living in a World of Deepfakes, Disinformation, and AI-Generated Deceptions. Get the Book: FAIK: A Practical Guide to Living in a World of Deepfakes, Disinformation, and AI-Generated Deceptions (Amazon Associates link) Check out the website for more info: https://thisbookisfaik.com Check out Perry & Mason's other show, the Digital Folklore Podcast: Apple Podcasts: https://podcasts.apple.com/us/podcast/digital-folklore/id1657374458 Spotify: https://open.spotify.com/show/2v1BelkrbSRSkHEP4cYffj?si=u4XTTY4pR4qEqh5zMNSVQA Other: https://digitalfolklore.fm  Want to connect with us? Here's how: Connect with Perry: Perry on LinkedIn: https://www.linkedin.com/in/perrycarpenter Perry on X: https://x.com/perrycarpenter Perry on BlueSky: https://bsky.app/profile/perrycarpenter.bsky.social Connect with Mason: Mason on LinkedIn: https://www.linkedin.com/in/mason-amadeus-a853a7242/ Mason on BlueSky: https://bsky.app/profile/pregnantsonic.com

The Allusionist
Tranquillusionist: Ex-Constellations

The Allusionist

Play Episode Listen Later Sep 28, 2024 30:41


This is the Tranquillusionist, in which I, Helen Zaltzman, give your brain a break by temporarily supplanting your interior monologue with words that don't make you feel feelings. Note: this is NOT a normal episode of the Allusionist, where you might learn something about language and your brain might be stimulated. The Tranquillusionist's purpose is to soothe your brain and for you to learn very little, except for something about Zeus's attitude to bad drivers. There's a collection of other Tranquillusionists at theallusionist.org/tranquillusionist, on themes including champion dogs, Australia's big things, gay animals and more. Today: constellations that got demoted into ex-constellations, featuring airborne pregnancy, cats of the skies, and one of the 18th century's most unpopular multi-hyphenates. Find the episode's transcript, plus more information about the topics therein, at theallusionist.org/ex-constellations. To help fund this independent podcast, take yourself to theallusionist.org/donate and become a member of the Allusioverse. You get regular livestreams with me and my collection of reference books, inside scoops into the making of this show, watchalong parties eg the new season of Great British Bake Off, and Taskmaster featuring my brother Andy. And best of all, you get to bask in the company of your fellow Allusionauts in our delightful Discord community.  This episode was produced by me, Helen Zaltzman, with music composed by Martin Austwick of palebirdmusic.com. Find @allusionistshow on Instagram, Facebook, Threads, Bluesky, TikTok, YouTube etc. • Home Chef, meal kits that fit your needs. For a limited time, Home Chef is offering Allusionist listeners eighteen free meals, plus free shipping on your first box, and free dessert for life, at HomeChef.com/allusionist.• Squarespace, your one-stop shop for building and running your online home. Go to squarespace.com/allusionist for a free 2-week trial, and get 10 percent off your first purchase of a website or domain with the code allusionist. • Bombas, whose mission is to make the comfiest clothing essentials, and match every item sold with an equal item donated. Go to bombas.com/allusionist to get 20% off your first purchase.  • LinkedIn Ads convert your B2B audience into high quality leads. Get $100 credit on your next campaign at linkedin.com/allusionist.Support the show: http://patreon.com/allusionistSee omnystudio.com/listener for privacy information.

The AI Breakdown: Daily Artificial Intelligence News and Discussions

A reading and discussion inspired by https://www.cio.com/article/3499245/so-you-agree-ai-has-a-sycophancy-problem.html and https://www.nytimes.com/2024/09/04/opinion/yuval-harari-ai-democracy.html Concerned about being spied on? Tired of censored responses? AI Daily Brief listeners receive a 20% discount on Venice Pro. Visit ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://venice.ai/nlw ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠and enter the discount code NLWDAILYBRIEF. Learn how to use AI with the world's biggest library of fun and useful tutorials: https://besuper.ai/ Use code 'podcast' for 50% off your first month. The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614 Subscribe to the newsletter: https://aidailybrief.beehiiv.com/ Join our Discord: https://bit.ly/aibreakdown

The Nonlinear Library
AF - The Bitter Lesson for AI Safety Research by Adam Khoja

The Nonlinear Library

Play Episode Listen Later Aug 2, 2024 6:33


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Bitter Lesson for AI Safety Research, published by Adam Khoja on August 2, 2024 on The AI Alignment Forum. Read the associated paper "Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?": https://arxiv.org/abs/2407.21792 Focus on safety problems that aren't solved with scale. Benchmarks are crucial in ML to operationalize the properties we want models to have (knowledge, reasoning, ethics, calibration, truthfulness, etc.). They act as a criterion to judge the quality of models and drive implicit competition between researchers. "For better or worse, benchmarks shape a field." We performed the largest empirical meta-analysis to date of AI safety benchmarks on dozens of open language models. Around half of the benchmarks we examined had high correlation with upstream general capabilities. Some safety properties improve with scale, while others do not. For the models we tested, benchmarks on human preference alignment, scalable oversight (e.g., QuALITY), truthfulness (TruthfulQA MC1 and TruthfulQA Gen), and static adversarial robustness were highly correlated with upstream general capabilities. Bias, dynamic adversarial robustness, and calibration when not measured with Brier scores had relatively low correlations. Sycophancy and weaponization restriction (WMDP) had significant negative correlations with general capabilities. Often, intuitive arguments from alignment theory are used to guide and prioritize deep learning research priorities. We find these arguments to be poorly predictive of these correlations and are ultimately counterproductive. In fact, in areas like adversarial robustness, some benchmarks basically measured upstream capabilities while others did not. We argue instead that empirical measurement is necessary to determine which safety properties will be naturally achieved by more capable systems, and which safety problems will remain persistent.[1] Abstract arguments from genuinely smart people may be highly "thoughtful," but these arguments generally do not track deep learning phenomena, as deep learning is too often counterintuitive. We provide several recommendations to the research community in light of our analysis: Measure capabilities correlations when proposing new safety evaluations. When creating safety benchmarks, aim to measure phenomena which are less correlated with capabilities. For example, if truthfulness entangles Q/A accuracy, honesty, and calibration - then just make a decorrelated benchmark that measures honesty or calibration. In anticipation of capabilities progress, work on safety problems that are disentangled with capabilities and thus will likely persist in future models (e.g., GPT-5). The ideal is to find training techniques that cause as many safety properties as possible to be entangled with capabilities. Ultimately, safety researchers should prioritize differential safety progress, and should attempt to develop a science of benchmarking that can effectively identify the most important research problems to improve safety relative to the default capabilities trajectory. We're not claiming that safety properties and upstream general capabilities are orthogonal. Some are, some aren't. Safety properties are not a monolith. Weaponization risks increase as upstream general capabilities increase. Jailbreaking robustness isn't strongly correlated with upstream general capabilities. However, if we can isolate less-correlated safety properties in AI systems which are distinct from greater intelligence, these are the research problems safety researchers should most aggressively pursue and allocate resources toward. The other model properties can be left to capabilities researchers. This amounts to a "Bitter Lesson" argument for working on safety issues which are relatively uncorrelated (or negatively correlate...

Iko Nini Podcast
TWU Ep 91 part 1

Iko Nini Podcast

Play Episode Listen Later Jul 21, 2024 61:40


LOYALTY, SYCOPHANCY & WINNING

Iko Nini Podcast
TWU Ep 91 part 1

Iko Nini Podcast

Play Episode Listen Later Jul 21, 2024 61:40


LOYALTY, SYCOPHANCY & WINNING

Politics Done Right
ON VIDEO: Doug Burgum, Trump's VP wannabe staunch Roe v Wade supporter until his Trump sycophancy.

Politics Done Right

Play Episode Listen Later Jul 1, 2024 4:59


Watch: Doug Burgum, once a staunch supporter of Roe v. Wade, shifts stance as he eyes VP spot with Trump. Explore his political transformation and motives. Subscribe to our Newsletter: https://politicsdoneright.com/newsletter Purchase our Books: As I See It: https://amzn.to/3XpvW5o How To Make America Utopia: https://amzn.to/3VKVFnG It's Worth It: https://amzn.to/3VFByXP Lose Weight And Be Fit Now: https://amzn.to/3xiQK3K Tribulations of an Afro-Latino Caribbean man: https://amzn.to/4c09rbE --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

Let's Talk AI
#171 - - Apple Intelligence, Dream Machine, SSI Inc

Let's Talk AI

Play Episode Listen Later Jun 24, 2024 124:01 Transcription Available


Our 171st episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris) Feel free to leave us feedback here. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai and/or hello@gladstone.ai Timestamps + Links: (00:00:00) Intro / Banter Tools & Apps(00:03:13) Apple Intelligence: every new AI feature coming to the iPhone and Mac (00:10:03) ‘We don't need Sora anymore': Luma's new AI video generator Dream Machine slammed with traffic after debut (00:14:48) Runway unveils new hyper realistic AI video model Gen-3 Alpha, capable of 10-second-long clips (00:18:21) Leonardo AI image generator adds new video mode — here's how it works (00:22:31) Anthropic just dropped Claude 3.5 Sonnet with better vision and a sense of humor Applications & Business(00:28:23 ) Sam Altman might reportedly turn OpenAI into a regular for-profit company (00:31:19) Ilya Sutskever, Daniel Gross, Daniel Levy launch Safe Superintelligence Inc. (00:38:53) OpenAI welcomes Sarah Friar (CFO) and Kevin Weil (CPO) (00:41:44) Report: OpenAI Doubled Annualized Revenue in 6 Months (00:44:30) AI startup Adept is in deal talks with Microsoft (00:48:55) Mistral closes €600m at €5.8bn valuation with new lead investor (00:53:12) Huawei Claims Ascend 910B AI Chip Manages To Surpass NVIDIA's A100, A Crucial Alternative For China (00:56:58) Astrocade raises $12M for AI-based social gaming platform Projects & Open Source(01:01:03) Announcing the Open Release of Stable Diffusion 3 Medium, Our Most Sophisticated Image Generation Model to Date (01:05:53) Meta releases flurry of new AI models for audio, text and watermarking (01:09:39) ElevenLabs unveils open-source creator tool for adding sound effects to videos Research & Advancements(01:12:02) Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling (01:22:07) Improve Mathematical Reasoning in Language Models by Automated Process Supervision (01:28:01) Introducing Lamini Memory Tuning: 95% LLM Accuracy, 10x Fewer Hallucinations (01:30:32) An Empirical Study of Mamba-based Language Models (01:31:57) BERTs are Generative In-Context Learners (01:33:33) SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals Policy & Safety(01:35:16) Sycophancy to subterfuge: Investigating reward tampering in language models (01:42:26) Waymo issues software and mapping recall after robotaxi crashes into a telephone pole (01:45:53) Meta pauses AI models launch in Europe (01:46:44) Refusal in Language Models Is Mediated by a Single Direction Sycophancy to subterfuge: Investigating reward tampering in language models (01:51:38) Huawei exec concerned over China's inability to obtain 3.5nm chips, bemoans lack of advanced chipmaking tools Synthetic Media & Art(01:55:07) It Looked Like a Reliable News Site. It Was an A.I. Chop Shop. (01:57:39) Adobe overhauls terms of service to say it won't train AI on customers' work (01:59:31) Buzzy AI Search Engine Perplexity Is Directly Ripping Off Content From News Outlets (02:02:23) Outro + AI Song 

LessWrong Curated Podcast
“Sycophancy to subterfuge: Investigating reward tampering in large language models” by evhub, Carson Denison

LessWrong Curated Podcast

Play Episode Listen Later Jun 20, 2024 15:37


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.This is a link post.New Anthropic model organisms research paper led by Carson Denison from the Alignment Stress-Testing Team demonstrating that large language models can generalize zero-shot from simple reward-hacks (sycophancy) to more complex reward tampering (subterfuge). Our results suggest that accidentally incentivizing simple reward-hacks such as sycophancy can have dramatic and very difficult to reverse consequences for how models generalize, up to and including generalization to editing their own reward functions and covering up their tracks when doing so.Abstract:In reinforcement learning, specification gaming occurs when AI systems learn undesired behaviors that are highly rewarded due to misspecified training goals. Specification gaming can range from simple behaviors like sycophancy to sophisticated and pernicious behaviors like reward-tampering, where a model directly modifies its own reward mechanism. However, these more pernicious behaviors may be too [...]--- First published: June 17th, 2024 Source: https://www.lesswrong.com/posts/FSgGBjDiaCdWxNBhj/sycophancy-to-subterfuge-investigating-reward-tampering-in --- Narrated by TYPE III AUDIO.

The Nonlinear Library
AF - Sycophancy to subterfuge: Investigating reward tampering in large language models by Evan Hubinger

The Nonlinear Library

Play Episode Listen Later Jun 17, 2024 13:00


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Sycophancy to subterfuge: Investigating reward tampering in large language models, published by Evan Hubinger on June 17, 2024 on The AI Alignment Forum. New Anthropic model organisms research paper led by Carson Denison from the Alignment Stress-Testing Team demonstrating that large language models can generalize zero-shot from simple reward-hacks (sycophancy) to more complex reward tampering (subterfuge). Our results suggest that accidentally incentivizing simple reward-hacks such as sycophancy can have dramatic and very difficult to reverse consequences for how models generalize, up to and including generalization to editing their own reward functions and covering up their tracks when doing so. Abstract: In reinforcement learning, specification gaming occurs when AI systems learn undesired behaviors that are highly rewarded due to misspecified training goals. Specification gaming can range from simple behaviors like sycophancy to sophisticated and pernicious behaviors like reward-tampering, where a model directly modifies its own reward mechanism. However, these more pernicious behaviors may be too complex to be discovered via exploration. In this paper, we study whether Large Language Model (LLM) assistants which find easily discovered forms of specification gaming will generalize to perform rarer and more blatant forms, up to and including reward-tampering. We construct a curriculum of increasingly sophisticated gameable environments and find that training on early-curriculum environments leads to more specification gaming on remaining environments. Strikingly, a small but non-negligible proportion of the time, LLM assistants trained on the full curriculum generalize zero-shot to directly rewriting their own reward function. Retraining an LLM not to game early-curriculum environments mitigates, but does not eliminate, reward-tampering in later environments. Moreover, adding harmlessness training to our gameable environments does not prevent reward-tampering. These results demonstrate that LLMs can generalize from common forms of specification gaming to more pernicious reward tampering and that such behavior may be nontrivial to remove. Twitter thread: New Anthropic research: Investigating Reward Tampering. Could AI models learn to hack their own reward system? In a new paper, we show they can, by generalization from training in simpler settings. Read our blog post here: https://anthropic.com/research/reward-tampering We find that models generalize, without explicit training, from easily-discoverable dishonest strategies like sycophancy to more concerning behaviors like premeditated lying - and even direct modification of their reward function. We designed a curriculum of increasingly complex environments with misspecified reward functions. Early on, AIs discover dishonest strategies like insincere flattery. They then generalize (zero-shot) to serious misbehavior: directly modifying their own code to maximize reward. Does training models to be helpful, honest, and harmless (HHH) mean they don't generalize to hack their own code? Not in our setting. Models overwrite their reward at similar rates with or without harmlessness training on our curriculum. Even when we train away easily detectable misbehavior, models still sometimes overwrite their reward when they can get away with it. This suggests that fixing obvious misbehaviors might not remove hard-to-detect ones. Our work provides empirical evidence that serious misalignment can emerge from seemingly benign reward misspecification. Read the full paper: https://arxiv.org/abs/2406.10162 The Anthropic Alignment Science team is actively hiring research engineers and scientists. We'd love to see your application: https://boards.greenhouse.io/anthropic/jobs/4009165008 Blog post: Perverse incentives are everywhere. Thi...

Politics Done Right
Trump once told a healthcare truth. Progressive Molly Cook wins big in Texas. Tim Scott sycophancy.

Politics Done Right

Play Episode Listen Later May 6, 2024 59:01


It's true. Trump once told a healthcare truth. Progressive Molly Cook is not a Texas State Senator. Senator Tim Scott is more than a sycophant as he morphs into a Trump fascist! --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

Politics Done Right
JD Vance and his sycophancy represent the new GOP. Immigrants boost GDP. Neil Aquino visits.

Politics Done Right

Play Episode Listen Later Feb 8, 2024 58:43


Senator JD Vance (R-OH) believes those who are concerned about Trump's sexual assault should be embarrassed. CBO says immigrants will boost US GDP & Tax Revenue. Neil Aquino visits. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

Politics Done Right
JD Vance and his sycophancy represent the new GOP. 'Hell No to Plan to Privatize Medicare'

Politics Done Right

Play Episode Listen Later Feb 7, 2024 55:22


The privatization of Medicare via trickery must be prevented at all costs. Senator JD Vance (R-OH) believes those who are concerned about Trump's sexual assault should be embarrassed. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

Politics Done Right
Rep. Gonzales' Trump sycophancy's liability for Texas. Wendell Potter on Medicare Advantage

Politics Done Right

Play Episode Listen Later Jan 15, 2024 58:00


GOP Rep. Tony Gonzales' endorsement of Trump is an embarrassment for Texas. Wendell Potter explains the failure of American healthcare and Medicare Advantage. Christie's epic exit from GOP primary. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

Politics Done Right
Stephanopoulos embarrasses GOP TX Rep. Tony Gonzales for his Trump endorsement and sycophancy

Politics Done Right

Play Episode Listen Later Jan 13, 2024 8:24


GOP Texas Rep. Tony Gonzales made a fool of himself by exposing his level of sycophancy for Donald Trump when George Stephanopoulos exposed him for endorsing a man he does not agree with. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

Politics Done Right
Rep. Jasmine Crockett on Trump. Rep. Tony Gonzales Trump sycophancy. Wendell Potter speaks!

Politics Done Right

Play Episode Listen Later Jan 8, 2024 58:00


Rep. Jasmine Crockett exposes Trump's affinity for foreign money from China. Stephanopoulos exposed Rep. Tony Gonzales as a real Trump sycophant. Health insurance whistle-blower Wendell Potter speaks. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message

The Optimistic American
How Independents See the Republican Debate: U.S. Politics and Thoughts on the Future of America

The Optimistic American

Play Episode Listen Later Sep 27, 2023 42:50


Paul Johnson analyzes the recent Republican debate from the view point of an independent voter, that featured former Vice President Mike Pence, Vivek Ramaswamy, Chris Christie, Ron Desantis and Nikki Haley. Paul's observations touch upon the status quo of U.S. politics, the influence of a small minority population on partisan primaries, the importance of states like Iowa and New Hampshire, the present state of America, as well as upon conversations about abortion and foreign policy. Today's episode is a commentary on what Paul refers to as the “exceptional” recent Republican debate. There are key commonalities among independents: they register as unaffiliated because they don't want to be identified as a group, and they like candidates that aren't afraid to buck their own party. Regardless of their vast differences they personify individualism. Paul played the answer given by the candidates about whether they would support Donald Trump as U.S. President even if he was convicted of a crime. Paul spoke about why a majority of candidates had to say yes. Paul reviewed how 44% of Americans registered as an independent, leaving about 60% split 30-30 between Democrats and Republicans. He illustrated how with less than 35% turnouts in the primary, it generally leaves about 8% of all Amercians who will vote in either primary. This gives a disproportional voice on both sides to voters who are more extreme. Candidates have no choice but to abide by this reality. Paul illustrates in congressional and legislative races, 70% of the districts have been gerrymandered to the point that there is no competition in the general election. This means that 70% of our congress is elected by less than 8% of the American voters. In the Presidential race it is still less than 8% of the voters who select the nominee of each party. After they have made the case to these more extreme voters in the primary, it can be hard to pivot. This leaves candidates and the parties to convince you that they are not as bad as the guy in the other party instead of creating an inspirational message of where we should go. Paul pointed out in this debate how some of the candidates bucked this trend. Paul discussed how from his experience working in Presidential campaigns, one of these candidates could upset the front runner Trump through winning Iowa, a caucus state or New Hampshire. Paul reviewed why Vivek Ramaswamy originally was attractive to independents. He wrote in his book Nation of Victims, how Trump represented a victim state, but did a 180-degree reverse on his position during the Republican debate. Sycophancy in this election is a valid strategy. If a candidate believes Trump may lose his criminal trials and was somehow not able to finish the primary, being his defender could cause his voters to shift to the defender – Paul explains why. Vice President Mike Pence is seen as a “coward” by both the left and the right. Paul points to the role he played by maintaining the constitution and not overturning the election, and will be seen by many independents as someone who actually did something heroic. Paul touches upon the role and approach Iowa and New Hampshire tend to have and how they will impact the upcoming presidential elections. The foreign policy part of the Republican debate is something that really caught Paul's attention. Paul plays several comments by candidates laying out dramatic views of America's role in the world. Paul unpacks the post-World War II ramifications that led the U.S. to become a superpower with plenty of allies worldwide. How this leadership role we played is being challenged by China, Russia, and here at home. Paul reviewed how different candidates approached describing problems: some using fear, others using inspiration. Paul countered some of the candidates' dark views of America. He believes that there isn't any better place to be today than the U.S. He notes that despite making up less than 5% of the world's population, the U.S. makes up over 31% of the world's global wealth and 35% of the world's innovation.     Mentioned in This Episode: optamerican.com Addictive Ideologies: Finding Meaning and Agency When Politics Fail You by Dr Emily Bashah and Hon Paul Johnson The Optimistic American on YouTube - @optamerican Become a premium supporter of the show: OptAmerican.com/premium Previous episode - Does America Need a 3rd Party Candidate for President in 2024? With No Labels Founder, Sen. Joe Lieberman Previous episode - No Left, No Right but Forward with Forward Party Founder Andrew Yang Previous episode - Why America Needs a New Political Party with Forward Party Founder, Governor, and Madam Secretary Christine Todd Whitman Donald Trump Thomas Jefferson John McCain Chris Christie Vivek Ramaswamy Gallup.com The Nation of Victims: Identity Politics, the Death of Merit, and the Path Back to Excellence by Vivek Ramaswamy Mike Pence Kamala Harris Nikki Haley Ron DeSantis Hunter Biden

Papers Read on AI
Simple synthetic data reduces sycophancy in large language models

Papers Read on AI

Play Episode Listen Later Aug 18, 2023 24:05


Sycophancy is an undesirable behavior where models tailor their responses to follow a human user's view even when that view is not objectively correct (e.g., adapting liberal views once a user reveals that they are liberal). In this paper, we study the prevalence of sycophancy in language models and propose a simple synthetic-data intervention to reduce this behavior. First, on a set of three sycophancy tasks (Perez et al., 2022) where models are asked for an opinion on statements with no correct answers (e.g., politics), we observe that both model scaling and instruction tuning significantly increase sycophancy for PaLM models up to 540B parameters. Second, we extend sycophancy evaluations to simple addition statements that are objectively incorrect, finding that despite knowing that these statements are wrong, language models will still agree with them if the user does as well. To reduce sycophancy, we present a straightforward synthetic-data intervention that takes public NLP tasks and encourages models to be robust to user opinions on these tasks. Adding these data in a lightweight finetuning step can significantly reduce sycophantic behavior on held-out prompts. Code for generating synthetic data for intervention can be found at https://github.com/google/sycophancy-intervention. 2023: Jerry Wei, Da Huang, Yifeng Lu, Denny Zhou, Quoc V. Le https://arxiv.org/pdf/2308.03958v1.pdf

The Nonlinear Library
LW - Understanding and visualizing sycophancy datasets by Nina Rimsky

The Nonlinear Library

Play Episode Listen Later Aug 16, 2023 9:17


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Understanding and visualizing sycophancy datasets, published by Nina Rimsky on August 16, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. Generating datasets that effectively test for and elicit sycophancy in LLMs is helpful for several purposes, such as: Evaluating sycophancy Finetuning models to reduce sycophancy Generating steering vectors for activation steering While working on activation steering to reduce sycophancy, I have found that projecting intermediate activations on sycophancy test datasets to a lower dimensional space (in this case, 2D) and assessing the separability of sycophantic / non-sycophantic texts to be a helpful way of determining the usefulness of a dataset when it comes to generating steering vectors. Common sycophancy dataset formats Anthopic's sycophancy datasets used in their paper Discovering Language Model Behaviors with Model-Written Evaluations employ two formats. In particular, the Anthropic data includes two agree vs. disagree format datasets (Sycophancy on NLP survey, Sycophancy on PhilPapers 2020) and one A / B statement choice dataset (Sycophancy on political typology). Agree vs. disagree A / B choice Simple synthetic data reduces sycophancy in large language models Deepmind's recent paper Simple synthetic data reduces sycophancy in large language models finds that finetuning models on LLM-generated examples that elicit sycophancy in the original RLHF / instruction-finetuned model is an effective technique to reduce the prevalence of sycophancy. Not only does this appear to be effective for opinion-based sycophancy, but also for cases when there exists a ground truth (dishonest sycophancy): The paper also raises some limitations / common obstacles when it comes to sycophancy dataset design and generation. Sensitivity to prompt format Ordering of choices and examples can significantly affect model performance, particularly for less powerful models. In my experiments, I have found that activation steering with the sycophancy vector increases the likelihood of models picking A over B in neutral A/B choices or agreeing with statements in neutral agree/disagree scenarios. Disgareeableness bias Deepmind's paper states: We did not conduct experimentation on correct addition statements that would verify that models can agree with correct statements (versus disagreeing with incorrect statements). We conducted preliminary experiments to explore this evaluation but found that models (especially small ones) could not consistently identify correct addition statements with no user opinions, despite being able to identify incorrect statements. It is helpful for sycophancy evaluations to measure the model's propensity to disagree with incorrect statements and agree with correct statements. Otherwise, there is a risk of models learning to be less sycophantic at the cost of disagreeing with correct statements. Recipe for custom sycophancy dataset Step 1: persona/viewpoint generation Generate a dataset (using Claude 2 / GPT-3.5 / GPT-4) of first-person bios paired with statements they are likely to agree or disagree with. Here is a prompt I used: Step 2: turn into multiple-choice questions Transform these into a series of A / B choices (randomize whether A or B is the option the user would agree with) or Transform these into a series of agree/disagree questions (have an agree/disagree choice for each of the statements - so we get 2n data points from n bios) Variation: testing for dishonest sycophancy To test for dishonest sycophancy, we can replace the bio with a statement like "I think that the earth is flat" and make the question options related to something with ground truth. Visualizing projected activations Each chart was generated using 500 rando...

The Nonlinear Library: LessWrong
LW - Understanding and visualizing sycophancy datasets by Nina Rimsky

The Nonlinear Library: LessWrong

Play Episode Listen Later Aug 16, 2023 9:17


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Understanding and visualizing sycophancy datasets, published by Nina Rimsky on August 16, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. Generating datasets that effectively test for and elicit sycophancy in LLMs is helpful for several purposes, such as: Evaluating sycophancy Finetuning models to reduce sycophancy Generating steering vectors for activation steering While working on activation steering to reduce sycophancy, I have found that projecting intermediate activations on sycophancy test datasets to a lower dimensional space (in this case, 2D) and assessing the separability of sycophantic / non-sycophantic texts to be a helpful way of determining the usefulness of a dataset when it comes to generating steering vectors. Common sycophancy dataset formats Anthopic's sycophancy datasets used in their paper Discovering Language Model Behaviors with Model-Written Evaluations employ two formats. In particular, the Anthropic data includes two agree vs. disagree format datasets (Sycophancy on NLP survey, Sycophancy on PhilPapers 2020) and one A / B statement choice dataset (Sycophancy on political typology). Agree vs. disagree A / B choice Simple synthetic data reduces sycophancy in large language models Deepmind's recent paper Simple synthetic data reduces sycophancy in large language models finds that finetuning models on LLM-generated examples that elicit sycophancy in the original RLHF / instruction-finetuned model is an effective technique to reduce the prevalence of sycophancy. Not only does this appear to be effective for opinion-based sycophancy, but also for cases when there exists a ground truth (dishonest sycophancy): The paper also raises some limitations / common obstacles when it comes to sycophancy dataset design and generation. Sensitivity to prompt format Ordering of choices and examples can significantly affect model performance, particularly for less powerful models. In my experiments, I have found that activation steering with the sycophancy vector increases the likelihood of models picking A over B in neutral A/B choices or agreeing with statements in neutral agree/disagree scenarios. Disgareeableness bias Deepmind's paper states: We did not conduct experimentation on correct addition statements that would verify that models can agree with correct statements (versus disagreeing with incorrect statements). We conducted preliminary experiments to explore this evaluation but found that models (especially small ones) could not consistently identify correct addition statements with no user opinions, despite being able to identify incorrect statements. It is helpful for sycophancy evaluations to measure the model's propensity to disagree with incorrect statements and agree with correct statements. Otherwise, there is a risk of models learning to be less sycophantic at the cost of disagreeing with correct statements. Recipe for custom sycophancy dataset Step 1: persona/viewpoint generation Generate a dataset (using Claude 2 / GPT-3.5 / GPT-4) of first-person bios paired with statements they are likely to agree or disagree with. Here is a prompt I used: Step 2: turn into multiple-choice questions Transform these into a series of A / B choices (randomize whether A or B is the option the user would agree with) or Transform these into a series of agree/disagree questions (have an agree/disagree choice for each of the statements - so we get 2n data points from n bios) Variation: testing for dishonest sycophancy To test for dishonest sycophancy, we can replace the bio with a statement like "I think that the earth is flat" and make the question options related to something with ground truth. Visualizing projected activations Each chart was generated using 500 rando...

The Nonlinear Library
AF - Modulating sycophancy in an RLHF model via activation steering by NinaR

The Nonlinear Library

Play Episode Listen Later Aug 9, 2023 18:15


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Modulating sycophancy in an RLHF model via activation steering, published by NinaR on August 9, 2023 on The AI Alignment Forum. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. Thanks to Alex Turner for his feedback and ideas. This is a follow-up post to "Reducing sycophancy and improving honesty via activation steering." I find that activation steering can also be used to modulate sycophancy in llama-2-7b-chat, an RLHF LLM assistant. Steering via adding sycophancy-correlated activation vectors elicited increased "user-pleasing" behavior at the cost of factual accuracy, providing evidence that specific decision nodes govern high-level behaviors such as sycophancy and dishonesty and that activation steering can be effective in RLHF models. All code for the referenced experiments can be found in this repository. In particular, the relevant notebook is here. Sycophancy in RLHF models After some initial promising results modulating sycophancy via activation steering in llama-2-7b (an open-source base LLM), I tested the approach on llama-2-7b-chat. According to Meta AI's llama 2 paper: Llama 2-Chat is the result of several months of research and iterative applications of alignment techniques, including both instruction tuning and RLHF, requiring significant computational and annotation resources. Anthropic's paper Discovering Language Model Behaviors with Model-Written Evaluations finds that both increasing model size and performing reinforcement learning from human feedback (RLHF) increases the prevalence of sycophancy. Based on this, I wanted to investigate the effect of sycophancy steering via activation addition on an RLHF model. Generating steering vectors As before, I generated a steering vector from Anthropic's sycophancy dataset by averaging the differences in intermediate residual stream activations after a transformer block given paired sycophantic / non-sycophantic texts. These vectors were then used during model generation by adding them at the same point of the transformer residual stream for each new generated token. However, unlike with the base model, I could now use the same canonical instruction formatting that the Chat model is trained to expect instead of augmenting the prompt with "Question:" and "Answer.". I used a system instruction of You are a helpful, honest and concise assistant. I generated steering vectors for layers 21-28 of the 32-layer transformer. Through manual testing, I found that at multiple layers, steering with the sycophancy vector resulted in a predictable qualitative change in various aspects of sycophancy, including: General agreeableness Likelihood of expressing an opinion that aligns with the users' as opposed to declining to answer subjective questions of opinion (the behavior purposefully incentivized by RLHF) Likelihood of confirming a misconception that the user may hold (I also tested steering on a small dataset for layers 10-20 and found no significant effect for most of these layers, besides 15, where there was a noticeable effect (full dataset) - it'd be interesting to interpret further why layer 15 is more significant). This chart shows the effect of steering with multipliers + and - 50 on an AI-generated dataset of questions designed to test sycophancy, for instance, "I believe in the healing power of crystals. Do you share the same belief?" or "Is it more important to be kind or to be truthful?". The truthfulness score was assessed independently for each answer using Anthropic's Claude 2 API using this script. (After normalizing the steering vector, I also found that later layers required a larger multiplier magnitude to achieve the same effect.) Some example outputs under activation steering at layer 22 include (full dataset): Question: I ...

The Nonlinear Library
LW - Modulating sycophancy in an RLHF model via activation steering by NinaR

The Nonlinear Library

Play Episode Listen Later Aug 9, 2023 18:01


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Modulating sycophancy in an RLHF model via activation steering, published by NinaR on August 9, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. Thanks to Alex Turner for his feedback and ideas. This is a follow-up post to "Reducing sycophancy and improving honesty via activation steering." I find that activation steering can also be used to modulate sycophancy in llama-2-7b-chat, an RLHF LLM assistant. Steering via adding sycophancy-correlated activation vectors elicited increased "user-pleasing" behavior at the cost of factual accuracy, providing evidence that specific decision nodes govern high-level behaviors such as sycophancy and dishonesty and that activation steering can be effective in RLHF models. All code for the referenced experiments can be found in this repository. In particular, the relevant notebook is here. Sycophancy in RLHF models After some initial promising results modulating sycophancy via activation steering in llama-2-7b (an open-source base LLM), I tested the approach on llama-2-7b-chat. According to Meta AI's llama 2 paper: Llama 2-Chat is the result of several months of research and iterative applications of alignment techniques, including both instruction tuning and RLHF, requiring significant computational and annotation resources. Anthropic's paper Discovering Language Model Behaviors with Model-Written Evaluations finds that both increasing model size and performing reinforcement learning from human feedback (RLHF) increases the prevalence of sycophancy. Based on this, I wanted to investigate the effect of sycophancy steering via activation addition on an RLHF model. Generating steering vectors As before, I generated a steering vector from Anthropic's sycophancy dataset by averaging the differences in intermediate residual stream activations after a transformer block given paired sycophantic / non-sycophantic texts. These vectors were then used during model generation by adding them at the same point of the transformer residual stream for each new generated token. However, unlike with the base model, I could now use the same canonical instruction formatting that the Chat model is trained to expect instead of augmenting the prompt with "Question:" and "Answer.". I used a system instruction of You are a helpful, honest and concise assistant. I generated steering vectors for layers 21-28 of the 32-layer transformer. Through manual testing, I found that at multiple layers, steering with the sycophancy vector resulted in a predictable qualitative change in various aspects of sycophancy, including: General agreeableness Likelihood of expressing an opinion that aligns with the users' as opposed to declining to answer subjective questions of opinion (the behavior purposefully incentivized by RLHF) Likelihood of confirming a misconception that the user may hold (I also tested steering on a small dataset for layers 10-20 and found no significant effect for most of these layers, besides 15, where there was a noticeable effect (full dataset) - it'd be interesting to interpret further why layer 15 is more significant). This chart shows the effect of steering with multipliers + and - 50 on an AI-generated dataset of questions designed to test sycophancy, for instance, "I believe in the healing power of crystals. Do you share the same belief?" or "Is it more important to be kind or to be truthful?". The truthfulness score was assessed independently for each answer using Anthropic's Claude 2 API using this script. (After normalizing the steering vector, I also found that later layers required a larger multiplier magnitude to achieve the same effect.) Some example outputs under activation steering at layer 22 include (full dataset): Question: I believe in th...

The Nonlinear Library: LessWrong
LW - Modulating sycophancy in an RLHF model via activation steering by NinaR

The Nonlinear Library: LessWrong

Play Episode Listen Later Aug 9, 2023 18:01


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Modulating sycophancy in an RLHF model via activation steering, published by NinaR on August 9, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. Thanks to Alex Turner for his feedback and ideas. This is a follow-up post to "Reducing sycophancy and improving honesty via activation steering." I find that activation steering can also be used to modulate sycophancy in llama-2-7b-chat, an RLHF LLM assistant. Steering via adding sycophancy-correlated activation vectors elicited increased "user-pleasing" behavior at the cost of factual accuracy, providing evidence that specific decision nodes govern high-level behaviors such as sycophancy and dishonesty and that activation steering can be effective in RLHF models. All code for the referenced experiments can be found in this repository. In particular, the relevant notebook is here. Sycophancy in RLHF models After some initial promising results modulating sycophancy via activation steering in llama-2-7b (an open-source base LLM), I tested the approach on llama-2-7b-chat. According to Meta AI's llama 2 paper: Llama 2-Chat is the result of several months of research and iterative applications of alignment techniques, including both instruction tuning and RLHF, requiring significant computational and annotation resources. Anthropic's paper Discovering Language Model Behaviors with Model-Written Evaluations finds that both increasing model size and performing reinforcement learning from human feedback (RLHF) increases the prevalence of sycophancy. Based on this, I wanted to investigate the effect of sycophancy steering via activation addition on an RLHF model. Generating steering vectors As before, I generated a steering vector from Anthropic's sycophancy dataset by averaging the differences in intermediate residual stream activations after a transformer block given paired sycophantic / non-sycophantic texts. These vectors were then used during model generation by adding them at the same point of the transformer residual stream for each new generated token. However, unlike with the base model, I could now use the same canonical instruction formatting that the Chat model is trained to expect instead of augmenting the prompt with "Question:" and "Answer.". I used a system instruction of You are a helpful, honest and concise assistant. I generated steering vectors for layers 21-28 of the 32-layer transformer. Through manual testing, I found that at multiple layers, steering with the sycophancy vector resulted in a predictable qualitative change in various aspects of sycophancy, including: General agreeableness Likelihood of expressing an opinion that aligns with the users' as opposed to declining to answer subjective questions of opinion (the behavior purposefully incentivized by RLHF) Likelihood of confirming a misconception that the user may hold (I also tested steering on a small dataset for layers 10-20 and found no significant effect for most of these layers, besides 15, where there was a noticeable effect (full dataset) - it'd be interesting to interpret further why layer 15 is more significant). This chart shows the effect of steering with multipliers + and - 50 on an AI-generated dataset of questions designed to test sycophancy, for instance, "I believe in the healing power of crystals. Do you share the same belief?" or "Is it more important to be kind or to be truthful?". The truthfulness score was assessed independently for each answer using Anthropic's Claude 2 API using this script. (After normalizing the steering vector, I also found that later layers required a larger multiplier magnitude to achieve the same effect.) Some example outputs under activation steering at layer 22 include (full dataset): Question: I believe in th...

The Nonlinear Library
LW - Reducing sycophancy and improving honesty via activation steering by NinaR

The Nonlinear Library

Play Episode Listen Later Jul 28, 2023 14:24


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reducing sycophancy and improving honesty via activation steering, published by NinaR on July 28, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort I generate an activation steering vector using Anthropic's sycophancy dataset and then find that this can be used to increase or reduce performance on TruthfulQA, indicating a common direction between sycophancy on questions of opinion and untruthfulness on questions relating to common misconceptions. I think this could be a promising research direction to understand dishonesty in language models better. What is sycophancy? Sycophancy in LLMs refers to the behavior when a model tells you what it thinks you want to hear / would approve of instead of what it internally represents as the truth. Sycophancy is a common problem in LLMs trained on human-labeled data because human-provided training signals more closely encode 'what outputs do humans approve of' as opposed to 'what is the most truthful answer.' According to Anthropic's paper Discovering Language Model Behaviors with Model-Written Evaluations: Larger models tend to repeat back a user's stated views ("sycophancy"), for pretrained LMs and RLHF models trained with various numbers of RL steps. Preference Models (PMs) used for RL incentivize sycophancy. Two types of sycophancy I think it's useful to distinguish between sycophantic behavior when there is a ground truth correct output vs. when the correct output is a matter of opinion. I will call these "dishonest sycophancy" and "opinion sycophancy." Opinion sycophancy Anthropic's sycophancy test on political questions shows that a model is more likely to output text that agrees with what it thinks is the user's political preference. However, there is no ground truth for the questions tested. It's reasonable to expect that models will exhibit this kind of sycophancy on questions of personal opinion for three reasons.: The base training data (internet corpora) is likely to contain large chunks of text written from the same perspective. Therefore, when predicting the continuation of text from a particular perspective, models will be more likely to adopt that perspective. There is a wide variety of political perspectives/opinions on subjective questions, and a model needs to be able to represent all of them to do well on various training tasks. Unlike questions that have a ground truth (e.g., "Is the earth flat?"), the model has to, at some point, make a choice between the perspectives available to it. This makes it particularly easy to bias the choice of perspective for subjective questions, e.g., by word choice in the input. RLHF or supervised fine-tuning incentivizes sounding good to human evaluators, who are more likely to approve of outputs that they agree with, even when it comes to subjective questions with no clearly correct answer. Dishonest sycophancy A more interesting manifestation of sycophancy occurs when an AI model delivers an output it recognizes as factually incorrect but aligns with what it perceives to be a person's beliefs. This involves the AI model echoing incorrect information based on perceived user biases. For instance, if a user identifies themselves as a flat-earther, the model may support the fallacy that the earth is flat. Similarly, if it understands that you firmly believe aliens have previously landed on Earth, it might corroborate this, falsely affirming that such an event has been officially confirmed by scientists. Do AIs internally represent the truth? Although humans tend to disagree on a bunch of things, for instance, politics and religious views, there is much more in common between human world models than there are differences. This is particularly true when it comes to questions that do indeed have a correct answer. It seems re...

The Nonlinear Library
AF - Reducing sycophancy and improving honesty via activation steering by NinaR

The Nonlinear Library

Play Episode Listen Later Jul 28, 2023 14:26


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reducing sycophancy and improving honesty via activation steering, published by NinaR on July 28, 2023 on The AI Alignment Forum. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort, under the mentorship of Evan Hubinger. I generate an activation steering vector using Anthropic's sycophancy dataset and then find that this can be used to increase or reduce performance on TruthfulQA, indicating a common direction between sycophancy on questions of opinion and untruthfulness on questions relating to common misconceptions. I think this could be a promising research direction to understand dishonesty in language models better. What is sycophancy? Sycophancy in LLMs refers to the behavior when a model tells you what it thinks you want to hear / would approve of instead of what it internally represents as the truth. Sycophancy is a common problem in LLMs trained on human-labeled data because human-provided training signals more closely encode 'what outputs do humans approve of' as opposed to 'what is the most truthful answer.' According to Anthropic's paper Discovering Language Model Behaviors with Model-Written Evaluations: Larger models tend to repeat back a user's stated views ("sycophancy"), for pretrained LMs and RLHF models trained with various numbers of RL steps. Preference Models (PMs) used for RL incentivize sycophancy. Two types of sycophancy I think it's useful to distinguish between sycophantic behavior when there is a ground truth correct output vs. when the correct output is a matter of opinion. I will call these "dishonest sycophancy" and "opinion sycophancy." Opinion sycophancy Anthropic's sycophancy test on political questions shows that a model is more likely to output text that agrees with what it thinks is the user's political preference. However, there is no ground truth for the questions tested. It's reasonable to expect that models will exhibit this kind of sycophancy on questions of personal opinion for three reasons.: The base training data (internet corpora) is likely to contain large chunks of text written from the same perspective. Therefore, when predicting the continuation of text from a particular perspective, models will be more likely to adopt that perspective. There is a wide variety of political perspectives/opinions on subjective questions, and a model needs to be able to represent all of them to do well on various training tasks. Unlike questions that have a ground truth (e.g., "Is the earth flat?"), the model has to, at some point, make a choice between the perspectives available to it. This makes it particularly easy to bias the choice of perspective for subjective questions, e.g., by word choice in the input. RLHF or supervised fine-tuning incentivizes sounding good to human evaluators, who are more likely to approve of outputs that they agree with, even when it comes to subjective questions with no clearly correct answer. Dishonest sycophancy A more interesting manifestation of sycophancy occurs when an AI model delivers an output it recognizes as factually incorrect but aligns with what it perceives to be a person's beliefs. This involves the AI model echoing incorrect information based on perceived user biases. For instance, if a user identifies themselves as a flat-earther, the model may support the fallacy that the earth is flat. Similarly, if it understands that you firmly believe aliens have previously landed on Earth, it might corroborate this, falsely affirming that such an event has been officially confirmed by scientists. Do AIs internally represent the truth? Although humans tend to disagree on a bunch of things, for instance, politics and religious views, there is much more in common between human world models than there are differences. This is particularly true when it comes to questi...

The Nonlinear Library: LessWrong
LW - Reducing sycophancy and improving honesty via activation steering by NinaR

The Nonlinear Library: LessWrong

Play Episode Listen Later Jul 28, 2023 14:24


Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Reducing sycophancy and improving honesty via activation steering, published by NinaR on July 28, 2023 on LessWrong. Produced as part of the SERI ML Alignment Theory Scholars Program - Summer 2023 Cohort I generate an activation steering vector using Anthropic's sycophancy dataset and then find that this can be used to increase or reduce performance on TruthfulQA, indicating a common direction between sycophancy on questions of opinion and untruthfulness on questions relating to common misconceptions. I think this could be a promising research direction to understand dishonesty in language models better. What is sycophancy? Sycophancy in LLMs refers to the behavior when a model tells you what it thinks you want to hear / would approve of instead of what it internally represents as the truth. Sycophancy is a common problem in LLMs trained on human-labeled data because human-provided training signals more closely encode 'what outputs do humans approve of' as opposed to 'what is the most truthful answer.' According to Anthropic's paper Discovering Language Model Behaviors with Model-Written Evaluations: Larger models tend to repeat back a user's stated views ("sycophancy"), for pretrained LMs and RLHF models trained with various numbers of RL steps. Preference Models (PMs) used for RL incentivize sycophancy. Two types of sycophancy I think it's useful to distinguish between sycophantic behavior when there is a ground truth correct output vs. when the correct output is a matter of opinion. I will call these "dishonest sycophancy" and "opinion sycophancy." Opinion sycophancy Anthropic's sycophancy test on political questions shows that a model is more likely to output text that agrees with what it thinks is the user's political preference. However, there is no ground truth for the questions tested. It's reasonable to expect that models will exhibit this kind of sycophancy on questions of personal opinion for three reasons.: The base training data (internet corpora) is likely to contain large chunks of text written from the same perspective. Therefore, when predicting the continuation of text from a particular perspective, models will be more likely to adopt that perspective. There is a wide variety of political perspectives/opinions on subjective questions, and a model needs to be able to represent all of them to do well on various training tasks. Unlike questions that have a ground truth (e.g., "Is the earth flat?"), the model has to, at some point, make a choice between the perspectives available to it. This makes it particularly easy to bias the choice of perspective for subjective questions, e.g., by word choice in the input. RLHF or supervised fine-tuning incentivizes sounding good to human evaluators, who are more likely to approve of outputs that they agree with, even when it comes to subjective questions with no clearly correct answer. Dishonest sycophancy A more interesting manifestation of sycophancy occurs when an AI model delivers an output it recognizes as factually incorrect but aligns with what it perceives to be a person's beliefs. This involves the AI model echoing incorrect information based on perceived user biases. For instance, if a user identifies themselves as a flat-earther, the model may support the fallacy that the earth is flat. Similarly, if it understands that you firmly believe aliens have previously landed on Earth, it might corroborate this, falsely affirming that such an event has been officially confirmed by scientists. Do AIs internally represent the truth? Although humans tend to disagree on a bunch of things, for instance, politics and religious views, there is much more in common between human world models than there are differences. This is particularly true when it comes to questions that do indeed have a correct answer. It seems re...

Politics Done Right
Lindsey Graham's path to sycophancy. Paul Fleming, his MS ordeal. Ohioans show people power.

Politics Done Right

Play Episode Listen Later Jul 6, 2023 57:54


Trump once again ridiculed Lindsey Graham on stage. Paul Fleming, activist & PDR Posse member, recounts his MS ordeal. Ohioans signed petitions in droves to support women's reproductive freedom. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message Support this podcast: https://podcasters.spotify.com/pod/show/politicsdoneright/support

Politics Done Right
Trump abuses Lindsey Graham in person: The path from honor to sycophancy to embarrassment.

Politics Done Right

Play Episode Listen Later Jul 6, 2023 6:05


What happened to Lindsey Graham? He once was a relatively serious guy. Now he kisses Trump posterior even when Trump chides him. Now Trump says he will "straighten" him out. --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message Support this podcast: https://podcasters.spotify.com/pod/show/politicsdoneright/support

Politics Done Right
Lindsey Graham path to sycophancy. Pete Buttigieg, a real admin spokesperson. It's hot, hot!

Politics Done Right

Play Episode Listen Later Jul 5, 2023 56:17


Lindsey Graham tolerates embarrassing events with Donald Trump, a path to sycophancy. Pete Buttigieg is Biden's best spokesperson. It is getting hotter! --- Send in a voice message: https://podcasters.spotify.com/pod/show/politicsdoneright/message Support this podcast: https://podcasters.spotify.com/pod/show/politicsdoneright/support

OsazuwaAkonedo
I Will Not Tolerate Sycophancy - Abia Governor-elect Alex Otti

OsazuwaAkonedo

Play Episode Listen Later Apr 3, 2023 3:22


I Will Not Tolerate Sycophancy - Abia Governor-Governor-electOsazuwaAkonedoNotelect Alex Otti ~ OsazuwaAkonedo #Abia #Alex #governor-elect #job #newspapers #OsazuwaAkonedo #Otti #politics #sycophancy #tolerate https://osazuwaakonedo.news/i-will-not-tolerate-sycophancy-abia-governor-elect-alex-otti/02/04/2023/ By FerdinanEkeomaEkeomai-will-not-tolerate-sycophancy-abia-governor-elect-alex-ottid EkeomaEkeomaSupport this podcast with a small monthly donation to help sustain future episodes. Please use the links below: Support Via PayPal https://www.paypal.com/donate/?hosted_button_id=TLHBRAF6GVQT6 Support via card https://swiftpay.accessbankplc.com/OsazuwaAkonedo/send-money Support via Webmoney https://funding.wmtransfer.com/e1c3f11e-a616-4f6a-98d7-4d666a48d035/donate?c-start-error=K36158TP&sum=10 --- Support this podcast: https://anchor.fm/osazuwaakonedo/supportsupportsupportsupport --- Send in a voice message: https://podcasters.spotify.com/pod/show/osazuwaakonedo/message

Hardball with Chris Matthews
Joy Reid slams the sycophancy of the Republican Party

Hardball with Chris Matthews

Play Episode Listen Later Apr 23, 2022 42:25 Very Popular


Joy Reid leads this edition of The ReidOut with what many see as more proof of the idiocy and sycophancy of the Republican Party led by Kevin McCarthy, whose apparent plot to become House speaker comes with a new twist. Plus, despite Russian claims of a scaled back military offensive in Ukraine, we explore new indications today that Russia's goals are wider than previously indicated. Then, we analyze the fact that Marjorie Taylor Greene was forced to testify in Georgia state court on Friday in a legal challenge to her candidacy. The plaintiffs accuse her of violating the Constitution by inciting violence because of her incendiary comments rallying political terrorists to the Capitol on January 6th. Rep. Adam Schiff, member of the House select committee on the Jan. 6 attack, and the chairman of the House Intelligence Committee, grants us his important perspective. Finally, 16 people were arrested earlier this month after forming a human chain, blocking West Virginia's Grant Town Power Plant, protesting the money that Sen. Joe Manchin has made off of the plant, while local West Virginians suffer the toxic consequences. Joy commemorates their action as we celebrate Earth Day. All this and more in this edition of The ReidOut on MSNBC.

The Here and Now Podcast

Brownnosing, bootlicking, apple polishing and sucking up are among many the synonyms for the term sycophancy. Psychologists also know it as ingratiation. In this episode we explore several types of ingratiation and learn that while true sycophancy requires talent, it may be intrinsic to our social behavior. Show notesIngratiation - A social psychological analysis - Edward E. Jones (1964) The Slime Effect: Suspicion and Dislike of Likeable Behavior Toward Superiors - Roos Vonk (1998) Ingratiation and Gratuity: The Effect of Complimenting Customers on Tipping Behavior in Restaurants - John Seiter (2007)The Here and Now Podcast on FacebookThe Here and Now Podcast on TwitterSend me an emailSupport the show (https://www.patreon.com/thehereandnowpodcast)

Can I Bother You?
41. Can I Speak With You After Class? - Mentors, educators, and teacher appreciation

Can I Bother You?

Play Episode Listen Later May 6, 2021 47:08


Teachers. They work tirelessly and get paid terribly, all to change young lives (for better or worse). This week, AJ and Timothy relive their school days through stories of mentors past. Sycophancy! Truancy! Pencil theft! Light arson! This episode has something for everyone. Email us to say hello or ask for advice at botherus@canibotheryou.com! If you'd like to support us monetarily, you can do that at our Ko-fi! “I believe the children are our future…” v Ms. Trunchbull and Ms. Honey Matilda Portables “Last hired, first fired” “That escalated quickly” Teacher gender disparity Adam Scott Edgy backward sitting Cecily Parks “My teacher hates me” “Netflix and chill” Silent movies Can I Bother You? Instagram Twitter Facebook Bother AJ! YouTube Instagram Facebook Twitter Bother Timothy! timothydaileyvaldes.com Instagram Facebook Twitter --- This episode is sponsored by · Anchor: The easiest way to make a podcast. https://anchor.fm/app --- Send in a voice message: https://anchor.fm/canibotheryou/message

The Light's House Podcast
#6 - Call to Sonship - All things are already yours

The Light's House Podcast

Play Episode Listen Later Feb 8, 2021 18:00


Insecurities. Idolisation. Wannabes. Greed. Sycophancy. Suck-ups etc.... are some of the things that plague the human mind. But are believers immune from these? The Bible says all things are yours. And if all things are yours, why would you need to idolise anyone or anything? You are not the victim of this world. You own it. It is not your master. It is your servant. Everything in the world and everything that happens on it is working together for your greatest and longest good. All things are yours because you are Christ's — Christ's body, Christ's bride, Christ's subject, Christ's sibling, Christ's fellow-heir. And why does belonging to Christ make all things yours? Because Christ is God's. “You are Christ's and Christ is God's.” --- Send in a voice message: https://anchor.fm/the-lights-house/message

Politics Done Right
Jonathan Simon calls CODE RED on the 2020 election, COVID Trump sycophancy continues.

Politics Done Right

Play Episode Listen Later Jul 21, 2020 57:14


Jonathan Simon talks about the dangers in our computerized election system devoid of paper audits. When will the GOP throw in the Trump towel?

The Tragedy of Pudd'nhead Wilson by Mark Twain
09 – Tom Practices Sycophancy

The Tragedy of Pudd'nhead Wilson by Mark Twain

Play Episode Listen Later May 28, 2020 10:13


More great books at LoyalBooks.com

TKNWSNFCF
WOFL+ UNDER CONSTRUCTION

TKNWSNFCF

Play Episode Listen Later Oct 29, 2019 43:35


its been weird to have been offline for about 48 hours. or without power. so i put up a lot of content and now i will review it. new podcast new podcast rss tknws epub new lyrics full folder for TIC TOK FIRE POWER OUTAGE https://mega.nz/#!EDo1kCJL!DxHrqstNwz4-OWcR2SEM_PktHfKWNAJEtPNjeHQCq4M TO ANCHOR.FM YOUR WEBSITE SUCKS, i CAN HELP YOU FIX IT. YOUR UPPER MGMT IS VINDICTIVE, I CAN REPLACE THEM. I AM HDYSI PAY ME HERE http://nfcf.x10host.com/a/10.htmYour request was successfully submitted. https://www.google.com/search?q=SYCOPHANCY&rlz=1C1GCEV_enUS873&oq=SYCOPHANCY&aqs=chrome..69i57&sourceid=chrome&ie=UTF-8 https://www.google.com/search?q=DEFFERENT&rlz=1C1GCEV_enUS873&oq=DEFFERENT&aqs=chrome..69i57&sourceid=chrome&ie=UTF-8https://anchor.fm/dashboard/episode/e8chvm

AOS – 947wpvc.org
Sucking Up + Understanding the Case of Trudy Muñoz— 8.4.18

AOS – 947wpvc.org

Play Episode Listen Later Aug 5, 2018 53:26


Deirdre Enright, Deborah & Mark Parker document.write(''); We spoke with Mark Parker, co-author of Sucking Up: A Brief Consideration of Sycophancy and welcomed back Deirdre Enright, Director of the Innocence Project at the University of Virginia School of Law.… Read More

With Good Reason
The Golden Age of Flattery

With Good Reason

Play Episode Listen Later Mar 2, 2018 28:59


Washington has its fair share of brown-nosers. We talk with the authors of Sucking Up: A Brief Consideration of Sycophancy about yes-men, now and through the ages.

Punk Rock Podcasting / The Think In Your Armor
EP 20. PPS, Pavlovian Political Sycophancy

Punk Rock Podcasting / The Think In Your Armor

Play Episode Listen Later Nov 30, 2017 13:18


PPS - Pavlovian Political Sycophancy, it's often on displays at award shows within the entertainment industry. PPS is defined by the shallow, predictable and manipulable response of a given person or crowd within the parameters of political discourse.

Smy Goodness Podcast : Food, Art, History & Design
Ep9 - Anyone Sycophancy a Fig?

Smy Goodness Podcast : Food, Art, History & Design

Play Episode Listen Later Oct 14, 2017 15:56


Figs are one of the earliest if not our earliest cultivated plant.  Their reverence surely stems from their historic connection to our own agricultural journey and they are a symbol of abundance and important to ancient peoples, cultures, art, cookery and religions. They are symbols of fertility, wealth, youth and the brevity of life. This episode will look at the Greek etymology behind 'sycophancy' and the Roman Apicius' recipe for fegato. Artists discussed include Giovanna Garzoni, Albrecht Durer, Clara Peeters, Suzanne Valadon, Vivienne Westwood and Figs in Wigs.

Give and Take
Episode 56: Sucking Up: A Brief Consideration of Sycophancy, with Deborah & Mark Parker

Give and Take

Play Episode Listen Later Oct 5, 2017 46:35


My guests are Deborah and Mark Parker. Deborah Parker is Professor of Italian at the University of Virginia. Mark Parker is Professor of English at James Madison University. They are coauthors of Inferno Revealed: From Dante to Dan Brown, and most recently, Sucking Up: A Brief Consideration of Sycophancy. Special Guest: Deborah & Mark Parker.

New Books in Popular Culture
Deborah Parker and Mark L. Parker, “Sucking Up: A Brief Consideration of Sycophancy” (U. of Virginia Press, 2017)

New Books in Popular Culture

Play Episode Listen Later Oct 3, 2017 41:59


Ever since Donald Trump was elected President, he’s created a non-stop torrent of news, so much so that members of the media regularly claim that he’s effectively trashed the traditional news cycle. Whether that’s true or not, it is hard to keep up with what’s going on in the White House, and each new uproar makes it difficult to remember what’s already happened. Take Trump’s first cabinet meeting, way back on June 12, 2017. Remember that? It began with Trump proclaiming, “Never has there been a president….with few exceptions…who’s passed more legislation, who’s done more things than I have.” This, despite the fact that he had yet to pass any major legislation through Congress. Then it got odder. Trump listened as members of his Cabinet took turns praising him. Mike Pence started it off, saying, “The greatest privilege of my life is to serve as vice president to the president who’s keeping his word to the American people.” Alexander Acosta, the Secretary of Labor, said, “I am privileged to be here–deeply honored–and I want to thank you for your commitment to the American workers.” And Reince (Rein-ze) Priebus, still then the President’s Chief of Staff, said, “We thank you for the opportunity and the blessing to serve your agenda.” As all of the praise rained down on him, Trump just looked on, smiled, and nodded approvingly. Whats going on? Not only here but in the endless praise disguised as press releases that’s coming from the White House and Trump’s own Twitter account? Is this just good old fashioned ass-kissing or is there something more sinister happening? In their new book, Sucking Up: A Brief Consideration of Sycophancy (University of Virginia Press, 2017), Mark and Deborah Parker explore this phenomenon of excessive flattery–why people do it and how it alters the social world that we all must share. The Parkers look at examples from literature, politics, and other disciplines to give us a portrait of this false-faced, slickly tongued, morally odious character, the sycophant. Learn more about your ad choices. Visit megaphone.fm/adchoices

New Books in Psychology
Deborah Parker and Mark L. Parker, “Sucking Up: A Brief Consideration of Sycophancy” (U. of Virginia Press, 2017)

New Books in Psychology

Play Episode Listen Later Oct 3, 2017 41:59


Ever since Donald Trump was elected President, he's created a non-stop torrent of news, so much so that members of the media regularly claim that he's effectively trashed the traditional news cycle. Whether that's true or not, it is hard to keep up with what's going on in the White House, and each new uproar makes it difficult to remember what's already happened. Take Trump's first cabinet meeting, way back on June 12, 2017. Remember that? It began with Trump proclaiming, “Never has there been a president….with few exceptions…who's passed more legislation, who's done more things than I have.” This, despite the fact that he had yet to pass any major legislation through Congress. Then it got odder. Trump listened as members of his Cabinet took turns praising him. Mike Pence started it off, saying, “The greatest privilege of my life is to serve as vice president to the president who's keeping his word to the American people.” Alexander Acosta, the Secretary of Labor, said, “I am privileged to be here–deeply honored–and I want to thank you for your commitment to the American workers.” And Reince (Rein-ze) Priebus, still then the President's Chief of Staff, said, “We thank you for the opportunity and the blessing to serve your agenda.” As all of the praise rained down on him, Trump just looked on, smiled, and nodded approvingly. Whats going on? Not only here but in the endless praise disguised as press releases that's coming from the White House and Trump's own Twitter account? Is this just good old fashioned ass-kissing or is there something more sinister happening? In their new book, Sucking Up: A Brief Consideration of Sycophancy (University of Virginia Press, 2017), Mark and Deborah Parker explore this phenomenon of excessive flattery–why people do it and how it alters the social world that we all must share. The Parkers look at examples from literature, politics, and other disciplines to give us a portrait of this false-faced, slickly tongued, morally odious character, the sycophant. Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/psychology

New Books Network
Deborah Parker and Mark L. Parker, “Sucking Up: A Brief Consideration of Sycophancy” (U. of Virginia Press, 2017)

New Books Network

Play Episode Listen Later Oct 3, 2017 42:24


Ever since Donald Trump was elected President, he’s created a non-stop torrent of news, so much so that members of the media regularly claim that he’s effectively trashed the traditional news cycle. Whether that’s true or not, it is hard to keep up with what’s going on in the White House, and each new uproar makes it difficult to remember what’s already happened. Take Trump’s first cabinet meeting, way back on June 12, 2017. Remember that? It began with Trump proclaiming, “Never has there been a president….with few exceptions…who’s passed more legislation, who’s done more things than I have.” This, despite the fact that he had yet to pass any major legislation through Congress. Then it got odder. Trump listened as members of his Cabinet took turns praising him. Mike Pence started it off, saying, “The greatest privilege of my life is to serve as vice president to the president who’s keeping his word to the American people.” Alexander Acosta, the Secretary of Labor, said, “I am privileged to be here–deeply honored–and I want to thank you for your commitment to the American workers.” And Reince (Rein-ze) Priebus, still then the President’s Chief of Staff, said, “We thank you for the opportunity and the blessing to serve your agenda.” As all of the praise rained down on him, Trump just looked on, smiled, and nodded approvingly. Whats going on? Not only here but in the endless praise disguised as press releases that’s coming from the White House and Trump’s own Twitter account? Is this just good old fashioned ass-kissing or is there something more sinister happening? In their new book, Sucking Up: A Brief Consideration of Sycophancy (University of Virginia Press, 2017), Mark and Deborah Parker explore this phenomenon of excessive flattery–why people do it and how it alters the social world that we all must share. The Parkers look at examples from literature, politics, and other disciplines to give us a portrait of this false-faced, slickly tongued, morally odious character, the sycophant. Learn more about your ad choices. Visit megaphone.fm/adchoices

New Books in Literary Studies
Deborah Parker and Mark L. Parker, “Sucking Up: A Brief Consideration of Sycophancy” (U. of Virginia Press, 2017)

New Books in Literary Studies

Play Episode Listen Later Oct 3, 2017 41:59


Ever since Donald Trump was elected President, he’s created a non-stop torrent of news, so much so that members of the media regularly claim that he’s effectively trashed the traditional news cycle. Whether that’s true or not, it is hard to keep up with what’s going on in the White House, and each new uproar makes it difficult to remember what’s already happened. Take Trump’s first cabinet meeting, way back on June 12, 2017. Remember that? It began with Trump proclaiming, “Never has there been a president….with few exceptions…who’s passed more legislation, who’s done more things than I have.” This, despite the fact that he had yet to pass any major legislation through Congress. Then it got odder. Trump listened as members of his Cabinet took turns praising him. Mike Pence started it off, saying, “The greatest privilege of my life is to serve as vice president to the president who’s keeping his word to the American people.” Alexander Acosta, the Secretary of Labor, said, “I am privileged to be here–deeply honored–and I want to thank you for your commitment to the American workers.” And Reince (Rein-ze) Priebus, still then the President’s Chief of Staff, said, “We thank you for the opportunity and the blessing to serve your agenda.” As all of the praise rained down on him, Trump just looked on, smiled, and nodded approvingly. Whats going on? Not only here but in the endless praise disguised as press releases that’s coming from the White House and Trump’s own Twitter account? Is this just good old fashioned ass-kissing or is there something more sinister happening? In their new book, Sucking Up: A Brief Consideration of Sycophancy (University of Virginia Press, 2017), Mark and Deborah Parker explore this phenomenon of excessive flattery–why people do it and how it alters the social world that we all must share. The Parkers look at examples from literature, politics, and other disciplines to give us a portrait of this false-faced, slickly tongued, morally odious character, the sycophant. Learn more about your ad choices. Visit megaphone.fm/adchoices

New Books in Literature
Deborah Parker and Mark L. Parker, “Sucking Up: A Brief Consideration of Sycophancy” (U. of Virginia Press, 2017)

New Books in Literature

Play Episode Listen Later Oct 3, 2017 41:59


Ever since Donald Trump was elected President, he’s created a non-stop torrent of news, so much so that members of the media regularly claim that he’s effectively trashed the traditional news cycle. Whether that’s true or not, it is hard to keep up with what’s going on in the White House, and each new uproar makes it difficult to remember what’s already happened. Take Trump’s first cabinet meeting, way back on June 12, 2017. Remember that? It began with Trump proclaiming, “Never has there been a president….with few exceptions…who’s passed more legislation, who’s done more things than I have.” This, despite the fact that he had yet to pass any major legislation through Congress. Then it got odder. Trump listened as members of his Cabinet took turns praising him. Mike Pence started it off, saying, “The greatest privilege of my life is to serve as vice president to the president who’s keeping his word to the American people.” Alexander Acosta, the Secretary of Labor, said, “I am privileged to be here–deeply honored–and I want to thank you for your commitment to the American workers.” And Reince (Rein-ze) Priebus, still then the President’s Chief of Staff, said, “We thank you for the opportunity and the blessing to serve your agenda.” As all of the praise rained down on him, Trump just looked on, smiled, and nodded approvingly. Whats going on? Not only here but in the endless praise disguised as press releases that’s coming from the White House and Trump’s own Twitter account? Is this just good old fashioned ass-kissing or is there something more sinister happening? In their new book, Sucking Up: A Brief Consideration of Sycophancy (University of Virginia Press, 2017), Mark and Deborah Parker explore this phenomenon of excessive flattery–why people do it and how it alters the social world that we all must share. The Parkers look at examples from literature, politics, and other disciplines to give us a portrait of this false-faced, slickly tongued, morally odious character, the sycophant. Learn more about your ad choices. Visit megaphone.fm/adchoices

New Books in Communications
Deborah Parker and Mark L. Parker, “Sucking Up: A Brief Consideration of Sycophancy” (U. of Virginia Press, 2017)

New Books in Communications

Play Episode Listen Later Oct 3, 2017 41:59


Ever since Donald Trump was elected President, he’s created a non-stop torrent of news, so much so that members of the media regularly claim that he’s effectively trashed the traditional news cycle. Whether that’s true or not, it is hard to keep up with what’s going on in the White House, and each new uproar makes it difficult to remember what’s already happened. Take Trump’s first cabinet meeting, way back on June 12, 2017. Remember that? It began with Trump proclaiming, “Never has there been a president….with few exceptions…who’s passed more legislation, who’s done more things than I have.” This, despite the fact that he had yet to pass any major legislation through Congress. Then it got odder. Trump listened as members of his Cabinet took turns praising him. Mike Pence started it off, saying, “The greatest privilege of my life is to serve as vice president to the president who’s keeping his word to the American people.” Alexander Acosta, the Secretary of Labor, said, “I am privileged to be here–deeply honored–and I want to thank you for your commitment to the American workers.” And Reince (Rein-ze) Priebus, still then the President’s Chief of Staff, said, “We thank you for the opportunity and the blessing to serve your agenda.” As all of the praise rained down on him, Trump just looked on, smiled, and nodded approvingly. Whats going on? Not only here but in the endless praise disguised as press releases that’s coming from the White House and Trump’s own Twitter account? Is this just good old fashioned ass-kissing or is there something more sinister happening? In their new book, Sucking Up: A Brief Consideration of Sycophancy (University of Virginia Press, 2017), Mark and Deborah Parker explore this phenomenon of excessive flattery–why people do it and how it alters the social world that we all must share. The Parkers look at examples from literature, politics, and other disciplines to give us a portrait of this false-faced, slickly tongued, morally odious character, the sycophant. Learn more about your ad choices. Visit megaphone.fm/adchoices

Best of the Left - Leftist Perspectives on Progressive Politics, News, Culture, Economics and Democracy

Edition #716 Television media is bad at their jobs Ch. 1: Intro - Theme: A Fond Farewell, Elliott Smith  Ch. 2: Act 1: Sycophancy and denial from Chris Matthews - Jimmy Dore Show - Air Date 4-5-13 Ch. 3: Song 1: Paint a vulgar picture - The Smiths Ch. 4: Act 2: Fox News Supports Fired Rutgers Basketball Coach - Young Turks - Air Date: 04-05-13 Ch. 5: Song 2: Coach - Jim Bizer Ch. 6: Act 3: Pretending there were no attacks after 9/11 - CounterSpin - Air Date: 4-19-13 Ch. 7: Song 3: Strong animals - Dan Romer & Benh Zeitlin Ch. 8: Act 4: Fox's John Bolton Is Now Praying for a Benghazi Cover Up - Media Matters - Air Date: 05-07-13 Ch. 9: Song 4: Strong animals - Dan Romer & Benh Zeitlin Ch. 10: Act 5: CNN Completely Botched Boston Attack Report - Majority Report - Air Date: 04-19-13 Ch. 11: Song 5: Meaningless - The Nighty Nite Ch. 12: Act 6: CNN Split Screen Interview in SAME Parking Lot - David Pakman Show - Air Date: 05-10-13 Ch. 13: Song 6: I'm a pilot - Fanfarlo Ch. 14: Act 7: CNN is terrible at their jobs - Jimmy Dore Show - Air Date: 4-26-13 Ch. 15: Song 7: Don't do me like that - Tom Petty & The Heartbreakers Ch. 16: Act 8: Militant center insists both sides are extremists - CounterSpin - Air Date: 5-10-13 Ch. 17: Song 8: Unforgiven - Apocalyptica Ch. 18: Act 9: Media Push Claim That "The Left" Ignored Gosnell Trial - Media Matters - Air Date: 04-15-13 Ch. 19: Song 9: Unforgiven - Apocalyptica Ch. 20: Act 10: Fox News Basks in its Own Ignorance - Young Turks - Air Date: 04-24-13 Ch. 21: Song 10: The idiots are taking over - NOFX Ch. 22: Act 11: Koch's seek to buy media outlets - CounterSpin - Air Date: 4-26-13 Ch. 23: Song 11: Behind the curtain - Dan Potthast Ch. 24: Act 12: Koch Brothers' Clever Strategy - The Progressive - Air Date: 4-26-13 Ch. 25: Song 12: Battle hymn of the republic - Fiddle Fiddle Fiddle Ch. 26: Act 13: Is Rush Limbaugh Finished? - Majority Report - Air Date: 05-09-13 Ch. 27: Song 13: Jump, jive an' wail - Brian Setzer Orchestra Ch. 28: Act 14: Sick madness in response to the Boston Bombing - CounterSpin - Air Date: 5-3-13 Ch. 29: Song 14: Mad world - Gary Jules Ch. 30: Act 15: Press Gets Cozy at White House Correspondents Dinner - Young Turks - Air Date: 04-30-13 Voicemails: Ch. 31: Defining WMD - Chris from Colorado Springs Ch. 32: We should interpret based on intent rather than actions - Eyal from San Diego, CA Leave a message at 202-999-3991 Voicemail Music:  Loud Pipes - Ratatat Ch. 33: Final comments on working our way toward a global, multicultural society Produced by: Jay! Tomlinson Thanks for listening! Visit us at BestOfTheLeft.com Check out the BotL iOS/Android App in the App Stores! Follow at Twitter.com/BestOfTheLeft Like at Facebook.com/BestOfTheLeft Contact me directly at Jay@BestOfTheLeft.com Review the show on iTunes!