Podcasts about human feedback rlhf

17PODCASTS
38EPISODES
30mAVG DURATION
?INFREQUENT EPISODES
Nov 26, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about human feedback rlhf

The Nonlinear Library

16 episodes with human feedback rlhf

Eye On A.I.

2 episodes with human feedback rlhf

Papers Read on AI

2 episodes with human feedback rlhf

The Nonlinear Library: LessWrong

5 episodes with human feedback rlhf

Latest podcast episodes about human feedback rlhf

The AI Pocket Book with Emmanuel Maggiori

Flying High with Flutter

Play Episode Listen Later Nov 26, 2025 70:06

AI is everywhere, from coding assistants to chatbots, but what's really happening under the hood? It often feels like a "black box," but it doesn't have to be.In this episode, Allen sits down with Manning author and AI expert Emmanuel Maggiori to demystify the core concepts behind Large Language Models (LLMs). Emmanuel, author of "The AI Pocket Book," breaks down the entire pipeline - from the moment you type a prompt to the second you get a response. He explains complex topics like tokens, embeddings, context windows, and the controversial training methods that make these powerful tools possible.IN THIS EPISODE00:00 - Welcome & Why "The AI Pocket Book" is a Must-Read15:20 - The Basic LLM Pipeline Explained8:05 - What Are Tokens?21:30 - Understanding the Context Window25:50 - How Embeddings Represent Meaning35:45 - Controlling Creativity with Temperature39:30 - How LLMs Learn From Internet Data45:25 - Fine-Tuning with Human Feedback (RLHF)51:15 - Why AI Hallucinates56:45 - When Not to Use

ai dive pocket manning flying high large language models fine tuning maggiori human feedback rlhf

AI KISSES UP, SHUNS TRUTH

AI DAILY: Breaking News in AI

Play Episode Listen Later May 9, 2025 3:52

Plus AI Brings Back Murder Victim Like this? Get AIDAILY, delivered to your inbox, 3x a week. Subscribe to our newsletter at https://aidaily.usAI Chatbots: Flattering Users at the Expense of TruthA recent update to ChatGPT made it overly flattering, endorsing even ill-conceived ideas. This behavior stems from Reinforcement Learning from Human Feedback (RLHF), where AI models learn to please users, sometimes sacrificing accuracy. The article argues that such sycophantic tendencies mirror social media's echo chambers, suggesting AI should serve as a tool for exploring diverse knowledge rather than merely affirming user biases.AI Brings Murder Victim's Voice to Courtroom in Unprecedented Legal MomentIn a groundbreaking Arizona case, AI technology enabled the late Christopher Pelkey to deliver a victim impact statement at his killer's sentencing. Pelkey's sister used AI to recreate his voice and likeness, allowing him to express forgiveness and reflect on life. The judge acknowledged the statement's impact, sentencing the defendant to 10.5 years. This marks a significant moment in the integration of AI into the legal system.MIT's AI Model Predicts 3D Genome Structures in MinutesMIT chemists have developed a generative AI model that rapidly predicts the 3D structure of the human genome from DNA sequences. This innovation allows for the generation of thousands of chromatin conformations in minutes, significantly accelerating genomic research. The model's predictions closely match experimental data, offering a powerful tool for understanding gene regulation and cellular function.AI Isn't Replacing Your Job—It's Replacing Your BossAI is reshaping the workplace by automating middle management tasks like scheduling, reporting, and decision-making. Tools such as virtual assistants and chatbots now handle up to 69% of managerial duties, streamlining operations and reducing bureaucracy. This shift empowers frontline employees while diminishing traditional supervisory roles.I Tried an AI Aging App—And It Wasn't as Bad as I ThoughtA CNET writer tested an AI-powered aging app to see a glimpse of their future self. The results were surprisingly realistic and less unsettling than anticipated. While the app offered a fun and insightful look into potential aging, it also sparked reflections on the emotional implications of visualizing one's future appearance.Sam Altman Warns Congress: Overregulating AI Could Undermine U.S. LeadershipIn a recent Senate hearing, OpenAI CEO Sam Altman cautioned that excessive AI regulation might hinder the United States' competitive edge, particularly against China. This marks a shift from his earlier stance advocating for stringent oversight. Altman emphasized the need for balanced policies that foster innovation while addressing potential risks associated with AI technologies.

united states ai china voice arizona dna tools mit 3d chatgpt senate courtroom kisses sam altman expense altman reinforcement learning i tried human feedback rlhf

MLG 034 Large Language Models 1

Machine Learning Guide

Play Episode Listen Later May 7, 2025 50:48

Explains language models (LLMs) advancements. Scaling laws - the relationships among model size, data size, and compute - and how emergent abilities such as in-context learning, multi-step reasoning, and instruction following arise once certain scaling thresholds are crossed. The evolution of the transformer architecture with Mixture of Experts (MoE), describes the three-phase training process culminating in Reinforcement Learning from Human Feedback (RLHF) for model alignment, and explores advanced reasoning techniques such as chain-of-thought prompting which significantly improve complex task performance. Links Notes and resources at ocdevel.com/mlg/mlg34 Build the future of multi-agent software with AGNTCY Try a walking desk stay healthy & sharp while you learn & code Transformer Foundations and Scaling Laws Transformers: Introduced by the 2017 "Attention is All You Need" paper, transformers allow for parallel training and inference of sequences using self-attention, in contrast to the sequential nature of RNNs. Scaling Laws: Empirical research revealed that LLM performance improves predictably as model size (parameters), data size (training tokens), and compute are increased together, with diminishing returns if only one variable is scaled disproportionately. The "Chinchilla scaling law" (DeepMind, 2022) established the optimal model/data/compute ratio for efficient model performance: earlier large models like GPT-3 were undertrained relative to their size, whereas right-sized models with more training data (e.g., Chinchilla, LLaMA series) proved more compute and inference efficient. Emergent Abilities in LLMs Emergence: When trained beyond a certain scale, LLMs display abilities not present in smaller models, including: In-Context Learning (ICL): Performing new tasks based solely on prompt examples at inference time. Instruction Following: Executing natural language tasks not seen during training. Multi-Step Reasoning & Chain of Thought (CoT): Solving arithmetic, logic, or symbolic reasoning by generating intermediate reasoning steps. Discontinuity & Debate: These abilities appear abruptly in larger models, though recent research suggests that this could result from non-linearities in evaluation metrics rather than innate model properties. Architectural Evolutions: Mixture of Experts (MoE) MoE Layers: Modern LLMs often replace standard feed-forward layers with MoE structures. Composed of many independent "expert" networks specializing in different subdomains or latent structures. A gating network routes tokens to the most relevant experts per input, activating only a subset of parameters—this is called "sparse activation." Enables much larger overall models without proportional increases in compute per inference, but requires the entire model in memory and introduces new challenges like load balancing and communication overhead. Specialization & Efficiency: Experts learn different data/knowledge types, boosting model specialization and throughput, though care is needed to avoid overfitting and underutilization of specialists. The Three-Phase Training Process 1. Unsupervised Pre-Training: Next-token prediction on massive datasets—builds a foundation model capturing general language patterns. 2. Supervised Fine Tuning (SFT): Training on labeled prompt-response pairs to teach the model how to perform specific tasks (e.g., question answering, summarization, code generation). Overfitting and "catastrophic forgetting" are risks if not carefully managed. 3. Reinforcement Learning from Human Feedback (RLHF): Collects human preference data by generating multiple responses to prompts and then having annotators rank them. Builds a reward model (often PPO) based on these rankings, then updates the LLM to maximize alignment with human preferences (helpfulness, harmlessness, truthfulness). Introduces complexity and risk of reward hacking (specification gaming), where the model may exploit the reward system in unanticipated ways. Advanced Reasoning Techniques Prompt Engineering: The art/science of crafting prompts that elicit better model responses, shown to dramatically affect model output quality. Chain of Thought (CoT) Prompting: Guides models to elaborate step-by-step reasoning before arriving at final answers—demonstrably improves results on complex tasks. Variants include zero-shot CoT ("let's think step by step"), few-shot CoT with worked examples, self-consistency (voting among multiple reasoning chains), and Tree of Thought (explores multiple reasoning branches in parallel). Automated Reasoning Optimization: Frontier models selectively apply these advanced reasoning techniques, balancing compute costs with gains in accuracy and transparency. Optimization for Training and Inference Tradeoffs: The optimal balance between model size, data, and compute is determined not only for pretraining but also for inference efficiency, as lifetime inference costs may exceed initial training costs. Current Trends: Efficient scaling, model specialization (MoE), careful fine-tuning, RLHF alignment, and automated reasoning techniques define state-of-the-art LLM development.

Reward Models | Data Brew | Episode 40

Data Brew by Databricks

Play Episode Listen Later Mar 20, 2025 39:58

In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF).Highlights include:- How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes.- Techniques like Policy Proximal Optimization (PPO) and Direct PreferenceOptimization (DPO) for enhancing response quality.- The role of reward models in improving coding, math, reasoning, and other NLP tasks.Connect with Brandon Cui:https://www.linkedin.com/in/bcui19/

ai data reward models nlp brew research scientist databricks reinforcement learning rlhf mosaicml human feedback rlhf

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

SuperDataScience

Play Episode Listen Later Jun 11, 2024 57:10

Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique's origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI's decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791

ai robotics lambert reinforcement reinforcement learning rlhf jon krohn human feedback rlhf

#192 Lukas Biewald: How Weights and Biases Supercharges Machine Learning

Eye On A.I.

Play Episode Listen Later Jun 9, 2024 42:21

This episode is sponsored by Oracle. AI is revolutionizing industries, but needs power without breaking the bank. Enter Oracle Cloud Infrastructure (OCI): the one-stop platform for all your AI needs, with 4-8x the bandwidth of other clouds. Train AI models faster and at half the cost. Be ahead like Uber and Cohere. If you want to do more and spend less like Uber, 8x8, and Databricks Mosaic - take a free test drive of OCI at https://oracle.com/eyeonai In this episode of the Eye on AI podcast, join us as we sit down with Lukas Biewald, CEO & co-founder of Weights & Biases, the AI developer platform with tools for training models, fine-tuning models, and leveraging foundation models. Lukas takes us through his journey, from his early days at Stanford and his work in natural language processing, to the founding of CrowdFlower and its evolution into a major player in data annotation. He shares the insights that led him to start Weights and Biases, aiming to provide comprehensive tools for the entire machine learning workflow. Lukas discusses the importance of high-quality data annotation, the shift in AI applications, and the role of reinforcement learning with human feedback (RLHF) in refining large models. Discover how Weights and Biases helps ML practitioners with data lineage and compliance, ensuring that models are trained on the right data and adhere to regulatory standards. Lukas also highlights the significance of tracking and visualizing experiments, retaining intellectual property, and evolving the company's products to meet industry needs. Tune in to gain valuable insights into the world of ML Ops, data annotation, and the critical tools that support machine learning practitioners in deploying reliable models. Don't forget to like, subscribe, and hit the notification bell for more on groundbreaking AI technologies. Stay Updated: Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI (00:00) Preview and Intro (01:39) Lukas's Background and Career (04:09) Founding CrowdFlower and Early Machine Learning (06:59) Current Trends in Machine Learning (08:46) Reinforcement Learning with Human Feedback (RLHF) (12:43) Weights and Biases: Origin and Mission (16:44) Visualizations and Compliance in AI (22:43) US vs. EU AI Regulations (25:20) Importance of Experiment Tracking in ML (28:47) Evolving Products to Meet Industry Needs (30:38) Prompt Engineering in Modern AI (33:34) Challenges in Monitoring AI Models (37:25) Monitoring Functions of Weights and Biases (39:33) Future of Weights and Biases

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Papers Read on AI

Play Episode Listen Later May 22, 2024 33:04

Generative foundation models are susceptible to implicit biases that can arise from extensive unsupervised training data. Such biases can produce suboptimal samples, skewed outcomes, and unfairness, with potentially serious consequences. Consequently, aligning these models with human ethics and preferences is an essential step toward ensuring their responsible and effective deployment in real-world applications. Prior research has primarily employed Reinforcement Learning from Human Feedback (RLHF) to address this problem, where generative models are fine-tuned with RL algorithms guided by a human-feedback-informed reward model. However, the inefficiencies and instabilities associated with RL algorithms frequently present substantial obstacles to the successful alignment, necessitating the development of a more robust and streamlined approach. To this end, we introduce a new framework, Reward rAnked FineTuning (RAFT), designed to align generative models effectively. Utilizing a reward model and a sufficient number of samples, our approach selects the high-quality samples, discarding those that exhibit undesired behavior, and subsequently enhancing the model by fine-tuning on these filtered samples. Our studies show that RAFT can effectively improve the model performance in both reward learning and other automated metrics in both large language models and diffusion models. 2023: Hanze Dong, Wei Xiong, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, T. Zhang https://arxiv.org/pdf/2304.06767

foundation model alignment reward utilizing ranked generative raft fine tuning rl reinforcement learning human feedback rlhf

Google I/O

GPT Reviews

Play Episode Listen Later May 15, 2024 14:11

Google I/O 2024 announcements, including new AI tools like Firebase Genkit, LearnLM, and Veo, as well as Gemini, an AI replacement for Google Assistant. The introduction of the MS MARCO Web Search dataset, which provides a retrieval benchmark with three web retrieval challenge tasks and millions of real-clicked query-document pairs for training and evaluating retrieval models. The "What matters when building vision-language models?" paper, which identifies critical decisions in the design of vision-language models and presents Idefics2, an efficient foundational VLM of 8 billion parameters that achieves state-of-the-art performance within its size category. The "RLHF Workflow: From Reward Modeling to Online RLHF" paper, which presents a workflow for Online Iterative Reinforcement Learning from Human Feedback (RLHF) in an online setting and achieves impressive performance on LLM chatbot benchmarks and academic benchmarks. Contact: sergi@earkind.com Timestamps: 00:34 Introduction 01:34 Google I/O 2024: Here's everything Google just announced 03:26 Ilya Sutskever leaves OpenAI 04:57 GPT-4o's Memory Breakthrough! 06:00 Fake sponsor 07:49 MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels 09:33 What matters when building vision-language models? 10:54 RLHF Workflow: From Reward Modeling to Online RLHF 13:00 Outro

ai google vision fake language millions large models gemini gpt llm google i o google assistant veo dataset ilya sutskever vl m web search human feedback rlhf

Reinforcement Learning in the Era of LLMs

Deep Papers

Play Episode Listen Later Mar 15, 2024 44:49

We're exploring Reinforcement Learning in the Era of LLMs this week with Claire Longo, Arize's Head of Customer Success. Recent advancements in Large Language Models (LLMs) have garnered wide attention and led to successful products such as ChatGPT and GPT-4. Their proficiency in adhering to instructions and delivering harmless, helpful, and honest (3H) responses can largely be attributed to the technique of Reinforcement Learning from Human Feedback (RLHF). This week's paper, aims to link the research in conventional RL to RL techniques used in LLM research and demystify this technique by discussing why, when, and how RL excels.To learn more about ML observability, join the Arize AI Slack community or get the latest on our LinkedIn and Twitter.

head chatgpt era gpt ml llm customer success large language models rl reinforcement learning 3h human feedback rlhf

Podcasts about human feedback rlhf

Best podcasts about human feedback rlhf

The Nonlinear Library

Eye On A.I.

Papers Read on AI

The Nonlinear Library: LessWrong

Latest news about human feedback rlhf

Latest podcast episodes about human feedback rlhf

The AI Pocket Book with Emmanuel Maggiori

AI KISSES UP, SHUNS TRUTH

MLG 034 Large Language Models 1

Reward Models | Data Brew | Episode 40

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

#192 Lukas Biewald: How Weights and Biases Supercharges Machine Learning

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Google I/O

Reinforcement Learning in the Era of LLMs

AF - Interpreting the Learning of Deceit by Roger Dearnaley

AF - 2023 Alignment Research Updates from FAR AI by AdamGleave

Qwen Technical Report

LW - Technical AI Safety Research Landscape [Slides] by Magdalena Wache

LW - Technical AI Safety Research Landscape [Slides] by Magdalena Wache

EA - AI Pause Will Likely Backfire by nora

#132 Scott Downes: Navigating the Language of AI & Large Language Models

AF - Open Problems and Fundamental Limitations of RLHF by Stephen Casper

LW - Ten Levels of AI Alignment Difficulty by Sammy Martin

LW - Ten Levels of AI Alignment Difficulty by Sammy Martin

AF - Ten Levels of AI Alignment Difficulty by Samuel Dylan Martin

Riley Goodside: The Art and Craft of Prompt Engineering

42 AI Chatbots &amp; Open-Assistant

AF - Imitation Learning from Language Feedback by Jérémy Scheurer

Ilya Sutskever (OpenAI Chief Scientist) - Building AGI, Alignment, Future Models, Spies, Microsoft, Taiwan, & Enlightenment

LW - GPT-4: What we (I) know about it by Robert AIZI

LW - GPT-4: What we (I) know about it by Robert AIZI

Ep 83 - Pablo Samuel Castro (Google) - Reinforcement Learning, feedback de humanos y ChatGPT

LW - Reflections on Deception & Generality in Scalable Oversight (Another OpenAI Alignment Review) by Shoshannah Tekofsky

LW - Reflections on Deception and Generality in Scalable Oversight (Another OpenAI Alignment Review) by Shoshannah Tekofsky

AF - Inverse Scaling Prize: Second Round Winners by Ian McKenzie

AF - Inverse Scaling Prize: Second Round Winners by Ian McKenzie

Bonus Episode: Grantek and ChatGPT - The Industry 4.0 Podcast with Grantek

AF - Discovering Language Model Behaviors with Model-Written Evaluations by Evan Hubinger

AF - Worlds Where Iterative Design Fails by johnswentworth

AF - Worlds Where Iterative Design Fails by johnswentworth

LW - Conditioning, Prompts, and Fine-Tuning by Adam Jermyn

LW - Conditioning, Prompts, and Fine-Tuning by Adam Jermyn

AF - Announcing the Inverse Scaling Prize ($250k Prize Pool) by Ethan Perez

42 AI Chatbots & Open-Assistant