Podcast appearances and mentions of John Yang

  • 45PODCASTS
  • 1,502EPISODES
  • 7mAVG DURATION
  • 1WEEKLY EPISODE
  • May 28, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about John Yang

Latest podcast episodes about John Yang

The Rational Reminder Podcast
Market Simulations & Financial Planning | #411 (John Yang)

The Rational Reminder Podcast

Play Episode Listen Later May 28, 2026 77:24


In this episode, Ben Felix and Braden Warwick unpack the surprisingly complex world of expected return modeling and why it matters so much for retirement projections, portfolio construction, and financial advice. They explain how PWL Capital currently estimates expected returns across asset classes, why traditional Monte Carlo methods relying on Gaussian distributions may miss important market behaviors, and how new research could improve the realism of long-term financial planning simulations. The conversation also explores a fascinating collaboration between PWL and Columbia Engineering student John Yang, who worked with Professor Michael Robbins on a project to build more realistic synthetic return data for financial planning. John explains how his team used empirical distributions, t-copulas, and Extreme Value Theory to better capture market crashes, fat tails, and asset co-movements during periods of stress. Ben and Braden then analyze how these improved simulation methods affect financial planning outcomes, sustainable spending estimates, and projections for long-term wealth accumulation.   Key Points From This Episode: (0:00:00) Introduction to expected return modeling and why it matters for financial planning.  (0:00:25) The importance of volatility, correlations, distribution shape, and time-series behavior in portfolio projections.  (0:01:26) How Scott Cederburg's research on block bootstrapping influenced PWL's thinking on simulations.  (0:02:03) Introduction to Columbia Engineering student John Yang and the industry research collaboration.  (0:03:30) How Conquest Planning allows PWL to upload custom return simulations.  (0:04:05) A new PWL client's detailed reasoning for moving from DIY investing to working with an advisor.  (0:06:22) Why financial planning and Monte Carlo simulations were central to the client's decision.  (0:07:22) Cross-border financial complexity and the value of professional advice.  (0:08:03) Estate planning, cognitive decline, and the role of trusted financial relationships.  (0:10:02) Research on cognitive decline and its impact on financial decision-making.  (0:12:00) Delegation, accountability, and reducing mental overhead through advisory relationships.  (0:13:47) Why the client chose PWL specifically and the appeal of evidence-based investing.  (0:15:25) Ben and Braden discuss the perceived disconnect between online discourse and demand for AUM advisors.  (0:16:12) Overview of PWL's methodology for estimating expected returns across asset classes.  (0:17:05) How PWL combines historical returns with market-implied expected returns.  (0:18:07) The use of factor premiums and expected return composition in taxable projections.  (0:18:48) Why PWL previously relied on Gaussian multivariate normal distributions for simulations.  (0:19:41) Arithmetic vs. geometric mean returns and why the distinction matters.  (0:21:01) A simple example illustrating volatility drag.  (0:23:29) Why diversification benefits must be incorporated into expected portfolio returns.  (0:25:15) How correcting portfolio math improved expected return estimates by 20–30 basis points.  (0:27:12) Transition to John Yang's interview and introduction to synthetic data generation.  (0:30:07) John explains the limitations of Gaussian return assumptions.  (0:31:04) Why realistic sequences of returns matter for retirement planning.  (0:32:16) Empirical evidence that returns are not truly random.  (0:33:25) The three modeling challenges: unique asset behavior, realistic co-movement, and tail risk.  (0:37:49) Separating marginal distributions from dependency structures in the modeling process.  (0:38:48) Using a t-copula to better model asset co-movement during market stress.  (0:39:39) Why historical data alone struggles to capture rare crisis events.  (0:40:06) Applying Extreme Value Theory and Generalized Pareto Distributions to model tail risk.  (0:42:15) How Monte Carlo simulations generate many realistic future return paths.  (0:43:00) Imposing forward-looking expected returns and volatility assumptions onto the simulations.  (0:44:56) How the new framework better preserves skewness and kurtosis.  (0:46:38) Evaluating the new model using marginal shape, tail behavior, and co-movement scores.  (0:48:10) Why the new model significantly improved tail realism without sacrificing correlations.  (0:49:05) Future extensions including dynamic correlations and volatility clustering.  (0:50:28) Potential future use of GANs and machine learning for synthetic financial data.  (0:52:02) Key takeaway: financial planning requires realistic return paths, not just summary statistics.  (0:53:41) Braden analyzes how the new simulation framework affects financial advice.  (0:55:04) Why monthly index data produced fatter tails than long-term annual DMS data.  (0:58:47) The new model improved Monte Carlo success rates by roughly 2–3%.  (1:00:25) Sustainable spending estimates changed only modestly under the new simulations.  (1:02:27) Why the improved methodology matters more for alternative asset classes.  (1:04:25) The surprising finding that median wealth outcomes increased while mean outcomes decreased.  (1:05:47) Why Gaussian simulations can create unrealistic runaway wealth scenarios.  (1:07:20) The practical implications for estate planning and multi-generational wealth projections.  (1:08:30) Why better simulation methods are especially important for concentrated and alternative investments.   Links From Today's Episode: Meet with PWL Capital: https://calendly.com/d/3vm-t2j-h3p Rational Reminder on iTunes — https://itunes.apple.com/ca/podcast/the-rational-reminder-podcast/id1426530582. Rational Reminder on Instagram — https://www.instagram.com/rationalreminder/ Rational Reminder on YouTube — https://www.youtube.com/channel/ Benjamin Felix — https://pwlcapital.com/our-team/ Benjamin on X — https://x.com/benjaminwfelix Benjamin on LinkedIn — https://www.linkedin.com/in/benjaminwfelix/   Editing and post-production work for this episode was provided by The Podcast Consultant (https://thepodcastconsultant.com)  

Power Station
I am an accidental Asian American activist

Power Station

Play Episode Listen Later Apr 20, 2026 39:06


A conversation with John Yang, President and Executive Director of Asian Americans Advancing Justice reverberates with facts and feelings. To start, we talk about the recent Supreme Court hearing on birthright citizenship, an outcome of President Trump's preoccupation with erasing this foundational constitutional right. As John explains on this episode of Power Station, this impulse is rooted in the desire to control who should and should not be considered an American. We are seeing this play out in real time in immigration sweeps and detention centers across the country. And while we do not see a lot of reporting about this, the birthright citizenship issue has a disproportionate impact on Asian Americans. In fact, about 1 in 25 people in the community would be impacted by an adverse ruling. But when John talks about birthright citizenship and his organization's broader civil and human rights mission, he is not advocating for Asian Americans alone. He collaborates with African American, Latino and LGBTQ organizations to advance together towards a more just America for all.  This value of inclusion runs deep within John and is embedded in AAJC's strategies and programs. Hear him!

PBS NewsHour - Shields and Brooks
Brooks and Marcus on voters fed up with gridlock in Congress

PBS NewsHour - Shields and Brooks

Play Episode Listen Later Mar 27, 2026 11:07


David Brooks of The Atlantic and Ruth Marcus of The New Yorker join John Yang to discuss the week in politics, including the collapse of a deal to end the partial government shutdown and more fallout from the war in Iran. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Brooks and Marcus on voters fed up with gridlock in Congress

PBS NewsHour - Segments

Play Episode Listen Later Mar 27, 2026 11:07


David Brooks of The Atlantic and Ruth Marcus of The New Yorker join John Yang to discuss the week in politics, including the collapse of a deal to end the partial government shutdown and more fallout from the war in Iran. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Iran warns of 'surprise' for U.S. troops if ground invasion begins

PBS NewsHour - Segments

Play Episode Listen Later Mar 26, 2026 3:31


Airstrikes continue in Iran as the U.S. says it's negotiating with the Islamic Republic. John Yang spoke with special correspondent Reza Sayah for the view from Tehran. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
How Major League Baseball's new 'robo ump' challenge system works

PBS NewsHour - Segments

Play Episode Listen Later Mar 26, 2026 6:49


Major League Baseball is back with a new automated ball-strike system, or ABS. In every ballpark, the precise location of pitches will be tracked by electronic monitors. Teams can challenge up to two ball or strike calls in a nine-inning game. John Yang discussed this new era of baseball with Dan Evans, a former general manager of the Los Angeles Dodgers. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
As more U.S. forces head to Mideast, military experts break down capabilities

PBS NewsHour - Segments

Play Episode Listen Later Mar 26, 2026 9:22


As President Trump says he's working on a deal to end the Iran war, more troops are heading to the region. John Yang discussed the capabilities of the forces and how they could be used with Joel Rayburn and Frederic Wehrey. Rayburn is a retired Army colonel and is now at the Hudson Institute. Wehrey is a retired Air Force lieutenant colonel now at the Carnegie Endowment for International Peace. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - World
As more U.S. forces head to Mideast, military experts break down capabilities

PBS NewsHour - World

Play Episode Listen Later Mar 26, 2026 9:22


As President Trump says he's working on a deal to end the Iran war, more troops are heading to the region. John Yang discussed the capabilities of the forces and how they could be used with Joel Rayburn and Frederic Wehrey. Rayburn is a retired Army colonel and is now at the Hudson Institute. Wehrey is a retired Air Force lieutenant colonel now at the Carnegie Endowment for International Peace. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - World
Iran warns of 'surprise' for U.S. troops if ground invasion begins

PBS NewsHour - World

Play Episode Listen Later Mar 26, 2026 3:31


Airstrikes continue in Iran as the U.S. says it's negotiating with the Islamic Republic. John Yang spoke with special correspondent Reza Sayah for the view from Tehran. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Politics
As more U.S. forces head to Mideast, military experts break down capabilities

PBS NewsHour - Politics

Play Episode Listen Later Mar 26, 2026 9:22


As President Trump says he's working on a deal to end the Iran war, more troops are heading to the region. John Yang discussed the capabilities of the forces and how they could be used with Joel Rayburn and Frederic Wehrey. Rayburn is a retired Army colonel and is now at the Hudson Institute. Wehrey is a retired Air Force lieutenant colonel now at the Carnegie Endowment for International Peace. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Jury finds Meta and YouTube liable in landmark youth addiction case

PBS NewsHour - Segments

Play Episode Listen Later Mar 25, 2026 7:39


In a span of less than 24 hours, juries have returned historic verdicts in a pair of high-profile lawsuits that accuse big tech companies of putting children and teens in harm's way on their social media platforms. John Yang discussed more with Jacob Ward of The Rip Current. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Mideast experts analyze state of Iran war and diplomatic efforts to end it

PBS NewsHour - Segments

Play Episode Listen Later Mar 25, 2026 7:25


To discuss the state of the war with Iran and the diplomatic efforts to end it, John Yang spoke with Ray Takeyh and Alan Eyre. Takeyh was a senior State Department adviser on Iran during the Obama administration and is now at the Council on Foreign Relations. Eyre was part of the Obama administration's negotiating team for the Iran nuclear deal and is now at the Middle East Institute. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - World
Mideast experts analyze state of Iran war and diplomatic efforts to end it

PBS NewsHour - World

Play Episode Listen Later Mar 25, 2026 7:25


To discuss the state of the war with Iran and the diplomatic efforts to end it, John Yang spoke with Ray Takeyh and Alan Eyre. Takeyh was a senior State Department adviser on Iran during the Obama administration and is now at the Council on Foreign Relations. Eyre was part of the Obama administration's negotiating team for the Iran nuclear deal and is now at the Middle East Institute. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Hundreds of thousands without power in aftermath of massive winter storm

PBS NewsHour - Segments

Play Episode Listen Later Jan 26, 2026 3:11


A massive winter storm blanketed much of the country with snow, sleet and ice over the weekend. At least 25 deaths were reported amid the winter weather, including hypothermia and sledding accidents. Millions of Americans now face bitter temperatures for days and widespread power outages in some states that may last well into the week. John Yang reports. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Highlights from PBS News Weekend as show goes off the air

PBS NewsHour - Segments

Play Episode Listen Later Jan 11, 2026 6:43


This Sunday is the final broadcast of PBS News Weekend, at least for the foreseeable future. PBS cancelled the show due to the loss of federal funding for public media. As our team signs off the air, anchor John Yang looks back at some of our top stories and highlights over the years. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Investigation raises concerns about lack of FDA quality testing for generic drugs

PBS NewsHour - Segments

Play Episode Listen Later Jan 11, 2026 5:16


By some estimates, about 90% of prescriptions in the U.S. are filled with generic drugs. The Food and Drug Administration says that all agency-approved generic drugs "have the same high quality" as brand-name drugs, but a ProPublica investigation found that the FDA rarely tests the quality of generic drugs. John Yang speaks with investigative reporter Debbie Cenziper for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Health
Investigation raises concerns about lack of FDA quality testing for generic drugs

PBS NewsHour - Health

Play Episode Listen Later Jan 11, 2026 5:16


By some estimates, about 90% of prescriptions in the U.S. are filled with generic drugs. The Food and Drug Administration says that all agency-approved generic drugs "have the same high quality" as brand-name drugs, but a ProPublica investigation found that the FDA rarely tests the quality of generic drugs. John Yang speaks with investigative reporter Debbie Cenziper for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
ICE shootings spark outrage, protests across the country demanding accountability

PBS NewsHour - Segments

Play Episode Listen Later Jan 10, 2026 5:45


This week's series of shootings by federal agents enforcing Trump's crackdown on illegal immigration have sparked a weekend of protests. Voices of anger and outrage were heard at rallies and demonstrations across the country. John Yang speaks with Lisa Gilbert, co-president of Public Citizen, a progressive advocacy group that helped organize Saturday's protests, for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Why San Francisco is suing top U.S. food manufacturers over ultra-processed foods

PBS NewsHour - Segments

Play Episode Listen Later Jan 3, 2026 6:47


In the first lawsuit of its kind, the city of San Francisco is suing 11 of the nation's top food companies, saying they sell ultra-processed food knowing they are harmful to health. By some estimates, more than 60% of food consumed in the U.S. is ultra-processed. John Yang speaks with Ashley Gearhardt, a University of Michigan psychology professor who studies addiction, to learn more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Health
Why San Francisco is suing top U.S. food manufacturers over ultra-processed foods

PBS NewsHour - Health

Play Episode Listen Later Jan 3, 2026 6:47


In the first lawsuit of its kind, the city of San Francisco is suing 11 of the nation's top food companies, saying they sell ultra-processed food knowing they are harmful to health. By some estimates, more than 60% of food consumed in the U.S. is ultra-processed. John Yang speaks with Ashley Gearhardt, a University of Michigan psychology professor who studies addiction, to learn more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 31, 2025 17:45


From creating SWE-bench in a Princeton basement to shipping CodeClash, SWE-bench Multimodal, and SWE-bench Multilingual, John Yang has spent the last year and a half watching his benchmark become the de facto standard for evaluating AI coding agents—trusted by Cognition (Devin), OpenAI, Anthropic, and every major lab racing to solve software engineering at scale. We caught up with John live at NeurIPS 2025 to dig into the state of code evals heading into 2026: why SWE-bench went from ignored (October 2023) to the industry standard after Devin's launch (and how Walden emailed him two weeks before the big reveal), how the benchmark evolved from Django-heavy to nine languages across 40 repos (JavaScript, Rust, Java, C, Ruby), why unit tests as verification are limiting and long-running agent tournaments might be the future (CodeClash: agents maintain codebases, compete in arenas, and iterate over multiple rounds), the proliferation of SWE-bench variants (SWE-bench Pro, SWE-bench Live, SWE-Efficiency, AlgoTune, SciCode) and how benchmark authors are now justifying their splits with curation techniques instead of just “more repos,” why Tau-bench's “impossible tasks” controversy is actually a feature not a bug (intentionally including impossible tasks flags cheating), the tension between long autonomy (5-hour runs) vs. interactivity (Cognition's emphasis on fast back-and-forth), how Terminal-bench unlocked creativity by letting PhD students and non-coders design environments beyond GitHub issues and PRs, the academic data problem (companies like Cognition and Cursor have rich user interaction data, academics need user simulators or compelling products like LMArena to get similar signal), and his vision for CodeClash as a testbed for human-AI collaboration—freeze model capability, vary the collaboration setup (solo agent, multi-agent, human+agent), and measure how interaction patterns change as models climb the ladder from code completion to full codebase reasoning.We discuss:* John's path: Princeton → SWE-bench (October 2023) → Stanford PhD with Diyi Yang and the Iris Group, focusing on code evals, human-AI collaboration, and long-running agent benchmarks* The SWE-bench origin story: released October 2023, mostly ignored until Cognition's Devin launch kicked off the arms race (Walden emailed John two weeks before: “we have a good number”)* SWE-bench Verified: the curated, high-quality split that became the standard for serious evals* SWE-bench Multimodal and Multilingual: nine languages (JavaScript, Rust, Java, C, Ruby) across 40 repos, moving beyond the Django-heavy original distribution* The SWE-bench Pro controversy: independent authors used the “SWE-bench” name without John's blessing, but he's okay with it (”congrats to them, it's a great benchmark”)* CodeClash: John's new benchmark for long-horizon development—agents maintain their own codebases, edit and improve them each round, then compete in arenas (programming games like Halite, economic tasks like GDP optimization)* SWE-Efficiency (Jeffrey Maugh, John's high school classmate): optimize code for speed without changing behavior (parallelization, SIMD operations)* AlgoTune, SciCode, Terminal-bench, Tau-bench, SecBench, SRE-bench: the Cambrian explosion of code evals, each diving into different domains (security, SRE, science, user simulation)* The Tau-bench “impossible tasks” debate: some tasks are underspecified or impossible, but John thinks that's actually a feature (flags cheating if you score above 75%)* Cognition's research focus: codebase understanding (retrieval++), helping humans understand their own codebases, and automatic context engineering for LLMs (research sub-agents)* The vision: CodeClash as a testbed for human-AI collaboration—vary the setup (solo agent, multi-agent, human+agent), freeze model capability, and measure how interaction changes as models improve—John Yang* SWE-bench: https://www.swebench.com* X: https://x.com/jyangballinFull Video EpisodeTimestamps00:00:00 Introduction: John Yang on SWE-bench and Code Evaluations00:00:31 SWE-bench Origins and Devon's Impact on the Coding Agent Arms Race00:01:09 SWE-bench Ecosystem: Verified, Pro, Multimodal, and Multilingual Variants00:02:17 Moving Beyond Django: Diversifying Code Evaluation Repositories00:03:08 Code Clash: Long-Horizon Development Through Programming Tournaments00:04:41 From Halite to Economic Value: Designing Competitive Coding Arenas00:06:04 Ofir's Lab: SWE-ficiency, AlgoTune, and SciCode for Scientific Computing00:07:52 The Benchmark Landscape: TAU-bench, Terminal-bench, and User Simulation00:09:20 The Impossible Task Debate: Refusals, Ambiguity, and Benchmark Integrity00:12:32 The Future of Code Evals: Long Autonomy vs Human-AI Collaboration00:14:37 Call to Action: User Interaction Data and Codebase Understanding Research Get full access to Latent.Space at www.latent.space/subscribe

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
[State of Code Evals] After SWE-bench, Code Clash & SOTA Coding Benchmarks recap — John Yang

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Dec 31, 2025


From creating SWE-bench in a Princeton basement to shipping CodeClash, SWE-bench Multimodal, and SWE-bench Multilingual, John Yang has spent the last year and a half watching his benchmark become the de facto standard for evaluating AI coding agents—trusted by Cognition (Devin), OpenAI, Anthropic, and every major lab racing to solve software engineering at scale. We caught up with John live at NeurIPS 2025 to dig into the state of code evals heading into 2026: why SWE-bench went from ignored (October 2023) to the industry standard after Devin's launch (and how Walden emailed him two weeks before the big reveal), how the benchmark evolved from Django-heavy to nine languages across 40 repos (JavaScript, Rust, Java, C, Ruby), why unit tests as verification are limiting and long-running agent tournaments might be the future (CodeClash: agents maintain codebases, compete in arenas, and iterate over multiple rounds), the proliferation of SWE-bench variants (SWE-bench Pro, SWE-bench Live, SWE-Efficiency, AlgoTune, SciCode) and how benchmark authors are now justifying their splits with curation techniques instead of just "more repos," why Tau-bench's "impossible tasks" controversy is actually a feature not a bug (intentionally including impossible tasks flags cheating), the tension between long autonomy (5-hour runs) vs. interactivity (Cognition's emphasis on fast back-and-forth), how Terminal-bench unlocked creativity by letting PhD students and non-coders design environments beyond GitHub issues and PRs, the academic data problem (companies like Cognition and Cursor have rich user interaction data, academics need user simulators or compelling products like LMArena to get similar signal), and his vision for CodeClash as a testbed for human-AI collaboration—freeze model capability, vary the collaboration setup (solo agent, multi-agent, human+agent), and measure how interaction patterns change as models climb the ladder from code completion to full codebase reasoning. We discuss: John's path: Princeton → SWE-bench (October 2023) → Stanford PhD with Diyi Yang and the Iris Group, focusing on code evals, human-AI collaboration, and long-running agent benchmarks The SWE-bench origin story: released October 2023, mostly ignored until Cognition's Devin launch kicked off the arms race (Walden emailed John two weeks before: "we have a good number") SWE-bench Verified: the curated, high-quality split that became the standard for serious evals SWE-bench Multimodal and Multilingual: nine languages (JavaScript, Rust, Java, C, Ruby) across 40 repos, moving beyond the Django-heavy original distribution The SWE-bench Pro controversy: independent authors used the "SWE-bench" name without John's blessing, but he's okay with it ("congrats to them, it's a great benchmark") CodeClash: John's new benchmark for long-horizon development—agents maintain their own codebases, edit and improve them each round, then compete in arenas (programming games like Halite, economic tasks like GDP optimization) SWE-Efficiency (Jeffrey Maugh, John's high school classmate): optimize code for speed without changing behavior (parallelization, SIMD operations) AlgoTune, SciCode, Terminal-bench, Tau-bench, SecBench, SRE-bench: the Cambrian explosion of code evals, each diving into different domains (security, SRE, science, user simulation) The Tau-bench "impossible tasks" debate: some tasks are underspecified or impossible, but John thinks that's actually a feature (flags cheating if you score above 75%) Cognition's research focus: codebase understanding (retrieval++), helping humans understand their own codebases, and automatic context engineering for LLMs (research sub-agents) The vision: CodeClash as a testbed for human-AI collaboration—vary the setup (solo agent, multi-agent, human+agent), freeze model capability, and measure how interaction changes as models improve — John Yang SWE-bench: https://www.swebench.com X: https://x.com/jyangballin Chapters 00:00:00 Introduction: John Yang on SWE-bench and Code Evaluations 00:00:31 SWE-bench Origins and Devon's Impact on the Coding Agent Arms Race 00:01:09 SWE-bench Ecosystem: Verified, Pro, Multimodal, and Multilingual Variants 00:02:17 Moving Beyond Django: Diversifying Code Evaluation Repositories 00:03:08 Code Clash: Long-Horizon Development Through Programming Tournaments 00:04:41 From Halite to Economic Value: Designing Competitive Coding Arenas 00:06:04 Ofir's Lab: SWE-ficiency, AlgoTune, and SciCode for Scientific Computing 00:07:52 The Benchmark Landscape: TAU-bench, Terminal-bench, and User Simulation 00:09:20 The Impossible Task Debate: Refusals, Ambiguity, and Benchmark Integrity 00:12:32 The Future of Code Evals: Long Autonomy vs Human-AI Collaboration 00:14:37 Call to Action: User Interaction Data and Codebase Understanding Research

PBS NewsHour - Segments
What to know about the U.S.-Ukraine talks and proposal to end Russia's war

PBS NewsHour - Segments

Play Episode Listen Later Dec 28, 2025 5:46


President Trump and Ukrainian President Zelenskyy said Sunday that they are closing in on a peace proposal aimed at ending the war with Russia. The two leaders met at Mar-a-Lago in Florida for talks that involved just the U.S. and Ukraine. John Yang speaks with Michael McFaul, who teaches at Stanford University and was U.S. ambassador to Russia in the Obama administration, to learn more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
A look at Christmas festivities and traditions around the world

PBS NewsHour - Segments

Play Episode Listen Later Dec 21, 2025 3:24


From twinkling Christmas markets across Europe to vibrant displays of poinsettia in Mexico City, the Christmas spirit takes many forms. John Yang takes a look at how Christians around the world are celebrating the season. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
U.S. Coast Guard ramps up oil tanker interceptions off Venezuelan coast

PBS NewsHour - Segments

Play Episode Listen Later Dec 21, 2025 5:40


Trump’s pressure on Venezuelan President Maduro mounted Sunday as the Coast Guard went after another oil tanker that U.S. officials accused of helping Venezuela circumvent sanctions. Last week, Trump announced a “total and complete blockade of all sanctioned tankers heading to and from Venezuela.” John Yang speaks with Reuters national security correspondent Idrees Ali for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
A conversation with Temple Grandin, world-renowned animal scientist and autism advocate

PBS NewsHour - Segments

Play Episode Listen Later Dec 20, 2025 9:38


Four new portraits have gone up at the Smithsonian National Portrait Gallery, showcasing this year’s recipients of the Portrait of America award for their transformative contributions to American history and culture. One of them is Temple Grandin, who has transformed animal welfare around the world and affected public perception of autism. John Yang speaks with Grandin for our Weekend Spotlight. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Justice Department’s heavily redacted Epstein file release draws criticism from lawmakers

PBS NewsHour - Segments

Play Episode Listen Later Dec 20, 2025 5:33


Overnight, the Justice Department released hundreds more heavily redacted pages of material it had gathered on convicted sex offender Jeffrey Epstein. They come in addition to the thousands of pages released Friday, but what has been made public so far falls short of the full disclosure required by the law Congress passed last month. John Yang speaks with Reuters correspondent Jeff Mason for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Providence community reels from deadly shooting and lockdown at Brown University

PBS NewsHour - Segments

Play Episode Listen Later Dec 14, 2025 6:28


In Providence, Rhode Island, two Brown University students were killed and nine others wounded in a shooting Saturday in a classroom. Authorities say a person of interest was taken into custody at a hotel about 20 miles from Providence. John Yang speaks with Ocean State Media reporter Ian Donnis in Rhode Island for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
New book ‘Dirtbag Billionaire’ tells story of Patagonia’s unconventional founder

PBS NewsHour - Segments

Play Episode Listen Later Dec 14, 2025 6:45


Surveys consistently rank Patagonia as one of the most reputable brands in America, not just for its outdoor gear, but also for being good environmental stewards. The story of both the company and its iconoclastic founder is told in a new book, “Dirtbag Billionaire: How Yvon Chouinard Built Patagonia, Made a Fortune, and Gave It All Away.” John Yang speaks with author David Gelles for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Beverly and Dereck Joubert reflect on 40 years of African wildlife photography in new book

PBS NewsHour - Segments

Play Episode Listen Later Dec 13, 2025 10:50


For more than 40 years, Beverly and Dereck Joubert have lived with, photographed and filmed African wildlife. Their images bear witness not just to the majesty of life on the continent, but also the host of threats that confront both the animals and the wilderness. John Yang speaks with the Jouberts about their new book, “Wild Eye: A Life in Photographs,” and their decades of work. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
How tariffs on China are making the holiday season less merry for shoppers

PBS NewsHour - Segments

Play Episode Listen Later Dec 13, 2025 3:56


This year it might not be the Grinch who threatens to steal Christmas, but tariffs. According to an analysis by Lending Tree, if Trump’s tariffs had been in place last year, they would have increased consumer costs by $28 billion — about $130 per shopper. John Yang speaks with Nathan Gordon, president of online retailer Christmas Central, about the effect of tariffs on seasonal shopping. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
What the end of a Biden-era student loan program means for borrowers

PBS NewsHour - Segments

Play Episode Listen Later Dec 10, 2025 6:54


The Trump administration has reached a joint settlement with seven states that will effectively shut down a key Biden-era student loan relief program. But what about the roughly 7 million people currently enrolled in it? Danielle Douglas-Gabriel, The Washington Post’s national higher education reporter, joins John Yang to break down the impact on borrowers in the months ahead. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
National security strategist analyzes Trump administration’s new global policy

PBS NewsHour - Segments

Play Episode Listen Later Dec 6, 2025 5:58


White House envoys met again with Ukrainian officials on Saturday to discuss Trump’s proposed path to peace. The administration’s national security strategy released this week says ending the war in Ukraine is a “core” U.S. interest, reflecting a shift from the stance of previous administrations, including Trump’s first term. John Yang speaks with the Atlantic Council’s Matthew Kroenig for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
More 4-year colleges offer 2-year degrees to reach new groups of students

PBS NewsHour - Segments

Play Episode Listen Later Dec 2, 2025 8:22


About one in four college students is both first-generation and from low-income backgrounds, making the path to a college degree especially challenging. At Boston College’s Messina College, a new, two-year, fully residential associates degree program, a wide range of support is helping change that. John Yang visited the campus to learn more as part of our ongoing series, Rethinking College. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
The story behind one man’s historic descent of Mount Everest on skis

PBS NewsHour - Segments

Play Episode Listen Later Nov 30, 2025 6:01


When adventurers talk about Mount Everest, most often it's about climbing the world's highest peak. In October, Jim Morrison became the first person to ski down Everest’s most dangerous route. The feat was chronicled by mountaineer and Academy Award-winning filmmaker Jimmy Chin for an upcoming National Geographic documentary. John Yang speaks with Morrison for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Shields and Brooks
Capehart and Wehner on Trump’s reaction to the National Guard shooting

PBS NewsHour - Shields and Brooks

Play Episode Listen Later Nov 28, 2025 10:34


Jonathan Capehart of MS NOW and Peter Wehner, a contributing writer at The Atlantic and a senior fellow at the Trinity Forum, join John Yang to discuss the week in politics, including President Trump's push for an even tougher crackdown on immigration in the days following the shooting of two National Guard members by an Afghan national on the streets of Washington. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Top Zelenskyy aide resigns in midst of Ukraine corruption scandal

PBS NewsHour - Segments

Play Episode Listen Later Nov 28, 2025 4:18


A political earthquake in Ukraine has taken place as President Volodymyr Zelenskyy’s chief of staff Andrii Yermak, the country’s second-most-powerful person, was forced to resign amid a corruption scandal. This comes as Ukraine is enmeshed in negotiations with the Trump administration on a possible end to Russia’s war in Ukraine. Jack Hewson joins John Yang with the latest from Kyiv. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Capehart and Wehner on Trump’s reaction to the National Guard shooting

PBS NewsHour - Segments

Play Episode Listen Later Nov 28, 2025 10:34


Jonathan Capehart of MS NOW and Peter Wehner, a contributing writer at The Atlantic and a senior fellow at the Trinity Forum, join John Yang to discuss the week in politics, including President Trump's push for an even tougher crackdown on immigration in the days following the shooting of two National Guard members by an Afghan national on the streets of Washington. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Trump’s deployment of National Guard in U.S. cities gets renewed scrutiny

PBS NewsHour - Segments

Play Episode Listen Later Nov 27, 2025 5:34


The shooting of two National Guard troops near the White House has intensified focus on the Trump administration’s use of military force to crack down on crime in cities led by Democrats. Juliette Kayyem, faculty director of the Harvard Kennedy School’s Homeland Security Project and an assistant DHS secretary during the Obama administration, joins John Yang to discuss. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Holiday travel delays piling up as winter storm wreaks havoc

PBS NewsHour - Segments

Play Episode Listen Later Nov 26, 2025 2:20


On the day before Thanksgiving, a major winter storm and a plunge in temperatures are wreaking havoc with many travelers' schedules. Temperatures will drop to 20 degrees below normal in much of the central and eastern parts of the country, and flight delays are piling up. John Yang reports. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Deep in the Amazon, scientists build a ‘time capsule’ to predict future of climate change

PBS NewsHour - Segments

Play Episode Listen Later Nov 23, 2025 2:18


Hundreds of miles from the U.N. conference on climate change that wrapped this weekend in Belém, Brazil, scientists are conducting a first-of-its-kind experiment that could help future policymakers address the issue. John Yang reports. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Questions linger in a Georgia town more than a year after the toxic BioLab fire

PBS NewsHour - Segments

Play Episode Listen Later Nov 16, 2025 6:57


Last September, a chemical fire in Conyers, Georgia, sent a toxic cloud over the area. A Georgia Public Broadcasting podcast called “Manufacturing Danger: The BioLab Story” examined that day, what led up to it, and the immediate aftermath. Now, a second season of the podcast looks at health consequences for residents a year later. John Yang speaks with GPB’s Pamela Kirkland for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Italy’s oldest barista, who has served coffee since WWII, turns 101

PBS NewsHour - Segments

Play Episode Listen Later Nov 16, 2025 1:53


In a small town in northern Italy, there’s a barista who has been brewing espressos and serving coffees for more than 80 years. She’s still going strong as she turns 101 this weekend, with no intention of retiring. John Yang reports. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Key takeaways from COP30 halfway through the UN climate summit

PBS NewsHour - Segments

Play Episode Listen Later Nov 16, 2025 5:46


This weekend is the halfway point for the 30th U.N. climate summit known as COP30. In a report issued days before the meeting began, the World Meteorological Organization said 2025 is “on track to be among the three warmest years on record.” New York Times international climate reporter Somini Sengupta, who just returned from COP30, joins John Yang to discuss. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Trump feuds with MAGA ally ahead of vote to release Epstein files

PBS NewsHour - Segments

Play Episode Listen Later Nov 15, 2025 5:48


President Trump continues to be dogged by Jeffrey Epstein, a man who’s been dead for more than six years. The president on Friday broke with Rep. Marjorie Taylor Greene, a one-time staunch ally who was among four House Republicans who joined all 214 Democrats to force a vote next week on releasing the Justice Department’s Epstein files. Jonathan Lemire of The Atlantic joins John Yang to discuss. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
New study suggests link between medical imaging and pediatric cancer risk

PBS NewsHour - Segments

Play Episode Listen Later Nov 9, 2025 5:02


Medical imaging, like X-rays and CT scans, are routine, non-invasive and painless tools used by doctors to make diagnoses. But a recent study of about 4 million children published in the New England Journal of Medicine suggests that the radiation exposure from imaging could pose a risk for pediatric cancer. John Yang speaks with Dr. Rebecca Smith-Bindman, the study’s lead author, to learn more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Children exposed to ‘horrific violence’ in Sudan’s civil war, UNICEF says

PBS NewsHour - Segments

Play Episode Listen Later Nov 9, 2025 4:08


Aid groups say tens of thousands of people have fled violence in el-Fasher, a city in the Darfur region of Sudan, which is in the midst of a yearslong civil war. This follows an official declaration that famine is spreading through the northeastern African nation. John Yang speaks with Sheldon Yett, UNICEF’s representative in Sudan, for more. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
Longest shutdown on record disrupts air travel and food assistance for Americans

PBS NewsHour - Segments

Play Episode Listen Later Nov 8, 2025 6:45


Any possible optimism that lawmakers would reach a deal this weekend to end the longest government shutdown on record has faded. The Senate held its first Saturday session since the shutdown began, but no votes were scheduled. John Yang speaks with former FAA administrator Randy Babbitt and Supreme Court analyst Amy Howe about two widespread effects of the shutdown: air travel and SNAP benefits. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
What the ‘bird theory’ test may reveal about your relationship

PBS NewsHour - Segments

Play Episode Listen Later Nov 8, 2025 6:51


One of the latest relationship tests to go viral is the “bird theory,” racking up millions of views on social media. It’s based on a theory developed by couples researcher John Gottman about the importance of engaging with partners when looking for a connection. John Yang speaks with licensed clinical psychologist Alexandra Solomon to learn more about the test and what it reveals. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy

PBS NewsHour - Segments
A look at Dick Cheney’s influential and polarizing legacy

PBS NewsHour - Segments

Play Episode Listen Later Nov 4, 2025 11:35


Dick Cheney, one of the most influential and polarizing vice presidents in American history, died at age 84. He served alongside President George W. Bush for two terms, a period that saw the 9/11 attacks and the start of two major wars. Cheney's family said he passed away due to complications of pneumonia, along with cardiac and vascular disease. John Yang looks back at Cheney's career and legacy. PBS News is supported by - https://www.pbs.org/newshour/about/funders. Hosted on Acast. See acast.com/privacy