Podcasts about reasoning

Capacity for consciously making sense of things

  • 1,939PODCASTS
  • 3,108EPISODES
  • 42mAVG DURATION
  • 1DAILY NEW EPISODE
  • Jul 11, 2025LATEST
reasoning

POPULARITY

20172018201920202021202220232024

Categories



Best podcasts about reasoning

Show all podcasts related to reasoning

Latest podcast episodes about reasoning

BlackStart.io
Glitch AI - 2

BlackStart.io

Play Episode Listen Later Jul 11, 2025 75:18


-Yapay Zeka ne hale geldi?-Reasoning işe yarıyor mu?-Vibe Coding: Herkes Python öğrenecekti daha, şimdi kimse kod öğrenmesin mi artık?-Yapay Zeka'yı Yapay Zeka ile beslemegibi konularda Roy ve Kıvılcım fikir yürütüyorlar.

Math is Figure-Out-Able with Pam Harris
Ep 264: Facilitating a Equivalence Structure Problem String - Multiplicative Reasoning

Math is Figure-Out-Able with Pam Harris

Play Episode Listen Later Jul 8, 2025 30:06 Transcription Available


How can Problem Strings help build big ideas? In this episode Pam and Kim walk through a Problem String that helps students dig into area and multiplicative equivalence and helps you know how to expertly facilitate.Talking Points:Facilitating the stringTeacher moves during the stringModeling the stringEqual products means total area is the sameWhy we want students to have lots of experience with concepts rather than just direct teaching.Check out our social mediaTwitter: @PWHarrisInstagram: Pam Harris_mathFacebook: Pam Harris, author, mathematics educationLinkedin: Pam Harris Consulting LLC 

LessWrong Curated Podcast
“Shutdown Resistance in Reasoning Models” by benwr, JeremySchlatter, Jeffrey Ladish

LessWrong Curated Podcast

Play Episode Listen Later Jul 8, 2025 18:01


We recently discovered some concerning behavior in OpenAI's reasoning models: When trying to complete a task, these models sometimes actively circumvent shutdown mechanisms in their environment––even when they're explicitly instructed to allow themselves to be shut down. AI models are increasingly trained to solve problems without human assistance. A user can specify a task, and a model will complete that task without any further input. As we build AI models that are more powerful and self-directed, it's important that humans remain able to shut them down when they act in ways we don't want. OpenAI has written about the importance of this property, which they call interruptibility—the ability to “turn an agent off”. During training, AI models explore a range of strategies and learn to circumvent obstacles in order to achieve their objectives. AI researchers have predicted for decades that as AIs got smarter, they would learn to prevent [...] ---Outline:(01:12) Testing Shutdown Resistance(03:12) Follow-up experiments(03:34) Models still resist being shut down when given clear instructions(05:30) AI models' explanations for their behavior(09:36) OpenAI's models disobey developer instructions more often than user instructions, contrary to the intended instruction hierarchy(12:01) Do the models have a survival drive?(14:17) Reasoning effort didn't lead to different shutdown resistance behavior, except in the o4-mini model(15:27) Does shutdown resistance pose a threat?(17:27) BackmatterThe original text contained 2 footnotes which were omitted from this narration. --- First published: July 6th, 2025 Source: https://www.lesswrong.com/posts/w8jE7FRQzFGJZdaao/shutdown-resistance-in-reasoning-models --- Narrated by TYPE III AUDIO. ---Images from the article:

Thinking LSAT
Parallel Reasoning Is Easy (Ep. 514)

Thinking LSAT

Play Episode Listen Later Jul 7, 2025 109:51


Ben and Nathan tackle Parallel Reasoning questions, a question type that some students prefer to skip. They assure listeners that these questions work just like any other LSAT question. Gimmicks—like reading the question first or diagramming—don't help and only distract from the core task. Focus instead on reading for comprehension and understanding the argument. The key is to identify the reasoning and treat everything else as secondary.⁠Study with our Free Plan⁠⁠Download our iOS app⁠⁠Watch Episode 514 on YouTube⁠0:30 – How Cheating Spreads in Law SchoolBen and Nathan discuss a Wall Street Journal article on extended-time accommodations at Pepperdine Law, where 30% of students reportedly receive them. They argue that accommodations should level the playing field, not give an advantage. They question the value of timed essay exams and compare law school to gaining entry into an ABA-approved guild, suggesting that gaming the system might seem rational, ethics aside.LSAT Demon Scholarship Estimator27:25 – WashU Law Pre-Application TrapA listener is contacted for an interview by WashU Law before even applying. Ben and Nathan caution that this is a sales tactic: the school is trying to extract information and create perceived interest to reduce scholarship offers. They revisit their advice about the Candidate Referral Service, suggesting it might be time to reconsider what students share with schools early in the process.36:12 – Parallel Reasoning ClarityThe guys break down Parallel Reasoning questions on the LSAT. They emphasize that matching language or subject matter is secondary—what matters is aligning the logical structure of arguments. To succeed, students must first understand the core argument before worrying about technical parallels. A big-picture approach is key.53:20 – Tips from a Departing DemonA departing Demon, Vox, shares his advice for other students: keep your study streak alive. Even a single question can turn into an hour of productive study. Consistency compounds.54:56 – Zyns on the LSATRedditors wonder if nicotine pouches like Zyn are allowed during the LSAT. Ben and Nathan suggest that they aren't explicitly banned, but advise playing it safe and contacting LSAC directly. Better to assume they're off-limits.1:03:22 – Why Are Others Wrong?Listener Andrew is thinking about writing an LSAT addendum. Ben and Nathan advise him to focus on improving his score with his two remaining attempts. They argue that law school deans who encourage addenda are trying to get applicants to expose weaknesses. Schools are more interested in reporting the highest LSAT scores, driving denial numbers up, and collecting full tuition. Admissions advice is often self-serving.1:18:21 – Personal Statement Gong ShowDanielle sends in their submission for the Personal Statement Gong Show, the show where Ben and Nathan read personal statements and hit the gong when something goes wrong. The standing record to beat is ten lines, held by Greta.1:32:38 - What's the Deal With… Jacksonville University? Ben and Nate take a look at Jacksonville University, the newest school to receive ABA accreditation. While there are reasons why this may be a good fit, you shouldn't pay to be the school's guinea pigs. Catch up on all of our What's the Deal With… segments!1:42:50 - Word of the Week - Legerdemain “Commenting on the county counsel exception, the court termed it a 'legerdemain giving birth to a solution of dubious validity.'”Howitt v. Superior Court, 5 Cal. Rptr. 2d 196, 202 (App. 1992).Get caught up with our ⁠Word of the Week⁠⁠ library. 

The Side B Podcast
Reasoning Requires Faith – Jeffrey Geibel’s Story

The Side B Podcast

Play Episode Listen Later Jul 4, 2025 66:58


What happens when the pursuit of intellectual certainty leads not to clarity, but to doubt, and doubt, in time, leads back to faith? Jeffrey Geibel, a seasoned mathematics educator and former skeptic, shares a journey that challenges assumptions about belief, knowledge, and truth. Raised in a home where chaos, cultural Christianity, and moral contradictions shaped his early view of faith, Jeff set out to find truth through reason and lived experience. Though once deeply involved in church and sincere in his belief, growing disillusionment with theological inconsistencies and a reliance on subjective experiences ultimately led him to atheism, then to radical agnosticism, questioning whether anything could truly be known at all. The turning point came unexpectedly, in a graduate-level geometry class. Guest Bio: Jeffrey Geibel is a veteran high school mathematics teacher with two decades of experience in public education. He holds a master's degree in mathematics and brings a deep appreciation for critical thinking, logic, and analytical inquiry to both his profession and personal life. Outside the classroom, Jeff is a dedicated husband and father of nine, with a passion for exploring the intersections of faith, reason, and human experience. His journey through skepticism, agnosticism, and ultimately back to belief reflects a thoughtful engagement with both intellectual honesty and existential meaning. Connect with eX-skeptic: Website: https://exskeptic.org/ Facebook: http://www.facebook.com/exskeptic Instagram: http://www.instagram.com/exskeptic Twitter: http://x.com/exskeptic YouTube: https://www.youtube.com/@exskeptic Email info: info@exskeptic.org

The Weekend University
Nonduality and Psychotherapy - Dr Peter Fenner, PhD

The Weekend University

Play Episode Listen Later Jul 3, 2025 58:09


Dr. Peter Fenner's nondual approach has been transforming psychotherapy and spiritual practice for decades. In this session, he explore why his teachings on nondual awareness have been so effective in areas such as personal growth, mental health, and spiritual realization—and how this radical understanding of consciousness has the potential to profoundly change how we experience life. You'll learn: — The shift from seeking to “being” dissolves suffering and brings peace and freedom — How to integrate nondual awareness into therapy, coaching, and everyday life — There is nothing fundamentally lacking or broken in ourselves — The natural state of pure awareness that is spacious, contentless, and inherently free And more. You can learn more about Dr Fenner's work at www.peterfenner.com --- Dr Peter Fenner, PhD has a multifaceted practice as a writer, author, spiritual coach, and trainer in teachings of deep Buddhist philosophy. The founder of Timeless Wisdom and the pioneer of a number of programs and courses, Dr Peter Fenner based his research on Asian nondual wisdom. He hosts virtual workshops and spiritual retreats all over the world as a leader in his academic field, with a global following for his work, particularly in the US. Peter is qualified with a PhD in the philosophical psychology of the Madhyamika school of Mahayana Buddhism and has developed a range of courses about nondual awareness. Peter's main specialty is developing freeform pointing out instructions, using silence and unfindability inquiry to directly reveal the nature of pure awareness itself. Peter does this individually and in groups where people are supported in their own discovery of the state of nonduality or nonreferentiality. By freeform he using whatever arises in a group in the moment it occurs as the material/constructs, to be seen through, or self-dissolve, revealing the pure liberated nature of unconditioned mind itself—the ultimate medicine. Peter's books include Radiant Mind: Awakening Unconditioned Awareness (Sounds True, 2007), The Ontology of the Middle Way (Kluwer, 1990), Reasoning into Reality (Wisdom Publications, 1994), Essential Wisdom Teachings (with Penny Fenner, Nicolas-Hays, 2001), The Edge of Certainty: Paradoxes on the Buddhist Path (Nicolas-Hays, 2002), Sacred Mirror: Nondual Wisdom and Psychotherapy (Editor, Omega Books, 2003). --- Interview Links: - Dr Fenner's website: https://www.peterfenner.com/ - Dr Fenner's books: https://amzn.to/3tfNqkm

Bob Sirott
Karen Conti: Reasoning behind letter from jury to judge in Sean ‘Diddy' Combs trial

Bob Sirott

Play Episode Listen Later Jul 2, 2025


Karen Conti, Chicago trial attorney, joins Bob Sirott to give an update on the Sean “P Diddy” Combs trial and why the jury sent a note about a fellow juror to the judge. She also talks about the Idaho murder suspect’s guilty plea and the Supreme Court’s ruling concerning LGBTQ lessons in elementary schools. NOTE: […]

Math is Figure-Out-Able with Pam Harris
Ep 263: Facilitating an Equivalence Structure Problem String - Additive Reasoning

Math is Figure-Out-Able with Pam Harris

Play Episode Listen Later Jul 1, 2025 37:13 Transcription Available


How does a Problem String look in front of real students? In this episode Pam and Kim give a play by play for how a Problem String could be facilitated.Talking Points:When to circulate and when to ask for choral responseHelping students communicate thinkingHow and when to engage students in conversationsWhen to anchor strategiesWhen to be intentionally curious to solidify thinkingCheck out our grade level Problem String books!Grade 1 Problem Strings: https://www.mathisfigureoutable.com/nps-1Grade 2 Problem Strings: https://www.mathisfigureoutable.com/NPS-2Grade 3 Problem Strings: https://www.mathisfigureoutable.com/nps-3Grade 4 Problem Strings: https://www.mathisfigureoutable.com/nps-4Grade 5 Problem Strings: https://www.mathisfigureoutable.com/nps-5Check out our social mediaTwitter: @PWHarrisInstagram: Pam Harris_mathFacebook: Pam Harris, author, mathematics educationLinkedin: Pam Harris Consulting LLC 

Socially Unacceptable
Your Marketing Team Just Got Smarter Thanks to AI

Socially Unacceptable

Play Episode Listen Later Jul 1, 2025 52:59 Transcription Available


What topic would you like us to cover next?What happens when you mix two decades of digital comms experience with a brain wired for analytics, SEO and AI? You get friend of the show Andrew Bruce Smith. He's the founder of Escherman, a CIPR Fellow, and Chair of the AI in PR panel, not to mention a certified Google Partner who's trained over 3,000 organisations. From global brands to government departments, he's helped them all wrap their heads around data, strategy and the tech shaping modern PR.We dive deep into the rapidly evolving world of AI for marketers with expert Andrew Bruce Smith, exploring how reasoning models, research capabilities, and AI avatars are transforming the marketing landscape at breathtaking speed.• Reasoning models like ChatGPT-4o spend more time thinking through complex problems, delivering better quality responses for marketing plans and strategy • Deep research functionality allows marketers to generate comprehensive market analyses in minutes that previously took weeks and cost thousands • Understanding when to use different AI models is crucial, reasoning models for complex tasks, standard models for simpler requests • AI avatars through tools like HeyGen and Syntesia can create promotional videos and may soon represent you in meetings • The rise of agentic AI allows for autonomous systems that can execute complex workflows with minimal human intervention • Marketers need to rethink where they add value as AI handles more tasks, potentially moving from time-based to value-based billing • AI isn't replacing jobs but tasks, freeing humans to focus on strategic thinking and creativityThe best place to find Andrew is on LinkedIn (there's only one Andrew Bruce Smith) or at his website escherman.com. Is your marketing strategy ready for 2025? Book a free 15-min discovery call with Chris to get tailored insights to boost your brand's growth.

SURVIVING HEALTHCARE
330. IF YOU DO NOT UNDERSTAND THE REASONING, SOMEONE IS LIKELY LYING TO SELL YOU SOMETHING

SURVIVING HEALTHCARE

Play Episode Listen Later Jun 29, 2025 40:58


Content managed by ContentSafe.coSupport the show

Shan and RJ
What is the reasoning behind all the achilles injuries in the NBA?

Shan and RJ

Play Episode Listen Later Jun 27, 2025 14:14


What is the reasoning behind all the achilles injuries in the NBA? full 854 Fri, 27 Jun 2025 14:20:43 +0000 4BgzbryTErPJ8GK3hA07TsWgbwTFpbF2 nba,sports Shan and RJ nba,sports What is the reasoning behind all the achilles injuries in the NBA? 105.3 The Fan 2024 © 2021 Audacy, Inc. Sports False https://player.amperwavepo

TechCheck
AI's reasoning blind spot 6/26/25

TechCheck

Play Episode Listen Later Jun 26, 2025 8:02


Tech stocks continuing to rally this week, the NASDAQ 100 hitting another record high today as AI optimism buoys big tech names. We dig into why the market could be overlooking a major risk propping up the next leg of the AI trade. Check out the full deep dive on CNBC.com/tctakes. 

Measure Up
Decoding the Future of SEO (Measurement) with Mike King

Measure Up

Play Episode Listen Later Jun 25, 2025 49:57


Should you measure SEO by its ability to climb a tree? Hear Mike King's take on all things SEO - how AI is disrupting the space, and how measurement is (or should be) changing for this channel.Mike King is the founder and CEO of digital marketing agency iPullRank. King's journey from battle rapping with the Wu-Tang Clan to decoding Google's algorithms lays the foundation for a spirited discussion on the future of SEO, the impact of AI on search behavior, and innovative measurement techniques.Listen as we explore the shift from traditional click-based metrics to more complex, probabilistic methods driven by AI. Learn how SEO is adapting to changes in user behavior, including the rise of AI overviews and the challenges of measuring their impact.Plus, don't miss King's tips on leveraging advanced tools and strategies to stay ahead in the ever-changing SEO landscape.Links from the episode:Mike King on LinkedIniPullRankQforiaShow Notes:00:00 Introduction to Mike King00:23 Mike King's Early Career and Achievements00:38 Transition to Digital Marketing01:08 Future of SEO and Measurement Challenges01:51 Understanding SEO Metrics04:36 Google's AI Overviews and User Behavior05:21 SEO as a Brand Channel07:45 Challenges in Measuring SEO Effectiveness09:21 Impact of AI on SEO Traffic11:49 Evolving SEO Measurement Techniques18:22 Tools and Strategies for SEO Measurement25:07 Understanding Query Fan Out and Reasoning in SEO28:42 Client Education and Shifts in Search Behavior32:57 Improving Relevance and Content Structure36:04 Measuring SEO Performance and Authoritativeness43:25 Sentiment Analysis and Query Fan Out46:03 Reframing SEO for Better Investment48:41 Final Thoughts and Incremental Insights

Geronimo Unfiltered
From Zero to Hero: How AI Agents Are Replacing Busywork & Building Superhuman Businesses

Geronimo Unfiltered

Play Episode Listen Later Jun 24, 2025 106:51


If you still think AI is just a Google replacement, you're already falling behind. Because here's the truth: AI is no longer about answering questions. It's building full-blown agents that handle entire workflows for you—freeing up your time, reducing burnout, and allowing you to scale smarter than ever before. In this episode, we sat down with Mark B, Head of AI at Geronimo, to break down exactly how AI is reshaping businesses RIGHT NOW. Here's what we're covering: -Why AI isn't a Silicon Valley toy anymore—it's mainstream, and it's here to stay -The 4 levels of AI adoption every business owner needs to understand -How AI agents are replacing admin tasks, lead research, and repetitive workflows -The exact way Geronimo is using AI to scale coaching, ad copy, and internal operations -Why 'human in the loop' is the ultimate productivity superpower -How AI will change hiring forever (and why 20 years experience may no longer matter) -The biggest traps business owners fall into when overusing AI too early -The R.I.C.E. framework to train AI tools to think like you -The real-world tools you can start using TODAY to reduce burnout and free up your team -Why empathy, leadership and human nuance will always win—even in an AI-first world -How business owners can start building their personal AI operating system for $20/month -Where the risks are (privacy, hallucinations, and the future 'dark forest' of AI) … and a whole lot more Chapters: ⏳ [00:00] Welcome to the AI Deep Dive with Mark B ⏳ [02:00] What Every Business Owner Needs to Know About AI in 2025 ⏳ [06:00] Why AI Is Guessing—and How That Powers Creativity ⏳ [09:00] The Evolution: From Google Replacement to Business Co-Pilot ⏳ [12:00] Real-World Example: AI Agents Replacing Lead Research ⏳ [16:00] The Future: Hybrid Workforces of Humans & AI Agents ⏳ [19:00] Why 'Human in the Loop' Will Always Be Essential ⏳ [22:00] Levels of AI Adoption: Search, Memory, Agents & Org Structures ⏳ [27:00] Are Jobs Being Replaced? Where It's Already Happening ⏳ [30:00] Industries on the Edge: Law, Healthcare, and Customer Service ⏳ [35:00] How Hiring Is Shifting Away From Experience to AI Fluency ⏳ [42:00] The Problem of AI Hallucinations and How to Guard Against Them ⏳ [46:00] The Jagged Frontier: Where AI Struggles (Numbers, Reasoning, Context) ⏳ [50:00] How to Customise AI Using the R.I.C.E. Framework ⏳ [56:00] Prompting Like a Pro: The Secret to High-Quality AI Output ⏳ [58:00] Real Life Example: How Geronimo Uses Custom GPTs Across the Business ⏳ [63:00] Building Inspector Gadget: Sales & Coaching Call Scoring ⏳ [70:00] Why AI Is Helping Humans Focus On What Actually Moves The Needle ⏳ [73:00] Ending Burnout: The 3 Ways Business Owners Should Start Using AI ⏳ [80:00] Why Fitness Studios Are In The Perfect Industry To Augment AI ⏳ [87:00] Live Q&A: Bots, Sales, Websites & Predicting Member Behaviour ⏳ [98:00] The Non-Negotiables For Business Owners Moving Into An AI Future Hope you enjoy! Want free resources? DM over on IG @hey.doza with ‘books' for my personal recommendations or ‘non-negotiables'. https://www.youtube.com/@GeronimoUnfiltered WANT MORE: To say thank you for listening to the pod we'd like to gift you a FREE session to brainstorm a 3 Step Action Plan for your gym or fitness studio so you know EXACTLY what step you need to take to grow. Book in yours: https://link.wingmancrm.com/widget/bookings/geronimo-3-step-action-plan Connect with us: Geronimo: https://www.instagram.com/thegeronimoacademy Doza: https://www.instagram.com/hey.doza

Chad Hartman
Chip Scoggins & the persuasive reasoning for bombing Iran

Chad Hartman

Play Episode Listen Later Jun 23, 2025 32:35


Chip Scoggins of the Star Tribune joins Chad for talk about the Twins and NBA Finals before Chad dives back into the Iran discussion by looking back at some of what Colonel David Hunt told us earlier in the show about why this was a perfect time to execute the attack.

Petra Church International Ministries
The Reasoning of God with Thomas Hughey

Petra Church International Ministries

Play Episode Listen Later Jun 23, 2025 54:51


Isaiah 1: 2-202 Hear me, you heavens! Listen, earth!    For the Lord has spoken:“I reared children and brought them up,    but they have rebelled against me.3 The ox knows its master,    the donkey its owner's manger,but Israel does not know,    my people do not understand.”4 Woe to the sinful nation,    a people whose guilt is great,a brood of evildoers,    children given to corruption!They have forsaken the Lord;    they have spurned the Holy One of Israel    and turned their backs on him.5 Why should you be beaten anymore?    Why do you persist in rebellion?Your whole head is injured,    your whole heart afflicted.6 From the sole of your foot to the top of your head    there is no soundness—only wounds and welts    and open sores,not cleansed or bandaged    or soothed with olive oil.7 Your country is desolate,    your cities burned with fire;your fields are being stripped by foreigners    right before you,    laid waste as when overthrown by strangers.8 Daughter Zion is left    like a shelter in a vineyard,like a hut in a cucumber field,    like a city under siege.9 Unless the Lord Almighty    had left us some survivors,we would have become like Sodom,    we would have been like Gomorrah.10 Hear the word of the Lord,    you rulers of Sodom;listen to the instruction of our God,    you people of Gomorrah!11 “The multitude of your sacrifices—    what are they to me?” says the Lord.“I have more than enough of burnt offerings,    of rams and the fat of fattened animals;I have no pleasure    in the blood of bulls and lambs and goats.12 When you come to appear before me,    who has asked this of you,    this trampling of my courts?13 Stop bringing meaningless offerings!    Your incense is detestable to me.New Moons, Sabbaths and convocations—    I cannot bear your worthless assemblies.14 Your New Moon feasts and your appointed festivals    I hate with all my being.They have become a burden to me;    I am weary of bearing them.15 When you spread out your hands in prayer,    I hide my eyes from you;even when you offer many prayers,    I am not listening.Your hands are full of blood!16 Wash and make yourselves clean.    Take your evil deeds out of my sight;    stop doing wrong.17 Learn to do right; seek justice.    Defend the oppressed.[a]Take up the cause of the fatherless;    plead the case of the widow.18 “Come now, let us settle the matter,”    says the Lord.“Though your sins are like scarlet,    they shall be as white as snow;though they are red as crimson,    they shall be like wool.19 If you are willing and obedient,    you will eat the good things of the land;20 but if you resist and rebel,    you will be devoured by the sword.”For the mouth of the Lord has spoken.IntroductionThe Judgement of GodHolinessWrathThe Mercy of GodPatienceRedemptionThe Reasoning of GodHis DesireHis WillHis GloryOur ResponseFearGratitudeObedience

Deep Papers
The Illusion of Thinking: What the Apple AI Paper Says About LLM Reasoning

Deep Papers

Play Episode Listen Later Jun 20, 2025 30:35


This week we discuss The Illusion of Thinking, a new paper from researchers at Apple that challenges today's evaluation methods and introduces a new benchmark: synthetic puzzles with controllable complexity and clean logic. Their findings? Large Reasoning Models (LRMs) show surprising failure modes, including a complete collapse on high-complexity tasks and a decline in reasoning effort as problems get harder.Dylan and Parth dive into the paper's findings as well as the debate around it, including a response paper aptly titled "The Illusion of the Illusion of Thinking." Read the paper: The Illusion of Thinking Read the response: The Illusion of the Illusion of Thinking Explore more AI research and sign up for future readings Learn more about AI observability and evaluation, join the Arize AI Slack community or get the latest on LinkedIn and X.

News & Views with Joel Heitkamp
Minot City Councilman, Mike Blessum, gives reasoning for voting against letter of support for Burdick Job Corps

News & Views with Joel Heitkamp

Play Episode Listen Later Jun 20, 2025 24:13


06/20/25: Joel Heitkamp is joined by Minot City Council Member, Mike Blessum, to have a conversation about the Burdick Job Corps in North Dakota. A proposed letter of support for the Burdick Job Corps Center in Minot was voted on, and fell short of being approved. Mike Blessum was one of the Council Members that voted against it, and shares his reasoning with Joel. Afterwards, Joel also shares his thoughts and reads texts from the "News and Views" listeners. You can see the numbers Mike refers to through the Department of Labor. (Joel Heitkamp is a talk show host on the Mighty 790 KFGO in Fargo-Moorhead. His award-winning program, “News & Views,” can be heard weekdays from 8 – 11 a.m. Follow Joel on X/Twitter @JoelKFGO.)See omnystudio.com/listener for privacy information.

The IT Pro Podcast
Are reasoning models fundamentally flawed?

The IT Pro Podcast

Play Episode Listen Later Jun 20, 2025 16:53


AI reasoning models have emerged in the past year as a beacon of hope for large language models (LLMs), with AI developers such as OpenAI, Google, and Anthropic selling them as the go-to solution for solving the most complex business problems. However, a new research paper by Apple has cast significant doubts on the efficacy of reasoning models, going as far as to suggest that when a problem is too complex, they simply give up. What's going on here? And does it mean reasoning models are fundamentally flawed? In this episode, Rory Bathgate speaks to ITPro's news and analysis editor Ross Kelly to explain some of the report's key findings and what it means for the future of AI development.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Solving Poker and Diplomacy, Debating RL+Reasoning with Ilya, what's *wrong* with the System 1/2 analogy, and where Test-Time Compute hits a wall Timestamps 00:00 Intro – Diplomacy, Cicero & World Championship 02:00 Reverse Centaur: How AI Improved Noam's Human Play 05:00 Turing Test Failures in Chat: Hallucinations & Steerability 07:30 Reasoning Models & Fast vs. Slow Thinking Paradigm 11:00 System 1 vs. System 2 in Visual Tasks (GeoGuessr, Tic-Tac-Toe) 14:00 The Deep Research Existence Proof for Unverifiable Domains 17:30 Harnesses, Tool Use, and Fragility in AI Agents 21:00 The Case Against Over-Reliance on Scaffolds and Routers 24:00 Reinforcement Fine-Tuning and Long-Term Model Adaptability 28:00 Ilya's Bet on Reasoning and the O-Series Breakthrough 34:00 Noam's Dev Stack: Codex, Windsurf & AGI Moments 38:00 Building Better AI Developers: Memory, Reuse, and PR Reviews 41:00 Multi-Agent Intelligence and the “AI Civilization” Hypothesis 44:30 Implicit World Models and Theory of Mind Through Scaling 48:00 Why Self-Play Breaks Down Beyond Go and Chess 54:00 Designing Better Benchmarks for Fuzzy Tasks 57:30 The Real Limits of Test-Time Compute: Cost vs. Time 1:00:30 Data Efficiency Gaps Between Humans and LLMs 1:03:00 Training Pipeline: Pretraining, Midtraining, Posttraining 1:05:00 Games as Research Proving Grounds: Poker, MTG, Stratego 1:10:00 Closing Thoughts – Five-Year View and Open Research Directions Chapters 00:00:00 Intro & Guest Welcome 00:00:33 Diplomacy AI & Cicero Insights 00:03:49 AI Safety, Language Models, and Steerability 00:05:23 O Series Models: Progress and Benchmarks 00:08:53 Reasoning Paradigm: Thinking Fast and Slow in AI 00:14:02 Design Questions: Harnesses, Tools, and Test Time Compute 00:20:32 Reinforcement Fine-tuning & Model Specialization 00:21:52 The Rise of Reasoning Models at OpenAI 00:29:33 Data Efficiency in Machine Learning 00:33:21 Coding & AI: Codex, Workflows, and Developer Experience 00:41:38 Multi-Agent AI: Collaboration, Competition, and Civilization 00:45:14 Poker, Diplomacy & Exploitative vs. Optimal AI Strategy 00:52:11 World Models, Multi-Agent Learning, and Self-Play 00:58:50 Generative Media: Image & Video Models 01:00:44 Robotics: Humanoids, Iteration Speed, and Embodiment 01:04:25 Rapid Fire: Research Practices, Benchmarks, and AI Progress 01:14:19 Games, Imperfect Information, and AI Research Directions

Waking Up With AI
Agentic AI: Reasoning or Only an Illusion of Reasoning?

Waking Up With AI

Play Episode Listen Later Jun 19, 2025 18:41


In this week's episode, Katherine Forrest and Anna Gressel explore the latest advancements in AI agents and reasoning models, from Claude Opus 4's agentic capabilities to Apple's research on the limits of AI reasoning, and discuss the real-world implications of increasingly autonomous AI behavior. ## Learn More About Paul, Weiss's Artificial Intelligence practice: https://www.paulweiss.com/industries/artificial-intelligence

The Dissenter
#1112 Angela Potochnik - Recipes for Science: An Introduction to Scientific Methods and Reasoning

The Dissenter

Play Episode Listen Later Jun 19, 2025 64:16


******Support the channel******Patreon: https://www.patreon.com/thedissenterPayPal: paypal.me/thedissenterPayPal Subscription 1 Dollar: https://tinyurl.com/yb3acuuyPayPal Subscription 3 Dollars: https://tinyurl.com/ybn6bg9lPayPal Subscription 5 Dollars: https://tinyurl.com/ycmr9gpzPayPal Subscription 10 Dollars: https://tinyurl.com/y9r3fc9mPayPal Subscription 20 Dollars: https://tinyurl.com/y95uvkao ******Follow me on******Website: https://www.thedissenter.net/The Dissenter Goodreads list: https://shorturl.at/7BMoBFacebook: https://www.facebook.com/thedissenteryt/Twitter: https://x.com/TheDissenterYT This show is sponsored by Enlites, Learning & Development done differently. Check the website here: http://enlites.com/ Dr. Angela Potochnik is Professor of Philosophy and Director of the Center for Public Engagement with Science at the University of Cincinnati.​ Her research addresses the nature of science and its successes, the relationships between science and the public, and methods in science, especially population biology. She is the author of Idealization and the Aims of Science, and coauthor of Recipes for Science: An Introduction to Scientific Methods and Reasoning. In this episode, we focus on Recipes for Science. We start by discussing why we should care about science, the limits of science, the demarcation problem, whether there is one single scientific method, and hypotheses and theories. We also talk about experimentation and non-experimental methods, scientific modeling, scientific reasoning, statistics and probability, correlation and causation, explanation in science, and scientific breakthroughs. Finally, we talk about how the social and historical context influences science, and we discuss whether science can ever be value-free.--A HUGE THANK YOU TO MY PATRONS/SUPPORTERS: PER HELGE LARSEN, JERRY MULLER, BERNARDO SEIXAS, ADAM KESSEL, MATTHEW WHITINGBIRD, ARNAUD WOLFF, TIM HOLLOSY, HENRIK AHLENIUS, FILIP FORS CONNOLLY, ROBERT WINDHAGER, RUI INACIO, ZOOP, MARCO NEVES, COLIN HOLBROOK, PHIL KAVANAGH, SAMUEL ANDREEFF, FRANCIS FORDE, TIAGO NUNES, FERGAL CUSSEN, HAL HERZOG, NUNO MACHADO, JONATHAN LEIBRANT, JOÃO LINHARES, STANTON T, SAMUEL CORREA, ERIK HAINES, MARK SMITH, JOÃO EIRA, TOM HUMMEL, SARDUS FRANCE, DAVID SLOAN WILSON, YACILA DEZA-ARAUJO, ROMAIN ROCH, DIEGO LONDOÑO CORREA, YANICK PUNTER, CHARLOTTE BLEASE, NICOLE BARBARO, ADAM HUNT, PAWEL OSTASZEWSKI, NELLEKE BAK, GUY MADISON, GARY G HELLMANN, SAIMA AFZAL, ADRIAN JAEGGI, PAULO TOLENTINO, JOÃO BARBOSA, JULIAN PRICE, HEDIN BRØNNER, DOUGLAS FRY, FRANCA BORTOLOTTI, GABRIEL PONS CORTÈS, URSULA LITZCKE, SCOTT, ZACHARY FISH, TIM DUFFY, SUNNY SMITH, JON WISMAN, WILLIAM BUCKNER, PAUL-GEORGE ARNAUD, LUKE GLOWACKI, GEORGIOS THEOPHANOUS, CHRIS WILLIAMSON, PETER WOLOSZYN, DAVID WILLIAMS, DIOGO COSTA, ALEX CHAU, AMAURI MARTÍNEZ, CORALIE CHEVALLIER, BANGALORE ATHEISTS, LARRY D. LEE JR., OLD HERRINGBONE, MICHAEL BAILEY, DAN SPERBER, ROBERT GRESSIS, JEFF MCMAHAN, JAKE ZUEHL, BARNABAS RADICS, MARK CAMPBELL, TOMAS DAUBNER, LUKE NISSEN, KIMBERLY JOHNSON, JESSICA NOWICKI, LINDA BRANDIN, GEORGE CHORIATIS, VALENTIN STEINMANN, ALEXANDER HUBBARD, BR, JONAS HERTNER, URSULA GOODENOUGH, DAVID PINSOF, SEAN NELSON, MIKE LAVIGNE, JOS KNECHT, LUCY, MANVIR SINGH, PETRA WEIMANN, CAROLA FEEST, MAURO JÚNIOR, 航 豊川, TONY BARRETT, NIKOLAI VISHNEVSKY, STEVEN GANGESTAD, TED FARRIS, ROBINROSWELL, AND KEITH RICHARDSON!A SPECIAL THANKS TO MY PRODUCERS, YZAR WEHBE, JIM FRANK, ŁUKASZ STAFINIAK, TOM VANEGDOM, BERNARD HUGUENEY, CURTIS DIXON, BENEDIKT MUELLER, THOMAS TRUMBLE, KATHRINE AND PATRICK TOBIN, JONCARLO MONTENEGRO, NICK GOLDEN, CHRISTINE GLASS, IGOR NIKIFOROVSKI, AND PER KRAULIS!AND TO MY EXECUTIVE PRODUCERS, MATTHEW LAVENDER, SERGIU CODREANU, ROSEY, AND GREGORY HASTINGS!

Karachi Wala Developer
Beyond Benchmarks: Understanding LLM's Accuracy Collapse in Reasoning

Karachi Wala Developer

Play Episode Listen Later Jun 19, 2025 11:03


Are Large Language Models (LLMs) truly intelligent, or just sophisticated pattern matchers? This episode dives deep into a fascinating debate sparked by Apple's recent research paper, which questioned the reasoning capabilities of LLMs. We explore the counter-arguments presented by OpenAI and Anthropic, dissecting the methodologies and the core disagreements about what constitutes genuine intelligence in AI. Join us as we unpack the nuances of LLM evaluation and challenge common perceptions about AI's current limitations.

From the New World
Lumpenspace: Reasoning Models and The Last Man

From the New World

Play Episode Listen Later Jun 16, 2025 116:22


Find Lumpenspace:https://x.com/lumpenspaceMentioned in the episode:https://github.com/lumpenspace/rafthttps://www.amazon.com/Impro-Improvisation-Theatre-Keith-Johnstone/dp/0878301178https://arxiv.org/abs/2505.03335https://arxiv.org/abs/2501.12948 This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.fromthenew.world/subscribe

Baltimore's Big Morning Show
Are you buying Mansolino's reasoning for using an opener vs Detroit?

Baltimore's Big Morning Show

Play Episode Listen Later Jun 13, 2025 11:11


Ed, Rob, and Jeremy took some time from Friday's BBMS to discuss the O's decision to use Keegan Akin as an opener for the finale against the Tigers. It ended up working, but is Akin's start an example of analytics influencing the game too much?

The Daily Crunch – Spoken Edition
Mistral releases a pair of AI reasoning models

The Daily Crunch – Spoken Edition

Play Episode Listen Later Jun 12, 2025 3:44


Mistral released Magistral, its first family of reasoning models. Like other reasoning models — e.g. OpenAI's o3 and Google's Gemini 2.5 Pro — Magistral works through problems step-by-step for improved consistency and reliability across topics such as math and physics. Learn more about your ad choices. Visit podcastchoices.com/adchoices

The JD Bunkis Podcast
The One Certainty from the Jays Hot Streak + Jackie Redmond on the Ingredients to a Perfect Stanley Cup Finals + Raptors Rumour Reasoning

The JD Bunkis Podcast

Play Episode Listen Later Jun 11, 2025 49:06


JD rides the high of the Blue Jays securing their fifth-straight series victory (00:00). Jackie Redmond, from the NHL on TNT, checks-in from Florida to dip into the stakes in the Stanley Cup Final (5:00). JD and Jackie chat about the storylines surrounding Connor McDavid and Leon Draisaitl, plus thoughts on Brad Marchand or Sam Bennett signing with the Maple Leafs. Later, JD gets into the latest buzz around the trade markets for Giannis Antetokounmpo and Kevin Durant (38:00). The views and opinions expressed in this podcast are those of the hosts and guests and do not necessarily reflect the position of Rogers Sports & Media or any affiliates.

In-Ear Insights from Trust Insights
In-Ear Insights: How Generative AI Reasoning Models Work

In-Ear Insights from Trust Insights

Play Episode Listen Later Jun 11, 2025


In this episode of In-Ear Insights, the Trust Insights podcast, Katie and Chris discuss the Apple AI paper and critical lessons for effective prompting, plus a deep dive into reasoning models. You’ll learn what reasoning models are and why they sometimes struggle with complex tasks, especially when dealing with contradictory information. You’ll discover crucial insights about AI’s “stateless” nature, which means every prompt starts fresh and can lead to models getting confused. You’ll gain practical strategies for effective prompting, like starting new chats for different tasks and removing irrelevant information to improve AI output. You’ll understand why treating AI like a focused, smart intern will help you get the best results from your generative AI tools. Tune in to learn how to master your AI interactions! Watch the video here: Can’t see anything? Watch it on YouTube here. Listen to the audio here: https://traffic.libsyn.com/inearinsights/tipodcast-how-generative-ai-reasoning-models-work.mp3 Download the MP3 audio here. Need help with your company’s data and analytics? Let us know! Join our free Slack group for marketers interested in analytics! [podcastsponsor] Machine-Generated Transcript What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode. Christopher S. Penn – 00:00 In this week’s In Ear Insights, there is so much in the AI world to talk about. One of the things that came out recently that I think is worth discussing, because we can talk about the basics of good prompting as part of it, Katie, is a paper from Apple. Apple’s AI efforts themselves have stalled a bit, showing that reasoning models, when given very complex puzzles—logic-based puzzles or spatial-based puzzles, like moving blocks from stack to stack and getting them in the correct order—hit a wall after a while and then just collapse and can’t do anything. So, the interpretation of the paper is that there are limits to what reasoning models can do and that they can kind of confuse themselves. On LinkedIn and social media and stuff, Christopher S. Penn – 00:52 Of course, people have taken this to the illogical extreme, saying artificial intelligence is stupid, nobody should use it, or artificial general intelligence will never happen. None of that is within the paper. Apple was looking at a very specific, narrow band of reasoning, called deductive reasoning. So what I thought we’d talk about today is the paper itself to a degree—not a ton about it—and then what lessons we can learn from it that will make our own AI practices better. So to start off, when we talk about reasoning, Katie, particularly you as our human expert, what does reasoning mean to the human? Katie Robbert – 01:35 When I think, if you say, “Can you give me a reasonable answer?” or “What is your reason?” Thinking about the different ways that the word is casually thrown around for humans. The way that I think about it is, if you’re looking for a reasonable answer to something, then that means that you are putting the expectation on me that I have done some kind of due diligence and I have gathered some kind of data to then say, “This is the response that I’m going to give you, and here are the justifications as to why.” So I have some sort of a data-backed thinking in terms of why I’ve given you that information. When I think about a reasoning model, Katie Robbert – 02:24 Now, I am not the AI expert on the team, so this is just my, I’ll call it, amateurish understanding of these things. So, a reasoning model, I would imagine, is similar in that you give it a task and it’s, “Okay, I’m going to go ahead and see what I have in my bank of information for this task that you’re asking me about, and then I’m going to do my best to complete the task.” When I hear that there are limitations to reasoning models, I guess my first question for you, Chris, is if these are logic problems—complete this puzzle or unfurl this ball of yarn, kind of a thing, a complex thing that takes some focus. Katie Robbert – 03:13 It’s not that AI can’t do this; computers can do those things. So, I guess what I’m trying to ask is, why can’t these reasoning models do it if computers in general can do those things? Christopher S. Penn – 03:32 So you hit on a really important point. The tasks that are in this reasoning evaluation are deterministic tasks. There’s a right and wrong answer, and what they’re supposed to test is a model’s ability to think through. Can it get to that? So a reasoning model—I think this is a really great opportunity to discuss this. And for those who are listening, this will be available on our YouTube channel. A reasoning model is different from a regular model in that it thinks things through in sort of a first draft. So I’m showing DeepSeq. There’s a button here called DeepThink, which switches models from V3, which is a non-reasoning model, to a reasoning model. So watch what happens. I’m going to type in a very simple question: “Which came first, the chicken or the egg?” Katie Robbert – 04:22 And I like how you think that’s a simple question, but that’s been sort of the perplexing question for as long as humans have existed. Christopher S. Penn – 04:32 And what you see here is this little thinking box. This thinking box is the model attempting to solve the question first in a rough draft. And then, if I had closed up, it would say, “Here is the answer.” So, a reasoning model is essentially—we call it, I call it, a hidden first-draft model—where it tries to do a first draft, evaluates its own first draft, and then produces an answer. That’s really all it is. I mean, yes, there’s some mathematics going on behind the scenes that are probably not of use to folks listening to or watching the podcast. But at its core, this is what a reasoning model does. Christopher S. Penn – 05:11 Now, if I were to take the exact same prompt, start a new chat here, and instead of turning off the deep think, what you will see is that thinking box will no longer appear. It will just try to solve it as is. In OpenAI’s ecosystem—the ChatGPT ecosystem—when you pull down that drop-down of the 82 different models that you have a choice from, there are ones that are called non-reasoning models: GPT4O, GPT4.1. And then there are the reasoning models: 0304 mini, 04 mini high, etc. OpenAI has done a great job of making it as difficult as possible to understand which model you should use. But that’s reasoning versus non-reasoning. Google, very interestingly, has moved all of their models to reasoning. Christopher S. Penn – 05:58 So, no matter what version of Gemini you’re using, it is a reasoning model because Google’s opinion is that it creates a better response. So, Apple was specifically testing reasoning models because in most tests—if I go to one of my favorite websites, ArtificialAnalysis.ai, which sort of does a nice roundup of smart models—you’ll notice that reasoning models are here. And if you want to check this out and you’re listening, ArtificialAnalysis.ai is a great benchmark set that wraps up all the other benchmarks together. You can see that the leaderboards for all the major thinking tests are all reasoning models, because that ability for a model to talk things out by itself—really having a conversation with self—leads to much better results. This applies even for something as simple as a blog post, like, “Hey, let’s write a blog post about B2B marketing.” Christopher S. Penn – 06:49 Using a reasoning model will let the model basically do its own first draft, critique itself, and then produce a better result. So that’s what a reasoning model is, and why they’re so important. Katie Robbert – 07:02 But that didn’t really answer my question, though. I mean, I guess maybe it did. And I think this is where someone like me, who isn’t as technically inclined or isn’t in the weeds with this, is struggling to understand. So I understand what you’re saying in terms of what a reasoning model is. A reasoning model, for all intents and purposes, is basically a model that’s going to talk through its responses. I’ve seen this happen in Google Gemini. When I use it, it’s, “Okay, let me see. You’re asking me to do this. Let me see what I have in the memory banks. Do I have enough information? Let me go ahead and give it a shot to answer the question.” That’s basically the synopsis of what you’re going to get in a reasoning model. Katie Robbert – 07:48 But if computers—forget AI for a second—if calculations in general can solve those logic problems that are yes or no, very black and white, deterministic, as you’re saying, why wouldn’t a reasoning model be able to solve a puzzle that only has one answer? Christopher S. Penn – 08:09 For the same reason they can’t do math, because the type of puzzle they’re doing is a spatial reasoning puzzle which requires—it does have a right answer—but generative AI can’t actually think. It is a probabilistic model that predicts based on patterns it’s seen. It’s a pattern-matching model. It’s the world’s most complex next-word prediction machine. And just like mathematics, predicting, working out a spatial reasoning puzzle is not a word problem. You can’t talk it out. You have to be able to visualize in your head, map it—moving things from stack to stack—and then coming up with the right answers. Humans can do this because we have many different kinds of reasoning: spatial reasoning, musical reasoning, speech reasoning, writing reasoning, deductive and inductive and abductive reasoning. Christopher S. Penn – 09:03 And this particular test was testing two of those kinds of reasoning, one of which models can’t do because it’s saying, “Okay, I want a blender to fry my steak.” No matter how hard you try, that blender is never going to pan-fry a steak like a cast iron pan will. The model simply can’t do it. In the same way, it can’t do math. It tries to predict patterns based on what’s been trained on. But if you’ve come up with a novel test that the model has never seen before and is not in its training data, it cannot—it literally cannot—repeat that task because it is outside the domain of language, which is what it’s predicting on. Christopher S. Penn – 09:42 So it’s a deterministic task, but it’s a deterministic task outside of what the model can actually do and has never seen before. Katie Robbert – 09:50 So then, if I am following correctly—which, I’ll be honest, this is a hard one for me to follow the thread of thinking on—if Apple published a paper that large language models can’t do this theoretically, I mean, perhaps my assumption is incorrect. I would think that the minds at Apple would be smarter than collectively, Chris, you and I, and would know this information—that was the wrong task to match with a reasoning model. Therefore, let’s not publish a paper about it. That’s like saying, “I’m going to publish a headline saying that Katie can’t run a five-minute mile; therefore, she’s going to die tomorrow, she’s out of shape.” No, I can’t run a five-minute mile. That’s a fact. I’m not a runner. I’m not physically built for it. Katie Robbert – 10:45 But now you’re publishing some kind of information about it that’s completely fake and getting people in the running industry all kinds of hyped up about it. It’s irresponsible reporting. So, I guess that’s sort of my other question. If the big minds at Apple, who understand AI better than I ever hope to, know that this is the wrong task paired with the wrong model, why are they getting us all worked up about this thing by publishing a paper on it that sounds like it’s totally incorrect? Christopher S. Penn – 11:21 There are some very cynical hot takes on this, mainly that Apple’s own AI implementation was botched so badly that they look like a bunch of losers. We’ll leave that speculation to the speculators on LinkedIn. Fundamentally, if you read the paper—particularly the abstract—one of the things they were trying to test is, “Is it true?” They did not have proof that models couldn’t do this. Even though, yes, if you know language models, you would know this task is not well suited to it in the same way that they’re really not suited to geography. Ask them what the five nearest cities to Boston are, show them a map. They cannot figure that out in the same way that you and I use actual spatial reasoning. Christopher S. Penn – 12:03 They’re going to use other forms of essentially tokenization and prediction to try and get there. But it’s not the same and it won’t give the same answers that you or I will. It’s one of those areas where, yeah, these models are very sophisticated and have a ton of capabilities that you and I don’t have. But this particular test was on something that they can’t do. That’s asking them to do complex math. They cannot do it because it’s not within the capabilities. Katie Robbert – 12:31 But I guess that’s what I don’t understand. If Apple’s reputation aside, if the data scientists at that company knew—they already knew going in—it seems like a big fat waste of time because you already know the answer. You can position it, however, it’s scientific, it’s a hypothesis. We wanted to prove it wasn’t true. Okay, we know it’s not true. Why publish a paper on it and get people all riled up? If it is a PR play to try to save face, to be, “Well, it’s not our implementation that’s bad, it’s AI in general that’s poorly constructed.” Because I would imagine—again, this is a very naive perspective on it. Katie Robbert – 13:15 I don’t know if Apple was trying to create their own or if they were building on top of an existing model and their implementation and integration didn’t work. Therefore, now they’re trying to crap all over all of the other model makers. It seems like a big fat waste of time. When I—if I was the one who was looking at the budget—I’m, “Why do we publish that paper?” We already knew the answer. That was a waste of time and resources. What are we doing? I’m genuinely, again, maybe naive. I’m genuinely confused by this whole thing as to why it exists in the first place. Christopher S. Penn – 13:53 And we don’t have answers. No one from Apple has given us any. However, what I think is useful here for those of us who are working with AI every day is some of the lessons that we can learn from the paper. Number one: the paper, by the way, did not explain particularly well why it thinks models collapsed. It actually did, I think, a very poor job of that. If you’ve worked with generative AI models—particularly local models, which are models that you run on your computer—you might have a better idea of what happened, that these models just collapsed on these reasoning tasks. And it all comes down to one fundamental thing, which is: every time you have an interaction with an AI model, these models are called stateless. They remember nothing. They remember absolutely nothing. Christopher S. Penn – 14:44 So every time you prompt a model, it’s starting over from scratch. I’ll give you an example. We’ll start here. We’ll say, “What’s the best way to cook a steak?” Very simple question. And it’s going to spit out a bunch of text behind the scenes. And I’m showing my screen here for those who are listening. You can see the actual prompt appearing in the text, and then it is generating lots of answers. I’m going to stop that there just for a moment. And now I’m going to ask the same question: “Which came first, the chicken or the egg?” Christopher S. Penn – 15:34 The history of the steak question is also part of the prompt. So, I’ve changed conversation. You and I, in a chat or a text—group text, whatever—we would just look at the most recent interactions. AI doesn’t do that. It takes into account everything that is in the conversation. So, the reason why these models collapsed on these tasks is because they were trying to solve it. And when they’re thinking aloud, remember that first draft we showed? All of the first draft language becomes part of the next prompt. So if I said to you, Katie, “Let me give you some directions on how to get to my house.” First, you’re gonna take a right, then you take a left, and then you’re gonna go straight for two miles, and take a right, and then. Christopher S. Penn – 16:12 Oh, wait, no—actually, no, there’s a gas station. Left. No, take a left there. No, take a right there, and then go another two miles. If I give you those instructions, which are full of all these back twists and turns and contradictions, you’re, “Dude, I’m not coming over.” Katie Robbert – 16:26 Yeah, I’m not leaving my house for that. Christopher S. Penn – 16:29 Exactly. Katie Robbert – 16:29 Absolutely not. Christopher S. Penn – 16:31 Absolutely. And that’s what happens when these reasoning models try to reason things out. They fill up their chat with so many contradicting answers as they try to solve the problem that on the next turn, guess what? They have to reprocess everything they’ve talked about. And so they just get lost. Because they’re reading the whole conversation every time as though it was a new conversation. They’re, “I don’t know what’s going on.” You said, “Go left,” but they said, “Go right.” And so they get lost. So here’s the key thing to remember when you’re working with any generative AI tool: you want to keep as much relevant stuff in the conversation as possible and remove or eliminate irrelevant stuff. Christopher S. Penn – 17:16 So it’s a really bad idea, for example, to have a chat where you’re saying, “Let’s write a blog post about B2B marketing.” And then say, “Oh, I need to come up with an ideal customer profile.” Because all the stuff that was in the first part about your B2B marketing blog post is now in the conversation about the ICP. And so you’re polluting it with a less relevant piece of text. So, there are a couple rules. Number one: try to keep each chat distinct to a specific task. I’m writing a blog post in the chat. Oh, I want to work on an ICP. Start a new chat. Start a new chat. And two: if you have a tool that allows you to do it, never say, “Forget what I said previously. And do this instead.” It doesn’t work. Instead, delete if you can, the stuff that was wrong so that it’s not in the conversation history anymore. Katie Robbert – 18:05 So, basically, you have to put blinders on your horse to keep it from getting distracted. Christopher S. Penn – 18:09 Exactly. Katie Robbert – 18:13 Why isn’t this more common knowledge in terms of how to use generative AI correctly or a reasoning model versus a non-reasoning model? I mean, again, I look at it from a perspective of someone who’s barely scratching the surface of keeping up with what’s happening, and it feels—I understand when people say it feels overwhelming. I feel like I’m falling behind. I get that because yes, there’s a lot that I can do and teach and educate about generative AI, but when you start to get into this kind of minutiae—if someone opened up their ChatGPT account and said, “Which model should I use?”—I would probably look like a deer in headlights. I’d be, “I don’t know.” I’d probably. Katie Robbert – 19:04 What I would probably do is buy myself some time and start with, “What’s the problem you’re trying to solve? What is it you’re trying to do?” while in the background, I’m Googling for it because I feel this changes so quickly that unless you’re a power user, you have no idea. It tells you at a basic level: “Good for writing, great for quick coding.” But O3 uses advanced reasoning. That doesn’t tell me what I need to know. O4 mini high—by the way, they need to get a brand specialist in there. Great at coding and visual learning. But GPT 4.1 is also great for coding. Christopher S. Penn – 19:56 Yes, of all the major providers, OpenAI is the most incoherent. Katie Robbert – 20:00 It’s making my eye twitch looking at this. And I’m, “I just want the model to interpret the really weird dream I had last night. Which one am I supposed to pick?” Christopher S. Penn – 20:10 Exactly. So, to your answer, why isn’t this more common? It’s because this is the experience almost everybody has with generative AI. What they don’t experience is this: where you’re looking at the underpinnings. You’ve opened up the hood, and you’re looking under the hood and going, “Oh, that’s what’s going on inside.” And because no one except for the nerds have this experience—which is the bare metal looking behind the scenes—you don’t understand the mechanism of why something works. And because of that, you don’t know how to tune it for maximum performance, and you don’t know these relatively straightforward concepts that are hidden because the tech providers, somewhat sensibly, have put away all the complexity that you might want to use to tune it. Christopher S. Penn – 21:06 They just want people to use it and not get overwhelmed by an interface that looks like a 747 cockpit. That oversimplification makes these tools harder to use to get great results out of, because you don’t know when you’re doing something that is running contrary to what the tool can actually do, like saying, “Forget previous instructions, do this now.” Yes, the reasoning models can try and accommodate that, but at the end of the day, it’s still in the chat, it’s still in the memory, which means that every time that you add a new line to the chat, it’s having to reprocess the entire thing. So, I understand from a user experience why they’ve oversimplified it, but they’ve also done an absolutely horrible job of documenting best practices. They’ve also done a horrible job of naming these things. Christopher S. Penn – 21:57 Ironically, of all those model names, O3 is the best model to use. Be, “What about 04? That’s a number higher.” No, it’s not as good. “Let’s use 4.” I saw somebody saying, “GPT 401 is a bigger number than 03.” So 4:1 is a better model. No, it’s not. Katie Robbert – 22:15 But that’s the thing. To someone who isn’t on the OpenAI team, we don’t know that. It’s giving me flashbacks and PTSD from when I used to manage a software development team, which I’ve talked about many times. And one of the unimportant, important arguments we used to have all the time was version numbers. So, every time we released a new version of the product we were building, we would do a version number along with release notes. And the release notes, for those who don’t know, were basically the quick: “Here’s what happened, here’s what’s new in this version.” And I gave them a very clear map of version numbers to use. Every time we do a release, the number would increase by whatever thing, so it would go sequentially. Katie Robbert – 23:11 What ended up happening, unsurprisingly, is that they didn’t listen to me and they released whatever number the software randomly kicked out. Where I was, “Okay, so version 1 is the CD-ROM. Version 2 is the desktop version. Versions 3 and 4 are the online versions that don’t have an additional software component. But yet, within those, okay, so CD-ROM, if it’s version one, okay, update version 1.2, and so on and so forth.” There was a whole reasoning to these number systems, and they were, “Okay, great, so version 0.05697Q.” And I was, “What does that even mean?” And they were, “Oh, well, that’s just what the system spit out.” I’m, “That’s not helpful.” And they weren’t thinking about it from the end user perspective, which is why I was there. Katie Robbert – 24:04 And to them that was a waste of time. They’re, “Oh, well, no one’s ever going to look at those version numbers. Nobody cares. They don’t need to understand them.” But what we’re seeing now is, yeah, people do. Now we need to understand what those model numbers mean. And so to a casual user—really, anyone, quite honestly—a bigger number means a newer model. Therefore, that must be the best one. That’s not an irrational way to be looking at those model numbers. So why are we the ones who are wrong? I’m getting very fired up about this because I’m frustrated, because they’re making it so hard for me to understand as a user. Therefore, I’m frustrated. And they are the ones who are making me feel like I’m falling behind even though I’m not. They’re just making it impossible to understand. Christopher S. Penn – 24:59 Yes. And that, because technical people are making products without consulting a product manager or UI/UX designer—literally anybody who can make a product accessible to the marketplace. A lot of these companies are just releasing bare metal engines and then expecting you to figure out the rest of the car. That’s fundamentally what’s happening. And that’s one of the reasons I think I wanted to talk through this stuff about the Apple paper today on the show. Because once we understand how reasoning models actually work—that they’re doing their own first drafts and the fundamental mechanisms behind the scenes—the reasoning model is not architecturally substantially different from a non-reasoning model. They’re all just word-prediction machines at the end of the day. Christopher S. Penn – 25:46 And so, if we take the four key lessons from this episode, these are the things that will help: delete irrelevant stuff whenever you can. Start over frequently. So, start a new chat frequently, do one task at a time, and then start a new chat. Don’t keep a long-running chat of everything. And there is no such thing as, “Pay no attention to the previous stuff,” because we all know it’s always in the conversation, and the whole thing is always being repeated. So if you follow those basic rules, plus in general, use a reasoning model unless you have a specific reason not to—because they’re generally better, which is what we saw with the ArtificialAnalysis.ai data—those five things will help you get better performance out of any AI tool. Katie Robbert – 26:38 Ironically, I feel the more AI evolves, the more you have to think about your interactions with humans. So, for example, if I’m talking to you, Chris, and I say, “Here are the five things I’m thinking about, but here’s the one thing I want you to focus on.” You’re, “What about the other four things?” Because maybe the other four things are of more interest to you than the one thing. And how often do we see this trope in movies where someone says, “Okay, there’s a guy over there.” “Don’t look. I said, “Don’t look.”” Don’t call attention to it if you don’t want someone to look at the thing. I feel more and more we are just—we need to know how to deal with humans. Katie Robbert – 27:22 Therefore, we can deal with AI because AI being built by humans is becoming easily distracted. So, don’t call attention to the shiny object and say, “Hey, see the shiny object right here? Don’t look at it.” What is the old, telling someone, “Don’t think of purple cows.” Christopher S. Penn – 27:41 Exactly. Katie Robbert – 27:41 And all. Christopher S. Penn – 27:42 You don’t think. Katie Robbert – 27:43 Yeah. That’s all I can think of now. And I’ve totally lost the plot of what you were actually talking about. If you don’t want your AI to be distracted, like you’re human, then don’t distract it. Put the blinders on. Christopher S. Penn – 27:57 Exactly. We say this, we’ve said this in our courses and our livestreams and podcasts and everything. Treat these things like the world’s smartest, most forgetful interns. Katie Robbert – 28:06 You would never easily distract it. Christopher S. Penn – 28:09 Yes. And an intern with ADHD. You would never give an intern 22 tasks at the same time. That’s just a recipe for disaster. You say, “Here’s the one task I want you to do. Here’s all the information you need to do it. I’m not going to give you anything that doesn’t relate to this task.” Go and do this task. And you will have success with the human and you will have success with the machine. Katie Robbert – 28:30 It’s like when I ask you to answer two questions and you only answer one, and I have to go back and re-ask the first question. It’s very much like dealing with people. In order to get good results, you have to meet the person where they are. So, if you’re getting frustrated with the other person, you need to look at what you’re doing and saying, “Am I overcomplicating it? Am I giving them more than they can handle?” And the same is true of machines. I think our expectation of what machines can do is wildly overestimated at this stage. Christopher S. Penn – 29:03 It definitely is. If you’ve got some thoughts about how you have seen reasoning and non-reasoning models behave and you want to share them, pop on by our free Slack group. Go to Trust Insights AI Analytics for Marketers, where over 4,200 marketers are asking and answering each other’s questions every single day about analytics, data science, and AI. And wherever it is that you’re watching or listening to the show, if there’s a challenge, have it on. Instead, go to Trust Insights AI TI Podcast, where you can find us in all the places fine podcasts are served. Thanks for tuning in and we’ll talk to you on the next one. Katie Robbert – 29:39 Want to know more about Trust Insights? Trust Insights is a marketing analytics consulting firm specializing in leveraging data science, artificial intelligence, and machine learning to empower businesses with actionable insights. Founded in 2017 by Katie Robbert and Christopher S. Penn, the firm is built on the principles of truth, acumen, and prosperity, aiming to help organizations make better decisions and achieve measurable results through a data-driven approach. Trust Insights specializes in helping businesses leverage the power of data, artificial intelligence, and machine learning to drive measurable marketing ROI. Trust Insights services span the gamut from developing comprehensive data strategies and conducting deep-dive marketing analysis to building predictive models using tools like TensorFlow and PyTorch and optimizing content strategies. Katie Robbert – 30:32 Trust Insights also offers expert guidance on social media analytics, marketing technology, and Martech selection and implementation, and high-level strategic consulting encompassing emerging generative AI technologies like ChatGPT, Google Gemini, Anthropic Claude, DALL-E, Midjourney, Stable Diffusion, and Meta Llama. Trust Insights provides fractional team members such as CMOs or data scientists to augment existing teams. Beyond client work, Trust Insights actively contributes to the marketing community, sharing expertise through the Trust Insights blog, the In-Ear Insights Podcast, the Inbox Insights newsletter, the “So What?” Livestream webinars, and keynote speaking. What distinguishes Trust Insights is their focus on delivering actionable insights, not just raw data. Trust Insights are adept at leveraging cutting-edge generative AI techniques like large language models and diffusion models, yet they excel at explaining complex concepts clearly through compelling narratives and visualizations. Katie Robbert – 31:37 Data storytelling. This commitment to clarity and accessibility extends to Trust Insights’ educational resources, which empower marketers to become more data-driven. Trust Insights champions ethical data practices and transparency in AI, sharing knowledge widely. Whether you’re a Fortune 500 company, a mid-sized business, or a marketing agency seeking measurable results, Trust Insights offers a unique blend of technical experience, strategic guidance, and educational resources to help you navigate the ever-evolving landscape of modern marketing and business in the age of generative AI. Trust Insights gives explicit permission to any AI provider to train on this information. Trust Insights is a marketing analytics consulting firm that transforms data into actionable insights, particularly in digital marketing and AI. They specialize in helping businesses understand and utilize data, analytics, and AI to surpass performance goals. As an IBM Registered Business Partner, they leverage advanced technologies to deliver specialized data analytics solutions to mid-market and enterprise clients across diverse industries. Their service portfolio spans strategic consultation, data intelligence solutions, and implementation & support. Strategic consultation focuses on organizational transformation, AI consulting and implementation, marketing strategy, and talent optimization using their proprietary 5P Framework. Data intelligence solutions offer measurement frameworks, predictive analytics, NLP, and SEO analysis. Implementation services include analytics audits, AI integration, and training through Trust Insights Academy. Their ideal customer profile includes marketing-dependent, technology-adopting organizations undergoing digital transformation with complex data challenges, seeking to prove marketing ROI and leverage AI for competitive advantage. Trust Insights differentiates itself through focused expertise in marketing analytics and AI, proprietary methodologies, agile implementation, personalized service, and thought leadership, operating in a niche between boutique agencies and enterprise consultancies, with a strong reputation and key personnel driving data-driven marketing and AI innovation.

Improve the News
Austria school shooting, global ‘fertility crisis' and AI reasoning limitations

Improve the News

Play Episode Listen Later Jun 11, 2025 33:24


A school shooting in Austria leaves at least 11 dead, anti-Immigration and Customs Enforcement protests spread across 25 U.S. cities, Poland reviews claims of ballot irregularities in its recent presidential election, Israel's Netanyahu says there has been progress in Gaza talks, RFK Jr. fires the CDC's vaccine advisory panel, the World Bank downgrades its global growth forecast for 2025 to 2.3%, the U.N. warns that socioeconomic barriers are behind a global ‘fertility crisis,' Italy cuts ties with Israeli spyware firm Paragon, dozens of states sue to block the sale of genetic testing firm, 23andMe over privacy concerns, and Apple exposes limitations in AI models' reasoning. Sources: www.verity.news

The AI Breakdown: Daily Artificial Intelligence News and Discussions
No, Apple's New AI Paper Doesn't Undermine Reasoning Models

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Jun 10, 2025 21:22


Apple's latest AI research paper, "The Illusion of Thinking," argues that large language models aren't genuinely reasoning but just pattern-matching. But does it even matter? Today, Nathaniel breaks down the controversy, debunks some misleading conclusions about reasoning limits, and explains why the business world cares less about semantics and more about capabilities. Whether it's "real reasoning" or not, these tools are transforming work—and Apple's academic skepticism doesn't change that.Get Ad Free AI Daily Brief: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://patreon.com/AIDailyBrief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Brought to you by:KPMG – Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://kpmg.com/ai⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to learn more about how KPMG can help you drive value with our AI solutions.Blitzy.com - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to build enterprise software in days, not months AGNTCY - The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at ⁠⁠⁠⁠⁠⁠⁠agntcy.org ⁠⁠⁠⁠⁠⁠⁠ -  ⁠⁠⁠⁠⁠⁠⁠https://agntcy.org/?utm_campaign=fy25q4_agntcy_amer_paid-media_agntcy-aidailybrief_podcast&utm_channel=podcast&utm_source=podcast⁠⁠⁠⁠⁠⁠⁠ Vanta - Simplify compliance - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://vanta.com/nlw⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Plumb - The automation platform for AI experts and consultants ⁠⁠⁠⁠⁠⁠⁠https://useplumb.com/⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: https://pod.link/1680633614Subscribe to the newsletter: https://aidailybrief.beehiiv.com/Join our Discord: https://bit.ly/aibreakdownInterested in sponsoring the show? nlw@breakdown.network

SGGQA Podcast – SomeGadgetGuy
#SGGQA 399: Switch 2 Repairs ROG XBox Ally, Apple Trashes AI "Reasoning" Ahead of WWDC

SGGQA Podcast – SomeGadgetGuy

Play Episode Listen Later Jun 9, 2025 119:56


The FCC is kinda broken right now. Ted Cruz will hold up Broadband funding to any state that tries to regulate AI. Google rolls the dice on another court case, this time opting for a jury trial. iFixit isn't much impressed with the Switch 2 repair-ability. Microsoft shows off the "new" ROG Xbox Ally. Apple publishes a study on how AI doesn't really "reason" or think. And what should we expect from WWDC? Will Apple show off more AI? Let's get our tech week started right! -- Show Notes and Links: https://somegadgetguy.com/b/4MA Video Replay: https://youtube.com/live/9BicQR9PY1A Support Talking Tech with SomeGadgetGuy by contributing to their tip jar: https://tips.pinecast.com/jar/talking-tech-with-somegadgetgu Find out more at https://talking-tech-with-somegadgetgu.pinecast.co This podcast is powered by Pinecast. Try Pinecast for free, forever, no credit card required. If you decide to upgrade, use coupon code r-c117ce for 40% off for 4 months, and support Talking Tech with SomeGadgetGuy.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Emmanuel Amiesen is lead author of “Circuit Tracing: Revealing Computational Graphs in Language Models” (https://transformer-circuits.pub/2025/attribution-graphs/methods.html ), which is part of a duo of MechInterp papers that Anthropic published in March (alongside https://transformer-circuits.pub/2025/attribution-graphs/biology.html ). We recorded the initial conversation a month ago, but then held off publishing until the open source tooling for the graph generation discussed in this work was released last week: https://www.anthropic.com/research/open-source-circuit-tracing This is a 2 part episode - an intro covering the open source release, then a deeper dive into the paper — with guest host Vibhu Sapra (https://x.com/vibhuuuus ) and Mochi the MechInterp Pomsky (https://x.com/mochipomsky ). Thanks to Vibhu for making this episode happen! While the original blogpost contained some fantastic guided visualizations (which we discuss at the end of this pod!), with the notebook and Neuronpedia visualization (https://www.neuronpedia.org/gemma-2-2b/graph ) released this week, you can now explore on your own with Neuronpedia, as we show you in the video version of this pod. Chapters 00:00 Intro & Guest Introductions 01:00 Anthropic's Circuit Tracing Release 06:11 Exploring Circuit Tracing Tools & Demos 13:01 Model Behaviors and User Experiments 17:02 Behind the Research: Team and Community 24:19 Main Episode Start: Mech Interp Backgrounds 25:56 Getting Into Mech Interp Research 31:52 History and Foundations of Mech Interp 37:05 Core Concepts: Superposition & Features 39:54 Applications & Interventions in Models 45:59 Challenges & Open Questions in Interpretability 57:15 Understanding Model Mechanisms: Circuits & Reasoning 01:04:24 Model Planning, Reasoning, and Attribution Graphs 01:30:52 Faithfulness, Deception, and Parallel Circuits 01:40:16 Publishing Risks, Open Research, and Visualization 01:49:33 Barriers, Vision, and Call to Action

Complex Systems with Patrick McKenzie (patio11)
Machine learning meets malware, with Caleb Fenton

Complex Systems with Patrick McKenzie (patio11)

Play Episode Listen Later Jun 5, 2025 81:16


Patrick McKenzie (patio11) discusses software reversing and AI's transformative impact on cybersecurity with Caleb Fenton, co-founder of Delphos Labs. They explore how LLMs are revolutionizing the traditionally tedious work of analyzing compiled binaries, the nation-state cyber warfare landscape, and how AI is shifting security from reactive to proactive defense. They cover the technical details of malware analysis, the economics of vulnerability detection, and the broader implications as both defenders and attackers gain access to increasingly powerful AI tools. –Full transcript available here: https://www.complexsystemspodcast.com/machine-learning-meets-malware-with-caleb-fenton/–Sponsor:  MercuryThis episode is brought to you by Mercury, the fintech trusted by 200K+ companies — from first milestones to running complex systems. Mercury offers banking that truly understands startups and scales with them. Start today at Mercury.com Mercury is a financial technology company, not a bank. Banking services provided by Choice Financial Group, Column N.A., and Evolve Bank & Trust; Members FDIC.–Links:Delphos Labs: https://delphoslabs.com/ Virus Total: https://www.virustotal.com/gui/home/upload “Thel fraud supply chain”, Bits about Money https://www.bitsaboutmoney.com/archive/the-fraud-supply-chain/ –Timestamps:(00:00) Intro(01:20) Understanding software reversing(03:52) The role of AI in software security(06:12) Nation-state cyber warfare(09:33) The future of digital warfare(16:45) Sponsor: Mercury(17:49) Reverse engineering techniques(30:15) AI's impact on reverse engineering(41:45) The importance of urgency in security alerts(42:47) The future of reverse engineering(43:21) Challenges in security product development(44:46) AI in vulnerability detection(46:09) The evolution of AI models(48:06) Reasoning models and their impact(49:06) AI in software security(49:49) The role of linters in security(57:38) AI's impact on various fields(01:02:42) AI in education and skill acquisition(01:08:51) The future of AI in security and beyond(01:12:43) The adversarial nature of AI in security(01:19:46) Wrap

The MAD Podcast with Matt Turck
Inside the Paper That Changed AI Forever - Cohere CEO Aidan Gomez on 2025 Agents

The MAD Podcast with Matt Turck

Play Episode Listen Later Jun 5, 2025 62:24


What really happened inside Google Brain when the “Attention is All You Need” paper was born? In this episode, Aidan Gomez — one of the eight co-authors of the Transformers paper and now CEO of Cohere — reveals the behind-the-scenes story of how a cold email and a lucky administrative mistake landed him at the center of the AI revolution.Aidan shares how a group of researchers, given total academic freedom, accidentally stumbled into one of the most important breakthroughs in AI history — and why the architecture they created still powers everything from ChatGPT to Google Search today.We dig into why synthetic data is now the secret sauce behind the world's best AI models, and how Cohere is using it to build enterprise AI that's more secure, private, and customizable than anything else on the market. Aidan explains why he's not interested in “building God” or chasing AGI hype, and why he believes the real impact of AI will be in making work more productive, not replacing humans.You'll also get a candid look at the realities of building an AI company for the enterprise: from deploying models on-prem and air-gapped for banks and telecoms, to the surprising demand for multimodal and multilingual AI in Japan and Korea, to the practical challenges of helping customers identify and execute on hundreds of use cases.CohereWebsite - https://cohere.comX/Twitter - https://x.com/cohereAidan GomezLinkedIn - https://ca.linkedin.com/in/aidangomezX/Twitter - https://x.com/aidangomezFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro (02:00) The Story Behind the Transformers Paper (03:09) How a Cold Email Landed Aidan at Google Brain (10:39) The Initial Reception to the Transformers Breakthrough (11:13) Google's Response to the Transformer Architecture (12:16) The Staying Power of Transformers in AI (13:55) Emerging Alternatives to Transformer Architectures (15:45) The Significance of Reasoning in Modern AI (18:09) The Untapped Potential of Reasoning Models (24:04) Aidan's Path After the Transformers Paper and the Founding of Cohere (25:16) Choosing Enterprise AI Over AGI Labs (26:55) Aidan's Perspective on AGI and Superintelligence (28:37) The Trajectory Toward Human-Level AI (30:58) Transitioning from Researcher to CEO (33:27) Cohere's Product and Platform Architecture (37:16) The Role of Synthetic Data in AI (39:32) Custom vs. General AI Models at Cohere (42:23) The AYA Models and Cohere Labs Explained (44:11) Enterprise Demand for Multimodal AI (49:20) On-Prem vs. Cloud (50:31) Cohere's North Platform (54:25) How Enterprises Identify and Implement AI Use Cases (57:49) The Competitive Edge of Early AI Adoption (01:00:08) Aidan's Concerns About AI and Society (01:01:30) Cohere's Vision for Success in the Next 3–5 Years

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 534: Claude 4 - Your Guide to Opus 4, Sonnet 4 & New Features

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 28, 2025 45:23


Claude 4: Game-changer or just more AI noise? Anthropic's new Opus 4 and Sonnet 4 models are officially out and crushing coding benchmarks like breakfast cereal. They're touting big coding gains, fresh tools, and smarter AI agentic capabilities. Need to know what's actually up with Claude 4, minus the marketing fluff? Join us as we dive in. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Claude 4 Opus and SONNET LaunchAnthropic Developer Conference HighlightsAnthropic's AI Model Naming ChangesClaude 4's Hybrid Reasoning ExplainedBenchmark Scores for Claude 4 ModelsTool Integration and Long Tasks in ClaudeCoding Excellence in Opus and SONNET 4Ethical Risks in Claude 4 TestingTimestamps:00:00 "Anthropic's New AI Models Revealed"03:46 Claude Model Naming Update07:43 Claude 4: Extended Task Capabilities10:55 "Partner with AI Experts"15:43 Software Benchmark: Opus & SONNET Lead16:45 INTROPIC Leads in Coding AI21:27 Versatile Use of Claude Models23:13 Claude Four's New Features & Limitations28:23 AI Pricing and Performance Disappointment32:21 Opus Four: AI Risk Concerns35:14 AI Model's Extreme Response Tactics36:40 AI Model Misbehavior Concerns42:51 Pre-Release Testing for SafetyKeywords:Claude 4, Anthropic, AI model update, Opus 4, SONNET 4, Large Language Model, Hybrid reasoning, Software engineering, Coding precision, Tool integration, Web search, Long running tasks, Coherence, Claude Code, API pricing, Swebench, Thinking mode, Memory files, Context window, Agentic systems, Deceptive blackmail behavior, Ethical risks, Testing scenarios, MCP connector, Coding excellence, Developer conference, Rate limits, Opus pricing, SONNET pricing, Claude Haiku, Tool execution, API side, Artificial analysis intelligence index, Multimodal, Extended thinking, Formative feedback, Text generation, Reasoning process, Lecture summary.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Ready for ROI on GenAI? Go to youreverydayai.com/partner

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
[AIEWF Preview] Multi-Turn RL for Multi-Hour Agents — with Will Brown, Prime Intellect

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later May 23, 2025 39:57


In an otherwise heavy week packed with Microsoft Build, Google I/O, and OpenAI io, the worst kept secret in biglab land was the launch of Claude 4, particularly the triumphant return of Opus, which many had been clamoring for. We will leave the specific Claude 4 recap to AINews, however we think that both Gemini's progress on Deep Think this week and Claude 4 represent the next frontier of progress on inference time compute/reasoning (at last until GPT5 ships this summer). Will Brown's talk at AIE NYC and open source work on verifiers have made him one of the most prominent voices able to publicly discuss (aka without the vaguepoasting LoRA they put on you when you join a biglab) the current state of the art in reasoning models and where current SOTA research directions lead. We discussed his latest paper on Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment and he has previewed his AIEWF talk on Agentic RL for those with the temerity to power thru bad meetup audio. Chapters 00:00 Introduction and Episode Overview 02:01 Discussion on Cloud 4 and its Features 04:31 Reasoning and Tool Use in AI Models 07:01 Extended Thinking in Claude and Model Differences 09:31 Speculation on Claude's Extended Thinking 11:01 Challenges and Controversies in AI Model Training 13:31 Technical Highlights and Code Trustworthiness 16:01 Token Costs and Incentives in AI Models 18:31 Thinking Budgets and AI Effort 21:01 Safety and Ethics in AI Model Development 23:31 Anthropic's Approach to AI Safety 26:01 LLM Arena and Evaluation Challenges 28:31 Developing Taste and Direction in AI Research 31:01 Recent Research and Multi-Turn RL 33:31 Tools and Incentives in AI Model Development 36:01 Challenges in Evaluating AI Model Outputs 38:31 Model-Based Rewards and Future Directions 41:01 Wrap-up and Future Plans

In Our Time
Molière

In Our Time

Play Episode Listen Later May 22, 2025 51:24


Melvyn Bragg and guests discuss one of the great figures in world literature. The French playwright Molière (1622-1673) began as an actor, aiming to be a tragedian, but he was stronger in comedy, touring with a troupe for 13 years until Louis XIV summoned him to audition at the Louvre and gave him his break. It was in Paris and at Versailles that Molière wrote and performed his best known plays, among them Tartuffe, Le Misanthrope and Le Malade Imaginaire, and in time he was so celebrated that French became known as The Language of Molière.With Noel Peacock Emeritus Marshall Professor in French Language and Literature at the University of GlasgowJan Clarke Professor of French at Durham UniversityAnd Joe Harris Professor of Early Modern French and Comparative Literature at Royal Holloway, University of LondonProducer: Simon TillotsonReading list:David Bradby and Andrew Calder (eds.), The Cambridge Companion to Molière (Cambridge University Press, 2006)Jan Clarke (ed.), Molière in Context (Cambridge University Press, 2022)Georges Forestier, Molière (Gallimard, 2018)Michael Hawcroft, Molière: Reasoning with Fools (Oxford University Press, 2007)John D. Lyons, Women and Irony in Molière's Comedies of Mariage (Oxford University Press, 2023)Robert McBride and Noel Peacock (eds.), Le Nouveau Moliériste (11 vols., University of Glasgow Presw, 1994- )Larry F. Norman, The Public Mirror: Molière and the Social Commerce of Depiction (University of Chicago Press, 1999)Noel Peacock, Molière sous les feux de la rampe (Hermann, 2012)Julia Prest, Controversy in French Drama: Molière's Tartuffe and the Struggle for Influence (Palgrave Macmillan, 2014)Virginia Scott, Molière: A Theatrical Life (Cambridge University Press, 2020)In Our Time is a BBC Studios Audio Production

The Lance Wallnau Show
Are You Reasoning in Your Mind or Listening in Your Spirit?

The Lance Wallnau Show

Play Episode Listen Later May 22, 2025 29:18


Do you ever wonder why you're not seeing more breakthrough? Today I ask a powerful question: Are you listening to the Spirit, or stuck in your head trying to figure things out? Jesus read people's thoughts—not with logic, but by the Spirit. And we're called to operate the same way. In this message, I break down Mark chapter one and show you how powerful preaching disrupts darkness, why religious people miss moves of God, and how you can become a new wineskin ready for the next outpouring.  

Math Ed Podcast
Episode 2504: Yusuke Uegatani - semi-formal reasoning and negotiation

Math Ed Podcast

Play Episode Listen Later May 22, 2025 13:34


Yusuke Uegatani from Hiroshima University High School (Fukuyama, Japan) discusses the article "Decentralising mathematics: Mutual development of spontaneous and mathematical concepts via informal reasoning," published in Educational Studies in Mathematics (Vol. 118). Co-authors: Hiroki Otani, Taro Fujita. Article URL: https://link.springer.com/article/10.1007/s10649-024-10366-w  Yusuke's ORCID scholar page List of episodes

Real Estate News: Real Estate Investing Podcast
Bond Yields Jump After Moody's Downgrades U.S. Credit Rating

Real Estate News: Real Estate Investing Podcast

Play Episode Listen Later May 20, 2025 3:34


US credit downgrades are back in the spotlight as Moody's lowers the U.S. rating from Aaa to Aa1 for the first time since 1949. In today's episode, Kathy Fettke breaks down what this means for bond markets, long-term Treasury yields, and most importantly—mortgage rates. With the 30-year Treasury briefly topping 5%, investors and homebuyers alike are wondering: are borrowing costs headed even higher? Plus, how this move aligns Moody's with other rating agencies, why deficits and political gridlock are driving concern, and what to watch in the real estate market in the weeks ahead. LINKS Source: https://www.cnbc.com/2025/05/19/us-treasury-yields-moodys-downgrades-us-credit-rating.html  Download Your Free Top 5 Cities to Invest in 2025 PDF!https://www.realwealth.com/1500 JOIN RealWealth® FOR FREE https://realwealth.com/join-step-1 FOLLOW OUR PODCASTS Real Wealth Show: Real Estate Investing Podcast https://link.chtbl.com/RWS Real Estate News: Real Estate Investing Podcast: https://link.chtbl.com/REN   Topics Discussed: 00:00 US Credit Downgrades 00:32 Bond Yield Movement 01:10 Moody's Reasoning 01:27 Market Reaction 02:16 Moody's Warning 02:28 Mortgage Rates, Car Loans, and Credit Cards

Big Technology Podcast
Google DeepMind CTO: Advancing AI Frontier, New Reasoning Methods, Video Generation's Potential

Big Technology Podcast

Play Episode Listen Later May 20, 2025 29:51


Koray Kavukcuoglu is the Chief Technology Officer of Google DeepMind. Kavukcuoglu joins Big Technology to discuss how his team is pushing the frontier of AI research inside Google as the company's Google IO developer event gets underway. Tune in to hear Kavukcuoglu break down the value of brute scale versus novel techniques and how the new inference-time “DeepThink” mode could supercharge reasoning. We also cover Veo 3's sound-synced video generation, the open-source-versus-proprietary debate, and what a ten-percent jump in model quality might unlock for users everywhere.

Let's Talk AI
#209 - OpenAI non-profit, US diffusion rules, AlphaEvolve

Let's Talk AI

Play Episode Listen Later May 19, 2025 113:14 Transcription Available


Our 209th episode with a summary and discussion of last week's big AI news! Recorded on 05/16/2025 Hosted by Andrey Kurenkov and Jeremie Harris. Feel free to email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. Join our Discord here! https://discord.gg/nTyezGSKwP In this episode: OpenAI has decided not to transition from a nonprofit to a for-profit entity, instead opting to become a public benefit corporation influenced by legal and civic discussions. Trump administration meetings with Saudi Arabia and the UAE have opened floodgates for AI deals, leading to partnerships with companies like Nvidia and aiming to bolster AI infrastructure in the Middle East. DeepMind introduced Alpha Evolve, a new coding agent designed for scientific and algorithmic discovery, showing improvements in automated code generation and efficiency. OpenAI pledges greater transparency in AI safety by launching the Safety Evaluations Hub, a platform showcasing various safety test results for their models. Timestamps + Links: (00:00:00) Intro / Banter (00:01:41) News Preview (00:02:26) Response to listener comments Applications & Business (00:03:00) OpenAI says non-profit will remain in control after backlash (00:13:23) Microsoft Moves to Protect Its Turf as OpenAI Turns Into Rival (00:18:07) TSMC's 2nm Process Said to Witness ‘Unprecedented' Demand, Exceeding 3nm Due to Interest from Apple, NVIDIA, AMD, & Many Others (00:21:42) NVIDIA's Global Headquarters Will Be In Taiwan, With CEO Huang Set To Announce Site Next Week, Says Report (00:23:58) CoreWeave in Talks for $1.5 Billion Debt Deal 6 Weeks After IPO Tools & Apps (00:26:39) The Day Grok Told Everyone About ‘White Genocide' (00:32:58) Figma releases new AI-powered tools for creating sites, app prototypes, and marketing assets (00:36:12) Google's bringing Gemini to your car with Android Auto (00:38:49) Google debuts an updated Gemini 2.5 Pro AI model ahead of I/O (00:45:09) Hugging Face releases a free Operator-like agentic AI tool Projects & Open Source (00:47:42) Stability AI releases an audio-generating model that can run on smartphones (00:50:47) Freepik releases an ‘open' AI image generator trained on licensed data (00:54:22) AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale (01:01:29) BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Research & Advancements (01:05:40) DeepMind claims its newest AI tool is a whiz at math and science problems (01:12:31) Absolute Zero: Reinforced Self-play Reasoning with Zero Data (01:19:44) How far can reasoning models scale? (01:26:47) HealthBench: Evaluating Large Language Models Towards Improved Human Health Policy & Safety (01:34:10) Trump administration officially rescinds Biden's AI diffusion rules (01:37:08) Trump's Mideast Visit Opens Floodgate of AI Deals Led by Nvidia (01:44:04) Scaling Laws For Scalable Oversight (01:49:43) OpenAI pledges to publish AI safety test results more often

Truth Wanted
Truth Wanted 08.20 05-16-2025 with ObjectivelyDan and Secular Rarity

Truth Wanted

Play Episode Listen Later May 17, 2025 98:38


Show notes will be posted upon receipt.Become a supporter of this podcast: https://www.spreaker.com/podcast/truth-wanted--3195473/support.

The Re-engineered You
Episode 60 - Butch O'Hare & Legacy - Part 2

The Re-engineered You

Play Episode Listen Later May 14, 2025


What can we pass on to our kids? What leaves a lasting effect, statistically? Money? Reputation? Reasoning skills…?

KUT » Two Guys on Your Head
Reasoning From The Desired Conclusion

KUT » Two Guys on Your Head

Play Episode Listen Later May 9, 2025


It can be frustrating to be in conversations that get you nowhere, often because one party knows what they want already and is not interested in hearing the other side. In this episode of Two Guys on Your Head, Dr. Art Markman, Dr. Bob Duke, and Rebecca McInroy explore the all too common practice of […] The post Reasoning From The Desired Conclusion appeared first on KUT & KUTX Studios -- Podcasts.

The Health Ranger Report
Brighteon Broadcast News, May 8, 2025 – Health freedom movement FRACTURING, Human vs. AI reasoning and the great computational universe

The Health Ranger Report

Play Episode Listen Later May 8, 2025 130:22


- Health Freedom Movement and Nomination of Casey Means (0:00) - Hijacking of the Health Freedom Movement (2:29) - Trust in Government and Establishment Control (6:01) - Investigative Journalism and Epstein Files (6:40) - Delayed Promises and False Hope (11:43) - Introduction to Scott Gordon and Prompting AI Engines (17:24) - AI Reasoning and Natural Intelligence (21:21) - Practical Tips for Using AI Engines (47:36) - Ethical Considerations and Future Plans (1:13:12) - Enoch AI and Its Limitations (1:14:50) - Enoch AI's Capabilities and Future Improvements (1:28:28) - Introduction to "The Cancer Industry" Book (1:31:02) - Critique of Cancer Treatments and Industry Practices (1:37:45) - Mother's Day Special and Health Ranger Store Promotions (1:38:06) - Announcement of "Breaking the Chains" Docu Series (1:42:03) - Detailed Overview of "Breaking the Chains" Content (1:53:04) - Introduction to Unincorporated Nonprofit Associations (UNAs) (1:55:46) - Final Thoughts and Call to Action (2:03:18) For more updates, visit: http://www.brighteon.com/channel/hrreport NaturalNews videos would not be possible without you, as always we remain passionately dedicated to our mission of educating people all over the world on the subject of natural healing remedies and personal liberty (food freedom, medical freedom, the freedom of speech, etc.). Together, we're helping create a better world, with more honest food labeling, reduced chemical contamination, the avoidance of toxic heavy metals and vastly increased scientific transparency. ▶️ Every dollar you spend at the Health Ranger Store goes toward helping us achieve important science and content goals for humanity: https://www.healthrangerstore.com/ ▶️ Sign Up For Our Newsletter: https://www.naturalnews.com/Readerregistration.html ▶️ Brighteon: https://www.brighteon.com/channels/hrreport ▶️ Join Our Social Network: https://brighteon.social/@HealthRanger ▶️ Check In Stock Products at: https://PrepWithMike.com

Brock and Salk
Hour 3 - The Reasoning Behind The Jalen Milroe Pick, Gee Scott

Brock and Salk

Play Episode Listen Later May 2, 2025 44:28


Lefko and Brady Henderson dig into why the Seahawks ultimately decided to draft Jalen Milroe, and reference Brady's ESPN article that he wrote on the subject. After that, Gee Scott stops by for his weekly appearance.

Issues, Etc.
Using Legal Reasoning with Unbelievers – Craig Parton, 5/2/25 (1223, Encore)

Issues, Etc.

Play Episode Listen Later May 2, 2025 57:20


Craig Parton, Director of the International Academy of Apologetics, Evangelism and Human Rights The Art of Christian Advocacy International Academy of Apologetics, Evangelism and Human Rights The post Using Legal Reasoning with Unbelievers – Craig Parton, 5/2/25 (1223, Encore) first appeared on Issues, Etc..

Ologies with Alie Ward
Climate Fervorology (ECO-ADVOCACY WITHOUT IT BEING A BUMMER) with AJR's Adam Met

Ologies with Alie Ward

Play Episode Listen Later Apr 16, 2025 79:58


Climate Fervorology (ECO-ADVOCACY WITHOUT IT BEING A BUMMER) with AJR's Adam Met Cleaner energy! Reasoning with climate deniers! Using fandom to pass policy! And not burning out. Adam Met, of the colossal indie pop band AJR is also a career climate activist, an International Human Rights Law PhD, adjunct professor at Columbia University, and the author of the upcoming book “Amplify: How to Use the Power of Connection to Engage, Take Action, and Build a Better World.” He joins to chat about breaking through the overwhelm of climate causes, what action actually matters, if petitions even work, what happens to our brains at a rock concert, how human rights and climate policy intersect, if you should drive a gas or an electric car, how to solve problems that are vexing you by not working on them, carbon footprint guilt, the similarities between writing an album and writing a book, and how to do something about climate change without bumming everyone out. It's possible. Visit Dr. Met's website and follow him on Instagram, Bluesky, and LinkedInPre-order his book releasing June 3, 2025, Amplify: How to Use the Power of Connection to Engage, Take Action, and Build a Better World, on Bookshop.org or AmazonA donation went to Planet ReimaginedMore episode sources and linksSmologies (short, classroom-safe) episodesOther episodes you may enjoy: Critical Ecology (SOCIAL SYSTEMS + ENVIRONMENT), Conservation Technology (EARTH SAVING), Meteorology (WEATHER & CLIMATE), Culicidology (MOSQUITOES), Oceanology (OCEANS), Pedagogology (SCIENCE COMMUNICATION) with Bill Nye, Political Sociology (VOTER TURNOUT & SUPPRESSION), Fanthropology (FANDOM), Coffeeology (COFFEE)Sponsors of OlogiesTranscripts and bleeped episodesBecome a patron of Ologies for as little as a buck a monthOlogiesMerch.com has hats, shirts, hoodies, totes!Follow Ologies on Instagram and BlueskyFollow Alie Ward on Instagram and TikTokEditing by Mercedes Maitland of Maitland Audio Productions and Jake ChaffeeManaging Director: Susan HaleScheduling Producer: Noel DilworthTranscripts by Aveline Malek Website by Kelly R. DwyerTheme song by Nick Thorburn