Podcasts about Inference

  • 521PODCASTS
  • 939EPISODES
  • 43mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Oct 24, 2025LATEST
Inference

POPULARITY

20172018201920202021202220232024


Best podcasts about Inference

Show all podcasts related to inference

Latest podcast episodes about Inference

TechCrunch Startups – Spoken Edition
Tensormesh raises $4.5M to squeeze more inference out of AI server loads; also, Palantir enters $200M partnership with telco Lumen

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Oct 24, 2025 6:35


Tensormesh uses an expanded form of KV Caching to make inference loads as much as ten times more efficient. Plus, Palantir said on Thursday it had struck a partnership with Lumen Technologies that will see the telecommunications company using the data management company's AI software to build capabilities to support enterprise AI services. Learn more about your ad choices. Visit podcastchoices.com/adchoices

Edge of the Web - An SEO Podcast for Today's Digital Marketer
772 | Unpacking LLMs.txt with Carolyn Shelby

Edge of the Web - An SEO Podcast for Today's Digital Marketer

Play Episode Listen Later Oct 16, 2025 42:54


Erin welcomes Carolyn Shelby, the Principal SEO at Yoast and a renowned authority in technical and enterprise SEO. Carolyn brings decades of hands-on experience from her pioneering days in digital marketing, working with brands like Disney's ESPN, Tribune Publishing, and major nonprofits. The conversation kicks off with a surprising twist—Carolyn's unique title as Queen of the micronation Ladonia—before diving into her role at Yoast and their latest innovation: the LLMs.txt file generator. Carolyn explains how this new file helps websites communicate their most valuable content directly to large language models like ChatGPT and Google's AI, streamlining the way future search agents discover and answer questions with information from your site. We explore what inspired Yoast's push to roll out LLMs.txt to over 13 million sites, what website owners should include in their files, potential industry pushback, the adoption challenge with search giants, and how this moment could change the way websites optimize for AI-driven search results. Key Segments: [00:01:46] Introducing Carolyn Shelby, Senior SEO at Yoast [00:03:09] Queen of the Micronation Ladonia? [00:07:56] What is the LLMS.txt file? [00:08:59] LLMS.txt is a Treasure Map [00:14:38] A New File, along with Robots.txt and Sitemap.xml [00:15:41]  What inspired Yoast to create this LLM text file? [00:17:12]  EDGE of the Web Sponsor: PreWriter.AI [00:18:22] LLMS.txt proposed by Jeremy Howard (Sept, 2024) [00:22:37] Standard Uniformity and Acceptance? [00:24:43] Housekeeping  [00:29:37] LLM Markdown Effort Questioned: Exploitation? [00:31:41] LLMs Lack Memory at Inference [00:34:07] EDGE of The Web Sponsor: Inlinks (WAIKAY) [00:36:09] Pushback on the LLMS.txt file Thanks to Our Sponsors! PreWriter.AI: https://edgeofthewebradio.com/prewriter  Inlinks WAIKAY https://edgeofthewebradio.com/waikay Follow Our Guest Twitter: @cshel LinkedIn: https://www.linkedin.com/in/cshel/  Resources Learn about Ladonia (DONATE!): https://www.ladonia.org/about/  Carolyn's Posts on LLMS.txt: https://www.cshel.com/ai-seo/how-llms-interpret-content-structuring-for-ai-search-unedited-version/  https://searchengineland.com/llms-txt-isnt-robots-txt-its-a-treasure-map-for-ai-456586 

Crafting Solutions to Conflict
To infer and to imply, part one

Crafting Solutions to Conflict

Play Episode Listen Later Oct 16, 2025 5:32


My most recent guest, Gerry O'Sullivan, talked with me about her process, The Journey of Inference. As she puts it succinctly: “Our Journey of Inference interprets the world of observable data according to our unique perspective or paradigm.”It's clear from Gerry's process and our conversation that our inferences can get us into trouble, precisely because we each carry a unique perspective or paradigm.Dictionary definitions of infer are, if not quite unique, not fully consistent.For example, one says infer means to conclude through reasoning. Another than infer means to guess or use reasoning. And yet another statesInfer can mean “to derive by reasoning; conclude or judge from premises or evidence.”It's that guessing, those premises, that can wreak havoc.  Do you have comments or suggestions about a topic or guest? An idea or question about conflict management or conflict resolution? Let me know at jb@dovetailresolutions.com! And you can learn more about me and my work as a mediator and a Certified CINERGY® Conflict Coach at www.dovetailresolutions.com and https://www.linkedin.com/in/janebeddall/.Enjoy the show for free on your favorite podcast app or on the podcast website: https://craftingsolutionstoconflict.com/  

This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Dataflow Computing for AI Inference with Kunle Olukotun - #751

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

Play Episode Listen Later Oct 14, 2025 57:37


In this episode, we're joined by Kunle Olukotun, professor of electrical engineering and computer science at Stanford University and co-founder and chief technologist at Sambanova Systems, to discuss reconfigurable dataflow architectures for AI inference. Kunle explains the core idea of building computers that are dynamically configured to match the dataflow graph of an AI model, moving beyond the traditional instruction-fetch paradigm of CPUs and GPUs. We explore how this architecture is well-suited for LLM inference, reducing memory bandwidth bottlenecks and improving performance. Kunle reviews how this system also enables efficient multi-model serving and agentic workflows through its large, tiered memory and fast model-switching capabilities. Finally, we discuss his research into future dynamic reconfigurable architectures, and the use of AI agents to build compilers for new hardware. The complete show notes for this episode can be found at https://twimlai.com/go/751.

NPPBC Audio Sermons
A Journey of Redemption

NPPBC Audio Sermons

Play Episode Listen Later Oct 12, 2025 50:58


The Greatest of All The greatest of all is the one that hung on the cross. Christ could have done anything he wanted. John said he could have called 12 legions of angels. He loved you and me. Luke 15:11 A certain man had two sons. The younger son asked for his inheritance. The father divided his living between them. The younger son gathered his goods and journeyed to a far country. He wasted his substance with riotous living. He spent all he had. A mighty famine arose in the land. He began to be in want. He joined himself to a citizen of that country. He sent him into the fields to feed swine. He would have filled his belly with the husks that the swine ate. No man gave to him. He came to himself and said: How many hired servants of my father's have bread enough and to spare. I perish with hunger. I will arise and go to my father. I will say to him, Father, I have sinned against heaven and before thee. I am no more worthy to be called thy son. Make me as one of thy hired servants. He arose and came to his father. When he was yet a great way off, his father saw him and had compassion. It wasn't judgment or condemnation. He ran and fell on his neck and kissed him. The son said, Father, I have sinned against heaven and in thy sight and am no more worthy to be called thy son. The father said to his servants: Bring forth the best robe and put it on him. Put a ring on his hand and shoes on his feet. Bring hither the fatted calf, kill it, and let us eat and be merry. For this my son was dead and is alive again; he was lost and is found. They began to be merry. Be Reconciled to God Need to get right with God. Inference that you ain't right. People don't like to hear that. Fellowship with God was broken in the Garden of Eden. The only thing that reconciled man back to God was the blood of Jesus Christ. Christ died for my sin. God made a way for me to be right. The day I got born again, the Holy Ghost of God imputed the righteousness of Christ to me. I stand here today right with God. The blood of Jesus Christ has taken my sin dead. My sins have been cast into a sea of forgetfulness. Believe in the name of Jesus Christ, the one who died for your sins and rose again. In believing Him, you can be saved today. You can be made right with God. The young man woke up one day and decided he didn't need his father. He didn't need his brothers. He didn't need father's house. He didn't need the father's help. He made up his mind to require his inheritance. He was tired of his father's rules and ways. The father has rules. He won't let me drink, do dope, listen to bad music, or watch pornography. He loves me too much. He knows what a detriment it is to my life. He knows that when I go down that road, there ain't nothing but pain and suffering and hard things to be born. There's more do's in that book than there are don'ts. If you'll focus on the do's, you won't have time for the don'ts. This young man was the child of the Father, so I'm not questioning his birthright. Even when he was out there strung out on dope or alcohol or women, whatever else it was that was his particular vice that caused him to waste everything. The devil took everything he had. There he was without anything when the famine hit. To be right with God: Be in a place where I'm doing what God has called me to do. If you ain't doing what God called you to do, you're wrong. You're living wrong. You're not right with God. You won't be right with God until you line up and do what God has told you to do. Quit comparing ourselves to an earthly standard and get right with the Holy God today. That will make a difference in you. You want to see your life changed? You want to see your life transformed?

MacVoices Audio
MacVoices #25257: Live! - Macs in Enterprise AI, An FCC Leak, and Xiaomi Copycats

MacVoices Audio

Play Episode Listen Later Oct 10, 2025 37:05


The panel explores how M-series Macs—with huge unified memory and efficient silicon—are gaining traction for AI inference and on-device privacy, citing MacStadium use cases and enterprise angles like Copilot adoption. Chuck Joiner, David Ginsburg, Marty Jencius, Brian Flanigan-Arthurs, Eric Bolden, Guy Serle, Web Bixby, Jeff Gamet, Jim Rea, and Mark Fuccio contrast training vs. inference, discuss small language models, and corporate data policies. The session wraps up with the alleged FCC leak of iPhone 16e schematics, and Xiaomi's unabashed Apple cloning—plus a quick note on viral AI fakes.  This edition of MacVoices is brought to you by the MacVoices Dispatch, our weekly newsletter that keeps you up-to-date on any and all MacVoices-related information. Subscribe today and don't miss a thing. Show Notes: Chapters: [0:30] AI workloads on Macs and unified memory advantages [1:36] Training vs. inference explained; why memory matters [3:49] M3/M4 bandwidth, neural accelerators, and privacy [5:35] Avoiding the “NVIDIA tax” with custom silicon [7:13] Power efficiency and enterprise adoption angles [9:45] User education and Copilot in corporate settings [12:42] Small language models for classrooms and offline use| [16:34] Alleged FCC leak of iPhone 16e schematics [22:47] Xiaomi's cloning culture and SEO gaming [25:29] Viral AI “security footage” hoaxes and media literacy Links: MacStadium: Macs increasingly being adopted for enterprise AI workloads https://appleworld.today/2025/09/macstadium-macs-increasingly-being-adopted-for-enterprise-ai-workloads/ College football keeps picking iPad over Surface as fourth conference joins team Apple https://9to5mac.com/2025/09/25/college-football-keeps-picking-ipad-over-surface-as-fourth-conference-joins-team-apple/ Xiaomi's latest Apple clones include 'Hyper Island' and 'Pad Mini' tablet https://9to5google.com/2025/09/26/xiaomis-latest-apple-clones-include-hyper-island-and-pad-mini-tablet-gallery AI Video of Sam Altman Stealing GPUs https://www.instagram.com/ai.innovationshub/reel/DPPdo3VDxmI/ FCC mistakenly leaks confidential iPhone 16e schematics https://appleinsider.com/articles/25/09/29/fcc-mistakenly-leaks-confidential-iphone-16e-schematics?utm_source=rss Guests: Web Bixby has been in the insurance business for 40 years and has been an Apple user for longer than that.You can catch up with him on Facebook, Twitter, and LinkedIn, but prefers Bluesky. Eric Bolden is into macOS, plants, sci-fi, food, and is a rural internet supporter. You can connect with him on Twitter, by email at embolden@mac.com, on Mastodon at @eabolden@techhub.social, on his blog, Trending At Work, and as co-host on The Vision ProFiles podcast. Brian Flanigan-Arthurs is an educator with a passion for providing results-driven, innovative learning strategies for all students, but particularly those who are at-risk. He is also a tech enthusiast who has a particular affinity for Apple since he first used the Apple IIGS as a student. You can contact Brian on twitter as @brian8944. He also recently opened a Mastodon account at @brian8944@mastodon.cloud. Mark Fuccio is actively involved in high tech startup companies, both as a principle at piqsure.com, or as a marketing advisor through his consulting practice Tactics Sells High Tech, Inc. Mark was a proud investor in Microsoft from the mid-1990's selling in mid 2000, and hopes one day that MSFT will be again an attractive investment. You can contact Mark through Twitter, LinkedIn, or on Mastodon. Jeff Gamet is a technology blogger, podcaster, author, and public speaker. Previously, he was The Mac Observer's Managing Editor, and the TextExpander Evangelist for Smile. He has presented at Macworld Expo, RSA Conference, several WordCamp events, along with many other conferences. You can find him on several podcasts such as The Mac Show, The Big Show, MacVoices, Mac OS Ken, This Week in iOS, and more. Jeff is easy to find on social media as @jgamet on Twitter and Instagram, jeffgamet on LinkedIn., @jgamet@mastodon.social on Mastodon, and on his YouTube Channel at YouTube.com/jgamet. David Ginsburg is the host of the weekly podcast In Touch With iOS where he discusses all things iOS, iPhone, iPad, Apple TV, Apple Watch, and related technologies. He is an IT professional supporting Mac, iOS and Windows users. Visit his YouTube channel at https://youtube.com/daveg65 and find and follow him on Twitter @daveg65 and on Mastodon at @daveg65@mastodon.cloud. Dr. Marty Jencius has been an Associate Professor of Counseling at Kent State University since 2000. He has over 120 publications in books, chapters, journal articles, and others, along with 200 podcasts related to counseling, counselor education, and faculty life. His technology interest led him to develop the counseling profession ‘firsts,' including listservs, a web-based peer-reviewed journal, The Journal of Technology in Counseling, teaching and conferencing in virtual worlds as the founder of Counselor Education in Second Life, and podcast founder/producer of CounselorAudioSource.net and ThePodTalk.net. Currently, he produces a podcast about counseling and life questions, the Circular Firing Squad, and digital video interviews with legacies capturing the history of the counseling field. This is also co-host of The Vision ProFiles podcast. Generally, Marty is chasing the newest tech trends, which explains his interest in A.I. for teaching, research, and productivity. Marty is an active presenter and past president of the NorthEast Ohio Apple Corp (NEOAC). Jim Rea built his own computer from scratch in 1975, started programming in 1977, and has been an independent Mac developer continuously since 1984. He is the founder of ProVUE Development, and the author of Panorama X, ProVUE's ultra fast RAM based database software for the macOS platform. He's been a speaker at MacTech, MacWorld Expo and other industry conferences. Follow Jim at provue.com and via @provuejim@techhub.social on Mastodon. Guy Serle, best known for being one of the co-hosts of the MyMac Podcast, sincerely apologizes for anything he has done or caused to have happened while in possession of dangerous podcasting equipment. He should know better but being a blonde from Florida means he's probably incapable of understanding the damage he has wrought. Guy is also the author of the novel, The Maltese Cube. You can follow his exploits on Twitter, catch him on Mac to the Future on Facebook, at @Macparrot@mastodon.social, and find everything at VertShark.com.   Support:      Become a MacVoices Patron on Patreon      http://patreon.com/macvoices      Enjoy this episode? Make a one-time donation with PayPal Connect:      Web:      http://macvoices.com      Twitter:      http://www.twitter.com/chuckjoiner      http://www.twitter.com/macvoices      Mastodon:      https://mastodon.cloud/@chuckjoiner      Facebook:      http://www.facebook.com/chuck.joiner      MacVoices Page on Facebook:      http://www.facebook.com/macvoices/      MacVoices Group on Facebook:      http://www.facebook.com/groups/macvoice      LinkedIn:      https://www.linkedin.com/in/chuckjoiner/      Instagram:      https://www.instagram.com/chuckjoiner/ Subscribe:      Audio in iTunes      Video in iTunes      Subscribe manually via iTunes or any podcatcher:      Audio: http://www.macvoices.com/rss/macvoicesrss      Video: http://www.macvoices.com/rss/macvoicesvideorss

The Neuron: AI Explained
AI Inference: Why Speed Matters More Than You Think (with SambaNova's Kwasi Ankomah)

The Neuron: AI Explained

Play Episode Listen Later Oct 7, 2025 53:19


Everyone's talking about the AI datacenter boom right now. Billion dollar deals here, hundred billion dollar deals there. Well, why do data centers matter? It turns out, AI inference (actually calling the AI and running it) is the hidden bottleneck slowing down every AI application you use (and new stuff yet to be released). In this episode, Kwasi Ankomah from SambaNova Systems explains why running AI models efficiently matters more than you think, how their revolutionary chip architecture delivers 700+ tokens per second, and why AI agents are about to make this problem 10x worse.

Perfect English Podcast
Critical Thinking 1 | The Critical Thinking Renaissance: How to Think Clearly in a Chaotic World

Perfect English Podcast

Play Episode Listen Later Oct 6, 2025 25:41


Do you ever feel like you're lost in a digital funhouse, bombarded by conflicting headlines, biased sources, and endless rabbit holes? In an age of information overload, the most crucial skill isn't coding or a new language—it's learning how to think. This episode kicks off our journey by redefining critical thinking not as a negative act of criticism, but as a constructive, powerful toolkit for building a reliable understanding of the world. We strip away the jargon and explore the fundamental actions and mindsets that empower a clear and disciplined mind. Join us as we make the urgent case for why this timeless skill has become the essential survival guide for the 21st century. In this episode, you'll learn: What critical thinking reallyis: Moving beyond cynicism to a constructive process of Analysis, Evaluation, and Inference. The Three Pillars of a Thinking Mind:Discover why Logic, Intellectual Humility, and Skepticism are the bedrock of rational thought. The Three Tsunamis of the Modern Age:Understand why the rise of misinformation, generative AI, and global complexity makes critical thinking more essential now than ever before. To unlock full access to all our episodes, consider becoming a premium subscriber on Apple Podcasts or Patreon. And don't forget to visit englishpluspodcast.com for even more content, including articles, in-depth studies, and our brand-new audio series and courses now available in our Patreon Shop!

The Crackin' Backs Podcast
The Mind-Body Code to Beating Chronic Pain -Dr. Jorge Esteves

The Crackin' Backs Podcast

Play Episode Listen Later Sep 29, 2025 68:40 Transcription Available


Is chronic pain really “in the body”… or in the brain's predictions about the body?Today on the Crackin' Backs Podcast, we sit down with Dr. Jorge Esteves, PhD, DO—an osteopath, educator, and researcher whose work reframes low back pain, sciatica, and other MSK issues through the lens of predictive processing, active inference, and interoception. Dr. Esteves explains why pain is more than a physical signal: it's shaped by mood, memory, context, and environment—and how the right mix of smart touch, simple movement, precise language, and meaning can rewrite faulty predictions and dial down threat in the nervous system.We explore what he calls “smart touch”—the affective, well-timed, well-paced contact that improves therapeutic alliance, entrains breath and rhythm, and helps the brain feel safe enough to update its story about the spine. We also unpack fresh imaging work suggesting hands-on care can influence connectivity in pain and interoceptive hubs, including the insula—right where body-signal meaning is made. You'll leave with a 5-minute daily recalibration (breath cue + one gentle movement + one self-touch drill) to keep predictions aligned with reality—especially during a flare.What You'll LearnPain ≠ damage: Why back pain often persists due to over-protective predictions and how to nudge them toward safety.Smart touch, real change: How affective touch, pacing, and breath cues shift interoceptive processing and calm threat.Therapeutic alliance matters: The first 10 minutes that build trust—and the phrases clinicians should avoid because they raise threat.Brains on hands-on care: New imaging insights on how manual therapy may modulate brain connectivity in chronic low back pain.Learn More / Contact Dr. EstevesOfficial site: Prof Jorge EstevesGoogle Scholar (Atlântica University, Portugal): Google ScholarResearchGate: https://www.researchgate.net/profile/Jorge-Esteves-3 ResearchGate(En)active Inference paper (open-access): FrontiersEmail (from CV): osteojorge@gmail.com Pro OsteoLinkedIn: https://www.linkedin.com/in/dr-jorge-esteves-27371522/ Pro OsteoTwitter/X: https://twitter.com/JEsteves_osteo Pro OsteoWe are two sports chiropractors, seeking knowledge from some of the best resources in the world of health. From our perspective, health is more than just “Crackin Backs” but a deep dive into physical, mental, and nutritional well-being philosophies. Join us as we talk to some of the greatest minds and discover some of the most incredible gems you can use to maintain a higher level of health. Crackin Backs Podcast

Bill Wenstrom
Ephesians 4.25-The Contents of Ephesians 4.25 is a Strong Inference from the Contents of Ephesians 4.17-24

Bill Wenstrom

Play Episode Listen Later Sep 25, 2025 44:07


Ephesians Series: Ephesians 4:25-The Contents of Ephesians 4:25 is a Strong Inference from the Contents of Ephesians 4:17-24-Lesson # 282

Wenstrom Bible Ministries
Ephesians 4.25-The Contents of Ephesians 4.25 is a Strong Inference from the Contents of Ephesians 4.17-24

Wenstrom Bible Ministries

Play Episode Listen Later Sep 25, 2025 44:07


Ephesians Series: Ephesians 4:25-The Contents of Ephesians 4:25 is a Strong Inference from the Contents of Ephesians 4:17-24-Lesson # 282

AWS Podcast
#738: AWS News: Global Cross-Region Inference, Aurora Limitless and lots more.

AWS Podcast

Play Episode Listen Later Sep 22, 2025 25:01


Simon and Jillian keep you up to date with all the latest releases and capabilities!

NEGOTIATEx
Episode 10: The Secret to Overcoming Resistance in Negotiations | Negotiate X in Rewind

NEGOTIATEx

Play Episode Listen Later Sep 18, 2025 9:40


Episode 10 of the Negotiate X in Rewind series addresses a listener's question on stalled projects and pushback from supervisors. Nolan Martin and Aram Donigian explore the secret to overcoming resistance in negotiations, showing why success comes from stepping into the other party's shoes.  They discuss why resistance often makes sense from the other side's perspective, how to uncover hidden concerns, and why listening and inquiry are more effective than debating. With practical models like the Ladder of Inference, the episode highlights how negotiators can reframe “no” into dialogue, build trust, and transform resistance into collaborative agreement.  

The Bright Morning Podcast
Using the Ladder of Inference [Demonstration]: Episode 258

The Bright Morning Podcast

Play Episode Listen Later Sep 15, 2025 22:28


When your client jumps to conclusions, the Ladder of Inference can help. In this episode, Elena demonstrates how to use the Ladder to surface assumptions, expand thinking, and move toward more grounded decision-making.Notable moments: Keep learning: Subscribe: Ladder of Inference Skill Session in the Coach Learning Library The First 10 Minutes Receive weekly wisdom and tools from Elena delivered to your inboxWatch the Bright Morning Podcast on YouTube and subscribe to our channelBecome a Bright Morning Member Follow Elena on Instagram and LinkedInFollow Bright Morning on LinkedIn and InstagramSupport the show:Become a Friend of the Podcast  Rate and review usReflection questions: What kinds of judgments or generalizations do you commonly hear in your coaching conversations?How might the Ladder of Inference help your clients think more clearly or equitably?What do you need to feel confident using this tool in your own practice?Podcast Transcript and Use:Bright Morning Consulting owns the copyright to all content and transcripts of The Bright Morning Podcast, with all rights reserved. You may not distribute or commercially exploit the content without our express written permission.We welcome you to download and share the podcast with others for personal use; please acknowledge The Bright Morning Podcast as the source of the material.Episode Transcript

User-Owned AI: On-Chain Training, Inference, and Agents, with NEAR's Illia Polosukhin

Play Episode Listen Later Sep 13, 2025 93:08


Today Illia Polosukhin, founder of Near Protocol joins The Cognitive Revolution to discuss the intersection of AI and blockchain technologies, exploring how decentralized infrastructure can enable "user-owned AI" through privacy-preserving model training, confidential computing, and autonomous agents that operate via smart contracts without centralized control. Check out our sponsors: Linear, Oracle Cloud Infrastructure. Shownotes below brought to you by Notion AI Meeting Notes - try one month for free at: ⁠https://notion.com/lp/nathan Evolution from AI to Blockchain: Near Protocol evolved from an AI project focused on teaching machines to code to a blockchain platform, and now combines both technologies. Significant User Base: Near Protocol has achieved 50 million monthly active users, demonstrating substantial adoption. Autonomous AI Agents: Near has developed autonomous AI agents that can operate independently once deployed, including an example of an agent given $10,000 to trade based on Twitter sentiment. AI-Blockchain Integration: The platform combines decentralized compute for running AI with smart contracts that execute actions, creating truly autonomous systems. AI in Governance: Near is experimenting with "AI senators" that people can vote for, which then make governance decisions, potentially solving the principal-agent problem in representation. Jurisdictional Framework: Near is creating infrastructure for jurisdictions and courts to enforce rules on AI agents, essentially putting governance into AI. Sponsors: Linear: Linear is the system for modern product development. Nearly every AI company you've heard of is using Linear to build products. Get 6 months of Linear Business for free at: https://linear.app/tcr Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive PRODUCED BY: https://aipodcast.ing CHAPTERS: (00:00) About the Episode (03:58) From Transformers to Blockchain (Part 1) (17:01) Sponsor: Linear (18:30) From Transformers to Blockchain (Part 2) (21:23) Blockchain Security Fundamentals (Part 1) (33:36) Sponsor: Oracle Cloud Infrastructure (35:00) Blockchain Security Fundamentals (Part 2) (39:01) Zero-Day Vulnerabilities Solution (51:11) Confidential Computing Infrastructure (58:07) User-Owned AI Models (01:14:47) Marketplace and Governance (01:18:57) AI-Crypto Synergy Vision (01:27:50) Autonomous Agents Future (01:31:22) Outro

Unsupervised Learning
Ep 74: Chief Scientist of Together.AI Tri Dao On The End of Nvidia's Dominance, Why Inference Costs Fell & The Next 10X in Speed

Unsupervised Learning

Play Episode Listen Later Sep 10, 2025 58:37


Fill out this short listener survey to help us improve the show: https://forms.gle/bbcRiPTRwKoG2tJx8 Tri Dao, Chief Scientist at Together AI and Princeton professor who created Flash Attention and Mamba, discusses how inference optimization has driven costs down 100x since ChatGPT's launch through memory optimization, sparsity advances, and hardware-software co-design. He predicts the AI hardware landscape will shift from Nvidia's current 90% dominance to a more diversified ecosystem within 2-3 years, as specialized chips emerge for distinct workload categories: low-latency agentic systems, high-throughput batch processing, and interactive chatbots. Dao shares his surprise at AI models becoming genuinely useful for expert-level work, making him 1.5x more productive at GPU kernel optimization through tools like Claude Code and O1. The conversation explores whether current transformer architectures can reach expert-level AI performance or if approaches like mixture of experts and state space models are necessary to achieve AGI at reasonable costs. Looking ahead, Dao sees another 10x cost reduction coming from continued hardware specialization, improved kernels, and architectural advances like ultra-sparse models, while emphasizing that the biggest challenge remains generating expert-level training data for domains lacking extensive internet coverage. (0:00) Intro(1:58) Nvidia's Dominance and Competitors(4:01) Challenges in Chip Design(6:26) Innovations in AI Hardware(9:21) The Role of AI in Chip Optimization(11:38) Future of AI and Hardware Abstractions(16:46) Inference Optimization Techniques(33:10) Specialization in AI Inference(35:18) Deep Work Preferences and Low Latency Workloads(38:19) Fleet Level Optimization and Batch Inference(39:34) Evolving AI Workloads and Open Source Tooling(41:15) Future of AI: Agentic Workloads and Real-Time Video Generation(44:35) Architectural Innovations and AI Expert Level(50:10) Robotics and Multi-Resolution Processing(52:26) Balancing Academia and Industry in AI Research(57:37) Quickfire With your co-hosts: @jacobeffron - Partner at Redpoint, Former PM Flatiron Health @patrickachase - Partner at Redpoint, Former ML Engineer LinkedIn @ericabrescia - Former COO Github, Founder Bitnami (acq'd by VMWare) @jordan_segall - Partner at Redpoint

Moonshots with Peter Diamandis
AI Insiders Reveal Elon Musk's Master Plan to Win AI w/ Dave Blundin & Alex Wissner-Gross | EP #192

Moonshots with Peter Diamandis

Play Episode Listen Later Sep 3, 2025 110:55


Get access to metatrends 10+ years before anyone else - https://qr.diamandis.com/metatrends   Dave Blundin is the founder & GP of Link Ventures Dr. Alexander Wissner-Gross is a computer scientist and founder of Reified, focused on AI and complex systems. – My companies: Reverse the age of my skin using the same cream at https://qr.diamandis.com/oneskinpod   Apply to Dave's and my new fund:https://qr.diamandis.com/linkventureslanding      –- Connect with Peter: X Instagram Connect with Dave: X LinkedIn Connect with Alex Website LinkedIn X Email Listen to MOONSHOTS: Apple YouTube – *Recorded on September 2nd, 2025 *The views expressed by me and all guests are personal opinions and do not constitute Financial, Medical, or Legal advice. -------- Chapters 02:50 - The Importance of Positive News in Tech 05:49 - Education and the Future of Learning 09:02 - AI Wars: Colossus II and Hardware Scaling 12:02 - Training vs. Inference in AI Models 18:02 - Elon Musk's XAI and Recruitment Strategies 20:47 - The Rise of NanoBanana and AI in Media 26:38 - Google's AI-Powered Live Translation 29:03 - The Future of Language and Cultural Diversity 48:07 - AI Disruption in Language Learning 51:56 - The Future of SaaS Companies 57:28 - NVIDIA's Market Position and AI Chips 59:51 - China's AI Chip Landscape 01:03:13 - India's AI Infrastructure Revolution 01:11:11 - The Concept of AI Governance 01:15:16 - Economic Implications of AI Investment 01:19:54 - AI in Healthcare Innovations 01:36:32 - The Future of Urban Planning with AI 01:40:39 - Electricity Costs and AI's Impact Learn more about your ad choices. Visit megaphone.fm/adchoices

Advantest Talks Semi
Inference is shaping a major role in AI's future

Advantest Talks Semi

Play Episode Listen Later Aug 28, 2025 48:31 Transcription Available


Bringing AI to the Edge: Dr. Bannon Bastani on Enterprise Computing's New FrontierThanks for tuning in to "Advantest Talks Semi"! If you enjoyed this episode, we'd love to hear from you! Please take a moment to leave a rating on Apple Podcast. Your feedback helps us improve and reach new listeners. Don't forget to subscribe and share with your friends. We appreciate your support!

Joint Dynamics - Intelligent Movement Series
Episode 133 -MIT Movement & Pain Researcher Max Shen on Fighting Monkey & the Enactive Inference Approach to Health & Movement

Joint Dynamics - Intelligent Movement Series

Play Episode Listen Later Aug 18, 2025 95:22


Send us a textIn this episode of the Joint Dynamics Podcast, host Andrew Cox | Joint Dynamics chats with Max Shen, a pain and movement researcher at MIT. Max shares his innovative approach to understanding pain through the lens of Enactive inference & his research with Jozef Frucek of Fighting Monkey.Andrew & Max explore the complexities of pain, the importance of adaptability, and how social contexts influence health. Max discusses his personal journey into pain research, highlighting the role of interoceptive awareness and emotional support in recovery.Join Andrew and Max for a thought-provoking conversation that reimagines movement and pain management, offering practical insights for both practitioners and individuals seeking to enhance their well-being.JD show sponsor is Muvitality Medicinal Mushrooms for modern day health and wellness | Mu …Go to muvitality.com and use the code JD10 to receive a 10% discount on your purchase of Mu Functional mushrooms such as Lions Mane, Cordyceps, Chaga, Reishi, and Turkey tail functional mushroomsEnjoyHere are some useful links for this podcastLinked In https://www.linkedin.com/in/max-shen-6a7878325/https://www.maxkshen.com/Relevant episodesEpisode 100 Jozef Frucek on communication, movement, acting, & spitting into the face of a tiger! - https://podcasts.apple.com/hk/podcast/episode-100-jozef-frucek-on-communication-movement/id1527374894?i=1000655394611JOINT DYNAMICS links:Joint Dynamics Facebook - https://www.facebook.com/JointDynamicsHongKong/Joint Dynamics Instagram -https://www.instagram.com/jointdynamics/Joint Dynamics Youtube - https://www.youtube.com/channel/UCRQZplKxZMSvtc6LxM5WckwJoint Dynamics Website - www.jointdynamics.com.hk Host - Andrew Cox - https://www.jointdynamics.com.hk/the-team/trainers/andrew-cox

Spatial Web AI Podcast
Active Inference AI: Cutting-Edge Use Cases | Denise Holt, Mahault Albarracin, PhD & David Bray, PhD

Spatial Web AI Podcast

Play Episode Listen Later Aug 16, 2025 101:03


August 7, 2025 #ActiveInference - In this thought-provoking episode of the Spatial Web AI Podcast, host Denise Holt (Founder & CEO of AIXGlobal Media and Learning Lab Central) sits down with two renowned expertsshaping the future of intelligent systems:

The Voice of Resurrection
08.14.25 | AUDIO | Living with a God-Inference

The Voice of Resurrection

Play Episode Listen Later Aug 14, 2025 28:30


Vendo Podcast - Protect Your Brand & Sell More!™
Amazon AI Shopping Through an Agentic Lens - VENDO Velocity Podcast Ep. 171

Vendo Podcast - Protect Your Brand & Sell More!™

Play Episode Listen Later Aug 14, 2025 34:09


In this episode, the VENDO team is joined by Andrew Bell, Amazon Lead for NFPA, to explore Amazon Rufus and the future of AI-driven shopping. We unpack the evolution of Rufus, its impact on new brands, and the growing role of inference, visual search, and agentic evaluation. Tune in to learn how brands can adapt and thrive in the next era of intelligent commerce. Topics Covered: Amazon Rufus Context & Evolution (3:00) Does Rufus Affect New Brands? (9:30) The Importance of Inference (10:38) Brands Optimizing for Inference (15:32) AI Shopping (17:06) Lens AI and Visual Search (22:00) Agentic Evaluation (26:19) Speakers: Andrew Bell, Amazon Lead, NFPA Delaney Del Mundo, VP Account Strategy - Amazon & TikTok Shop, VENDO Want to stay up to date on topics like this? Subscribe to our Amazon & Walmart Growth #podcast for bi-weekly episodes every other Thursday! ➡️ YouTube: https://www.youtube.com/channel/UCr2VTsj1X3PRZWE97n-tDbA ➡️ Spotify: https://open.spotify.com/show/4HXz504VRToYzafHcAhzke?si=9d57599ed19e4362 ➡️ Apple: https://podcasts.apple.com/us/podcast/vendo-amazon-walmart-growth-experts/id1512362107

The Voice of Resurrection
08.13.25 | AUDIO | Living with a God-Inference

The Voice of Resurrection

Play Episode Listen Later Aug 13, 2025 28:30


The Tech Trek
Inference: AI's Hidden Engine

The Tech Trek

Play Episode Listen Later Aug 13, 2025 25:25


Nikola Borisov, CEO and co-founder of Deep Infra, joins the show to unpack the rapid evolution of AI inference, the hardware race powering it, and how startups can actually keep up without burning out. From open source breakthroughs to the business realities of model selection, Nikola shares why speed, efficiency, and strategic focus matter more than ever. If you're building in AI, this conversation will help you see the road ahead more clearly.Key Takeaways• Open source AI models are advancing at a pace that forces founders to choose focus over chasing every release.• First mover advantage in AI is real but plays out differently than in consumer tech because models are often black boxes to end users.• Infrastructure and hardware strategy can make or break AI product delivery, especially for startups.• Efficient inference may become more important than efficient training as AI usage scales.• Optimizing for specific customer needs can create significant performance and cost advantages.Timestamped Highlights[02:12] How far AI has come — and why we're still under 10% of its future potential[04:11] The challenge of keeping pace with constant model releases[08:12] Why differentiation between models still matters for builders[14:08] The hidden costs and strategies of AI hardware infrastructure[18:05] Why inference efficiency could eclipse training efficiency[21:46] Lessons from missed opportunities and unexpected shifts in model innovationQuote of the Episode“Being more efficient at inference is going to be way more important than being very efficient at training.” — Nikola BorisovResources MentionedDeepInfra — https://deepinfra.comNikola Borisov on LinkedIn — https://www.linkedin.com/in/nikolabCall to ActionIf you enjoyed this conversation, share it with someone building in AI and subscribe so you never miss an episode. Your next big idea might just come from the next one.

This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Closing the Loop Between AI Training and Inference with Lin Qiao - #742

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

Play Episode Listen Later Aug 12, 2025 61:11


In this episode, we're joined by Lin Qiao, CEO and co-founder of Fireworks AI. Drawing on key lessons from her time building PyTorch, Lin shares her perspective on the modern generative AI development lifecycle. She explains why aligning training and inference systems is essential for creating a seamless, fast-moving production pipeline, preventing the friction that often stalls deployment. We explore the strategic shift from treating models as commodities to viewing them as core product assets. Lin details how post-training methods, like reinforcement fine-tuning (RFT), allow teams to leverage their own proprietary data to continuously improve these assets. Lin also breaks down the complex challenge of what she calls "3D optimization"—balancing cost, latency, and quality—and emphasizes the role of clear evaluation criteria to guide this process, moving beyond unreliable methods like "vibe checking." Finally, we discuss the path toward the future of AI development: designing a closed-loop system for automated model improvement, a vision made more attainable by the exciting convergence of open and closed-source model capabilities. The complete show notes for this episode can be found at https://twimlai.com/go/742.

The Voice of Resurrection
08.12.25 | AUDIO | Living with a God-Inference

The Voice of Resurrection

Play Episode Listen Later Aug 12, 2025 28:30


The Voice of Resurrection
08.11.25 | AUDIO | Living with a God-Inference

The Voice of Resurrection

Play Episode Listen Later Aug 11, 2025 28:30


Chip Stock Investor Podcast
Episode 326: The TRUTH About Data Center AI Inference: AMD, ALAB, ANET Stock Analysis

Chip Stock Investor Podcast

Play Episode Listen Later Aug 7, 2025 27:08


Join us on Discord with Semiconductor Insider, sign up on our website: www.chipstockinvestor.com/membershipSupercharge your analysis with AI! Get 15% of your membership with our special link here: https://fiscal.ai/csi/Investors are excited about the prospects for AI inference, and potential "next Nvidia" investments. Is AMD really going to be the biggest winner from this market? There are actually other candidates, like Astera Labs (ALAB) and Credo Technology (CRDO) that need to be considered. And of course, a big data center AI inference winner is none other than Arista Networks (ANET). Chip Stock Investors Nick and Kasey break it down in this important video update.Sign Up For Our Newsletter: https://mailchi.mp/b1228c12f284/sign-up-landing-page-short-formPrevious Arista videos referenced:https://youtu.be/gyfRB8E0p6ohttps://youtu.be/OuMuLBVcb84Astera Labs Video:https://youtu.be/jZyHWqBXDo8********************************************************Affiliate links that are sprinkled in throughout this video. If something catches your eye and you decide to buy it, we might earn a little coffee money. Thanks for helping us (Kasey) fuel our caffeine addiction!Content in this video is for general information or entertainment only and is not specific or individual investment advice. Forecasts and information presented may not develop as predicted and there is no guarantee any strategies presented will be successful. All investing involves risk, and you could lose some or all of your principal. #amd #alab #anet #gpus #aiinference #arista #asteralabs #semiconductors #chips #investing #stocks #finance #financeeducation #silicon #artificialintelligence #ai #financeeducation #chipstocks #finance #stocks #investing #investor #financeeducation #stockmarket #chipstockinvestor #fablesschipdesign #chipmanufacturing #semiconductormanufacturing #semiconductorstocks Timestamps:(00:00) AMD's Recent Performance and Market Position(01:58) AMD's Financials and Future Prospects(07:17) Astera Labs: Networking Innovations(15:12) Credo: A Strategic Investment(17:12) Arista Networks: A Growth Story(26:36) Conclusion Nick and Kasey own shares of AMD, ANET

The New Stack Podcast
Confronting AI's Next Big Challenge: Inference Compute

The New Stack Podcast

Play Episode Listen Later Aug 6, 2025 24:14


While AI training garners most of the spotlight — and investment — the demands ofAI inferenceare shaping up to be an even bigger challenge. In this episode ofThe New Stack Makers, Sid Sheth, founder and CEO of d-Matrix, argues that inference is anything but one-size-fits-all. Different use cases — from low-cost to high-interactivity or throughput-optimized — require tailored hardware, and existing GPU architectures aren't built to address all these needs simultaneously.“The world of inference is going to be truly heterogeneous,” Sheth said, meaning specialized hardware will be required to meet diverse performance profiles. A major bottleneck? The distance between memory and compute. Inference, especially in generative AI and agentic workflows, requires constant memory access, so minimizing the distance data must travel is key to improving performance and reducing cost.To address this, d-Matrix developed Corsair, a modular platform where memory and compute are vertically stacked — “like pancakes” — enabling faster, more efficient inference. The result is scalable, flexible AI infrastructure purpose-built for inference at scale.Learn more from The New Stack about inference compute and AIScaling AI Inference at the Edge with Distributed PostgreSQLDeep Infra Is Building an AI Inference Cloud for DevelopersJoin our community of newsletter subscribers to stay on top of the news and at the top of your game  

VOX Podcast with Mike Erre
The Sacred Nature of Questioning Everything - Nonference 2025

VOX Podcast with Mike Erre

Play Episode Listen Later Aug 4, 2025 65:07


Live from the 2025 Nonference, Mike and Tim (In the same room) are joined in studio by Journey Church Pastors Suzie P. Lind and Sam Barnhart. What does it mean to truly deconstruct faith, and how can that journey lead to healing? In this heartfelt and thought-provoking conversation, the hosts tackle the complexities of "deconstruction," exploring disillusionment, doubt, discipleship, and ultimately, the pursuit of Jesus amidst cultural challenges. From addressing church hurt and systemic issues to reexamining theologies and navigating the intersection of faith and politics, this episode unpacks the role of the church in society and the personal journeys that shape our understanding of Christianity. Through themes of justice, cruciformity, and reimagining what it means to follow Jesus, the discussion dives deep into how cultural realities and historical practices influence our faith. The panel shares stories of heartbreak and hope, challenging the idea that questioning or rethinking faith is a departure from Jesus—instead, it's often a move toward deeper authenticity. Whether you're wrestling with theological questions, processing church trauma, or striving to navigate cultural issues as a follower of Jesus, this episode offers a space for reflection and community. Feel free to share your thoughts, send in your questions, or engage with us on Facebook and Instagram. Let's continue pursuing a faith marked by humility, curiosity, and justice together. CHAPTERS: 00:00 - Welcome to the Nonference 02:12 - The Tennessee Buzz 04:35 - Deconstruction: A Second Innocence 07:11 - The Six D's of Deconstruction 14:46 - Why People Are Disillusioned 18:18 - Did the Church Move or Did the Curtain Open 23:16 - Deconstruction as Repentance 28:32 - Discipleship in Deconstruction 29:41 - Understanding Deconversion 32:44 - Redefinition in Faith 34:58 - Navigating Doubt 38:50 - Biblical Foundations of Deconstruction 41:00 - Purpose of Inference 42:26 - Q&A: Insights from Stafford 49:49 - National Park Moments 51:09 - Experiencing Death and Grief 56:32 - Neuroscience of Belief 56:41 - Josh McDowell and the Talking Snake 1:02:40 - Embracing the Power of Weakness 1:03:12 - Thank You 1:04:08 - Credits As always, we encourage and would love discussion as we pursue. Feel free to email in questions to hello@voxpodcast.com, and to engage the conversation on Facebook and Instagram. We're on YouTube (if you're into that kinda thing): VOXOLOGY TV. Our Merch Store! ETSY Learn more about the Voxology Podcast Subscribe on iTunes or Spotify Support the Voxology Podcast on Patreon The Voxology Spotify channel can be found here: Voxology Radio Follow us on Instagram: @voxologypodcast and "like" us on Facebook Follow Mike on Twitter: www.twitter.com/mikeerre Music in this episode by Timothy John Stafford Instagram & Twitter: @GoneTimothy

AI + a16z
Performance and Passion: Fal's Approach to AI Inference

AI + a16z

Play Episode Listen Later Aug 1, 2025 40:10


If you've been experimenting with image, video, and audio models, the chances are you've been both blown away by how good they're becoming, and also a little perturbed by how long they can take to generate. If you've been using a platform like Fal, however, your experience on the latter point might be more positive.In this episode, Fal cofounder and CEO Burkay Gur and head of engineering Batuhan Taskaya join a16z general partner Jennifer Li to discuss how they built an inference platform — or, as they call it, a generative media cloud — that's optimized for speed, performance, and user experience. These are core features for a great product, yes, and also ones borne of necessity as the early team obsessively engineered around its meager GPU capacity at the height of the AI infrastructure crunch.But this is more than a story about infrastructure. As you'll hear, they also delve into sales and hiring strategy; the team's overall excitement over these emerging modalities; and the trends they're seeing as competition in the world of video models, especially, heats up.  Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

Software Engineering Daily
Modal and Scaling AI Inference with Erik Bernhardsson

Software Engineering Daily

Play Episode Listen Later Jul 31, 2025 39:55


Modal is a serverless compute platform that's specifically focused on AI workloads. The company's goal is to enable AI teams to quickly spin up GPU-enabled containers, and rapidly iterate and autoscale. It was founded by Erik Bernhardsson who was previously at Spotify for 7 years where he built the music recommendation system and the popular The post Modal and Scaling AI Inference with Erik Bernhardsson appeared first on Software Engineering Daily.

Podcast – Software Engineering Daily
Modal and Scaling AI Inference with Erik Bernhardsson

Podcast – Software Engineering Daily

Play Episode Listen Later Jul 31, 2025 39:55


Modal is a serverless compute platform that’s specifically focused on AI workloads. The company's goal is to enable AI teams to quickly spin up GPU-enabled containers, and rapidly iterate and autoscale. It was founded by Erik Bernhardsson who was previously at Spotify for 7 years where he built the music recommendation system and the popular The post Modal and Scaling AI Inference with Erik Bernhardsson appeared first on Software Engineering Daily.

EconTalk
Read Like a Champion (with Doug Lemov)

EconTalk

Play Episode Listen Later Jul 28, 2025 63:56


Many students graduate high school today without having read a book cover to cover. Many students struggle to learn to read at all. How did this happen? Listen as educator and author Doug Lemov talks with EconTalk's Russ Roberts about the failed fads in reading education, the mistaken emphasis on vocabulary as a skill, and the importance of background knowledge for thinking and reading comprehension. Lemov and Roberts also discuss their love of difficult-to-read authors, the power of reading in groups, the value of keeping a reading journal, and how even basketball can be more enjoyable when we have the right terminology.

The New Stack Podcast
How Fal.ai Went From Inference Optimization to Hosting Image and Video Models

The New Stack Podcast

Play Episode Listen Later Jul 25, 2025 52:41


Fal.ai, once focused on machine learning infrastructure, has evolved into a major player in generative media. In this episode of The New Stack Agents, hosts speak with Fal.ai CEO Burkay Gur and investor Glenn Solomon of Notable Capital. Originally aiming to optimize Python runtimes, Fal.ai shifted direction as generative AI exploded, driven by tools like DALL·E and ChatGPT. Today, Fal.ai hosts hundreds of models—from image to audio and video—and emphasizes fast, optimized inference to meet growing demand.Speed became Fal.ai's competitive edge, especially as newer generative models require GPU power not just for training but also for inference. Solomon noted that while optimization alone isn't a sustainable business model, Fal's value lies in speed and developer experience. Fal.ai offers both an easy-to-use web interface and developer-focused APIs, appealing to both technical and non-technical users.Gur also addressed generative AI's impact on creatives, arguing that while the cost of creation has plummeted, the cost of creativity remains—and may even increase as content becomes easier to produce.Learn more from The New Stack about AI's impact on creatives:AI Will Steal Developer Jobs (But Not How You Think) How AI Agents Will Change the Web for Users and Developers Join our community of newsletter subscribers to stay on top of the news and at the top of your game. 

Embedded Insiders
The Age of Inference: Generative AI & Sustainability

Embedded Insiders

Play Episode Listen Later Jul 24, 2025 24:51


Send us a textIn this episode of Embedded Insiders, Rich and I sit down with Sid Sheth, CEO and co-founder of d-Matrix, to explore the ongoing generative AI boom—why it's becoming increasingly unsustainable, and how d-Matrix is addressing the challenge with a chiplet-based compute architecture built specifically for AI inference.Next, Ken brings us up to speed on some of the week's top embedded industry headlines, with updates from ASUS IoT, LG, and Microelectronics UK.But first, Rich, Ken, and I share our thoughts on the state of generative AI and AI inference. For more information, visit embeddedcomputing.com

Henderson Blvd church of Christ
Implication and Inference

Henderson Blvd church of Christ

Play Episode Listen Later Jul 21, 2025 40:51


Series: Bible Class 2025 - AuthorityService: Bible Study - SundayType: Bible ClassSpeaker: Ralph Walker

Vanishing Gradients
Episode 54: Scaling AI: From Colab to Clusters — A Practitioner's Guide to Distributed Training and Inference

Vanishing Gradients

Play Episode Listen Later Jul 18, 2025 41:17


Colab is cozy. But production won't fit on a single GPU. Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software. We talk through: • From Colab to clusters: why scaling isn't just about training massive models, but serving agents, handling load, and speeding up iteration • Zero-to-two GPUs: how to get started without Kubernetes, Slurm, or a PhD in networking • Scaling tradeoffs: when to care about interconnects, which infra bottlenecks actually matter, and how to avoid chasing performance ghosts • The GPU middle class: strategies for training and serving on a shoestring, with just a few cards or modest credits • Local experiments, global impact: why learning distributed systems—even just a little—can set you apart as an engineer If you've ever stared at a Hugging Face training script and wondered how to run it on something more than your laptop: this one's for you. LINKS Zach on LinkedIn (https://www.linkedin.com/in/zachary-mueller-135257118/) Hugo's blog post on Stop Buliding AI Agents (https://www.linkedin.com/posts/hugo-bowne-anderson-045939a5_yesterday-i-posted-about-stop-building-ai-activity-7346942036752613376-b8-t/) Upcoming Events on Luma (https://lu.ma/calendar/cal-8ImWFDQ3IEIxNWk) Hugo's recent newsletter about upcoming events and more! (https://hugobowne.substack.com/p/stop-building-agents)

The Dawn of Dynamic AI: RFT Comes Online, w/ Predibase CEO Dev Rishi, from Inference by Turing Post

Play Episode Listen Later Jul 16, 2025 38:47


This crossover episode from Inference by Turing Post features CEO Dev Rishi of Predibase discussing the shift from static to continuously learning AI systems that can adapt and improve from ongoing user feedback in production. Rishi provides grounded insights from deploying these dynamic models to real enterprise customers in healthcare and finance, exploring both the massive potential upside and significant safety challenges of reinforcement learning at scale. The conversation examines how "practical specialized intelligence" could reshape the AI landscape by filling economic niches efficiently, potentially offering a more stable alternative to AGI development. This discussion bridges theoretical concepts with real-world deployment experience, offering a practical preview of AI systems that "train once and learn forever." Turing Post channel:  @RealTuringPost  Turpin Post website: https://www.turingpost.com Sponsors: Google Gemini 2.5 Flash : Build faster, smarter apps with customizable reasoning controls that let you optimize for speed and cost. Start building at https://aistudio.google.com Labelbox: Labelbox pairs automation, expert judgment, and reinforcement learning to deliver high-quality training data for cutting-edge AI. Put its data factory to work for you, visit https://labelbox.com Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at https://agntcy.org NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 42,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive PRODUCED BY: https://aipodcast.ing CHAPTERS: (00:00) Sponsor: Google Gemini 2.5 Flash (00:31) About the Episode (03:46) Training Models Continuously (05:03) Reinforcement Fine-Tuning Revolution (09:31) Agentic Workflows Challenges (Part 1) (12:51) Sponsors: Labelbox | Oracle Cloud Infrastructure (15:28) Agentic Workflows Challenges (Part 2) (15:41) ChatGPT Pivot Moment (19:59) Planning AI Future (24:45) Open Source Gaps (Part 1) (28:35) Sponsors: The AGNTCY | NetSuite by Oracle (30:50) Open Source Gaps (Part 2) (30:54) AGI vs Specialized (35:26) Happiness and Success (37:04) Outro

Learning Bayesian Statistics
#136 Bayesian Inference at Scale: Unveiling INLA, with Haavard Rue & Janet van Niekerk

Learning Bayesian Statistics

Play Episode Listen Later Jul 9, 2025 77:37 Transcription Available


Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!Intro to Bayes Course (first 2 lessons free)Advanced Regression Course (first 2 lessons free)Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!Visit our Patreon page to unlock exclusive Bayesian swag ;)Takeaways:INLA is a fast, deterministic method for Bayesian inference.INLA is particularly useful for large datasets and complex models.The R INLA package is widely used for implementing INLA methodology.INLA has been applied in various fields, including epidemiology and air quality control.Computational challenges in INLA are minimal compared to MCMC methods.The Smart Gradient method enhances the efficiency of INLA.INLA can handle various likelihoods, not just Gaussian.SPDs allow for more efficient computations in spatial modeling.The new INLA methodology scales better for large datasets, especially in medical imaging.Priors in Bayesian models can significantly impact the results and should be chosen carefully.Penalized complexity priors (PC priors) help prevent overfitting in models.Understanding the underlying mathematics of priors is crucial for effective modeling.The integration of GPUs in computational methods is a key future direction for INLA.The development of new sparse solvers is essential for handling larger models efficiently.Chapters:06:06 Understanding INLA: A Comparison with MCMC08:46 Applications of INLA in Real-World Scenarios11:58 Latent Gaussian Models and Their Importance15:12 Impactful Applications of INLA in Health and Environment18:09 Computational Challenges and Solutions in INLA21:06 Stochastic Partial Differential Equations in Spatial Modeling23:55 Future Directions and Innovations in INLA39:51 Exploring Stochastic Differential Equations43:02 Advancements in INLA Methodology50:40 Getting Started with INLA56:25 Understanding Priors in Bayesian ModelsThank you to my Patrons for making this episode possible!Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad

MLOps.community
Inside Uber's AI Revolution - Everything about how they use AI/ML

MLOps.community

Play Episode Listen Later Jul 4, 2025 45:23


Kai Wang joins the MLOps Community podcast LIVE to share how Uber built and scaled its ML platform, Michelangelo. From mission-critical models to tools for both beginners and experts, he walks us through Uber's AI playbook—and teases plans to open-source parts of it.// BioKai Wang is the product lead of the AI platform team at Uber, overseeing Uber's internal end-to-end ML platform called Michelangelo that powers 100% Uber's business-critical ML use cases.// Related LinksUber GenAI: https://www.uber.com/blog/from-predictive-to-generative-ai/#uber #podcast #ai #machinelearning ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreMLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Kai on LinkedIn: /kai-wang-67457318/Timestamps:[00:00] Rethinking AI Beyond ChatGPT[04:01] How Devs Pick Their Tools[08:25] Measuring Dev Speed Smartly[10:14] Predictive Models at Uber[13:11] When ML Strategy Shifts[15:56] Smarter Uber Eats with AI[19:29] Summarizing Feedback with ML[23:27] GenAI That Users Notice[27:19] Inference at Scale: Michelangelo[32:26] Building Uber's AI Studio[33:50] Faster AI Agents, Less Pain[39:21] Evaluating Models at Uber[42:22] Why Uber Open-Sourced Machanjo[44:32] What Fuels Uber's AI Team

80,000 Hours Podcast with Rob Wiblin
#219 – Toby Ord on graphs AI companies would prefer you didn't (fully) understand

80,000 Hours Podcast with Rob Wiblin

Play Episode Listen Later Jun 24, 2025 168:22


The era of making AI smarter just by making it bigger is ending. But that doesn't mean progress is slowing down — far from it. AI models continue to get much more powerful, just using very different methods, and those underlying technical changes force a big rethink of what coming years will look like.Toby Ord — Oxford philosopher and bestselling author of The Precipice — has been tracking these shifts and mapping out the implications both for governments and our lives.Links to learn more, video, highlights, and full transcript: https://80k.info/to25As he explains, until recently anyone can access the best AI in the world “for less than the price of a can of Coke.” But unfortunately, that's over.What changed? AI companies first made models smarter by throwing a million times as much computing power at them during training, to make them better at predicting the next word. But with high quality data drying up, that approach petered out in 2024.So they pivoted to something radically different: instead of training smarter models, they're giving existing models dramatically more time to think — leading to the rise in “reasoning models” that are at the frontier today.The results are impressive but this extra computing time comes at a cost: OpenAI's o3 reasoning model achieved stunning results on a famous AI test by writing an Encyclopedia Britannica's worth of reasoning to solve individual problems at a cost of over $1,000 per question.This isn't just technical trivia: if this improvement method sticks, it will change much about how the AI revolution plays out, starting with the fact that we can expect the rich and powerful to get access to the best AI models well before the rest of us.Toby and host Rob discuss the implications of all that, plus the return of reinforcement learning (and resulting increase in deception), and Toby's commitment to clarifying the misleading graphs coming out of AI companies — to separate the snake oil and fads from the reality of what's likely a "transformative moment in human history."Recorded on May 23, 2025.Chapters:Cold open (00:00:00)Toby Ord is back — for a 4th time! (00:01:20)Everything has changed (and changed again) since 2020 (00:01:37)Is x-risk up or down? (00:07:47)The new scaling era: compute at inference (00:09:12)Inference scaling means less concentration (00:31:21)Will rich people get access to AGI first? Will the rest of us even know? (00:35:11)The new regime makes 'compute governance' harder (00:41:08)How 'IDA' might let AI blast past human level — or not (00:50:14)Reinforcement learning brings back 'reward hacking' agents (01:04:56)Will we get warning shots? Will they even help? (01:14:41)The scaling paradox (01:22:09)Misleading charts from AI companies (01:30:55)Policy debates should dream much bigger (01:43:04)Scientific moratoriums have worked before (01:56:04)Might AI 'go rogue' early on? (02:13:16)Lamps are regulated much more than AI (02:20:55)Companies made a strategic error shooting down SB 1047 (02:29:57)Companies should build in emergency brakes for their AI (02:35:49)Toby's bottom lines (02:44:32)Tell us what you thought! https://forms.gle/enUSk8HXiCrqSA9J8Video editing: Simon MonsourAudio engineering: Ben Cordell, Milo McGuire, Simon Monsour, and Dominic ArmstrongMusic: Ben CordellCamera operator: Jeremy ChevillotteTranscriptions and web: Katy Moore

Seller Sessions
The Blueprint Show - Unlocking the Future of E-commerce with AI

Seller Sessions

Play Episode Listen Later Jun 19, 2025 52:35


The Blueprint Show - Unlocking the Future of E-commerce with AI   Summary   In this episode of Seller Sessions, Danny McMillan and Andrew Joseph Bell explore the intersection of AI and e-commerce, with a focus on Amazon's technological advancements. They examine Amazon science papers versus patents, discuss challenges with large language models, and highlight the importance of semantic intent in product recommendations. The conversation explores the evolution from keyword optimization to understanding customer purchase intentions, showcasing how AI tools like Rufus are transforming the shopping experience. The hosts provide practical strategies for sellers to optimize listings and harness AI for improved product visibility and sales.   Key Takeaways     Amazon science papers predict future e-commerce trends.       AI integration is accelerating in Amazon's ecosystem.       Understanding semantic intent is crucial for product recommendations.       The shift from keywords to purchase intentions is significant.       Rufus enhances the shopping experience with AI planning capabilities.       Sellers should focus on customer motivations in their listings.       Creating compelling product content is essential for visibility.       Custom GPTs can optimize product listings effectively.       Inference pathways help align products with customer goals.       Asking the right questions is key to leveraging AI effectively.     Sound Bites     "Understanding semantic intent is crucial."       "You can bend AI to your will."       "Asking the right questions opens doors."     Chapters   00:00 Introduction to Seller Sessions and New Season   00:33 Exploring Amazon Science Papers vs. Patents   01:27 Understanding Rufus and AI in E-commerce   02:52 Challenges in Large Language Models and Product Recommendations   07:09 Research Contributions and Implications for Sellers   10:31 Strategies for Leveraging AI in Product Listings   12:42 The Future of Shopping with AI and Amazon's Innovations   16:14 Practical Examples: Using AI for Product Optimization   22:29 Building Tools for Enhanced E-commerce Experiences   25:38 Product Naming and Features Exploration   27:44 Understanding Inference Pathways in Product Descriptions   30:36 Building Tools for AI Prompting and Automation   38:58 Bending AI to Your Will: Creativity and Imagination   48:10 Practical Applications of AI in Business Automation

The Data Center Frontier Show
Open Source, AMD GPUs, and the Future of Edge Inference: Vultr's Big AI Bet

The Data Center Frontier Show

Play Episode Listen Later Jun 12, 2025 25:00


In this episode of the Data Center Frontier Show, we sit down with Kevin Cochrane, Chief Marketing Officer of Vultr, to explore how the company is positioning itself at the forefront of AI-native cloud infrastructure, and why they're all-in on AMD's GPUs, open-source software, and a globally distributed strategy for the future of inference. Cochrane begins by outlining the evolution of the GPU market, moving from a scarcity-driven, centralized training era to a new chapter focused on global inference workloads. With enterprises now seeking to embed AI across every application and workflow, Vultr is preparing for what Cochrane calls a “10-year rebuild cycle” of enterprise infrastructure—one that will layer GPUs alongside CPUs across every corner of the cloud. Vultr's recent partnership with AMD plays a critical role in that strategy. The company is deploying both the MI300X and MI325X GPUs across its 32 data center regions, offering customers optimized options for inference workloads. Cochrane explains the advantages of AMD's chips, such as higher VRAM and power efficiency, which allow large models to run with fewer GPUs—boosting both performance and cost-effectiveness. These deployments are backed by Vultr's close integration with Supermicro, which delivers the rack-scale servers needed to bring new GPU capacity online quickly and reliably. Another key focus of the episode is ROCm (Radeon Open Compute), AMD's open-source software ecosystem for AI and HPC workloads. Cochrane emphasizes that Vultr is not just deploying AMD hardware; it's fully aligned with the open-source movement underpinning it. He highlights Vultr's ongoing global ROCm hackathons and points to zero-day ROCm support on platforms like Hugging Face as proof of how open standards can catalyze rapid innovation and developer adoption. “Open source and open standards always win in the long run,” Cochrane says. “The future of AI infrastructure depends on a global, community-driven ecosystem, just like the early days of cloud.” The conversation wraps with a look at Vultr's growth strategy following its $3.5 billion valuation and recent funding round. Cochrane envisions a world where inference workloads become ubiquitous and deeply embedded into everyday life—from transportation to customer service to enterprise operations. That, he says, will require a global fabric of low-latency, GPU-powered infrastructure. “The world is going to become one giant inference engine,” Cochrane concludes. “And we're building the foundation for that today.” Tune in to hear how Vultr's bold moves in open-source AI infrastructure and its partnership with AMD may shape the next decade of cloud computing, one GPU cluster at a time.

You Are Not So Smart
315 - May Contain Lies - Alex Edmans

You Are Not So Smart

Play Episode Listen Later Jun 9, 2025 39:43


Alex Edmans, a professor of finance at London Business School, tells us how to avoid the Ladder of Misinference by examining how narratives, statistics, and articles can mislead, especially when they align with our preconceived notions and confirm what we believe is true, assume is true, and wish were true.Alex Edmans May Contain LiesWhat to Test in a Post Trust WorldHow Minds ChangeDavid McRaney's TwitterDavid McRaney's BlueSkyYANSS TwitterYANSS FacebookNewsletterKittedPatreon

The Circuit
Episode 120: NVIDIA Earnings, The Future of AI inference, China AI and more

The Circuit

Play Episode Listen Later Jun 2, 2025 43:07


In this conversation, Jay Goldberg and Austin Lyons discuss Nvidia's recent earnings report, the future of AI and inference, and the dynamics of the AI market, including the impact of China on Nvidia's revenue. They explore the differences between consumer and enterprise workloads, the role of financing in AI server sales, and the challenges of realizing ROI from AI investments. The discussion also touches on real-world applications of AI in business and the future of AI integration in consumer products.

Hope for Anxiety and OCD
174. Is ICBT Right for Me? How Do I Know?

Hope for Anxiety and OCD

Play Episode Listen Later May 21, 2025 27:37


 In this episode, Carrie explores whether Inference-based Cognitive Behavioral Therapy (ICBT) is a good fit for individuals struggling with OCD—especially those who haven't found success with exposure and response prevention (ERP). Episode Highlights:The key differences between ERP and ICBT, and why ICBT may be a better fit for certain individuals with OCD.How ICBT helps unpack the reasoning behind obsessions rather than just managing behaviors.Why ICBT can be especially valuable for Christians seeking faith-sensitive OCD treatment.The limitations and challenges of ERP, including dropout rates and religious exposure concerns.What it takes to succeed with ICBT, including a willingness to deeply engage with the learning and healing process. Join the waitlist for the Christians Learning ICBT training: https://carriebock.com/training/ Explore Carrie's services and courses: carriebock.com/services/ carriebock.com/resources/Follow us on Instagram: www.instagram.com/christianfaithandocd/and like our Facebook page: https://www.facebook.com/christianfaithandocd for the latest updates and sneak peeks.

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 524: Agentic AI Done Right - How to avoid missing out or messing up.

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 13, 2025 18:33


Agentic AI is equally as daunting as it is dynamic. So…… how do you not screw it up? After all, the more robust and complex agentic AI becomes, the more room there is for error. Luckily, we've got Dr. Maryam Ashoori to guide our agentic ways. Maryam is the Senior Director of Product Management of watsonx at IBM. She joined us at IBM Think 2025 to break down agentic AI done right. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Agentic AI Benefits for EnterprisesWatson X's New Features & AnnouncementsAI-Powered Enterprise Solutions at IBMResponsible Implementation of Agentic AILLMs in Enterprise Cost OptimizationDeployment and Scalability EnhancementsAI's Impact on Developer ProductivityProblem-Solving with Agentic AITimestamps:00:00 AI Agents: A Business Imperative06:14 "Optimizing Enterprise Agent Strategy"09:15 Enterprise Leaders' AI Mindset Shift09:58 Focus on Problem-Solving with Technology13:34 "Boost Business with LLMs"16:48 "Understanding and Managing AI Risks"Keywords:Agentic AI, AI agents, Agent lifecycle, LLMs taking actions, WatsonX.ai, Product management, IBM Think conference, Business leaders, Enterprise productivity, WatsonX platform, Custom AI solutions, Environmental Intelligence Suite, Granite Code models, AI-powered code assistant, Customer challenges, Responsible AI implementation, Transparency and traceability, Observability, Optimization, Larger compute, Cost performance optimization, Chain of thought reasoning, Inference time scaling, Deployment service, Scalability of enterprise, Access control, Security requirements, Non-technical users, AI-assisted coding, Developer time-saving, Function calling, Tool calling, Enterprise data integration, Solving enterprise problems, Responsible implementation, Human in the loop, Automation, IBM savings, Risk assessment, Empowering workforce.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info)

This Week in Startups
What's Next for AI Infrastructure with Amin Vahdat | AI Basics with Google Cloud

This Week in Startups

Play Episode Listen Later May 1, 2025 27:34


In this episode of AI Basics, Jason sits down with Amin Vahdat, VP of ML at Google Cloud, to unpack the mind-blowing infrastructure behind modern AI. They dive into how Google's TPUs power massive queries, why 2025 is the “Year of Inference,” and how startups can now build what once felt impossible. From real-time agents to exponential speed gains, this is a look inside the AI engine that's rewriting the future.*Timestamps:(0:00) Jason introduces today's guest Amin Vahdat(3:18) Data movement implications for founders and historical bandwidth perspective(5:29) The shift to inference and AI infrastructure trends in startups and enterprises(8:40) Evolution of productivity and potential of low-code/no-code development(11:20) AI infrastructure pricing, cost efficiency, and historical innovation(17:53) Google's TPU technology and infrastructure scale(23:21) Building AI agents for startup evaluation and supervised associate agents(26:08) Documenting decisions for AI learning and early AI agent development*Uncover more valuable insights from AI leaders in Google Cloud's 'Future of AI: Perspectives for Startups' report. Discover what 23 AI industry leaders think about the future of AI—and how it impacts your business. Read their perspectives here: https://goo.gle/futureofai*Check out all of the Startup Basics episodes here: https://thisweekinstartups.com/basicsCheck out Google Cloud: https://cloud.google.com/*Follow Amin:LinkedIn: https://www.linkedin.com/in/vahdat/?trk=public_post_feed-actor-name*Follow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanis*Follow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.com