Podcasts about Inference

  • 575PODCASTS
  • 1,027EPISODES
  • 42mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Mar 4, 2026LATEST
Inference

POPULARITY

20192020202120222023202420252026


Best podcasts about Inference

Show all podcasts related to inference

Latest podcast episodes about Inference

Stocks for Beginners
The AI Compute Shift: Why Inference Could Change Everything for Nvidia, Intel & AMD

Stocks for Beginners

Play Episode Listen Later Mar 4, 2026 47:44


Dive into the heart of the AI revolution with Gary Brode from Deep Knowledge Investing. In this episode, we unravel the complex world of the semiconductors that power AI. Nvidia's GPU dominance to ARM-based innovations, Intel and AMD's CPU roles, and the massive energy demands of data centres. Learn about key deals like Nvidia-Meta's collaboration, investment risks in hyperscalers, and opportunities in nuclear energy and uranium. Perfect for investors navigating the AI boom.

Shares for Beginners
The AI Compute Shift: Why Inference Could Change Everything for Nvidia, Intel & AMD

Shares for Beginners

Play Episode Listen Later Mar 4, 2026 47:46


Dive into the heart of the AI revolution with Gary Brode from Deep Knowledge Investing. In this episode, we unravel the complex world of the semiconductors that power AI. Nvidia's GPU dominance to ARM-based innovations, Intel and AMD's CPU roles, and the massive energy demands of data centres. Learn about key deals like Nvidia-Meta's collaboration, investment risks in hyperscalers, and opportunities in nuclear energy and uranium. Perfect for investors navigating the AI boom.

The Tech Blog Writer Podcast
From Core To Edge: Akamai On Where AI Inference Must Live Next

The Tech Blog Writer Podcast

Play Episode Listen Later Mar 3, 2026 27:40


What if the real AI race in 2026 isn't about building bigger models, but about where decisions are made, how fast they happen, and whether they deliver measurable value? In this episode, I'm joined by John Bradshaw, Director of Cloud Computing Technology and Strategy at Akamai, to unpack his predictions for the next phase of cloud, AI inference, and the economics that will shape enterprise technology over the next 12 months. As organizations move beyond experimentation, John explains why the boardroom conversation has shifted from capability to return on investment, and how spiraling compute demands are forcing leaders to rethink the balance between performance, cost, and innovation. We explore why this new financial scrutiny is not slowing AI adoption, but refining it. John shares how inefficient GPU workflows, centralized inference, and poorly aligned architectures are being challenged by a more disciplined approach that pushes intelligence closer to the edge. This shift is not only about latency and performance. It is about building scalable, value-driven platforms that can support real-time decision-making, agentic workloads, and global user experiences without breaking traditional IT budgets. Trust is another major theme throughout our conversation. From the rise of everyday AI agents that quietly handle routine tasks to the growing importance of secure, resilient inference pipelines, John outlines how low-latency edge infrastructure, local processing, and hybrid cloud models will redefine reliability for both enterprises and consumers. We also discuss the smart home backlash following recent outages, and why the next generation of connected products will be designed to work even when the network does not. The episode also looks at the future of streaming, where consolidation, intelligent content delivery, and AI-driven personalization are reshaping both the user experience and the economics behind the platforms. Behind the scenes, orchestration is emerging as a defining capability, with multiple models and services working together to validate outputs, reduce hallucinations, and create more dependable AI systems. This is a conversation about moving from possibility to production, from experimentation to accountability, and from centralized architectures to distributed intelligence. So as AI becomes embedded in every workflow and every customer interaction, will the winners be the companies with the biggest models, or the ones that know exactly where their AI should live, how it should be orchestrated, and how it proves its value every single day?

The Deductionist Podcast
The Inference Cycle: How to Think Like an Elite Investigator

The Deductionist Podcast

Play Episode Listen Later Feb 27, 2026 25:13


Most people don't investigate. They react. In this episode, we break down the Inference Cycle, the psychological defence system elite investigators use to prevent confirmation bias, emotional reasoning, and premature certainty. From early inquisitorial systems to Joseph Bell (the real-life inspiration for Sherlock Holmes), we explore how structured reasoning replaced accusation, and why that matters now more than ever. You'll learn: • Why suspicion is not a verdict• How to build falsifiable hypotheses• The danger of narrative seduction• Why evidence must be designed before it's collected• How cognitive dissonance corrupts smart people• The psychological discipline Sherlock Holmes actually represents This is not about memorizing facts. It's about training your character to tolerate ambiguity. As Holmes said: “It is a capital mistake to theorize before one has data.” If you want sharper thinking, better judgment, and intellectual humility under pressure, this episode is for you. Access the free tier or go deeper with exclusive paid challenges: https://www.omniscient-insights.com/axiom https://www.omniscient-insights.com/community-home MERCH -- https://the-deductionist.myspreadshop.co.uk/all E-SCAPE GAME -- https://www.youtube.com/@thedeductionistteam Everything else you need -- https://linktr.ee/bencardall Music provided by https://robertjohncollinsmusic.com/` #sherlock #deduction #mystery

JSA Podcasts for Telecom and Data Centers
Kansas' First Neutral IX + AI Inference at the Edge | Connected Nation IXP at MetroConnect 2026

JSA Podcasts for Telecom and Data Centers

Play Episode Listen Later Feb 25, 2026 8:05


The Cloud Pod
344: Amazon's Coding Bot Bites the Hand That Runs It

The Cloud Pod

Play Episode Listen Later Feb 24, 2026 61:30


Welcome to episode 344 of The Cloud Pod, where the forecast is always cloudy! Justin is out of the office at a World of Warcraft Tournament (not really), and Ryan is pursuing his lifelong dream of becoming a roadie for The Eagles (maybe?), so it's Jonathan and Matt holding down the fort this week, and they've got a ton of cloud news for you! From security to AI assistants, we've got all the news you need. Let's get started!  Titles we almost went with this week Zero Bus, All Gas, No Kafka Brakes AI Coding Bot Bites the Hand That Runs It When Your Robot Developer Goes Rogue on AWS Kubernetes VPA Finally Stops Evicting Your Database Pods Google Trains 100 Million People, Still No One Reads the Docs  MCP Walks Into a Bar Not Enterprise Ready Yet No More Pod Evictions Kubernetes 1.35 Scales In Place No Keys No Drama Just IAM and Cloud SQL One Agent to Rule Them All in Kubernetes IAM Tired of Writing Policies Manually When Your AI Coding Tool Has Delete Permissions One Dashboard to Rule All Your GPU Clusters Serverless Reservations Prove Nothing Is Truly Free Range Kiro Takes the Wheel on AWS IAM Policies Stop Blaming Backups for Your Bad Architecture AI Agent Goes Rogue, Takes AWS Down With It Everything is Bigger in Texas Except the Water Usage OpenAI launches the college basketball of Inference. Pro service – low cost General News  1:05 Code Mode: give agents an entire API in 1,000 tokens Cloudflare‘s Code Mode MCP server reduces token consumption by 99.9% compared to a traditional MCP implementation, exposing the entire Cloudflare API (over 2,500 endpoints) through just two tools, search() and execute(), using roughly 1,000 tokens versus 1.17 million for a conventional approach. The architecture works by having the AI agent write JavaScript code against a typed OpenAPI spec representation, rather than loading tool definitions into context, with code executing inside a sandboxed V8 isolate (Dynamic Worker) that restricts file system access, environment variables, and external fetches by default. This approach addresses a fundamental constraint in agentic AI systems: adding more tools to give agents broader capabilities directly competes with the available context space for the task at hand. 01:41 Jonathan- “It's good. I'm not sure I could imagine 2 ½ thousand MCP tool definitions in a context window and still actually use it for anything.”    AI Is Going Great – Or How ML Makes Money  03:58 OpenClaw creator Peter Steinberger joins OpenAI Peter Steinberger, creator of viral AI assistant OpenClaw (formerly Clawdbot/Moltbot), has joined

Intelligence with Everyone: RL @ MiniMax, with Olive Song, from AIE NYC & Inference by Turing Post

Play Episode Listen Later Feb 22, 2026 55:29


Olive Song from MiniMax shares how her team trains the M series frontier open-weight models using reinforcement learning, tight product feedback loops, and systematic environment perturbations. This crossover episode weaves together her AI Engineer Conference talk and an in-depth interview from the Inference podcast. Listeners will learn about interleaved thinking for long-horizon agentic tasks, fighting reward hacking, and why they moved RL training to FP32 precision. Olive also offers a candid look at debugging real-world LLM failures and how MiniMax uses AI agents to track the fast-moving AI landscape. Use the Granola Recipe Nathan relies on to identify blind spots across conversations, AI research, and decisions: https://bit.ly/granolablindspot LINKS: Conference Talk (AI Engineer, Dec 2025) – https://www.youtube.com/watch?v=lY1iFbDPRlwInterview (Turing Post, Jan 2026) – https://www.youtube.com/watch?v=GkUMqWeHn40 Sponsors: Claude: Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro's full capabilities at https://claude.ai/tcr Tasklet: Tasklet is an AI agent that automates your work 24/7; just describe what you want in plain English and it gets the job done. Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai CHAPTERS: (00:00) About the Episode (04:15) Minimax M2 presentation (Part 1) (17:59) Sponsors: Claude | Tasklet (21:22) Minimax M2 presentation (Part 2) (21:26) Research life and culture (26:27) Alignment, safety and feedback (32:01) Long-horizon coding agents (35:57) Open models and evaluation (43:29) M2.2 and researcher goals (48:16) Continual learning and AGI (52:58) Closing musical summary (55:49) Outro PRODUCED BY: https://aipodcast.ing SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://linkedin.com/in/nathanlabenz/ Youtube: https://youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk

DH Unplugged
DHUnplugged #791: AI Overload

DH Unplugged

Play Episode Listen Later Feb 18, 2026 70:35


Self Created Valuation Boosts Apple Announces new Podcast push AI – A breakdown Playing them like a fiddle – Warner Brothers PLUS we are now on Spotify and Amazon Music/Podcasts! Click HERE for Show Notes and Links DHUnplugged is now streaming live - with listener chat. Click on link on the right sidebar. Love the Show? Then how about a Donation? Follow John C. Dvorak on Twitter Follow Andrew Horowitz on Twitter Warm-Up - A NEW CTP just announced - China releasing new AI models - AI - A breakdown - we are on overload - Big Employment news.... Markets - Self Created Valuation Boosts - Apple Announces new Podcast push - Playing them like a fiddle - Warner Brothers Quick Note - Going to rip up the playbook on something this week on TDI Podcast. Anyone who owns an annuity should listen to what is about to come on next Sundays show.....  No Agenda... Olympics - Anything to discuss? MONEY FOR ALL - The average tax refund is 10.9% higher so far this season, compared to about the same point in 2025, according to early filing data from the IRS. - The 2026 tax season opened Jan. 26, and the average refund amount was $2,290 as of Feb. 6, up from $2,065 about one year prior, the IRS reported Friday night. - As of Feb. 6, the total amount refunded was more than $16.9 billion, up 1.9% compared to last year, according to the IRS release. That figure reflects current-year returns only. - This is partly because there were excess-witholdings from last year on the rules changed and paycheck withholdings were not adjusted. This is a one time situation.. Emplyment - 4.3% - "Better" than expected payrolls number - A major revision was released last Wednesday. Overall 2025 job growth was much weaker than initially reported. The total net change for the full year 2025 was revised down from +584,000 jobs to just +181,000 jobs (seasonally adjusted) — an average of only about 15,000 jobs added per month instead of ~49,000. This made 2025 one of the weakest years for job creation in recent non-recession periods. - Employment levels were consistently overstated throughout 2025 by roughly 800,000 to over 1 million jobs, peaking around mid-year. For example: By March 2025, the level was revised down by 898,000. By December 2025 (preliminary), down by 1,029,000. - Monthly changes were also adjusted downward in most cases (e.g., August's originally reported -26,000 became a larger loss of -70,000; September's +108,000 became +76,000). - The revisions reflect normal annual benchmarking, but this one was unusually large (larger than the typical 0.2% average over the prior decade), likely due to factors like overestimation of business births or other data mismatches. - In short, the data reveals that the U.S. labor market in 2025 was significantly softer than the monthly headlines suggested at the time — job growth was overstated by a substantial margin, painting a picture of a much weaker employment picture for the year. AI Updates - While U.S. markets have been focused on the impact of Anthropic and Altruist's tools on software and financial services, China's tech giants have released AI models this week that have shown advancements in robotics and video generation. - Google is reporting that China's AI models are just MONTHS behind western models - However - is this progress? In a video demo, Alibaba showed a robot with pincers for hands that appeared to be able to count oranges, pick them up and place them in a basket. It was also shown taking milk out of a fridge. - Alibaba on Monday unveiled a new artificial intelligence model Qwen 3.5 designed to execute complex tasks independently, with big improvements in performance and cost that the Chinese tech giant claims beat major U.S. rival models on several benchmarks. - Zhipu AI — which trades as Knowledge Atlas Technology in Hong Kong said the model approaches Anthropic's Claude Opus 4.5 in coding benchmarks while surpassing Google's Gemini 3 Pro on some tests. - Shares of MiniMax also jumped Thursday after it launched its updated M2.5 open-source model with enhanced AI agent tools. Grok Update - Grok, Elon Musk's AI chatbot, has been gaining ground in the U.S. over the past months, data showed, even as it draws global censure and regulatory scrutiny after being used to generate a wave of non-consensual sexualized images of women and minors. - U.S. market share of the tool rose to 17.8% last month from 14% in December, and 1.9% in January 2025, according to data from research firm Apptopia. - Men are still the largest % users of Grok ~ 78% (down from 89% in April 2025) AI Market Share - ChatGPT's share slumped to 52.9% last month from 80.9% in January last year, while Gemini's grew to 29.4% from 17.3% over the same period. AI Market Share InfoGrapic and AI Understanding - Have we gone through this? - At its core, AI is technology that lets machines perform tasks that normally require human intelligence — things like understanding language, recognizing images, making decisions, or solving problems. - Modern AI (especially since ~2022) is dominated by machine learning — systems that learn patterns from huge amounts of data instead of being explicitly programmed rule-by-rule. - Inference is the "using" or "applying" phase of AI — when a trained model takes new input and produces an output / prediction / answer. Contrast with training (the "learning" phase): ------ Training ? Like a student studying for years: very compute-heavy, expensive, done once (or rarely) on massive servers/GPUs, adjusts billions of parameters based on examples. ------ Inference ? Like the student taking a test or doing their job: much faster, cheaper, runs on your phone/laptop/cloud, uses the fixed knowledge from training to respond instantly. - gentic AI takes regular AI (like chat models) to the next level: instead of just answering questions or generating text, these systems act autonomously to achieve goals with minimal human help. "Agentic" comes from "agency" — the ability to make decisions, plan, use tools, take actions, adapt, and even learn from results — like a smart digital employee rather than just a smart answer machine. AI Infographic Last AI Item - A shortage of memory chips is hammering profits, derailing corporate plans, and inflating price tags on various products, with the crunch expected to get worse. - The fundamental reason for the squeeze is the buildout of AI data centers, with companies like Alphabet and OpenAI buying up large shares of memory chip production, leaving consumer electronics producers fighting over a dwindling supply. - The resulting price spikes are causing concern, with some warning of "RAMmageddon" and others predicting that memory chip prices will go "parabolic", bringing lavish profits to some companies but painful prices to the rest of the electronics sector. Here is something: - Gallup will no longer track presidential approval ratings after nearly 90 years - Founded by George Gallup in 1935, the Washington, DC-based management company began tracking the president's job performance 88 years ago. - Gallup told USA TODAY it will no longer publish "favorability ratings of political figures," a decision it said "reflects an evolution in how Gallup focuses its public research and thought leadership." - Gallup said the ratings are now "widely produced, aggregated and interpreted, and no longer represent an area where Gallup can make its most distinctive contribution." - "Our commitment is to long-term, methodologically sound research on issues and conditions that shape people's lives," the company wrote, adding that its work will continue through the Gallup Poll Social Series, the Gallup Quarterly Business Review, the World Poll and more. - Seems like they are unable to SHAPE opinion due to social media etc.....? Apple Podcast Update - Big news! - Apple on Monday announced that it will bring a new integrated video podcast experience to Apple Podcasts this spring. - The move comes as video viewership continues to reshape podcasting. About 37% of people over age 12 watch video podcasts monthly, according to Edison Research. - The update brings Apple Podcasts more in-line with its competitors Spotify, YouTube and now Netflix, which have increasingly leaned into video podcasting. -“Twenty years ago, Apple helped take podcasting mainstream by adding podcasts to iTunes, and more than a decade ago, we introduced the dedicated Apple Podcasts app,” said Eddy Cue, Apple's senior vice president of Services, in a statement. “ - By bringing a category-leading video experience to Apple Podcasts, we're putting creators in full control of their content and how they build their businesses, while making it easier than ever for audiences to listen to or watch podcasts.” M&A - Texas Instruments Inc. has reached an agreement to buy Silicon Laboratories Inc. for about $7.5 billion, deepening its exposure to several markets for chips. - Silicon Labs investors will receive $231 in cash for each share of the company's common stock and the transaction is expected to close in the first half of 2027. - The transaction still needs to win approval by investors in Silicon Labs and shares of Silicon Labs surged by 51% to $206.48 after the announcement. Inflation - This helps - PepsiCo, will cut prices on core brands such as Lay's and Doritos by up to 15% following a consumer backlash against several previous price hikes, the snacks and beverage maker said on Tuesday after it topped fourth-quarter results. Miran - Moving - Federal Reserve Governor Stephen Miran is leaving his post as chair of the Council of Economic Advisers, CNBC has confirmed. - He joined the CEA in January 2025, but had been on leave from that post since last September when he filled the unexpired term of former Fed Governor Adriana Kugler.- He reamins on Fed board No Biggie???? - There are some astonishing cased being reported of Bad AI in the operating room - JNJ's TruDi Navigation System - Since AI was added to the device, the FDA has received unconfirmed reports of at least 100 malfunctions and adverse events. - At least 10 people were injured between late 2021 and November 2025, according to the reports. Most allegedly involved errors in which the TruDi Navigation System misinformed surgeons about the location of their instruments while they were using them inside patients' heads during operations. - Cerebrospinal fluid reportedly leaked from one patient's nose. In another reported case, a surgeon mistakenly punctured the base of a patient's skull. In two other cases, patients each allegedly suffered strokes after a major artery was accidentally injured. Cuba - The main airport has putt out a bulletin that they are out of Jet Fuel - Blackouts and lack of other fuels are creating big problems - No airlines have stopped running at this point, but many will as they cannot refuel - This is a bigger problem for cargo planes (supplies) that may not be able to risk flying to Cuba as they will not be able to get out. Dalio Warning -  Legendary investor Ray Dalio said on Tuesday the world was “on the brink” of a capital war. - He said central banks and sovereign wealth funds were already preparing for measures like foreign exchange and capital controls. - "When money is weaponized using measures like trade embargoes, blocking access to capital markets, or using ownership of debt as leverage." - “Capital, money, matters,” Dalio said Tuesday. “We're seeing capital controls … taking place all over the world today, and who will experience that is questionable. So, we are on the brink — that doesn't mean we are in [a capital war now], but it means that it's a logical concern.” - Could this be why gold and siver are being hoarded (physical assets over digital currency? - Is China's edict to banks to diversify away from US Treasuries a sign? Self Boosted Valuation - Waymo is aiming to raise about $16 billion in a financing-round that would value it at nearly $110 billion, Bloomberg News reported, citing people familiar with the matter. - Alphabet would provide about $13 billion to the autonomous driving firm while the rest would come from investors including Sequoia Capital, DST Global and Dragoneer Investment Group, the report added. - Soooooo - Waymo is a unit of Alphabet.... Alphabet providing 80% of the funding that boosts valuations..... Hmmmmmmmm Warner Brothers -  Warner Bros Discovery Inc is considering reopening sale talks with Paramount Skydance Corp after receiving its amended offer. - The Warner Bros board is discussing whether Paramount could offer a path to a superior deal, which may ignite a second bidding war with Netflix Inc. - Paramount submitted amended terms that addressed several concerns, including covering a fee owed to Netflix and offering to backstop a Warner Bros debt refinancing. Economics Coming Up - Short Week - plenty of Reports - Wednesday - Durable Goods, Housing Starts, Industrial Production, FOMC Minutes - Thursday - Philly Fed, Initial Claims - Friday: PCE, Personal Income and Spending, GDP for Q4 (3.6%) ----- New Home Sales, UMich Feb Final   Love the Show? Then how about a Donation? ANNOUNCING THE THE CLOSEST TO THE PIN for CATERPILLAR Winners will be getting great stuff like the new "OFFICIAL" DHUnplugged Shirt!     FED AND CRYPTO LIMERICKS   See this week's stock picks HERE Follow John C. Dvorak on Twitter Follow Andrew Horowitz on Twitter

Scientific Sense ®
Prof. Andrew Jaffe of Imperial College on the Random Universe

Scientific Sense ®

Play Episode Listen Later Feb 14, 2026 66:04


Scientific Sense ® by Gill Eapen: Prof. Andrew Jaffe is professor of astrophysics and cosmology at Imperial College, London. He is the Director of the Imperial Centre for Inference and Cosmology. He studies the history and evolution of the Universe as a whole.Please subscribe to this channel:https://www.youtube.com/c/ScientificSense?sub_confirmation=1

Azeem Azhar's Exponential View
Inside the economics of OpenAI (exclusive research)

Azeem Azhar's Exponential View

Play Episode Listen Later Feb 13, 2026 49:46


Welcome to Exponential View, the show where I explore how exponential technologies such as AI are reshaping our future. I've been studying AI and exponential technologies at the frontier for over ten years. Each week, I share some of my analysis or speak with an expert guest to make light of a particular topic. To keep up with the Exponential transition, subscribe to this channel or to my newsletter: https://www.exponentialview.co/ ----In this episode, I'm joined by Jaime Sevilla, founder of Epoch AI; Hannah Petrovic from my team at Exponential View; and financial journalist Matt Robinson from AI Street. Together we investigate a fundamental question: do the economics of AI companies actually work? We analysed OpenAI's financials from public data to examine whether their revenues can sustain the staggering R&D costs of frontier models. The findings reveal a picture far more precarious than many assume; we also explore where the real infrastructure bottlenecks lie, why compute demand will dwarf energy constraints, and what the rise of long-running agentic workloads means for the entire industry. Read the study here: https://www.exponentialview.co/p/inside-openais-unit-economics-epoch-exponentialviewWe covered: (00:00) Do the economics of frontier AI actually work? (02:48) Piecing together OpenAI's finances from public data (05:24) GPT-5's "rapidly depreciating asset" problem (13:25) Why OpenAI is flirting with ads (17:31) If you were Sam Altman, what would you do differently? (22:54) Energy vs. GPUs; where the real infrastructure bottleneck lies (29:15) What surging compute demand actually looks like (33:12) The most surprising finding from the research (38:02) The race to avoid commoditization (43:35) Agents that outlive their models  Where to find me: Exponential View newsletter: https://www.exponentialview.co/ Website: https://www.azeemazhar.com/ LinkedIn: https://www.linkedin.com/in/azhar/ Twitter/X: https://x.com/azeem  Where to find Jamie: https://epoch.ai or https://epochai.substack.com Where to find Matt: https://www.ai-street.co  Production by supermix.io and EPIIPLUS1 Production and research: Chantal Smith and Marija Gavrilov. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

The Dissenter
#1215 Mauricio Suárez - Inference and Representation: A Study in Modeling Science

The Dissenter

Play Episode Listen Later Feb 13, 2026 52:40


******Support the channel******Patreon: https://www.patreon.com/thedissenterPayPal: paypal.me/thedissenterPayPal Subscription 1 Dollar: https://tinyurl.com/yb3acuuyPayPal Subscription 3 Dollars: https://tinyurl.com/ybn6bg9lPayPal Subscription 5 Dollars: https://tinyurl.com/ycmr9gpzPayPal Subscription 10 Dollars: https://tinyurl.com/y9r3fc9mPayPal Subscription 20 Dollars: https://tinyurl.com/y95uvkao ******Follow me on******Website: https://www.thedissenter.net/The Dissenter Goodreads list: https://shorturl.at/7BMoBFacebook: https://www.facebook.com/thedissenteryt/Twitter: https://x.com/TheDissenterYT This show is sponsored by Enlites, Learning & Development done differently. Check the website here: http://enlites.com/ Dr. Mauricio Suárez is Full Professor (catedrático) in Logic and Philosophy of Science at Universidad Complutense de Madrid. He is also a life member at Clare Hall at Cambridge University. His main research interests lie in the philosophy of probability and causality, the history and philosophy of science (mainly physics, chemistry and biology), modeling and idealization, the aesthetics of scientific representation, and general epistemology and methodology of science. He is the author of Inference and Representation: A Study in Modeling Science. In this episode, we focus on Inference and Representation. We start by talking about modeling in science. We then explore the concept of representation. We talk about the flaws of reductive naturalist theories of scientific representation, and an inferential conception of scientific representation. Finally, we discuss how our exploration of scientific representation connects to debates on artistic representation.--A HUGE THANK YOU TO MY PATRONS/SUPPORTERS: PER HELGE LARSEN, JERRY MULLER, BERNARDO SEIXAS, ADAM KESSEL, MATTHEW WHITINGBIRD, ARNAUD WOLFF, TIM HOLLOSY, HENRIK AHLENIUS, ROBERT WINDHAGER, RUI INACIO, ZOOP, MARCO NEVES, COLIN HOLBROOK, PHIL KAVANAGH, SAMUEL ANDREEFF, FRANCIS FORDE, TIAGO NUNES, FERGAL CUSSEN, HAL HERZOG, NUNO MACHADO, JONATHAN LEIBRANT, JOÃO LINHARES, STANTON T, SAMUEL CORREA, ERIK HAINES, MARK SMITH, JOÃO EIRA, TOM HUMMEL, SARDUS FRANCE, DAVID SLOAN WILSON, YACILA DEZA-ARAUJO, ROMAIN ROCH, YANICK PUNTER, CHARLOTTE BLEASE, NICOLE BARBARO, ADAM HUNT, PAWEL OSTASZEWSKI, NELLEKE BAK, GUY MADISON, GARY G HELLMANN, SAIMA AFZAL, ADRIAN JAEGGI, PAULO TOLENTINO, JOÃO BARBOSA, JULIAN PRICE, HEDIN BRØNNER, FRANCA BORTOLOTTI, GABRIEL PONS CORTÈS, URSULA LITZCKE, SCOTT, ZACHARY FISH, TIM DUFFY, SUNNY SMITH, JON WISMAN, WILLIAM BUCKNER, LUKE GLOWACKI, GEORGIOS THEOPHANOUS, CHRIS WILLIAMSON, PETER WOLOSZYN, DAVID WILLIAMS, DIOGO COSTA, ALEX CHAU, CORALIE CHEVALLIER, BANGALORE ATHEISTS, LARRY D. LEE JR., OLD HERRINGBONE, MICHAEL BAILEY, DAN SPERBER, ROBERT GRESSIS, JEFF MCMAHAN, JAKE ZUEHL, MARK CAMPBELL, TOMAS DAUBNER, LUKE NISSEN, KIMBERLY JOHNSON, JESSICA NOWICKI, LINDA BRANDIN, VALENTIN STEINMANN, ALEXANDER HUBBARD, BR, JONAS HERTNER, URSULA GOODENOUGH, DAVID PINSOF, SEAN NELSON, MIKE LAVIGNE, JOS KNECHT, LUCY, MANVIR SINGH, PETRA WEIMANN, CAROLA FEEST, MAURO JÚNIOR, 航 豊川, TONY BARRETT, NIKOLAI VISHNEVSKY, STEVEN GANGESTAD, TED FARRIS, HUGO B., JAMES, JORDAN MANSFIELD, CHARLOTTE ALLEN, PETER STOYKO, DAVID TONNER, LEE BECK, PATRICK DALTON-HOLMES, NICK KRASNEY, RACHEL ZAK, AND DENNIS XAVIER!A SPECIAL THANKS TO MY PRODUCERS, YZAR WEHBE, JIM FRANK, ŁUKASZ STAFINIAK, TOM VANEGDOM, BERNARD HUGUENEY, CURTIS DIXON, BENEDIKT MUELLER, THOMAS TRUMBLE, KATHRINE AND PATRICK TOBIN, JONCARLO MONTENEGRO, NICK GOLDEN, CHRISTINE GLASS, IGOR NIKIFOROVSKI, PER KRAULIS, AND JOSHUA WOOD!AND TO MY EXECUTIVE PRODUCERS, MATTHEW LAVENDER, SERGIU CODREANU, ROSEY, AND GREGORY HASTINGS!

TechCrunch Startups – Spoken Edition
Didero lands $30M to put manufacturing procurement on ‘agentic' autopilot; plus, AI inference startup Modal Labs in talks to raise at $2.5B valuation, sources say

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Feb 13, 2026 6:37


Didero functions as an agentic AI layer that sits on top of a company's existing ERP, acting as a coordinator that reads incoming communications and automatically executes the necessary updates and tasks. Also, General Catalyst is in talks to lead Modal Labs' next round for the four-year-old startup, according to our sources. Learn more about your ad choices. Visit podcastchoices.com/adchoices

The Data Exchange with Ben Lorica
Breaking the Memory Wall in the Age of Inference

The Data Exchange with Ben Lorica

Play Episode Listen Later Feb 12, 2026 45:43


In this episode, Sid Sheth, founder and CEO of d-matrix, discusses the company's approach to AI inference hardware with a focus on solving the memory bottleneck problem. Subscribe to the Gradient Flow Newsletter

Learning Bayesian Statistics
151 Diffusion Models in Python, a Live Demo with Jonas Arruda

Learning Bayesian Statistics

Play Episode Listen Later Feb 12, 2026 95:43


• Support & get perks!• Proudly sponsored by PyMC Labs! Get in touch at alex.andorra@pymc-labs.com• Intro to Bayes and Advanced Regression courses (first 2 lessons free)Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work !Chapters:00:00 Exploring Generative AI and Scientific Modeling10:27 Understanding Simulation-Based Inference (SBI) and Its Applications15:59 Diffusion Models in Simulation-Based Inference19:22 Live Coding Session: Implementing Baseflow for SBI34:39 Analyzing Results and Diagnostics in Simulation-Based Inference46:18 Hierarchical Models and Amortized Bayesian Inference48:14 Understanding Simulation-Based Inference (SBI) and Its Importance49:14 Diving into Diffusion Models: Basics and Mechanisms50:38 Forward and Backward Processes in Diffusion Models53:03 Learning the Score: Training Diffusion Models54:57 Inference with Diffusion Models: The Reverse Process57:36 Exploring Variants: Flow Matching and Consistency Models01:01:43 Benchmarking Different Models for Simulation-Based Inference01:06:41 Hierarchical Models and Their Applications in Inference01:14:25 Intervening in the Inference Process: Adding Constraints01:25:35 Summary of Key Concepts and Future DirectionsThank you to my Patrons for making this episode possible!Links from the show:- Come meet Alex at the Field of Play Conference in Manchester, UK, March 27, 2026!- Jonas's Diffusion for SBI Tutorial & Review (Paper & Code)- The BayesFlow Library- Jonas on LinkedIn- Jonas on GitHub- Further reading for more mathematical details: Holderrieth & Erives- 150 Fast Bayesian Deep Learning, with David Rügamer, Emanuel Sommer & Jakob Robnik- 107 Amortized Bayesian Inference with Deep Neural Networks, with Marvin Schmitt

Code Story
The Gene Simmons of Data Protection - AI Inference-time Guardrails

Code Story

Play Episode Listen Later Feb 11, 2026 26:44


The Gene Simmons of Data Protection: Protegrity's KISS MethodToday, we are releasing our final FINAL episode from our series, entitled The Gene Simmons of Data Protection - the KISS Method, brought to you by none other than Protegrity. Protegrity is AI-powered data security for data consumption, offering fine grain data protection solutions, so you can enable your data security, compliance, sharing and analytics.Episode Title: Navigating the Future of Data Management: Type Systems, Quantum Computing, and Protegrity's InnovationsIn our final-FINAL episode, we are speaking with Ave Gatton, Director of Generative AI. We talk about how AI safety doesn't end with training, it begins with inference. We explore the overlooked frontier of AI security, from prompt-injection, data leakage, and model manipulation. Ave helps to understand how you can build guardrails that operate in real time, and adapt to evolving threats.QuestionsWhat are inference-time threats and why are they becoming a critical focus in AI security? How do inference-time risks differ from training-time risks? Why is inference-time protection critical for safe, scalable AI adoption? How do inference-time threats vary across industries? Is there any industry where these attacks are most prevalent? Why are traditional security models insufficient at inference? What is the impact of inference-time breaches on AI adoption? What role does compliance play in shaping inference-time guardrails?What practical steps can organizations take to secure inference today? How can businesses balance performance with security when adding guardrails? Linkshttps://www.protegrity.com/https://www.linkedin.com/in/averell-gatton/Support this podcast at — https://redcircle.com/code-story-insights-from-startup-tech-leaders/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

The Six Five with Patrick Moorhead and Daniel Newman
The Six Five Pod | EP 291: Davos to Abu Dhabi - Inference, Codex & the So-Called SaaSpocalypse

The Six Five with Patrick Moorhead and Daniel Newman

Play Episode Listen Later Feb 10, 2026 54:58


The Six Five Pod is back with Episode 291. Daniel Newman and Patrick Moorhead are fresh off trips to Davos and Abu Dhabi, where they've explored the full AI stack up close (models, infrastructure, healthcare/genomics). This episode dives into what really matters right now in the markets and tech. From Microsoft's Maia 200 inference push, to NVIDIA's $2B CoreWeave bet, OpenAI's Codex closing the coding gap, the "SaaSpocalypse" panic, Cisco's AI Summit, and a no-BS debate on whether AI agents are actually enterprise-ready. The handpicked topics for this week are: Inside Abu Dhabi's Full-Stack AI Play: From universities to healthcare to hyperscale infrastructure — Pat shares a firsthand perspective on how the UAE is quietly building an end-to-end AI ecosystem. Optics, Cooling, and the Hidden AI Infrastructure Layer: Why companies like Coherent matter as much as GPUs — and how photonics, co-packaged optics, and rack-level cooling are becoming critical to scaling AI factories. Inference Takes Center Stage: Microsoft's Maia 200 shows real progress — and why hyperscalers are building custom silicon to boost capacity, economics, and control. NVIDIA's $2B CoreWeave Bet Circular finance or strategic genius? We unpack what NVIDIA's latest investment signals about AI factories, cloud capacity, and long-term infrastructure buildout. Codex vs. Claude: The Coding Wars Heat Up: OpenAI closes the gap fast — and developers start hopping between tools as AI coding becomes a moving target. The "SaaSpocalypse" Narrative: Is software really dead? We separate market panic from reality — and explain why SaaS won't disappear, but will never be valued the same again. Cisco's AI Summit Reality Check: From hype to execution: what stood out from Cisco's AI Summit and why networking, security, and enterprise integration matter more than demos. Are AI Agents Enterprise-Ready? The Flip Debates: real-world workflows vs. reliability, governance, and security — where agents work today, and where they still fall short. Big Tech Earnings Whiplash: AWS, Google, Microsoft, Meta, NVIDIA, AMD, Palantir, and Coherent — massive CapEx, cloud acceleration, and what Wall Street is getting wrong about AI ROI.   Be sure to subscribe to The Six Five Pod so you never miss an episode.

Telugu Bytes
087 - Tsun"AI" Warning

Telugu Bytes

Play Episode Listen Later Feb 9, 2026 103:42


Dhruv and Ravi are back to talk about the rise of agentic AI — their experience with Claude Code and Cursor, what agents actually are, and why they think a tsunami is coming for software engineers and knowledge workers. The Tsunami Warning The feeling since late 2025 — prapancham roju roju ki maripotundi The COVID masks analogy — we are those people now Why the folks back home aren't feeling it yet Timeline — How We Got Here GPT-2 (2020) → ChatGPT (2022) → Cursor (2023) → Claude Code & Opus 4.5 (2025) The Evolution of AI Coding Chat interface — copy-paste snippets from ChatGPT Assisted coding — Cursor tab-complete, you drive, model navigates Agentic coding — the agent drives, you're the passenger Cursor vs Claude Code — why Claude Code wins The Autopilot vs FSD analogy WTF is a Model? Giant N-dimensional matrices with weights Text in, everything out Bigger model, better responses WTF is an Agent? Model = brain, Agent = human Agent uses the model to operate tools — like a robot with a task Inference and Context Engineering Sessions, prompting, context windows SWE = Context Engineering + Verification Engineering Memory, Skills, and the Matrix Kung-Fu analogy Agent Harnesses Claude Code, Cursor, Agent SDKs Programming in English It's fun, addictive, and an art Communication skills over coding skills Good taste, strong architecture, trash your prior beliefs My Thesis — And How It Was Wrong Thought it'd hit "IT workers" first, not Big Tech But the tsunami hits the coast first — US and Big Tech have closed loops Tesla car Dharavi slums lo nadavadhu — we paved 6-lane roads for AI Knowledge Work, Manufacturing and Farming Any work where you can "close the loop" is at risk Manufacturing with QC — robots were always there, programming them was hard Farming — mostly done What is Still Scarce? Ideas, customer acquisition, creative content, land Creating software is no longer scarce Ippudu Em Cheyyamantaru Saar? We don't need SWEs, we need builders Product sense, distributed systems, build-sell-ship quickly The existential dread — we don't have 10 years, or 5, or even 2 Collective mental health crisis and economic reshaping ahead The fire storm is coming

Telecom Reseller
Blaize and Nokia Target Real-World Edge AI with Hybrid Inference for APAC, Podcast

Telecom Reseller

Play Episode Listen Later Feb 9, 2026


Doug Green, Publisher of Technology Reseller News, spoke with Dinakar Munagala, CEO & Co- Dinakar Munagala Founder of Blaize, and Joseph Sulistyo, SVP of Corporate Marketing, about Blaize's push to make AI inference practical outside the data center—and why a new strategic collaboration with Nokia is designed to accelerate that shift, especially across Asia Pacific. Blaize positions itself as an AI computing company built around a purpose-built, fully programmable processor architecture it calls a graph streaming processor, paired with software intended to simplify development of “real-world” AI. Munagala framed the company's focus as practical AI inference for environments like smart factories, smart cities, agriculture, defense, and other edge and hybrid deployments where latency, power, thermal limits, and operating conditions are non-negotiable. A centerpiece of the discussion was Blaize's announcement that Nokia is strengthening edge AI capabilities through a strategic collaboration with Blaize to deliver hybrid inference solutions across APAC. Munagala and Sulistyo described the move as a signal that AI's next phase isn't only about large-scale training in centralized data centers, but about deploying inference where outcomes are realized—near cameras, sensors, machines, and field infrastructure. In their view, Nokia's global reach in networking, automation, and integration creates a path to deliver end-to-end solutions that combine connectivity and compute for real deployments, not demos. Sulistyo emphasized the economics driving hybrid inference: cost-sensitive, power-constrained environments often cannot justify a single “monolithic” compute approach. Instead, he argued, the market is moving toward heterogeneous architectures—mixing different compute types to hit performance targets while controlling total cost of ownership. In APAC, he noted, the scale of deployments makes marginal savings meaningful, and hybrid designs become an operational requirement, not a preference. The conversation also connected edge inference to public-sector and community outcomes. Both executives highlighted smart-city use cases—such as traffic management, tolling, and first-responder automation—where real-time inference can improve accuracy and responsiveness while reducing labor-intensive processes. They extended that point to rural and underserved regions, arguing that “smart city” also includes municipalities and regional governments, where automation and analytics can unlock revenue (e.g., tolls and fines) while improving safety. Doug pushed on definitions and practicality, prompting Munagala to describe edge inference as compute performed as close as possible to the sensor—for example, processing video near a camera mounted on a pole, at a toll booth, or in a factory—so systems can detect events and respond with low latency. He added that some deployments may route inference to nearby on-prem servers or regional data centers, depending on architecture and proximity, and Blaize aims to support these variations with a common hardware/software platform. Blaize also addressed the “AI energy speed bump” impacting communities and operators—particularly where power availability and cost are constrained. Munagala said low power is foundational to Blaize's design goals and argued that purpose-built inference architectures can reduce the burden associated with power-hungry AI approaches. Sulistyo added that the broader infrastructure conversation increasingly includes cooling realities (air and liquid) and the need to match the deployment environment to the right compute profile. To ground “real-world AI” in examples, the guests pointed to deployments including license plate recognition in complex, variable conditions and traffic anomaly detection (identifying behavior that deviates from normal flow). They described these as compute-intensive workloads that must run reliably outdoors and under harsh conditions, where latency and endurance matter as much as accuracy. They also discussed retail analytics as another example of edge inference delivering measurable business outcomes by connecting what happens in-store to revenue-driving decisions. Looking ahead, Munagala described the Nokia collaboration as a model for additional partnerships that bring inference solutions into production environments at scale. Sulistyo noted APAC is the initial focus, with other regions expected to follow based on demand, proof points, and the prioritization of specific use cases. To learn more about Blaize and its technology, visit https://www.blaize.com/.

Lucretius Today -  Epicurus and Epicurean Philosophy
Episode 319 - Is the Key To Happiness Found In Supernatural Causes and Geometry?

Lucretius Today - Epicurus and Epicurean Philosophy

Play Episode Listen Later Feb 6, 2026 46:37 Transcription Available


Welcome to Episode 319 of Lucretius Today. This is a podcast dedicated to the poet Lucretius, who wrote "On The Nature of Things," the most complete presentation of Epicurean philosophy left to us from the ancient world. Each week we walk you through the Epicurean texts, and we discuss how Epicurean philosophy can apply to you today. If you find the Epicurean worldview attractive, we invite you to join us in the study of Epicurus at EpicureanFriends.com, where we discuss this and all of our podcast episodes.       Last week we completed our series on Cicero's "Tusculan Disputations," and this week we start a new series that will help us with canonics / epistemology. We will eventually move to Philodemus' "On Signs" / "On Methods of Inference," and when we do we will refer to David Sedley's article on "On Signs," and the appendix in the translation prepared by Philip Lacey, both of which are very good but difficult.To get us acclimated to the issues, we need a little more Cicero from his work "Academic Questions." This is much shorter than On Ends and Tusculan Disputations but gives us an overview of the issues that split Plato's Academy and shows how Aristotle and the Stoics (and Epicurus) responded to those controversies.https://www.epicureanfriends.com/thread/4922-episode-319-is-the-secret-to-happiness-found-in-supernatural-causes-and-geometry/

Infinite Machine Learning
Building a $4 Billion AI Infra Company | Benny Chen, cofounder of Fireworks AI

Infinite Machine Learning

Play Episode Listen Later Feb 6, 2026 41:45


Benny Chen is the cofounder of Fireworks AI, an AI infrastructure platform. They have raised $327M in funding from Benchmark, Sequoia, Lightspeed, Index, and others. Benny's favorite book: Principles (Author: Ray Dalio)(00:01) Intro and why AI infrastructure is having a moment(00:06) Training vs inference: what's working and where the real bottlenecks are(01:25) Why inference is the hard problem in production(03:30) What breaks at scale when AI systems hit real users(05:29) GPUs, hardware constraints, and why power is now a first-class concern(06:02) What you're actually paying for in inference(07:21) Reliability, compliance, and enterprise expectations(09:49) Training and inference capacity: when they blur together(11:06) How to make inference fast in practice(13:06) System design choices behind modern inference platforms(15:28) Inference economics and cost tradeoffs(18:02) When fine-tuning actually makes sense(21:58) What “best model” really means for real companies(24:25) Production LLM architectures that actually work(27:46) Building an AI infra company customers can trust(29:27) Shipping fast without breaking reliability(31:14) Go-to-market lessons for infra startups(34:17) Where inference platforms are heading next(36:32) Rapid fire round--------Where to find Benny Chen: LinkedIn: https://www.linkedin.com/in/benny-yufei-chen-2238575a/--------Where to find Prateek Joshi: Website: https://prateekj.com Research Column: https://www.infrastartups.comLinkedIn: https://www.linkedin.com/in/prateek-joshi-infiniteX: https://x.com/prateekj

Top Traders Unplugged
UGO09: Playing the Players in a Narrative Market ft. Ben Hunt

Top Traders Unplugged

Play Episode Listen Later Feb 4, 2026 61:00 Transcription Available


Cem Karsan sits down with Ben Hunt, founder of Epsilon Theory, to explore how narratives shape markets, politics, and decision making itself. Drawing on decades of experience across academia, hedge funds, and applied AI, Ben explains why stories, not data, increasingly drive outcomes in modern markets. The conversation spans unstructured data, inference, common knowledge, and the mechanics of narrative momentum. Together, they examine consumer expectations, inflation silence, geopolitical signaling, and the slow shift away from US dominance. What emerges is a framework for understanding markets as reflexive systems, where perception often matters more than reality.-----50 YEARS OF TREND FOLLOWING BOOK AND BEHIND-THE-SCENES VIDEO FOR ACCREDITED INVESTORS - CLICK HERE-----Follow Niels on Twitter, LinkedIn, YouTube or via the TTU website.IT's TRUE ? – most CIO's read 50+ books each year – get your FREE copy of the Ultimate Guide to the Best Investment Books ever written here.And you can get a free copy of my latest book “Ten Reasons to Add Trend Following to Your Portfolio” here.Learn more about the Trend Barometer here.Send your questions to info@toptradersunplugged.comAnd please share this episode with a like-minded friend and leave an honest Rating & Review on iTunes or Spotify so more people can discover the podcast.Follow Cem on Twitter.Episode TimeStamps: 00:00 - Introduction to U Got Options and the trading floor setting02:18 - Ben Hunt's background and Epsilon Theory origins04:11 - Markets as the ultimate multiplayer game06:15 - Inference, unstructured data, and narrative analysis08:18 - Why sentiment and word counts miss the real signal11:16 - Mapping meaning and truthy stories15:00 - LLMs as operating systems, not oracles18:01 - Giving money back and when models stop working21:16 - Applying narrative tools beyond markets24:10 - Consumer weakness versus bullish expectations30:43 - Inflation, recession, and why markets do not care33:29 - Dormant stories and volatility discovery34:26 -

Mind-Body Solution with Dr Tevin Naidu
What is Ultimately Real? Consciousness, Free Energy & Spacetime | Donald Hoffman & Karl Friston

Mind-Body Solution with Dr Tevin Naidu

Play Episode Listen Later Feb 4, 2026 160:24


In this landmark Mind-Body Solution Colloquia, cognitive scientist Donald Hoffman and neuroscientist Karl Friston engage in a deep, rigorous dialogue on the foundations of reality, perception, and consciousness.Hoffman argues that spacetime and physical objects are not fundamental, but evolved interfaces shaped by fitness rather than truth. Friston presents the Free Energy Principle and Active Inference as a unifying framework for life, mind, and meaning — raising the question of whether inference itself can ground reality.Together, they explore:- Why spacetime may be derived, not fundamental- Whether consciousness must come before physics- Markov blankets, trace logic, and system boundaries- Probability, inference, and non-equilibrium dynamics- The limits of scientific explanation- Implications for AI, evolution, and ontologyThis is not a debate — it is a serious attempt to understand reality at its deepest level.TIMESTAMPS:(00:00) - What is Ultimately Real? Consciousness vs Physicalism Debate(00:51) - Why Consciousness is Fundamental Beyond Spacetime(03:06) - High Energy Physics: Spacetime is Doomed Explained(05:06 - Challenges of Physicalist Theories in Explaining Consciousness(07:11 - Ontological Views: Free Energy Principle Integration(08:20) - Background-Free Explanations of Lived Experience(10:06) - Parsimony and Data Compression in Scientific Models(12:21) - Discoveries in Simpler Scattering Amplitude Solutions(14:09) - Free Energy Principle Guiding Beyond Spacetime Physics(16:06) - Why Physicalism Fails to Boot Up Consciousness(19:05) - Probability Theory's Role in Consciousness Frameworks(26:05) - Trace Logic Applied to Markov Chains Dynamics(34:51) - Markov Blankets and Insulation from the Past(39:07) - Minimizing Surprise in Non-Equilibrium Processes(53:32) - Spacetime as a Derived Projection from Fundamentals(1:04:15) - Constructing Simpler Explanations of Reality(1:20:50) - State Spaces and Dimensionality in Consciousness(1:41:30) - Non-Unique Bounds in AI Design Using Trace Logic(2:02:00) - From Classical Probability to Quantum Mechanics Transition(2:10:26) - Inferring Hidden Realities Through Relationships(2:18:54) - Time as a Computational Resource in Inference(2:24:09) - Scope and Limits of Scientific Explanations(2:32:32) - Agreements on Constructed Realities and Perceptions(2:40:01) - Closing Thoughts: Joint ManifestoEPISODE LINKS:- Karl's Round 1: https://youtu.be/Kb5X8xOWgpc- Karl's Round 2: https://youtu.be/mqzyKs2Qvug- Karl's Round 3 (Ft Mark Solms): https://youtu.be/Jtp426wQ-JI- Karl's Lecture 1: https://youtu.be/Gp9Sqvx4H7w- Karl's Lecture 2: https://youtu.be/Sfjw41TBnRM- Karl's Lecture 3: https://youtu.be/dM3YINvDZsY- Don's Round 1: https://youtu.be/M5Hz1giUUT8- Don's Round 2: https://youtu.be/Toq9YLl49KM- Don's Round 3: https://youtu.be/QRa8r5xOaAA- Don's Round 4: https://youtu.be/Hf1q-bZMEo4- Don's Lecture 1: https://youtu.be/r_UFm8GbSvU- Don's Lecture 2: https://youtu.be/YBmzqNIlbcICONNECT:- Website: https://mindbodysolution.org - YouTube: https://youtube.com/@MindBodySolution- Podcast: https://creators.spotify.com/pod/show/mindbodysolution- Twitter: https://twitter.com/drtevinnaidu- Facebook: https://facebook.com/drtevinnaidu - Instagram: https://instagram.com/drtevinnaidu- LinkedIn: https://linkedin.com/in/drtevinnaidu- Website: https://tevinnaidu.com=============================Disclaimer: The information provided on this channel is for educational purposes only. The content is shared in the spirit of open discourse and does not constitute, nor does it substitute, professional or medical advice. We do not accept any liability for any loss or damage incurred from you acting or not acting as a result of listening/watching any of our contents. You acknowledge that you use the information provided at your own risk. Listeners/viewers are advised to conduct their own research and consult with their own experts in the respective fields.

Effective Altruism Forum Podcast
[Linkpost] “Inference Scaling Reshapes AI Governance” by Toby_Ord

Effective Altruism Forum Podcast

Play Episode Listen Later Feb 2, 2026 34:49


This is a link post. The shift from scaling up the pre-training compute of AI systems to scaling up their inference compute may have profound effects on AI governance. The nature of these effects depends crucially on whether this new inference compute will primarily be used during external deployment or as part of a more complex training programme within the lab. Rapid scaling of inference-at-deployment would: lower the importance of open-weight models (and of securing the weights of closed models), reduce the impact of the first human-level models, change the business model for frontier AI, reduce the need for power-intense data centres, and derail the current paradigm of AI governance via training compute thresholds. Rapid scaling of inference-during-training would have more ambiguous effects that range from a revitalisation of pre-training scaling to a form of recursive self-improvement via iterated distillation and amplification. The end of an era — for both training and governance The intense year-on-year scaling up of AI training runs has been one of the most dramatic and stable markers of the Large Language Model era. Indeed it had been widely taken to be a permanent fixture of the AI landscape and the basis of many approaches to [...] ---Outline:(01:06) The end of an era -- for both training and governance(05:24) Scaling inference-at-deployment(06:42) Reducing the number of simultaneously served copies of each new model(08:45) Reducing the value of securing model weights(09:30) Reducing the benefits and risks of open-weight models(10:05) Unequal performance for different tasks and for different users(12:08) Changing the business model and industry structure(12:50) Reducing the need for monolithic data centres(17:16) Scaling inference-during-training(28:07) Conclusions(30:17) Appendix. Comparing the costs of scaling pre-training vs inference-at-deployment --- First published: February 2nd, 2026 Source: https://forum.effectivealtruism.org/posts/RnsgMzsnXcceFfKip/inference-scaling-reshapes-ai-governance Linkpost URL:https://www.tobyord.com/writing/inference-scaling-reshapes-ai-governance --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

Effective Altruism Forum Podcast
[Linkpost] “Inference Scaling and the Log-x Chart” by Toby_Ord

Effective Altruism Forum Podcast

Play Episode Listen Later Feb 2, 2026 16:32


This is a link post. Improving model performance by scaling up inference compute is the next big thing in frontier AI. But the charts being used to trumpet this new paradigm can be misleading. While they initially appear to show steady scaling and impressive performance for models like o1 and o3, they really show poor scaling (characteristic of brute force) and little evidence of improvement between o1 and o3. I explore how to interpret these new charts and what evidence for strong scaling and progress would look like. From scaling training to scaling inference The dominant trend in frontier AI over the last few years has been the rapid scale-up of training — using more and more compute to produce smarter and smarter models. Since GPT-4, this kind of scaling has run into challenges, so we haven't yet seen models much larger than GPT-4. But we have seen a recent shift towards scaling up the compute used during deployment (aka 'test-time compute' or ‘inference compute'), with more inference compute producing smarter models. You could think of this as a change in strategy from improving the quality of your employees' work via giving them more years of training in which acquire [...] --- First published: February 2nd, 2026 Source: https://forum.effectivealtruism.org/posts/zNymXezwySidkeRun/inference-scaling-and-the-log-x-chart Linkpost URL:https://www.tobyord.com/writing/inference-scaling-and-the-log-x-chart --- Narrated by TYPE III AUDIO. ---Images from the article:

Effective Altruism Forum Podcast
[Linkpost] “Evidence that Recent AI Gains are Mostly from Inference-Scaling” by Toby_Ord

Effective Altruism Forum Podcast

Play Episode Listen Later Feb 2, 2026 10:01


This is a link post. In the last year or two, the most important trend in modern AI came to an end. The scaling-up of computational resources used to train ever-larger AI models through next-token prediction (pre-training) stalled out. Since late 2024, we've seen a new trend of using reinforcement learning (RL) in the second stage of training (post-training). Through RL, the AI models learn to do superior chain-of-thought reasoning about the problem they are being asked to solve. This new era involves scaling up two kinds of compute: the amount of compute used in RL post-training the amount of compute used every time the model answers a question Industry insiders are excited about the first new kind of scaling, because the amount of compute needed for RL post-training started off being small compared to the tremendous amounts already used in next-token prediction pre-training. Thus, one could scale the RL post-training up by a factor of 10 or 100 before even doubling the total compute used to train the model. But the second new kind of scaling is a problem. Major AI companies were already starting to spend more compute serving their models to customers than in the training [...] --- First published: February 2nd, 2026 Source: https://forum.effectivealtruism.org/posts/5zfubGrJnBuR5toiK/evidence-that-recent-ai-gains-are-mostly-from-inference Linkpost URL:https://www.tobyord.com/writing/mostly-inference-scaling --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

TD Ameritrade Network
Nitin Sacheti's Market Picks: Inference, Fiber, Defense, Energy

TD Ameritrade Network

Play Episode Listen Later Jan 30, 2026 6:17


“Expectations are really high,” Nitin Sacheti says, examining the results from megacap tech earnings this week. It's becoming a stock picker's market as investors need to sort through the AI trade carefully; he likes hardware and inference-related companies, but is beginning to hedge. Another place he's looking is at fiber, as it's needed to move data across the country. He's also long in the defense sector in companies where he sees secular growth. ======== Schwab Network ========Empowering every investor and trader, every market day.Options involve risks and are not suitable for all investors. Before trading, read the Options Disclosure Document. http://bit.ly/2v9tH6DSubscribe to the Market Minute newsletter - https://schwabnetwork.com/subscribeDownload the iOS app - https://apps.apple.com/us/app/schwab-network/id1460719185Download the Amazon Fire Tv App - https://www.amazon.com/TD-Ameritrade-Network/dp/B07KRD76C7Watch on Sling - https://watch.sling.com/1/asset/191928615bd8d47686f94682aefaa007/watchWatch on Vizio - https://www.vizio.com/en/watchfreeplus-exploreWatch on DistroTV - https://www.distro.tv/live/schwab-network/Follow us on X – https://twitter.com/schwabnetworkFollow us on Facebook – https://www.facebook.com/schwabnetworkFollow us on LinkedIn - https://www.linkedin.com/company/schwab-network/About Schwab Network - https://schwabnetwork.com/about

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
20VC: Brex Acquired for $5.15BN | a16z Companies are 2/3 AI Revenues | Anthropic Inference Costs Skyrocket | OpenEvidence Raises at $12BN Valuation | The IPO Market: EquipmentShare, Wealthfront and Ethos Insurance

The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch

Play Episode Listen Later Jan 29, 2026 75:55


AGENDA: 03:36 Brex Acquisition by Capital One for $5.15BN 10:54 Does Brex's Acquisition Help or Hurt Ramp? 16:28 TikTok Deal Completed: Who Won & Who Lost: Analysis 19:30 Anthropic Inference Costs Higher Than Expected 37:50 Open Evidence Raises at $12BN from Thrive and DST 53:56 Wealthront IPO Disaster: Is $1.5BN IPO Too Small? 01:07:27 Salesforce Wins $5BN Army Contract: The Last Laugh for SaaS  

Data Driven
Synthetic Populations and the Future of Decision Intelligence

Data Driven

Play Episode Listen Later Jan 29, 2026 50:16 Transcription Available


In this episode of Data Driven, Frank and Andy dive into the future of market intelligence with Dr. Jill Axline, co-founder and CEO of Mavera—a company building synthetic populations that simulate real human behaviour, cognition, and emotion. Forget Personas. We're talking real-time, AI-driven behavioural modeling that's more predictive than your horoscope and considerably more data-backed.Dr. Axline shares how Mavera's swarm of AI models situates these synthetic humans within real-world business contexts to forecast decisions, measure emotional resonance, and even test marketing messages before they go live. From governance and model drift to the surprising uses in financial services, political campaigns, and speechwriting—this is one of the most forward-looking conversations we've had yet.If you've ever wanted a deeper understanding of how AI can augment decision-making—or just want to hear Frank admit asset managers love ice cream—this one's for you.LinksLearn more about Mavera:https://mavera.ioConnect with Jill Axline on LinkedIn:https://linkedin.com/in/jillaxlineMorningstar:https://www.morningstar.comTime Stamps00:00 - Introduction & AI Swarms Explained03:30 - Forget Personas: Contextual AI Models07:00 - Evidence vs Inference & AI Governance10:20 - Simulation Scenarios & Model Drift14:30 - Synthetic Audiences in Action18:00 - Evidence Feedback Loops & Small Data Challenges22:00 - Industry Applications & Use Cases27:00 - Analyzing Speeches & Emotional Resonance30:45 - Sentiment, Social Listening, and Real-Time News Reactions34:00 - Adversarial Models & Strategic Pushback38:00 - The Cartoon Bank Portal That Failed Spectacularly41:00 - From Skeptic to CEO: Jill's Journey45:00 - Data Privacy, Compliance & Synthetic Ethics48:00 - Reflections on Empathy, Engineers, and Selling Without SellingSupport the ShowIf you enjoy Data Driven, leave us a review on Apple Podcasts or your favourite pod platform. It helps more people find the show—and fuels Frank's Monster Energy habit.

The MAD Podcast with Matt Turck
State of LLMs 2026: RLVR, GRPO, Inference Scaling — Sebastian Raschka

The MAD Podcast with Matt Turck

Play Episode Listen Later Jan 29, 2026 68:13


Sebastian Raschka joins the MAD Podcast for a deep, educational tour of what actually changed in LLMs in 2025 — and what matters heading into 2026.We start with the big architecture question: are transformers still the winning design, and what should we make of world models, small “recursive” reasoning models and text diffusion approaches? Then we get into the real story of the last 12 months: post-training and reasoning. Sebastian breaks down RLVR (reinforcement learning with verifiable rewards) and GRPO, why they pair so well, what makes them cheaper to scale than classic RLHF, and how they “unlock” reasoning already latent in base models.We also cover why “benchmaxxing” is warping evaluation, why Sebastian increasingly trusts real usage over benchmark scores, and why inference-time scaling and tool use may be the underappreciated drivers of progress. Finally, we zoom out: where moats live now (hint: private data), why more large companies may train models in-house, and why continual learning is still so hard.If you want the 2025–2026 LLM landscape explained like a masterclass — this is it.Sources:The State Of LLMs 2025: Progress, Problems, and Predictions - https://x.com/rasbt/status/2006015301717028989?s=20The Big LLM Architecture Comparison - https://magazine.sebastianraschka.com/p/the-big-llm-architecture-comparisonSebastian RaschkaWebsite - https://sebastianraschka.comBlog - https://magazine.sebastianraschka.comLinkedIn - https://www.linkedin.com/in/sebastianraschka/X/Twitter - https://x.com/rasbtFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) - Intro (01:05) - Are the days of Transformers numbered?(14:05) - World models: what they are and why people care(06:01) - Small “recursive” reasoning models (ARC, iterative refinement)(09:45) - What is a diffusion model (for text)?(13:24) - Are we seeing real architecture breakthroughs — or just polishing?(14:04) - MoE + “efficiency tweaks” that actually move the needle(17:26) - “Pre-training isn't dead… it's just boring”(18:03) - 2025's headline shift: RLVR + GRPO (post-training for reasoning)(20:58) - Why RLHF is expensive (reward model + value model)(21:43) - Why GRPO makes RLVR cheaper and more scalable(24:54) - Process Reward Models (PRMs): why grading the steps is hard(28:20) - Can RLVR expand beyond math & coding?(30:27) - Why RL feels “finicky” at scale(32:34) - The practical “tips & tricks” that make GRPO more stable(35:29) - The meta-lesson of 2025: progress = lots of small improvements(38:41) - “Benchmaxxing”: why benchmarks are getting less trustworthy(43:10) - The other big lever: inference-time scaling(47:36) - Tool use: reducing hallucinations by calling external tools(49:57) - The “private data edge” + in-house model training(55:14) - Continual learning: why it's hard (and why it's not 2026)(59:28) - How Sebastian works: reading, coding, learning “from scratch”(01:04:55) - LLM burnout + how he uses models (without replacing himself)

unSILOed with Greg LaBlanc
615. Reclaim Your Life from Digital Overload with Paul Leonardi

unSILOed with Greg LaBlanc

Play Episode Listen Later Jan 26, 2026 60:03


What are practical strategies to avoid overload and exhaustion in today's digital world? What norms can organizations create for tool usage, and how can finding offline activities that provide a mental contrast to digital work?Paul Leonardi is the Duca Family Professor of Technology Management at UC Santa Barbara, a consultant and speaker on digital transformation and the future of work, and an author of several works. His latest book is called Digital Exhaustion: Simple Rules for Reclaiming Your Life.Greg and Paul discuss the complementary nature of his two most recent books: the first focuses on harnessing digital tools, and the second on mitigating the overwhelm they can cause. They also explore teaching technology management, including the importance of understanding technology's impact on people and organizational processes. Paul explains the 30% rule, emphasizing the need to understand digital tools well enough to use them effectively. They also explore the concept of digital exhaustion, the subject of his most recent book, its symptoms, and how to manage it, both at work and in daily life. *unSILOed Podcast is produced by University FM.*Episode Quotes:How can we reduce exhaustion?41:29: One easy way of reducing our exhaustion is to match the sort of complexity of the task that we are trying to do with the affordances or the capabilities of the technology. And I say match, not over exceed, because we also have the problem where, like me, I am sure you have been in many, many meetings that should have just been an email, that there is not the need. And so what we have done in that situation is we have overstimulated people, right, in a setting with, you know, 15 other folks, and we have taken an hour out of their day and maybe the travel time to get there. And that has created other avenues for exhaustion when, if we had just perceived this information via email, we could not have had the meeting. So you do not want to overmatch, you just want to like match to the complexity of the task. And that is the key to reducing our exhaustion.It's not just distraction that exhausts us18:28: I think we have failed to look at how it is not just being distracted that is a problem, but it is the act of switching itself across all of these different inputs really is a significant source of our exhaustion.Inference is a big driver of exhaustion32:45: Inference is really a big driver of exhaustion. And I would say the place that it most shows up, although not exclusively, is in our social media lives. Because, of course, people are curating their lives in terms of what they post, whether that is LinkedIn or TikTok or Instagram, that does not really matter. And we are constantly not only making inferences of them, but what I find is that we are also very often making inferences about ourselves because we see a past record of all the things that we wrote and all of the things that we posted. And then we are also making inferences of what we think other people think about us based on all the things that we post.Show Links:Recommended Resources:Human MultitaskingTask SwitchingFatigueUnsiloed Podcast Episode 612: Rebecca HindsGuest Profile:Faculty Profile at UC Santa BarbaraPaulLeonardi.comWikipedia ProfileLinkedIn ProfileGuest Work:Amazon Author PageDigital Exhaustion: Simple Rules for Reclaiming Your LifeThe Digital Mindset: What It Really Takes to Thrive in the Age of Data, Algorithms, and AIExpertise, Communication, and OrganizingMateriality and Organizing: Social Interaction in a Technological WorldCar Crashes without Cars: Lessons About Simulation Technology and Organizational Change from Automotive DesignGoogle Scholar Page Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

AI Chat: ChatGPT & AI News, Artificial Intelligence, OpenAI, Machine Learning

In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future

Midjourney
Microsoft Reveals Maya 200 AI Inference Chip

Midjourney

Play Episode Listen Later Jan 26, 2026 11:17


In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future LinksGet the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

UiPath Daily
Microsoft Reveals Maya 200 AI Inference Chip

UiPath Daily

Play Episode Listen Later Jan 26, 2026 11:17


In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future LinksGet the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

ChatGPT: OpenAI, Sam Altman, AI, Joe Rogan, Artificial Intelligence, Practical AI

In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future

ChatGPT: News on Open AI, MidJourney, NVIDIA, Anthropic, Open Source LLMs, Machine Learning

In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future LinksGet the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

AI for Non-Profits
Microsoft Reveals Maya 200 AI Inference Chip

AI for Non-Profits

Play Episode Listen Later Jan 26, 2026 11:17


In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future LinksGet the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Lex Fridman Podcast of AI
Microsoft Reveals Maya 200 AI Inference Chip

Lex Fridman Podcast of AI

Play Episode Listen Later Jan 26, 2026 11:17


In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future LinksGet the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

The Elon Musk Podcast
Microsoft Reveals Maya 200 AI Inference Chip

The Elon Musk Podcast

Play Episode Listen Later Jan 26, 2026 11:17


In this episode, we discuss Microsoft's new Maya 200 AI inference chip, highlighting its capabilities, its importance for efficient AI model deployment, and how it signifies a major shift towards custom silicon in the AI industry. We also touch upon its potential impact on cost savings and Microsoft's strategy to become a leading player in the AI hardware space.Chapters00:00 Microsoft's Maya 200 AI Chip00:29 AI Box.ai Tools02:03 Power and Performance04:54 Inference vs. Training08:21 Efficiency and Competition14:06 Internal Deployment and Future LinksGet the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustle See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Crazy Wisdom
Episode #525: The Billion-Dollar Architecture Problem: Why AI's Innovation Loop is Stuck

Crazy Wisdom

Play Episode Listen Later Jan 23, 2026 53:38


In this episode of the Crazy Wisdom podcast, host Stewart Alsop welcomes Roni Burd, a data and AI executive with extensive experience at Amazon and Microsoft, for a deep dive into the evolving landscape of data management and artificial intelligence in enterprise environments. Their conversation explores the longstanding challenges organizations face with knowledge management and data architecture, from the traditional bronze-silver-gold data processing pipeline to how AI agents are revolutionizing how people interact with organizational data without needing SQL or Python expertise. Burd shares insights on the economics of AI implementation at scale, the debate between one-size-fits-all models versus specialized fine-tuned solutions, and the technical constraints that prevent companies like Apple from upgrading services like Siri to modern LLM capabilities, while discussing the future of inference optimization and the hundreds-of-millions-of-dollars cost barrier that makes architectural experimentation in AI uniquely expensive compared to other industries.Timestamps00:00 Introduction to Data and AI Challenges03:08 The Evolution of Data Management05:54 Understanding Data Quality and Metadata08:57 The Role of AI in Data Cleaning11:50 Knowledge Management in Large Organizations14:55 The Future of AI and LLMs17:59 Economics of AI Implementation29:14 The Importance of LLMs for Major Tech Companies32:00 Open Source: Opportunities and Challenges35:19 The Future of AI Inference and Hardware43:24 Optimizing Inference: The Next Frontier49:23 The Commercial Viability of AI ModelsKey Insights1. Data Architecture Evolution: The industry has evolved through bronze-silver-gold data layers, where bronze is raw data, silver is cleaned/processed data, and gold is business-ready datasets. However, this creates bottlenecks as stakeholders lose access to original data during the cleaning process, making metadata and data cataloging increasingly critical for organizations.2. AI Democratizing Data Access: LLMs are breaking down technical barriers by allowing business users to query data in plain English without needing SQL, Python, or dashboarding skills. This represents a fundamental shift from requiring intermediaries to direct stakeholder access, though the full implications remain speculative.3. Economics Drive AI Architecture Decisions: Token costs and latency requirements are major factors determining AI implementation. Companies like Meta likely need their own models because paying per-token for billions of social media interactions would be economically unfeasible, driving the need for self-hosted solutions.4. One Model Won't Rule Them All: Despite initial hopes for universal models, the reality points toward specialized models for different use cases. This is driven by economics (smaller models for simple tasks), performance requirements (millisecond response times), and industry-specific needs (medical, military terminology).5. Inference is the Commercial Battleground: The majority of commercial AI value lies in inference rather than training. Current GPUs, while specialized for graphics and matrix operations, may still be too general for optimal inference performance, creating opportunities for even more specialized hardware.6. Open Source vs Open Weights Distinction: True open source in AI means access to architecture for debugging and modification, while "open weights" enables fine-tuning and customization. This distinction is crucial for enterprise adoption, as open weights provide the flexibility companies need without starting from scratch.7. Architecture Innovation Faces Expensive Testing Loops: Unlike database optimization where query plans can be easily modified, testing new AI architectures requires expensive retraining cycles costing hundreds of millions of dollars. This creates a potential innovation bottleneck, similar to aerospace industries where testing new designs is prohibitively expensive.

a16z
Inferact: Building the Infrastructure That Runs Modern AI

a16z

Play Episode Listen Later Jan 22, 2026 43:37


Inferact is a new AI infrastructure company founded by the creators and core maintainers of vLLM. Its mission is to build a universal, open-source inference layer that makes large AI models faster, cheaper, and more reliable to run across any hardware, model architecture, or deployment environment. Together, they broke down how modern AI models are actually run in production, why “inference” has quietly become one of the hardest problems in AI infrastructure, and how the open-source project vLLM emerged to solve it. The conversation also looked at why the vLLM team started Inferact and their vision for a universal inference layer that can run any model, on any chip, efficiently.Follow Matt Bornstein on X: https://twitter.com/BornsteinMattFollow Simon Mo on X: https://twitter.com/simon_mo_Follow Woosuk Kwon on X: https://twitter.com/woosuk_kFollow vLLM on X: https://twitter.com/vllm_project Stay Updated:Find a16z on XFind a16z on LinkedInListen to the a16z Show on SpotifyListen to the a16z Show on Apple PodcastsFollow our host: https://twitter.com/eriktorenberg Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

AI + a16z
Inferact: Building the Infrastructure That Runs Modern AI

AI + a16z

Play Episode Listen Later Jan 22, 2026 43:37


Inferact is a new AI infrastructure company founded by the creators and core maintainers of vLLM. Its mission is to build a universal, open-source inference layer that makes large AI models faster, cheaper, and more reliable to run across any hardware, model architecture, or deployment environment. Together, they broke down how modern AI models are actually run in production, why “inference” has quietly become one of the hardest problems in AI infrastructure, and how the open-source project vLLM emerged to solve it. The conversation also looked at why the vLLM team started Inferact and their vision for a universal inference layer that can run any model, on any chip, efficiently.Follow Matt Bornstein on X: https://twitter.com/BornsteinMattFollow Simon Mo on X: https://twitter.com/simon_mo_Follow Woosuk Kwon on X: https://twitter.com/woosuk_kFollow vLLM on X: https://twitter.com/vllm_project Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts. Please note that the content here is for informational purposes only; should NOT be taken as legal, business, tax, or investment advice or be used to evaluate any investment or security; and is not directed at any investors or potential investors in any a16z fund. a16z and its affiliates may maintain investments in the companies discussed. For more details please see a16z.com/disclosures. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Cloud Security Podcast
Why AI Can't Replace Detection Engineers: Build vs. Buy & The Future of SOC

Cloud Security Podcast

Play Episode Listen Later Jan 21, 2026 52:08


Is the AI SOC a reality, or just vendor hype? In this episode, Antoinette Stevens (Principal Security Engineer at Ramp) joins Ashish to dissect the true state of AI in detection engineering.Antoinette shares her experience building detection program from scratch, explaining why she doesn't trust AI to close alerts due to hallucinations and faulty logic . We explore the "engineering-led" approach to detection, moving beyond simple hunting to building rigorous testing suites for detection-as-code .We discuss the shrinking entry-level job market for security roles , why software engineering skills are becoming non-negotiable , and the critical importance of treating AI as a "force multiplier, not your brain".Guest Socials - ⁠⁠⁠Antoinette's LinkedinPodcast Twitter - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠@CloudSecPod⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠If you want to watch videos of this LIVE STREAMED episode and past episodes - Check out our other Cloud Security Social Channels:-⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security Podcast- Youtube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠- ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security Newsletter ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠If you are interested in AI Security, you can check out our sister podcast -⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ AI Security Podcast⁠Questions asked:(00:00) Introduction(02:25) Who is Antoinette Stevens?(04:10) What is an "Engineering-Led" Approach to Detection? (06:00) Moving from Hunting to Automated Testing Suites (09:30) Build vs. Buy: Is AI Making it Easier to Build Your Own Tools? (11:30) Using AI for Documentation & Playbook Updates (14:30) Why Software Engineers Still Need to Learn Detection Domain Knowledge (17:50) The Problem with AI SOC: Why ChatGPT Lies During Triage (23:30) Defining AI Concepts: Memory, Evals, and Inference (26:30) Multi-Agent Architectures: Using Specialized "Persona" Agents (28:40) Advice for Building a Detection Program in 2025 (Back to Basics) (33:00) Measuring Success: Noise Reduction vs. False Positive Rates (36:30) Building an Alerting Data Lake for Metrics (40:00) The Disappearing Entry-Level Security Job & Career Advice (44:20) Why Junior Roles are Becoming "Personality Hires" (48:20) Fun Questions: Wine Certification, Side Quests, and Georgian Food

Crazy Wisdom
Episode #524: The 500-Year Prophecy: Why Buddhism and AI Are Colliding Right Now

Crazy Wisdom

Play Episode Listen Later Jan 19, 2026 60:49


In this episode of the Crazy Wisdom podcast, host Stewart Alsop sits down with Kelvin Lwin for their second conversation exploring the fascinating intersection of AI and Buddhist cosmology. Lwin brings his unique perspective as both a technologist with deep Silicon Valley experience and a serious meditation practitioner who's spent decades studying Buddhist philosophy. Together, they examine how AI development fits into ancient spiritual prophecies, discuss the dangerous allure of LLMs as potentially "asura weapons" that can mislead users, and explore verification methods for enlightenment claims in our modern digital age. The conversation ranges from technical discussions about the need for better AI compilers and world models to profound questions about humanity's role in what Lwin sees as an inevitable technological crucible that will determine our collective spiritual evolution. For more information about Kelvin's work on attention training and AI, visit his website at alin.ai. You can also join Kelvin for live meditation sessions twice daily on Clubhouse at clubhouse.com/house/neowise.Timestamps00:00 Exploring AI and Spirituality05:56 The Quest for Enlightenment Verification11:58 AI's Impact on Spirituality and Reality17:51 The 500-Year Prophecy of Buddhism23:36 The Future of AI and Business Innovation32:15 Exploring Language and Communication34:54 Programming Languages and Human Interaction36:23 AI and the Crucible of Change39:20 World Models and Physical AI41:27 The Role of Ontologies in AI44:25 The Asura and Deva: A Battle for Supremacy48:15 The Future of Humanity and AI51:08 Persuasion and the Power of LLMs55:29 Navigating the New Age of TechnologyKey Insights1. The Rarity of Polymath AI-Spirituality Perspectives: Kelvin argues that very few people are approaching AI through spiritual frameworks because it requires being a polymath with deep knowledge across multiple domains. Most people specialize in one field, and combining AI expertise with Buddhist cosmology requires significant time, resources, and academic background that few possess.2. Traditional Enlightenment Verification vs. Modern Claims: There are established methods for verifying enlightenment claims in Buddhist traditions, including adherence to the five precepts and overcoming hell rebirth through karmic resolution. Many modern Western practitioners claiming enlightenment fail these traditional tests, often changing the criteria when they can't meet the original requirements.3. The 500-Year Buddhist Prophecy and Current Timing: We are approximately 60 years into a prophesied 500-year period where enlightenment becomes possible again. This "startup phase of Buddhism revival" coincides with technological developments like the internet and AI, which are seen as integral to this spiritual renaissance rather than obstacles to it.4. LLMs as UI Solution, Not Reasoning Engine: While LLMs have solved the user interface problem of capturing human intent, they fundamentally cannot reason or make decisions due to their token-based architecture. The technology works well enough to create illusion of capability, leading people down an asymptotic path away from true solutions.5. The Need for New Programming Paradigms: Current AI development caters too much to human cognitive limitations through familiar programming structures. True advancement requires moving beyond human-readable code toward agent-generated languages that prioritize efficiency over human comprehension, similar to how compilers already translate high-level code.6. AI as Asura Weapon in Spiritual Warfare: From Buddhist cosmological perspective, AI represents an asura (demon-realm) tool that appears helpful but is fundamentally wasteful and disruptive to human consciousness. Humanity exists as the battleground between divine and demonic forces, with AI serving as a weapon that both sides employ in this cosmic conflict.7. 2029 as Critical Convergence Point: Multiple technological and spiritual trends point toward 2029 as when various systems will reach breaking points, forcing humanity to either transcend current limitations or be consumed by them. This timing aligns with both technological development curves and spiritual prophecies about transformation periods.

Latin in Layman’s - A Rhetoric Revolution
The Inference Limit - Speed of Thought (A Short Sci-Fi story from me)

Latin in Layman’s - A Rhetoric Revolution

Play Episode Listen Later Jan 17, 2026 28:27


My links:My Ko-fi: https://ko-fi.com/rhetoricrevolutionSend me a voice message!: https://podcasters.spotify.com/pod/show/liam-connerlyTikTok: ⁠https://www.tiktok.com/@mrconnerly?is_from_webapp=1&sender_device=pc⁠Email: ⁠rhetoricrevolution@gmail.com⁠Instagram: https://www.instagram.com/connerlyliam/Podcast | Latin in Layman's - A Rhetoric Revolution https://open.spotify.com/show/0EjiYFx1K4lwfykjf5jApM?si=b871da6367d74d92YouTube: https://www.youtube.com/@MrConnerly 

Eye On A.I.
#313 Nick Pandher: How Inference-First Infrastructure Is Powering the Next Wave of AI

Eye On A.I.

Play Episode Listen Later Jan 17, 2026 56:02


Inference is now the biggest challenge in enterprise AI. In this episode of Eye on AI, Craig Smith speaks with Nick Pandher, VP of Product at Cirrascale, about why AI is shifting from model training to inference at scale. As AI moves into production, enterprises are prioritizing performance, latency, reliability, and cost efficiency over raw compute. The conversation covers the rise of inference-first infrastructure, the limits of hyperscalers, the emergence of neoclouds, and how agentic AI is driving always-on inference workloads. Nick also explains how inference-optimized hardware and serverless AI platforms are shaping the future of enterprise AI deployment.   If you are deploying AI in production, this episode explains why inference is the real frontier.   Stay Updated: Craig Smith on X: https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) Preview (00:50) Introduction to Cirrascale and AI inference (03:04) What makes Cirrascale a neocloud (04:42) Why AI shifted from training to inference (06:58) Private inference and enterprise security needs (08:13) Hyperscalers vs neoclouds for AI workloads (10:22) Performance metrics that matter in inference (13:29) Hardware choices and inference accelerators (20:04) Real enterprise AI use cases and automation (23:59) Hybrid AI, regulated industries, and compliance (26:43) Proof of value before AI pilots (31:18) White-glove AI infrastructure vs self-serve cloud (33:32) Qualcomm partnership and inference-first AI (41:52) Edge-to-cloud inference and agentic workflows (49:20) Why AI pilots fail and how enterprises succeed

Startup Project
Inside Story of Building the World's Largest AI Inference Chip | Cerebras CEO & Co-Founder Andrew Feldman

Startup Project

Play Episode Listen Later Jan 16, 2026 63:21


Discover how Cerebras is challenging NVIDIA with a fundamentally different approach to AI hardware and large-scale inference.In this episode of Startup Project, Nataraj sits down with Andrew Feldman, co-founder and CEO of Cerebras Systems, to discuss how the company built a wafer-scale AI chip from first principles. Andrew shares the origin story of Cerebras, why they chose to rethink chip architecture entirely, and how system-level design decisions unlock new performance for modern AI workloads.The conversation explores:Why inference is becoming the dominant cost and performance bottleneck in AIHow Cerebras' wafer-scale architecture overcomes GPU memory and communication limitsWhat it takes to compete with incumbents like NVIDIA and AMD as a new chip companyThe tradeoffs between training and inference at scaleCerebras' product strategy across systems, cloud offerings, and enterprise deploymentsThis episode is a deep dive into AI infrastructure, semiconductor architecture, and system-level design, and is especially relevant for builders, engineers, and leaders thinking about the future of AI compute.

Reversim Podcast
509 Bumpers 90

Reversim Podcast

Play Episode Listen Later Jan 11, 2026


רק מספר 509 של רברס עם פלטפורמה - באמפרס מספר 90, שהוקלט ב-1 בינואר 2026, שנה אזרחית חדשה טובה! רן, דותן ואלון באולפן הוירטואלי (עם Riverside) בסדרה של קצרצרים וחדשות (ולפעמים קצת ישנות) מרחבי האינטרנט: הבלוגים, ה-GitHub-ים, ה-Rust-ים וה-LLM-ים החדשים מהתקופה האחרונה.

Chip Stock Investor Podcast
Beyond the GPU: Nvidia's Secret Weapon for AI Inference in 2026

Chip Stock Investor Podcast

Play Episode Listen Later Jan 8, 2026 13:18


Nvidia just kicked off 2026 with a full stack announcement at CES. From the new Vera Rubin architecture to the Bluefield-4 DPU, we're breaking down why Nvidia remains our top stock pick for the year.As AI shifts from training to inference, Nvidia is evolving its hardware to solve the memory wall. Today, we look at the Bluefield-4 storage processor and how it integrates with the Nvidia Dynamo software architecture to boost inference performance by up to 5x. We also share our updated 2026 baseline assumptions for NVDA stock, including profit growth expectations and valuation risks.How to Invest In Chip Stocks 2026 -- AI Data Center Networking, Optical, and Silicon Photonics: https://youtu.be/RC8Tzr1pXxAJoin us on Discord with Semiconductor Insider, sign up on our website: www.chipstockinvestor.com/membershipSupercharge your analysis with AI! Get 15% of your membership with our special link here: https://fiscal.ai/csi/Sign Up For Our Newsletter: https://mailchi.mp/b1228c12f284/sign-up-landing-page-short-formChapters:0:00 Our Top Stock Holding1:00 Why Individual Chips Don't Matter Anymore (Full Stack)2:45 Vera Rubin, Bluefield-4, and More4:15 Bluefield-4: The Secret to AI Inference Storage6:05 Solving the "KV Cache" Problem with Enfabrica8:10 Nvidia Dynamo & The 5X Inference Breakthrough10:00 Nvidia Stock Analysis: 2026 Price & Profit Outlook11:45 Managing Cyclicality: Is the AI Growth Cycle Over?If you found this video useful, please make sure to like and subscribe!*********************************************************Affiliate links that are sprinkled in throughout this video. If something catches your eye and you decide to buy it, we might earn a little coffee money. Thanks for helping us (Kasey) fuel our caffeine addiction!Content in this video is for general information or entertainment only and is not specific or individual investment advice. Forecasts and information presented may not develop as predicted and there is no guarantee any strategies presented will be successful. All investing involves risk, and you could lose some or all of your principal. #NVIDIA #NVDA #Semiconductors #AI #TechInvesting #ChipStockInvestor #GPU #CES2026 #VeraRubin #Bluefield4 #AIInference #NvidiaDynamo #DataCenter #Networking #FullStackCompute #KVCache#StockMarket #InvestingStrategy #TechStocks #GrowthStocks #PortfolioUpdate #MarketAnalysis #EarningsGrowth #semiconductormanufacturing #semiconductorstocks Nick and Kasey own shares of Nvidia

WSJ Tech News Briefing
TNB Tech Minute: Nvidia Licenses Groq's AI-Inference Technology

WSJ Tech News Briefing

Play Episode Listen Later Dec 26, 2025 2:09


Plus: China sanctions U.S. defense companies and executives including Northrop Grumman, Boeing and Palmer Luckey over Taiwan arms sale. And Google will let users change their Gmail address. Julie Chang hosts. Learn more about your ad choices. Visit megaphone.fm/adchoices

Catalyst with Shayle Kann
Will inference move to the edge?

Catalyst with Shayle Kann

Play Episode Listen Later Dec 18, 2025 47:47


Today virtually all AI compute takes place in centralized data centers, driving the demand for massive power infrastructure. But as workloads shift from training to inference, and AI applications become more latency-sensitive (autonomous vehicles, anyone?), there‘s another pathway: migrating a portion of inference from centralized computing to the edge. Instead of a gigawatt-scale data center in a remote location, we might see a fleet of smaller data centers clustered around an urban core. Some inference might even shift to our devices.  So how likely is a shift like this, and what would need to happen for it to substantially reshape AI power? In this episode, Shayle talks to Dr. Ben Lee, a professor of electrical engineering and computer science at the University of Pennsylvania, as well as a visiting researcher at Google. Shayle and Ben cover topics like: The three main categories of compute: hyperscale, edge, and on-device Why training is unlikely to move from hyperscale The low latency demands of new applications like autonomous vehicles How generative AI is training us to tolerate longer latencies  Why distributed inference doesn‘t face the same technical challenges as distributed training Why consumer devices may limit model capability  Resources: ACM SIGMETRICS Performance Evaluation Review: A Case Study of Environmental Footprints for Generative AI Inference: Cloud versus Edge Internet of Things and Cyber-Physical Systems: Edge AI: A survey Credits: Hosted by Shayle Kann. Produced and edited by Daniel Woldorff. Original music and engineering by Sean Marquand. Stephen Lacey is our executive editor.  Catalyst is brought to you by EnergyHub. EnergyHub helps utilities build next-generation virtual power plants that unlock reliable flexibility at every level of the grid. See how EnergyHub helps unlock the power of flexibility at scale, and deliver more value through cross-DER dispatch with their leading Edge DERMS platform, by visiting energyhub.com. Catalyst is brought to you by Bloom Energy. AI data centers can't wait years for grid power—and with Bloom Energy's fuel cells, they don't have to. Bloom Energy delivers affordable, always-on, ultra-reliable onsite power, built for chipmakers, hyperscalers, and data center leaders looking to power their operations at AI speed. Learn more by visiting⁠ ⁠⁠BloomEnergy.com⁠. Catalyst is supported by Third Way. Third Way's new PACE study surveyed over 200 clean energy professionals to pinpoint the non-cost barriers delaying clean energy deployment today and offers practical solutions to help get projects over the finish line. Read Third Way's full report, and learn more about their PACE initiative, at www.thirdway.org/pace.