Podcasts about Multimodal

  • 488PODCASTS
  • 1,051EPISODES
  • 49mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Oct 1, 2025LATEST

POPULARITY

20172018201920202021202220232024

Categories



Best podcasts about Multimodal

Show all podcasts related to multimodal

Latest podcast episodes about Multimodal

Remotely Curious
These people made an internal podcast. AI helped them reach the whole team

Remotely Curious

Play Episode Listen Later Oct 1, 2025 38:08


Amanda Cupido doesn't speak Spanish or French. But using AI, she and her team helped a global nonprofit make their internal podcast more accessible to as many employees as possible. Amanda is an audio producer and the founder of a production company called Lead Podcasting. One of her clients is a global nonprofit with over 35,000 employees—and not all of them speak English. So she made them a pitch: what if they added AI into the mix? They would make the podcast in English, and then use generative AI voice tools to translate it into Spanish and French—with a lot of human oversight, of course. Driven by a desire to use these tools for good, the goal was never to replace people, but to reach more people, and it worked.On this episode, Amanda shows what it's like—and what it sounds like—to make a podcast with AI that's still human at its core.You can learn more about Lead Podcasting at leadpodcasting.com~ ~ ~Working Smarter is brought to you by Dropbox Dash—the AI universal search and knowledge management tool from Dropbox. Learn more at workingsmarter.ai/dashYou can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube Music, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.aiThis show would not be possible without the talented team at Cosmic Standard: producer Dominic Girard, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrators Justin Tran and Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck. Our theme song was composed by Doug Stuart. Working Smarter is hosted by Matthew Braga. Thanks for listening!

The Vet Blast Podcast
354: Modern multimodal pain management for patients

The Vet Blast Podcast

Play Episode Listen Later Sep 30, 2025 32:06


On this episode of The Vet Blast Podcast presented by dvm360, our host Adam Christman, DVM, MBA, and Matthew W. Brunke, DVM, DACVSMR (Canine), CCAT, Fellow IAVRPT, sit down to discuss  multimodal pain management in veterinary medicine. Throughout the episode, the doctor take a deep dive into  innovative treatment options for pain management, from joint injections to acupuncture, plus the importance role clients play in their pets pain management. Article mentioned in the episode: https://www.dvm360.com/view/pain-management-in-veterinary-medicine-what-s-new- dvm360's Pain Awareness Month content is sponsored by Elanco. 

The top AI news from the past week, every ThursdAI

This is a free preview of a paid episode. To hear more, visit sub.thursdai.newsHola AI aficionados, it's yet another ThursdAI, and yet another week FULL of AI news, spanning Open Source LLMs, Multimodal video and audio creation and more! Shiptember as they call it does seem to deliver, and it was hard even for me to follow up on all the news, not to mention we had like 3-4 breaking news during the show today! This week was yet another Qwen-mas, with Alibaba absolutely dominating across open source, but also NVIDIA promising to invest up to $100 Billion into OpenAI. So let's dive right in! As a reminder, all the show notes are posted at the end of the article for your convenience. ThursdAI - Because weeks are getting denser, but we're still here, weekly, sending you the top AI content! Don't miss outTable of Contents* Open Source AI* Qwen3-VL Announcement (Qwen3-VL-235B-A22B-Thinking):* Qwen3-Omni-30B-A3B: end-to-end SOTA omni-modal AI unifying text, image, audio, and video* DeepSeek V3.1 Terminus: a surgical bugfix that matters for agents* Evals & Benchmarks: agents, deception, and code at scale* Big Companies, Bigger Bets!* OpenAI: ChatGPT Pulse: Proactive AI news cards for your day* XAI Grok 4 fast - 2M context, 40% fewer thinking tokens, shockingly cheap* Alibaba Qwen-Max and plans for scaling* This Week's Buzz: W&B Fully Connected is coming to London and Tokyo & Another hackathon in SF* Vision & Video: Wan 2.2 Animate, Kling 2.5, and Wan 4.5 preview* Moondream-3 Preview - Interview with co-founders Via & Jay* Wan open sourced Wan 2.2 Animate (aka “Wan Animate”): motion transfer and lip sync* Kling 2.5 Turbo: cinematic motion, cheaper and with audio* Wan 4.5 preview: native multimodality, 1080p 10s, and lip-synced speech* Voice & Audio* ThursdAI - Sep 25, 2025 - TL;DR & Show notesOpen Source AIThis was a Qwen-and-friends week. I joked on stream that I should just count how many times “Alibaba” appears in our show notes. It's a lot.Qwen3-VL Announcement (Qwen3-VL-235B-A22B-Thinking): (X, HF, Blog, Demo)Qwen 3 launched earlier as a text-only family; the vision-enabled variant just arrived, and it's not timid. The “thinking” version is effectively a reasoner with eyes, built on a 235B-parameter backbone with around 22B active (their mixture-of-experts trick). What jumped out is the breadth of evaluation coverage: MMU, video understanding (Video-MME, LVBench), 2D/3D grounding, doc VQA, chart/table reasoning—pages of it. They're showing wins against models like Gemini 2.5 Pro and GPT‑5 on some of those reports, and doc VQA is flirting with “nearly solved” territory in their numbers.Two caveats. First, whenever scores get that high on imperfect benchmarks, you should expect healthy skepticism; known label issues can inflate numbers. Second, the model is big. Incredible for server-side grounding and long-form reasoning with vision (they're talking about scaling context to 1M tokens for two-hour video and long PDFs), but not something you throw on a phone.Still, if your workload smells like “reasoning + grounding + long context,” Qwen 3 VL looks like one of the strongest open-weight choices right now.Qwen3-Omni-30B-A3B: end-to-end SOTA omni-modal AI unifying text, image, audio, and video (HF, GitHub, Qwen Chat, Demo, API)Omni is their end-to-end multimodal chat model that unites text, image, and audio—and crucially, it streams audio responses in real time while thinking separately in the background. Architecturally, it's a 30B MoE with around 3B active parameters at inference, which is the secret to why it feels snappy on consumer GPUs.In practice, that means you can talk to Omni, have it see what you see, and get sub-250 ms replies in nine speaker languages while it quietly plans. It claims to understand 119 languages. When I pushed it in multilingual conversational settings it still code-switched unexpectedly (Chinese suddenly appeared mid-flow), and it occasionally suffered the classic “stuck in thought” behavior we've been seeing in agentic voice modes across labs. But the responsiveness is real, and the footprint is exciting for local speech streaming scenarios. I wouldn't replace a top-tier text reasoner with this for hard problems, yet being able to keep speech native is a real UX upgrade.Qwen Image Edit, Qwen TTS Flash, and Qwen‑GuardQwen's image stack got a handy upgrade with multi-image reference editing for more consistent edits across shots—useful for brand assets and style-tight workflows. TTS Flash (API-only for now) is their fast speech synth line, and Q‑Guard is a new safety/moderation model from the same team. It's notable because Qwen hasn't really played in the moderation-model space before; historically Meta's Llama Guard led that conversation.DeepSeek V3.1 Terminus: a surgical bugfix that matters for agents (X, HF)DeepSeek whale resurfaced to push a small 0.1 update to V3.1 that reads like a “quality and stability” release—but those matter if you're building on top. It fixes a code-switching bug (the “sudden Chinese” syndrome you'll also see in some Qwen variants), improves tool-use and browser execution, and—importantly—makes agentic flows less likely to overthink and stall. On the numbers, Humanities Last Exam jumped from 15 to 21.7, while LiveCodeBench dipped slightly. That's the story here: they traded a few raw points on coding for more stable, less dithery behavior in end-to-end tasks. If you've invested in their tool harness, this may be a net win.Liquid Nanos: small models that extract like they're big (X, HF)Liquid Foundation Models released “Liquid Nanos,” a set of open models from roughly 350M to 2.6B parameters, including “extract” variants that pull structure (JSON/XML/YAML) from messy documents. The pitch is cost-efficiency with surprisingly competitive performance on information extraction tasks versus models 10× their size. If you're doing at-scale doc ingestion on CPUs or small GPUs, these look worth a try.Tiny IBM OCR model that blew up the charts (HF)We also saw a tiny IBM model (about 250M parameters) for image-to-text document parsing trending on Hugging Face. Run in 8-bit, it squeezes into roughly 250 MB, which means Raspberry Pi and “toaster” deployments suddenly get decent OCR/transcription against scanned docs. It's the kind of tiny-but-useful release that tends to quietly power entire products.Meta's 32B Code World Model (CWM) released for agentic code reasoning (X, HF)Nisten got really excited about this one, and once he explained it, I understood why. Meta released a 32B code world model that doesn't just generate code - it understands code the way a compiler does. It's thinking about state, types, and the actual execution context of your entire codebase.This isn't just another coding model - it's a fundamentally different approach that could change how all future coding models are built. Instead of treating code as fancy text completion, it's actually modeling the program from the ground up. If this works out, expect everyone to copy this approach.Quick note, this one was released with a research license only! Evals & Benchmarks: agents, deception, and code at scaleA big theme this week was “move beyond single-turn Q&A and test how these things behave in the wild.” with a bunch of new evals released. I wanted to cover them all in a separate segment. OpenAI's GDP Eval: “economically valuable tasks” as a bar (X, Blog)OpenAI introduced GDP Eval to measure model performance against real-world, economically valuable work. The design is closer to how I think about “AGI as useful work”: 44 occupations across nine sectors, with tasks judged against what an industry professional would produce.Two details stood out. First, OpenAI's own models didn't top the chart in their published screenshot—Anthropic's Claude Opus 4.1 led with roughly a 47.6% win rate against human professionals, while GPT‑5-high clocked in around 38%. Releasing a benchmark where you're not on top earns respect. Second, the tasks are legit. One example was a manufacturing engineer flow where the output required an overall design with an exploded view of components—the kind of deliverable a human would actually make.What I like here isn't the precise percent; it's the direction. If we anchor progress to tasks an economy cares about, we move past “trivia with citations” and toward “did this thing actually help do the work?”GAIA 2 (Meta Super Intelligence Labs + Hugging Face): agents that execute (X, HF)MSL and HF refreshed GAIA, the agent benchmark, with a thousand new human-authored scenarios that test execution, search, ambiguity handling, temporal reasoning, and adaptability—plus a smartphone-like execution environment. GPT‑5-high led across execution and search; Kimi's K2 was tops among open-weight entries. I like that GAIA 2 bakes in time and budget constraints and forces agents to chain steps, not just spew plans. We need more of these.Scale AI's “SWE-Bench Pro” for coding in the large (HF)Scale dropped a stronger coding benchmark focused on multi-file edits, 100+ line changes, and large dependency graphs. On the public set, GPT‑5 (not Codex) and Claude Opus 4.1 took the top two slots; on a commercial set, Opus edged ahead. The broader takeaway: the action has clearly moved to test-time compute, persistent memory, and program-synthesis outer loops to get through larger codebases with fewer invalid edits. This aligns with what we're seeing across ARC‑AGI and SWE‑bench Verified.The “Among Us” deception test (X)One more that's fun but not frivolous: a group benchmarked models on the social deception game Among Us. OpenAI's latest systems reportedly did the best job both lying convincingly and detecting others' lies. This line of work matters because social inference and adversarial reasoning show up in real agent deployments—security, procurement, negotiations, even internal assistant safety.Big Companies, Bigger Bets!Nvidia's $100B pledge to OpenAI for 10GW of computeLet's say that number again: one hundred billion dollars. Nvidia announced plans to invest up to $100B into OpenAI's infrastructure build-out, targeting roughly 10 gigawatts of compute and power. Jensen called it the biggest infrastructure project in history. Pair that with OpenAI's Stargate-related announcements—five new datacenters with Oracle and SoftBank and a flagship site in Abilene, Texas—and you get to wild territory fast.Internal notes circulating say OpenAI started the year around 230MW and could exit 2025 north of 2GW operational, while aiming at 20GW in the near term and a staggering 250GW by 2033. Even if those numbers shift, the directional picture is clear: the GPU supply and power curves are going vertical.Two reactions. First, yes, the “infinite money loop” memes wrote themselves—OpenAI spends on Nvidia GPUs, Nvidia invests in OpenAI, the market adds another $100B to Nvidia's cap for good measure. But second, the underlying demand is real. If we need 1–8 GPUs per “full-time agent” and there are 3+ billion working adults, we are orders of magnitude away from compute saturation. The power story is the real constraint—and that's now being tackled in parallel.OpenAI: ChatGPT Pulse: Proactive AI news cards for your day (X, OpenAI Blog)In a #BreakingNews segment, we got an update from OpenAI, that currently works only for Pro users but will come to everyone soon. Proactive AI, that learns from your chats, email and calendar and will show you a new “feed” of interesting things every morning based on your likes and feedback! Pulse marks OpenAI's first step toward an AI assistant that brings the right info before you ask, tuning itself with every thumbs-up, topic request, or app connection. I've tuned mine for today, we'll see what tomorrow brings! P.S - Huxe is a free app from the creators of NotebookLM (Ryza was on our podcast!) that does a similar thing, so if you don't have pro, check out Huxe, they just launched! XAI Grok 4 fast - 2M context, 40% fewer thinking tokens, shockingly cheap (X, Blog)xAI launched Grok‑4 Fast, and the name fits. Think “top-left” on the speed-to-cost chart: up to 2 million tokens of context, a reported 40% reduction in reasoning token usage, and a price tag that's roughly 1% of some frontier models on common workloads. On LiveCodeBench, Grok‑4 Fast even beat Grok‑4 itself. It's not the most capable brain on earth, but as a high-throughput assistant that can fan out web searches and stitch answers in something close to real time, it's compelling.Alibaba Qwen-Max and plans for scaling (X, Blog, API)Back in the Alibaba camp, they also released their flagship API model, Qwen 3 Max, and showed off their future roadmap. Qwen-max is over 1T parameters, MoE that gets 69.6 on Swe-bench verified and outperforms GPT-5 on LMArena! And their plan is simple: scale. They're planning to go from 1 million to 100 million token context windows and scale their models into the terabytes of parameters. It culminated in a hilarious moment on the show where we all put on sunglasses to salute a slide from their presentation that literally said, “Scaling is all you need.” AGI is coming, and it looks like Alibaba is one of the labs determined to scale their way there. Their release schedule lately (as documented by Swyx from Latent.space) is insane. This Week's Buzz: W&B Fully Connected is coming to London and Tokyo & Another hackathon in SFWeights & Biases (now part of the CoreWeave family) is bringing Fully Connected to London on Nov 4–5, with another event in Tokyo on Oct 31. If you're in Europe or Japan and want two days of dense talks and hands-on conversations with teams actually shipping agents, evals, and production ML, come hang out. Readers got a code on stream; if you need help getting a seat, ping me directly.Links: fullyconnected.comWe are also opening up registrations to our second WeaveHacks hackathon in SF, October 11-12, yours trully will be there, come hack with us on Self Improving agents! Register HEREVision & Video: Wan 2.2 Animate, Kling 2.5, and Wan 4.5 previewThis is the most exciting space in AI week-to-week for me right now. The progress is visible. Literally.Moondream-3 Preview - Interview with co-founders Via & JayWhile I've already reported on Moondream-3 in the last weeks newsletter, this week we got the pleasure of hosting Vik Korrapati and Jay Allen the co-founders of MoonDream to tell us all about it. Tune in for that conversation on the pod starting at 00:33:00Wan open sourced Wan 2.2 Animate (aka “Wan Animate”): motion transfer and lip sync Tongyi's Wan team shipped an open-source release that the community quickly dubbed “Wanimate.” It's a character-swap/motion transfer system: provide a single image for a character and a reference video (your own motion), and it maps your movement onto the character with surprisingly strong hair/cloth dynamics and lip sync. If you've used runway's Act One, you'll recognize the vibe—except this is open, and the fidelity is rising fast.The practical uses are broader than “make me a deepfake.” Think onboarding presenters with perfect backgrounds, branded avatars that reliably say what you need, or precise action blocking without guessing at how an AI will move your subject. You act it; it follows.Kling 2.5 Turbo: cinematic motion, cheaper and with audioKling quietly rolled out a 2.5 Turbo tier that's 30% cheaper and finally brings audio into the loop for more complete clips. Prompts adhere better, physics look more coherent (acrobatics stop breaking bones across frames), and the cinematic look has moved from “YouTube short” to “film-school final.” They seeded access to creators and re-shared the strongest results; the consistency is the headline. (Source X: @StevieMac03)I've chatted with my kiddos today over facetime, and they were building minecraft creepers. I took a screenshot, sent to Nano Banana to make their creepers into actual minecraft ones, and then with Kling, Animated the explosions for them. They LOVED it! Animations were clear, while VEO refused for me to even upload their images, Kling didn't care hahaWan 4.5 preview: native multimodality, 1080p 10s, and lip-synced speechWan also teased a 4.5 preview that unifies understanding and generation across text, image, video, and audio. The eye-catching bit: generate a 1080p, 10-second clip with synced speech from just a script. Or supply your own audio and have it lip-sync the shot. I ran my usual “interview a polar bear dressed like me” test and got one of the better results I've seen from any model. We're not at “dialogue scene” quality, but “talking character shot” is getting… good. The generation of audio (not only text + lipsync) is one of the best ones besides VEO, it's really great to see how strongly this improves, sad that this wasn't open sourced! And apparently it supports “draw text to animate” (Source: X) Voice & AudioSuno V5: we've entered the “I can't tell anymore” eraSuno calls V5 a redefinition of audio quality. I'll be honest, I'm at the edge of my subjective hearing on this. I've caught myself listening to Suno streams instead of Spotify and forgetting anything is synthetic. The vocals feel more human, the mixes cleaner, and the remastering path (including upgrading V4 tracks) is useful. The last 10% to “you fooled a producer” is going to be long, but the distance between V4 and V5 already makes me feel like I should re-cut our ThursdAI opener.MiMI Audio: a small omni-chat demo that hints at the floorWe tried a MiMI Audio demo live—a 7B-ish model with speech in/out. It was responsive but stumbled on singing and natural prosody. I'm leaving it in here because it's a good reminder that the open floor for “real-time voice” is rising quickly even for small models. And the moment you pipe a stronger text brain behind a capable, native speech front-end, the UX leap is immediate.Ok, another DENSE week that finishes up Shiptember, tons of open source, Qwen (Tongyi) shines, and video is getting so so good. This is all converging folks, and honestly, I'm just happy to be along for the ride! This week was also Rosh Hashanah, which is the Jewish new year, and I've shared on the pod that I've found my X post from 3 years ago, using the state of the art AI models of the time. WHAT A DIFFERENCE 3 years make, just take a look, I had to scale down the 4K one from this year just to fit into the pic! Shana Tova to everyone who's reading this, and we'll see you next week

The MadTech Podcast
MadTech Daily: Double-Digit Growth Ahead for Digital Ad spend; Alibaba Unveils Multimodal AI; eBay Moves to Buy Tise

The MadTech Podcast

Play Episode Listen Later Sep 24, 2025 2:04


In today's MadTech Daily, we discuss double-digit growth ahead for digital adspend, Alibaba unveiling a multimodal AI, and eBay moving to buy Tise. 

WALL STREET COLADA
Wall Street estable, Powell en foco, Rocket Lab rumbo a Marte, Alibaba lanza IA multimodal y Boeing cerca de mega pedido en China

WALL STREET COLADA

Play Episode Listen Later Sep 23, 2025 4:41


Summary del Show: • Futuros planos en Wall Street tras nuevos récords y la inversión de $NVDA en OpenAI, con foco en Powell y PCE. • Rocket Lab $RKLB entregó dos naves para misión de la NASA a Marte, logrando récord en tiempos de desarrollo. • Alibaba $BABA presentó Qwen3-Omni, modelo de IA multimodal open source que compite con $GOOG y $MSFT. • EE. UU. y China negocian un pedido de hasta 500 aviones Boeing $BA, el mayor desde 2017.

Edtech Insiders
Google's Vision of a Personal Tutor for Every Student: Dave Messer on Guided Learning

Edtech Insiders

Play Episode Listen Later Sep 23, 2025 45:33 Transcription Available


Send us a textDave Messer is a Product Manager on Google's Learning & Education team, leads product for emerging learning initiatives, including Learning Labs and Gemini. A former teacher with masters degrees in software engineering and education, he works with experts and the education community to build products that target real-world challenges for educators and students.

K12ArtChat the Podcast
Episode 231 – Tamryn McDermott – Multimodal Learning

K12ArtChat the Podcast

Play Episode Listen Later Sep 18, 2025 43:32


In this episode, Laura talks with artist, researcher, and teacher – Tamyrn McDermott about multimodal learning! Listen in to learn what this is and how it can be applied in an art classroom. Tamryn shares strategies and activities she's used in her own experience that art educators can introduce into their existing programs. During the discussion, she also explores how technology and media arts can be intertwined with multimodal learning concepts! Tune in to hear more from Tamryn.

Remotely Curious
For these shipwreck-hunting humans, AI is part of the crew

Remotely Curious

Play Episode Listen Later Sep 17, 2025 30:39


Very few people get paid to visit shipwrecks—but for Stephanie Gandulla, it's all part of the job. Stephanie is a scuba diver, maritime archeologist, and resource protection coordinator for the Thunder Bay National Marine Sanctuary. The agency safeguards Lake Huron's historic shipwrecks, many of which have yet to be discovered. That's where Katie Skinner comes in. She's an assistant professor at the University of Michigan and the director of the school's Field Robotics Group. Skinner and her team have been developing autonomous underwater vehicles that can find new shipwreck sites, all on their own. For humans, a search is costly, time-consuming, manual work. But for AI? Skinner thinks it could help us find answers in a snap. On this episode, Stephanie and Katie talk about using AI to find shipwrecks in a literal lake of data, so that they can spend less time searching and more time exploring—as only humans can do.You can learn more about some of the people and projects featured in this episode, including… The Thunder Bay National Marine Sanctuary at thunderbay.noaa.govKatie Skinner and the University of Michigan's Field Robotics Group at fieldrobotics.engin.umich.eduPrevious efforts to autonomously map Thunder Bay's historical shipwrecks at theverge.com/2020/3/5/21157791/drone-autonomous-boat-ben-shipwreck-alley-unh-noaa-great-lakes-thunder-bay~ ~ ~Working Smarter is brought to you by Dropbox Dash—the AI universal search and knowledge management tool from Dropbox. Learn more at workingsmarter.ai/dashYou can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube Music, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.aiThis show would not be possible without the talented team at Cosmic Standard: producer Dominic Girard, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrators Justin Tran and Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck. Our theme song was composed by Doug Stuart. Working Smarter is hosted by Matthew Braga. Thanks for listening!

Gradient Dissent - A Machine Learning Podcast by W&B
The Startup Powering The Data Behind AGI

Gradient Dissent - A Machine Learning Podcast by W&B

Play Episode Listen Later Sep 16, 2025 56:15


In this episode of Gradient Dissent, Lukas Biewald talks with the CEO & founder of Surge AI, the billion-dollar company quietly powering the next generation of frontier LLMs. They discuss Surge's origin story, why traditional data labeling is broken, and how their research-focused approach is reshaping how models are trained.You'll hear why inter-annotator agreement fails in high-complexity tasks like poetry and math, why synthetic data is often overrated, and how Surge builds rich RL environments to stress-test agentic reasoning. They also go deep on what kinds of data will be critical to future progress in AI—from scientific discovery to multimodal reasoning and personalized alignment.It's a rare, behind-the-scenes look into the world of high-quality data generation at scale—straight from the team most frontier labs trust to get it right.Timestamps: 00:00 – Intro: Who is Edwin Chen? 03:40 – The problem with early data labeling systems 06:20 – Search ranking, clickbait, and product principles 10:05 – Why Surge focused on high-skill, high-quality labeling 13:50 – From Craigslist workers to a billion-dollar business 16:40 – Scaling without funding and avoiding Silicon Valley status games 21:15 – Why most human data platforms lack real tech 25:05 – Detecting cheaters, liars, and low-quality labelers 28:30 – Why inter-annotator agreement is a flawed metric 32:15 – What makes a great poem? Not checkboxes 36:40 – Measuring subjective quality rigorously 40:00 – What types of data are becoming more important 44:15 – Scientific collaboration and frontier research data 47:00 – Multimodal data, Argentinian coding, and hyper-specificity 50:10 – What's wrong with LMSYS and benchmark hacking 53:20 – Personalization and taste in model behavior 56:00 – Synthetic data vs. high-quality human data Follow Weights & Biases:https://twitter.com/weights_biases https://www.linkedin.com/company/wandb

Database School
The database for all your AI needs

Database School

Play Episode Listen Later Sep 16, 2025 60:07


Marcel Kornacker, the creator of Apache Impala and co-creator of Apache Parquet, joins me to talk about his latest project: Pixeltable, a multimodal AI database that combines structured and unstructured data with rich, Python-native workflows.From ingestion to vector search, transcription to snapshots, Pixeltable eliminates painful data plumbing for modern AI teams.Follow MarcelPixeltable: https://pixeltable.comPixeltable GitHub: https://github.com/pixeltable/pixeltableLinkedIn: https://www.linkedin.com/in/marcelkornackerFollow AaronTwitter: https://twitter.com/aarondfrancisLinkedIn: https://www.linkedin.com/in/aarondfrancisWebsite: https://aaronfrancis.com – find articles, podcasts, courses, and moreDatabase School: https://databaseschool.comChapters0:00 – Introduction0:20 – Meet Marcel Kornacker1:19 – Early career and grad school in databases2:12 – Joining Google and building F13:42 – How F1 used Spanner at Google4:01 – Starting Apache Impala at Cloudera6:02 – Why SQL still matters7:29 – What keeps Marcel fascinated with databases9:37 – The “SQL is dead” waves and shift to AI10:21 – Observing pain points in computer vision pipelines13:02 – Multimodal data challenges and the idea for Pixeltable16:10 – How Pixeltable handles transformations with computed columns26:29 – Example: processing video, audio, and transcripts in Pixeltable33:12 – DAG execution and parallelism explained37:00 – Transactional guarantees in Pixeltable39:00 – Iterators and chunking data for search42:26 – Using embeddings and semantic search47:05 – Updating data and incremental recomputation50:06 – Thoughts on RAG and hybrid search53:14 – Real-world use cases and dataset curation57:00 – Example: labeling food waste on cruise ships1:02:00 – Labeling workflows and syncing annotations1:02:41 – Pixeltable's roadmap and cloud vision1:07:10 – How to get involved with Pixeltable1:09:03 – Closing and where to find Marcel

Open Source Startup Podcast
E181: Why Multimodal Is the Future of AI Data Workloads

Open Source Startup Podcast

Play Episode Listen Later Sep 9, 2025 36:31


Chang She is Co-Founder & CEO of LanceDB, the multimodal lakehouse platform. Their open source data format lance has over 5K stars on GitHub and is a modern columnar data format for ML and LLMs implemented in Rust.LanceDB has raised $41M from investors including Theory Ventures, CRV, and Essence VC. In this episode, we dig into:Early focus: autonomous vehicles; solved real-time analysis limits with Lance format → 9,000% performance gain.Multi-modal AI taking off (vision, audio, text); Midjourney & Runway as pioneers; audio now a major category.How they built trust through open source.Integrated workflows (data prep + search + embedding) going beyond vector DBs; education needed to show full value.Cloud/serverless launch in 2023–24 enabled seamless local-to-production use.Future bets: audio infra, robotics, spatial reasoning; vector DBs risk irrelevance if they don't evolve.

Remotely Curious
Why the hot new ingredient in this chef's pantry is AI

Remotely Curious

Play Episode Listen Later Sep 3, 2025 33:02


Ian Ramirez has spent his career finding innovative ways to make mouth-watering meals for clients—and one of his latest ingredients is artificial intelligence. As a chef, culinary consultant, and co-founder of Mad Honey Culinary Studio and Goods, he's the guy that brands hire to get their product on restaurant menus, and make it look and taste good—whether it's a sauce, syrup, spread, or spice. Ian uses AI to tackle the repetitive, time-consuming parts of menu planning for commercial kitchens, and help clients visualize new concepts before anything gets sliced or diced. It's a tool that augments his creativity, he says, and makes prep less of a grind. On this episode, Ian talks about how AI is helping him and his team spend more time doing what they love: cooking and getting creative in the kitchen.Learn more about Mad Honey Culinary Studio and Goods at madhoneyculinary.comLearn more about Dropbox Dash—the AI universal search and knowledge management tool from Dropbox—at workingsmarter.ai/dash~ ~ ~Working Smarter is brought to you by Dropbox Dash—the AI universal search and knowledge management tool from Dropbox. Learn more at workingsmarter.ai/dashYou can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube Music, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.aiThis show would not be possible without the talented team at Cosmic Standard: producer Dominic Girard, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrators Justin Tran and Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck. Our theme song was composed by Doug Stuart. Working Smarter is hosted by Matthew Braga. Thanks for listening!

Digital Pathology Podcast
158: Multimodal Magic AI's Role in Lung & Prostate Cancer Predictions

Digital Pathology Podcast

Play Episode Listen Later Aug 29, 2025 28:50 Transcription Available


Send us a textWhat if AI could predict cancer outcomes better than traditional methods—and at a fraction of the cost? In this episode, I explore how multimodal AI is reshaping lung and prostate cancer predictions and why integration challenges still stand in the way.Episode Highlights with Timestamps:[00:02:57] Agentic AI in toxicologic pathology – what it is and how it could orchestrate workflows.[00:05:40] Grandium desktop scanners – making histology studies more accessible and efficient.[00:08:03] Clover framework – a cost-effective multimodal model combining vision + language for pathology.[00:13:40] NSCLC study (Beijing Chest Hospital) – AI predicts progression-free and overall survival with high accuracy.[00:17:58] Prostate cancer prognostic model (Cleveland Clinic & US partners) – validating AI-enabled Pathomic PRA test.[00:23:35] Thyroid neoplasm classification – challenges for AI in distinguishing overlapping histopathological features.[00:34:49] Real-world Belgium case study – AI integration into prostate biopsy workflow reduced IHC testing and turnaround time.[00:41:03] Lessons learned – adoption hurdles, system integration, and why change management is essential for successful digital transformation.Resources from this EpisodeWorld Tumor Registry – A global open-access repository for histopathology images: World Tumor RegistryBeijing Chest Hospital NSCLC AI Prognostic Study – Prognosis prediction using multimodal models.Cleveland Clinic Pathomic PRA Study – Independent validation of AI-enabled prostate cancer risk assessment.Grandium Scanners – Compact desktop scanners for histology slides: Grandium.aiSupport the showBecome a Digital Pathology Trailblazer get the "Digital Pathology 101" FREE E-book and join us!

Remotely Curious
Coming soon: Season two

Remotely Curious

Play Episode Listen Later Aug 27, 2025 1:54


Working Smarter is back for season two! Starting September 3, we're going beyond the hype and headlines to bring you stories about real people using AI to do more of what they love about their jobs. From the F1 track to the kitchen—and even the bottom of a lake—learn how new tools are helping creatives, makers, visionaries, and their teams think big, move faster, and focus on the work that matters most.~ ~ ~Working Smarter is brought to you by Dropbox Dash—the AI universal search and knowledge management tool from Dropbox. Learn more at workingsmarter.ai/dashYou can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube Music, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.aiThis show would not be possible without the talented team at Cosmic Standard: producer Dominic Girard, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrators Justin Tran and Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck. Our theme song was composed by Doug Stuart. Working Smarter is hosted by Matthew Braga. Thanks for listening!

This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Multimodal AI Models on Apple Silicon with MLX with Prince Canuma - #744

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

Play Episode Listen Later Aug 26, 2025 70:20


Today, we're joined by Prince Canuma, an ML engineer and open-source developer focused on optimizing AI inference on Apple Silicon devices. Prince shares his journey to becoming one of the most prolific contributors to Apple's MLX ecosystem, having published over 1,000 models and libraries that make open, multimodal AI accessible and performant on Apple devices. We explore his workflow for adapting new models in MLX, the trade-offs between the GPU and Neural Engine, and how optimization methods like pruning and quantization enhance performance. We also cover his work on "Fusion," a weight-space method for combining model behaviors without retraining, and his popular packages—MLX-Audio, MLX-Embeddings, and MLX-VLM—which streamline the use of MLX across different modalities. Finally, Prince introduces Marvis, a real-time speech-to-speech voice agent, and shares his vision for the future of AI, emphasizing the move towards "media models" that can handle multiple modalities, and more. The complete show notes for this episode can be found at https://twimlai.com/go/744.

Digital Pathology Podcast
156: Digital Pathology and AI in Cancer Grading, T-Cell Imaging & Biomarkers

Digital Pathology Podcast

Play Episode Listen Later Aug 21, 2025 34:35 Transcription Available


Send us a textCan AI Grade Cancer Better Than Us? The Truth About T-Cell Imaging, Biomarkers & Digital Pathology DisruptionYou think Saturday mornings are for coffee? Try diving into bone marrow morphology, organ donor kidney biopsies, and AI-driven metastasis detection at sunrise. That's how I do it—and you're invited to join.Welcome to another data-packed episode of DigiPath Digest, where we explore the latest frontier in digital pathology and AI. This time, I reviewed some of the most exciting recent abstracts spanning cancer grading, T-cell quantification, and AI agents in oncology decision-making.These studies aren't just fascinating—they're redefining what's possible in diagnostics, especially in under-resourced areas where digital pathology can create game-changing access and efficiency.

Windows Weekly (MP3)
WW 946: Backing up the Intel Truck - Microsoft's gamescom 2025 reveals

Windows Weekly (MP3)

Play Episode Listen Later Aug 20, 2025 Transcription Available


Leo, Paul, and Richard break down Google's Pixel 10 launch spectacle, poking fun at celebrity overkill and asking whether anyone actually cares about new phones anymore. Plus, they dig into Lenovo's record-breaking quarter, surprising shifts in the PC market, and the ongoing struggle between innovation and copycatting in the AI arms race. Also, Notion has finally added basic offline support, which should make it stickier than ever. You got your AI in my Windows Pavan Davuluri discusses how AI will impact the Windows user experience Not the same video series as the previous "vision" video Davuluri leads Windows and Surface, so his words matter Changing: Interactions, business models, experiences Multimodal - in this case, meaning adding natural language interactions and vision to keyboard, mouse, touch, pen, etc. - "experience diversity" Powerful AI models running on-device are "transformational" Predictably, the Chicken Littles are losing their s#%t yet again. Guys. Come on. Windows 11 Semantic search and new Copilot home page for all Insiders Click to Do selection modes, minor improvements in Beta and Dev Recall and other Copilot+ PC features FINALLY come to Canary A few minor additions to Canary, nothing new to everyone else Notepad is getting an updated context menu and the Chicken Littles are losing their s#%t yet again. Guys. Come on! Lenovo earnings up 22 percent, best PC market share ever, number one in AI PCs too AI Google Chrome takes the subtle approach Brave found a major security vulnerability in Comet Like my wife, Gemini remembers everything I ever said now Duck.ai gets GPT-5 Mini access, web search results Grammarly announces CODA-based editor, several AI agents Xbox and games Another stunning Windows on Arm development The Xbox app actually works now on Windows 11 on Arm, meaning not just game streaming but also downloads. Except, of course, that it mostly doesn't work Heretic/Hexen installs and runs great Asus ROG Xbox Ally handhelds to launch on October 16 Call of Duty: Black Ops 7 with four-player co-op campaign Indiana Jones coming to the Switch 2 Gears of War: Reloaded, more coming to Game Pass in late August To help Xbox, Sony raises prices on the PS5 GeForce Now gets more powerful cloud GPUs Tips & picks Tip of the week: Windows 11 Field Guide, 25H2 Edition is on the way App pick of the week: Notion RunAs Radio this week: Data Governance for AI with Martina Grom Brown liquor pick of the week: Chichibu Ichiro's Malt & Grain Whisky Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly Check out Paul's blog at thurrott.com The Windows Weekly theme music is courtesy of Carl Franklin. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsor: uscloud.com

All TWiT.tv Shows (MP3)
Windows Weekly 946: Backing up the Intel Truck

All TWiT.tv Shows (MP3)

Play Episode Listen Later Aug 20, 2025 124:41 Transcription Available


Leo, Paul, and Richard break down Google's Pixel 10 launch spectacle, poking fun at celebrity overkill and asking whether anyone actually cares about new phones anymore. Plus, they dig into Lenovo's record-breaking quarter, surprising shifts in the PC market, and the ongoing struggle between innovation and copycatting in the AI arms race. Also, Notion has finally added basic offline support, which should make it stickier than ever. You got your AI in my Windows Pavan Davuluri discusses how AI will impact the Windows user experience Not the same video series as the previous "vision" video Davuluri leads Windows and Surface, so his words matter Changing: Interactions, business models, experiences Multimodal - in this case, meaning adding natural language interactions and vision to keyboard, mouse, touch, pen, etc. - "experience diversity" Powerful AI models running on-device are "transformational" Predictably, the Chicken Littles are losing their s#%t yet again. Guys. Come on. Windows 11 Semantic search and new Copilot home page for all Insiders Click to Do selection modes, minor improvements in Beta and Dev Recall and other Copilot+ PC features FINALLY come to Canary A few minor additions to Canary, nothing new to everyone else Notepad is getting an updated context menu and the Chicken Littles are losing their s#%t yet again. Guys. Come on! Lenovo earnings up 22 percent, best PC market share ever, number one in AI PCs too AI Google Chrome takes the subtle approach Brave found a major security vulnerability in Comet Like my wife, Gemini remembers everything I ever said now Duck.ai gets GPT-5 Mini access, web search results Grammarly announces CODA-based editor, several AI agents Xbox and games Another stunning Windows on Arm development The Xbox app actually works now on Windows 11 on Arm, meaning not just game streaming but also downloads. Except, of course, that it mostly doesn't work Heretic/Hexen installs and runs great Asus ROG Xbox Ally handhelds to launch on October 16 Call of Duty: Black Ops 7 with four-player co-op campaign Indiana Jones coming to the Switch 2 Gears of War: Reloaded, more coming to Game Pass in late August To help Xbox, Sony raises prices on the PS5 GeForce Now gets more powerful cloud GPUs Tips & picks Tip of the week: Windows 11 Field Guide, 25H2 Edition is on the way App pick of the week: Notion RunAs Radio this week: Data Governance for AI with Martina Grom Brown liquor pick of the week: Chichibu Ichiro's Malt & Grain Whisky Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly Check out Paul's blog at thurrott.com The Windows Weekly theme music is courtesy of Carl Franklin. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsor: uscloud.com

Radio Leo (Audio)
Windows Weekly 946: Backing up the Intel Truck

Radio Leo (Audio)

Play Episode Listen Later Aug 20, 2025 124:41 Transcription Available


Leo, Paul, and Richard break down Google's Pixel 10 launch spectacle, poking fun at celebrity overkill and asking whether anyone actually cares about new phones anymore. Plus, they dig into Lenovo's record-breaking quarter, surprising shifts in the PC market, and the ongoing struggle between innovation and copycatting in the AI arms race. Also, Notion has finally added basic offline support, which should make it stickier than ever. You got your AI in my Windows Pavan Davuluri discusses how AI will impact the Windows user experience Not the same video series as the previous "vision" video Davuluri leads Windows and Surface, so his words matter Changing: Interactions, business models, experiences Multimodal - in this case, meaning adding natural language interactions and vision to keyboard, mouse, touch, pen, etc. - "experience diversity" Powerful AI models running on-device are "transformational" Predictably, the Chicken Littles are losing their s#%t yet again. Guys. Come on. Windows 11 Semantic search and new Copilot home page for all Insiders Click to Do selection modes, minor improvements in Beta and Dev Recall and other Copilot+ PC features FINALLY come to Canary A few minor additions to Canary, nothing new to everyone else Notepad is getting an updated context menu and the Chicken Littles are losing their s#%t yet again. Guys. Come on! Lenovo earnings up 22 percent, best PC market share ever, number one in AI PCs too AI Google Chrome takes the subtle approach Brave found a major security vulnerability in Comet Like my wife, Gemini remembers everything I ever said now Duck.ai gets GPT-5 Mini access, web search results Grammarly announces CODA-based editor, several AI agents Xbox and games Another stunning Windows on Arm development The Xbox app actually works now on Windows 11 on Arm, meaning not just game streaming but also downloads. Except, of course, that it mostly doesn't work Heretic/Hexen installs and runs great Asus ROG Xbox Ally handhelds to launch on October 16 Call of Duty: Black Ops 7 with four-player co-op campaign Indiana Jones coming to the Switch 2 Gears of War: Reloaded, more coming to Game Pass in late August To help Xbox, Sony raises prices on the PS5 GeForce Now gets more powerful cloud GPUs Tips & picks Tip of the week: Windows 11 Field Guide, 25H2 Edition is on the way App pick of the week: Notion RunAs Radio this week: Data Governance for AI with Martina Grom Brown liquor pick of the week: Chichibu Ichiro's Malt & Grain Whisky Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly Check out Paul's blog at thurrott.com The Windows Weekly theme music is courtesy of Carl Franklin. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsor: uscloud.com

Windows Weekly (Video HI)
WW 946: Backing up the Intel Truck - Microsoft's gamescom 2025 reveals

Windows Weekly (Video HI)

Play Episode Listen Later Aug 20, 2025 124:41 Transcription Available


Leo, Paul, and Richard break down Google's Pixel 10 launch spectacle, poking fun at celebrity overkill and asking whether anyone actually cares about new phones anymore. Plus, they dig into Lenovo's record-breaking quarter, surprising shifts in the PC market, and the ongoing struggle between innovation and copycatting in the AI arms race. Also, Notion has finally added basic offline support, which should make it stickier than ever. You got your AI in my Windows Pavan Davuluri discusses how AI will impact the Windows user experience Not the same video series as the previous "vision" video Davuluri leads Windows and Surface, so his words matter Changing: Interactions, business models, experiences Multimodal - in this case, meaning adding natural language interactions and vision to keyboard, mouse, touch, pen, etc. - "experience diversity" Powerful AI models running on-device are "transformational" Predictably, the Chicken Littles are losing their s#%t yet again. Guys. Come on. Windows 11 Semantic search and new Copilot home page for all Insiders Click to Do selection modes, minor improvements in Beta and Dev Recall and other Copilot+ PC features FINALLY come to Canary A few minor additions to Canary, nothing new to everyone else Notepad is getting an updated context menu and the Chicken Littles are losing their s#%t yet again. Guys. Come on! Lenovo earnings up 22 percent, best PC market share ever, number one in AI PCs too AI Google Chrome takes the subtle approach Brave found a major security vulnerability in Comet Like my wife, Gemini remembers everything I ever said now Duck.ai gets GPT-5 Mini access, web search results Grammarly announces CODA-based editor, several AI agents Xbox and games Another stunning Windows on Arm development The Xbox app actually works now on Windows 11 on Arm, meaning not just game streaming but also downloads. Except, of course, that it mostly doesn't work Heretic/Hexen installs and runs great Asus ROG Xbox Ally handhelds to launch on October 16 Call of Duty: Black Ops 7 with four-player co-op campaign Indiana Jones coming to the Switch 2 Gears of War: Reloaded, more coming to Game Pass in late August To help Xbox, Sony raises prices on the PS5 GeForce Now gets more powerful cloud GPUs Tips & picks Tip of the week: Windows 11 Field Guide, 25H2 Edition is on the way App pick of the week: Notion RunAs Radio this week: Data Governance for AI with Martina Grom Brown liquor pick of the week: Chichibu Ichiro's Malt & Grain Whisky Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly Check out Paul's blog at thurrott.com The Windows Weekly theme music is courtesy of Carl Franklin. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsor: uscloud.com

All TWiT.tv Shows (Video LO)
Windows Weekly 946: Backing up the Intel Truck

All TWiT.tv Shows (Video LO)

Play Episode Listen Later Aug 20, 2025 124:41 Transcription Available


Leo, Paul, and Richard break down Google's Pixel 10 launch spectacle, poking fun at celebrity overkill and asking whether anyone actually cares about new phones anymore. Plus, they dig into Lenovo's record-breaking quarter, surprising shifts in the PC market, and the ongoing struggle between innovation and copycatting in the AI arms race. Also, Notion has finally added basic offline support, which should make it stickier than ever. You got your AI in my Windows Pavan Davuluri discusses how AI will impact the Windows user experience Not the same video series as the previous "vision" video Davuluri leads Windows and Surface, so his words matter Changing: Interactions, business models, experiences Multimodal - in this case, meaning adding natural language interactions and vision to keyboard, mouse, touch, pen, etc. - "experience diversity" Powerful AI models running on-device are "transformational" Predictably, the Chicken Littles are losing their s#%t yet again. Guys. Come on. Windows 11 Semantic search and new Copilot home page for all Insiders Click to Do selection modes, minor improvements in Beta and Dev Recall and other Copilot+ PC features FINALLY come to Canary A few minor additions to Canary, nothing new to everyone else Notepad is getting an updated context menu and the Chicken Littles are losing their s#%t yet again. Guys. Come on! Lenovo earnings up 22 percent, best PC market share ever, number one in AI PCs too AI Google Chrome takes the subtle approach Brave found a major security vulnerability in Comet Like my wife, Gemini remembers everything I ever said now Duck.ai gets GPT-5 Mini access, web search results Grammarly announces CODA-based editor, several AI agents Xbox and games Another stunning Windows on Arm development The Xbox app actually works now on Windows 11 on Arm, meaning not just game streaming but also downloads. Except, of course, that it mostly doesn't work Heretic/Hexen installs and runs great Asus ROG Xbox Ally handhelds to launch on October 16 Call of Duty: Black Ops 7 with four-player co-op campaign Indiana Jones coming to the Switch 2 Gears of War: Reloaded, more coming to Game Pass in late August To help Xbox, Sony raises prices on the PS5 GeForce Now gets more powerful cloud GPUs Tips & picks Tip of the week: Windows 11 Field Guide, 25H2 Edition is on the way App pick of the week: Notion RunAs Radio this week: Data Governance for AI with Martina Grom Brown liquor pick of the week: Chichibu Ichiro's Malt & Grain Whisky Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly Check out Paul's blog at thurrott.com The Windows Weekly theme music is courtesy of Carl Franklin. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsor: uscloud.com

Radio Leo (Video HD)
Windows Weekly 946: Backing up the Intel Truck

Radio Leo (Video HD)

Play Episode Listen Later Aug 20, 2025 124:41 Transcription Available


Leo, Paul, and Richard break down Google's Pixel 10 launch spectacle, poking fun at celebrity overkill and asking whether anyone actually cares about new phones anymore. Plus, they dig into Lenovo's record-breaking quarter, surprising shifts in the PC market, and the ongoing struggle between innovation and copycatting in the AI arms race. Also, Notion has finally added basic offline support, which should make it stickier than ever. You got your AI in my Windows Pavan Davuluri discusses how AI will impact the Windows user experience Not the same video series as the previous "vision" video Davuluri leads Windows and Surface, so his words matter Changing: Interactions, business models, experiences Multimodal - in this case, meaning adding natural language interactions and vision to keyboard, mouse, touch, pen, etc. - "experience diversity" Powerful AI models running on-device are "transformational" Predictably, the Chicken Littles are losing their s#%t yet again. Guys. Come on. Windows 11 Semantic search and new Copilot home page for all Insiders Click to Do selection modes, minor improvements in Beta and Dev Recall and other Copilot+ PC features FINALLY come to Canary A few minor additions to Canary, nothing new to everyone else Notepad is getting an updated context menu and the Chicken Littles are losing their s#%t yet again. Guys. Come on! Lenovo earnings up 22 percent, best PC market share ever, number one in AI PCs too AI Google Chrome takes the subtle approach Brave found a major security vulnerability in Comet Like my wife, Gemini remembers everything I ever said now Duck.ai gets GPT-5 Mini access, web search results Grammarly announces CODA-based editor, several AI agents Xbox and games Another stunning Windows on Arm development The Xbox app actually works now on Windows 11 on Arm, meaning not just game streaming but also downloads. Except, of course, that it mostly doesn't work Heretic/Hexen installs and runs great Asus ROG Xbox Ally handhelds to launch on October 16 Call of Duty: Black Ops 7 with four-player co-op campaign Indiana Jones coming to the Switch 2 Gears of War: Reloaded, more coming to Game Pass in late August To help Xbox, Sony raises prices on the PS5 GeForce Now gets more powerful cloud GPUs Tips & picks Tip of the week: Windows 11 Field Guide, 25H2 Edition is on the way App pick of the week: Notion RunAs Radio this week: Data Governance for AI with Martina Grom Brown liquor pick of the week: Chichibu Ichiro's Malt & Grain Whisky Hosts: Leo Laporte, Paul Thurrott, and Richard Campbell Download or subscribe to Windows Weekly at https://twit.tv/shows/windows-weekly Check out Paul's blog at thurrott.com The Windows Weekly theme music is courtesy of Carl Franklin. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access. Join today: https://twit.tv/clubtwit Sponsor: uscloud.com

The OJSM Hot Corner
“Postoperative Opioid Reduction Using a Multimodal Pain Protocol for Outpatient Orthopaedic Sports Medicine Surgery” with Author, Dr. J. Preston Van Buren, DO

The OJSM Hot Corner

Play Episode Listen Later Aug 14, 2025 22:20


Multimodal analgesia refers to a pain medication strategy that targets multiple chemical pathways to achieve adequate pain relief. This concept has grown in popularity over the years particularly in light of the recognition that opioids have major downsides including dependence. We welcome Dr. J. Preston Van Buren, DO from the Naval Medical Center in San Diego to discuss his team's findings after implementing a focused multimodal analgesia strategy with a reduced number of prescribed opioid tablets following Sports Medicine surgery compared to a more traditional, opioid-heavy regimen that has been classically employed. 

AJR Podcast Series
Exploring the Use of Multimodal Generative AI in Reading Chest Radiographs for Tuberculosis Screening

AJR Podcast Series

Play Episode Listen Later Aug 11, 2025 8:23


Full article: Multimodal Generative Artificial Intelligence Model for Creating Radiology Reports for Chest Radiographs in Patients Undergoing Tuberculosis Screening Widespread radiographic screening for tuberculosis can be challenged in certain regions by limited radiologist availability. Dora Chen, MD, discusses a recent AJR article by Hong et al. that evaluates the potential use of generative AI for chest radiography interpretation in this setting.

The Preschool SLP
184. Gestalt Language Processing Intervention: What's Evidence-Based—and What Isn't? Part 2

The Preschool SLP

Play Episode Listen Later Aug 7, 2025 31:06


Are you using the Natural Language Acquisition (NLA) framework in your autism intervention? This episode of The Preschool SLP pulls back the curtain on Gestalt Language Processing (GLP) and challenges you to think critically about what's truly supported by research—and what isn't. SLPs are increasingly encouraged to adopt GLP-informed interventions, but a recent article by Venker and Lorang (2025) in response to Hadock et al. (2024) raises five concerns you can't afford to ignore. In this episode, we break down each criticism with clinical insight and offer evidence-aligned strategies you can use immediately in your therapy room.

The Chris Voss Show
The Chris Voss Show Podcast – AI Revolutionizing Finance and Insurance with Multimodal’s Founder Ankur Patel

The Chris Voss Show

Play Episode Listen Later Aug 1, 2025 28:02


AI Revolutionizing Finance and Insurance with Multimodal's Founder Ankur Patel Multimodal.dev Ankur Patel is the founder and CEO of Multimodal, a cutting-edge company at the forefront of implementing AI solutions for finance and insurance industries. With over a decade of experience in machine learning and hands-on experience at reputable financial institutions such as JP Morgan and Bridgewater, Ankur brings deep expertise in both finance and AI. Under his leadership, Multimodal aims to automate complex processes, improve decision-making, and enhance efficiency within these industries. Episode Summary: In this engaging episode of The Chris Voss Show, host Chris Voss delves into the world of AI with Ankur Patel, the visionary founder and CEO of Multimodal. As AI continues to revolutionize industries, they explore how Multimodal's AI-driven solutions are streamlining complex processes in finance and insurance, promising to transform consumer experiences and business efficiencies alike. Throughout the discussion, Ankur emphasizes the game-changing capabilities of AI in automating administrative tasks, thereby reducing costs and accelerating decision-making. Buzzwords like "agentic AI," "large language models," and "streamlining workflows" dominate the conversation as Patel explains how enterprises and mid-market players can enhance operations. Chris Voss humorously relates his mortgage company experiences, illustrating the tedious past processes that AI could now simplify. The podcast also tackles the potentially changing job landscape due to AI, addressing both apprehensions and exciting possibilities. Key Takeaways: * **Revolutionizing Insurance and Finance:** Multimodal's AI solutions aim to automate complex processes, reduce administrative costs, and speed up decision-making in industries like insurance and finance. * **Agentic AI's Role:** Ankur Patel explains agentic AI as systems capable of performing actions, not just chat-based responses, enhancing operational efficiency significantly. * **Impacts on Employment:** While AI might change job roles, it's poised to create new opportunities, with human workers focusing more on critical thinking tasks. * **Improving Consumer Experiences:** By streamlining operations, AI promises faster service delivery and potentially lower costs in sectors like mortgage and insurance. * **Preserving Expert Knowledge:** AI helps retain critical expertise across industries, mitigating the risks associated with workforce turnover and knowledge gaps. Notable Quotes: 1. "AI can do certain things that are ridiculously hard for people to do." 2. "People will offload to one or several AI agents, trained to do specific things really well." 3. "The speed at which AI could resolve some of this... is going to be a massive unlock." 4. "Job roles will change, and expectations will evolve, requiring increased productivity with AI." 5. "AI is able to look at tens or hundreds of thousands of claims beyond the capability of the human brain."

Growing With Proficiency The Podcast
Episode 156: Planning With Purpose — A Conversation with Dr. Diane Neubauer & Dr. Reed Riggs

Growing With Proficiency The Podcast

Play Episode Listen Later Jul 31, 2025 63:44


Send us a textPlanning for acquisition isn't about perfection—it's about intention. In this powerful episode, I sit down with Dr. Diane Neubauer and Dr. Reed Riggs for one of the most meaningful conversations I've ever had about lesson planning in the world language classroom.We explore what it means to plan with purpose—centering curiosity, communication, and student connection instead of rigid structures or vocabulary lists.✨ Key takeaways:Why essential questions can guide your unit better than word banksHow to stay flexible with language goals (because students don't always know what we think they know)The power of interpretive communication—why listening and reading matter just as much (or more!) than speakingHow routines and student responses can lead the wayWhy language is the vehicle, not the destinationWhether you're new to acquisition-driven instruction or looking to refine your approach, this episode will inspire and ground your planning process.

Agency Unfiltered
Designing for a Generative Future: AI, Intelligent Interfaces, and Enterprise CX

Agency Unfiltered

Play Episode Listen Later Jul 30, 2025 20:28


Multimodal interfaces. Real-time personalization. Data privacy. Content ownership. Responsible AI. In this episode, Eve Sangenito of global consultancy Perficient offers a grounded, enterprise lens on the evolving demands of AI-powered customer experience—and what leaders (and the partners who support them) need to understand right now. Eve and Sarah explore how generative AI is reshaping customer expectations, guiding tech investments, and redefining experience delivery at scale. For anyone driving digital transformation, building AI strategy, or modernizing enterprise CX, this conversation is a timely look at what's shifting—and what's ahead.

Cyber Security Weekly Podcast
Episode 460 - Contactless multimodal biometric software for digital identity

Cyber Security Weekly Podcast

Play Episode Listen Later Jul 24, 2025 6:55


Blue Biometrics (Blue) was founded in Australia in September 2017, to develop world class contactless biometric technology, for dual use national security and commercial applications. Blue now also operates companies in the UK and the USA. Prior to founding Blue, the co-founders were involved in pioneering contactless technology.Blue Biometrics is now delivering to law enforcement the next generation of LEA Blue, software that enables smartphone cameras as contactless fingerprint scanners, ideal for field identification in policing.We speak with CEO and Founder, Kenneth King at the World Police Summit 2025, held at the Dubai World Trade Centre 13-15 May. MySecurity Media were media partners to the WPS 2025.#Worldpolicesummit #wps2025 #mysecuritytv

Data Driven
Dr Alan Bekker on Multimodal Avatars, Education, and Authentic Digital Connections

Data Driven

Play Episode Listen Later Jul 23, 2025 57:09


In today's conversation, hosts BAILeY and Frank La Vigne sit down with Dr. Alan Becker, co-founder and CEO of E Self AI and former co-founder of Voca AI, which was acquired by Snap in 2020. Dr. Becker brings a powerhouse combination of academic expertise and entrepreneurial experience, with a PhD in machine learning and AI and research spanning voice, NLP, and computer vision.This episode dives into how E Self AI is transforming human-machine interaction by moving beyond chat and voice—introducing real-time face-to-face video AI assistants. Imagine smarter digital avatars that don't just talk but engage visually and contextually, delivering personalized tutoring, streamlined customer service, and even real estate tours.Along the way, the conversation goes deep: they discuss the pyramid of engagement from text to voice to fully visual interaction, and what the next generation of AI means for education, accessibility, and the way we connect with each other. Dr. Becker offers insights from his research on multimodal AI and reflects on the very philosophical questions these new technologies raise—like authenticity, connection, and the future of what we call “real.”Whether you're curious about the engineering magic behind lifelike avatars, the ethical dilemmas of AI interviews, or how technology is reshaping the learning experience, this episode is a fascinating look at where artificial intelligence—and humanity—are headed next.Strap in, update your firmware, and get ready to stay Data Driven.ESelf Website: https://www.eself.ai/TimeStamps00:00 "Exploring AI with Dr. Becker"05:49 Accessibility Drives eSelf's Innovation Journey06:58 Advanced Voice Interaction Evolution11:33 Personalized Tutoring via Multimodal AI14:33 AI in Sci-Fi and Real Life16:21 Understanding Users Via Encoding Algorithms19:44 Digital Twin to AI Avatars Transition23:14 AI Misunderstandings Due to Background Noise27:19 AI's Role in Job Interviews32:36 Nvidia Tool Alters Eye Contact35:15 AI-Human Communication Blur38:31 Reality and Gaming Perception Debate41:51 Influencer Reality vs. Perception44:45 Identity and Authenticity Dilemma48:30 "Influencer's Authenticity vs. Flashiness"53:22 Fake Celeb Math Explainer TikToks56:10 "AI's Future: Human-Like Interaction"57:07 Data-Driven Focus

Vegan Podcast
Mögliche Spike Therapie (verboten) | Andreas Schlecht #1226

Vegan Podcast

Play Episode Listen Later Jul 20, 2025 39:47


Andreas Schlecht beleuchtet die zellulären Risiken von Lipid-Nanopartikeln und zeigt, wie regenerative Peptide wie KPV potenziell bei Long Covid, Neuroinflammation und hormonellen Dysfunktionen helfen könnten – trotz regulatorischer Hürden.Hilfreiche Komplexe: Long C: https://www.sunday.de/long-c-komplexe/ (Code: TAN34909) Acetyl L-Carnitin: https://www.sunday.de/acetyl-l-carnitin-kapseln.html (Gutschein: TAN34909) Molecusan: https://molecusan.com/products/liquid-spectrum (Code: vegan10)

The Preschool SLP
181. Thinking About SLP Contracting? Here's What You Need to Know First to Prevent Burnout

The Preschool SLP

Play Episode Listen Later Jul 17, 2025 38:37


Is the way you're working… actually working? In this candid and empowering episode of The Preschool SLP, I sit down with the SLP Happy Hour host, Sarah Lockhart, to talk about what no one tells you in grad school: the path to sustainability as an SLP is rarely linear. Together, we unpack what it really takes to thrive in this field—from navigating burnout to rebuilding self-trust, shifting careers, and defining success on your own terms. Whether you're in public schools, private practice, teletherapy, or contracting, this conversation will meet you exactly where you are. What You'll Learn: The hidden costs of perfectionism and overworking When private practice becomes unsustainable The real pros and cons of contracting, teletherapy, and salaried school positions How to build a fulfilling SLP career through self-trust and clinical intuition Why working smarter (not harder) is the key to staying in this field long term If this episode hit home, and you're craving less prep and more purpose in your therapy sessions, then it's time to join the SIS Membership. Inside SIS, you'll get: ✅ Weekly ready-to-go, research-backed therapy materials ✅ Tools like the Behavior Flip Cards and Progress Diplomas featured in this episode ✅ Multimodal strategies that support the whole child ✅ A system that protects your energy so you can focus on what matters: being present Join now and start working smarter today: www.kellyvess.com/sis 00:00 Introduction and Inspirational Opening 02:33 Sarah's Journey as an SLP 04:16 Challenges and Realities of Private Practice 04:46 The Shift to Telepractice 07:24 Balancing Work and Personal Life 18:12 Top Recovery Tips for SLPs Get your Behavior Rule Flip Cards + Behavior Diplomas at:

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 559: ChatGPT's Updated Custom GPTs: What's New and How They Work

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Jul 2, 2025 46:40


Wanna hear a lil secret?You (likely) have no clue what custom GPTs are capable of inside of ChatGPT. OpenAI just updated their capabilities, yet no one's talking about it. Why? The original hype and hoopla from their late 2023 launched fizzled and faded away, and now many AI users have written GPTs off. Big mistake. You won't believe what the newly upgraded GPTs are capable of.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Thoughts on this? Join the convo.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Custom GPTs Launch & Initial ReceptionUpdated OpenAI Custom GPT CapabilitiesExpanded Model Support for Custom GPTsBusiness Applications of Custom GPT UpdatesLive Demo of New Custom GPT FeaturesInsight Synthesizer GPT's Unique AbilitiesMeeting Actionizer GPT for Business EfficiencyPersonalizing with the Updated GPT ModelsTimestamps:00:00 "Upgraded Custom GPTs Revolution"04:52 GPT Building: Web Access Only06:46 "Podcast Rambling Concerns"09:56 Benefits of Using Custom GPTs13:18 Using Custom GPTs and GPT Store17:16 Simple AI Tool Usage Guide21:32 Custom ChatGPT Limitations Explained25:17 Exploring AI's Efficiency in Tasks27:06 "AI Impact Dashboard for 2025"32:03 GPT-4 vs. GPT-3: Agentic Abilities35:33 Reasoning Models Enhance Meeting Analysis36:53 AI Meeting Summary Features40:40 Personalized NVIDIA Stock Insights42:38 GPT Custom Models: New DevelopmentsKeywords:Custom GPTs, OpenAI updates, Expanded model support, No code creation, Custom actions, GPT store, Enterprise rollout, Recommended model, O3 model, O3 Pro model, GPT-4.5, Data storytelling, AI humanizer, Multimodal capabilities, Sentiment analysis, Thematic clustering, Research analyst, Meeting actionizer, Personalized learning architect, Financial snapshot, Web search, Canvas mode, Python coding, Boolean search, AGSentic reasoning, Chain of thought, Knowledge files, Fine-tuning, Domain expertise, Automated workflows, Generative AI, Creative marketing, Information synthesis, Meeting analysis, Decision automation, Webhooks, APIs, Knowledge tokenization.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Ready for ROI on GenAI? Go to youreverydayai.com/partner

Healthed Australia
Managing CPAP non-compliance: Practical strategies and multimodal solutions

Healthed Australia

Play Episode Listen Later Jun 16, 2025 50:07


Challenges of continuous positive airway pressure (CPAP) compliance Importance of patient follow-up and addressing psychological issues Importance of trialling different masks and techniques to find the best solution for each patient Potential alternatives, including weight loss, physiotherapy, oral appliances, and surgery Host: Dr David Lim | Total Time: 50 mins Experts: Dr Rosemary Clancy, Clinical Psychologist Register for our fortnightly FREE WEBCASTSEvery second Tuesday | 7:00pm-9:00pm AEST Click here to register for the next oneSee omnystudio.com/listener for privacy information.

Twins Talk it Up Podcast
Episode 269: Creating Heroes

Twins Talk it Up Podcast

Play Episode Listen Later Jun 13, 2025 63:49


This is another special edition. We're on location at IT Nation Secure in Orlando, where cybersecurity leaders, channel experts, and solution innovators gathered to share ideas, build partnerships, and level up the industry—together. Not all heroes wear capes — some carry laptops and create solutions to help their MSPs and partners win. Dive into conversations with inspired leaders: Dexter Caffey, CEO of Smart Eye Technology, Jean Templin, Co-Founder and CEO of Nayak.ai Jared Casner & Mike Zbarsky Co-Founders of Blacksmith Infosec. Highlights include: Participating in the PitchIT Accelerator Program. Multimodal biometric authentication. Inspiration behind names and branding. Necessity of Executive buy-in to impact and shape security culture. Another podcast program to follow. Resources: An e-book from Nayak.ai: The Rise of AI in Sales Enablement https://www.nayak.ai/company/download-the-e-book A sample of the book, Forging Trust: https://blacksmithinfosec.com/forging-trust/ Timestamps: [01:34] Smart Eye Technology [20:39] Nayak.ai [41:13] Blacksmith infosec --- more --- If you want to master the art of audience engagement while learning how to conquer speaking anxiety, deliver persuasive presentations, and close more deals, this is the program for you. Twins Talk It Up is hosted by identical twin brothers Danny Suk Brown and David Suk Brown, who share leadership communication strategies designed to help professionals embrace the power of their authentic voice. Together, we'll explore tips and tools to unlock the full potential of your voice, dominate every stage you step onto, and elevate your influence and value. Along the way, we'll crush goals and share plenty of laughs. Book a Free 15-minute discovery call: dsbleadershipgroup.com/schedule-a-call/ Website: appmeetup.com/twinstalkitup/ Community: facebook.com/groups/publicspeakingpoints Patreon: patreon.com/twinstalkitup

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 534: Claude 4 - Your Guide to Opus 4, Sonnet 4 & New Features

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 28, 2025 45:23


Claude 4: Game-changer or just more AI noise? Anthropic's new Opus 4 and Sonnet 4 models are officially out and crushing coding benchmarks like breakfast cereal. They're touting big coding gains, fresh tools, and smarter AI agentic capabilities. Need to know what's actually up with Claude 4, minus the marketing fluff? Join us as we dive in. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Claude 4 Opus and SONNET LaunchAnthropic Developer Conference HighlightsAnthropic's AI Model Naming ChangesClaude 4's Hybrid Reasoning ExplainedBenchmark Scores for Claude 4 ModelsTool Integration and Long Tasks in ClaudeCoding Excellence in Opus and SONNET 4Ethical Risks in Claude 4 TestingTimestamps:00:00 "Anthropic's New AI Models Revealed"03:46 Claude Model Naming Update07:43 Claude 4: Extended Task Capabilities10:55 "Partner with AI Experts"15:43 Software Benchmark: Opus & SONNET Lead16:45 INTROPIC Leads in Coding AI21:27 Versatile Use of Claude Models23:13 Claude Four's New Features & Limitations28:23 AI Pricing and Performance Disappointment32:21 Opus Four: AI Risk Concerns35:14 AI Model's Extreme Response Tactics36:40 AI Model Misbehavior Concerns42:51 Pre-Release Testing for SafetyKeywords:Claude 4, Anthropic, AI model update, Opus 4, SONNET 4, Large Language Model, Hybrid reasoning, Software engineering, Coding precision, Tool integration, Web search, Long running tasks, Coherence, Claude Code, API pricing, Swebench, Thinking mode, Memory files, Context window, Agentic systems, Deceptive blackmail behavior, Ethical risks, Testing scenarios, MCP connector, Coding excellence, Developer conference, Rate limits, Opus pricing, SONNET pricing, Claude Haiku, Tool execution, API side, Artificial analysis intelligence index, Multimodal, Extended thinking, Formative feedback, Text generation, Reasoning process, Lecture summary.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Ready for ROI on GenAI? Go to youreverydayai.com/partner

Empowered Patient Podcast
Multimodal AI Platform Enables Precision-Guided Prostate Cancer Treatment with Shyam Natarajan Avenda Health

Empowered Patient Podcast

Play Episode Listen Later May 22, 2025 20:58


Shyam Natarajan, Founder and CEO of Avenda Health, is utilizing the Unfold AI platform that combines imaging, pathology, and clinical data to provide a comprehensive 3D visualization of prostate cancer. This platform has demonstrated significantly higher accuracy than conventional imaging techniques, enabling physicians to make more informed diagnoses and treatment decisions. The technology has been integrated into the clinical workflow to provide real-time insights and precision-guided interventions, minimizing treatment-related side effects and preserving patient quality of life. Shyam explains, "Unfold AI is unique in that it's multimodal. We take in imaging biomarkers, pathology, and clinical information as input. And conventional imaging really doesn't show you exactly everywhere the cancer is. MRI today misses two-thirds of the disease by volume, and so imaging is really good at screening and that initial diagnosis. But when it comes time to decide how to treat patients, the standard of care is challenging today because, really, up to a third of patients end up having cancer left behind after treatment. So what we're trying to solve is this pain point where cancer is missed, and as a consequence, cancer is left behind."  "This product is really for patients who have a diagnosis of what we call clinically significant or cancer that you have to do something about. So, it's not the very low-risk, where being on what's called surveillance, watch and wait, is probably more appropriate. But this product, you touched upon the value proposition where a lot of patients are coming to their doctor saying, Hey Doc, I don't want to get surgery because I'm scared of the quality of life outcomes or the side effect profile. They want to get a targeted therapy. Well, physicians really can't offer targeted therapy in a broad sense unless they know where the cancer is. And so, AI is empowering and enabling physicians to perform precision-guided therapy or focal therapy." #AvendaHealth #UnfoldAi #ProstateCancer #ProstateCancerTreatment #HealthcareAI #CancerAI #RadiologyAI #DiagnosticAI #MedicalAI #AIinHealthcare avendahealth.com Download the transcript here

Empowered Patient Podcast
Multimodal AI Platform Enables Precision-Guided Prostate Cancer Treatment with Shyam Natarajan Avenda Health TRANSCRIPT

Empowered Patient Podcast

Play Episode Listen Later May 22, 2025


Shyam Natarajan, Founder and CEO of Avenda Health, is utilizing the Unfold AI platform that combines imaging, pathology, and clinical data to provide a comprehensive 3D visualization of prostate cancer. This platform has demonstrated significantly higher accuracy than conventional imaging techniques, enabling physicians to make more informed diagnoses and treatment decisions. The technology has been integrated into the clinical workflow to provide real-time insights and precision-guided interventions, minimizing treatment-related side effects and preserving patient quality of life. Shyam explains, "Unfold AI is unique in that it's multimodal. We take in imaging biomarkers, pathology, and clinical information as input. And conventional imaging really doesn't show you exactly everywhere the cancer is. MRI today misses two-thirds of the disease by volume, and so imaging is really good at screening and that initial diagnosis. But when it comes time to decide how to treat patients, the standard of care is challenging today because, really, up to a third of patients end up having cancer left behind after treatment. So what we're trying to solve is this pain point where cancer is missed, and as a consequence, cancer is left behind." "This product is really for patients who have a diagnosis of what we call clinically significant or cancer that you have to do something about. So, it's not the very low-risk, where being on what's called surveillance, watch and wait, is probably more appropriate. But this product, you touched upon the value proposition where a lot of patients are coming to their doctor saying, Hey Doc, I don't want to get surgery because I'm scared of the quality of life outcomes or the side effect profile. They want to get a targeted therapy. Well, physicians really can't offer targeted therapy in a broad sense unless they know where the cancer is. And so, AI is empowering and enabling physicians to perform precision-guided therapy or focal therapy." #AvendaHealth #UnfoldAi #ProstateCancer #ProstateCancerTreatment #HealthcareAI #CancerAI #RadiologyAI #DiagnosticAI #MedicalAI #AIinHealthcare avendahealth.com Listen to the podcast here

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 529: Microsoft Build Updates: 5 new Copilot AI updates and how to use them

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 20, 2025 41:49


Microsoft legit just dropped a book of AI updates at the Build Conference.We're going to go over the 5 most impactful AI-powered Microsoft Copilot updates and how they will change the future of work. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:GitHub Copilot's Autonomous Coding Partner UpdateCopilot Tuning for Enterprise CustomizationIntroducing Agent Foundry on AzureMulti-Agent Orchestration in Copilot StudioComputer Use Automation in CopilotMCP Native Support in Microsoft SystemsTimestamps:00:00 "Everyday AI: Transform Your Business"06:42 AI Coding Assistant Evolution09:29 Copilot Tuning for Business Leaders10:56 Data Privacy Concerns in Cloud Use16:52 "AI Collaboration Among Tech Giants"20:48 "Multi-Agent Orchestration Cautions"22:59 "Multi-Agent Orchestration in Copilot Studio"25:27 OpenAI Copilot Access and Availability29:38 Copilot Pro: Versatile AI Agent35:13 Microsoft Embraces Open AI Collaboration36:57 "Security Concerns Slow AI Rollout"39:44 Subscribe & Review RequestKeywords:Microsoft Build 2025, AI updates, Copilot AI updates, GitHub Copilot, GitHub Copilot coding agent, Autonomous coding partner, Visual Studio Code, Multimodal understanding, Natural language prompts, MCP protocol, Model context protocol, Anthropic, Microsoft 365 Copilot, Business leaders, Copilot tuning, Organization's internal data, Low code model tuning, Task specific agents, Secure service boundary, Azure, Agent foundry, AI agent playground, Enterprise grade AI agents, Grok, Elon Musk, Microsoft Azure, Agent to agent protocol, A to A, Multi agent orchestration, Copilot Studio, Agents collaboration, Agentic memory, Automated validation tools, Computer use in Copilot, Desktop applications, Repetitive tasks, MCP native support, Windows 11, Future of work, Third party applications, Agentic web, Security and access controls.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Ready for ROI on GenAI? Go to youreverydayai.com/partner

EUVC
VC | E467 | Dave Haynes on How AR Glasses, Multimodal AI, and Teleoperated Robots Will Reshape How We Work

EUVC

Play Episode Listen Later May 13, 2025 48:27


In this episode, Andreas Munk Holm talks with Dave Haynes at FOV Ventures to unpack how the evolution of multimodal AI, audio intelligence, and teleoperation are driving a new wave of human-computer interaction. From audio-enhancing models like AI Acoustics to video synthesis via 3D engines, and from VR-powered robot control to wearable AI glasses—you'll get a front-row seat to the technologies that are redefining the frontier of ambient computing.Dave shares how AR glasses are already unlocking new workflows, why multimodal models are changing the way we use AI daily, and what this means for investors eyeing the next trillion-dollar opportunity. If you've ever wondered what comes after mobile and SaaS, this is it.Here's what's covered:02:30 From Text to Multimodal: How AI Has Leveled Up06:00 Voice is the Interface: Audio Models Like 11 Labs and AI Acoustics08:45 The Rise of Ambient AI: Talking to Your Computer Like a Human12:10 The Future Is Synthetic: Training Video Models in 3D Engines15:55 Robotics in the Wild: Why Teleoperation Still Matters19:40 The Role of VR in Human-Robot Collaboration23:20 AR Glasses and Continuous Context Capture26:50 Using AI to Wrap Your Day: Follow-Up Summaries from Lifelogging30:10 Beyond SaaS: Why Frontier Tech Is Where Alpha Lives33:00 Convincing LPs to Bet on the Future36:25 Venture in the Age of Disruption: The 10-Year Thesis

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 523: OpenAI could go public, Gemini 2.5 continues dominance and more AI News That Matters

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 12, 2025 46:32


OpenAI is making moves to go public. Apple and Anthropic are teaming up for vibe coding. And Google is quietly continuing its dominance with a quiet update to the world's most powerful AI model.Once again, the big names are shaking up the AI space. Don't burn hours a day trying to keep up. Spend your Mondays with Everyday AI and our weekly 'AI News that Matters' segment. You'll be the smartest person in AI at your company.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Have a question? Join the convo here.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:Anthropic and Apple AI PartnershipApple AI Coding with Anthropic's ClaudeOpenAI's Wind Surf AcquisitionAI Search Engines in Apple's SafariOpenAI and FDA Drug Approval TalksGoogle Gemini 2.5 Pro IO EditionAmazon AI Coding Tool KiroOpenAI's Nonprofit Control DecisionTimestamps:00:00 "Everyday AI: Podcast and Newsletter"03:44 Apple Eyes External AI Partnerships07:12 OpenAI's Wind Surf Acquisition Disrupts Coding10:28 Windsurf Model Selection and Future14:24 Apple's AI Search Engine Shift20:45 FDA-OpenAI AI Drug Approval Talks22:50 AI Literacy Challenges27:14 "Gemini 2.5 Pro Unveiled"31:27 Advanced AI Coding Tools Emerging34:50 OpenAI Governance and Structure Shift36:50 OpenAI-Microsoft Partnership Revamp Talks42:40 Tech Giants Shake Up AI LandscapeKeywords:Anthropic, Apple, Vibe coding, Google Gemini, 2.5 pro IO edition, OpenAI, Microsoft partnership, IPO, Artificial General Intelligence, AI coding models, Claude SONNET, Swift Assist, Anthropic's Claude, Wind Surf, $3 billion acquisition, AI IDE, Race car driver analogy, AI search engines, Safari, Perplexity AI, ChatGPT, Search engine market, FDA, Drug approval process, AI-assisted scientific review, Google IO edition, Web dev arena leaderboard, Amazon Web Services, AI-powered code generation, Kiro, Multimodal capabilities, OpenAI nonprofit arm, Public Benefits Corporation, Equity stake, Microsoft partnership renegotiation, $13 billion investment, SoftBank, Oracle, Stargate projectSend Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Ready for ROI on GenAI? Go to youreverydayai.com/partner

Luma Labs' Diffusion Revolution: from Dream Machine to Multimodal Worldsim - Amit Jain, Jiaming Song

Play Episode Listen Later May 11, 2025 79:32


In this episode of the Cognitive Revolution podcast, the host Nathan Labenz welcomes Amit Jain, CEO and Jiaming Song, Chief Scientist at Luma Labs, alongside co-host Stephen Parker. The conversation delves into the latest advancements and products from Luma Labs, makers of the Dream Machine, including cutting-edge models and features like camera motion and creative video generation tools. They explore technical aspects like pre-training for diffusion models and the development of concepts to improve AI capabilities. The discussion also covers the philosophical and practical implications of AI interpretability and multimodality, along with a deep dive into the intellectual history and recent innovations in diffusion models. Upcoming Major AI Events Featuring Nathan Labenz as a Keynote Speaker https://www.imagineai.live/ https://adapta.org/adapta-summit https://itrevolution.com/product/enterprise-tech-leadership-summit-las-vegas/ SPONSORS: ElevenLabs: ElevenLabs gives your app a natural voice. Pick from 5,000+ voices in 31 languages, or clone your own, and launch lifelike agents for support, scheduling, learning, and games. Full server and client SDKs, dynamic tools, and monitoring keep you in control. Start free at https://elevenlabs.io/cognitive-revolution Oracle Cloud Infrastructure (OCI): Oracle Cloud Infrastructure offers next-generation cloud solutions that cut costs and boost performance. With OCI, you can run AI projects and applications faster and more securely for less. New U.S. customers can save 50% on compute, 70% on storage, and 80% on networking by switching to OCI before May 31, 2024. See if you qualify at https://oracle.com/cognitive Shopify: Shopify powers millions of businesses worldwide, handling 10% of U.S. e-commerce. With hundreds of templates, AI tools for product descriptions, and seamless marketing campaign creation, it's like having a design studio and marketing team in one. Start your $1/month trial today at https://shopify.com/cognitive NetSuite: Over 41,000 businesses trust NetSuite by Oracle, the #1 cloud ERP, to future-proof their operations. With a unified platform for accounting, financial management, inventory, and HR, NetSuite provides real-time insights and forecasting to help you make quick, informed decisions. Whether you're earning millions or hundreds of millions, NetSuite empowers you to tackle challenges and seize opportunities. Download the free CFO's guide to AI and machine learning at https://netsuite.com/cognitive PRODUCED BY: https://aipodcast.ing

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 513: OpenAI's new open model, Copilot updates, Perplexity going after Siri & more AI News That Matters

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Apr 28, 2025 54:52


Is Perplexity going after..... Siri? Talk about a hard pivot. OpenAI and Google are racing for users.... who's winning? And will the U.S.'s effort on AI in education be too little, too late? We'll answer those questions and a ton more on our weekly news roundup show. Don't spend hours a day trying to keep up. Just join us (most) Mondays. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Thoughts on this? Join the convo.Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:OpenAI Launches Lightweight Deep Research ToolDOJ Pushes Google Chrome BreakupMicrosoft 365 Copilot Spring 2025 UpdateAdobe Firefly Supports Google, OpenAI ModelsMyPillow CEO's AI-Generated Legal TroubleOpenAI's Image Gen API for DevelopersU.S. President Signs AI Education OrderPerplexity AI Challenges Siri with Voice AssistantTimestamps:00:00 AI in Education, Adobe Partnerships05:52 "Premier Deep Research Tools: OpenAI & Google"07:35 DOJ Proposes Google Chrome Sale13:42 Adobe Expands AI Image Tools17:08 AI Missteps: Lindell's Legal Trouble18:43 Lindell's Legal AI Misstep22:25 OpenAI ImageGen API Overview28:27 AI Literacy Initiative Praised31:29 Microsoft Launches Controversial Recall Feature32:32 Improved Microsoft Search Enhancement Secured38:30 "Perplexity: Contextual AI Assistant Edge"41:22 "Perplexity's Crucial Pivot Needed"43:18 OpenAI's New Reasoning Language Model46:43 AI Usage Surges Amid OpenAI Speculation50:54 Tech Updates: AI Expansions & Legal IssuesKeywords:OpenAI, Lightweight deep research tool, chat GPT, Free users, Paid users, Deep research queries, O four mini model, Google, Gemini 2.5, Perplexity, Deep research, Adobe, Firefly AI, Competitors, Image generation, Microsoft, Copilot, AI features, MyPillow, Legal citations, AI-generated court filings, ImageGen, API access, Creative work, US president Trump, Executive order, AI education, Microsoft 365, Recall feature, Copilot plus PCs, Perplexity AI, Voice assistant, Siri competitor, Open reasoning language model, Llama, Open weights, Meta, Text in/text out, DeepSeek, AI safety, Sam Altman, Edge AI, Multimodal models.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Ready for ROI on GenAI? Go to youreverydayai.com/partner

The Agile World with Greg Kihlstrom
#652: The power of multimodal AI with Dani Yogatama, Reka

The Agile World with Greg Kihlstrom

Play Episode Listen Later Mar 19, 2025 22:21


Is your organization just jumping on the AI bandwagon, or do you have a solution that will support your company's needs in the short and long term? Welcome to this episode, brought to you by Reka, a developer of industry-leading, multimodal, AI models that enable individuals and organizations to deploy generative AI applications. Today we're going to talk about the power of multimodal AI in the enterprise and why it is important for businesses to incorporate AI that is able to utilize multiple input sources, multiple languages, and flexible contexts to provide more intelligent insights. To help me discuss this topic, I'd like to welcome Dani Yogatama, CEO of Reka. RESOURCES Reka: https://www.reka.ai Don't Miss MAICON 2025, October 14-16 in Cleveland - the event bringing together the brightest minds and leading voices in AI. Use Code AGILE150 for $150 off registration. Go here to register: https://bit.ly/agile150 Connect with Greg on LinkedIn: https://www.linkedin.com/in/gregkihlstrom Don't miss a thing: get the latest episodes, sign up for our newsletter and more: https://www.theagilebrand.show Check out The Agile Brand Guide website with articles, insights, and Martechipedia, the wiki for marketing technology: https://www.agilebrandguide.com The Agile Brand podcast is brought to you by TEKsystems. Learn more here: https://www.teksystems.com/versionnextnow The Agile Brand is produced by Missing Link—a Latina-owned strategy-driven, creatively fueled production co-op. From ideation to creation, they craft human connections through intelligent, engaging and informative content. https://www.missinglink.company

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 483: Inside Apple's AI failures, Google and OpenAI asking feds for help on AI, NVIDIA GTC & more AI News That Matters

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Mar 17, 2025 51:34


This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Imagine while Reasoning in Space: Multimodal Visualization-of-Thought with Chengzu Li - #722

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

Play Episode Listen Later Mar 10, 2025 42:11


Today, we're joined by Chengzu Li, PhD student at the University of Cambridge to discuss his recent paper, “Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.” We explore the motivations behind MVoT, its connection to prior work like TopViewRS, and its relation to cognitive science principles such as dual coding theory. We dig into the MVoT framework along with its various task environments—maze, mini-behavior, and frozen lake. We explore token discrepancy loss, a technique designed to align language and visual embeddings, ensuring accurate and meaningful visual representations. Additionally, we cover the data collection and training process, reasoning over relative spatial relations between different entities, and dynamic spatial reasoning. Lastly, Chengzu shares insights from experiments with MVoT, focusing on the lessons learned and the potential for applying these models in real-world scenarios like robotics and architectural design. The complete show notes for this episode can be found at https://twimlai.com/go/722.