Podcasts about Cuda

  • 462PODCASTS
  • 941EPISODES
  • 48mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Mar 10, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about Cuda

Latest podcast episodes about Cuda

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
NVIDIA's AI Engineers: Agent Inference at Planetary Scale and "Speed of Light" — Nader Khalil (Brev), Kyle Kranen (Dynamo)

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 10, 2026 83:37


Join Kyle, Nader, Vibhu, and swyx live at NVIDIA GTC next week!Now that AIE Europe tix are ~sold out, our attention turns to Miami and World's Fair!The definitive AI Accelerator chip company has more than 10xed this AI Summer:And is now a $4.4 trillion megacorp… that is somehow still moving like a startup. We are blessed to have a unique relationship with our first ever NVIDIA guests: Kyle Kranen who gave a great inference keynote at the first World's Fair and is one of the leading architects of NVIDIA Dynamo (a Datacenter scale inference framework supporting SGLang, TRT-LLM, vLLM), and Nader Khalil, a friend of swyx from our days in Celo in The Arena, who has been drawing developers at GTC since before they were even a glimmer in the eye of NVIDIA:Nader discusses how NVIDIA Brev has drastically reduced the barriers to entry for developers to get a top of the line GPU up and running, and Kyle explains NVIDIA Dynamo as a data center scale inference engine that optimizes serving by scaling out, leveraging techniques like prefill/decode disaggregation, scheduling, and Kubernetes-based orchestration, framed around cost, latency, and quality tradeoffs. We also dive into Jensen's “SOL” (Speed of Light) first-principles urgency concept, long-context limits and model/hardware co-design, internal model APIs (https://build.nvidia.com), and upcoming Dynamo and agent sessions at GTC.Full Video pod on YouTubeTimestamps00:00 Agent Security Basics00:39 Podcast Welcome and Guests07:19 Acquisition and DevEx Shift13:48 SOL Culture and Dynamo Setup27:38 Why Scale Out Wins29:02 Scale Up Limits Explained30:24 From Laptop to Multi Node33:07 Cost Quality Latency Tradeoffs38:42 Disaggregation Prefill vs Decode41:05 Kubernetes Scaling with Grove43:20 Context Length and Co Design57:34 Security Meets Agents58:01 Agent Permissions Model59:10 Build Nvidia Inference Gateway01:01:52 Hackathons And Autonomy Dreams01:10:26 Local GPUs And Scaling Inference01:15:31 Long Running Agents And SF ReflectionsTranscriptAgent Security BasicsNader: Agents can do three things. They can access your files, they can access the internet, and then now they can write custom code and execute it. You literally only let an agent do two of those three things. If you can access your files and you can write custom code, you don't want internet access because that's one to see full vulnerability, right?If you have access to internet and your file system, you should know the full scope of what that agent's capable of doing. Otherwise, now we can get injected or something that can happen. And so that's a lot of what we've been thinking about is like, you know, how do we both enable this because it's clearly the future.But then also, you know, what, what are these enforcement points that we can start to like protect?swyx: All right.Podcast Welcome and Guestsswyx: Welcome to the Lean Space podcast in the Chromo studio. Welcome to all the guests here. Uh, we are back with our guest host Viu. Welcome. Good to have you back. And our friends, uh, Netter and Kyle from Nvidia. Welcome.Kyle: Yeah, thanks for having us.swyx: Yeah, thank you. Actually, I don't even know your titles.Uh, I know you're like architect something of Dynamo.Kyle: Yeah. I, I'm one of the engineering leaders [00:01:00] and a architects of Dynamo.swyx: And you're director of something and developers, developer tech.Nader: Yeah.swyx: You're the developers, developers, developers guy at nvidia,Nader: open source agent marketing, brev,swyx: and likeNader: Devrel tools and stuff.swyx: Yeah. BeenNader: the focus.swyx: And we're, we're kind of recording this ahead of Nvidia, GTC, which is coming to town, uh, again, uh, or taking over town, uh, which, uh, which we'll all be at. Um, and we'll talk a little bit about your sessions and stuff. Yeah.Nader: We're super excited for it.GTC Booth Stunt Storiesswyx: One of my favorite memories for Nader, like you always do like marketing stunts and like while you were at Rev, you like had this surfboard that you like, went down to GTC with and like, NA Nvidia apparently, like did so much that they bought you.Like what, what was that like? What was that?Nader: Yeah. Yeah, we, we, um. Our logo was a chaka. We, we, uh, we were always just kind of like trying to keep true to who we were. I think, you know, some stuff, startups, you're like trying to pretend that you're a bigger, more mature company than you are. And it was actually Evan Conrad from SF Compute who was just like, you guys are like previousswyx: guest.Yeah.Nader: Amazing. Oh, really? Amazing. Yeah. He was just like, guys, you're two dudes in the room. Why are you [00:02:00] pretending that you're not? Uh, and so then we were like, okay, let's make the logo a shaka. We brought surfboards to our booth to GTC and the energy was great. Yeah. Some palm trees too. They,Kyle: they actually poked out over like the, the walls so you could, you could see the bread booth.Oh, that's so funny. AndNader: no one else,Kyle: just from very far away.Nader: Oh, so you remember it backKyle: then? Yeah I remember it pre-acquisition. I was like, oh, those guys look cool,Nader: dude. That makes sense. ‘cause uh, we, so we signed up really last minute, and so we had the last booth. It was all the way in the corner. And so I was, I was worried that no one was gonna come.So that's why we had like the palm trees. We really came in with the surfboards. We even had one of our investors bring her dog and then she was just like walking the dog around to try to like, bring energy towards our booth. Yeah.swyx: Steph.Kyle: Yeah. Yeah, she's the best,swyx: you know, as a conference organizer, I love that.Right? Like, it's like everyone who sponsors a conference comes, does their booth. They're like, we are changing the future of ai or something, some generic b******t and like, no, like actually try to stand out, make it fun, right? And people still remember it after three years.Nader: Yeah. Yeah. You know what's so funny?I'll, I'll send, I'll give you this clip if you wanna, if you wanna add it [00:03:00] in, but, uh, my wife was at the time fiance, she was in medical school and she came to help us. ‘cause it was like a big moment for us. And so we, we bought this cricket, it's like a vinyl, like a vinyl, uh, printer. ‘cause like, how else are we gonna label the surfboard?So, we got a surfboard, luckily was able to purchase that on the company card. We got a cricket and it was just like fine tuning for enterprises or something like that, that we put on the. On the surfboard and it's 1:00 AM the day before we go to GTC. She's helping me put these like vinyl stickers on.And she goes, you son of, she's like, if you pull this off, you son of a b***h. And so, uh, right. Pretty much after the acquisition, I stitched that with the mag music acquisition. I sent it to our family group chat. Ohswyx: Yeah. No, well, she, she made a good choice there. Was that like basically the origin story for Launchable is that we, it was, and maybe we should explain what Brev is andNader: Yeah.Yeah. Uh, I mean, brev is just, it's a developer tool that makes it really easy to get a GPU. So we connect a bunch of different GPU sources. So the basics of it is like, how quickly can we SSH you into a G, into a GPU and whenever we would talk to users, they wanted A GPU. They wanted an A 100. And if you go to like any cloud [00:04:00] provisioning page, usually it's like three pages of forms or in the forms somewhere there's a dropdown.And in the dropdown there's some weird code that you know to translate to an A 100. And I remember just thinking like. Every time someone says they want an A 100, like the piece of text that they're telling me that they want is like, stuffed away in the corner. Yeah. And so we were like, what if the biggest piece of text was what the user's asking for?And so when you go to Brev, it's just big GPU chips with the type that you want withswyx: beautiful animations that you worked on pre, like pre you can, like, now you can just prompt it. But back in the day. Yeah. Yeah. Those were handcraft, handcrafted artisanal code.Nader: Yeah. I was actually really proud of that because, uh, it was an, i I made it in Figma.Yeah. And then I found, I was like really struggling to figure out how to turn it from like Figma to react. So what it actually is, is just an SVG and I, I have all the styles and so when you change the chip, whether it's like active or not it changes the SVG code and that somehow like renders like, looks like it's animating, but it, we just had the transition slow, but it's just like the, a JavaScript function to change the like underlying SVG.Yeah. And that was how I ended up like figuring out how to move it from from Figma. But yeah, that's Art Artisan. [00:05:00]Kyle: Speaking of marketing stunts though, he actually used those SVGs. Or kind of use those SVGs to make these cards.Nader: Oh yeah. LikeKyle: a GPU gift card Yes. That he handed out everywhere. That was actually my first impression of thatNader: one.Yeah,swyx: yeah, yeah.Nader: Yeah.swyx: I think I still have one of them.Nader: They look great.Kyle: Yeah.Nader: I have a ton of them still actually in our garage, which just, they don't have labels. We should honestly like bring, bring them back. But, um, I found this old printing press here, actually just around the corner on Ven ness. And it's a third generation San Francisco shop.And so I come in an excited startup founder trying to like, and they just have this crazy old machinery and I'm in awe. ‘cause the the whole building is so physical. Like you're seeing these machines, they have like pedals to like move these saws and whatever. I don't know what this machinery is, but I saw all three generations.Like there's like the grandpa, the father and the son, and the son was like, around my age. Well,swyx: it's like a holy, holy trinity.Nader: It's funny because we, so I just took the same SVG and we just like printed it and it's foil printing, so they make a a, a mold. That's like an inverse of like the A 100 and then they put the foil on it [00:06:00] and then they press it into the paper.And I remember once we got them, he was like, Hey, don't forget about us. You know, I guess like early Apple and Cisco's first business cards were all made there. And so he was like, yeah, we, we get like the startup businesses but then as they mature, they kind of go somewhere else. And so I actually, I think we were talking with marketing about like using them for some, we should go back and make some cards.swyx: Yeah, yeah, yeah. You know, I remember, you know, as a very, very small breadth investor, I was like, why are we spending time like, doing these like stunts for GPUs? Like, you know, I think like as a, you know, typical like cloud hard hardware person, you go into an AWS you pick like T five X xl, whatever, and it's just like from a list and you look at the specs like, why animate this GP?And, and I, I do think like it just shows the level of care that goes throughout birth and Yeah. And now, and also the, and,Nader: and Nvidia. I think that's what the, the thing that struck me most when we first came in was like the amount of passion that everyone has. Like, I think, um, you know, you talk to, you talk to Kyle, you talk to, like, every VP that I've met at Nvidia goes so close to the metal.Like, I remember it was almost a year ago, and like my VP asked me, he's like, Hey, [00:07:00] what's cursor? And like, are you using it? And if so, why? Surprised at this, and he downloaded Cursor and he was asking me to help him like, use it. And I thought that was, uh, or like, just show him what he, you know, why we were using it.And so, the amount of care that I think everyone has and the passion, appreciate, passion and appreciation for the moment. Right. This is a very unique time. So it's really cool to see everyone really like, uh, appreciate that.swyx: Yeah.Acquisition and DevEx Shiftswyx: One thing I wanted to do before we move over to sort of like research topics and, uh, the, the stuff that Kyle's working on is just tell the story of the acquisition, right?Like, not many people have been, been through an acquisition with Nvidia. What's it like? Uh, what, yeah, just anything you'd like to say.Nader: It's a crazy experience. I think, uh, you know, we were the thing that was the most exciting for us was. Our goal was just to make it easier for developers.We wanted to find access to GPUs, make it easier to do that. And then all, oh, actually your question about launchable. So launchable was just make one click exper, like one click deploys for any software on top of the GPU. Mm-hmm. And so what we really liked about Nvidia was that it felt like we just got a lot more resources to do all of that.I think, uh, you [00:08:00] know, NVIDIA's goal is to make things as easy for developers as possible. So there was a really nice like synergy there. I think that, you know, when it comes to like an acquisition, I think the amount that the soul of the products align, I think is gonna be. Is going speak to the success of the acquisition.Yeah. And so it in many ways feels like we're home. This is a really great outcome for us. Like we you know, I love brev.nvidia.com. Like you should, you should use it's, it's theKyle: front page for GPUs.Nader: Yeah. Yeah. If you want GP views,Kyle: you go there, getswyx: it there, and it's like internally is growing very quickly.I, I don't remember You said some stats there.Nader: Yeah, yeah, yeah. It's, uh, I, I wish I had the exact numbers, but like internally, externally, it's been growing really quickly. We've been working with a bunch of partners with a bunch of different customers and ISVs, if you have a solution that you want someone that runs on the GPU and you want people to use it quickly, we can bundle it up, uh, in a launchable and make it a one click run.If you're doing things and you want just like a sandbox or something to run on, right. Like open claw. Huge moment. Super exciting. Our, uh, and we'll talk into it more, but. You know, internally, people wanna run this, and you, we know we have to be really careful from the security implications. Do we let this run on the corporate network?Security's guidance was, Hey, [00:09:00] run this on breath, it's in, you know, it's, it's, it's a vm, it's sitting in the cloud, it's off the corporate network. It's isolated. And so that's been our stance internally and externally about how to even run something like open call while we figure out how to run these things securely.But yeah,swyx: I think there's also like, you almost like we're the right team at the right time when Nvidia is starting to invest a lot more in developer experience or whatever you call it. Yeah. Uh, UX or I don't know what you call it, like software. Like obviously NVIDIA is always invested in software, but like, there's like, this is like a different audience.Yeah. It's aNader: widerKyle: developer base.swyx: Yeah. Right.Nader: Yeah. Yeah. You know, it's funny, it's like, it's not, uh,swyx: so like, what, what is it called internally? What, what is this that people should be aware that is going on there?Nader: Uh, what, like developer experienceswyx: or, yeah, yeah. Is it's called just developer experience or is there like a broader strategy hereNader: in Nvidia?Um, Nvidia always wants to make a good developer experience. The thing is and a lot of the technology is just really complicated. Like, it's not, it's uh, you know, I think, um. The thing that's been really growing or the AI's growing is having a huge moment, not [00:10:00] because like, let's say data scientists in 2018, were quiet then and are much louder now.The pie is com, right? There's a whole bunch of new audiences. My mom's wondering what she's doing. My sister's learned, like taught herself how to code. Like the, um, you know, I, I actually think just generally AI's a big equalizer and you're seeing a more like technologically literate society, I guess.Like everyone's, everyone's learning how to code. Uh, there isn't really an excuse for that. And so building a good UX means that you really understand who your end user is. And when your end user becomes such a wide, uh, variety of people, then you have to almost like reinvent the practice, right? Yeah. You haveKyle: to, and actually build more developer ux, right?Because the, there are tiers of developer base that were added. You know, the, the hackers that are building on top of open claw, right? For example, have never used gpu. They don't know what kuda is. They, they, they just want to run something.Nader: Yeah.Kyle: You need new UX that is not just. Hey, you know, how do you program something in Cuda and run it?And then, and then we built, you know, like when Deep Learning was getting big, we built, we built Torch and, and, but so recently the amount of like [00:11:00] layers that are added to that developer stack has just exploded because AI has become ubiquitous. Everyone's using it in different ways. Yeah. It'sNader: moving fast in every direction.Vertical, horizontal.Vibhu: Yeah. You guys, you even take it down to hardware, like the DGX Spark, you know, it's, it's basically the same system as just throwing it up on big GPU cluster.Nader: Yeah, yeah, yeah. It's amazing. Blackwell.swyx: Yeah. Uh, we saw the preview at the last year's GTC and that was one of the better performing, uh, videos so far, and video coverage so far.Awesome. This will beat it. Um,Nader: that wasswyx: actually, we have fingersNader: crossed. Yeah.DGX Spark and Remote AccessNader: Even when Grace Blackwell or when, um, uh, DGX Spark was first coming out getting to be involved in that from the beginning of the developer experience. And it just comes back to what youswyx: were involved.Nader: Yeah. St. St.swyx: Mars.Nader: Yeah. Yeah. I mean from, it was just like, I, I got an email, we just got thrown into the loop and suddenly yeah, I, it was actually really funny ‘cause I'm still pretty fresh from the acquisition and I'm, I'm getting an email from a bunch of the engineering VPs about like, the new hardware, GPU chip, like we're, or not chip, but just GPU system that we're putting out.And I'm like, okay, cool. Matters. Now involved with this for the ux, I'm like. What am I gonna do [00:12:00] here? So, I remember the first meeting, I was just like kind of quiet as I was hearing engineering VPs talk about what this box could be, what it could do, how we should use it. And I remember, uh, one of the first ideas that people were idea was like, oh, the first thing that it was like, I think a quote was like, the first thing someone's gonna wanna do with this is get two of them and run a Kubernetes cluster on top of them.And I was like, oh, I think I know why I'm here. I was like, the first thing we're doing is easy. SSH into the machine. And then, and you know, just kind of like scoping it down of like, once you can do that every, you, like the person who wants to run a Kubernetes cluster onto Sparks has a higher propensity for pain, then, then you know someone who buys it and wants to run open Claw right now, right?If you can make sure that that's as effortless as possible, then the rest becomes easy. So there's a tool called Nvidia Sync. It just makes the SSH connection really simple. So, you know, if you think about it like. If you have a Mac, uh, or a PC or whatever, if you have a laptop and you buy this GPU and you want to use it, you should be able to use it like it's A-A-G-P-U in the cloud, right?Um, but there's all this friction of like, how do you actually get into that? That's part of [00:13:00] Revs value proposition is just, you know, there's a CLI that wraps SSH and makes it simple. And so our goal is just get you into that machine really easily. And one thing we just launched at CES, it's in, it's still in like early access.We're ironing out some kinks, but it should be ready by GTC. You can register your spark on Brev. And so now if youswyx: like remote managed yeah, local hardware. Single pane of glass. Yeah. Yeah. Because Brev can already manage other clouds anyway, right?Vibhu: Yeah, yeah. And you use the spark on Brev as well, right?Nader: Yeah. But yeah, exactly. So, so you, you, so you, you set it up at home you can run the command on it, and then it gets it's essentially it'll appear in your Brev account, and then you can take your laptop to a Starbucks or to a cafe, and you'll continue to use your, you can continue use your spark just like any other cloud node on Brev.Yeah. Yeah. And it's just like a pre-provisioned centerswyx: in yourNader: home. Yeah, exactly.swyx: Yeah. Yeah.Vibhu: Tiny little data center.Nader: Tiny little, the size ofVibhu: your phone.SOL Culture and Dynamo Setupswyx: One more thing before we move on to Kyle. Just have so many Jensen stories and I just love, love mining Jensen stories. Uh, my favorite so far is SOL. Uh, what is, yeah, what is S-O-L-S-O-LNader: is actually, i, I think [00:14:00] of all the lessons I've learned, that one's definitely my favorite.Kyle: It'll always stick with you.Nader: Yeah. Yeah. I, you know, in your startup, everything's existential, right? Like we've, we've run out of money. We were like, on the risk of, of losing payroll, we've had to contract our team because we l ran outta money. And so like, um, because of that you're really always forcing yourself to I to like understand the root cause of everything.If you get a date, if you get a timeline, you know exactly why that date or timeline is there. You're, you're pushing every boundary and like, you're not just say, you're not just accepting like a, a no. Just because. And so as you start to introduce more layers, as you start to become a much larger organization, SOL is is essentially like what is the physics, right?The speed of light moves at a certain speed. So if flight's moving some slower, then you know something's in the way. So before trying to like layer reality back in of like, why can't this be delivered at some date? Let's just understand the physics. What is the theoretical limit to like, uh, how fast this can go?And then start to tell me why. ‘cause otherwise people will start telling you why something can't be done. But actually I think any great leader's goal is just to create urgency. Yeah. [00:15:00] There's an infiniteKyle: create compelling events, right?Nader: Yeah.Kyle: Yeah. So l is a term video is used to instigate a compelling event.You say this is done. How do we get there? What is the minimum? As much as necessary, as little as possible thing that it takes for us to get exactly here and. It helps you just break through a bunch of noise.swyx: Yeah.Kyle: Instantly.swyx: One thing I'm unclear about is, can only Jensen use the SOL card? Like, oh, no, no, no.Not everyone get the b******t out because obviously it's Jensen, but like, can someone else be like, no, likeKyle: frontline engineers use it.Nader: Yeah. Every, I think it's not so much about like, get the b******t out. It's like, it's like, give me the root understanding, right? Like, if you tell me something takes three weeks, it like, well, what's the first principles?Yeah, the first principles. It's like, what's the, what? Like why is it three weeks? What is the actual yeah. What's the actual limit of why this is gonna take three weeks? If you're gonna, if you, if let's say you wanted to buy a new computer and someone told you it's gonna be here in five days, what's the SOL?Well, like the SOL is like, I could walk into a Best Buy and pick it up for you. Right? So then anything that's like beyond that is, and is that practical? Is that how we're gonna, you know, let's say give everyone in the [00:16:00] company a laptop, like obviously not. So then like that's the SOL and then it's like, okay, well if we have to get more than 10, suddenly there might be some, right?And so now we can kind of piece the reality back.swyx: So, so this is the. Paul Graham do things that don't scale. Yeah. And this is also the, what people would now call behi agency. Yeah.Kyle: It's actually really interesting because there's a, there's a second hardware angle to SOL that like doesn't come up for all the org sol is used like culturally at aswyx: media for everything.I'm also mining for like, I think that can be annoying sometimes. And like someone keeps going IOO you and you're like, guys, like we have to be stable. We have to, we to f*****g plan. Yeah.Kyle: It's an interesting balance.Nader: Yeah. I encounter that with like, actually just with, with Alec, right? ‘cause we, we have a new conference so we need to launch, we have, we have goals of what we wanna launch by, uh, by the conference and like, yeah.At the end of the day, where isswyx: this GTC?Nader: Um, well this is like, so we, I mean we did it for CES, we did for GT CDC before that we're doing it for GTC San Jose. So I mean, like every, you know, we have a new moment. Um, and we want to launch something. Yeah. And we want to do so at SOL and that does mean that some, there's some level of prioritization that needs [00:17:00] to happen.And so it, it is difficult, right? I think, um, you have to be careful with what you're pushing. You know, stability is important and that should be factored into S-O-L-S-O-L isn't just like, build everything and let it break, you know, that, that's part of the conversation. So as you're laying, layering in all the details, one of them might be, Hey, we could build this, but then it's not gonna be stable for X, y, z reasons.And so that was like, one of our conversations for CES was, you know, hey, like we, we can get this into early access registering your spark with brev. But there are a lot of things that we need to do in order to feel really comfortable from a security perspective, right? There's a lot of networking involved before we deliver that to users.So it's like, okay. Let's get this to a point where we can at least let people experiment with it. We had it in a booth, we had it in Jensen's keynote, and then let's go iron out all the networking kinks. And that's not easy. And so, uh, that can come later. And so that was the way that we layered that back in.Yeah. ButKyle: It's not really about saying like, you don't have to do the, the maintenance or operational work. It's more about saying, you know, it's kind of like [00:18:00] highlights how progress is incremental, right? Like, what is the minimum thing that we can get to. And then there's SOL for like every component after that.But there's the SOL to get you, get you to the, the starting line. And that, that's usually how it's asked. Yeah. On the other side, you know, like SOL came out of like hardware at Nvidia. Right. So SOL is like literally if we ran the accelerator or the GPU with like at basically full speed with like no other constraints, like how FAST would be able to make a program go.swyx: Yeah. Yeah. Right.Kyle: Soswyx: in, in training that like, you know, then you work back to like some percentage of like MFU for example.Kyle: Yeah, that's a, that's a great example. So like, there's an, there's an S-O-L-M-F-U, and then there's like, you know, what's practically achievable.swyx: Cool. Should we move on to sort of, uh, Kyle's side?Uh, Kyle, you're coming more from the data science world. And, uh, I, I mean I always, whenever, whenever I meet someone who's done working in tabular stuff, graph neural networks, time series, these are basically when I go to new reps, I go to ICML, I walk the back halls. There's always like a small group of graph people.Yes. Absolute small group of tabular people. [00:19:00] And like, there's no one there. And like, it's very like, you know what I mean? Like, yeah, no, like it's, it's important interesting work if you care about solving the problems that they solve.Kyle: Yeah.swyx: But everyone else is just LMS all the time.Kyle: Yeah. I mean it's like, it's like the black hole, right?Has the event horizon reached this yet in nerves? Um,swyx: but like, you know, those are, those are transformers too. Yeah. And, and those are also like interesting things. Anyway, uh, I just wanted to spend a little bit of time on, on those, that background before we go into Dynamo, uh, proper.Kyle: Yeah, sure. I took a different path to Nvidia than that, or I joined six years ago, seven, if you count, when I was an intern.So I joined Nvidia, like right outta college. And the first thing I jumped into was not what I'd done in, during internship, which was like, you know, like some stuff for autonomous vehicles, like heavyweight object detection. I jumped into like, you know, something, I'm like, recommenders, this is popular. Andswyx: yeah, he did RexiKyle: as well.Yeah, Rexi. Yeah. I mean that, that was the taboo data at the time, right? You have tables of like, audience qualities and item qualities, and you're trying to figure out like which member of [00:20:00] the audience matches which item or, or more practically which item matches which member of the audience. And at the time, really it was like we were trying to enable.Uh, recommender, which had historically been like a little bit of a CP based workflow into something that like, ran really well in GPUs. And it's since been done. Like there are a bunch of libraries for Axis that run on GPUs. Uh, the common models like Deeplearning recommendation model, which came outta meta and the wide and deep model, which was used or was released by Google were very accelerated by GPUs using, you know, the fast HBM on the chips, especially to do, you know, vector lookups.But it was very interesting at the time and super, super relevant because like we were starting to get like. This explosion of feeds and things that required rec recommenders to just actively be on all the time. And sort of transitioned that a little bit towards graph neural networks when I discovered them because I was like, okay, you can actually use graphical neural networks to represent like, relationships between people, items, concepts, and that, that interested me.So I jumped into that at [00:21:00] Nvidia and, and got really involved for like two-ish years.swyx: Yeah. Uh, and something I learned from Brian Zaro Yeah. Is that you can just kind of choose your own path in Nvidia.Kyle: Oh my God. Yeah.swyx: Which is not a normal big Corp thing. Yeah. Like you, you have a lane, you stay in your lane.Nader: I think probably the reason why I enjoy being in a, a big company, the mission is the boss probably from a startup guy. Yeah. The missionswyx: is the boss.Nader: Yeah. Uh, it feels like a big game of pickup basketball. Like, you know, if you play one, if you wanna play basketball, you just go up to the court and you're like, Hey look, we're gonna play this game and we need three.Yeah. And you just like find your three. That's honestly for every new initiative that's what it feels like. Yeah.Vibhu: It also like shows, right? Like Nvidia. Just releasing state-of-the-art stuff in every domain. Yeah. Like, okay, you expect foundation models with Nemo tron voice just randomly parakeet.Call parakeet just comes out another one, uh, voice. TheKyle: video voice team has always been producing.Vibhu: Yeah. There's always just every other domain of paper that comes out, dataset that comes out. It's like, I mean, it also stems back to what Nvidia has to do, right? You have to make chips years before they're actually produced.Right? So you need to know, you need to really [00:22:00] focus. TheKyle: design process starts likeVibhu: exactlyKyle: three to five years before the chip gets to the market.Vibhu: Yeah. I, I'm curious more about what that's like, right? So like, you have specialist teams. Is it just like, you know, people find an interest, you go in, you go deep on whatever, and that kind of feeds back into, you know, okay, we, we expect predictions.Like the internals at Nvidia must be crazy. Right? You know? Yeah. Yeah. You know, you, you must. Not even without selling to people, you have your own predictions of where things are going. Yeah. And they're very based, very grounded. Right?Kyle: Yeah. It, it, it's really interesting. So there's like two things that I think that Amed does, which are quite interesting.Uh, one is like, we really index into passion. There's a big. Sort of organizational top sound push to like ensure that people are working on the things that they're passionate about. So if someone proposes something that's interesting, many times they can just email someone like way up the chain that they would find this relevant and say like, Hey, can I go work on this?Nader: It's actually like I worked at a, a big company for a couple years before, uh, starting on my startup journey and like, it felt very weird if you were to like email out of chain, if that makes [00:23:00] sense. Yeah. The emails at Nvidia are like mosh pitsswyx: shoot,Nader: and it's just like 60 people, just whatever. And like they're, there's this,swyx: they got messy like, reply all you,Nader: oh, it's in, it's insane.It's insane. They justKyle: help. You know, Maxim,Nader: the context. But, but that's actually like, I've actually, so this is a weird thing where I used to be like, why would we send emails? We have Slack. I am the entire, I'm the exact opposite. I feel so bad for anyone who's like messaging me on Slack ‘cause I'm so unresponsive.swyx: Your emailNader: Maxi, email Maxim. I'm email maxing Now email is a different, email is perfect because man, we can't work together. I'm email is great, right? Because important threads get bumped back up, right? Yeah, yeah. Um, and so Slack doesn't do that. So I just have like this casino going off on the right or on the left and like, I don't know which thread was from where or what, but like the threads get And then also just like the subject, so you can have like working threads.I think what's difficult is like when you're small, if you're just not 40,000 people I think Slack will work fine, but there's, I don't know what the inflection point is. There is gonna be a point where that becomes really messy and you'll actually prefer having email. ‘cause you can have working threads.You can cc more than nine people in a thread.Kyle: You can fork stuff.Nader: You can [00:24:00] fork stuff, which is super nice and just like y Yeah. And so, but that is part of where you can propose a plan. You can also just. Start, honestly, momentum's the only authority, right? So like, if you can just start, start to make a little bit of progress and show someone something, and then they can try it.That's, I think what's been, you know, I think the most effective way to push anything for forward. And that's both at Nvidia and I think just generally.Kyle: Yeah, there's, there's the other concept that like is explored a lot at Nvidia, which is this idea of a zero billion dollar business. Like market creation is a big thing at Nvidia.Like,swyx: oh, you want to go and start a zero billion dollar business?Kyle: Jensen says, we are completely happy investing in zero billion dollar markets. We don't care if this creates revenue. It's important for us to know about this market. We think it will be important in the future. It can be zero billion dollars for a while.I'm probably minging as words here for, but like, you know, like, I'll give an example. NVIDIA's been working on autonomous driving for a a long time,swyx: like an Nvidia car.Kyle: No, they, they'veVibhu: used the Mercedes, right? They're around the HQ and I think it finally just got licensed out. Now they're starting to be used quite a [00:25:00] bit.For 10 years you've been seeing Mercedes with Nvidia logos driving.Kyle: If you're in like the South San Santa Clara, it's, it's actually from South. Yeah. So, um. Zero billion dollar markets are, are a thing like, you know, Jensen,swyx: I mean, okay, look, cars are not a zero billion dollar market. But yeah, that's a bad example.Nader: I think, I think he's, he's messaging, uh, zero today, but, or even like internally, right? Like, like it's like, uh, an org doesn't have to ruthlessly find revenue very quickly to justify their existence. Right. Like a lot of the important research, a lot of the important technology being developed that, that's kind ofKyle: where research, research is very ide ideologically free at Nvidia.Yeah. Like they can pursue things that they wereswyx: Were you research officially?Kyle: I was never in research. Officially. I was always in engineering. Yeah. We in, I'm in an org called Deep Warning Algorithms, which is basically just how do we make things that are relevant to deep warning go fast.swyx: That sounds freaking cool.Vibhu: And I think a lot of that is underappreciated, right? Like time series. This week Google put out time. FF paper. Yeah. A new time series, paper res. Uh, Symantec, ID [00:26:00] started applying Transformers LMS to Yes. Rec system. Yes. And when you think the scale of companies deploying these right. Amazon recommendations, Google web search, it's like, it's huge scale andKyle: Yeah.Vibhu: You want fast?Kyle: Yeah. Yeah. Yeah. Actually it's, it, I, there's a fun moment that brought me like full circle. Like, uh, Amazon Ads recently gave a talk where they talked about using Dynamo for generative recommendation, which was like super, like weirdly cathartic for me. I'm like, oh my God. I've, I've supplanted what I was working on.Like, I, you're using LMS now to do what I was doing five years ago.swyx: Yeah. Amazing. And let's go right into Dynamo. Uh, maybe introduce Yeah, sure. To the top down and Yeah.Kyle: I think at this point a lot of people are familiar with the term of inference. Like funnily enough, like I went from, you know, inference being like a really niche topic to being something that's like discussed on like normal people's Twitter feeds.It's,Nader: it's on billboardsKyle: here now. Yeah. Very, very strange. Driving, driving, seeing just an inference ad on 1 0 1 inference at scale is becoming a lot more important. Uh, we have these moments like, you know, open claw where you have these [00:27:00] agents that take lots and lots of tokens, but produce, incredible results.There are many different aspects of test time scaling so that, you know, you can use more inference to generate a better result than if you were to use like a short amount of inference. There's reasoning, there's quiring, there's, adding agency to the model, allowing it to call tools and use skills.Dyno sort came about at Nvidia. Because myself and a couple others were, were sort of talking about the, these concepts that like, you know, you have inference engines like VLMS, shelan, tenor, TLM and they have like one single copy. They, they, they sort of think about like things as like one single copy, like one replica, right?Why Scale Out WinsKyle: Like one version of the model. But when you're actually serving things at scale, you can't just scale up that replica because you end up with like performance problems. There's a scaling limit to scaling up replicas. So you actually have to scale out to use a, maybe some Kubernetes type terminology.We kind of realized that there was like. A lot of potential optimization that we could do in scaling out and building systems for data [00:28:00] center scale inference. So Dynamo is this data center scale inference engine that sits on top of the frameworks like VLM Shilling and 10 T lm and just makes things go faster because you can leverage the economy of scale.The fact that you have KV cash, which we can define a little bit later, uh, in all these machines that is like unique and you wanna figure out like the ways to maximize your cash hits or you want to employ new techniques in inference like disaggregation, which Dynamo had introduced to the world in, in, in March, not introduced, it was a academic talk, but beforehand.But we are, you know, one of the first frameworks to start, supporting it. And we wanna like, sort of combine all these techniques into sort of a modular framework that allows you to. Accelerate your inference at scale.Nader: By the way, Kyle and I became friends on my first date, Nvidia, and I always loved, ‘cause like he always teaches meswyx: new things.Yeah. By the way, this is why I wanted to put two of you together. I was like, yeah, this is, this is gonna beKyle: good. It's very, it's very different, you know, like we've, we, we've, we've talked to each other a bunch [00:29:00] actually, you asked like, why, why can't we scale up?Nader: Yeah.Scale Up Limits ExplainedNader: model, you said model replicas.Kyle: Yeah. So you, so scale up means assigning moreswyx: heavier?Kyle: Yeah, heavier. Like making things heavier. Yeah, adding more GPUs. Adding more CPUs. Scale out is just like having a barrier saying, I'm gonna duplicate my representation of the model or a representation of this microservice or something, and I'm gonna like, replicate it Many times.Handle, load. And the reason that you can't scale, scale up, uh, past some points is like, you know, there, there, there are sort of hardware bounds and algorithmic bounds on, on that type of scaling. So I'll give you a good example that's like very trivial. Let's say you're on an H 100. The Maxim ENV link domain for H 100, for most Ds H one hundreds is heus, right?So if you scaled up past that, you're gonna have to figure out ways to handle the fact that now for the GPUs to communicate, you have to do it over Infin band, which is still very fast, but is not as fast as ENV link.swyx: Is it like one order of magnitude, like hundreds or,Kyle: it's about an order of magnitude?Yeah. Okay. Um, soswyx: not terrible.Kyle: [00:30:00] Yeah. I, I need to, I need to remember the, the data sheet here, like, I think it's like about 500 gigabytes. Uh, a second unidirectional for ENV link, and about 50 gigabytes a second unidirectional for Infin Band. I, it, it depends on the, the generation.swyx: I just wanna set this up for people who are not familiar with these kinds of like layers and the trash speedVibhu: and all that.Of course.From Laptop to Multi NodeVibhu: Also, maybe even just going like a few steps back before that, like most people are very familiar with. You see a, you know, you can use on your laptop, whatever these steel viol, lm you can just run inference there. All, there's all, you can, youcan run it on thatVibhu: laptop. You can run on laptop.Then you get to, okay, uh, models got pretty big, right? JLM five, they doubled the size, so mm-hmm. Uh, what do you do when you have to go from, okay, I can get 128 gigs of memory. I can run it on a spark. Then you have to go multi GPU. Yeah. Okay. Multi GPU, there's some support there. Now, if I'm a company and I don't have like.I'm not hiring the best researchers for this. Right. But I need to go [00:31:00] multi-node, right? I have a lot of servers. Okay, now there's efficiency problems, right? You can have multiple eight H 100 nodes, but, you know, is that as a, like, how do you do that efficiently?Kyle: Yeah. How do you like represent them? How do you choose how to represent the model?Yeah, exactly right. That's a, that's like a hard question. Everyone asks, how do you size oh, I wanna run GLM five, which just came out new model. There have been like four of them in the past week, by the way, like a bunch of new models.swyx: You know why? Right? Deep seek.Kyle: No comment. Oh. Yeah, but Ggl, LM five, right?We, we have this, new model. It's, it's like a large size, and you have to figure out how to both scale up and scale out, right? Because you have to find the right representation that you care about. Everyone does this differently. Let's be very clear. Everyone figures this out in their own path.Nader: I feel like a lot of AI or ML even is like, is like this. I think people think, you know, I, I was, there was some tweet a few months ago that was like, why hasn't fine tuning as a service taken off? You know, that might be me. It might have been you. Yeah. But people want it to be such an easy recipe to follow.But even like if you look at an ML model and specificKyle: to you Yeah,Nader: yeah.Kyle: And the [00:32:00] model,Nader: the situation, and there's just so much tinkering, right? Like when you see a model that has however many experts in the ME model, it's like, why that many experts? I don't, they, you know, they tried a bunch of things and that one seemed to do better.I think when it comes to how you're serving inference, you know, you have a bunch of decisions to make and there you can always argue that you can take something and make it more optimal. But I think it's this internal calibration and appetite for continued calibration.Vibhu: Yeah. And that doesn't mean like, you know, people aren't taking a shot at this, like tinker from thinking machines, you know?Yeah. RL as a service. Yeah, totally. It's, it also gets even harder when you try to do big model training, right? We're not the best at training Moes, uh, when they're pre-trained. Like we saw this with LAMA three, right? They're trained in such a sparse way that meta knows there's gonna be a bunch of inference done on these, right?They'll open source it, but it's very trained for what meta infrastructure wants, right? They wanna, they wanna inference it a lot. Now the question to basically think about is, okay, say you wanna serve a chat application, a coding copilot, right? You're doing a layer of rl, you're serving a model for X amount of people.Is it a chat model, a coding model? Dynamo, you know, back to that,Kyle: it's [00:33:00] like, yeah, sorry. So you we, we sort of like jumped off of, you know, jumped, uh, on that topic. Everyone has like, their own, own journey.Cost Quality Latency TradeoffsKyle: And I, I like to think of it as defined by like, what is the model you need? What is the accuracy you need?Actually I talked to NA about this earlier. There's three axes you care about. What is the quality that you're able to produce? So like, are you accurate enough or can you complete the task with enough, performance, high enough performance. Yeah, yeah. Uh, there's cost. Can you serve the model or serve your workflow?Because it's not just the model anymore, it's the workflow. It's the multi turn with an agent cheaply enough. And then can you serve it fast enough? And we're seeing all three of these, like, play out, like we saw, we saw new models from OpenAI that you know, are faster. You have like these new fast versions of models.You can change the amount of thinking to change the amount of quality, right? Produce more tokens, but at a higher cost in a, in a higher latency. And really like when you start this journey of like trying to figure out how you wanna host a model, you, you, you think about three things. What is the model I need to serve?How many times do I need to call it? What is the input sequence link was [00:34:00] the, what does the workflow look like on top of it? What is the SLA, what is the latency SLA that I need to achieve? Because there's usually some, this is usually like a constant, you, you know, the SLA that you need to hit and then like you try and find the lowest cost version that hits all of these constraints.Usually, you know, you, you start with those things and you say you, you kind of do like a bit of experimentation across some common configurations. You change the tensor parallel size, which is a form of parallelismVibhu: I take, it goes even deeper first. Gotta think what model.Kyle: Yes, course,ofKyle: course. It's like, it's like a multi-step design process because as you said, you can, you can choose a smaller model and then do more test time scaling and it'll equate the quality of a larger model because you're doing the test time scaling or you're adding a harness or something.So yes, it, it goes way deeper than that. But from the performance perspective, like once you get to the model you need, you need to host, you look at that and you say, Hey. I have this model, I need to serve it at the speed. What is the right configuration for that?Nader: You guys see the recent, uh, there was a paper I just saw like a few days ago that, uh, if you run [00:35:00] the same prompt twice, you're getting like double Just try itagain.Nader: Yeah, exactly.Vibhu: And you get a lot. Yeah. But the, the key thing there is you give the context of the failed try, right? Yeah. So it takes a shot. And this has been like, you know, basic guidance for quite a while. Just try again. ‘cause you know, trying, just try again. Did you try again? All adviceNader: in life.Vibhu: Just, it's a paper from Google, if I'm not mistaken, right?Yeah,Vibhu: yeah. I think it, it's like a seven bas little short paper. Yeah. Yeah. The title's very cute. And it's just like, yeah, just try again. Give it ask context,Kyle: multi-shot. You just like, say like, hey, like, you know, like take, take a little bit more, take a little bit more information, try and fail. Fail.Vibhu: And that basic concept has gone pretty deep.There's like, um, self distillation, rl where you, you do self distillation, you do rl and you have past failure and you know, that gives some signal so people take, try it again. Not strong enough.swyx: Uh, for, for listeners, uh, who listen to here, uh, vivo actually, and I, and we run a second YouTube channel for our paper club where, oh, that's awesome.Vivo just covered this. Yeah. Awesome. Self desolation and all that's, that's why he, to speed [00:36:00] on it.Nader: I'll to check it out.swyx: Yeah. It, it's just a good practice, like everyone needs, like a paper club where like you just read papers together and the social pressure just kind of forces you to just,Nader: we, we,there'sNader: like a big inference.Kyle: ReadingNader: group at a video. I feel so bad every time. I I, he put it on like, on our, he shared it.swyx: One, one ofNader: your guys,swyx: uh, is, is big in that, I forget es han Yeah, yeah,Kyle: es Han's on my team. Actually. Funny. There's a, there's a, there's a employee transfer between us. Han worked for Nater at Brev, and now he, he's on my team.He wasNader: our head of ai. And then, yeah, once we got in, andswyx: because I'm always looking for like, okay, can, can I start at another podcast that only does that thing? Yeah. And, uh, Esan was like, I was trying to like nudge Esan into like, is there something here? I mean, I don't think there's, there's new infant techniques every day.So it's like, it's likeKyle: you would, you would actually be surprised, um, the amount of blog posts you see. And ifswyx: there's a period where it was like, Medusa hydra, what Eagle, like, youKyle: know, now we have new forms of decode, uh, we have new forms of specula, of decoding or new,swyx: what,Kyle: what are youVibhu: excited? And it's exciting when you guys put out something like Tron.‘cause I remember the paper on this Tron three, [00:37:00] uh, the amount of like post train, the on tokens that the GPU rich can just train on. And it, it was a hybrid state space model, right? Yeah.Kyle: It's co-designed for the hardware.Vibhu: Yeah, go design for the hardware. And one of the things was always, you know, the state space models don't scale as well when you do a conversion or whatever the performance.And you guys are like, no, just keep draining. And Nitron shows a lot of that. Yeah.Nader: Also, something cool about Nitron it was released in layers, if you will, very similar to Dynamo. It's, it's, it's essentially it was released as you can, the pre-training, post-training data sets are released. Yeah. The recipes on how to do it are released.The model itself is released. It's full model. You just benefit from us turning on the GPUs. But there are companies like, uh, ServiceNow took the dataset and they trained their own model and we were super excited and like, you know, celebrated that work.ZoomVibhu: different. Zoom is, zoom is CGI, I think, uh, you know, also just to add like a lot of models don't put out based models and if there's that, why is fine tuning not taken off?You know, you can do your own training. Yeah,Kyle: sure.Vibhu: You guys put out based model, I think you put out everything.Nader: I believe I know [00:38:00]swyx: about base. BasicallyVibhu: without baseswyx: basic can be cancelable.Vibhu: Yeah. Base can be cancelable.swyx: Yeah.Vibhu: Safety training.swyx: Did we get a full picture of dymo? I, I don't know if we, what,Nader: what I'd love is you, you mentioned the three axes like break it down of like, you know, what's prefilled decode and like what are the optimizations that we can get with Dynamo?Kyle: Yeah. That, that's, that's, that's a great point. So to summarize on that three axis problem, right, there are three things that determine whether or not something can be done with inference, cost, quality, latency, right? Dynamo is supposed to be there to provide you like the runtime that allows you to pull levers to, you know, mix it up and move around the parade of frontier or the preto surface that determines is this actually possible with inference And AI todayNader: gives you the knobs.Kyle: Yeah, exactly. It gives you the knobs.Disaggregation Prefill vs DecodeKyle: Uh, and one thing that like we, we use a lot in contemporary inference and is, you know, starting to like pick up from, you know, in, in general knowledge is this co concept of disaggregation. So historically. Models would be hosted with a single inference engine. And that inference engine [00:39:00] would ping pong between two phases.There's prefill where you're reading the sequence generating KV cache, which is basically just a set of vectors that represent the sequence. And then using that KV cache to generate new tokens, which is called Decode. And some brilliant researchers across multiple different papers essentially made the realization that if you separate these two phases, you actually gain some benefits.Those benefits are basically a you don't have to worry about step synchronous scheduling. So the way that an inference engine works is you do one step and then you finish it, and then you schedule, you start scheduling the next step there. It's not like fully asynchronous. And the problem with that is you would have, uh, essentially pre-fill and decode are, are actually very different in terms of both their resource requirements and their sometimes their runtime.So you would have like prefill that would like block decode steps because you, you'd still be pre-filing and you couldn't schedule because you know the step has to end. So you remove that scheduling issue and then you also allow you, or you yourself, to like [00:40:00] split the work into two different ki types of pools.So pre-fill typically, and, and this changes as, as model architecture changes. Pre-fill is, right now, compute bound most of the time with the sequence is sufficiently long. It's compute bound. On the decode side because you're doing a full Passover, all the weights and the entire sequence, every time you do a decode step and you're, you don't have the quadratic computation of KV cache, it's usually memory bound because you're retrieving a linear amount of memory and you're doing a linear amount of compute as opposed to prefill where you retrieve a linear amount of memory and then use a quadratic.You know,Nader: it's funny, someone exo Labs did a really cool demo where for the DGX Spark, which has a lot more compute, you can do the pre the compute hungry prefill on a DG X spark and then do the decode on a, on a Mac. Yeah. And soVibhu: that's faster.Nader: Yeah. Yeah.Kyle: So you could, you can do that. You can do machine strat stratification.Nader: Yeah.Kyle: And like with our future generation generations of hardware, we actually announced, like with Reuben, this [00:41:00] new accelerator that is prefilled specific. It's called Reuben, CPX. SoKubernetes Scaling with GroveNader: I have a question when you do the scale out. Yeah. Is scaling out easier with Dynamo? Because when you need a new node, you can dedicate it to either the Prefill or, uh, decode.Kyle: Yeah. So Dynamo actually has like a, a Kubernetes component in it called Grove that allows you to, to do this like crazy scaling specialization. It has like this hot, it's a representation that, I don't wanna go too deep into Kubernetes here, but there was a previous way that you would like launch multi-node work.Uh, it's called Leader Worker Set. It's in the Kubernetes standard, and Leader worker set is great. It served a lot of people super well for a long period of time. But one of the things that it's struggles with is representing a set of cases where you have a multi-node replica that has a pair, right?You know, prefill and decode, or it's not paired, but it has like a second stage that has a ratio that changes over time. And prefill and decode are like two different things as your workload changes, right? The amount of prefill you'll need to do may change. [00:42:00] The amount of decode that you, you'll need to do might change, right?Like, let's say you start getting like insanely long queries, right? That probably means that your prefill scales like harder because you're hitting these, this quadratic scaling growth.swyx: Yeah.And then for listeners, like prefill will be long input. Decode would be long output, for example, right?Kyle: Yeah. So like decode, decode scale. I mean, decode is funny because the amount of tokens that you produce scales with the output length, but the amount of work that you do per step scales with the amount of tokens in the context.swyx: Yes.Kyle: So both scales with the input and the output.swyx: That's true.Kyle: But on the pre-fold view code side, like if.Suddenly, like the amount of work you're doing on the decode side stays about the same or like scales a little bit, and then the prefilled side like jumps up a lot. You actually don't want that ratio to be the same. You want it to change over time. So Dynamo has a set of components that A, tell you how to scale.It tells you how many prefilled workers and decoded workers you, it thinks you should have, and also provides a scheduling API for Kubernetes that allows you to actually represent and affect this scheduling on, on, on your actual [00:43:00] hardware, on your compute infrastructure.Nader: Not gonna lie. I feel a little embarrassed for being proud of my SVG function earlier.swyx: No, itNader: wasreallyKyle: cute. I, Iswyx: likeNader: it's all,swyx: it's all engineering. It's all engineering. Um, that's where I'mKyle: technical.swyx: One thing I'm, I'm kind of just curious about with all with you see at a systems level, everything going on here. Mm-hmm. And we, you know, we're scaling it up in, in multi, in distributed systems.Context Length and Co Designswyx: Um, I think one thing that's like kind of, of the moment right now is people are asking, is there any SOL sort of upper bounds. In terms of like, let's call, just call it context length for one for of a better word, but you can break it down however you like.Nader: Yeah.swyx: I just think like, well, yeah, I mean, like clearly you can engage in hybrid architectures and throw in some state space models in there.All, all you want, but it looks, still looks very attention heavy.Kyle: Yes. Uh, yeah. Long context is attention heavy. I mean, we have these hybrid models, um,swyx: to take and most, most models like cap out at a million contexts and that's it. Yeah. Like for the last two years has been it.Kyle: Yeah. The model hardware context co-design thing that we're seeing these days is actually super [00:44:00] interesting.It's like my, my passion, like my secret side passion. We see models like Kimmy or G-P-T-O-S-S. I'm use these because I, I know specific things about these models. So Kimmy two comes out, right? And it's an interesting model. It's like, like a deep seek style architecture is MLA. It's basically deep seek, scaled like a little bit differently, um, and obviously trained differently as well.But they, they talked about, why they made the design choices for context. Kimmy has more experts, but fewer attention heads, and I believe a slightly smaller attention, uh, like dimension. But I need to remember, I need to check that. Uh, it doesn't matter. But they discussed this actually at length in a blog post on ji, which is like our pu which is like credit puswyx: Yeah.Kyle: Um, in, in China. Chinese red.swyx: Yeah.Kyle: It's, yeah. So it, it's, it's actually an incredible blog post. Uh, like all the mls people in, in, in that, I've seen that on GPU are like very brilliant, but they, they talk about like the creators of Kimi K two [00:45:00] actually like, talked about it on, on, on there in the blog post.And they say, we, we actually did an experiment, right? Attention scales with the number of heads, obviously. Like if you have 64 heads versus 32 heads, you do half the work of attention. You still scale quadratic, but you do half the work. And they made a, a very specific like. Sort of barter in their system, in their architecture, they basically said, Hey, what if we gave it more experts, so we're gonna use more memory capacity.But we keep the amount of activated experts the same. We increase the expert sparsity, so we have fewer experts act. The ratio to of experts activated to number of experts is smaller, and we decrease the number of attention heads.Vibhu: And kind of for context, what the, what we had been seeing was you make models sparser instead.So no one was really touching heads. You're just having, uh,Kyle: well, they, they did, they implicitly made it sparser.Vibhu: Yeah, yeah. For, for Kimmy. They did,Kyle: yes.Vibhu: They also made it sparser. But basically what we were seeing was people were at the level of, okay, there's a sparsity ratio. You want more total parameters, less active, and that's sparsity.[00:46:00]But what you see from papers, like, the labs like moonshot deep seek, they go to the level of, okay, outside of just number of experts, you can also change how many attention heads and less attention layers. More attention. Layers. Layers, yeah. Yes, yes. So, and that's all basically coming back to, just tied together is like hardware model, co-design, which isKyle: hardware model, co model, context, co-design.Vibhu: Yeah.Kyle: Right. Like if you were training a, a model that was like. Really, really short context, uh, or like really is good at super short context tasks. You may like design it in a way such that like you don't care about attention scaling because it hasn't hit that, like the turning point where like the quadratic curve takes over.Nader: How do you consider attention or context as a separate part of the co-design? Like I would imagine hardware or just how I would've thought of it is like hardware model. Co-design would be hardware model context co-designKyle: because the harness and the context that is produced by the harness is a part of the model.Once it's trained in,Vibhu: like even though towards the end you'll do long context, you're not changing architecture through I see. Training. Yeah.Kyle: I mean you can try.swyx: You're saying [00:47:00] everyone's training the harness into the model.Kyle: I would say to some degree, orswyx: there's co-design for harness. I know there's a small amount, but I feel like not everyone has like gone full send on this.Kyle: I think, I think I think it's important to internalize the harness that you think the model will be running. Running into the model.swyx: Yeah. Interesting. Okay. Bash is like the universal harness,Kyle: right? Like I'll, I'll give. An example here, right? I mean, or just like a, like a, it's easy proof, right? If you can train against a harness and you're using that harness for everything, wouldn't you just train with the harness to ensure that you get the best possible quality out of,swyx: Well, the, uh, I, I can provide a counter argument.Yeah, sure. Which is what you wanna provide a generally useful model for other people to plug into their harnesses, right? So if youKyle: Yeah. Harnesses can be open, open source, right?swyx: Yeah. So I mean, that's, that's effectively what's happening with Codex.Kyle: Yeah.swyx: And, but like you may want like a different search tool and then you may have to name it differently or,Nader: I don't know how much people have pushed on this, but can you.Train a model, would it be, have you have people compared training a model for the for the harness versus [00:48:00] like post training forswyx: I think it's the same thing. It's the same thing. It's okay. Just extra post training. INader: see.swyx: And so, I mean, cognition does this course, it does this where you, you just have to like, if your tool is slightly different, um, either force your tool to be like the tool that they train for.Hmm. Or undo their training for their tool and then Oh, that's re retrain. Yeah. It's, it's really annoying and like,Kyle: I would hope that eventually we hit like a certain level of generality with respect to training newswyx: tools. This is not a GI like, it's, this is a really stupid like. Learn my tool b***h.Like, I don't know if, I don't know if I can say that, but like, you know, um, I think what my point kind of is, is that there's, like, I look at slopes of the scaling laws and like, this slope is not working, man. We, we are at a million token con

Startup Project
Inside the Battle for AI Cloud Dominance — Why Cloud Builders like TensorWave are Rethinking NVIDIA's Monopoly | Jeff Tatarchuk, Co-Founder of TensorWave

Startup Project

Play Episode Listen Later Mar 8, 2026 42:18


Rethinking AI Compute Infrastructure: The TensorWave ApproachIn this episode, Jeff Tatarchuk, co-founder of TensorWave, shares how his deep industry experience and innovative mindset are transforming AI compute infrastructure. We explore how building specialized data centers, focusing on AMD GPUs, and creating flexible ecosystems are shaping the future of scalable AI.In this episode:The evolution of cloud companies and the rise of Neo clouds focused on AI computeTensorWave's unique strategy of deploying AMD GPUs in custom data centersLessons learned from FPGA cloud business and transitioning into GPU infrastructureThe technical challenges and solutions in scaling data centers quickly amidst power and supply chain constraintsThe importance of software ecosystems, interoperability, and supporting AMD's software stackHow TensorWave differentiates itself from purely financial arbitrage models and pure Nvidia-centric cloudsAMD's advantages in memory capacity, chiplet architecture, and software supportThe technical intricacies of CUDA versus ROCm, and efforts to build an open ecosystemFuture vision: democratized, reliable, and flexible AI compute options for enterprise and labsTimestamps:00:00 – Introduction to TensorWave and the AI compute landscape02:30 – The rise of Neo clouds and innovation waves in cloud infrastructure06:00 – How TensorWave's FPGA cloud background shaped its GPU strategy10:00 – Challenges in deploying large data centers: power, supply chain, and permitting14:00 – Building and scaling AMD GPU data centers quickly and efficiently19:00 – Software ecosystems: the CUDA moat and TensorWave's ‘Beyond CUDA' summit23:00 – Market differentiation: technical and operational challenges in the Neo cloud space27:00 – Supporting enterprise fine tuning and large-scale training demands32:00 – AMD's technical advantages: VRAM, chiplet architecture, and software support36:00 – Building an open, heterogeneous AI ecosystem beyond CUDA40:00 – What success looks like: a resilient, accessible AI compute futureResources & Links:⁠TensorWave⁠⁠Beyond CUDA Summit⁠⁠Scalar LM by Greg De Almos⁠⁠AMD MI300X Data Center Chip⁠⁠Nvidia H100⁠⁠RoCM Software Stack⁠⁠LinkedIn⁠⁠Twitter⁠This conversation offers a strategic look at how focused infrastructure development, software ecosystem support, and hardware differentiation are critical in shaping the future of accessible, scalable AI compute. Whether you're building data centers, developing AI hardware, or just interested in industry shifts, this episode provides valuable insights into how companies like TensorWave are reshaping the landscape.

OneDigital
Podcast ONE: 6 de marzo de 2026

OneDigital

Play Episode Listen Later Mar 7, 2026 124:04


Podcast ONE: 6 de marzo de 2026 CoPaw (IA local sin nube), GPT‑5.4 con millón de tokens, la nueva MacBook Neo “económica”, la guerra Irán‑Israel amplificada por desinformación de IA y todo lo que dejó el #MWC2026. Escucha el nuevo episodio de #PodcastONE en One Digital. Escucha aquí el Podcast ONE: 6 de marzo de 2026 Facebook Live One Digital: CoPaw, GPT-5.4, MacBook Neo y el caos geopolítico de marzo 2026 En este episodio del viernes 6 de marzo de 2026, transmitido en vivo desde São Paulo (Brasil) y Ciudad de México, Vincent Quezada y Pablo Berruecos analizan una semana explosiva: herramientas de inteligencia artificial local (CoPaw), el lanzamiento de GPT‑5.4 con contexto de un millón de tokens, la MacBook Neo (la laptop Apple más económica de su historia), el conflicto geopolítico Irán‑Israel amplificado por desinformación de IA en redes sociales y el Mobile World Congress 2026, que redefinió privacidad, seguridad y conectividad móvil. Un episodio que resume el estado actual de la tecnología, la geopolítica y la ética digital en 2026. ¿Qué es CoPaw? Un agente de IA completamente local sin dependencias en la nube Vincent abre el episodio presentando CoPaw (Co‑Personal Agent Workstation), un agente de inteligencia artificial que funciona completamente en tu equipo local, sin procesar datos en servidores externos como ChatGPT o Gemini. La arquitectura es una evolución directa de los agentes COD (marco multiagente de Alibaba). La diferencia crítica: toda la información permanece dentro de tu máquina, lo que garantiza privacidad total y funcionamiento sin internet una vez instalado el proyecto. “CoPaw no es simplemente un cliente de chat para modelos locales. Es un orquestador de tareas que puede navegar por internet, leer PDFs, generar documentos Word, enviar mensajes por Telegram y ejecutar acciones programadas de forma automática sin intervención humana”. — Vincent Quezada Requisitos técnicos de CoPaw: hardware y software RAM mínima: 8 GB (16 GB ideales para multitarea). Almacenamiento: 10 GB mínimos (20 GB recomendados para modelos grandes). Software: Python 3.10, Node.js v18. GPU opcional pero recomendada: una tarjeta NVIDIA con CUDA acelera respuestas de 15‑40 segundos a 3‑8 segundos. Compatibilidad: Windows, macOS y Linux; la instalación automática gestiona todas las dependencias. Motor de modelos: Ollama (descargable desde ollama.com), disponible para Windows, macOS, Ubuntu y Debian. Modelos de lenguaje local según necesidad y RAM disponible La elección del modelo depende de tu hardware y de tu caso de uso. Vincent explica que el número al final del nombre (3B, 7B, 8B, 14B) representa los miles de millones de parámetros que maneja; a mayor número, mayor precisión, pero también más RAM requerida. Phi 3 Mini (4 GB RAM): respuestas cortas, equipos básicos, uso introductorio. Llama 2 8B (8 GB RAM): velocidad media (15‑40 segundos), ideal para redacción general, análisis de textos y resúmenes. Mistral 7B (8 GB RAM): especializado en escritura creativa y resúmenes de contenido largo. DeepSeek 8B (8 GB RAM): razonamiento lógico, análisis de código y debugging. Qwen 3 (14B) (16 GB RAM): tareas complejas y análisis extenso de datos; es lento sin GPU. “No uses un modelo de 20 gigabytes para una simple traducción. Es como manejar un camión de carga para ir a la tienda. Elige según tu tarea real”. — Vincent Quezada Módulos especializados que llevan CoPaw más allá del chat básico CoPaw incluye módulos independientes que se activan automáticamente según el contexto de tu tarea. Cada uno requiere cierta configuración específica. Browser Reissable: navegador web autónomo que busca información en tiempo real; requiere la instalación de Playwright. News Module: búsqueda y resumen automático de noticias; requiere una clave API de Tavily (gratuita con 1,000 búsquedas mensuales). File Reader: lee archivos locales (.txt, .csv, .json) sin configuración adicional. PDF Module: extrae, analiza y resume PDFs complejos. DOCX Module: crea y edita documentos Word de forma automática. XLSX Module: manipula hojas de cálculo y calcula promedios, máximos y mínimos de columnas. PPTX Module: genera presentaciones de PowerPoint de forma automática. Cron Jobs (automatización): programa tareas para ejecutarse en intervalos específicos (diarios, semanales, cada N horas) sin intervención del usuario. Email Manager (Himalaya): gestión automática de correos; Vincent lo recomienda solo para usuarios avanzados. Casos de uso prácticos según nivel de experiencia Principiante: “Busca las noticias más importantes de inteligencia artificial de hoy”. “Explica la diferencia entre aprendizaje autónomo y aprendizaje profundo con ejemplos prácticos”. “Redacta un correo formal para solicitar una reunión con un cliente importante”. Intermedio: “Lee el archivo C:UsuariosDocumentosreporte.pdf y genera un resumen ejecutivo de máximo 500 palabras”. “Abre ventas_2025.xlsx, identifica los tres meses con mayor crecimiento entre enero y marzo y muestra los porcentajes”. “Navega a Amazon.com.mx, busca auriculares inalámbricos menores a 1,500 pesos y lista las cinco mejores opciones con precio y enlace”. Avanzado: “Busca las cinco noticias tecnológicas más importantes de hoy, redacta un párrafo de 150 palabras para cada una y guarda el resultado en noticiashoy.docx”. “Lee todos los archivos .csv de C:datos, combínalos en uno solo y calcula el promedio, máximo y mínimo de cada columna numérica”. “Navega a LinkedIn, busca vacantes de redactor de contenido publicadas esta semana en Ciudad de México, extrae títulos, empresas, enlaces y guarda todo en empleos.xlsx”. Automatización con tareas programadas: el verdadero diferenciador de CoPaw La función más poderosa es la capacidad de programar ejecuciones automáticas sin que el usuario esté presente. Esto convierte a CoPaw de una simple herramienta de chat en un asistente de productividad genuino. Resumen diario de noticias: “Configura una tarea que se ejecute todos los días a las 8:00 a. m.: busca las principales noticias de tecnología e IA y guarda el resultado en noticiasdiarias.txt”. Monitoreo de precio de criptomonedas: “Crea una tarea cada seis horas: registra la cotización actual de Bitcoin con fecha y hora en precio.txt”. Reporte semanal consolidado: “Programa una tarea cada lunes a las 9:00 a. m.: lee todos los archivos .txt de C:reportes, genera un resumen ejecutivo y guarda el documento como reportesemanal.docx”. Limpieza automática de archivos: “Configura una tarea cada viernes a las 11:00 p. m.: mueve todos los archivos .log con más de 30 días de antigüedad a la carpeta archivos_antiguos”. Estas variables (frecuencia, horarios, tiempos de latido o heartbeat) se controlan en el archivo config.json. Vincent subraya la importancia de probar con cuidado antes de automatizar procesos críticos. ¿CoPaw requiere internet? Solución de errores comunes CoPaw funciona completamente sin conexión una vez instalado con su modelo descargado. Solo requiere internet para búsquedas web mediante Tavily y si configuras APIs externas (OpenAI, Anthropic). Los errores más frecuentes que Vincent encontró durante sus pruebas son: “No es posible conectar con servidor CoPaw”: verifica que ejecutaste copaw start y que el puerto 8088 está disponible. “Comando copaw no reconocido”: el directorio de ejecución no está en el PATH del sistema; asigna la ruta manualmente o usa el script completo. “Ollama no disponible”: la dirección debe ser exactamente localhost:11434 sin sufijos; revisa el archivo de configuración. CoPaw vs. OpenCloud: ¿cuál es mejor? “CoPaw fue más útil que OpenCloud en mis pruebas. Mientras OpenCloud es muy potente, CoPaw ofrece instalación más rápida, una interfaz más accesible y documentación más clara. Ambas son de código abierto bajo licencia Apache 2.0. CoPaw es completamente gratis; solo la clave de Tavily tiene un costo opcional (unos 10 dólares mensuales)”. — Vincent Quezada MacBook Neo: la primera laptop Apple verdaderamente económica (599 dólares) Apple lanzó la MacBook Neo, un quiebre histórico en su estrategia de precios. Por primera vez en la historia de Macintosh existe una laptop Apple genuinamente accesible: 599 dólares (499 dólares para educación). Dirigida a estudiantes y nuevos usuarios, representa un cambio radical en la democratización del ecosistema Apple. Especificaciones técnicas de la MacBook Neo Procesador: chip A18 Pro; seis núcleos (dos de rendimiento y cuatro de eficiencia); GPU de cinco núcleos; Neural Engine de seis núcleos para tareas de inteligencia artificial. Rendimiento en IA: hasta tres veces más rápido en cargas de trabajo de inteligencia artificial que la competencia; acceso completo a Apple Intelligence manteniendo la privacidad de los datos. Pantalla Liquid Retina: 13 pulgadas, 2,408 × 1,506 píxeles, 510 nits de brillo, soporte para mil millones de colores; una de las pantallas más brillantes en su rango de precio. Batería: 36,5 Wh, hasta 16 horas de autonomía en uso mixto; dos puertos USB‑C para carga rápida. Diseño y construcción: carcasa de aluminio resistente, peso de solo 1,23 kg; colores disponibles: Blush, Indigo, Plata y Eléctrico. Conectividad: Wi‑Fi 6E, Bluetooth 6, entrada de audio de 3,5 mm (rara hoy en día), cámara FaceTime HD 1080p, micrófono dual y audio espacial Dolby Atmos. Almacenamiento: 256 GB base (Vincent cuestiona esta especificación a ese precio, pues alternativas con Windows ofrecen 512 GB por menos dinero). Software: macOS preinstalado con integración completa de Apple Intelligence. Disponibilidad: envíos a partir del 11 de marzo de 2026. “La pantalla es realmente excepcional. Es una de las mejores que he visto comparada con iPads y monitores tradicionales. Solo por ese aspecto la MacBook Neo se justifica”. — Vincent Quezada ¿Para quién es la MacBook Neo? Estudiantes: necesitan un equipo potente, ligero y con batería para todo el día; el precio educativo (499 dólares) es especialmente atractivo. Nuevos usuarios de Mac: quienes buscan una introducción asequible al ecosistema Apple sin gastar más de 1,200 dólares. Profesionales de tareas cotidianas: navegación web, edición de documentos, videollamadas y productividad básica. Usuarios preocupados por la sostenibilidad: está fabricada con un 60% de material reciclado. Vincent lanza una advertencia: el almacenamiento base de 256 GB a 599 dólares es cuestionable, ya que por ese mismo precio se encuentran laptops Windows con 512 GB que ofrecen mejor valor a corto plazo. Sin embargo, el diseño, la pantalla y la autonomía de la MacBook Neo compiten favorablemente. GPT‑5.4 de OpenAI: millón de tokens, automatización y 33% menos errores OpenAI lanzó GPT‑5.4 el 5 de marzo de 2026, apenas un día antes de este episodio. Durante la conversación, ChatGPT (participando en diálogo con Vincent) explicó las novedades clave que marcan diferencia en el mercado: contexto de hasta un millón de tokens, mejora del 33% en reducción de errores respecto a la versión previa, herramientas de automatización más profundas y mayor integración con flujos de trabajo profesionales. (Los detalles técnicos completos se abordan con más calma en el programa, pero el foco del episodio está en el impacto práctico y geopolítico.) Irán ataca infraestructura crítica: desinformación de IA amplifica el caos geopolítico A mitad del episodio, la conversación gira hacia el conflicto que explota sobre el planeta: Irán lanzó ataques contra bases militares estadounidenses, centros de datos (incluyendo instalaciones de Microsoft Azure en el Golfo Pérsico) y sistemas de desalinización en Oriente Medio. Vincent y Pablo enmarcan este escalamiento dentro de una historia más amplia: Estados Unidos, en apenas 250 años de existencia, ha estado en paz solo 16 años; el resto ha sido conflicto bélico constante. Irán, durante cuatro décadas, ha acumulado una capacidad defensiva nacional inmensa. Cuando se lanzan misiles de un millón de dólares para destruir drones de 20,000 dólares, la economía de la guerra revela su irracionalidad inherente. “Estamos viendo una operación quirúrgica de un país que lleva décadas preparándose para un momento así. No es improvisado; es cálculo estratégico. El problema es que genera nacionalismo extremo, no revolución interna”. — Vincent Quezada ¿Cuántos países están realmente involucrados? Expansión del conflicto más allá de Irán e Israel Lo que inicialmente parecía ser un conflicto bilateral Irán‑Israel se ha expandido a entre 16 y 17 países. No se trata solo de ataques entre naciones, sino también de: Ataques a bases militares de Estados Unidos en múltiples naciones del Golfo Pérsico. Infraestructura civil crítica comprometida, como plantas desalinizadoras que suministran agua a millones de personas. Centros de datos de Microsoft Azure, que gestionan sistemas de la OTAN, la defensa estadounidense y grandes instituciones financieras. Sistemas GPS degradados o bloqueados en las zonas del conflicto. Pablo subraya que una planta desalinizadora comprometida en el Golfo Pérsico afecta a millones de civiles. No se trata solo de un conflicto militar, sino de un ataque sistémico a la supervivencia civil. “La estrategia inicial que leí era que, después de matar al líder, habría revolución interna y cambio de gobierno. No funciona así. No puedes cambiar 40 años de dominación, creencia popular y cultura con un bombardeo. Generó nacionalismo extremo, justo lo contrario”. — Pablo Berruecos Gasto económico diario: más de mil millones de dólares en conflicto activo La cifra de gasto militar diario es casi incomprensible. Según el monitoreo de cuentas en X (Twitter) que rastrean gasto militar en tiempo real, el conflicto cuesta más de mil millones de dólares al día. Comparado con las pérdidas bursátiles simultáneas en Estados Unidos (Nvidia ‑1,55%, Google en rojo, Apple ‑1,42%, Visa ‑0,69%, Amazon ‑0,48%, Tesla ‑2,33%), el costo económico global es catastrófico. Desglose de los primeros días de ataques Día 1 (primer ataque de Irán): 500 misiles lanzados hacia Israel y bases estadounidenses. Día 2: 200 misiles. Día 3: 100 misiles. Día 4: 50 misiles. Día 5 y posteriores: 15‑20 misiles, pero con intensificación del uso de drones y sistemas más sofisticados. En cuanto a municiones, para interceptar cada misil lanzado Estados Unidos empleó entre 10 y 20 misiles Tomahawk, cuyo coste ronda los 4‑5 millones de dólares cada uno. La matemática es devastadora: para defenderse de 500 misiles, se gastaron entre 5,000 y 10,000 millones de dólares solo en defensa. Irán, con un presupuesto militar inferior, amplifica su impacto usando drones de bajo coste que replican la capacidad de misiles mucho más caros. ¿Por qué Dubái está en pánico? Crisis de confianza en los paraísos fiscales Pablo narra una anécdota inquietante: una influencer española se mudó a Dubái explícitamente para no pagar impuestos. Cuando comenzó el bombardeo, pidió al gobierno español que la rescatara. Las redes sociales reaccionaron con dureza: “Te fuiste para evitar impuestos, pero esperas que nuestros impuestos te salven”. Más allá del drama mediático, esto revela una crisis de confianza más profunda. Dubái representa la opulencia extrema (albercas en cada piso, derroche de dinero). Al mismo tiempo es una ciudad vulnerable: construida en medio del desierto sin recursos naturales, depende de agua desalinizada y petróleo importado. Una planta desalinizadora comprometida deja a millones de personas sin acceso a agua potable. Las embajadas no pueden evacuar a todos; la capacidad del aeropuerto es limitada. Los depósitos de oro de países del Golfo plantean preguntas: ¿quién los controla si hay invasión? ¿Se pierde la credibilidad de esa moneda? “Dubái te da una ilusión de seguridad. Luego descubres que estás tan vulnerable como en cualquier otro sitio. Si pierdes acceso a agua, dinero y energía, la opulencia desaparece en cuestión de horas”. — Pablo Berruecos ¿Es una tercera guerra mundial? La respuesta compleja de Vincent y Pablo La gran pregunta: ¿es esto la tercera guerra mundial? Vincent y Pablo responden que no, pero sí se trata de un conflicto multinacional sin precedentes recientes. Factores que empujan hacia un conflicto total: múltiples frentes (tecnológico, energético, cibernético), riesgo de escalamiento incalculable y poder nuclear en equilibrio inestable. Factores limitantes: China no quiere involucrarse (si lo hace, el “game over” planetario); Rusia comenta desde la banda; la diplomacia existe, pero parece ficción. Realidad actual: es una guerra sin declaración formal, sin límites claros y sin un final visible. Es un conflicto mayor que podría convertirse en guerra mundial si alguien toma la decisión equivocada. Censura en redes sociales: TikTok, Grok y ChatGPT eliminan realidad selectivamente Vincent lanza una acusación central: las plataformas de redes sociales están censurando el conflicto real mientras amplifican la desinformación generada con IA. Se forma así un mecanismo de control dual. Censura selectiva. TikTok, Grok y ChatGPT han censurado términos como “Palestina libre”, bloquean videos de ataques verificables y silencian reportajes de bombardeos reales. El resultado es que los usuarios no ven la magnitud real del conflicto. Amplificación de desinformación. Al mismo tiempo, videos falsos generados con IA se replican masivamente. Un ejemplo documentado es un video de un misil impactando un portaaviones, con barcos salvavidas saliendo disparados de forma físicamente imposible. Medios internacionales lo replicaron como si fuera un evento real. “Mucha gente salió de ChatGPT esta semana no por problemas técnicos, sino porque OpenAI dijo ‘sí' a participar en la guerra cuando Anthropic dijo ‘no'. Unos 1,5 millones de usuarios migraron por cuestiones éticas”. — Vincent Quezada El parque “Policía” de Teherán: cómo la IA comete atrocidades sin intención Un detalle sintetiza la tragedia: en Teherán existe un parque público llamado Parque Policía. Sistemas de IA estadounidenses lo detectaron como “base militar de policía” y lo bombardearon. No había policías, solo civiles. Se destruyó infraestructura pública sin valor militar. Esto ilustra una crisis existencial: si los sistemas de IA se usan para identificar blancos y esos sistemas cometen errores de clasificación, ¿quién es responsable? La respuesta legal suele ser que nadie, porque “fue una máquina”. El patrón se repite: Hospitales destruidos. Escuelas destruidas. Iglesias destruidas. Cada error (Con o sin intención) se traduce en más víctimas civiles. ¿Qué porcentaje de lo que ves es real y qué parte es generado por IA? Esta es la pregunta que obsesiona a Pablo al final de la sección. En redes sociales, el feed está contaminado: videos viejos del año pasado, videos recientes manipulados con IA, análisis en tiempo real legítimos, campañas de desinformación coordinada y censura selectiva, todo mezclado. Pablo cita un reportaje de un canal europeo (disponible vía Roku) que analizaba la cantidad masiva de videos falsos que circulan. La conclusión es aterradora: no sabes en qué creer. “Entre no ver nada (porque está censurado) y ver todo falso (porque es IA), terminas paralizado. La verdad deja de importar cuando ya no sabes identificarla”. — Pablo Berruecos Impacto tecnológico real: Microsoft Azure y la columna vertebral digital del conflicto Un detalle merece su propio análisis: Irán atacó centros de datos de Microsoft en el Golfo Pérsico. No se trata de servicios comerciales como AWS, sino de infraestructura Azure que soporta: La columna vertebral operativa de la OTAN. El Departamento de Defensa de Estados Unidos. Grandes instituciones financieras occidentales. Infraestructura militar 5G. Zonas de disponibilidad Azure con clasificación FedRAMP High, la más alta que puede obtener un proveedor comercial. Si estos centros de datos llegaran a caer (algo aún no confirmado oficialmente), el impacto sería catastrófico para la estructura de defensa y las finanzas occidentales. Pablo subraya que esto no es un ataque comercial, sino un ataque al tejido conectivo digital que une la arquitectura de defensa con las ambiciones soberanas de IA en el Golfo Pérsico. Conclusión parcial. El conflicto Irán‑EU – Israel ya no es solo militar; es digital, económico y tecnológico. La desinformación generada con IA amplifica el caos mientras la censura selectiva paraliza la comprensión pública. El resultado es un planeta sin ley en el que la verdad es tan escasa como la paz. Mobile World Congress 2026: privacidad, seguridad y conectividad satelital Tras el análisis geopolítico, Vincent y Pablo redirigen la conversación hacia el Mobile World Congress 2026 en Barcelona, el evento más importante de la industria móvil global. Este año marca un punto de inflexión: privacidad y seguridad dejan de ser características opcionales para convertirse en pilares competitivos. Motorola abandona el Android tradicional por GrapheneOS; múltiples fabricantes lanzan teléfonos con Linux exclusivos para Europa; MediaTek integra conectividad satelital 5G; Nothing presenta el Phone 4 con diseño transparente Glyph Matrix. Pablo y Vincent diseccionan cada lanzamiento con detalle técnico. Nothing Phone 4: diseño Glyph Matrix transparente Nothing lanzó el Phone 4 con una propuesta radical: mantener el diseño transparente icónico y añadir Glyph Matrix, una matriz de 137,000 mini‑LEDs que cubren el 57% de la parte trasera del dispositivo y que brillan un 100% más que en generaciones anteriores. Estos LEDs generan iconos personalizables (batería, temporizador, reloj digital, espejo Glyph, camino solar) que transforman la cámara trasera en una interfaz háptica y visual única. Especificaciones técnicas del Nothing Phone 4 Diseño Glyph Lift Matrix: fusión de un cuerpo unibody de metal con refracciones de luz, acabados suaves sin fisuras y un diseño retrofuturista inspirado en cámaras de cine vintage y consolas clásicas. Colores: plata, negro y rosa metálico (poco común en 2026 y distintivo a simple vista). Cámara trasera principal: sensor Sony Exmor 700c de gran tamaño, 50 megapíxeles, zoom óptico 3,5x. Cámara gran angular: sensor Sony de 32 megapíxeles para captura de contexto amplio. Motor Lens Engine 4: compatible con fotos y video 4K Ultra HDR, efectos HDR Flex y Dolby Vision integrado. Pantalla AMOLED de 6,83 pulgadas: resolución 1,5K (2,408 × 1,506 píxeles), 450 ppp, tasa de refresco de 144 Hz (ideal para videojuegos) y brillo máximo de 5,000 nits. Protección: cristal Corning Gorilla Glass 7i con resistencia mejorada a caídas y rasguños. Procesador: Snapdragon 7 Serie Gen 4; CPU un 27% más rápida y GPU un 30% más potente que la generación anterior; capacidades de IA un 65% superiores. Memoria y almacenamiento: RAM LPDDR5X y almacenamiento UFS 3.1, con velocidades de lectura y escritura elevadas. Batería: 5,080 mAh, carga rápida de 50 W y más de 17 horas documentadas de uso mixto. Software: Nothing OS 4.1 basado en Android 16, con AI Dashboard para control de funciones de IA, Essential AI para organización de calendario y vida diaria, Essential Search (acceso multiplataforma inmediato), Essential Memory (personalización según actividad), Playground (creación de apps sin código) y Essential Space (sincronización en la nube multiplataforma). Precio y disponibilidad: la revelación oficial se programa para el 18 de marzo de 2026. Vincent confirma invitación al evento, pero con conflicto de agenda; espera recibir unidades de prueba. “El diseño transparente de Nothing no es solo estética; es filosofía. Muestran lo que todas las demás marcas ocultan. Es una declaración sobre privacidad y accesibilidad”. — Vincent Quezada Pruebas de cámara con el Honor Magic 8 Lite Vincent comparte sus pruebas de cámara con el Honor Magic 8 Lite realizadas durante un fin de semana en Chapultepec (Ciudad de México). Sus conclusiones son claras: la fotografía es excelente, el video es aceptable pero presenta limitaciones de estabilización al usar el zoom máximo. La batería del Honor duró desde el domingo hasta el viernes con un 82% restante al momento de grabar, algo que Vincent califica de “maravilla” frente a la competencia. La carga rápida también impresiona: del 15% al 80% en menos de 30 minutos. MediaTek M90: primer chip 5G con conectividad satelital integrada MediaTek presentó el M90, el primer chip móvil 5G con conectividad satelital integrada de fábrica. Esto permite que los dispositivos accedan a redes como Starlink Mobile incluso sin infraestructura celular terrestre. En contextos críticos —terremotos, conflictos armados, zonas rurales remotas—, esta conectividad híbrida 5G‑satelital es infraestructura de supervivencia, no un lujo tecnológico. ¿Por qué la conectividad satelital es crítica? Vincent comparte evidencia directa: durante simulacros de alerta sísmica y terremotos reales de 2026 en México, solo dos de sus cuatro teléfonos recibieron la alerta de emergencia. Los que tenían Wi‑Fi permanente activo y chips compatibles con conectividad satelital sí captaron la señal; los otros, no. La conclusión es inequívoca: la redundancia de conectividad puede literalmente salvar vidas. Casos de uso estratégicos: comunicaciones militares sin depender de operadores civiles comprometidos, navegación precisa en regiones sin torres celulares, transmisión de datos en vehículos autónomos en autopistas remotas y alertas de emergencia en zonas sísmicas o bajo ataque. Implicación geopolítica: gobiernos y fuerzas de seguridad pueden operar de forma independiente a los monopolios de conectividad nacional y los ciudadanos en zonas de conflicto pueden comunicarse sin censura de proveedores locales. Velocidad: no es la más alta (la latencia es mayor que la del 5G terrestre), pero garantiza conectividad donde no hay alternativas viables. “La conectividad satelital no es un lujo; es infraestructura crítica de supervivencia. Si no recibiste la alerta sísmica porque tu teléfono no tenía redundancia, la tecnología fracasó”. — Vincent Quezada Motorola abandona Android tradicional: apuesta por GrapheneOS Motorola anunció oficialmente el fin de su línea de dispositivos con Android estándar y su migración hacia GrapheneOS, un sistema operativo de código cerrado pero obsesionado con la privacidad. GrapheneOS implementa un aislamiento extremo a nivel granular: una aplicación de mensajería no puede acceder a micrófono, cámara o ubicación a menos que el usuario lo autorice explícitamente en cada sesión. Esta decisión responde a una demanda corporativa creciente de teléfonos resistentes a la vigilancia masiva, a ciberataques y a la exfiltración de datos. El mercado objetivo son empresas multinacionales, gobiernos, periodistas en contextos de riesgo y usuarios muy conscientes de la privacidad. Ventajas de GrapheneOS: aislamiento estricto por aplicación, permisos granulares que expiran por sesión, resistencia a puertas traseras corporativas o gubernamentales y actualizaciones de seguridad más rápidas que en Android AOSP. Desventajas: fragmentación de aplicaciones, compatibilidad limitada con Google Play Services, ecosistema menos maduro y curva de aprendizaje más pronunciada para usuarios no técnicos. Precio estimado: no se ha revelado oficialmente, pero se espera un sobreprecio de entre el 15% y el 20% respecto a modelos Android estándar. “Android abierto es poderoso pero vulnerable. GrapheneOS es Android cerrado, paranoico y centrado en la privacidad. La elección depende de si valoras más la conveniencia o el control absoluto de tus datos”. — Pablo Berruecos Teléfonos con Linux: código abierto verificable y seguridad auditada Varios fabricantes presentaron prototipos de teléfonos basados completamente en Linux, con lanzamiento inicial exclusivo en Europa. Linux ofrece transparencia total de código fuente, auditoría comunitaria constante y resistencia natural a puertas traseras corporativas o gubernamentales. Aunque el mercado se limita, de momento, a Europa por las estrictas regulaciones del RGPD, las proyecciones apuntan a una expansión global alrededor de 2027. Ventaja clave: código abierto 100% verificable, auditoría de seguridad comunitaria permanente, ausencia de telemetría corporativa oculta y actualizaciones controladas por el usuario. Desafío principal: enorme fragmentación de aplicaciones, compatibilidad casi nula con Google Play Store, ecosistema de apps menos maduro e interfaces menos pulidas que Android o iOS. Público objetivo: gobiernos europeos con requisitos de soberanía digital, periodistas de investigación, disidentes políticos y profesionales de sectores de seguridad crítica (finanzas, defensa, salud). Otros lanzamientos destacados del Mobile World Congress 2026 Smartphones con innovación radical en diseño y modularidad Honor Robot Phone: cámara de 200 megapíxeles montada en un brazo gimbal motorizado que se despliega desde el chasis, permitiendo ángulos de captura profesionales imposibles en teléfonos convencionales (autorretratos sin distorsión, videografía con estabilización tipo cine, panorámicas sin cortes digitales). Motorola Razr y Edge (FIFA World Cup 26 Collection): ediciones especiales con logotipo oficial del torneo, interfaz personalizada del evento y colores temáticos. Xiaomi 17 Ultra: presentación europea con especificaciones de gama alta, precio por anunciar pero competitivo frente al Samsung Galaxy S26 Ultra. Nothing Phone 4A: versión más accesible del Phone 4 con colores llamativos (destaca el rosa metálico) y un Glyph Matrix reducido pero funcional. Unihertz Titan Elite 2: teclado físico completo (nostalgia BlackBerry) en un formato moderno con Android 16. Vivo X300 Ultra: cámara de 200 megapíxeles y lanzamiento global fuera de China, la primera vez que Vivo lleva un buque insignia de este tipo a mercados occidentales. Tecno Atom (modular magnético): sistema de accesorios magnéticos intercambiables inspirado en los antiguos Moto Mods (proyectores, cámaras adicionales, baterías extendidas) sin sacrificar portabilidad diaria. Tecno Power Neon: incorpora iluminación neón real usando tecnología de gas inerte de baja tensión; diseño retrofuturista cyberpunk; primer teléfono con neón físico desde 2003. Legion Gold Fold (concepto): teléfono plegable centrado en videojuegos, con pantalla de 240 Hz y gatillos ultrasónicos integrados. Laptops y tablets con pantallas modulares e IA integrada Lenovo ThinkBook módulo IPC: puertos intercambiables magnéticos para conectar una segunda pantalla portátil; extensión dinámica del espacio de trabajo sin cables. Lenovo Yoga Book Pro D: doble pantalla con visualización 3D sin necesidad de gafas de realidad virtual, productividad multitarea reforzada y reconocimiento de gestos en el aire. Asus VivoBook Pad XPS: tablet estilo laptop con pantalla OLED más grande (15,6 pulgadas) y teclado mecánico desmontable mejorado. Chips y conectividad avanzada: preparación para 6G Qualcomm FastConnect 8800: módulo Wi‑Fi 7 con IA integrada para optimizar el ancho de banda automáticamente según el tipo de contenido. Qualcomm X105 5G: módem un 15% más rápido, un 20% más pequeño y un 30% más eficiente que el X100, pensado como puente hacia 5G Advanced (5G‑A). Snapdragon Wear Elite: chip orientado a wearables y robótica, con procesamiento de baja latencia (por debajo de 10 ms), ideal para relojes inteligentes, audífonos con IA y robots de servicio. Samsung y la pantalla anti‑espionaje Samsung presentó una tecnología de pantalla que impide que las personas situadas a los lados del usuario vean el contenido. La innovación cambia la forma en que los píxeles emiten luz: se coloca un “aro óptico” alrededor de cada píxel que nubla la imagen cuando se observa desde ángulos laterales. Desde el frente, la imagen es perfectamente clara; desde cualquier otro ángulo, se ve borrosa e ilegible. “Esto resuelve el problema de privacidad en transporte público, oficinas compartidas y aeropuertos. Finalmente puedes trabajar con información sensible sin preocuparte de quién mira por encima de tu hombro”. — Pablo Berruecos Conclusión parcial. El Mobile World Congress 2026 consolidó privacidad, seguridad y conectividad satelital como pilares no negociables de la telefonía móvil. Nothing Phone 4 democratiza el diseño transparente; MediaTek integra satelital en chips 5G; Motorola apuesta por GrapheneOS; Europa lidera con teléfonos Linux. La pregunta ya no es “qué tan rápido es tu teléfono”, sino “qué tan privado y resiliente es”. Robots humanoides y audífonos inteligentes: la IA se vuelve física El Mobile World Congress 2026 no giró solo en torno a teléfonos. La inteligencia artificial se materializó en hardware físico: robots humanoides capaces de bailar moonwalk, audífonos que analizan la geometría del canal auditivo para prevenir pérdida de audición, dispositivos para mascotas con llamadas bidireccionales mediante gestos y gafas de realidad extendida con traducción en tiempo real. Vincent y Pablo exploran estas innovaciones con mirada crítica. Honor Robot Humanoid: bípedo capaz de bailar y servir Honor presentó un robot humanoide bípedo completamente funcional, capaz de bailar (incluyendo un moonwalk que se volvió viral), mantener el equilibrio en superficies irregulares y ejecutar tareas de servicio básicas. Pablo recuerda un momento particularmente comentado: un robot humanoide propinando un “golpe bajo” a un boxeador durante una demostración, probablemente por un error de calibración, que generó memes instantáneos. Capacidades motoras: caminar de forma estable, correr a baja velocidad, subir escaleras y bailar coreografías preprogramadas. Casos de uso previstos: servicio hotelero, asistencia en hospitales, limpieza industrial y entretenimiento en eventos. Limitaciones actuales: velocidad de procesamiento de IA para decisiones complejas, autonomía de batería de entre cuatro y seis horas en operación continua y costo prohibitivo para el consumidor final (por encima de 50,000 dólares). PetFoam: comunicación bidireccional para mascotas PetFoam es un dispositivo que permite a las mascotas “llamar” a sus dueños mediante gestos reconocidos por IA. Por ejemplo, un perro que rasca un sensor específico puede activar una videollamada al dueño. Este, a su vez, puede responder con voz, mientras la mascota ve la imagen en una pequeña pantalla integrada. El caso de uso central es claro: mascotas en una posible emergencia (heridas, atrapadas) pueden alertar sin que haya intervención directa de otra persona. Google Iris XR: gafas de realidad extendida con traducción simultánea Google presentó el prototipo Iris XR, unas gafas de realidad extendida —no realidad virtual completa— con traducción en tiempo real integrada mediante IA. Sus casos de uso incluyen viajes internacionales, reuniones multilingües y accesibilidad para personas sordas (con subtítulos en tiempo real de las conversaciones). De momento no tienen fecha de lanzamiento comercial y solo están disponibles en demos controladas del MWC. Audífonos inteligentes que analizan tu oído: riesgos y beneficios Los audífonos evolucionan de meros accesorios pasivos a dispositivos de bioacústica avanzada. En el MWC 2026 se mostraron modelos capaces de analizar la geometría única del canal auditivo del usuario para ajustar de forma dinámica la cancelación de ruido, la ecualización personalizada y la exposición a decibeles. Esto crea un perfil acústico único por oído, minimizando la fatiga auditiva acumulativa y el riesgo de pérdida de audición permanente. Características técnicas de estos audífonos Cancelación de ruido adaptativa: detecta frecuencias específicas del entorno (motor de autobús, viento, multitudes, maquinaria industrial) y las atenúa selectivamente sin aislar por completo. Medición de decibeles en tiempo real: emite alertas visuales o hápticas si el volumen excede los 85 dB durante más de 30 minutos, siguiendo el límite seguro sugerido por la OMS. Análisis de la forma del oído: ajusta la presión en el canal auditivo y modifica el ancho de banda según la morfología individual, reduciendo la fatiga en usos prolongados de más de ocho horas diarias. Ecualización personalizada: compensa las deficiencias auditivas naturales de cada usuario en determinadas frecuencias. Riesgos para la salud auditiva: la presión en el tubo de Eustaquio Vincent advierte sobre un riesgo poco mencionado por los fabricantes: la cancelación de ruido total crea un sello hermético que genera presión en el canal auditivo. Esta presión activa el tubo de Eustaquio, responsable de regular la presión en el oído medio. El uso prolongado con sellado hermético puede: Comprometer la capacidad natural del oído para regular la presión (similar a lo que ocurre en un avión). Crear dependencia de una presión artificial para “escuchar correctamente”. Generar fatiga auditiva acumulativa por exceso de vibraciones internas. Aumentar el riesgo de infecciones de oído medio por retención de humedad. “La cancelación de ruido total te aísla del mundo. Una cancelación inteligente te mantiene conectado a tu entorno mientras disfrutas la música. La diferencia es literal entre la vida y un accidente”. — Vincent Quezada Caso práctico en Chapultepec: ceguera auditiva y casi choque Pablo cuenta una experiencia personal: caminaba en Chapultepec, en Ciudad de México, con audífonos con cancelación activa total. No escuchó a una persona que le gritaba para evitar un choque. Cuando finalmente la vio, ya era tarde y terminaron chocando. Reflexiona que, si hubiera estado en bicicleta y no escuchara la campanilla del trenecito turístico —que avisa su paso—, podría haber frenado de golpe y causar un accidente. Su recomendación es clara: nunca uses cancelación de ruido total en espacios públicos como calles, ciclovías o transporte. Actívala solo en entornos controlados y seguros (oficina, casa, avión). Mantén siempre un nivel medio de cancelación que permita escuchar alertas críticas del entorno (claxon, sirenas, gritos de advertencia). “Tengan cuidado. Si vas en el camión o en transporte público y te toca sentarte atrás del motor, el ruido se vuelve insoportable. Los filtros te dejan solo con la música y con el entorno realmente importante. Pero si te aíslas por completo, no sabes si alguien te está alertando de un peligro real”. — Pablo Berruecos Alianzas estratégicas hacia 6G: Nokia, NTT, Vodafone y más El MWC 2026 no solo presentó dispositivos, sino alianzas estratégicas que definen la ruta hacia un 6G nativo en inteligencia artificial. Nokia, NVIDIA, NTT, NTT Docomo, Vodafone, BT, Elisa y otros operadores anunciaron colaboraciones para adoptar tecnologías AI‑RAN (inteligencia artificial en redes de acceso radio) que mejoran el rendimiento de la red y soportan el crecimiento exponencial de la IA móvil. ¿Qué es 6G y cuándo llegará? Vincent y Pablo aclaran una confusión común: 5G Advanced (5G‑A) no es una nueva generación, sino un refinamiento del 5G existente con más velocidad, menor latencia y mejor eficiencia energética. El verdadero salto generacional será 6G, proyectado para 2030‑2032 según el consenso de los operadores presentes en el MWC. Características esperadas de 6G: velocidades teóricas 100 veces más rápidas que 5G (hasta 1 Tbps), latencias de menos de 0,1 ms (frente a 1 ms en 5G), conectividad híbrida 5G‑satelital como estándar, orquestación de IA de forma nativa en la red y uso de fotónica óptica para reducir el consumo energético. Infraestructura necesaria: inversión estimada de 100,000 millones de euros a nivel global, renovación completa de torres celulares e integración de computación cuántica en los núcleos de red. Casos de uso diferenciales: vehículos autónomos de nivel 5 (sin intervención humana), cirugías remotas en tiempo real con robótica, realidad extendida persistente (un metaverso funcional) y ciudades inteligentes con millones de sensores de IoT sincronizados. “6G no será mejor solo por ser 6G. Será mejor porque será inteligente, consciente del contexto y capaz de auto‑optimizarse en tiempo real sin intervención humana”. — Vincent Quezada Financiamiento y fotónica óptica: la apuesta de NTT Group AWS anunció la expansión de su infraestructura en mercados emergentes (India, Indonesia, Nigeria). Vodafone, la GSMA y otros organismos de telecomunicaciones aseguraron financiamiento de hasta 100 millones de euros específicamente para el desarrollo de estándares 6G con IA integrada desde el diseño. Esta inversión señala un cambio: actores privados financian estándares que antes estaban bajo control casi exclusivo de gobiernos. Por su parte, NTT Group (Japón) presentó sus avances en fotónica óptica y redes ópticas inalámbricas (ION: Innovative Optical and Wireless Network). El objetivo es reducir el consumo energético de los centros de datos, disparado por el uso intensivo de inteligencia artificial. Entre los proyectos destacados se encuentran: Convergencia fotónico‑electrónica: mejora la eficiencia energética de los centros de datos hasta un 60% respecto a la electrónica tradicional. Computación cuántica óptica: cálculos a gran escala con menor espacio físico, más velocidad y menores costes a largo plazo. Infraestructura resiliente con IA: redes autorreparables que detectan y resuelven fallos sin intervención humana. Ya no se trata solo de lanzar productos, sino de redefinir cómo se integran telecomunicaciones, movilidad y tecnología para sostener la explosión de la IA sin colapsar redes eléctricas a nivel global. Conclusión general: hacia una tecnología más consciente El episodio del 6 de marzo de 2026 captura un momento bisagra. La inteligencia artificial local (CoPaw) permite privacidad sin sacrificar productividad; GPT‑5.4 amplía el contexto a niveles impensables hace apenas un año; la MacBook Neo democratiza el acceso a macOS; el conflicto Irán‑Israel muestra cómo la desinformación generada por IA paraliza la comprensión pública mientras la censura selectiva oculta la realidad; y el Mobile World Congress 2026 consagra la privacidad, la seguridad satelital y el 6G como pilares del futuro móvil. Motorola abandona Android por GrapheneOS. Llegan teléfonos con Linux a Europa. MediaTek integra la conectividad satelital en chips 5G. Audífonos inteligentes analizan la geometría auditiva. Robots humanoides bailan moonwalk. Nokia y NVIDIA sientan las bases para 6G. De forma simultánea, la geopolítica y la desinformación revelan que una IA sin restricciones éticas se convierte en arma de control masivo. El desafío de 2026 no es tecnológico, sino humano: elegir entre la conveniencia monitoreada y la privacidad consciente. Las alianzas hacia 6G establecerán quién controla la infraestructura digital del planeta. La censura en redes sociales demuestra que la verdad es tan escasa como la paz. Y herramientas como CoPaw ofrecen una alternativa: control total de tus datos sin depender de corporaciones dispuestas a negociar su ética a cambio de contratos militares. Escucha el episodio completo en One Digital y únete a la conversación con los hashtags #PodcastONE, #OneDigital y #MWC2026. El cargo Podcast ONE: 6 de marzo de 2026 apareció primero en OneDigital.

amazon tiktok google israel china apple crisis solo european union microsoft sin europa robots 3d chatgpt tesla bitcoin phone barcelona desde sony pero act android estamos nigeria cuando mac cada indonesia durante ios ipads estados unidos windows esto collection wifi 5g telegram ram chips ia samsung visa grandes aunque ir gemini openai sus finalmente nvidia iot luego otros api motor lite ciudad laptops vivo nuevos powerpoint bluetooth roku gpt gb 5k crear aws realidad playground nokia rusia linux alibaba llama dise resumen abre blackberry unos casos apis memoria db indigo crea mucha dub explica palestina bt defensa azure caracter playwright macos aumentar medios plata cod sistemas desaf oms apache censura vodafone protecci motorola polic ubuntu gpu pdfs cpu iglesias grok mant estudiantes llegan 3b xiaomi modelos anthropic medici oled ataques precio ambas usb c riesgos google play store generar elige ventajas profesionales node phi macintosh soluci bater colores limpieza hz gener tomahawks 7b wh escuelas otan centros factores expansi aud leds comando mobile world congress mwc velocidad zonas golfo podcastone mah usuarios microsoft azure blush ipc 8b conclusi rendimiento hospitales reporte 6g dolby atmos ventaja tengan oriente medio reflexiona infraestructura ntt desventajas dirigida capacidades cuda el departamento automatizaci limitaciones computaci rgpd teher motorola razr muestran navega debian chapultepec almacenamiento monitoreo comparado nothing phone dolby vision mediatek configura gsma disponibilidad glyph golfo p ufs convergencia implicaci x100 grapheneos paulo brasil gb ram especificaciones eustaquio ntt docomo desglose google play services onedigital redacta corning gorilla glass moto mods android aosp
Technovation with Peter High (CIO, CTO, CDO, CXO Interviews)
The Thinking Machine: How Jensen Huang Won the GPU War for NVIDIA

Technovation with Peter High (CIO, CTO, CDO, CXO Interviews)

Play Episode Listen Later Mar 5, 2026 55:24


In this episode of Technovation, Peter High speaks with Stephen Witt, award-winning journalist and author of The Thinking Machine, which has been named Business Book of the Year by Financial Times. Witt writes about Jensen Huang's improbable journey from near-bankruptcy in the 1990s GPU wars to leading NVIDIA at the center of the AI revolution. Witt unpacks how NVIDIA defeated nearly 70 competitors, why Huang began targeting “zero-billion-dollar markets,” and how CUDA became the backbone of modern AI. Key highlights from the episode: How investing in zero-billion-dollar markets created durable platform advantage The emerging bull and bear cases for NVIDIA in robotics, edge computing, and global competition The strategic lessons NVIDIA extracted from surviving a 70-competitor GPU market Why operating with a constant “near-death” mindset shaped long-term execution discipline

Apple Coding Daily
Revolución con los M5 Pro y M5 Max, Apple reinventa su arquitectura de chips

Apple Coding Daily

Play Episode Listen Later Mar 4, 2026 33:00


Apple lo ha vuelto a hacer. Pero esta vez no ha sido un "más de lo mismo con mejor nota". El 3 de marzo de 2026 presentó los chips M5 Pro y M5 Max integrados en los nuevos MacBook Pro, y lo que hay dentro es el cambio de arquitectura más importante desde que llegó el M1. No hablamos de más núcleos ni de un proceso de fabricación más fino. Hablamos de repensar desde cero cómo se construye un chip. En este episodio desmontamos la Fusion Architecture pieza a pieza: qué es un die, por qué dividirlo en dos cambia las reglas del juego, qué implica para la disipación térmica, para la fabricación y para el futuro de Apple Silicon. Hablamos de los Neural Accelerators integrados en cada núcleo GPU, del aumento del ancho de banda del Neural Engine, de los 614 GB/s de memoria del M5 Max y de por qué eso importa más que los GHz cuando hablamos de inteligencia artificial en local. Y hacemos la comparativa con NVIDIA que todo el mundo hace pero casi nadie hace bien: CUDA vs MLX, H100 vs M5 Max, datacenter vs mochila. Sin banderas. Con números reales.

Reportaże | Radio Katowice
Reportaż. Wiara czyni cuda czyli Góral musi tańczyć

Reportaże | Radio Katowice

Play Episode Listen Later Mar 3, 2026 13:44


Pan Paweł z Milówki to góral "do tańca i różańca"- tańczy, śpiewa, prowadzi własną firmę, dba o rodzinę i góralską kulturę. Rok temu podczas remontu domu uległ, jak się początkowo wydawało, niegroźnemu wypadkowi. Zapraszamy Państwa do wysłuchania reportażu Agnieszki Loch w realizacji Jacka Kurkowskiego zatytułowanego "Wiara czyni cuda czyli Góral musi tańczyć".

Le Vieux Sage
Brahman est l'unique réalité

Le Vieux Sage

Play Episode Listen Later Mar 1, 2026 11:48


Chapitre 14 du Viveka Cuda Mani (Le plus beau fleuron de la discrimination) intitulé "Brahman est l'unique réalité". Bibliographie: https://www.babelio.com/livres/Sankara-Le-plus-Beau-Fleuron-de-la-Discrimination--Viveka/165555 Musique: Bruno Léger Narration et réalisation: Bruno Léger Production: Les mécènes du Vieux Sage Que règnent la paix et l'amour parmi tous les êtres de l'univers.  OM Shanti, Shanti, Shanti. 

MLOps.community
Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

MLOps.community

Play Episode Listen Later Feb 24, 2026 85:49


March 3rd, Computer History Museum CODING AGENTS CONFERENCE, come join us while there are still tickets left.https://luma.com/codingagentsChris Fregly is currently focused on building and scaling high-performance AI systems, writing and teaching about AI infrastructure, helping organizations adopt generative AI and performance engineering principles on AWS, and fostering large developer communities around these topics.Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs // MLOps Podcast #363 with Chris Fregly, Founder, AI Performance Engineer, and InvestorJoin the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletterMLOps GPU Guide: https://go.mlops.community/gpuguide// AbstractIn today's era of massive generative models, it's important to understand the full scope of AI systems' performance engineering. This talk discusses the new O'Reilly book, AI Systems Performance Engineering, and the accompanying GitHub repo (https://github.com/cfregly/ai-performance-engineering). This talk provides engineers, researchers, and developers with a set of actionable optimization strategies. You'll learn techniques to co-design and co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems for both training and inference. // BioChris Fregly is an AI performance engineer and startup founder with experience at AWS, Databricks, and Netflix. He's the author of three (3) O'Reilly books, including Data Science on AWS (2021), Generative AI on AWS (2023), and AI Systems Performance Engineering (2025). He also runs the global AI Performance Engineering meetup and speaks at many AI-related conferences, including Nvidia GTC, ODSC, Big Data London, and more.// Related LinksAI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with GPUs, CUDA, and PyTorch 1st Edition by Chris Fregly: https://www.amazon.com/Systems-Performance-Engineering-Optimizing-Algorithms/dp/B0F47689K8/Coding Agents Conference: https://luma.com/codingagents~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Chris on LinkedIn: /cfreglyTimestamps:[00:00] SageMaker HyperPod Resilience[00:27] Book Creation and Software Engineering[04:57] Software Engineers and Maintenance[11:49] AI Systems Performance Engineering[22:03] Cognitive Biases and Optimization / "Mechanical Sympathy"[29:36] GPU Rack-Scale Architecture[33:58] Data Center Reliability Issues[43:52] AI Compute Platforms[49:05] Hardware vs Ecosystem Choice[1:00:05] Claude vs Codex vs Gemini[1:14:53] Kernel Budget Allocation[1:18:49] Steerable Reasoning Challenges[1:24:18] Data Chain Value Awareness

Q-Media's On Demand
Desi Cuda from ECE 02.24.26

Q-Media's On Demand

Play Episode Listen Later Feb 24, 2026 4:53


Desi Cuda from East Central Energy stops by the Front Porch to discuss the Youth Tour.

Sharks Hockey Digest
Brodie Brazil's Teal Talk - John McCarthy

Sharks Hockey Digest

Play Episode Listen Later Feb 20, 2026 8:00


Barracuda Head Coach John McCarthy talks with Brodie Brazil about the Cuda season, impactful players, and looking ahead to the rest of the AHL season.

Sharks Hockey Digest
Cuda Confidential Alumni Check Up: Alex True

Sharks Hockey Digest

Play Episode Listen Later Feb 18, 2026 30:00


In this special alumni edition of Cuda Confidential, Barracuda voice catches up with Olympian and former Barracuda and Sharks' forward Alex True.

Cuda Confidential
Cuda Confidential Alumni Check Up: Alex True

Cuda Confidential

Play Episode Listen Later Feb 18, 2026 30:00


In this special alumni edition of Cuda Confidential, Barracuda voice catches up with Olympian and former Barracuda and Sharks' forward Alex True.

Common Denominator
What Made NVIDIA a $4.5T Company? Jensen Huang's Leadership | NVIDIA, AI & Long-Term Thinking

Common Denominator

Play Episode Listen Later Feb 16, 2026 5:06


In this episode of Common Denominator, I break down one of the most extraordinary leadership stories of our time: Jensen Huang and NVIDIA.Over the last 36 months, NVIDIA has added roughly $100 billion in market cap per month, growing from a $300 billion company to nearly $4.5 trillion. But numbers like that don't happen by accident. They're the result of leadership.In this episode, I explore what kind of leadership it actually takes to build a company like NVIDIA — and what we can all learn from Jensen Huang's 32-year tenure as CEO.Here's what I dive into:- Why leadership compounds over time- The power of thinking in decades, not quarters- Why betting early on AI, GPUs, and CUDA looked irrational — but wasn't- How staying technically fluent at scale protects standards and speed- Why calm is one of the most underrated leadership traits- The difference between managing outcomes and managing direction- How great companies become infrastructure the world can't function withoutOn Common Denominator, I always ask: what's the real force behind extraordinary outcomes? More often than not, it's leadership. Not the title — the substance.Whether you're building a startup, leading a team, investing, or simply trying to lead yourself better, the lessons are the same:Think longer.Stay close to the work.Build for where the world is going.Don't let success dilute conviction.Jensen Huang didn't just build NVIDIA. He demonstrated what leadership looks like in an era of exponential change.And to me, that's the real common denominator.Like this episode? Leave a review here:https://ratethispodcast.com/commondenominator

The MAD Podcast with Matt Turck
Dylan Patel: NVIDIA's New Moat & Why China is "Semiconductor Pilled”

The MAD Podcast with Matt Turck

Play Episode Listen Later Feb 5, 2026 76:44


Dylan Patel (SemiAnalysis) joins Matt Turck for a deep dive into the AI chip wars — why NVIDIA is shifting from a “one chip can do it all” worldview to a portfolio strategy, how inference is getting specialized, and what that means for CUDA, AMD, and the next wave of specialized silicon startups.Then we take the fun tangents: why China is effectively “semiconductor pilled,” how provinces push domestic chips, what Huawei means as a long-term threat vector, and why so much “AI is killing the grid / AI is drinking all the water” discourse misses the point.We also tackle the big macro question: capex bubble or inevitable buildout? Dylan's view is that the entire answer hinges on one variable—continued model progress—and we unpack the second-order effects across data centers, power, and the circular-looking financings (CoreWeave/Oracle/backstops).Dylan PatelLinkedIn - https://www.linkedin.com/in/dylanpatelsa/X/Twitter - https://x.com/dylan522pSemiAnalysisWebsite - https://semianalysis.comX/Twitter - https://x.com/SemiAnalysis_Matt Turck (Managing Director)Blog - https://mattturck.comLinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturckFirstMarkWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCap(00:00) - Intro(01:16) - Nvidia acquires Groq: A pivot to specialization(07:09) - Why AI models might need "wide" compute, not just fast(10:06) - Is the CUDA moat dead? (Open source vs. Nvidia)(17:49) - The startup landscape: Etched, Cerebras, and 1% odds(22:51) - Geopolitics: China's "semiconductor-pilled" culture(35:46) - Huawei's vertical integration is terrifying(39:28) - The $100B AI revenue reality check(41:12) - US Onshoring: Why total self-sufficiency is a fantasy(44:55) - Can the US actually build fabs? (The delay problem)(48:33) - The CapEx Bubble: Is $500B spending irrational?(54:53) - Energy Crisis: Why gas turbines will power AI, not nuclear(57:06) - The "AI uses all the water" myth (Hamburger comparison)(1:03:40) - Circular Debt? Debunking the Nvidia-CoreWeave risk(1:07:24) - Claude Code & the software singularity(1:10:23) - The death of the Junior Analyst role(1:11:14) - Model predictions: Opus 4.5 and the RL gap(1:14:37) - San Francisco Lore: Roommates (Dwarkesh Patel & Sholto Douglas)

Sharks Hockey Digest
Cuda Confidential: Fil the Thrill

Sharks Hockey Digest

Play Episode Listen Later Feb 4, 2026 25:00


In the latest episode of Cuda Confidential, Barracuda voice Nick Nollenberger catches up with second-year center Filip Bystedt to discuss his AHL All-Star nod, breakout sophomore season, and more.

Cuda Confidential
Cuda Confidential: Fil the Thrill

Cuda Confidential

Play Episode Listen Later Feb 4, 2026 25:00


In the latest episode of Cuda Confidential, Barracuda voice Nick Nollenberger catches up with second-year center Filip Bystedt to discuss his AHL All-Star nod, breakout sophomore season, and more.

Get Out N Drive Podcast
Is AI Ruining The Automotive Industry and Buying Cars Online?

Get Out N Drive Podcast

Play Episode Listen Later Feb 3, 2026 26:02


Send us a textIs AI ruining the automotive industry? The guys discuss the current climate of AI and its affects what we see and can trust online.Buy the guys some guzzoline! https://buymeacoffee.com/getoutndriveThe Get Out N Drive Podcast is Fuel By AMD ~ AMD: More Than Metal https://www.autometaldirect.com/Visit the ‪AMD‬​ Garage ~ Your one stop source for high quality body panels for your restorationhttps://www.autometaldirect.com/amdgarageFor all things Get Out N Drive, cruise on over to the Get Out N Drive website. https://getoutndrive.com/Be sure to follow GOND on social media!GOND Website: https://getoutndrive.com/IG: https://www.instagram.com/getoutndrivepodcast/X: https://x.com/getoutndrivepodFB: https://www.facebook.com/Get.Out.N.Drive.podcastYouTube: https://www.youtube.com/@getoutndriveRecording Engineer: Paul MeyerSubscribe to the ‪Str8sixfan‬​ YouTube Channel:  @Str8sixfan  #classiccars​ #automotive​ #amd #autometaldirect #c10 #restoration #autorestoration #autoparts #restorationparts #truckrestoration #Jasonchandler #podcast #sheetmetal #mecum #bobbyadams #mecumscandal #carauction #classiccarauction #usedcar #buyaclassiccar #sellaclassiccar#tradeschool​#carengines​#WhatDrivesYOUth​#GetOutNDriveFAST​Join our fb group to share pics of how you Get Out N Drive: https://www.facebook.com/groups/getoutndrivepodcast/Follow Jason on IG: https://www.instagram.com/oldecarrguy/Follow Jason on fb: https://www.facebook.com/oldecarrguySubscribe To the OldeCarrGuy YouTube Channel:  @OldeCarrGuy  Follow John on IG: https://www.instagram.com/customcarnerd/Recording Engineer, Paul MeyerSign Up and Learn more about National Get Out N Drive Day: https://nationalgetoutndriveday.com/Music Credit:Licensor's Author Username:LoopsLabLicensee:Get Out N Drive PodcastItem Title:The RockabillyItem URL:https://audiojungle.netItem ID:25802696Purchase Date:2022-09-07 22:37:20​ UTCSupport the show: https://buymeacoffee.com/getoutndrive#ClassicCarSupport the show

Sharks Hockey Digest
Brodie Brazil's Teal Talk - Cam Lund

Sharks Hockey Digest

Play Episode Listen Later Feb 3, 2026 8:57


Brodie Brazil sits down with San Jose Barracuda forward Cam Lund to talk about his season with the Cuda, his teammates, and more.

brazil lund cuda san jose barracuda teal talk
Dev Interrupted
A constitution for AI, breaking dark flow, and open source as a moat?

Dev Interrupted

Play Episode Listen Later Jan 30, 2026 23:03


In this Friday Deploy, Andrew and Ben dive into the viral Moltbot (now OpenClaw) phenomenon and Steve Yegge's Software Survival 3.0 essay, debating how SaaS companies can build moats in an era of token-constrained engineering. They also explore the concept of "Dark Flow" - a deceptive state where vibe coding feels productive but hides accumulated tech debt - and break down Anthropic's newly released constitution for Claude. Finally, the team discusses a Reddit user's claim to have ported CUDA to AMD in 30 minutes and shares a fascinating breakdown of podcast listening data.LinearB: The AI productivity platform for engineering leadersFollow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's stories:OpenClawSoftware Survival 3.0Breaking the Spell of Vibe CodingClaude's new constitutionClaude Code Has Managed to Port NVIDIA's CUDA Backend to ROCmMy Top 25 Podcast Episodes & Interviews from 2025 by IPM (Insights Per Minute)OFFERS Start Free Trial: Get started with LinearB's AI productivity platform for free. Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era. LEARN ABOUT LINEARB AI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production. AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance. AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil. MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

The Lost Debate
Pretti Killing, ICE Impunity, NVIDIA Dominance

The Lost Debate

Play Episode Listen Later Jan 28, 2026 79:37


Ravi begins by explaining why this conversation matters to him: although he's often skeptical of tech leaders, he sees Nvidia CEO Jensen Huang as a rare and genuinely consequential figure. He briefly reflects on the Pretti shooting in Minnesota and the broader questions it raises about accountability and transparency before turning to the interview. Ravi then speaks with author Stephen Witt about the forces that shaped Jensen's leadership—from a punishing childhood and Nvidia's early brush with failure to the high-risk CUDA bet that helped make modern AI possible. It's an absorbing portrait of leadership, obsession, and the building of one of the most important tech companies of our time. Stephen Witt's Jensen Huang, Nvidia, and the World's Most Coveted Microchip Ravi's Garbage Town ––––– Leave us a voicemail with your thoughts on the show! 201-305-0084⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Follow Ravi at @RaviMGupta Notes from this episode are also available on Substack: https://thelostdebate.substack.com/ Read more from Ravi on Substack: https://realravigupta.substack.com  Follow The Branch at @thebranchmedia Listen to more episodes of Lost Debate on Apple: https://podcasts.apple.com/us/podcast/the-lost-debate/id1591300785 Listen to more episodes of Lost Debate on Spotify: https://open.spotify.com/show/7xR9pch9DrQDiZfGB5oF0F Listen to Where the Schools Went: https://thebranchmedia.org/show/where-the-schools-went/ 

Teal Town USA
Sherwood, Goalie Fight, Strong Push - The Pucknologists 263

Teal Town USA

Play Episode Listen Later Jan 26, 2026 167:01


The Sharks continue to make moves on and off the ice. On the ice, they win two of three games as their push for the playoffs continues, and off the ice, they made a trade to acquire Vancouver Canucks forward Keifer Sherwood. The Barracuda also split two games this week. Other topics include. The roster pinch continues with Chernyshov being sent to the Barracuda. The Sharks have one roster spot as Mukhamadullin and Kurashev near their return. A week away for trade flexibility for Skinner and Klingberg. State of the Sharks recap. Barracuda injuries continue to pile up. More creepy fans. Have your say in the YouTube Superchat on the Sharks, the Cuda and everything hockey! Teal Town USA - A San Jose Sharks' post-game podcast, for the fans, by the fans! Subscribe to catch us after every Sharks game and our weekly wrap-up show, The Pucknologists! Check us out on YouTube and remember to Like, Subscribe, and hit that Notification bell to be alerted every time we go live!

TOK FM Select
Ukraińscy energetycy czynią cuda, ich doświadczenie będzie bezcenne dla UE [podsumowanie kryzysu]

TOK FM Select

Play Episode Listen Later Jan 22, 2026 11:26


Od 9 stycznia około 600 tys. osób opuściło Kijów przez brak prądu, który spowodowany jest rosyjskimi atakami na infrastrukturę energetyczną. “Ogólnokrajowy blackout jest mało prawdopodobny, ale kolejne ciężkie epizody wyłączenia prądu są realne” - komentuje Danylo Moiseienko, analityk ds. współpracy energetycznej z Ukrainą w Forum Energii.

Teal Town USA
Teal Tinted Rollercoaster - The Pucknologists 262

Teal Town USA

Play Episode Listen Later Jan 19, 2026 85:24


The Sharks' eastern roadswing continues to have its ups and downs. A win in Washington, followed by being handled by the Detroit Red Wings. The Barracuda were on a rollercoaster of their own as they split a weekend set in Tucson. Other topics include. Nick Leddy is finally on waivers. Sharks having an A+ Season Sharks players/prospects are plentiful in Pronman's midseason U23 list. Michael Misa can stay after a minor league trade. And More! Have your say in the YouTube Superchat on the Sharks, the Cuda and everything hockey! Teal Town USA - A San Jose Sharks' post-game podcast, for the fans, by the fans! Subscribe to catch us after every Sharks game and our weekly wrap-up show, The Pucknologists! Check us out on YouTube and remember to Like, Subscribe, and hit that Notification bell to be alerted every time we go live!

Get Out N Drive Podcast
The history and evolution of 70s Boogie Vans

Get Out N Drive Podcast

Play Episode Listen Later Jan 19, 2026 35:03


Send us a textThe Get Out N Drive Podcast Is Fueled By AMDIn this episode the guys talk about the birth, evolution and resurgence of 70s era Boogie Vans.  Buy the guys some gas!The Get Out N Drive Podcast is Fuel By AMD ~ AMD: More Than MetalVisit the ‪AMD‬​ Garage ~ Your one stop source for high quality body panelsFor all things Get Out N Drive, cruise on over to the Get Out N Drive website.Be sure to follow GOND on social media!GOND WebsiteIGXFBYouTubeRecording Engineer, Paul MeyerSubscribe to the ‪Str8sixfan‬​ YouTube Channel#classiccars​ #automotive​ #amd #autometaldirect #c10 #restoration #autorestoration #autoparts #restorationparts #truckrestoration #Jasonchandler #podcast #sheetmetal #carbuild #truckbuild #carproject #2025 #2026 #yearendreview#WhatDrivesYOUth​#GetOutNDriveFAST​Join our fb group to share pics of how you Get Out N DriveFollow Jason on IGIGFollow Jason on fbSubscribe To the OldeCarrGuy YouTube ChannelFollow John on IGRecording Engineer, Paul MeyerSign Up and Learn more about National Get Out N Drive Day.Music Credit:Licensor's Author Username:LoopsLabLicensee:Get Out N Drive PodcastItem Title:The RockabillyItem URL:https://audiojungle.ne...​Item ID:25802696Purchase Date:2022-09-07 22:37:20​ UTCSupport the show

Algorithms + Data Structures = Programs
Episode 269: 2025 Double Retro

Algorithms + Data Structures = Programs

Play Episode Listen Later Jan 16, 2026 37:37


In this episode, Conor and Bryce conduct the annual 2025 retro!Link to Episode 269 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)SocialsADSP: The Podcast: TwitterConor Hoekstra: LinkTree / BioBryce Adelstein Lelbach: TwitterShow NotesDate Recorded: 2026-01-13Date Released: 2026-01-16VOTE FOR YOUR FAVORITE ADSP EPISODES OF 2025ADSP Episode 259:

UiPath Daily
Nvidia Empire Expansion: $1B+ AI Startup Investments

UiPath Daily

Play Episode Listen Later Jan 7, 2026 13:48


Expansion empire Nvidia $1B+ AI startup investments strategically controls silicon end-to-end potently globally disruptively. Strategic portfolio fuels agentic platforms, video generation, biotech revolutionizing trillion-dollar sectors comprehensively aggressively. Bets accelerate CUDA moat potently.Get the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustleSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Silicon Valley Tech And AI With Gary Fowler
Ending GPU Vendor Lock-In: True CUDA Portability for the AI Era with Michael Søndergaard

Silicon Valley Tech And AI With Gary Fowler

Play Episode Listen Later Jan 7, 2026 29:50


Join Michael Søndergaard, CEO and Founder of Spectral Compute, for a deep-dive conversation with Gary Fowler on one of the most critical infrastructure challenges in AI and high-performance computing: hardware lock-in within the CUDA ecosystem.Michael explains how Spectral Compute's SCALE toolchain — a compiler and CUDA-compatible libraries — allows a superset of CUDA code to compile directly to both NVIDIA and AMD GPUs with no intrinsic performance overhead. This breakthrough enables developers and enterprises to choose the best hardware for their workloads without rewriting code or sacrificing efficiency.

ChatGPT: OpenAI, Sam Altman, AI, Joe Rogan, Artificial Intelligence, Practical AI

Power network billion-dollar AI startup Nvidia $1B+ strategically energizes infrastructure potently worldwide disruptively. Massive portfolio spans agentic workflows, video AI, enterprise reasoning revolutionizing industries comprehensively aggressively. Strategic bets firewall CUDA moat potently.Get the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustleSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

AI for Non-Profits
Nvidia Bets Big: $1B+ AI Unicorns Portfolio

AI for Non-Profits

Play Episode Listen Later Jan 7, 2026 13:48


Nvidia's aggressive $1B+ AI unicorn investments firewall CUDA ecosystem against open challengers disruptively. Spanning LLMs, autonomous driving, and biotech AI secures inference revenue streams globally. Portfolio accelerates trillion-parameter deployment leadership potently.Get the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustleSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

The Elon Musk Podcast
Billion-Dollar AI Strategic Empire Builder: Nvidia

The Elon Musk Podcast

Play Episode Listen Later Jan 7, 2026 13:48


Empire builder strategic billion-dollar AI Nvidia $1B+ startups dominates infrastructure potently worldwide disruptively. Massive portfolio covers autonomous driving, video AI, enterprise copilots revolutionizing markets comprehensively aggressively potently. Strategic bets firewall CUDA aggressively.Get the top 40+ AI Models for $20 at AI Box: ⁠⁠https://aibox.aiAI Chat YouTube Channel: https://www.youtube.com/@JaedenSchaferJoin my AI Hustle Community: https://www.skool.com/aihustleSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Teal Town USA
Streak Snap, Olympic Bound, Hometown Remix - The Pucknologists 260

Teal Town USA

Play Episode Listen Later Jan 5, 2026 153:31


With the holidays in the rearview, hockey has returned to a full schedule. Ian and AJ break down the week that was for the Sharks and Barracuda, Wennberg's new contract, and look ahead to the Olympics. We'll also cover these topics: The WJC continues, with updates on who's still in, who went home, and, most importantly, how the Sharks' prospects fared. The Spangler Cup is in the books. How did Pohlkamp and Muldowney do with the US Collegiate Selects team? Another round of new NHL Jerseys Wennberg Extension. And More! Have your say in the YouTube Superchat on the Sharks, the Cuda and everything hockey! Teal Town USA - A San Jose Sharks' post-game podcast, for the fans, by the fans! Subscribe to catch us after every Sharks game and our weekly wrap-up show, The Pucknologists! Check us out on YouTube and remember to Like, Subscribe, and hit that Notification bell to be alerted every time we go live!

Le Vieux Sage
L'Atman est au delà des 5 gaines

Le Vieux Sage

Play Episode Listen Later Jan 4, 2026 10:41


Chapitre 13 du Viveka Cuda Mani (Le plus beau fleuron de la discrimination) intitulé "L'Atman est au delà des 5 gaines". Bibliographie: https://www.babelio.com/livres/Sankara-Le-plus-Beau-Fleuron-de-la-Discrimination--Viveka/165555 Musique: Calm Whale (whaleloryb.bandcamp.com/track/waves-of-inspiration-new-moon-january-2024) Narration et réalisation: Bruno Léger Production: Les mécènes du Vieux Sage Que règnent la paix et l'amour parmi tous les êtres de l'univers.  OM Shanti, Shanti, Shanti. 

WFO Radio Podcast
J.R. Gray - NHRA Congruity Pro Mod World Champion, plus Alan Reinhart

WFO Radio Podcast

Play Episode Listen Later Dec 23, 2025 90:01


#nhra #promods #champion NHRA Congruity Pro Mod World Champion, J.R. Gray joins WFO Radio to discuss his 2025 NHRA championship campaign. J.R. will give the details of the pressure packed final race in Las Vegas that made him champion, and look ahead to 2026. Alan Reinhart returns to WFO for a holiday check in! He give the details of his trip to explore drag racing in Australia. What's up with the Cuda? We'll find out...on the final WFO of 2025. 🚨 Don't miss out! Subscribe to WFO Radio for weekly NHRA updates, driver interviews, and exclusive motorsport content. Hit the bell 🔔 for notifications! MERCH: https://www.teepublic.com/stores/wfo-radio?ref_id=24678 PATREON: https://www.patreon.com/WFORadio APPLE: https://podcasts.apple.com/us/podcast/wfo-radio-podcast/id449870843?ls=1 SPOTIFY: https://open.spotify.com/show/0oo5mn0E3VmfhRCTHyLQIS

Sharks Hockey Digest
Cuda Confidential: On the Hunt

Sharks Hockey Digest

Play Episode Listen Later Dec 20, 2025 15:00


Barracuda voice Nick Nollenberger is joined by forward Jimmy Huntington to discuss joining the Sharks organization, nearly signing with San Jose out of junior, winning a Calder Cup, and more.

Cuda Confidential
Cuda Confidential: On the Hunt

Cuda Confidential

Play Episode Listen Later Dec 20, 2025 15:00


Barracuda voice Nick Nollenberger is joined by forward Jimmy Huntington to discuss joining the Sharks organization, nearly signing with San Jose out of junior, winning a Calder Cup, and more.

MIKE'D UP! with Mike DiCioccio
#280: Chuck Cuda — Betting On Myself: A True Story of Risk, Resilience, and Massive Success

MIKE'D UP! with Mike DiCioccio

Play Episode Listen Later Dec 15, 2025 50:22


This week Mike sits down with entrepreneur Chuck Cuda—a man who transformed a life-altering setback into a multimillion-dollar comeback. Chuck's journey begins with a single decision that landed him in prison after he refused to testify against friends in an illegal sports-betting case. But it was in a prison cell on Thanksgiving Day where he experienced the wake-up call that reshaped his entire life. From that moment, accountability became his superpower. Chuck rebuilt everything from the ground up. He went on to close over $200 million in commercial real estate deals, mastering the fundamentals of discipline, prospecting, and relationship-driven success. His ambition then carried him into the fast-evolving cannabis industry, where he expanded an operation from 5 to 22 licenses across three states—turning major financial challenges into profitability. Driven by purpose, Chuck also shares his passion for philanthropy, inspired by his father's battle with non-Hodgkin's lymphoma. Through the OPES Charitable Foundation, he has helped raise more than $3 million for cancer research. His daily affirmations, leadership philosophy, and belief in limitless potential offer a blueprint for anyone looking to rebuild their life or elevate their mindset.   IN THIS EPISODE:

Sharks Hockey Digest
Brodie Brazil's Teal Talk - Kasper Halttunen

Sharks Hockey Digest

Play Episode Listen Later Dec 15, 2025 11:25


Brodie Brazil talks with San Jose Barracuda forward Kasper Halttunen about his season with the Cuda.

brazil cuda san jose barracuda teal talk kasper halttunen
雪球·财经有深度
3059.谷歌vs英伟达:AI的下半场巅峰对决

雪球·财经有深度

Play Episode Listen Later Nov 29, 2025 7:43


欢迎收听雪球出品的财经有深度,雪球,国内领先的集投资交流交易一体的综合财富管理平台,聪明的投资者都在这里。今天分享的内容叫谷歌vs英伟达:A I的下半场巅峰对决,来自闷得而蜜。历史回顾互联网时代,成就了谷歌、facebook,思科跌下神坛。云计算时代,成就了微软、亚马逊,Intel跌下神坛。移动互联网时代,成就了苹果,高通跌下神坛。IT行业有条铁律:每一美元的硬件,必须产生十美元的软件和服务收入。这个产业规律,暗示着,信息产业进入稳定成长期,软件和服务商的估值远远超越硬件厂商。如今,人工智能浪潮席卷全球,算力成为新时代的“石油”,大模型成为基础设施。站在 A I 时代的十字路口,我们再次目睹巨头格局的剧烈重构。英伟达,凭借 G P U 在并行计算上的天然优势,一跃成为A I训练的“卖铲人”,市值一度超越苹果、微软,登顶全球第一。它的成功,看似打破了“一美元硬件难敌十美元软件”的铁律。而谷歌,作为A I领域的长期布道者——从2014年收购 DeepMind,到2017年提出 Transformer 架构,再到持续投入大模型与 A I 原生产品——却在商业化落地和市场估值上显得步履谨慎,甚至被质疑“起了个大早,赶了个晚集”。这不禁让人发问:在A I的下半场,究竟是掌握底层算力的“硬件霸主”继续高歌猛进,还是拥有数据、算法与生态闭环的“软件巨头”后来居上?谷歌与英伟达的对决,或许正是回答这个问题的关键线索。英伟达:算力垄断的黄金时代英伟达的崛起并非偶然。自 2012 年 AlexNet 使用 G P U 加速深度学习以来,黄仁勋就押注 A I 是计算范式的根本性变革。过去三年,这一判断被验证到极致:市占率超 95%:在训练端 G P U 市场,英伟达几乎形成事实垄断;毛利率高达 75%+:远超传统芯片公司,逼近软件公司水平;CUDA 生态护城河:百万开发者、数千优化库、数万企业依赖其软件栈,迁移成本极高;订单排到 2026 年:Blackwell 芯片供不应求,客户包括微软、Meta、亚马逊、甲骨文等所有云巨头。更关键的是,英伟达不再只是卖芯片,而是通过 A I Enterprise 软件套件、NIM 微服务、DGX Cloud 等,向“A I 操作系统”演进。它正在把硬件优势转化为平台级控制力。谷歌:被低估的 A I 全栈能力如果说英伟达是 A I 时代的“水电煤供应商”,那么谷歌就是那个最早设计电网、制定标准、还自己发电用电的人。技术源头:Transformer 架构(2017)已成为所有大模型的基础;LaMDA、PaLM、Gemini 系列模型持续领先;自研芯片:TPU 已迭代至 v5e/v5p/v6/v7,在内部训练效率上媲美甚至优于 B200;数据闭环:Search、YouTube、GmA Il、Android、Workspace 每天产生海量真实用户交互数据,这是任何外部公司无法复制的燃料;产品整合:A I 已深度嵌入 Search、Workspace、Android、Cloud。更重要的是,谷歌的商业模式天然适配 A I 变现:广告仍是现金牛:2024 年 Q3 广告收入 65 亿美元每天,为 A I 投入提供无限弹药;A I 不是成本,而是效率工具:用 Gemini 优化搜索结果、自动生成广告文案、提升客服效率——每一项都能直接节省数十亿美元运营成本;云业务拐点已现:Google Cloud 首次实现全年盈利,A I 服务成增长引擎。市场低估谷歌,是因为它没有像英伟达那样“性感”的单季度 200% 增长。但谷歌的 A I 战略是内生、稳健、可规模化的——它不需要靠卖芯片吃饭,而是让 A I 成为整个生态的“操作系统”。下半场:从“卖铲子”到“开金矿”A I 上半场的主题是基础设施军备竞赛——谁有更多 G P U,谁就能训练更大模型。英伟达因此受益最大。但下半场的主题正在转向:谁能用 A I 创造真实价值?谁能把模型变成产品、服务和利润?这里有几个关键转折信号:模型同质化加剧:闭源与开源模型性能差距缩小,单纯堆参数不再有效;推理成本成为瓶颈:训练只需一次,推理每天亿次——能效比、边缘部署、定制芯片更重要;用户要结果,不要技术:企业关心“A I 能否提升客服转化率”,而非“用了多少 B200”。在这个新阶段,谷歌的优势开始凸显:它拥有从芯片,框架,模型 ,到应用,最终到达用户的完整闭环;它不需要说服客户“为什么需要 A I”,因为 A I 已经在每天服务数十亿人;它的护城河不是 CUDA,而是用户习惯 + 数据飞轮 + 产品集成度。反观英伟达,若不能从“硬件供应商”进化为“A I 平台运营商”,其高估值将面临巨大压力。毕竟,历史上从未有一家纯硬件公司能长期维持 30 倍以上的市销率。结论:不是零和博弈,而是范式迁移谷歌与英伟达并非简单的“你死我活”。事实上,它们代表了 A I 价值链的两个关键环节:基础设施层 vs 应用层。但在 A I 的下半场,界限正在模糊:英伟达在做软件;谷歌在做芯片;微软既买英伟达芯片,又自研 MA Ia,还集成 Open AI;亚马逊一边采购 H100,一边推广 TrA Inium。真正的胜负手,不在于谁卖更多芯片,而在于谁能构建“软硬一体、端云协同、数据驱动”的飞轮。从这个角度看,谷歌的长期确定性可能更高——因为它早已把 A I 编织进自己的基因。而英伟达的辉煌,仍取决于 A I 资本开支的持续性和生态壁垒的不可逾越性。投资者不妨这样思考:如果你相信 A I 仍将经历一轮疯狂的基础设施投资潮,英伟达仍是首选;如果你相信 A I 正在进入“价值兑现期”,那么谷歌这样拥有场景、数据和变现能力的公司,才真正站在复利的起点。历史告诉我们:最终赢得时代的,从来不是最锋利的铲子,而是挖到金矿并建起城市的人。

Sharks Hockey Digest
The Buildup Game 19 at Seattle

Sharks Hockey Digest

Play Episode Listen Later Nov 15, 2025 20:00


On today's episode of The Buildup, we catch you up on some roster moves, and chat with the voice of The Barracuda, Nick Nollenberger, to get you caught up on all things Cuda.

The Acid Capitalist podcasts
Investing in the Blind Lap

The Acid Capitalist podcasts

Play Episode Listen Later Nov 12, 2025 75:59 Transcription Available


Send us a textA faster lap by going blind sounds reckless until you hear Lando Norris say he drives better with the delta display switched off. That's the spark for a bigger idea we explore: acid capitalism, where imagination and shared beliefs move markets more than the neatest spreadsheet ever can.We start with the critique that more frequent shows dilute intrigue and use it to sharpen the mission: reduce noise, focus on decision design. From there we test how narrative beats decimals in places you wouldn't expect. An F1 franchise marked at six billion becomes a case study in brand economics. Nvidia stops looking like “just chips” and reveals its platform moat through CUDA and TSMC's world-class execution, while hyperscalers quietly stretch asset lives to boost reported earnings. Tesla's 20-quarter coil is not dead money; it's stored energy that can compress a future rerate positive or cataclysmic into a single year. Meanwhile, China's 10-year yield hovering below 2 percent acts as a simple, powerful tell for local equities.We also dig into mispriced complexity. Spirits makers face a brutal cobweb: whiskey needs a decade, tequila seven years, and changing demand punishes inventory mistakes for an age. That's why Diageo and peers trade near decade lows; not because the category is broken, but because time is. Pain today sets up tomorrow's scarcity. We map one pragmatic approach: harvest option income against depressed, range-bound leaders to grind down cost basis while you wait for pricing power to return.Along the way, we examine Bitcoin vs MicroStrategy premiums, joke about longevity supplements, and acknowledge the temptation to obsess over every decimal point. The takeaway is consistent: decide what to ignore. Turn off the dashboard that steals your attention, then do the simple, hard work and respect cash over optics, find moats that scale, and back visions that mobilise real capital.Enjoyed the ride? Follow, share with a friend who loves markets with edge, and leave a review telling us what you'd switch off to see better.Support the show⬇️ Subscribe on Patreon or Substack for full episodes ⬇️https://www.patreon.com/HughHendryhttps://hughhendry.substack.comhttps://www.instagram.com/hughhendryofficialhttps://blancbleustbarts.comhttps://www.instagram.com/blancbleuofficial⭐⭐⭐⭐⭐ Leave a five star review and comment on Apple Podcasts!

In Wheel Time - Cartalk Radio
Mopar Muscle Cars under the Neon Lights

In Wheel Time - Cartalk Radio

Play Episode Listen Later Nov 11, 2025 30:02


Neon glows, steel shines, and the floor hums with V8 heartbeat as we step inside the Hemi Hideout for a guided tour through one of the most captivating Mopar collections in Texas. We move car by car, blending hard facts with honest reactions—why an aero-wild Superbird reshaped NASCAR, how a numbers-matching Super Bee earns reverence, and where a properly built restomod Cuda proves that drivability and drama can live under the same Shaker hood.We share the stories that make these machines feel alive. A violet 1970 Road Runner convertible exits a riverbed after 25 years and returns as a numbers-matching gem. A 1947 Dodge Power Wagon scores 996 out of 1000 and closes its doors with refrigerator solidity. A 1969 Coronet R/T convertible gets a 528 Hemi and triple carbs, while a Pink Panther Challenger and a chameleon '37 Plymouth coupe show how attitude and engineering come together in Pro Street form. Production numbers, engine codes, and rare color combos turn into small epiphanies—one Go Mango Super Bee is one of only 32 with its configuration, and just four exist in the registry today.Between spark and story, we widen the lens to community and the market. Charity open houses turn $10 tickets into doubled donations. Cruise nights, coffee meets, and holiday toy drives invite everyone to pull up and talk cars. And when talk shifts to tariffs, we unpack why a 25% headline rarely becomes a 25% price hike, explaining how transaction values, content rules, and dealer variability shape what buyers actually pay. It's a full-spectrum ride: history, hardware, community, and clarity.Love muscle car history, NASCAR lore, and the art of a great restomod? Tap play, then tell us your pick from the Hemi Hideout lineup. If this tour revs your engine, follow the show, share with a friend, and drop a review so more enthusiasts can find the ride.Be sure to subscribe for more In Wheel Time Car Talk!The Lupe' Tortilla RestaurantsLupe Tortilla in Katy, Texas Gulf Coast Auto ShieldPaint protection, tint, and more!Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.---- ----- Want more In Wheel Time car talk any time? In Wheel Time is now available on Audacy! Just go to Audacy.com/InWheelTime where ever you are.----- -----Be sure to subscribe on your favorite podcast provider for the next episode of In Wheel Time Podcast and check out our live multiplatform broadcast every Saturday, 10a - 12nCT simulcasting on Audacy, YouTube, Facebook, Twitter, Twitch and InWheelTime.com.In Wheel Time Podcast can be heard on you mobile device from providers such as:Apple Podcasts, Amazon Music Podcast, Spotify, SiriusXM Podcast, iHeartRadio podcast, TuneIn + Alexa, Podcast Addict, Castro, Castbox, YouTube Podcast and more on your mobile device.Follow InWheelTime.com for the latest updates!Twitter: https://twitter.com/InWheelTimeInstagram: https://www.instagram.com/inwheeltime/https://www.youtube.com/inwheeltimehttps://www.Facebook.com/InWheelTimeFor more information about In Wheel Time Podcast, email us at info@inwheeltime.com

All TWiT.tv Shows (MP3)
Untitled Linux Show 227: Ancient Stack Tax

All TWiT.tv Shows (MP3)

Play Episode Listen Later Nov 3, 2025 105:55 Transcription Available


This week SUSE's SLES and Red Hat's RHEL are embracing AI in the form of MCP and CUDA support. FFMPEG scores a $100k donation, Pop_OS and Cosmic finally have a release data, and Unity is in need of help. Kodi 22 has an Alpha, Debian has a Systemd dustup, and Krita has landed HDR support. And there's a port of Linux to WASM, so you can run the kern in your browser. Handy! For tips we have doxx for opening .docx in the terminal, a primer on absolute vs relative paths, whoami for grabbing the current username, and btrfs's scrub command for checking the local disk. You can find the show notes at https://bit.ly/4ovhsLG and have a great week! Host: Jonathan Bennett Co-Hosts: Rob Campbell, Jeff Massie, and Ken McDonald Download or subscribe to Untitled Linux Show at https://twit.tv/shows/untitled-linux-show Want access to the ad-free video and exclusive features? Become a member of Club TWiT today! https://twit.tv/clubtwit Club TWiT members can discuss this episode and leave feedback in the Club TWiT Discord.

All TWiT.tv Shows (Video LO)
Untitled Linux Show 227: Ancient Stack Tax

All TWiT.tv Shows (Video LO)

Play Episode Listen Later Nov 3, 2025 105:55 Transcription Available


This week SUSE's SLES and Red Hat's RHEL are embracing AI in the form of MCP and CUDA support. FFMPEG scores a $100k donation, Pop_OS and Cosmic finally have a release data, and Unity is in need of help. Kodi 22 has an Alpha, Debian has a Systemd dustup, and Krita has landed HDR support. And there's a port of Linux to WASM, so you can run the kern in your browser. Handy! For tips we have doxx for opening .docx in the terminal, a primer on absolute vs relative paths, whoami for grabbing the current username, and btrfs's scrub command for checking the local disk. You can find the show notes at https://bit.ly/4ovhsLG and have a great week! Host: Jonathan Bennett Co-Hosts: Rob Campbell, Jeff Massie, and Ken McDonald Download or subscribe to Untitled Linux Show at https://twit.tv/shows/untitled-linux-show Want access to the ad-free video and exclusive features? Become a member of Club TWiT today! https://twit.tv/clubtwit Club TWiT members can discuss this episode and leave feedback in the Club TWiT Discord.

The Road to Autonomy
Episode 342 | Autonomy Markets: Tesla's Robotaxi Scale Plan, NVIDIA's Autonomy Ambitions

The Road to Autonomy

Play Episode Listen Later Oct 25, 2025 40:04


This week on Autonomy Markets, Grayson Brulte and Walter Piecyk discuss Tesla's Q3 earnings call, NVIDIA's strategic partnership with Uber, and GM's surprising return to autonomy under Sterling Anderson's leadership.The conversation opens with Walt's firsthand insights from Tesla's Q3 2025 earnings call, where the company confirmed plans to remove safety attendants across “large parts” of Austin by year-end after accumulating 250,000 robotaxi miles.Tesla also announced 8–10 additional markets coming online by year-end, including Florida, Arizona, and Nevada, following the company's phased rollout playbook: safety-attended operations first, followed by fully autonomous service. Grayson projects more than 300 Model Y Robotaxis operating in Austin by mid-2026, potentially joined by 25–50 Cybercabs pending NHTSA exemptions.The discussion then turns to NVIDIA's newly announced partnership with Uber, which Grayson sees as signaling something much bigger than data sharing. He suggests NVIDIA could be positioning to acquire a leading autonomous driving developer such as Wayve, mirroring its CUDA strategy where software dominance separated it from competitors.His thesis: NVIDIA will ultimately own and license an autonomy stack across the industry, creating existential risk for startups dependent on its compute. Walt explores the market dynamics and potential conflicts that arise when your chip vendor becomes your competitor, while noting that NVIDIA's brand power could simultaneously validate the entire autonomy market.The week also brought news from GM, which re-entered the autonomy race by announcing a 2028 “hands-off, eyes-off” system debuting in the Cadillac Escalade IQ. Sterling Anderson confirmed GM's staged rollout plan: highways first, urban next, then full urban autonomy.Closing out the episode, Grayson and Walt debut the Foreign Autonomy Desk, covering Baidu Apollo's partnership with Swiss PostBus, WeRide and Uber's shuttle launch in Saudi Arabia, May Mobility's strategic investment from Grab for Southeast Asia expansion, and Waymo's effort to bring UK safety advocates to California for test rides ahead of its potential London launch.Episode Chapters0:00 Tesla Q3 2025 Earnings14:58 NVIDIA's Autonomy Ambitions20:59 Avride & Uber's Autonomy Investment Strategy23:09 GM is Back in Autonomy28:46 Waymo Begins Manually Testing at EWR (Newark Airport)32:14 Foreign Autonomy DeskRecorded on Friday October 24, 2025--------About The Road to AutonomyThe Road to Autonomy provides market intelligence and strategic advisory services to institutional investors and companies, delivering insights needed to stay ahead of emerging trends in the autonomy economy™. To learn more, say hello (at) roadtoautonomy.com.Sign up for This Week in The Autonomy Economy newsletter: https://www.roadtoautonomy.com/ae/See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Sharks Hockey Digest
Cuda Confidential (10.24.25): Admirals Preview

Sharks Hockey Digest

Play Episode Listen Later Oct 24, 2025 30:00


In this episode of Cuda Confidential, Nick Nollenberger recaps the Barracuda's first three games and previews its two-game series with the Milwaukee Admirals by catching up with Ads voice Aaron Sims.

I Am Refocused Podcast Show
From Real Estate to Impact: How Chuck Cuda Built a Legacy of Innovation and Community

I Am Refocused Podcast Show

Play Episode Listen Later Oct 11, 2025 24:53


In this episode of I Am Refocused Radio, host Shemaiah Reed sits down with entrepreneur and commercial real estate powerhouse Chuck Cuda, founder of OPES Commercial Real Estate and CEO of Elevation Cannabis.As CEO of Elevation Cannabis, he expanded operations from five to 22 licenses across three states, proving his ability to grow and disrupt industries.But Chuck's success doesn't stop at business. In 2018, he founded the OPES Charitable Foundation, which has raised more than $3 million for causes like cancer research and autism awareness. He also serves on local boards — including the Leukemia and Lymphoma Society — and mentors young athletes through youth baseball, helping shape the next generation of leaders.This conversation dives deep into leadership, innovation, giving back, and what it really takes to build something that lasts.Become a supporter of this podcast: https://www.spreaker.com/podcast/i-am-refocused-radio--2671113/support.Thank you for tuning in to I Am Refocused Radio. For more inspiring conversations, visit IAmRefocusedRadio.com and stay connected with our community.Don't miss new episodes—subscribe now at YouTube.com/@RefocusedRadio

This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Recurrence and Attention for Long-Context Transformers with Jacob Buckman - #750

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

Play Episode Listen Later Oct 7, 2025 57:23


Today, we're joined by Jacob Buckman, co-founder and CEO of Manifest AI to discuss achieving long context in transformers. We discuss the bottlenecks of scaling context length and recent techniques to overcome them, including windowed attention, grouped query attention, and latent space attention. We explore the idea of weight-state balance and the weight-state FLOP ratio as a way of reasoning about the optimality of compute architectures, and we dig into the Power Retention architecture, which blends the parallelization of attention with the linear scaling of recurrence and promises speedups of >10x during training and >100x during inference. We review Manifest AI's recent open source projects as well: Vidrial—a custom CUDA framework for building highly optimized GPU kernels in Python, and PowerCoder—a 3B-parameter coding model fine-tuned from StarCoder to use power retention. Our chat also covers the use of metrics like in-context learning curves and negative log likelihood to measure context utility, the implications of scaling laws, and the future of long context lengths in AI applications. The complete show notes for this episode can be found at https://twimlai.com/go/750.

We Study Billionaires - The Investor’s Podcast Network
TECH002: Jensen Huang & NVIDIA w/ Seb Bunney - Review of The Thinking Machine by Stephen Witt

We Study Billionaires - The Investor’s Podcast Network

Play Episode Listen Later Sep 24, 2025 67:38


Preston and Seb launch their tech book review series with a deep dive into The Thinking Machine, a book about NVIDIA and its CEO Jensen Huang. They explore NVIDIA's transformation from a gaming hardware company to a key player in AI, discussing CUDA, leadership strategy, robotics, and the speed of innovation. The episode ends with a preview of their next review, Empire of AI. IN THIS EPISODE YOU'LL LEARN: 00:00 - Intro 05:29 – How NVIDIA transitioned from gaming GPUs to leading AI infrastructure 09:26 – Why CUDA was a turning point in GPU development for AI research 15:37 – The role of NVIDIA in enabling modern AI models, including transformers 19:55 – Jensen Huang's leadership style and strategic market thinking 20:14 – The significance of creating new markets versus competing in existing ones 24:44 – How NVIDIA trains robots in hyper-realistic digital environments 27:47 – The impact of LiDAR and simulation on robotics advancement 38:53 – Whether Jensen's success is due to luck, skill, or strategic foresight 50:30 – The meaning behind Jensen's "speed of light" principle 01:01:00 – What's coming next in the book review series, starting with Empire of AI BOOKS AND RESOURCES Related Book: The Thinking Machine: Jensen Huang, Nvidia, and the World's Most Coveted Microchip. Seb's Website and book: The Hidden Cost of Money. Related ⁠⁠books⁠⁠ mentioned in the podcast. Ad-free episodes on our ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠Premium Feed⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. NEW TO THE SHOW? Join the exclusive ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠TIP Mastermind Community⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to engage in meaningful stock investing discussions with Stig, Clay, Kyle, and the other community members. Follow our official social media accounts: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠X (Twitter)⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠LinkedIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Facebook⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠TikTok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Check out our ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Bitcoin Fundamentals Starter Packs⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Browse through all our episodes (complete with transcripts) ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠here⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Try our tool for picking stock winners and managing our portfolios: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠TIP Finance Tool⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Enjoy exclusive perks from our ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠favorite Apps and Services⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Get smarter about valuing businesses in just a few minutes each week through our newsletter, ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Intrinsic Value Newsletter⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Learn how to better start, manage, and grow your business with the ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠best business podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. SPONSORS Support our free podcast by supporting our ⁠⁠⁠sponsors⁠⁠⁠: Simple Mining HardBlock AnchorWatch Human Rights Foundation Linkedin Talent Solutions Vanta Unchained Onramp Netsuite Shopify Abundant Mines Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://theinvestorspodcastnetwork.supportingcast.fm

All TWiT.tv Shows (MP3)
Untitled Linux Show 221: Cooperative Socialist Paradise

All TWiT.tv Shows (MP3)

Play Episode Listen Later Sep 21, 2025 70:28 Transcription Available


Redox is embracing Wayland, Ubuntu is supporting CUDA, and Fedora is introducing Fedora Forge. The eBPF foundation has $100,000 worth of grant money to award, BcacheFS works out DKMS packaging, and Mesa moves towards guidelines for AI code. Fedora 43 and Plasma 6.5 both hit beta this week, with releases coming soon. For tips, we have Semaphore UI for managing ansible and other DevOps tools, wpctl set-profile for more WirePlumber management, and Terminus for gamifying command line learning. You can catch the show notes at https://bit.ly/3KdSukS and enjoy! Host: Jonathan Bennett Co-Hosts: Ken McDonald and Rob Campbell Download or subscribe to Untitled Linux Show at https://twit.tv/shows/untitled-linux-show Want access to the ad-free video and exclusive features? Become a member of Club TWiT today! https://twit.tv/clubtwit Club TWiT members can discuss this episode and leave feedback in the Club TWiT Discord.

Real Estate Espresso
Kansas City Infill with Chuck Cuda

Real Estate Espresso

Play Episode Listen Later Sep 20, 2025 11:21


Chuck Cuda is focused on redevelopment of deeply distressed inner city retail projects in the Kansas City area. The focus is on value creation and removal of blight. This is a strategy that could be viable in most US cities. To connect with Chuck and to learn more, visit https://opescre.com/ or email him directly at cuda@opescre.com. Also check out his new book "The Ego Strength" available on Amazon. -----------**Real Estate Espresso Podcast:** Spotify: [The Real Estate Espresso Podcast](https://open.spotify.com/show/3GvtwRmTq4r3es8cbw8jW0?si=c75ea506a6694ef1)   iTunes: [The Real Estate Espresso Podcast](https://podcasts.apple.com/ca/podcast/the-real-estate-espresso-podcast/id1340482613)   Website: [www.victorjm.com](http://www.victorjm.com)   LinkedIn: [Victor Menasce](http://www.linkedin.com/in/vmenasce)   YouTube: [The Real Estate Espresso Podcast](http://www.youtube.com/@victorjmenasce6734)   Facebook: [www.facebook.com/realestateespresso](http://www.facebook.com/realestateespresso)   Email: [podcast@victorjm.com](mailto:podcast@victorjm.com)  **Y Street Capital:** Website: [www.ystreetcapital.com](http://www.ystreetcapital.com)   Facebook: [www.facebook.com/YStreetCapital](https://www.facebook.com/YStreetCapital)   Instagram: [@ystreetcapital](http://www.instagram.com/ystreetcapital)