Podcasts about gtc

399PODCASTS
990EPISODES
44mAVG DURATION
5WEEKLY NEW EPISODES
Jul 19, 2026LATEST

POPULARITY

20192020202120222023202420252026

Best podcasts about gtc

Simply Trade

45 episodes with gtc

Acting Up with GTC

39 episodes with gtc

Everyday AI Podcast â€“ An AI and ChatGPT Podcast

14 episodes with gtc

TD Ameritrade Network

15 episodes with gtc

Laura's List

12 episodes with gtc

WGTD's The Morning Show with Greg Berg

16 episodes with gtc

The Insider Travel Report Podcast

13 episodes with gtc

Killander & Björk

21 episodes with gtc

GTC Podcasts

25 episodes with gtc

Girls Talk Comics

16 episodes with gtc

The AI Podcast

6 episodes with gtc

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

5 episodes with gtc

The AI Breakdown: Daily Artificial Intelligence News and Discussions

5 episodes with gtc

The tastytrade network

10 episodes with gtc

All TWiT.tv Shows (MP3)

5 episodes with gtc

On The Tape

4 episodes with gtc

Good Luck High Five

4 episodes with gtc

Squawk on the Street

4 episodes with gtc

This Week in HPC

6 episodes with gtc

The SharePickers Podcast with Justin Waite

7 episodes with gtc

All CNET Video Podcasts (HD)

4 episodes with gtc

M?? | ??X??X??

4 episodes with gtc

All TWiT.tv Shows (Video LO)

5 episodes with gtc

Golf Talk Canada

8 episodes with gtc

Washington AI Network with Tammy Haddad

4 episodes with gtc

Radio Leo (Audio)

3 episodes with gtc

CNET News (HD)

4 episodes with gtc

PC Perspective Podcast

3 episodes with gtc

Tech Café

4 episodes with gtc

Trader Merlin

3 episodes with gtc

Algorithms + Data Structures = Programs

4 episodes with gtc

The Six Five with Patrick Moorhead and Daniel Newman

4 episodes with gtc

You Can't Make This Up Podcast

3 episodes with gtc

Hablando con Científicos - Cienciaes.com

3 episodes with gtc

????

5 episodes with gtc

Broken Silicon

2 episodes with gtc

Greater Than Code

5 episodes with gtc

The Full Nerd

2 episodes with gtc

Meadowbrook Magic

6 episodes with gtc

Tips For Guitar Playing Success

2 episodes with gtc

Auto Sausage

2 episodes with gtc

GEROS Health - Physical Therapy | Fitness | Geriatrics

2 episodes with gtc

NL Rallysport | BENE Servicepark | Rally Podcast

6 episodes with gtc

The Circuit

2 episodes with gtc

Will Brocker

3 episodes with gtc

Show all podcasts related to gtc

Latest podcast episodes about gtc

304 – Building Successful Multi-Product Solutions with Hyperscalers and GSI’s

Ultimate Guide to Partnering™

Play Episode Listen Later Jul 19, 2026 47:12

Don’t Fade and Die in AI Subscribe to our Newsletter: https://theultimatepartner.com/ebook-subscribe/ Check Out UPX: https://theultimatepartner.com/experience/ Matt Yanchyshyn, VP AWS Marketplace, Rekha Thangelapalita, Elastic GSI Leaders; Allison McFadden, Accenture AWS Leader; and James Kang of Nvidia join Ultimate Partner. In this panel discussion, leaders from Elastic, Accenture, Nvidia, and AWS dissect the urgent shifts in the ecosystem, emphasizing that partners must adapt to AI and agentic co-selling or risk fading away completely. The conversation explores the necessity of deep co-engineering, the power of multi-product solutions in the AWS marketplace, and how automated agents are now replacing traditional human sales pipeline progression. By embracing data readiness and strategic collaboration, organizations can survive the “token maxing” era, effectively scale their enterprise opportunities, and align with NVIDIA’s five-layer strategy to dominate the new cloud landscape. https://youtu.be/zUkL4Wqsa68 Key Takeaways AI agents will automate the majority of AWS partner co-selling attachments and opportunity progressions this year. Partners who fail to embrace agentic workflows and automated governance face the existential risk of fading into obsolescence. Successful multi-product offerings require a “blood to all organs” approach that benefits the client, the ISV, the GSI, and the hyperscaler simultaneously. Nvidia’s “five-layer cake” model emphasizes that successful outcomes at the application layer automatically drive growth for all underlying infrastructure. The “token maxing” phenomenon is forcing enterprises to seek cost-effective, open-model alternatives to scale their generative AI securely. Integrating GSIs and ISVs on the AWS marketplace significantly increases enterprise deal sizes and long-term customer renewal rates. If you're ready to lead through change, elevate your business, and achieve extraordinary outcomes through the power of partnership—this is your community. At Ultimate Partner® we want leaders like you to join us in the Ultimate Partner Experience – where transformation begins. Key Tags strategic collaboration agreement, data readiness engine, agentic co-sell, semantic layer, token maxing, five layer cake, accelerated computing platform, open models, cloud consumption, multi-product solutions, partner central agents, propensity data, automated opportunity progression, generative AI governance Transcript Matt Y and Panel Audio Podcast [00:00:00] Vince Menzione: You have a choice. You can embrace them and figure it out and get governance and, and make your data available. Um, use the partner, central agent, move to Agen Co-sell, or you can fade and die. [00:00:11] Vince Menzione: You can feel it happening. The ecosystem is shifting beneath us, the way Hyperscalers are partnering, how AI is remaking the channel and what it means to win in 2026. [00:00:22] Vince Menzione: Welcome to the Ultimate Partner Podcast. I’m Vince Menzi. Own your host. And each week I sit down with leaders at the intersection of technology, partnerships and outcomes. The voices shaping how ecosystems actually work. We talk about what’s real, what’s changing, and what it takes to lead in this era where the partner channel isn’t just part of the strategy. [00:00:44] Vince Menzione: It is the strategy because [00:00:46] Vince Menzione: being in the room changes everything. Let’s start. [00:00:51] Vince Menzione: We’ve got some amazing leaders joining us. So I think probably for a little bit of context, maybe just start with Rika. You can introduce yourself, your role and, uh, what, what you’ve been doing at Elastic. Yeah. [00:01:03] Rekha Thangellapalli: Yeah, sounds great. [00:01:04] Rekha Thangellapalli: Hi everyone. I’m Reka and I lead GSI Alliances at Elastic. Um, for the past 14 years, I’ve had the pleasure of building different kinds of partner ecosystems across companies such as SAP. MuleSoft, Salesforce, Coupa, and now Elastic. Um, I wanna thank Ultimate partner and Vince for having us here today. Thank you and the panel of these incredible speakers for joining me on stage. [00:01:31] Rekha Thangellapalli: Um, very excited for the conversation today. [00:01:33] Vince Menzione: We love Elastic, and you’ve had some of your other leaders on stage at other events. As such, the quality of your leadership team is amazing. Thank you. [00:01:42] Rekha Thangellapalli: I wholeheartedly agree. [00:01:45] Allison McFadden: Excellent. Um, hello everyone. Allison McFadden. I lead our North America AWS practice at Accenture. [00:01:52] Allison McFadden: Uh, I’ve been there for five years, and truth be told, it was my first partnership role, my first formal partnership role. Uh, so I can take some tips from all of you in the room here today. Prior to that, I was 21 years with IBM, and I got into partnerships because my last role at IBM was actually trying to build. [00:02:14] Allison McFadden: Linux business on the mainframe, and I had to have partners. I had to have partners to help me with workloads to run there. So I kind of learned, uh, trial by fire. But I’m excited for the conversation today. Excited to be in this room and excited to talk about what we’re doing with, uh, elastic. Thank you. [00:02:34] James Kang: Uh, my name is James Kang. Nice to see and meet everyone here. Vince, thank you for the opportunity. Thank you [00:02:38] Vince Menzione: for being here. [00:02:39] James Kang: Um, I’m with Nvidia, so I help manage the AWS partnership at Nvidia all up. Um, I guess fun fact, I’m former AWS and so I see a lot of very familiar faces here in the front row. Uh, former colleagues and then current friends. [00:02:56] James Kang: And so, uh, looking forward to the conversation. [00:02:59] Vince Menzione: Great. Well, we’ll start with an easy tia. Matt. This is not directed to you, directed to the others. So what does a successful AWS partnership look like from your C? So we’ll start with Eureka. [00:03:09] Rekha Thangellapalli: Sure. So from an ISV perspective, I think we really are looking at three things. [00:03:15] Rekha Thangellapalli: Uh, mutual investment building together. And scaling together. So when we talk about mutual investment, elastic recently signed a five-year SCA or strategic collaboration agreement with AWS. And while that is a significant milestone in our partnership, for us, what matters more is what it represents, and that is really a long-term commitment from both companies. [00:03:39] Rekha Thangellapalli: Towards product engineering, um, and joint go to market initiatives to deliver value to customers over time. And that’s what we see is that the best partnerships really compound and they build upon each other every year. Um, they don’t necessarily kind of reset every year. Um, next we talk about building together. [00:03:59] Rekha Thangellapalli: So, um. When we talk about joint solutions, we want to deliver solutions that are better together and the customers have to see us that way. And so whether it’s search, observability, or security, we’re looking at taking to market solutions that we can’t or necessarily don’t wanna take on our own. And finally we talk about scaling together. [00:04:22] Rekha Thangellapalli: And this is where marketplace, for instance, plays a big role, um, when customers can draw down on their cloud commitments, transact online and go from, you know, pilot to enterprise scale adoption in hours, not days. Um, this is when really everyone wins. Um, and this is also where partners like Accenture play a critical role. [00:04:47] Rekha Thangellapalli: Um, you know, the incredible amount of expertise that they bring, uh, the managed services capabilities and, um, their data assets actually play a huge role in having our customers realize that value faster. And, um, like Vince mentioned, at the end of the day, best partnerships are all all about creating kind of that. [00:05:07] Rekha Thangellapalli: Self-sustaining flywheel. And so it starts with investing together, building something unique, and having the customers realize that success faster because that success is really the only thing that’s gonna keep that flywheel going for everyone involved. I [00:05:26] Vince Menzione: absolutely. [00:05:26] Allison McFadden: Okay, amazing. I’m gonna riff off a few things Ika said, but from a GSI perspective. [00:05:32] Allison McFadden: A relationship with a WSA successful relationship with AWS looks slightly different. Um, so I think the first thing that we think of in the GSI Community common thread is that the client outcome and delivering value for clients is what we, what we’re striving for. Um, and so the partnership with AWS in that case, um, um, it has to, it has to. [00:06:01] Allison McFadden: Look like one team in front of our clients. So we have to show up indistinguishable, and that’s with AWS and with an ISV partner, it has to look like one solution in front of the client, especially moments that matter. So board meetings, um, you know, the time we’re gonna sign a deal, like we have to look like one team, uh, and keep our our client outcome, um, first and foremost in mind. [00:06:24] Allison McFadden: The second thing, and this is I think where the magic of all the people in this room comes into play. We can have as many discussions at a CEO level as we want. And if our client teams on the ground are not working together, it falls apart. Falls apart directly in front of the client. Yes. And that is a really hard thing to do. [00:06:45] Allison McFadden: So I’m passionate about the alliance work because that that work is what makes it happen at the corporate level. [00:06:53] James Kang: Cool. Um. I’ll start here. So in Nvidia is a accelerated computing platform company. Um, if you asked. Anyone on the, on the street about a year ago, what is ai? A lot of times they would say AI is, is open ai, or it’s philanthropic. [00:07:12] James Kang: Um, Jensen and I’ll, I’ll reference Jensen a lot today, um, because he is our leader, um, but he also sets the strategy in the direction for Nvidia. He talks a lot about AI in the metaphor of a five layer cake. And in terms of the five layer cake, you start off with the foundational bottom layer being power and energy, which sustains. [00:07:32] James Kang: All of our data centers, you move up the stack in terms of chips. So things think of Foxconn, think of TSMC. Next you have the infrastructure layer. So obvious choice is AWS, and then you get to the models where you do have the philanthropics and the open ais. But finally in at the precipice, you have the application layer. [00:07:53] James Kang: Ultimately, the reason why I mentioned all different stacks of the layers, the five layer cake, is the fact that the application layer is the most important. And so when you think about. Partners like Elastic or ServiceNow Trend, ai, CrowdStrike. Every time you pull from the application layer and you see a success, it pulls all five different components of that layer up. [00:08:13] James Kang: And so ultimately, as I think about success, it’s it’s being able to develop these co-sell wins at the application layer and really demonstrating that through extreme co-engineering and co-design with all the different application. Infrastructure, power and energy layers in mind. Um, Jensen also likes to think of himself not only as the CEO and founder, but also as the, the chief Marketing Officer. [00:08:35] James Kang: We are a very event driven company, and so at our big events like GTC or at big industry events like CES or Computex, he likes to show up on the biggest stage, biggest stages and showcase the partnerships with not only ISVs and GSIs, but also with end customers. And so that’s what I think about when I think of SA success. [00:08:56] Vince Menzione: That’s a really good point. You talked about, Allison, you talked about having an alliance strategy, or at least you teed it up, so I thought maybe we would go there for a second. Right? Like, what does a great alliance strategy look like and why is it important to the success of the partnership? [00:09:11] Allison McFadden: Man, I, uh, I have so many opinions on this. [00:09:13] Allison McFadden: We could probably be up here all day. That’s [00:09:15] Vince Menzione: okay. [00:09:16] Allison McFadden: Um, no, I think. Uh, there, there are a couple things, and the first one that comes to mind is focus. We cannot be all things to all people. Um, so when it comes to think about some of the, the work we’re doing with Elastic, we have a very, very clear point of view on what client problem we’re solving, what clients we want to talk to. [00:09:38] Allison McFadden: It helps if, um, from an ISV perspective, if there’s a very clear fit in. The Accenture portfolio or whatever, you know, SI consulting partner. You’re working with a very clear fit in the portfolio and we know what we’re not gonna go after, what we’re not gonna spend our time on because we have, we have this tendency, there’s millions of people. [00:10:00] Allison McFadden: The ecosystem chart that, you know, Vince, you showed up there, there’s so many connections. There’s probably more connections there than there are atoms in the universe, right? So, um. Defining what we do together and what we don’t do together is the first thing that pops to my mind. [00:10:19] Vince Menzione: Reka, do you have a perspective on it since we’re gonna, we’re gonna talk next about what you’ve done together, but, and I also wanna get mass perspective as a hyperscaler partner here as well. [00:10:29] Rekha Thangellapalli: Yeah, I mean from my perspective, I, I’m gonna, you know, kinda echo what Allison said is to be just maniacally focused. Yep. Um, because, especially from my perspective, so Elastic has three different solutions, right? We’ve got search, we’ve got observability, we’ve got security that map to completely different business units within Accenture. [00:10:47] Rekha Thangellapalli: And of course Accenture does a lot of things. And so, you know, when we first came together it was like. Okay, what are we gonna focus on? What industries are we gonna go after? Which segments are we gonna go after? Which customers, you know, um, outcomes are we trying to solve? And I think that sort of maniacal focus is the number one contributing factor to, to the fact that I’m like, up here on stage today. [00:11:12] Rekha Thangellapalli: Great. [00:11:14] Vince Menzione: Matt? Perspective? [00:11:16] Matt Yanchyshyn: Yeah, I, I, I guess I was trying to. To add something, uh, additional from an AWS perspective, uh, when it comes to, you know, what does a great alliance look like? Uh, AWS is obsessed with data, you know, in data we trust. And, and so the best, um, and, and this goes sales business problem, and it’s not just the engineering teams. [00:11:34] Matt Yanchyshyn: And so, uh, you know, Accenture does a good job of this elastic, definitely. And if you can come to the table with, um, quantifiable proof of the value of customer outcomes and partnerships. Um, you’ll win all the time and it’ll be a durable relationship with AWS ’cause we really are this data obsessed company and, and even the most senior sales leaders. [00:11:54] Matt Yanchyshyn: Uh, and so what I mean by that specifically is like if you, if you can show like your a RR to land an a RR conversion ratio, like in in numerical format, it’ll light up our sales leaders and, and they’ll be all, and they will co-sell with you all day long. If you can show the, I mentioned this earlier, like the AWS service, uh, whether you’re consulting company or, um, elastic and, and how the shape of customer accounts change positively when we work together. [00:12:15] Matt Yanchyshyn: That type of sort of quantifiable data works particularly well from an alliance perspective. With AWS as a partner, we, we really are like this data in sort of results out company. Um, so I, yeah, that’s just adding to the great points that were already made. I would say specific to AWS that that’s key. [00:12:30] Matt Yanchyshyn: Yeah. And I’m gonna bring up one more thing. I want to dive in on the, the joint value proposition, but you mentioned something that made a lot of sense and resonated to me about the organizations once you get out of partner, the partner world that we all know and love. Mm-hmm. Once you get down into a field organization or account management organization. [00:12:49] Matt Yanchyshyn: Not as much understanding and really organizations do a bad job here, honestly, in terms of enabling the field organizations. Do you agree? [00:12:58] Allison McFadden: I agree because I, I agree. And, um, you know, I think that’s one of the things, and, and I, I, when I joined Accenture, what we had was a lot of wicked smart architects delivering programs to clients in the field. [00:13:15] Allison McFadden: Very smart, very deep in AWS knowledge. Um, and that was awesome for the 10 clients they were staffed on and to get that understanding of how AWS works and I dream about lar, right? Like, this is a good, you know, but that takes real effort and real work. Yeah. And it’s, it’s um, almost like being a language translator. [00:13:37] Allison McFadden: Yes. For me. Yeah. So, you know, I had to deeply learn AWS so that I could. [00:13:42] Rekha Thangellapalli: Sure. [00:13:42] Allison McFadden: Teach my account teams. My account teams are really smart. They know who they’re selling to. They know their customers. They know what their customers need. They do not know what AWS has to offer always because they’ve got 20 partners lining up to try to tell their stories. [00:13:57] Allison McFadden: Um, they don’t know how to ask of the AWS team or the elastic team or the Nvidia team. Yeah. What they need [00:14:02] Vince Menzione: this co-selling piece. Yeah. [00:14:04] Allison McFadden: And so that is where, um. We had to build that muscle even around our AWS practice, which was a huge practice at Accenture, but we didn’t necessarily surround it with that kind of enablement and um, almost deal coaching layer. [00:14:21] Vince Menzione: So Elastic and Accenture came together. I dunno which one of you wants to lead this part of the conversation, but you will, right? Yeah. So tell us about the genesis of this and why. And a lot of people dunno what Elastic does, but you do some really incredible work. Like I, somebody told me one day was like, oh, you know, Uber, like, that’s elastic, powering all that. [00:14:41] Vince Menzione: Like, we don’t think about that. That the engines that you have and the, the backend to the customers, huge customers. [00:14:48] Rekha Thangellapalli: Yeah, absolutely. Um, so when AWS launched this feature last, um, reinvent where basically it allowed, you know, channel partners such as Accenture to be able to bundle up their services, their data assets with an ISV solution and put it on marketplace, um, you know, Accenture and Elastic immediately saw an opportunity. [00:15:09] Rekha Thangellapalli: Um, at the time most customers were doing gen ai. But they were running into the same challenge, which was that their data just was not ready. And by the way, this is a problem we were solving. Outside of marketplace. I think the, the feature that you guys launched just gave us a way to package it up and to be able to create this repeatable solution, which we call data readiness engine for gen ai and put it on marketplace. [00:15:40] Rekha Thangellapalli: And, um, this to me was a success because. Each company had a clear reason to invest. Um, so for Accenture, they were able to, you know, create a very differentiated services led offering. Uh, for Elastic, we were able to expand on our AI story. And for AWS, um, you know, it drives marketplace adoption, increases cloud consumption, all of that great stuff. [00:16:07] Rekha Thangellapalli: And customers, of course get. A solution to a very real problem that, that they were having. Um, and you know, the surprising part for me going through that journey was that, um. The pitching, the idea, getting the budget, getting the executive sponsorship was actually the easy part. The hard part was getting all three companies to come together, uh, to go from idea to launch in a very ambitious timeline of six weeks. [00:16:37] Rekha Thangellapalli: Nice. And so, you know, this was very much like. Doesn’t matter your title. We’re rolling up our sleeves and we are on this outcome together. Um, and so we literally built a RACI matrix, a project plan, and you know, we had daily standup calls for six weeks where literally. At least one person from each three of these companies called in, you know, got rid of any blockers and we made sure we were on target for that timeline. [00:17:07] Rekha Thangellapalli: Um, and you know, at the end we had a successful launch. But I think my favorite part about the story is the impact that we’re having and, um. My favorite story comes from a global pharmaceutical company that, you know, had basically nine petabytes of data spread across six different continents. Wow. And by working with Accenture and Elastic, they were able to build that trusted foundation that their AI and their agents can, you know, kind of safely tap into and be accessible at scale. [00:17:41] Rekha Thangellapalli: Um, so that’s my version. Allison. [00:17:44] Allison McFadden: Yeah. Well, I don’t have a lot to add. I just, I would say this is a good example of a couple of principles, right? One is having a forcing function is never a bad idea. Sign up for a big event, sign up. I’m like, I’m here with my, you know, Nvidia guys saying, sign up for the event. [00:17:58] Allison McFadden: It’ll make you move quick, right? [00:18:00] Audience Member: Yes. [00:18:00] Allison McFadden: Um, so that is one, but two, one of my mentors once told me, when you’re designing any kind of, you know, offering go to market motion, it has to get blood to all organs. If it does not get blood to all organs, it does not go [00:18:14] Vince Menzione: nice. [00:18:14] Allison McFadden: Um, [00:18:14] Vince Menzione: I love that analogy. [00:18:15] Allison McFadden: Oh, I love it. And I can talk all day. [00:18:17] Allison McFadden: That guy was brilliant. I love him. But, um, no, and, and so Elastic did a really nice job of bringing the tech to the table. Um, our team has to trust in that technology and its ability to scale, right? Um, because at Accenture we have to be able to deploy across 700,000 consultants. Um. And yeah, so I think those are the two, two things that really worked well here is we had, uh, trust in the technology solved a customer need. [00:18:50] Allison McFadden: Um, it drives, we don’t even talk about, like, yes, it drives marketplace revenue, but it unlocks work that we do that drives even more revenue to our AWS Friends. Right. So this is a, this is a, um, product that’s getting your data ready for AG agentic. It’s a messy problem that everyone’s dealing with, and it removes blockers for clients and it unlocks more, you know, ag agentic work on top of that. [00:19:15] Allison McFadden: So, blood to all organs. [00:19:17] Vince Menzione: So, was that the proposal going forward to say we need to have, we need to have trust in the solution. We need to drive significant revenue. It needs to be something all of our, you know, seven, 700,000 people. Can be a part of and help drive? Is that how you think about? [00:19:32] Allison McFadden: Yeah, and for us right now, um, it’s an interesting time for Accenture. [00:19:36] Allison McFadden: Our clients are asking a lot of us, and what it does is it having some of these accelerators helps us deliver cheaper, better, faster to our clients, which is what they’re demanding of us right now. Um, so it’s an accelerator to client outcomes. [00:19:55] Vince Menzione: James, what is NVIDIA’s role and how do, how do you enter the equation here? [00:20:00] James Kang: Yeah, it’s, um, it’s a good question. Um, I, I would say that Nvidia is probably one of the most misunderstood organizations in the world. Um, despite the, uh, the market capitalization in the valuation of the company, we have a very tiny organization. Um, what I mean by that is, um, if you think about. [00:20:20] James Kang: Salesforces and field sales organizations. Um, we’ll take Salesforce as the account or the customer. As an example, we have one account manager at NVIDIA that no, not only covers and is responsible for the relationship with Salesforce, um, but also manages. Automation Anywhere as well as DocuSign. Whereas at AWS, in contrast, like there are full armies and teams Yeah. [00:20:45] James Kang: That are supporting the Salesforce relationship. And so as you think about partnering and working with Nvidia, the focus has to be on really. Extreme co-design, but also being very prescriptive in terms of what are the very specific customer outcomes that we are solving for. And the guidance that I would give is bring in Nvidia into that equation and that conversation as early as possible because that [00:21:10] James Kang: co-engineering and co-design needs to be part of the foundational building blocks in order for you to come out with a end solution that checks all those different requirements. [00:21:20] James Kang: And so I think. Again, like going back to Nvidia, um, we like to talk about two different types of brains. A brain one and a brain two. Uh, brain One you think about the next quarter and making sure that you’re hitting the revenue targets for the next quarter. Brain two, you think about a long-term goals and potentials looking around corners and being very strategic. [00:21:41] James Kang: The saying internally is without Brain one, there is no oxygen, but without brain two, there is no future. And everyone at NVIDIA is trained to think in that brain two mentality. [00:21:52] Vince Menzione: Wow, Matt. [00:21:54] Matt Yanchyshyn: Yeah, I, I was just thinking I love the blood doll organs. Uh, and so just on, on that note, um, and, and, you know, the multi-product solutions that, that you, you built together, uh, that is a really good example of blood do organs because like we all know, that’s how customers buy. [00:22:07] Matt Yanchyshyn: They, they buy solutions and increasingly they’re looking for combinations of ISV, sometimes multiple products from multiple ISVs with services. Uh, often they’re buying it through a resell motion. You know, and they, and, and so that from a customer perspective, they want a single place to go. And so that’s the multi-product solution. [00:22:24] Matt Yanchyshyn: They wanna find everything they need, they need Accenture, they need Elastic to solve a specific solution. And I think where that’s headed is even more specific listings, like with AI powered listing experience, like, you know, elastic Plus Accenture for, I’ll make something up like a manufacturing workload. [00:22:37] Matt Yanchyshyn: And so this solution based. Uh, sort of buying is, is very customer centric. It’s what customers want. We all know that. But that’s, that’s the customer sort of organ, I guess. Um, but then, you know, you all have SCAs and those SCAs have marketplace commits. It helps if that gets transacted through marketplace helps the AWS relationship, you know that that’s an organ. [00:22:55] Matt Yanchyshyn: It’s the relationship. It’s, it’s the commercial construct and that you have, uh, that that’s another organ. You’re marketing people. They, that’s another organ. They don’t wanna land, uh, leads on a static marketing page. They wanna land a lead on a, a storefront with a multi-product solution that can actually convert and that you can actually buy it through that. [00:23:12] Matt Yanchyshyn: So the marketing person’s happy because they, they have less churn. Uh, and then, you know, our reps are happy ’cause guess how they get paid? They retire quota when they sell Marketplace. And they, we also, Jay McMain will tell you, that’s another organ called Jay or on, on you now. Um, [00:23:27] Matt Yanchyshyn: he’ll like that. I’ll call him up and tell him that. [00:23:29] Matt Yanchyshyn: Yeah, [00:23:30] Matt Yanchyshyn: but he, he’ll tell you, you know, don’t believe me. Obviously, never believe Matt, believe, believe the, the data and, and his data shows that. Those deals will close faster and larger if you use marketplace. So that’s, that’s a lot of organs. That’s the whole body. Um, but you know, when you have your customer happy ’cause that’s how they wanna buy your field happy. [00:23:45] Matt Yanchyshyn: Um, and, you know, the relationship happy and you know, your marketing team happy. Uh, and, and Jay happy. Um, and, and you know, I think that multi-product construct and, and the way you kind of use it to model a partnership and the way buyers ultimately wanna buy is, is really powerful. And so I, I think it’s, you know, it’s really a manifestation of how. [00:24:04] Matt Yanchyshyn: We kind of intend and to go to market anyway. Uh, so I think, you know, and thanks for leading the way, by the way. You’re, you’re amongst the very first, so that’s great to see. [00:24:11] Matt Yanchyshyn: So these storefronts are really helping this drive, drive this. Well, [00:24:13] Matt Yanchyshyn: that’s the next evolution. Like we’re talking about the multiproduct solution. [00:24:16] Allison McFadden: I’m JJ Accenture storefront. [00:24:17] Vince Menzione: Yeah. Oh, there you go. I mean, j and j Accenture storefront. [00:24:20] Allison McFadden: We’re gonna talk about that. [00:24:20] Matt Yanchyshyn: Yeah. I mean, [00:24:21] Matt Yanchyshyn: Accenture also leading the way yet again with storefronts. And so I think the combination of. You know, again, I was talking a lot about conversion. Yeah. And you know, buyers know sometimes they know what they wanna buy and, but if you really wanna convert that lead, you wanna land them again, something that combines, you know, elastic Accenture’s services plus software, but in a storefront that is, you know, surrounding with just the solutions they want so they don’t need to kind of go searching. [00:24:42] Matt Yanchyshyn: So, you know, ultimately reducing that time to close, I guess, really ’cause meeting the customer where they are with what they need. [00:24:51] Matt Yanchyshyn: So we talk about co-selling a little bit. We, Jay and I talk about this all the time. We gotta keep looping Jay in here, even though he is not even in town this week, but Reko, um, what does co-sell look like inside Elastic? [00:25:02] Matt Yanchyshyn: You’ve got, we talked about an incredible leadership team. I’ve gotten meet some of your leaders. Seems like you drive, you do a good job internally driving that. Let’s talk a little bit about it. [00:25:11] Rekha Thangellapalli: Yeah, and this is something I’m, I’m personally very passionate about. Um, co-sell is. Very much a journey, not a destination. [00:25:20] Rekha Thangellapalli: And I think step one for us is recognizing the different partner types that we have. Because at Elastic we work with, you know, OEMs, MSPs, resale distributors, GSIs, um, and they all bring something very unique. To the customer lifecycle and they all contribute very differently within, you know, our own sales cycle and sales process. [00:25:45] Rekha Thangellapalli: And so, you know, figuring out what is the unique benefit they bring, how do we enable them? So training and enablement is a huge piece of it, and so is making sure we’ve got the right metrics to measure success. Um, I know a lot of companies look at partner sourced as the north star, and that’s great, right? [00:26:06] Rekha Thangellapalli: Because that is undeniable. You can say, Hey, that would not exist if it wasn’t for my partner team. Um, but we’ve also noticed that when we bring in GSIs, it actually increases renewal rates. It significantly increases. Um, a RR over time. Um, it expands deal sizes and so these are very real metrics that we can point to, um, beyond just the co-sell and the partner sourced number. [00:26:32] Rekha Thangellapalli: Um, so for us it’s looking at it from a very holistic perspective, but also catering it towards that unique partner and making sure we’re doing everything we can to set them up for success and setting up the partnership for success. [00:26:47] Vince Menzione: So clo close win ratios, deal size and renewal rates? [00:26:52] Rekha Thangellapalli: Yes. For specifically for geos size. [00:26:54] Rekha Thangellapalli: Yeah. [00:26:55] Vince Menzione: Very interesting. Allison, uh, what had to change internally to produce these co-selling? We talked a little bit about the field organization and enabling a, a group of, and, you know, account sellers that are very customer focused and enabling them on the co-sell side. What had to change internally to drive that? [00:27:13] Vince Menzione: Yeah. [00:27:14] Allison McFadden: I, I might have already alluded to this a little bit in a previous answer, but, um, creating the capacity to develop, build, and sell these solutions, um, inside of a large GSI, where billable hours is kind of the number one metric on the table. Um. Is part of the investment that we had to make within Accenture to get this done? [00:27:36] Audience Member: Yeah, [00:27:36] Allison McFadden: so expert technology time. So we have technologists that understand the elastic technology. We do similar with Nvidia, by the way, we. We released some of their time to go co-develop the solution because it has to hold technical water, right? It can’t just be a marketing pitch. It can’t just be, it has to be a real, um, what’s the there, there. [00:27:59] Allison McFadden: So in order to actually do proper co-sell, we had to release some of that time. Um, to invest in those partnerships. Um, we’ve also done similar with some industry aligned business development leaders recently, so we have freed their time up to go. Uh. Open new conversations, educate client, account teams, go to clients, have conversations. [00:28:26] Allison McFadden: Um, so that, that’s a new motion that we, uh, have just kind of recently made, um, to allow them, I love this brain one, brain two also, right? So to allow them to focus on brain two, because a lot of our time. Typically spent delivery issues, you know, getting my hours, where am I charging my time? And so just freeing up a little of that capacity to do this work, um, helps get us in this brain two mode where we’re not just living to survive. [00:28:56] Vince Menzione: I. So, Matt, you’ve removed a lot. I mean, one of the things I admire, I admire AWS for being first to market and removing the most friction in marketplace of any of the vendors. Really, truly that. You talked about some of the announcements. How does some of, how does some of this tie PC central agents propensity sales plays, MCP, how does some of this tie to how, how you’re thinking about the future? [00:29:18] Vince Menzione: And how to enable more motions like this. [00:29:20] Matt Yanchyshyn: Yeah. Well, I, I think if you know my boss, UBA Borno, uh, you’ll know that she has a maniacal focus on automation. Yeah. Um, and, uh, co-sell is increasingly automated. You know, you were asking earlier about propensity data. You can get that propensity data in addition to sales plays and, uh, opportunity scores through the partner central agents. [00:29:38] Matt Yanchyshyn: So things that used to require multiple calls to A PDM, if you’re lucky to have one. Yeah. Or a p sm. Uh, you, you can now get through, through these agents, you know, uh, tech Systems, TGS, they, they manage what, over 5,500 customer opportunities with agents that they built on top of our partner Central APIs. [00:29:55] Matt Yanchyshyn: Um, and work Span has built a whole product and business that’s right on leveraging, uh, our APIs, our capabilities to sort of tie into your CRM. So, majority of all opportunities will be progressed and managed by agents. This year at AWS, we already have a majority of all customer opportunities, all app have a partner attached and I, I took a personal goal for a majority of those partner attachments, not to happen from a human. [00:30:22] Matt Yanchyshyn: But from our solution matching engine. And how do you get recommended by that solution? Matching engine, having a healthy ACE pipeline, thanks to partner central agents and the integrations you’re doing. And in addition to being the specializations and doing things like multi-product solutions and ultimately closing opportunities, you dream of LAR and so LAR will help that. [00:30:40] Allison McFadden: It’s more like a nightmare. [00:30:41] Vince Menzione: And so, you know, [00:30:42] Allison McFadden: it’s more like a nightmare, but [00:30:44] Vince Menzione: nightmare. Well, it’s, it’s, yeah. Nightmare of Laura and, and. Nice dreams of PRM, but the, um, but that’s the loop, right? I, I think, uh, increasingly co-sell for us, and in my mind, is largely a hundred percent automated. Yeah. Except for what matters most, those most largest, most strategic, most complex deals. [00:31:01] Vince Menzione: Where our highly paid and very skilled salespeople are most effectively used. [00:31:05] Vince Menzione: Yeah. [00:31:05] Vince Menzione: You know, the days of, you know, this person with 20 years experience selling, clicking, progressing opportunities through a pipeline, uh, should be over. Uh, and, and we need those people out, out selling and, and co-selling. And so that for me. [00:31:19] Vince Menzione: Yeah. That, you know, we talk a lot about co-sell, but I, I’m obsessed with automating as much of the co-sell as possible. [00:31:24] Vince Menzione: I remember going back to the ex Excel spreadsheets and, and that, that seems to be be Viva became spreadsheet jockeys. [00:31:31] Vince Menzione: Yeah. [00:31:32] Vince Menzione: And, and they stopped selling. They forgot how to sell. [00:31:34] Vince Menzione: Yeah. And people spend all this time doing lunch and learns and things like that. [00:31:36] Vince Menzione: And then, you know. Then the salespeople rotate out after 18 months and, and it, that’s, that’s the old days. Uh, you know, the new days are, are AI powered matching algorithms, uh, ag agentic co-sell, using the partner essential agents to get your data and, and putting that data to use automatically and, and what sounded like magic. [00:31:51] Vince Menzione: 12 months ago is being done, you know, by partners at massive scale across thousands of opportunities. You can do it today. And you know, I, there’s a guy named another Mike, right? Mike another Mike who they have, there’s like a guy who’s doing all this and I’m picking on Mike ’cause I, I know their system really well and I know the guy Mike grew easily built it for them. [00:32:08] Vince Menzione: Um, but, you know, I think, yeah, again, in the days of having 10 people sort of doing lunch and learn could be replaced by one or two people, building agents, uh, managing a massive pipeline. And, and that’s the future. [00:32:18] Vince Menzione: Exactly. James, your perspective on what breaks with co-selling? [00:32:22] James Kang: Oh, what breaks co-sell? Um, I would say. [00:32:25] James Kang: It, it starts and finishes with just misalignment and a loss of trust with the customer, especially when you have multiple partners or stakeholders involved. If you’re trying to do a three-way deal with a end customer and you’re not on the same page, you’re not gonna get to a successful outcome on, on the backend. [00:32:44] James Kang: Uh, the fix is a much more complicated story. I would say that to take a step back, um. We’ve talked about the five layer cake. We’ve talked about where NVIDIA kind of fits within the equation. We are invested in the ecosystem and so as different players and application organizations win and see these outcomes for end customers, we celebrate that success. [00:33:07] James Kang: Um, and as part of that kind of ethos of where NVIDIA fits within the ecosystem, we wanna make sure that not only. Our customers, but our partners like ISVs and GSIs are set up for success. Um, we do not as Nvidia sell hardware or GPUs directly to customers We use. Hyperscalers like AWS as kind of our force multiplier. [00:33:31] James Kang: And similarly we think of ISVs and GSIs as the force multipliers in terms of our extensions of how we, we kind of leverage the relationships and build the trust with our end customers. And so going back to kind of the question, Vince, I would say that it all comes back to trust and being able to build that mutual trust. [00:33:48] James Kang: Um, a lot of what we do when we co-sell with AWS is really on the software layer. Um, we actually have more software engineers at NVIDIA than we have hardware engineers, which is a weird thing to say, um, because everyone knows us for our GPUs. But because of that fact, we are heavily invested in Cuda and making sure that Cuda becomes the foundational layer for how not only our ISVs and GSIs, but also our end customers are building. [00:34:12] Vince Menzione: Very cool. So Reiki, you and James together on this production. Versus pilot with the Gentech ai. Tell us a little bit more about that. Where, where are you in the process? [00:34:24] Rekha Thangellapalli: Yeah. So I mean, in general, what we’re seeing out in the market in, in relation to sort of AI and, and customer’s journeys is that, um, at least from an elastic perspective, um, we’re seeing people very much in production when it comes to, you know, kind of AI assistant co-pilot use cases. [00:34:42] Rekha Thangellapalli: So, you know, things like, um, software development, customer support is a big one. Um, any sort of employee productivity use cases where there’s. Still a human in the loop somewhere. Um, and there’s a very like, clear path to value. And so we see the customers being in production excelling there. Um, no problem. [00:35:01] Rekha Thangellapalli: Where we’re seeing people still kind of in the pilot phase is those fully autonomous workflows where there is no human involved. The agent is reasoning on its own. Um, accessing multiple systems and taking an action on the user’s behalf. And what we’re seeing is that it’s not the intelligence of the agent that’s holding it back. [00:35:26] Rekha Thangellapalli: It’s more about giving the right context to the agent and having the right. Security kind of governance controls in place for the company to feel comfortable in putting these fully autonomous workflows into production. And that’s really the conversation we’re having is all right, what are the controls you need in place? [00:35:47] Rekha Thangellapalli: For you to release this to your business unit. Um, and what is the context that the agent is needed before we can comfortably let the agent make the decision on the user’s behalf? Um, James, I’d be interested to hear what you’re, what you’re seeing in the market [00:36:03] James Kang: plus one on all things context. I, I would even go so far as to say, um. [00:36:09] James Kang: H how many folks in the audience have heard of token maxing? Like this new term? [00:36:13] Rekha Thangellapalli: Yeah. Yeah. [00:36:14] James Kang: Um, I’ll, I’ll give a very specific example of, of Uber that went public. With the example of Claude, like they allowed all of their employees to use as many tokens as possible, and within the span of four months, they exhausted their full budget for the year, and so they had to pull back, and now there’s a cap on every employee. [00:36:33] James Kang: I think the number that’s circulating is $1,500 per month per employee, and so I think that is at least. In this multi-phase evolution of where we’re going to be and where we’re today, cost has become kind of the prohibitive force in terms of agentic AI at scale. Um, I think we are working on some very creative solutions in-house and Nvidia. [00:36:55] James Kang: Um. And we saw some really dynamic announcements this week when it comes to all things agent core, um, where we want to focus on very nimble ways for customers to be able to execute and go to market. And one extreme example of that is our investment within our open model strategy. So Nvidia, not only, again, providing GPUs, we actually offer our own op open models, which we call our Nitron models. [00:37:21] James Kang: And through our Nitron models, we are allowing customers to really develop and fine tune their own proprietary models in a cost effective manner. So right alongside the frontier models like OpenAI and Anthropic. It’s not a if then, it’s not an either or statement. It’s a, it’s a permutation, it’s an and So we’re giving you a cost effective alternative to not only bring your AgTech applications at scale by training on Nibo tron, which is open source, but then once you’ve kind of finished and fine tuned that specific training job to be able to. [00:37:53] James Kang: Go ahead and utilize your frontier models, whether it be OpenAI or Claude. And I know there’s other partners here that are providing those kind of different model capabilities. And so I think for us it’s, it’s a matter of choice. We know that this market is dynamic. It’s gonna be evolving over the next coming months as well as the next coming years. [00:38:10] James Kang: Uh, but we believe that we are positioned for a really unique dynamic expansion of AgTech use cases over the, at least the next three to six months. [00:38:20] Vince Menzione: Allison, for the partners in the room who are glazed over right now going, what do I, what do I do over the next 12 months? [00:38:26] Allison McFadden: Should I wake everybody up by saying, yeah, please. [00:38:27] Allison McFadden: Say go hurricanes. [00:38:28] Vince Menzione: Yes. [00:38:29] Allison McFadden: Is there anyone, anybody? Everyone’s like, boo. I get to leave the parade today to go home to parade. I live in Raleigh, so we’ve got our parade on Saturday. Nice. [00:38:39] Vince Menzione: Nice. [00:38:40] Allison McFadden: All right. Wake up. Um, all right. So for the $50 million partners in the room, um. $50 million is not small. You have something that works. [00:38:50] Allison McFadden: Right. This is great. What I would be thinking about is, you know, we’ve talked about focus before, but really doubling down on, you know, what is, what is your industry, what is your client like, ideal client that you serve. And build, um, almost that kind of community. You know, the, the clients we have move from firm to firm to firm. [00:39:17] Allison McFadden: And if you’ve done good work at one, you’re gonna follow ’em to the next. Um, so build that client demand in a specific place or specific client profile that is just like really knocking it out out of the park for you. Um. Scale with marketplace, right? So if you, I, I love some of the data that you were sharing in your talk earlier, um, because it’s like no overhead scaling mechanism. [00:39:45] Allison McFadden: I mean, it’s, it’s fantastic. Um, Accenture, other GSIs like us, we are investing in marketplace. So we’re investing in resources, um, to help us. Use marketplace more with our clients and we’re gonna capture, right, those storefronts. And if you’re present on marketplace, you’re gonna be able to catch, uh, yourself in that wheel. [00:40:09] Allison McFadden: So I think those are the, the kind of couple of things I would say is focus, focus, focus to drive that client demand and use scaling mechanisms like marketplace to really kind of, uh, accelerate. [00:40:24] Vince Menzione: Matt, anything to add there on the. [00:40:26] Vince Menzione: Well just, you know, Ja, James, you, I love the token maxing reference in Uber and it reminds me, you remember when cloud came out and everyone was like, oh, all these people are, are gonna use the cloud and costs are outta control and. [00:40:39] Vince Menzione: Um, a lot of people pulled back from the cloud and, and a lot of those companies no longer exist. And it’s similar with, with, uh, token maxing, like, oh, these agents are outta control. You have a choice. You can embrace them and figure it out and get governance and, and make your data available. Um, use the partner, central agent, move to agent to co-sell, or you can fade and die. [00:40:58] Vince Menzione: And, and that’s, that’s where we’re at. Uh, is, is the, the companies sitting here today embraced the cloud years ago and won. Uh, and and there’s a set of companies here today who are gonna embrace agents in the, for both buyers and sellers, and will win. And there are those who won’t and they won’t win. And so for me, it’s like we’re, we’re at a, we’re at a crossroads. [00:41:18] Vince Menzione: And, and if you’re gonna win, you gotta leap into that, you know? I love it. And, uh, and, and, and it’s, it means the cost of experimentation is so much lower now. Development and, and even business development or software development is, is agent enabled. And so you can take risks, you can experiment and, and you have to, it’s, it’s an existential moment. [00:41:37] Vince Menzione: Agreed. We’ve got a couple minutes left over for any questions. What do you think? Sure. Are there any here. I think there are a couple. Yeah, we’ve got, we’ve got a co-sell question I’m sure coming up here. [00:41:51] Audience Member: Um, I’m Cassandra, I’m the CEO of Partner Tap. And one of the questions I had was, I think, you know, the co-selling between the sellers is where things get. Really, really hard when you’re multi-partner. And so when I was listening, um, with, you know, the Accenture and Elastic together, you talked about how you had, you, you had to get these BD business development people. [00:42:22] Audience Member: Um, is this a new team that is over the client team? And how do these teams interact like with the elastic sellers? Are you doing a lot of coaching to the field and then with if AWS sellers are, are involved, like what is that whole picture? What does look like, [00:42:43] Allison McFadden: like [00:42:44] Audience Member: on the ground? I mean, that is the hardest part, I think, and that’s what we hear. [00:42:48] Allison McFadden: It’s so, it’s so, it’s so tough. Um, and I will, I’ll just say, so our business development leaders that we now have kind of. Expanded their capacity. They have always been, they have always been there. Um, but they have not been well resourced. They haven’t, they haven’t had very clear kind of job description. [00:43:12] Allison McFadden: I’m gonna say I, in the past they have been kind of focused on partner relationship. And so like more like an alliance manager and maybe working on some of the data. Right? So when I say I have nightmares about Lars, because we’re always trying to increase the LAR for Accenture and, and they were focused like in those detailed weeds of like trying to pass ACE and trying to call the PDM and all this stuff. [00:43:39] Allison McFadden: What we are doing is really pivoting them to be proper sales, business development focused on client outcomes and focused on. Technical skills to be able to describe what this solution is to the field. So, um, and because we need, I have many, many questions about, I gotta get agents to work with Eurogen co-sell so that that part somehow goes away. [00:44:05] Allison McFadden: So that’s a, that’s the thing we gotta solve still, but, um, so we’re pivoting them to be kind of driving. More of that co-sell enablement with the field, um, and taking that message to the field rather than being there, waiting for questions to come in from the field, waiting for like our field teams to discover, oh, I saw something that we’re doing with Elastic, like on a press release on LinkedIn. [00:44:30] Allison McFadden: Right. So we’re kind of trying to pivot them to be more proactive. [00:44:33] Vince Menzione: Very cool. [00:44:34] Rekha Thangellapalli: Yeah. And uh, Cassandra, that’s an excellent question because I think. Multi-party, you know, sort of tri-party offerings. The hardest part is operationalizing it at scale, right? Yeah. And so for this particular offering, we are basically having three routes to market. [00:44:51] Rekha Thangellapalli: So one is seeing how this offering fits into our existing elastic go to market. And so I am constantly enabling our field sellers to say, okay, within our three field sales place, here’s exactly where this fits in. Here are, you know, uh. Keywords that you hear in customer conversations where you bring up this offering and here’s a process of how it works. [00:45:14] Rekha Thangellapalli: Um, exactly At what sales stage do I bring in Accenture, how, you know, what are the roles and expectations? Right? So that’s on the elastic side. We’re doing the same thing on the Accenture side. So we’re doing a ton of training enablement and lunch and learns, and we’re also looking at how do we fit into. [00:45:31] Rekha Thangellapalli: Uh, Accenture’s AI transformation projects, we are the semantic layer, right, of their enterprise brain. And so it’s a whole different sales motion, um, and, you know, having the right assets, having the right process again to make sure that that goes smoothly. And then finally, we’re going directly to the customer. [00:45:49] Rekha Thangellapalli: So we are launching multiple external campaigns where, you know, if the customer raises their hand. We will, we will line up immediately. Right. Um, and so, [00:46:01] Allison McFadden: I mean, I can’t, I can’t, I can’t say how important that third leg of the stool is. ’cause the second part, she talked about getting into our catalog is the first thing. [00:46:09] Allison McFadden: ’cause my BU business development leaders have the catalog. Right. And that’s what they’re selling. So what Elastic has done has gotten into one of those offerings and then. If we have a customer that asks for it, that is the fastest way to alignment. That is like the number one thing that we respond to [00:46:26] Vince Menzione: customer at the center. [00:46:27] Vince Menzione: This is great. Well, I think we’re up to time. This was a great session. I want to thank you. This is what a great, what a great group. [00:46:34] Vince Menzione: Thanks for listening to the Ultimate Partner Podcast. If today’s conversation resonated, share it with a partner leader in your network. Subscribe where [00:46:43] Vince Menzione: you listen, and head over to the ultimate partner.com. [00:46:47] Vince Menzione: For show notes related content and the resources for this episode. And if you haven’t already, now’s the time to register for the Ultimate Partner Live Event in Reston, Virginia, October 26th through October 28th. Until next time, keep showing up in the rooms that matter because being in the room changes everything [00:47:09] I.

ceo ai brain development open security uber wake defining pc scale partners falls nightmare extreme excited newsletter ibm perspective ces infrastructure technical marketplace excel crm openai salesforce nvidia raleigh bu aws sap accenture mm matching fade expanded apis bd anthropic eureka versus rr la r crowdstrike gpus tsmc oems foxconn span docusign elastic mcp tgs sca agtech marketing officer rika msps cuda gtc reston computex raci reka gsi isv pdm ika mulesoft prm isvs coupa reko automation anywhere wsa aws marketplace product solutions gentech scas gsis vince menzione nibo

157. Les différents types d'ordres (Partie 3)

D*Trading - Le podcast

Play Episode Listen Later Jul 17, 2026 28:36

Dernier épisode de la série. Nous continuons de partager notre passion des ordres boursiers. Des modèles plus complexes qui nous permettent de faire plus de choses. Quels sont les risques associés l'exécution des différents types d'ordre? Fuji nous partage plein de subtilités dans leur exécution. On jase de OCO, de MIT, de IOC, de FOK, de GTC, de GTD, d'ordre Iceberg…Découvrez ce que veulent dire ces acronymes/expressions et passez à un autre niveau dans la gestion de vos ordres!Lien discuté dans cet épisode:https://dtrading.net/blog/les-types-d-ordres-----------------------------------------------------DIVULGATION DE RISQUELa négociation d'actions, Forex ou tout autre produit financier comporte un niveau de risque élevé et peut ne pas convenir à tous les investisseurs. Les performances passées ne représentent pas les résultats futurs. Le degré élevé d'effet de levier peut être profitable et aussi vous nuire. Avant de décider d'investir, vous devez examiner attentivement vos objectifs d'investissement, votre niveau d'expérience et votre appétit pour le risque.La possibilité existe que vous puissiez subir une perte de tout ou d'une partie importante de votre investissement initial et donc vous ne devriez pas investir de l'argent que vous ne pouvez pas vous permettre de perdre. Vous devez être conscient de tous les risques associés aux opérations boursières et demander un avis à un conseiller financier indépendant si vous avez des doutes.Le contenu de ce podcast est à titre informatif et éducatif seulement et n'est pas et ne doit pas être interprété comme un conseil professionnel, financier, d'investissement, fiscal ou juridique. D*TRADING ne peut être retenu responsable de pertes financières dues aux décisions personnelles du client.#bourse #investissement #trading #finance #argent #investir #libertefinanciere #investisseur #trader #actions #business #stockmarket #motivation #bitcoin #independancefinanciere #stocks #forex #titre #epargne #formation #formationboursiere #marchesfinanciers #daytrading #swingtrading #tradingview #investissementlongterme #apprendre #Patrick Gaulin #François Joly-Dubois #Michel VillaHébergé par Ausha. Visitez ausha.co/politique-de-confidentialite pour plus d'informations.

mit types finances avant quels visitez lien dernier ausha rents forex ioc fuji gtd day trading bourse gtc swing trading oco fok

Why AI Infrastructure must evolve for Agent Experience — Akshat Bubna, Modal CTO

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jul 8, 2026 57:55

We've been running a bit of an Agent Cloud series surveying all the top inference/compute/cloud providers, from Databricks to Daytona to Railway and, even further back, E2B, but we're excited to conclude this series returning to Modal, which has just raised a monster $355M Series C.The cloud was built for developers. But agents are now changing that.The old infra stack was designed for a human who could read docs, reason through YAML, and understand dashboards to figure out what they need when something broke. While this was painful for developers, it worked since they could fill in missing context in their heads.However, agents don't have that luxury. Now in this new era of agents, everything has to be tighter.They need a place to write code, run it, inspect the output, change the environment, debug failures, and try again. Fast iteration and feedback loops with all the necessary context are crucial for agents to operate properly. Furthermore, sandboxes are a clear representation of this shift as agents can easily spin up isolated environments. This programmatic infra even extends to research:Two years ago, we were one of the first to cover Modal with CEO Erik Bernhardsson and Alessio designed our favorite LS thumbnail of all time:At the time, Modal was just a teeny little company with a $17M Series A.Today, fresh off their $355M Series C, Modal is one of the clearest examples of the agent cloud future being built in real time: a cloud platform moving past traditional web app assumptions toward the workloads AI actually creates such as elastic inference, sandboxes, GPU burst, post-training, background agents, and infrastructure that agents themselves can operate.In this episode, Modal CTO Akshat Bubna joins swyx and Vibhu to unpack why AI applications don't fit traditional cloud assumptions, why Kubernetes was never designed for bursty compute-heavy workloads, and why Modal is now shifting from developer experience to agent experience.We go deep on Modal's AI infra stack: serverless functions, decorator-based infrastructure, elastic inference for custom models, GPU snapshotting, DeFlash, speculative decoding, Auto Endpoints, sandboxes, persistent storage, networked containers, private IPv6, RDMA, multi-node training, and Modal's capacity pool across 17 cloud providers. Akshat also explains why RL rollouts can require 100,000 sandboxes, why production agents need hard guardrails, why observability may matter more than reading code, and why AI has made infrastructure exciting again.We discuss:* Why Kubernetes wasn't built for bursty AI workloads* How Modal started as a better runtime before becoming an AI cloud* Why Modal added GPUs before ChatGPT* The shift from developer experience to agent experience* Why observability matters when agents are writing the code* Elastic inference for custom models across audio, video, robotics, and comp bio* GPU snapshotting, cold starts, and why inference workloads are so bursty* Why RL rollouts can require 100,000 sandboxes* DeFlash, speculative decoding, and frontier-level inference performance* Auto Endpoints and making optimized inference easier to deploy* What Modal adds beyond vLLM, SGLang, and raw GPU rental* Modal's 17-cloud capacity pool and supercloud strategy* Networked sandboxes, sidecars, private IPv6, and RDMA* Serverless multi-node training for post-training and research workloads* Auto-research, model-guided sweeps, and agents launching GPU experiments* Compute strategy, capacity planning, and batch tiers* Why production agents need specialized sandboxes and hard guardrails* Modal's take on managed agents, CI, Gitpod/Ona, Python, TypeScript, and Modal BenchAkshat Bubna* LinkedIn: https://www.linkedin.com/in/akshat-bubna-188885103* X: https://x.com/akshat_bModal* Website: https://modal.comTimestamps00:00:00 Introduction00:00:39 Modal's origin and why Kubernetes wasn't enough00:04:32 Developer Experience → Agent Experience00:06:21 Modal's AI cloud primitives00:09:14 Sandboxes, agent loops, and proto-Cognition00:12:12 Elastic inference, GPU snapshotting, and 100,000 sandboxes00:15:24 DeFlash, speculative decoding, and Auto Endpoints00:19:59 Production-grade inference beyond raw GPUs00:22:00 Background agents, Ramp Inspect, and the agent lifecycle00:24:08 Modal's 17-cloud supercloud strategy00:26:40 Networked sandboxes, private IPv6, and RDMA00:32:48 Multi-node training, post-training, and auto research00:37:36 Compute strategy, capacity planning, and batch tiers00:40:55 Open models, real-time AI, and production agent infra00:43:06 Hard guardrails, managed agents, and specialized sandboxes00:46:06 Why AI made infrastructure exciting again00:48:30 Model APIs, differentiated products, and agentic video00:51:50 CI, coding-agent infra, SDKs, and Modal Bench00:57:28 Closing ThoughtsTranscriptIntroduction: Modal, Series C, and the Art PartySwyx [00:00:00]: We're here with Akshat, CTO of Modal, together with Vibhu. Congrats on your Series C.Akshat [00:00:10]: Thank you.Swyx [00:00:11]: Your party yesterday was amazing.Akshat [00:00:15]: Yeah.Swyx [00:00:15]: From all the photos and all the swag.Akshat [00:00:17]: We had a bunch of art installations, which was fun, seeing, like, our products on pedestals next to, like, Rodin.Swyx [00:00:25]: Very nice. Very nice. When you started, it was not the GPU inference company. Maybe it was in your mind. Take us back to the origin story.Modal's Origin: A New Runtime Beyond KubernetesAkshat [00:00:39]: I first met Eric, who's the CEO, through an investor. Back then Eric was already thinking about building, a new runtime, and he got there thinking through why are workflow orchestration products so hard to use. It's because you have to run them on Kubernetes. Kubernetes is hard to manage. It's not built for burstiness and, custom images,Swyx [00:01:03]: YeahAkshat [00:01:03]: It has a terrible developer experience.Swyx [00:01:05]: And I'll, I'll interjectAkshat [00:01:06]: YeahSwyx [00:01:07]: For listeners, who are new, we interviewed Eric two years ago, and there's a bit more of the story there from Spotify and all those things.Swyx [00:01:14]: And I came across Eric through Data Council because he did that talk on the serverless container stack that you guys did, which was like, that was my first like, “Okay, I need to take Modal very seriously” moment.Akshat [00:01:26]: Yeah.Swyx [00:01:26]: But it was still very unclear, like, do I need all this for just my data pipelines?Akshat [00:01:33]: Yeah. initially what we were thinking about was if we build a better runtime, it's a very useful primitive in itself. It's There's a lot of things that, get solved by serverless functions, like you can do, ETL stuff, you can do job queues, you can do all this, like, bursty processing, which it turns out every company had needs for. but then we also were thinking about this as like, this is a primitive that we can build a whole collection of products on, which are very verticalized. So perhaps data engineering would've been the first one, but we were thinking about inference. Back then it was more classical inference, like computer vision stuff and running XGBoosts and whatnot. But we added GPUs to the product a year before ChatGPT came out.From Serverless Containers to GPU WorkloadsSwyx [00:02:19]: Nice.Akshat [00:02:19]: We just didn't think it would be that big of a deal.Swyx [00:02:22]: Yeah, just like add A100.Vibhu [00:02:23]: Was there any, like, early key problem that really sparked off why you built it?Akshat [00:02:28]: Yeah. Primarily it's just, none of the tooling that was out there was built for, one, a really great developer experience, and also there's a general trend of, a lot of the workloads that we were seeing were very. I wish there was a better word for it, but compute-heavy. Like, they need, one, like, need a lot more resources, so you need to burst up and down a lot, versus like Kubernetes designed for, like, slow scaling and, more for, like, web server use cases. And also there's just a lot more specialization in, like, what kinds of environments these workloads run in. Like, we had sometimes they need accelerators, sometimes they need different kinds of images, and this is just like a consistent thing that we saw across a lot of companies. That would be the next step.Software-Defined Infrastructure and Decorator-Based DXSwyx [00:03:13]: Yeah. Yeah. Be nice. I don't know how much this factored into the early story, but I wrote a post when I was at Temporal about infrastructure, software-defined infrastructure or something like that.Akshat [00:03:22]: Yeah, the self-provisioningSwyx [00:03:23]: Self-provisioning.Akshat [00:03:24]: Yeah.Swyx [00:03:24]: Yeah. I can't even remember my own post.Swyx [00:03:26]: And then you put me on the landing page.Akshat [00:03:28]: Yeah. We really like, the term and so we stole it.Swyx [00:03:32]: Because you had the insight that everything can just be in decorators co-located with the code, right?Akshat [00:03:37]: Yeah.Swyx [00:03:37]: Was that a big part of the originalAkshat [00:03:39]: YesSwyx [00:03:39]: Story or it was just like a DX layer?Akshat [00:03:41]: That was, really important because we really didn't want people to spend, so much time, writing YAML, and it seemed like you could really condense the surface area of what you're doing, put it in code so you can operate on it just like you operate on other code, and like build stuff that's more expressive and dynamic. and so yeah, that was always a very important part.Swyx [00:04:04]: Then the pushback is this is a DSL.Akshat [00:04:07]: Yeah.Swyx [00:04:07]: It's you're closed source. I am locked into Modal.Akshat [00:04:11]: Yeah. We never really got pushback for that because the nice thing about Modal is you can bring whatever code you have, and sure, the DSL is at the configuration layer for, what hardware you're using, how you're scaling things up, but you still own the code.Akshat [00:04:27]: And that's, that's been an important, part of our story, even as we do inference now.Swyx [00:04:32]: Yeah.Vibhu [00:04:32]: How much of do you think still stays the same today? Like if you were to build something today, DevX very important, but I feel like, a lot of this has been changed with just hook it up to an agent, have Claude Code, have Codex implement a tool. there's very agent native primitives that are different than if I'm doing this myself, right?Developer Experience → Agent ExperienceAkshat [00:04:54]: We've changed our SDK team to think about agent experience instead of, developer experience and we think that the same benefits that apply for DX also apply for AX, which is why would you have an agent read through hundreds of Kubernetes files and like write YAML that's not even typed when it can make a couple of changes in a decorator and it gets this self-provisioning runtime of, being able to see its changes live in action? yeah, it just seems from the customers we talk to, they find Modal is much faster for agents to use versus operating on a different substrate.Swyx [00:05:34]: Yeah, because like you, again, you co-locate the infrastructure requirements to the code that runs it.Akshat [00:05:38]: Yeah.Swyx [00:05:38]: Well, the negative thesis now is that nobody's looking at their code anymore, so there's no point.Akshat [00:05:44]: Yeah, people aren't looking at code. one thing we still see is really important is observability.Swyx [00:05:51]: Yeah.Akshat [00:05:51]: Like how good is your dashboard? And of course, like we have, we push a lot of it to the CLI so the agents can do their own investigation, but you still need humans to go interpret what's going on and, make judgment calls and whatnot. and that's I feel like, Maybe more important now than looking at the code itself.Swyx [00:06:11]: Yes, because like, you can try to treat the code as a black box and then use, see the observable action that comes out of it, and then just prompt a change.What Modal Is For: AI Cloud PrimitivesAkshat [00:06:21]: Yeah.Swyx [00:06:22]: So I think it takes a bit of restraint to not specialize, to say, “I want to ship a new primitive,” and then just be general purpose.Swyx [00:06:31]: People ask you, “What are you for?” You're like, “ I don't know. We can do this, we can do that.”Vibhu [00:06:36]: Well, I'd be curious to see, like, okay, if we were to ask you, like, what is Modal for even at a high level? There's a lot you guys do, sandboxes, GPUs, everything. How do you answer?Akshat [00:06:46]: Modal is a cloud platform that's built for, where we've built the primitives from scratch for AI applications. and right now it covers, inference, training, batch processing, and sandbox workloads.Akshat [00:07:00]: But we're building a lot moreSwyx [00:07:02]: I noticed you didn't say web server, so there is still a role for, like, the always-on large-scale Kubernetes type things.Akshat [00:07:09]: Yeah, absolutely. We're, we're not trying to compete with the renders of the world, because yeah, we think the differentiator for us is the, are the workloads that need specialized compute, need to scale up and down a lot. yeah, they're, they're, they're just shaped differently.Working Alongside Frontier StartupsVibhu [00:07:26]: I think you're building a lot of it alongside the startups, right? They're innovating quite a bit, even in your, like, latest blog post. Like, even in the series C, the customers that you mention here, the cognitions, technical ones, ramps and whatnot, they're, they're innovating with you, right? And that's not something AWS is doing directly with.Akshat [00:07:45]: Yeah, absolutely. I think, this is again classic. We're a small team. We can move really fast. our engineers are working with our customers and figuring it out. Yeah.Swyx [00:07:54]: So my first week at Cognition, I walked in, there was someone wearing a Modal shirt. I was like, “What are you doing here?” They're like, “Yeah, I just. I am embedded inside of Cog.”Akshat [00:08:05]: Yeah, I think that was Peyton. We sent him overSwyx [00:08:07]: Yeah.Akshat [00:08:07]: Because, the latency of communication was too high otherwise.Swyx [00:08:12]: Yeah, distributed node, you have to - you have to place one and collocate.Vibhu [00:08:16]: Yeah.Swyx [00:08:16]: So I had a, I had direct personal experience, right? So I worked on smol developer three years ago. it was inspired by Claude 1. I think you onboarded me at some point, like, just before, and I was like, “Oh, like, I need some bursty compute. Like, I was just gonna try using Modal.” And it was a, it was a pretty pleasant experience. apparently, I showed up in the board meeting, like the analytics.smol developer, Sandboxes, and Proto-CognitionAkshat [00:08:39]: Yeah, you blew up on Hacker News and,Swyx [00:08:41]: YeahAkshat [00:08:41]: We got a big traffic spike. I. I think the way you used smol developer was Modal functions for running stuff, which was. Like, the, that was a good use case. but then, yeah.Swyx [00:08:53]: Yeah. That - So to me, that was proto-cognition.Akshat [00:08:55]: Right.Swyx [00:08:56]: If only I had, like, stuck to it.Swyx [00:08:58]: Like, that was like, if - did you say draw the tech treeAkshat [00:09:00]: AbsolutelySwyx [00:09:00]: You're just like, “Yeah, like, probably this will happen.”Akshat [00:09:02]: Yeah. Like, he was so close. You were just rebuilding upon usSwyx [00:09:04]: I just didn't realize.Akshat [00:09:05]: But the funny story there is at the same time, we were talking to a bunch of customers who needed something like sandboxing.Swyx [00:09:14]: Yeah.Akshat [00:09:14]: This is like twenty-three.Swyx [00:09:15]: Yeah.Akshat [00:09:16]: So we builtSwyx [00:09:17]: You introduced a new API right after that.Akshat [00:09:18]: Yeah.Swyx [00:09:19]: Yes.Akshat [00:09:19]: Like, we built sandboxes in May of twenty-three before anyone was even knew this was gonna be a thing. And the first example we published was, we took smol developerSwyx [00:09:28]: Smol developerAkshat [00:09:28]: And put it in a loop, so the agent can iterate on itself.Swyx [00:09:33]: Loops are hot these days.Vibhu [00:09:34]: It's the looper.Akshat [00:09:34]: Yeah.Vibhu [00:09:35]: Loops in. When was this, twenty-three?Akshat [00:09:38]: Yeah.Vibhu [00:09:39]: A small check.Akshat [00:09:39]: Yeah.Swyx [00:09:39]: It's like twenty-three. so the. the, those for listeners, like, the problem was the models are not built for any of this, right?Swyx [00:09:46]: Like, you're just trying to like. They're not post-training to understand, like, looping and, like, self-correction and tool calling was there, but, like, also not that great.Akshat [00:09:55]: Yeah.Akshat [00:09:55]: I don't remember if you used tool calling in this one, but yeah, the models would just diverge after like ten iterations and not produce anything meaningful.Swyx [00:10:03]: Yeah. But like, then. So okay, like now talking to myself three years ago, the answerVibhu [00:10:08]: Of course they will get betterSwyx [00:10:09]: Collect all the failures, build benchmark, and then collect all the, examples, build the RL environmentAkshat [00:10:15]: RightSwyx [00:10:15]: Sell it for like ten billion dollars to Meta.Swyx [00:10:17]: And then also train a model and then sell that for sixty billion dollars to Elon. And this isAkshat [00:10:23]: Yeah, of courseSwyx [00:10:23]: The funny machine. Like, it's like, it's about the hardware.Akshat [00:10:28]: It's hard to have that inherent conviction that the stuff will get that much better.Swyx [00:10:33]: In retrospect, it's so f*****g obvious.Akshat [00:10:36]: Fair enough.Swyx [00:10:37]: Like, what else were we doing back then? I don't know. anyway. Yeah. So this. That was the start of your sandboxing journey, right? I feel like it didn't blow up until, like, last year.Akshat [00:10:49]: Yeah.Swyx [00:10:50]: So there was like a couple years of quietness.Akshat [00:10:52]: Exactly, yeah. We wereVibhu [00:10:53]: I think very underrated product value. Like, my experience with Modal, Charles, before he had joined Modal, met this guy at a hackathon, and he really insisted we wanted to run some small model, not hosted anywhere, and he's like, “ there's this cool company, Modal. They'll like spin up a GPU sandbox, we can throw it on there. They'll take a Hugging Face link.” And like there's so much value just right there, right? Like instant hosting, spin it up, spin it down. It'll stay cold, but we run the demo a few days later, it'll come back up and like all this stuff in retrospect, like it's still what we needed like today.Akshat [00:11:27]: Yeah, it's still needed today. workload shapes have changed a lot as, we run stuff for people with really massive production scale and, there it's it's not about scaling from zero to one, but it's how do we scale really elastically, from like thousand to fifteen hundred GPUs very quickly in a given region. It's the same shape problem.Elastic Inference, GPU Autoscaling, and Custom ModelsVibhu [00:11:50]: Okay. So you look at, say, Cursor Composer, right?Akshat [00:11:53]: Yeah.Vibhu [00:11:53]: They had a. “We'll do RL on a model every couple hours.” you guys have a whole version of RL inference gym and whatnot.Vibhu [00:12:01]: When you look at workloads like that, you're doing train runs where you need to scale up, scale down every hour thousands of GPUs, right? That's the example for we do need it, right?Akshat [00:12:12]: Yeah. Well, so I'll, I'll take a step back and, maybe talk about like how people use Modal today. because our biggest use case is, elastic inference. And the thing we first found product market fit, with was inference for custom models. So we stayed away from the LLM space, and we were serving companies like Suno for audio, Runway for video, robotics, comp bio companies that train their own model elsewhere. But Modal is the best black box that for deployment, scaling to however many GPUs you need as your traffic pattern changes. And we saw all of them like have a very unpredict- predict- predictable, traffic pattern. it's like diurnal. It's Some days, like the company will do a launch and, they'll need like, way more. And it's not just one model that they deploy. They-- all these companies deploy, lots of different models in different regions, and so the autoscaling problem becomes even harder because then you have to scale within a certain region, and those cycles are offset. So different times you scale up in different regions.Akshat [00:13:20]: So that's like our sortVibhu [00:13:22]: And thatAkshat [00:13:22]: YeahVibhu [00:13:22]: That in and of itself is a huge category. There's a bunch of inference providers which, provide this fireworks, does this as a service together, whatnot, Base10. that's carved into its own niche for language models, at least right now.Akshat [00:13:36]: Yeah. the thing that we have specialized in is the autoscaling aspect.Vibhu [00:13:41]: Yeah.Akshat [00:13:41]: Because we found that it's not universally true that everyone else can autoscale, and we've gone deeper into it on the tech side by, we've incorporated GPU snapshotting into the product so we can take the GPU state, like your torch.compile model, snapshot it, and the next cold start is way faster. And so going back to your question, it's That's why you need a lot of burstiness for inference. But then people also do a lot of demand training, like for RL stuff, your rollouts are bursty, as you said. People also do a lot of batch jobs. So we'll see, a lot of companies, before they have a training run, they'll need thousands of GPUs to run encoding or something like that. And I think those things are much more bursty than. I agree that agents are not that bursty. sandboxes are, except when you're doing RL. RL is justRL, Batch Jobs, and 100,000 SandboxesVibhu [00:14:28]: Or commerceAkshat [00:14:28]: Insanely bursty.Vibhu [00:14:29]: Yeah.Akshat [00:14:30]: Yeah. Like when you're doing, rollouts, you sometimes need a hundred thousand sandboxes in your sandboxes.Vibhu [00:14:37]: Yeah. I'm curious if you've seen early sparks of continual learning. There are some people, like our friends, ngram, recently announced thisAkshat [00:14:45]: YeahVibhu [00:14:45]: They're, they're trying to do training. That also seems like a different workload, right? If you're doing training twenty-four/seven per se, there's a very weird dynamic of how you're using GPUs between people and whatnot, but seems like something you guys would work for.Akshat [00:15:00]: As you said, we're, we're fortunate to work with a number of, customers at the frontier and grab some of our customers. and they are taking the primitives we have, and trying to use them in very interesting ways, like continual learning. It's possible as the stuff gets better, some of that will be part of, our offering as well if, more people need it. but we're, we're just waiting to seeVibhu [00:15:23]: YeahAkshat [00:15:23]: How it shakes out.Vibhu [00:15:24]: Is there a primitive that you added after sandboxing that was the next step in the story?LLM Inference, DeFlash, and Speculative DecodingAkshat [00:15:32]: I guess we've been going much deeper into LLM inferenceVibhu [00:15:35]: YeahAkshat [00:15:35]: Because we realized that some of the advantages we have with like autoscaling, again, especially in different regions and whatnot, are, not present elsewhere. and the place where we had a gap was we weren't, working on the model layer itself. Like we were a black box. And, we realized that, we can get to frontier-level model performance, with, by having great people who work on this. And, we've been open sourcing a lot of our work, in terms of, Recently, we, shared our work on DeFlash, which is a block-based, speculator, and we've open sourced, all of it. So, you can - By using open source DeFlash, you can get the same performance as you would with one of the proprietary providers. And the next thing we're thinking about hereVibhu [00:16:23]: I thought this wasAkshat [00:16:24]: YeahVibhu [00:16:24]: An interesting blog post as well, right? Like, I think in here you make a claim that. Not a claim, just that how effective speculative deco-decoding really just get to.Akshat [00:16:33]: Yeah.Vibhu [00:16:33]: Anything you wanna point out from this around, what people should know?Akshat [00:16:39]: Yeah, absolutely. the high-level summary is, it would help to describe what speculative decoding is.Vibhu [00:16:44]: Yes.Akshat [00:16:44]: I will, yes.Vibhu [00:16:45]: I think, likeAkshat [00:16:46]: YeahVibhu [00:16:46]: So we've covered like Eagle and all thisAkshat [00:16:47]: YeahVibhu [00:16:47]: Like Hydra and all those things, but it was like two years ago.Akshat [00:16:51]: Yeah.Vibhu [00:16:51]: I think it doesn't hurt, right?Akshat [00:16:52]: Yeah. Speculative decoding is you have a smaller model, called a draft model, predict tokens ahead of the bigger model, and then you have the bigger model, verify all of this, all the tokens are predicted. And the reason it's faster is if you're predicting, one token at once, you're bound by memory bandwidth. But if you can batch the verification of, the draft model, then you're much more efficient using compute, and it's faster, and as long as your draft model is producing a lot of tokens that can get accepted, which is called the accept length, you can get a speed up that's, multiple times of, the original model speed. and well, that's what we highlight here. It's Like people talk a lot about we made these kernels faster and whatnot, but improving kernel will only give you like few percentage points of improvement, and, increasing accept length, literally is a multiplicative decreaseVibhu [00:17:47]: Like two to four X.Akshat [00:17:48]: Yeah, exactly.Vibhu [00:17:48]: Without much head-on performance.Akshat [00:17:50]: Yeah. I think it may - you are running a second model, right? So it may be something more expensive in the compute,Vibhu [00:17:57]: I meant quality performanceAkshat [00:17:58]: Probably not by muchVibhu [00:17:58]: But yeah. I thinkAkshat [00:17:59]: So there's no drop in quality performanceVibhu [00:18:01]: YeahAkshat [00:18:01]: Because you're always. You're never accepting a token that the big modelVibhu [00:18:04]: It's strictly betterAkshat [00:18:05]: YeahVibhu [00:18:05]: Or it's same.Akshat [00:18:06]: Exactly.Vibhu [00:18:07]: Right. Yeah.Akshat [00:18:08]: And so we've been working a bunch on DeFlash, which is a block-based speculator. so it's instead of predicting, one token at a time, it's predicting a block. And we've been open sourcing our work with it. The next thing for us here is for helping people train speculators and custom models. it's it's something that traditionally is very forward-deployed engineering driven, support deployed, engineer driven, like you work with customers and help them do that. And our vision for. This is why we launched Auto Endpoints, is we want to make frontier-level performance available to everyone. And so, we mentioned this in the announcement, we teased it. The next thing we're, we're launching is, as you run an auto endpoint, we shadow trafficAuto Endpoints and Frontier-Level PerformanceVibhu [00:18:54]: Do you want to explain what auto endpoints are?Akshat [00:18:57]: Yeah.Vibhu [00:18:57]: I lovely, yeah.Akshat [00:18:58]: Yeah. So, this is, I guess, going back to your Modal is you touch the code, but, sometimes people don't wanna touch the code, and they wanna get started with an endpoint that works and has all the great performance and, scalability that Modal has. So we've made that easier with, a way to create an endpoint from our UI, from the CLI, that has all of our optimizations that we talked about, like the DeFlash stuff already baked in, and there's full transparency. So we give you the code, you can go run it yourself, and if you want, you can eject out into the full Modal experience, which we see as people get sophisticated, they do wanna tweak the models, they wanna, fine-tune stuff. You can still do all of that. It's it's not a black box. And yeah, the next thing, as we teased later in the post, is how do we give you value even beyond this in terms of having your draft models evolve as your data distribution evolves, again, without having to talk to a person and, yeah.Vibhu [00:19:59]: I guess just to understand it directly, you have the GPUs, you have an endpoint that's compatible, you serve open model. If someone was to do this themselves, what's the delta that you guys provide? So you do a lot of open source great work on effective inference. how does it compare to, say, I take the same model, 5.2 FP8, take shelf inference engine, vLLM, SGLang, get compute of similar capacity, similar cost. What's the delta that plugging into something this, like this offers outside of the benefit of, scaling?Production Inference Beyond Raw GPUsAkshat [00:20:34]: It's interesting because we've taken the approach of open sourcing our contributions and upstreaming them. we work closely with the SGLang team. We want the improvements that our team, comes up with to be, there in open source for others to use, even outside of Modal. The benefit to us is we have a team that has significant expertise in terms of if you do have something that is not there, our team can help you get that performance, first. the other thing is with these endpoints, we are way more elastic, as you said, than, anyone else, and you have true scaling to zero. you have true, burstiness, and in practice, that matters a lot more to people than just finding, the GPU and, running Modal code on something.Vibhu [00:21:20]: Yeah. And I will say it's not that straightforward to just. like what I said is easier said than done, right?Akshat [00:21:26]: Yeah.Vibhu [00:21:27]: It's I think still for the average person, still hard to just gut check using different. There's, there's quite a bit of combinations you can make there. the trade-offs aren't really known at face value.Akshat [00:21:40]: Yeah. it's it's not just that. I think it's it's that running production-grade inference is a hard infer problem.Vibhu [00:21:49]: YeahAkshat [00:21:49]: Even if you subtract out the autoscalingVibhu [00:21:50]: YeahAkshat [00:21:51]: Is controlling things like tail latency and, making sure every, request is delivered at least once and whatnot.The Model and Agent LifecycleVibhu [00:22:00]: There's a lot of innovation that you can do here. I think, it's very interesting that you're starting to encroach on, like as you become a full cloud, you're starting to encroach on other people's turf.Vibhu [00:22:09]: What will you not do?Akshat [00:22:13]: Well, we wanna follow our users and, make sure they get like a platform that has everything that works well together. so right now we're focused on the model lifecycle and the agent, lifecycle. so both like going from data prep to training to inference, and then also if I want to deploy a background agent, let's say, sandbox, do persistent storage, a whole bunch of other stuff.Vibhu [00:22:38]: We talked to Cole, who did, OpenInspect. Yeah.Akshat [00:22:42]: Yeah.Vibhu [00:22:42]: And RealInspect also is on Modal.Akshat [00:22:44]: Yeah. So Ramp Inspect was a great example of a background agent that was really successful because they, were able to use some of the primitives like snapshotting and fast scaling to just have something that feels really reactive and works well.Ramp Inspect and Background AgentsVibhu [00:23:02]: Yeah. That's the new CTO of, Ramp right there.Akshat [00:23:05]: Yeah, Rahul.Vibhu [00:23:08]: It was really fun. yeah, okay, I think, all very bullish. Like, one of my reflections was also I did not originally. So when I met you guysThe Inference Inflection: CPU, GPU, and Co-LocationVibhu [00:23:19]: You weren't that much in the GPU game, and now you're all about, inference. And one of the points that I hinged on for Jensen's keynote at GTC this year was, what we're calling like the inference inflection, right? That let's say in AI workloads or machine learning workloads, it used to be like, let's call it eight to one GPU to CPU, and now it's more like one to one, which is like a interesting. Like, - because of how much agents are blocked or call out to this, to CPU heavy stuff the actual, like, limiting factor, like, swings back and forth from GPU to CPU a lot more than it used to be all GPU and then occasional CPU.Akshat [00:24:01]: Yeah.Vibhu [00:24:02]: GPU, CPU. And now it's like just constantly, and you just have to locate everything.Seventeen Clouds and the Supercloud StrategyAkshat [00:24:08]: Yeah. And that's one of the things that, again, we see as, something appealing about Modal, which is we've built this capacity pool that spans, 17 cloud providers, so we're, we're very good at Running on various kinds of cloud capacity across the worldSwyx [00:24:24]: You don't have your own data centers?Akshat [00:24:25]: We don't have our own data centers. We just run across a lot of neo cloudsSwyx [00:24:29]: Yeah. AreAkshat [00:24:30]: Metal providers.Swyx [00:24:30]: Yeah. Question mark.Swyx [00:24:31]: Yeah. You're, you're running the math, and you're like, “What's the cutover point where you're like.”Akshat [00:24:36]: Yeah, it's a good question. part of it is we see our differentiator in the software layer, and, being capital light and focusing on the software helps us move really fast. so far it's worked out well because there are so many other people building data centers that we're able to work effectively with them, and again, focus on what makes us, special.Swyx [00:24:55]: Yeah.Swyx [00:24:56]: 17 gets you into, like, the local providers sometimes. LikeAkshat [00:25:00]: The,Swyx [00:25:01]: Which was the most interesting one?Akshat [00:25:02]: There are a lot more neo clouds than you expect, and they all have various degrees of, various levels of reliability. And, that's why it's something we've invested a lot of time in, is building our own reliability layer on top. so if the GPU falls off the bus or something happens, we user workloads are not affected, and that lets us use a lot more capacity than,Swyx [00:25:30]: YeahAkshat [00:25:30]: You as a user would be able to.Swyx [00:25:32]: It's a useful thing to have because like now everyone knows, like, what layer you are and, like, you optimize for being the super cloud of all clouds.Akshat [00:25:41]: Yeah. That's, that's, that's the idea. and so I guess when you mentioned colocation, that's, that's another interesting thing where, one thing we've seen is people come to us when they want, very specifically located, CPUs or GPUs, like they wantSwyx [00:25:57]: Oh, they pin it in likeAkshat [00:25:58]: YeahSwyx [00:25:58]: EU?Akshat [00:25:59]: Exactly. Or EU, US.Swyx [00:26:01]: Right. Data resiliencyAkshat [00:26:02]: AustraliaSwyx [00:26:02]: Locality thing or performance or what?Akshat [00:26:04]: It's either data locality or latency, yeah.Swyx [00:26:07]: Yeah.Akshat [00:26:07]: Like, you want your. They're running sandboxes and model. They want them to be right next to aSwyx [00:26:10]: Yeah, it's easy thenAkshat [00:26:11]: YeahSwyx [00:26:12]: To. That is important in all those things. and so, like, you've accidentally, I don't know if it's accident, but, like, you've built the perfect primitive for agents to express themselves. And then, like, it's almost very funny how every extra development just involves more file system, just involves more CPU.Akshat [00:26:30]: Yeah.Swyx [00:26:31]: Just like the things that you already have. I don't know much about, if there's any, like, networking usages that are interesting, but you've also done some good work on networking.Networking, Sidecars, Private IPv6, and SandboxesAkshat [00:26:40]: Yeah, that's exactly right. Like, we're just taking compute storage and networking and building stuff on that layer, for, again, the stuff people need.Swyx [00:26:49]: YeahAkshat [00:26:50]: We see a few interesting networking things coming up. one is people want networked sandboxes. so we haveSwyx [00:26:57]: For like a Docker cluster type thing.Akshat [00:26:59]: Yeah.Swyx [00:26:59]: Sorry, Docker Swarm. Oh, f**k. What is it called?Akshat [00:27:02]: Compose.Swyx [00:27:03]: Compose type thing.Akshat [00:27:04]: Yeah. So if you want Docker Compose, our sandboxes now support, this thing called sidecars. So you can. A sandbox is a pod of containers, and you can run multiple containers in, a sandbox. also useful because, going back to networking, people want a lot of control over, outbound networking from a sandbox.Swyx [00:27:23]: Yeah.Akshat [00:27:23]: Like, they might wanna run a middle proxy for, like, maybe logging stuff for RL or, controlling how egress can happen to a domain, injecting credentials. and yeah. So we've, we've had to build a lot of that stuff ourselves.Swyx [00:27:38]: Yeah.Akshat [00:27:39]: But then also sometimes people want, sandboxes spanning multiple nodes to talk to each other, which is an emerging thing we're seeing. We have support for that for a different reason, and yeah, we'll see if that becomes stable.Swyx [00:27:52]: Like, just an open socket. It's a. This is directly like mTLS.Akshat [00:27:56]: We do support that, which is you can, expose a tunnel inside a sandbox.Swyx [00:28:01]: Yeah.Akshat [00:28:01]: And then you can either expose it to public internet or it can be, you can add like a HTTP, auth layer above it. But we have this thing called I6PN, which we haven't talked about, which is this, like, overlay network using IPv6 addresses. so if Modal containers, within the same workspace, when this is enabled, can address each other using this private IPv6 address, and no one else can.Akshat [00:28:28]: So it's like private networking, for containers. We built it because we needed it as a primitive for our distributed training product. so we have this other feature, which is you can add a decorator to a function, and you get a cluster of GPUs. and they have RDMA networking. so you can run a distributed training job, that's truly serverless. and we did the overlay network for that. But then we've seen that people are using it for other reasons, and, I'm intrigued to yeah, what would people do with it.Swyx [00:28:59]: Build primitives and let people figure it out, right?Akshat [00:29:01]: Yeah, exactly.Swyx [00:29:02]: You put out a pretty interestingAkshat [00:29:03]: They're like, they read the docs webpage. Let me use thatSwyx [00:29:06]: YeahAkshat [00:29:06]: Something they never intended to work. This is literally not even in our docs page. People somehow found it, and they're using it.RDMA, Memory Movement, and Distributed TrainingSwyx [00:29:12]: Huh.Swyx [00:29:14]: The way you portrayed it with, like, RDMA versus TCP, like, very well laid out, but just the transfer speed change at scale for RL, like yeah, you have it, you have it built in. I'm sure someone found it. It's found it to be a lot more efficient before you made a thing out of it, right?Akshat [00:29:32]: Yeah. And not to split hairs, I guess the overlay network is the TCP overlay network.Akshat [00:29:39]: The reason we have that is you need that to do the key exchange for RDMA before you set up the RDMA network on top of that. but then people found the TCP part.Swyx [00:29:48]: Can I tell you, this is like a big aha moment for me becauseAkshat [00:29:51]: YeahSwyx [00:29:51]: So I review 2,200 submissions for the World's Fair.Akshat [00:29:56]: Yeah.Swyx [00:29:57]: And then I got this from John OsterhoutAkshat [00:29:58]: HuhSwyx [00:29:59]: Who I don't know if. Do John Osterhout by name?Akshat [00:30:01]: The name sounds familiar.Swyx [00:30:02]: He published a. He's a well-known professor, published a lot of interesting software design books, and this is the talk he chose to submit, is on RDMA at Inference. And I'm like, you wouldn't think that this guy, who is like operating systems guy, would care about RDMA.Akshat [00:30:20]: I, it makes sense to me because I,Swyx [00:30:24]: This is the cloud, right? YeahAkshat [00:30:25]: Like, the way you move around your KV cache and how efficiently you can do it, how efficiently you move, your weights from your training GPUs to your inference GPUs in RL is there's a lot of degrees of freedom, and it is a systems problemSwyx [00:30:41]: YeahAkshat [00:30:41]: Moving memory aroundSwyx [00:30:42]: YeahAkshat [00:30:43]: Scheduling.Swyx [00:30:44]: This shows you how primitive my understanding of networking stuff is.Swyx [00:30:46]: Is this like the domain of WireGuard as well?Akshat [00:30:50]: Not quite.Swyx [00:30:51]: It's adjacent?Swyx [00:30:53]: Explain everything.Akshat [00:30:54]: Sure.Swyx [00:30:56]: How do we move memory around GPUs?Akshat [00:30:58]: Well, so sorry. Yeah, that is memory. Sorry, I was talking more, and maybe I was talking like five minutes back, about the private IPv6, addressing that you've set up.Swyx [00:31:09]: Yeah.Akshat [00:31:09]: Is it like it's a VPN?Swyx [00:31:10]: Yeah, it is like a VPN, and yeah, WireGuard is, yeah, you're right. It is,Akshat [00:31:16]: Right. Yeah, you already moved on to new topicsSwyx [00:31:17]: A similarAkshat [00:31:18]: OkaySwyx [00:31:19]: In the same space, WireGuard is, encrypted and this is,Akshat [00:31:23]: And you don't need encryption.Swyx [00:31:23]: Yeah.Akshat [00:31:24]: Yeah.Swyx [00:31:24]: This is not encrypted. that's the main difference. This is TCP and we have eBPF programs that will reject or allow the TCP connection based on whether you're allowed to do it.Akshat [00:31:35]: Used to involve a full sidecar, but now you have eBPF in the Linux kernel.Swyx [00:31:39]: Yeah.Akshat [00:31:40]: Yeah. I don't know if this is a natural follow-on to the topic of like my skepticism on distributed training is that while, like, people spend a lot of money on, like, cables to hook up GPUs, and even that is not, like, fast enough, and that's the bottleneck, is your networking fast enough?Swyx [00:31:59]: Yeah. So I guess you're talking about fully distributed training like, Dialog or something which is like cross data centerAkshat [00:32:06]: That would be, yes.Swyx [00:32:07]: That's the extreme.Akshat [00:32:08]: Yeah.Swyx [00:32:08]: You're in the middle, and then other people would have like the Mellanox cables up in, like, their actual data center.Akshat [00:32:14]: When you run multi-node training on Modal, RDMA, I think Mellanox, is, or InfiniBand is like a, is all seen as RDMA. but it's a way to bypass the TCP networking stack and, transfer, stuff much faster, between one node, to the other. And we have I think like 3 terabit per second, internal networkingSwyx [00:32:40]: OkayAkshat [00:32:40]: Which is the standard that's needed.Swyx [00:32:42]: Okay. So I misunderstood whatAkshat [00:32:43]: 50Swyx [00:32:43]: What part of the stack you wereAkshat [00:32:44]: 50 gigs overSwyx [00:32:45]: YeahAkshat [00:32:45]: If you wentSwyx [00:32:45]: YeahAkshat [00:32:46]: RDMA.Swyx [00:32:46]: Okay.Swyx [00:32:48]: Yeah. I, very impressive work.Multi-Node Training, Post-Training, and Auto ResearchSwyx [00:32:52]: So effectively you're extending like the model philosophy to the training cluster, like, yeah.Akshat [00:32:59]: Yeah. And we're, we're not going for like large scale training runs. the thing that we've built multi-node training for is, we see a lot of, smaller scale post-training. like, people are post-training like medium sized fund models, so they can, get higher quality on inference. this is a perfect fit, for something like that.Swyx [00:33:21]: Yeah. That is my impression of how a lot of these labs explore branches in post-training and then eventually merge whatever they find in.Akshat [00:33:31]: Yeah. The other use case we've seen for multi-node training is even if you have a big cluster, your researchers are still doing small runsSwyx [00:33:38]: YesAkshat [00:33:39]: Having elasticity thereSwyx [00:33:40]: Right, sureAkshat [00:33:40]: Matters a lot more.Swyx [00:33:41]: Yeah. the, like, this is like the current limiting factor for auto research, which is like you need to give your model some GPUs in order for it to completely run.Akshat [00:33:51]: We have a blog post on auto resource and model is,Swyx [00:33:55]: YeahAkshat [00:33:56]: Yeah, like, turns out to be pretty good substrate for that.Swyx [00:33:59]: So my impression is auto research means many things, likeAkshat [00:34:01]: YeahSwyx [00:34:01]: Anything that Andrej coins. Right now it's still science fair, right? Like not like, I don't know how many people are doing this.Akshat [00:34:08]: We're having a golf.Swyx [00:34:08]: Yeah.Akshat [00:34:09]: I thought the same thing.Swyx [00:34:11]: Yeah, you would know.Akshat [00:34:12]: We, like, our internal both training and inference teams use this the general shape of this quite a bit. like we have this one internal repo called auto inference, which essentially we've automated our own forward-deployed engineering efforts using, this harness, which is, the agent will just spin up a sweep of different things. It'll even run like, NVIDIA inside profiler and it'll like tweak configs and it'll arrive the right thing. it'll change your GPUs both from H200 to B200, and works really well.Swyx [00:34:47]: Nice.Akshat [00:34:47]: So yeah.Swyx [00:34:48]: By the way, I enjoy that your forward-deployed engineering is so technical that you have to do these things.Swyx [00:34:52]: It's very different from forward-deployed engineering from other people.Akshat [00:34:54]: Yeah. For our forward-deployed engineering team is, essentially they're like applied inference researchers or applied training researchers.Swyx [00:35:02]: Someone told me like they have to be able to build, but they also have to be able to sell. do they have to sell or are they like they're good, they're just like post-sale type of thing?Akshat [00:35:09]: It does, being able to talk to a customer and engage effectively with themSwyx [00:35:13]: YeahAkshat [00:35:13]: Matters a lot.Swyx [00:35:14]: They want the same thing.Akshat [00:35:15]: Yeah.Swyx [00:35:15]: ?Akshat [00:35:15]: But it's it's not really a sales, thing. We pair them with-- We have solution architects as well that are more on the sales side.Swyx [00:35:23]: Okay. Let's spend a bit more time on auto research. This is a big focus for for this year. Where does this go? like, have people explored enough? Like, there's all these beautiful charts of like improve and then level off a bit and then you find the next thing. Is this one abstraction up from normal training? Is that how we think about it, or do you think about it differently? Like model level training versus high, like driven hyperparameter search.Auto Inference and Modal BenchAkshat [00:35:51]: Yeah, like,Swyx [00:35:51]: Someone, some people call it like neural architecture search or whatever, right? Like.Akshat [00:35:54]: Yeah, - So the stuff I've seen people do with it is nowhere on the architecture level. It's pretty much tweaking parameters, but it's it's a hyperparameter sweep that's guided by some model intuition, so it's like much more efficient than, whatever other, sweep you would have.Swyx [00:36:12]: Yeah, it's just, it's just a question of where you want to spend your compute?Akshat [00:36:16]: Right.Swyx [00:36:16]: ‘Cause yeah, you can just throw infinite amounts of money on this and somehow you'll bang out Shakespeare?Akshat [00:36:22]: Yeah, infinite monkey.Swyx [00:36:24]: Yeah, so like the very good for model. and I think it's also very important that agents can spin up other agents, can spin up their infrastructure. Like very good for you. how good is our LLMs at generating model code? Like the benefit of existing LLMs is that you are in the data.Akshat [00:36:42]: Yeah. They're, they're surprisingly good. I think like pre Cloud 4 they were not, and then now they're able to shot, stuff out of the box. But we're playing around with releasing like a Modal Bench for like the harderSwyx [00:36:55]: YeahAkshat [00:36:55]: Things, that the LLMs cannot do yet and maybeSwyx [00:36:59]: What's an example of that?Akshat [00:37:01]: I think the things that- Sometimes agents struggle with, without right guidance and a skill is, how to, use the rest of our observability. Like how to. Something is failing, like how do you look at the logs and then update the right thing? It's reasoning about that. But they're able to shot, likeSwyx [00:37:23]: Yeah. You can just add a skill to it?Compute Strategy and Capacity PlanningAkshat [00:37:26]: Yeah. So we have a Modal skill now that. Which is why we built this Modal Bench. It's to find things like that, so we can address them in our tool.Swyx [00:37:35]: Tune a skill. Yeah.Akshat [00:37:36]: Yeah.Swyx [00:37:36]: No. it's it's good. are you facing any shortages? like we talk a lot about GPU shortages, but also CPU, also memory.Swyx [00:37:44]: Yeah.Akshat [00:37:45]: We have had a lot of growth, which means that, there's - we've had to be much better aboutSwyx [00:37:53]: PlanningAkshat [00:37:54]: Proactive capacity planning.Swyx [00:37:55]: Yeah.Akshat [00:37:55]: So we have,Swyx [00:37:57]: Which by the way, like it's like a MBA's like dreamAkshat [00:38:00]: YesSwyx [00:38:00]: Is like just planning this stuff. I think last time you and I talked about something maybe about this.Akshat [00:38:03]: Yeah. we have a really competent team of people that we call, The role is called compute strategy. so yeah, if anyone listening here or wants to work on thatSwyx [00:38:13]: Compute strategy?Akshat [00:38:13]: Yeah.Swyx [00:38:14]: I think,Akshat [00:38:14]: I feel like,Swyx [00:38:15]: I think the normies call it FP&A or something.Akshat [00:38:18]: Well, it's more It's it's not FP&A. It's it's There's a lot of interesting financial questions of like what is the blend between one year and three-year reservations? how do we forecast our own capacity? how do we. especially since our capacity is very fungible across different GPU types and different regions, like you have to model a lot of it. and you also have to have an opinion on how the supply chain is gonna evolve, and then you have to like, take bets,Swyx [00:38:49]: YeahAkshat [00:38:49]: Based on that.Swyx [00:38:50]: Tokenomics.Akshat [00:38:50]: Yeah.Swyx [00:38:51]: This is like probably a not a real point, but, I was trying to think about like what other industries. I was trying to think about like, we cannot be first to like these kinds of problems.Akshat [00:38:59]: Yeah.Swyx [00:39:00]: And what other industries have had this? And I was like, airlines with fuel and like they have to hedge their fuel and like, I think for a long time Southwest because they made like a hero fuel bet, they like were like super low cost becauseAkshat [00:39:12]: OhSwyx [00:39:12]: Compared to everyone else.Akshat [00:39:14]: Yeah. I hadn't thought about that.Vibhu [00:39:16]: We're at a fun time too?Akshat [00:39:18]: Yeah. It's. A lot of the compute business in general, for us is also about being very good about capacity management. That is how you have great unit, economics. but also over time it's how you can unlock more value for customers. Like, one of the things we're building now is like a way for customers to get, If they don't care about latency, like get much cheaper pricing and they'll get results back in like next 24 hours or something, like a batch tier essentially.Batch Tiers and Latency-Insensitive WorkloadsSwyx [00:39:47]: Yeah.Akshat [00:39:47]: And those are levers we have because we control the whole stack and scheduling and whatnot to give people a sufficientSwyx [00:39:53]: Yeah. I feel like they're not as popular. Like those, like the Frontier Labs have all those APIs. They're not as popular as they should be.Akshat [00:40:00]: The demand that we see for something like that is not for LLMs. although sometimes people wanna run evals andSwyx [00:40:08]: OkayAkshat [00:40:08]: Synthetic data prep and there it makes sense.Swyx [00:40:10]: Okay.Akshat [00:40:11]: But it's from a lot of LLM companies, like people who are doing computational bio, like they have to run really big batch jobs and they don't care about when they get it back.Swyx [00:40:22]: Yeah. And like they have a reasonable. It's it's also like a cousin to the stopping problem of like, will this finish in time?Akshat [00:40:30]: Yeah. You can bound it.Swyx [00:40:33]: Yeah.Akshat [00:40:33]: Like you can give peopleSwyx [00:40:34]: YeahAkshat [00:40:34]: SLAs on it.Swyx [00:40:35]: Yeah. I think what's, what's interesting is like the next phase of model.Swyx [00:40:38]: Like what, do people expect from you, now that you're established and you're like well-known compute player among all these leading companies. You had an inference launch week, and we talked a little bit about the launches. like what else? Like what else should people know?What Modal Builds NextAkshat [00:40:55]: We are building primitives that make our users' lives much easier. So, I think for example, with LLM inference, thousands more companies are gonna post-train their own models and, deploy open source models for inference. so we're thinking a lot about what is the best product shape for that. And, that involves everything from our training gym to, then, endpoints that get frontier-level performance. again, but I haven't talked to anyone. It looks somewhat different on other verticals. Like, we're also seeing a lot of real-time, audio-video stuff in there, which is why like, we're working on things like regional routing, with fallbacks. So you can get GPUs that are as close to users as possible. so you get like low latency for video streaming and whatnot. And then on the agent side, it's,Akshat [00:41:52]: We're still working very closely with our customers because stuff is changing so fast in terms of what they need. And, I think beyond sandboxes and persistent file systems, there's a lot of other things people will need from this agent stack as they build production agents. So yeah, we're thinking about those other things that fit in there.Swyx [00:42:13]: I want to ask what the other things are.Akshat [00:42:15]: Yeah. I probably should share right now.Swyx [00:42:17]: I think-- I think, okay, so, I do think a lot about the principal components of cloud, and you do talk about compute storage networking.Akshat [00:42:25]: Yeah.Swyx [00:42:25]: Because so far for me, it's fine. so far for the. the first couple generations of cloud, it's fine. What's different, qualitatively different about agents that you need some new permission level? Like a lot of people, okay, and I'll just kinda spew tokens at you until it like hopefully sparks something.Akshat [00:42:43]: Yeah.Swyx [00:42:44]: Like the new level now is whatever Claude Code does, which is dangerously scope permissions or like allow list by command or like whatever, right? And sometimes they're like, “Well, okay, we have like this adaptive thinking mode where like, just trust me, bro. I will make the calls for you.” Is that it? like mediated permissions.Hard Guardrails vs. LLM-Mediated PermissionsVibhu [00:43:03]: Now you're looping it with a goal and letting it roll.Akshat [00:43:06]: Yeah, I'm, I'm skeptical of LLM media permission for stuff that is at the sandbox level because you do want hard boundaries.Swyx [00:43:16]: Yeah.Akshat [00:43:16]: Otherwise, someone can exfiltrate stuff.Swyx [00:43:20]: But likeAkshat [00:43:20]: YeahSwyx [00:43:20]: Maybe that's old school thinking. Maybe we're the dinosaurs.Swyx [00:43:23]: Maybe the AI OS or the LLM OS is really the kernel is a goddamn LLM.Swyx [00:43:30]: Like it makes you feel uncomfortable.Akshat [00:43:31]: Yeah, I'm, I'm toldSwyx [00:43:32]: But that's what trusting the LLM is. Like imagine a spherical cow perfect LLM.Akshat [00:43:36]: Right.Swyx [00:43:37]: That it.Akshat [00:43:39]: Maybe.Swyx [00:43:41]: I wanna test the boundaries, right?Akshat [00:43:42]: Yeah.Swyx [00:43:42]: Like, and I don't believe that, but I wanna see where I'm wrong ‘cause that's, that's the consensus.Akshat [00:43:49]: Yeah. I think you always need hard guardrails when you want, And you can pair those with softer guardrails, right? And that's gonna be a lot of mediated.Managed Agents and Specialized SandboxesSwyx [00:44:00]: There. I'll also get you a end with a couple of your commentary on like the ecosystem outside of Modal. Manage agents. Everyone has one. Gemini, OpenAI, Claude, very useful for you, but also like it is their way of starting to edge into your space.Akshat [00:44:17]: Yeah.Swyx [00:44:17]: What's going on?Akshat [00:44:19]: Yeah, we're, very excited to partner with Anthropic and some of the other foundation labs, will not name who we're also working with. the way we see it is the manage agent thing is a great place to start if you're starting out building an agent and, But then when you get to, building something more production grade, like you're a company that's like Ramp that's building their own, Ramp also runs their accounting agent on us, so their external-facing agent. You need a lot more control over, your compute primitive on things like, what sort - how do you persist different files that the agent has access to, and how do you snapshot and restore? How do you control the networking? maybe you want GPUs. When you get to that point, you kinda want, a specialized sandbox provider, that gives you those things, and that's the role that we are trying to play.Swyx [00:45:15]: YeahAkshat [00:45:16]: We don't really have an opinion on the harness, whether it runs - it's a cloud-managed agent, and you hook it up to Model Sandbox, or you run the harness in Model Sandbox. We'll see where people converge with that.Swyx [00:45:26]: Yeah. Do you any opinions on like the meta harnesses, or just another layer on top of these things?Akshat [00:45:31]: You mean like the OpenPipeSwyx [00:45:33]: OpenPipe is one. I think Vercel had one, which I can't remember the name of right now. Fredshot had one. and then, to me, most recently was Data Databricks that had Omnigen. All these are meta harness. Like it's kinda pseudo agent cloud type things.Akshat [00:45:50]: I personally have not played around with them.Swyx [00:45:53]: Yeah.Akshat [00:45:53]: Build agents with them.Swyx [00:45:54]: Everything's bullish Modal, as long as it consumes more infra.Akshat [00:45:57]: That's why we're focusing on the infra layer. It's somewhere where our, relative competence is and, also it's a hard problem to solve.Swyx [00:46:06]: Yeah. I will say like just generally reflecting on that, I don't know if - if there's other topics on Modal, but like just generally reflecting as an infra person, not as intense as you, but in that field, this has like been the most exciting time in infra. Like it was boring for a while, and you couldn't really get people excited about data infrastructure. Like Eric would get on Data Console, everyone just watched the video and like say, “Look at how many sandboxes I can spin up,” and no one gave a crap.Why Infrastructure Became Exciting AgainAkshat [00:46:39]: Yeah.Swyx [00:46:40]: And like now everyone gives a crap.Akshat [00:46:42]: That's true. It is a very exciting time, and I think a lot of that's driven by just the amount of scale all of this stuff needs.Swyx [00:46:50]: I think the, like a lot of your initiatives or a lot of your like product directions make sense in retrospect, which is like the best kind, but I wouldn't necessarily have thought about it myself, which.Akshat [00:47:00]: We need the predictions.Swyx [00:47:02]: I think there's a lot that you just don't even see, right? Like you have the batch, you have the voice, you have the multimodal, but what else?Akshat [00:47:10]: What else is coming up for usSwyx [00:47:11]: Yeah. Where do you see things going?Akshat [00:47:13]: Yeah. I, in generalBiotech, Robotics, and Non-LLM AI WorkloadsAkshat [00:47:15]: It's it's clear that there's there's a huge shift happening. I think one thing that's not as obvious to people because LLM inference gets talked about so much and is also we work a lot of companies that are, doing things like drug discovery and computational bio, like the Chai Discoveries of the world. Big things are probably gonna happen there. we work a lot of robotics companies that are putting robots in like active deployments and getting good results out of them.Swyx [00:47:45]: Is there Air Gap Modal? Is there a version that is like prem air gapped whatever?Akshat [00:47:50]: No. We,Swyx [00:47:51]: You should cloud only.Akshat [00:47:51]: Yeah.Swyx [00:47:52]: Yeah. Okay. But yeah, so what you're saying is like because you're focused on primitives and they're good primitives, you find use cases in all these kinds of things.Akshat [00:48:01]: Yeah.Swyx [00:48:01]: Probably diversifies you a little bit away from LMS all the time.Akshat [00:48:05]: Yeah, absolutely. We're, we'- our goal isn't to only serve the LLM inference market.Swyx [00:48:10]: There are a lot just on the website, the audio,Akshat [00:48:12]: Yeah. We said both onSwyx [00:48:14]: Computational bio images. Yeah, there's a lot here. There's QTA TTS, customizing. Oh, Chatterbox. there was customizing Whisper.Akshat [00:48:24]: Okay. Yeah.Swyx [00:48:25]: This screen reminds me of a fallen competitor, which Replicate.Model APIs vs. Differentiated AI ProductsSwyx [00:48:31]: What's your postmortem on what happened?Akshat [00:48:34]: This is one thing we've stayed away from is providing an API for models because I think providing model APIs is some of it ends up serving like a really hobbyist market, which is much less sticky.Swyx [00:48:50]: Yeah.Akshat [00:48:50]: And we've always wanted to build for companies that are building products and need more flexibility that's not just an API.Swyx [00:48:57]: Which you can build an API for a model and this is clearly what it is. But you - but what you're saying, you can wrap it into a more fully functioning back end that you run.Akshat [00:49:06]: Yeah. So all of our examples, it's not that spin up this model, here's an API token, use it. They're all code.Swyx [00:49:13]: Okay.Akshat [00:49:13]: And so the point is that this is just an example.Swyx [00:49:16]: Starter code.Akshat [00:49:17]: Yeah. But you can tweak it however you want.Swyx [00:49:20]: Yeah.Akshat [00:49:21]: And if you're like a company building a product, like, computational bio whatnot, yeah.Swyx [00:49:26]: I guess I'm trying to tease out for listenersAkshat [00:49:28]: YeahSwyx [00:49:28]: When does it stop becoming, oh, you're just an API call and you're just a wrapper on API to becoming what you call a product, right?Swyx [00:49:36]: Like, what is that layer? Like what-- Like, more lines of code, but like beyond that, what is the substance that people add that qualifies it to be something more?Akshat [00:49:46]: I think there's a little bit of like a selection effect of like a lot of the companies who do wanna get deeper into that level are probably building something that's more differentiated. And, I think, an example is like - with LLM inference, originally we, worked with companies that were building their own post-training frameworks or they were, - Ramp early in the day was training their own tokenizer and like swapping out the tokenizer in Llama and whatnot. I'm not saying that's, that successful, in that case. But a better example is like, let's say Suno. because Suno, does not use Modal for training.Swyx [00:50:26]: Mikey on the pod. Yeah.Akshat [00:50:27]: But they use Modal for all their inference and that's because they have like a custom-- They have completely custom model architecture and that means that they have to be at the code level and tweak things that are not, just an API.Swyx [00:50:41]: It's interesting as well, like we had, Ethan, most recently on the xAI Groq team make a prediction that like the next tier in video gen is not a better video model, it's a better model or agent that orchestrates video models.Video Agents and Production WorkflowsAkshat [00:50:56]: Oh, interesting.Vibhu [00:50:56]: Language model backbone that can use toolsAkshat [00:50:58]: RightVibhu [00:50:59]: And write code.Akshat [00:51:00]: Like, yes, I can make my second video or my second video from Groq, but I want my minute video.Akshat [00:51:06]: And I'm not going there through normal video gen.Swyx [00:51:10]: Yeah, that's interesting. I - So we have GPU sandboxes and recently have seen a few companies doing agents that do video manipulation or,Akshat [00:51:22]: Yeah. Give it FFmpeg and just do it.Swyx [00:51:23]: Run FFmpeg. But likeAkshat [00:51:25]: That's not enough.Swyx [00:51:25]: Yeah.Akshat [00:51:26]: You need to give it Adobe.Swyx [00:51:27]: Yeah, I hadn't put it together with like it would be a video production thing. in my mind these things were going more towards editingAkshat [00:51:36]: Yeah.Vibhu [00:51:36]: Well, shout out Mantis.Akshat [00:51:37]: I think about this a lot.Swyx [00:51:38]: .Akshat [00:51:41]: Yeah. Sorry.Vibhu [00:51:41]: Luma. Luma Agent is a version of this for video production, but it's a off.Swyx [00:51:46]: I was gonna get your quick takes, on some other stuff that happensGitpod/Ona, CI, and Runtime SandboxesSwyx [00:51:50]: In recent news and just-just see if you have anything interesting. Gitpod, very li

ceo spotify world ai english moving future running story simple elon musk european union preparing open model language chatgpt mba networking production manage auto cloud shakespeare agent metal infrastructure explain cto whispers eagle southwest gemini openai evolve nvidia robotics adobe rust scheduling api proactive collect starter dominant python ui aws dialog ml linux daytona llama iso apis vpn runway llm synthetic anthropic temporal cognition astral railways gpu cpu qualcomm ramp loops modular ls rahul codex docker kv kubernetes gpus dx mantis sdks ona insanely ax alessio elastic speculative lms suno compute andrej rl computational ci cd rodin modal sidecar cpus luma replicate cli series c typescript inference databricks dsl compose tcp etl ipv6 cog gtc chatterbox networked hacker news slas tokenomics yaml groq locality ebpf sandboxes a100 wireguard akshat ffmpeg brett taylor mellanox cosine infiniband

When the answer is never ‘no': The world of entertainment travel (Global Travel Collection, part 2)

Trade Secrets Podcast

Play Episode Listen Later Jul 6, 2026 46:37

The world of entertainment travel can be similar to both luxury leisure and corporate travel, but don’t be fooled: it’s really a beast of its own. This week on Trade Secrets, delve into the world of entertainment travel and everything it entails, from multimillion-dollar contracts and high security clearances to catering to multi-page hotel riders and booking transfers for dogs. Co-hosts Emma Weissmann and Jamie Biesiada are joined by MD Travel Services’ Michael Dovey, Travel Princess Inc.’s Wendy Nyi-Baylock and ALTx Collective’s Corey Slee, all advisors under Global Travel Collection’s entertainment division. This episode was recorded during GTC’s Arrive conference in Austin in June. This episode was sponsored by National Geographic Lindblad Expeditions. Further resources ALTx Collective on Instagram Wendy Nyi-Baylock and Travel Princess Inc. on Instagram MD Travel Services on the web and Instagram Need advice? Call our hotline and leave a message: 201-902-2098 Email us: tradesecrets@travelweekly.com Theme song: Sock Hop by Kevin MacLeod Link: https://incompetech.filmmusic.io/song/4387-sock-hop License: https://filmmusic.io/standard-license See omnystudio.com/listener for privacy information.See omnystudio.com/listener for privacy information.

collection kevin macleod arrive trade secrets gtc global travel

How the GTC Host Agency Is Seeing Incredible Growth

The Insider Travel Report Podcast

Play Episode Listen Later Jul 3, 2026 10:38 Transcription Available

Josh Stevens, senior vice president of strategy and growth for Global Travel Collection, talks with James Shillinglaw of Insider Travel Report at last month's GTC ARRIVE conference in Austin, Texas. Stevens says GTC is seeing incredible growth in average daily rate across all GTC hotel bookings, with sales up nearly 10 percent in the luxury corporate, luxury leisure and entertainment segments. Demand of for luxury travel keeps going up in all segments, although Steven notes cruise sales are up nearly 20 percent for luxury ocean, river and yacht categories. For more information, visit www.globaltravelcollection.com. All our Insider Travel Report video interviews are archived and available on our Youtube channel (youtube.com/insidertravelreport), and as podcasts with the same title on: Spotify, Pandora, Stitcher, PlayerFM, Listen Notes, Podchaser, TuneIn + Alexa, Podbean, iHeartRadio, Google, Amazon Music/Audible, Deezer, Podcast Addict, and iTunes Apple Podcasts, which supports Overcast, Pocket Cast, Castro and Castbox.

spotify texas google growth incredible agency stitcher castro stevens podbean deezer podchaser podcast addict player fm pocketcast listen notes gtc amazon music audible josh stevens insider travel report

Where Global Travel Collection Grows Great Luxury Advisors

The Insider Travel Report Podcast

Play Episode Listen Later Jul 1, 2026 10:26 Transcription Available

Ragan Stone, senior vice president of In the Know Experiences, part of the Global Travel Collection (GTC), talks with James Shillinglaw of Insider Travel Report, at GTC's ARRIVE conference in Austin, Texas, last month about how her unit fits in with the rest of the GTC luxury host agency. In the Know, which offers bespoke vacation planning, VIP event access (like sold-out concerts and fashion shows), and custom corporate entertainment worldwide, is now where many advisors go to train to become great GTC luxury travel sellers. For more information, visit www.globaltravelcollection.com. All our Insider Travel Report video interviews are archived and available on our Youtube channel (youtube.com/insidertravelreport), and as podcasts with the same title on: Spotify, Pandora, Stitcher, PlayerFM, Listen Notes, Podchaser, TuneIn + Alexa, Podbean, iHeartRadio, Google, Amazon Music/Audible, Deezer, Podcast Addict, and iTunes Apple Podcasts, which supports Overcast, Pocket Cast, Castro and Castbox.

Who Helps GTC Travel Advisors Stay Successful in Selling Travel

The Insider Travel Report Podcast

Play Episode Listen Later Jul 1, 2026 8:06 Transcription Available

Simon Brooks, senior vice president-advisor success for Global Travel Collection (GTC), talks with James Shillinglaw of Insider Travel Report at last month's GTC ARRIVE conference in Austin, Texas, about his role with the luxury host agency. Brooks oversees the relationship GTC's 1,500 independent contractors have with host, helping them work with preferred suppliers and sell more travel. Essentially his job is to make every advisor successful in selling luxury leisure, corporate and entertainment travel. For more information, visit www.globaltravelcollection.com. All our Insider Travel Report video interviews are archived and available on our Youtube channel (youtube.com/insidertravelreport), and as podcasts with the same title on: Spotify, Pandora, Stitcher, PlayerFM, Listen Notes, Podchaser, TuneIn + Alexa, Podbean, iHeartRadio, Google, Amazon Music/Audible, Deezer, Podcast Addict, and iTunes Apple Podcasts, which supports Overcast, Pocket Cast, Castro and Castbox.

spotify texas google travel selling helps stitcher castro advisors podbean deezer podchaser podcast addict player fm pocketcast listen notes gtc amazon music audible insider travel report simon brooks

How Global Travel Collection Came Together in Austin

The Insider Travel Report Podcast

Play Episode Listen Later Jun 30, 2026 11:55 Transcription Available

Angie Licea, president of Global Travel Collection (GTC, talks with James Shillinglaw of Insider Travel Report at this month's second annual ARRIVE conference in Austin. Licea tells us how GTC has successfully unified all its former agency brands—Protravel International, Tzell Travel and Altour—under GTC, which is now effectively the largest luxury travel host agency in the world. She also tells us about the significance of ARRIVE in helping to create a new community of luxury travel advisors. For more information, visit www.globaltravelinternational.com. All our Insider Travel Report video interviews are archived and available on our Youtube channel (youtube.com/insidertravelreport), and as podcasts with the same title on: Spotify, Pandora, Stitcher, PlayerFM, Listen Notes, Podchaser, TuneIn + Alexa, Podbean, iHeartRadio, Google, Amazon Music/Audible, Deezer, Podcast Addict, and iTunes Apple Podcasts, which supports Overcast, Pocket Cast, Castro and Castbox.

spotify google stitcher collection castro podbean arrive deezer podchaser podcast addict player fm pocketcast listen notes gtc global travel amazon music audible insider travel report

An inside look at Global Travel Collection, the $2.4 billion powerhouse host agency (part 1, feat. Angie Licea)

Trade Secrets Podcast

Play Episode Listen Later Jun 22, 2026 56:55

In this episode of Trade Secrets, recorded in person at Global Travel Collection’s Arrive conference at the Fairmont Austin in June, get an update on all things GTC from president Angie Licea. From the labor of (mostly) love that was the GTC brand unification, uniting several agencies that sell $2.4 billion annually, to the Legacy in Motion program, which helps advisors both grow and sell their businesses, Licea offers an inside look at everything going on at the New York-based host agency. Then, she takes on several listener questions, ranging from breaking into ultra-high-net-worth travel to top holiday destinations. And for those who haven’t seen the ending of “Survivor 50,” beware of a spoiler at the 54:25 mark. This is the first part in a two-part series of episodes recorded at Arrive. In two weeks, tune in for an episode all about entertainment travel. This episode was sponsored by Brendan Vacations. Further resources Global Travel Collection on the web Mentioned in this episode: Austin’s Congress Avenue Bridge Bats The latest from GTC: Brand unification TV show “1st Look Presents – Extra Mile Club” Legacy in Motion program Circle program New-to-industry training program Need advice? Call our hotline and leave a message: 201-902-2098 Email us: tradesecrets@travelweekly.com Theme song: Sock Hop by Kevin MacLeod Link: https://incompetech.filmmusic.io/song/4387-sock-hop License: https://filmmusic.io/standard-license See omnystudio.com/listener for privacy information.See omnystudio.com/listener for privacy information.

tv new york survivors circle billion agency collection motion kevin macleod powerhouses arrive inside look trade secrets gtc global travel

Why Social Media Lost in Court and AI Agents Demand Total Surveillance ft. Shelley Palmer

This Week in XR Podcast

Play Episode Listen Later Jun 19, 2026 53:47

Shelley Palmer,media technologist, advisor, and author with over 700,000 daily newsletter subscribers, returns to the show. He's one of the sharpest thinkers writing about AI today, and this conversation covers the full arc: from social media liability to the trust collapse coming for all of us, and into the real productivity gains and surveillance trade-offs of living inside an AI-first workflow.The episode opens with the Google and Meta lawsuit verdict and quickly moves past the legal question. Shelley's position is precise: you can't legislate parenting, but you can legislate transparency, and the tech industry has failed on that front entirely. The $6 million judgment against Meta and Google is a rounding error — not a deterrent. What matters is what platforms actually engineered: engagement above all else, backed by neuroscience, probabilistic math, and dopamine feedback loops optimized for shareholders, not users.AI XR News You Should Know: OpenAI is ending Sora and pivoting hard to Codex and enterprise. Ben Affleck secured $900 million from Netflix for a custom AI filmmaking tool. Epic Games cut 1,000 jobs as Fortnite loses audience. NVIDIA's Jensen Huang introduced Nemo Claw and Open Shell at GTC — a corporatized framework for personal AI agents.Key Moments[00:01:15] – Charlie opens noting the show missed one episode in nearly 300 — his daughter's wedding[00:01:55] – OpenAI kills Sora; the Critters director goes dark before the episode[00:04:45] – Google and Meta lose their social media addiction lawsuit; Meta also loses in New Mexico[00:08:07] – Shelley on what can actually be legislated: not parenting, but transparency[00:11:42] – Shelley on Zuckerberg: he genuinely believed connection would be net positive; ask him today[00:13:31] – "Planetarily net negative. No matter what good it does, it does more harm."[00:18:16] – Rony on dopamine engineering: neuroscientists studying pixel size, color, sound to refine addiction[00:19:40] – Shelley reframes it: engagement maximization for shareholders, no more insidious than that[00:23:19] – The physiological change argument: humans evolved to default to trust; AI-generated everything breaks that[00:31:50] – Rony's counterpoint: trust will reset local; the software ecosystem will follow[00:36:53] – Shelley: "Our business increased last year. Everyone on my staff is doing 400 times the work."[00:44:42] – AI-first means automating every workflow you can honestly automate — and knowing what isn't ready[00:45:06] – Jensen's Nemo Claw and Open Shell: the safer path to personal AI agents, and what it actually costs[00:49:42] – The surveillance trade-off: an effective AI agent requires more personal data exposure than anything before it[00:51:24] – Apple's Secure Enclave play: why Tim Cook may win the AI trust war in the endThe productivity gains are real, but so is the privacy exposure, and the systems that earn trust — at every level — are the ones that will survive.This episode is brought to you by Zappar, the company behind Mattercraft — the leading visual development environment for building immersive 3D web experiences across mobile, headsets, and desktop. Mattercraft now features an AI assistant that helps you design, code, and debug in real time, right in your browser.Start building at mattercraft.io. Subscribe to the AI XR Podcast wherever you listen.Watch the full episode for the full breakdown. Available where podcasts are. Full videos available on YouTube. https://youtu.be/S_AECjELYyo Hosted on Acast. See acast.com/privacy for more information.

netflix ai social media google apple lost 3d court new mexico acast mark zuckerberg fortnite openai ben affleck nvidia surveillance epic games tim cook sora critters codex key moments jensen huang rony gtc zappar shelley palmer

Ferrari 330GTC: The Story of a Thoroughbred, with Maurice Khawam (Encore)

Horsepower Heritage

Play Episode Listen Later Jun 17, 2026 73:53

If the Ferrari Luce isn't your thing, then you'll appreciate the story behind one of Ferrari's greatest creations, the 330GTC. For many years, Enzo Ferrari's road cars were mostly thinly disguised racing machines. Motor racing, after all, was the company's raison d'être.But there came a time when its patrons wanted more refinement and sophistication. That car came in 1966 as the Ferrari 330 GTC. Soon it was proclaimed the finest Ferrari road car ever built.In this encore episode, author Maurice Khawam discusses his book, "Ferrari 330 GTC: Elegance and Pedigree". It's an exhaustive and painstaking work that will long serve as a reference for restorers, collectors and concours judges, as well as those who seek to understand the history of the Ferrari brand and the talented individuals behind it. SUPPORT THE PODCASTSUBSCRIBE to Horsepower Heritage on YouTubeFIND US ON THE WEBSupport the showHELP us grow the audience! SHARE the Podcast with your friends!

ferrari motor thoroughbreds pedigree gtc enzo ferrari showhelp

Perseverance in the Face of Adversity: From Geopolitics to the Gut Check with George Tagg Jr. | Ep 56

The Unleashing Leaders Podcast

Play Episode Listen Later Jun 10, 2026 47:12

Setbacks don't send a calendar invite. They show up at 6 a.m. on a Teams call. In a hospital room. On a battlefield. George Tagg Jr. has met adversity at every one of those addresses. And every single time, he found a way forward. Attorney. Former DOJ Nazi hunter. A State Department and DOD official who negotiated in the Caucasus, was on the ground in Kyiv when Russia's invasion began, and managed the deconfliction line between U.S. and Russian forces in Syria. George has served across multiple presidential administrations. He is now founder and CEO of GTC 360 Advisors. And writing a book on resilience, including a chapter on dealing with bullies. Starting with Putin. But if you ask George, his hardest moments were never on a battlefield. Losing his grandparents young. Then his father, one of the most magnetic lobbyists in DC during the 90s, gone suddenly to a heart attack six months after 9/11. George was 20. He could have gone the other direction. Instead, he picked up the legacy, held onto the lessons, and kept going. That is what this episode is really about. Not the geopolitics. The process. A repeatable, human way of turning your worst moments into the thing that moves you forward. If you have ever faced a moment that felt unsurvivable, this one is for you. Additional Resources: Connect with George on LinkedIn Learn more about GTC 360 Advisors Attend Unleashing Leaders University! Sign up for our newsletter! Learn more about Unleashing Leaders Follow Unleashing Leaders on LinkedIn Connect with Lee on LinkedIn Follow Unleashing Leaders on Facebook Follow Unleashing Leaders on Instagram Key Takeaways: Lick Your Wounds, Then Move: why giving yourself grace after a setback is not weakness, it is the first step The Blamestorm Trap: how blaming others feels good for a minute and keeps you stuck indefinitely The 180 Turn: George's three-phase framework from retrospective grief to forward-facing momentum The Rabbi Rule: why your results will never speak for themselves and the connective tissue that keeps you standing The Rule of Threes: why trying something once tells you almost nothing and how to find what reignites you Compounding Works Both Ways: small investments in action compound powerfully

News: осінь на ринку LLM?; як не треба робити метріки; локальний інференс майбутнього

Pi Tech

Play Episode Listen Later Jun 10, 2026 64:37

У цьому випуску обговорюємо зміни в настроях AI-індустрії. Компанії все рідше говорять про повну заміну людей ШІ і все частіше позиціонують його як інструмент для підвищення продуктивності. Говоримо про NVIDIA RTX Spark та зростання інтересу до локального інференсу, проблеми використання токенів як метрики ефективності, реальну вартість впровадження AI для бізнесу та можливе охолодження ринку. Також розбираємо заяви про домінування ботів у вебтрафіку, проблеми AI-підтримки в Meta, новини від Google та концепцію JEPA від Яна Лекуна як одну з потенційних альтернатив сучасним LLM. 00:40 — огляд GTC та нових продуктів NVIDIA 04:00 — порівняння NVIDIA Spark з Mac Studio 11:57 — тенденції в AI та локальному інференсі 18:29 — вплив цін на компоненти на ринок технологій 22:12 — перспективи локального інференсу та прогрес у великих моделях 28:45 — судові справи та авторські права в AI 32:29 — боти перевищують людей у мережевому трафіку 35:34 — презентація Google та нові технології помічників 39:19 — купівля кодів та нові можливості для розробників 42:53 — архітектура та нові концепції в ML 48:09 — абстракція в моделюванні та предикції 52:11 — потенціал LLM та їх правильне використання 56:13 — адаптація до нових реалій в програмуванні 01:02:25 — культ суперменів у IT

ai google mac llm tech news gtc tech talks

PCS Season Payoff: How Military Families Turn Moving Expenses Into Free Vacations #232

The Military Money Manual Podcast

Play Episode Listen Later Jun 8, 2026 47:28

20,000 Dollars, 4 Cards, 400,000 Points. Spencer Reese of the Military Money Manual podcast sits down with Taryn from The Military Travelers to break down how military families can turn PCS moving expenses into free travel — without spending a single extra dollar. From sign-up bonuses to Hyatt Globalist status, this episode is packed with practical strategies for the upcoming PCS season. Topics Covered PCS Season Spending Strategy — Why PCS moves generate $10,000–$20,000+ in spendable expenses and how to redirect that spend to hit credit card sign-up bonuses GTC vs. Personal Cards — The frustrating reality of the Government Travel Card and why (for most branches) personal cards make more sense Branch-Specific Rules — Important caveats for Navy families and other branches regarding on-post lodging forms and card usage policies The Math of Points — Why putting PCS spend on one existing card (like the Amex Platinum at 1X) leaves massive value on the table compared to opening new cards Business Cards for Military Families — How a PPM qualifies you as a sole proprietor, why business cards don't hit your personal credit report, and how the Chase 5/24 rule makes them essential for bigger families Current Elevated Offers — Hot cards highlighted: Chase Sapphire Reserve (150K points), Atmos Summit (100K points + 3X on foreign transactions), United Club, Marriott Brilliant, Hilton Aspire, and Delta Reserve TLE Maximization — Using all 21 days of Temporary Lodging Expense, loyalty numbers, and stacking hotel promos during your move Hyatt Globalist Corporate Challenge — How a .mil email address unlocks Globalist status after 20 nights in 90 days (valid through February 2028) Multiple Cards Strategy — Holding multiple Hilton Aspires and Chase Sapphire Reserves, and the tip to align annual free night certificate expiration dates Military Spouse Eligibility — A reminder that spouses are fully eligible for annual fee waivers under MLA/SCRA Resources Mentioned The Military Travelers — themilitarytravelers.com | Facebook Group: "The Military Travelers" | Instagram: @themilitarytravelers Travel Freely (travelfreely.net) — Zach's beginner's guide to business cards and sole proprietor applications IRS EIN Application — Free Employer Identification Number at irs.gov (2-minute online form) Hyatt Globalist Corporate Challenge — Search "Hyatt Globalist Corporate Challenge" and enroll with a .mil email Spencer and Jamie offer one-on-one Military Money Mentor sessions. Get your personal military money and personal finance questions answered in a confidential coaching call. militarymoneymanual.com/mentor Over 22,000 military servicemembers and military spouses have graduated from the 100% free, Ultimate Military Credit Cards Course available at militarymoneymanual.com/umc3 Military Money Manual may receive compensation from JPMC. Opinions expressed here are author's alone, not those of any bank, credit card issuer, airlines or hotel chain. If you want to maximize your military paycheck, check out Spencer's 5 star rated book The Military Money Manual: A Practical Guide to Financial Freedom on Amazon or at shop.militarymoneymanual.com. If you have a question you would like us to answer on the podcast, please reach out on instagram.com/militarymoneymanual.

amazon moving vacation navy math points cards dollars financial freedom payoff pcs expenses globalists 3x business cards military families ppm gtc 1x amex platinum jpmc travel freely

AUGTC S3 E6: What, Like It's Hard? Bringing Legally Blonde to Life

Acting Up with GTC

Play Episode Listen Later Jun 3, 2026 55:49

tickets legally blonde gtc acting up legally blonde the musical

台灣民眾薪情差韓媒評"乞丐超人"？（2026/06/01）

有話好說

Play Episode Listen Later Jun 1, 2026 29:10

黃仁勳旋風來台，參加Computex 也帶來新的AI技術跟訂單，台灣150家廠商供應鏈成黃仁勳最強後盾，AI經濟蓬勃，台灣經濟成長超越南韓，卻被南韓形容，台灣有不少乞丐超人，盯著便利商店打折品，人均所得低於南韓，落差真有這麼大嗎？AI旋風帶來經濟成長與出口成長，是否也造就台灣K型經濟，貧富差距拉大？歡迎加入《尖鋒對話》，一起深度探討。

gtc computex soundon

AUGTC S3 E6: What, Like It's Hard? Bringing Legally Blonde to Life

Acting Up with GTC

Play Episode Listen Later Jun 1, 2026 55:49

tickets legally blonde gtc acting up legally blonde the musical

Nvidia komt nu juist met chips voor laptops en desktop-pc's onder naam RTX Spark

Tech Update | BNR

Play Episode Listen Later Jun 1, 2026 3:58

Nvidia heeft tijdens de eigen goed-nieuwsshow GTC, gehouden rond techbeurs Computex in Taiwan, nieuwe chips aangekondigd die juist voor apparaten van eindgebruikers zelf bedoeld zijn. Het betreft de RTX Spark, voorzien van zowel een CPU als een GPU, om te concurreren met AMD en Intel. Joe van Burik vertelt erover in deze Tech Update. Verder in deze Tech Update: Softbank, bekend van investeringen in OpenAI, gaat tientallen miljarden in AI-datacenters in Frankrijk steken See omnystudio.com/listener for privacy information.

ai taiwan spark intel chips openai nvidia laptops voor amd verder gpu cpu desktops onder komt naam juist frankrijk gtc computex tech update desktop pc burik

SP2. 輝達 GTC 直播抽顯卡 + 軟體股回神了 | M觀點特別篇

M觀點 | 科技X商業X投資

Play Episode Listen Later May 31, 2026 60:28

SP2. 輝達 GTC 直播抽顯卡 + 軟體股回神了 | M觀點特別篇 --- (00:00)第一個話題：輝達 GTC 直播抽顯卡 (20:29)第二個話題：軟體股回神了 --- M觀點資訊 --- 科技巨頭解碼: https://bit.ly/2XupBZa M觀點 Telegram - https://t.me/miulaviewpoint M觀點 IG - https://www.instagram.com/miulaviewpoint/ M觀點Podcast - https://bit.ly/34fV7so M報: https://bit.ly/345gBbA M觀點YouTube頻道訂閱 https://bit.ly/2nxHnp9 M觀點粉絲團 https://www.facebook.com/miulaperspective/ 任何合作邀約請洽 miula@outlook.com -- Hosting provided by SoundOn

hosting gtc soundon m podcast sp2

【天下零時差06.01.26】輝達GTC、Computex登場；國際油價漲，台灣通膨為何低？；聯準會公布最新褐皮書

聽天下：天下雜誌Podcast

Play Episode Listen Later May 31, 2026 6:11

週一天下零時差關注以下財經大事：一、輝達執行長黃仁勳在GTC與Computex將為台廠帶來什麼利多？二、聯準會公布最新褐皮書，美伊戰爭如何影響美國經濟情勢？三、國際油價高漲，為何台灣通膨這麼低？文：郭家宏製作團隊：錢玉紘、鄭子鴻＊閱讀零時差，點這看全文

hosting gtc computex soundon

【阿榕伯胡說科技Ep.76】5月科技大事解析：黃仁勳再度訪台、聯發科股價噴發、SpaceX上市倒數

聽天下：天下雜誌Podcast

Play Episode Listen Later May 28, 2026 59:55

這集節目是5/25 胡說科技YouTube直播內容，想看影像版歡迎點擊連結：https://cww.psee.ly/959e8p 5月科技大事，你最關心哪件事？一、輝達財報公布，黃仁勳再度訪台二、聯發科股價大爆發，市值逼近7兆三：SpaceX上市倒數，地表規模最大IPO 阿榕伯直播講給你聽。錯過阿榕伯X群聯電子潘健成首場實體閉門對談！還有精彩回放可以看：https://bit.ly/4ntlMLY 關於《胡說科技》：天下獨家推出《胡說科技》，這是第一份為關注產業趨勢讀者打造的科技專欄頻道。由天下總主筆陳良榕，半導體資深記者，耕耘科技報導多年，親自操刀，每週出刊。《胡說科技》結合對產業的敏感度，提供獨家、深刻的觀點與分析，不談花俏技術，不用行話堆疊，每週一篇，希望讀者在短時間掌握影響決策的科技動向。＊到官網看最新文章，立即訂閱： https://bit.ly/3TuL8eb ＊意見信箱：bill@cw.com.tw -- Hosting provided by SoundOn

spacex hosting gtc soundon

[CLIP] World Models, Real-Time Video and the Decade Ahead | Jamie Umpherson (Runway)

Entreprendre dans la mode

Play Episode Listen Later May 22, 2026 12:24

ai decade models assurance nvidia real time visitez runway squarespace entreprendre audiomeans gtc maif orso media adrien garcia

#528 The AI Video Revolution Reshaping Cinema, Advertising and Fashion | Jamie Umpherson (CCO at Runway)

Entreprendre dans la mode

Play Episode Listen Later May 19, 2026 67:50

new york ai hollywood los angeles toronto revolution fashion cinema advertising fortnite twelve nyu paramount assurance cgi visitez reshaping a24 runway chief creative officer squarespace ithow telescope mers entreprendre farid audiomeans gtc itp siv gaumont mediawhy matterswhat maif gwm orso media adrien garcia

Lumentum: 90% Revenue Growth, a $2 Billion Nvidia Investment, Triple Digits Coming — and the Dilution Story Nobody Is Covering

Chip Stock Investor Podcast

Play Episode Listen Later May 13, 2026 10:01

Over a year ago, CSI did a three-part deep dive on co-packaged optics after Nvidia dedicated an entire segment of its GTC keynote to the technology — naming Lumentum and Coherent as the primary beneficiaries. The analysis was right. They did not buy.That mistake is now worth talking about directly.Lumentum just reported fiscal Q3 2026 revenue up 90% year over year. Q4 guidance implies triple-digit year-over-year growth. Nvidia made a $2 billion investment in both Lumentum and Coherent, and separately announced a major fiber optic cable manufacturing expansion with Corning. Co-packaged optics products have not even begun shipping in volume yet — that catalyst hits in December 2026. The case for Lumentum continues to build.But this is CSI, and true conviction in a business means covering what could go wrong as well as what is going right. There is a significant dilution story unfolding that every Lumentum shareholder needs to understand before adding to a position.When the stock was trading at roughly one-tenth of its current price, Lumentum raised cash by issuing convertible notes — a type of debt that converts to equity when the stock reaches certain price milestones. The stock has now blown through those milestones. All of that convertible debt is now eligible to convert into stock at terms that are extremely favorable for the debt holders and extremely expensive for existing shareholders. The result: shares outstanding are expected to increase by approximately 20% over the next two quarters. Nick and Kasey explain the full mechanics clearly — why it happened, what it costs, and whether the revenue acceleration can outrun the dilution.Also covered: the Qorvo fab acquisition in North Carolina that adds indium phosphide manufacturing capacity in two to three years, and what operating leverage looks like when a company goes from negative margins to all-time highs in the span of a few quarters.What we cover:— Why CSI did the deep dive on co-packaged optics and still did not buy — the honest lesson— Lumentum fiscal Q3 2026: 90% revenue growth — what drove it and what comes next— Q4 guidance: triple-digit YoY growth before CPO products even ramp— Nvidia's $2B investment in Lumentum and Coherent — the supply chain signal— Nvidia and Corning fiber optic expansion — Nvidia's hands all over the supply chain— Co-packaged optics — the December 2026 catalyst that has not landed yet— Operating leverage in action: from negative margins to all-time highs— Convertible notes explained: why ~20% share dilution is coming in 2026— Qorvo North Carolina fab acquisition — InP capacity coming in two to three years— The bottleneck in laser module manufacturing and why Lumentum dominates itSponsored by fiscal.ai — 25% off any paid plan through May 14 only. Use our link: fiscal.ai/csiDisclosure: Nick and Kasey hold positions in Lumentum and Coherent. This content is for general information only and is not individual investment advice. All investing involves risk.chipstockinvestor.com

north carolina investment billion covering triple nvidia operating csi 2b cpo revenue growth yoy digits coherent corning convertible gtc dilution lumentum qorvo

EP3-37 | What Makes AI Leaders Different? The Taiwan Connection Behind Jensen Huang and the AI Boom

Startup Island TAIWAN Podcast

Play Episode Listen Later May 12, 2026 22:51

NVIDIA GTC has outgrown the label of "tech conference" — it's now the annual convergence point of the entire AI ecosystem, where chips, software, robotics, and autonomous systems share the same floor, with an energy so dense that finding a parking spot takes two hours. In this episode, Silicon Valley-based veteran tech journalist and consultant Michelle Cheng joins host Uly fresh from GTC to give us a firsthand read on what the conference signals for the industry ahead.Michelle traces the deep ties between Jensen Huang and TSMC founder Morris Chang, and how Jensen's visits to Taiwan consistently bring the entire AI supply chain together at what the media has dubbed the "trillion-dollar banquet." Drawing on her experience covering Silicon Valley through the optical networking era, she maps today's AI infrastructure boom against the longer arc of industry cycles — demand never disappears, it just takes time to mature. She captures the mood of GTC with a drink called "Innovation Cocktail" she picked up at an after party: a blend of excitement and anticipation that defines where Silicon Valley stands right now.The episode closes on Jensen's own framing from the keynote — intelligence is becoming infrastructure — and Michelle's clear-eyed take: the window into the AI ecosystem is open, and the companies that find their position early tend to benefit the most when the cycle matures.NVIDIA GTC 已經不只是一場技術研討會——它是整個 AI 產業的年度聚合點，晶片、硬體、軟體、機器人、自駕車全在同一個屋簷下，能量之強連停車位都要搶兩個小時。這集邀請長期駐點矽谷、曾創辦科技媒體矽谷辦公室的資深記者兼顧問 Michelle Cheng，從剛結束的 GTC 現場出發，帶我們感受這場盛會的真實溫度。節目中，Michelle 從黃仁勳與台積電創辦人張忠謀的早期淵源，談到每次黃仁勳訪台召集台灣供應鏈夥伴的「兆元晚宴」，具體說明台灣在全球 AI 供應鏈中的結構性角色。她也從親身追蹤過的光纖網路世代出發，對照今天的 AI 基礎設施熱潮，指出產業週期的規律：需求從未消失，只是需要時間成熟。Michelle 用一杯在 GTC after party 喝到、名為「Innovation Cocktail」的調酒，形容當下矽谷興奮與期待交織的氛圍，並引用黃仁勳在 GTC 的核心論點作結——「Intelligence is becoming infrastructure」，AI 生態系的入場窗口正在開啟，能早期找到自己定位的公司，往往才是下一輪週期的最大受益者。

ai drawing leaders silicon valley taiwan powered tsmc jensen huang ai boom gtc nvidia gtc uly morris chang

NVIDIA GTC 2026: AI, Robotics and Future Trends in Tech

Get IT: Cybersecurity insights for the foreseeable future.

Play Episode Listen Later May 12, 2026 21:09

Join KJ Burke, as he dives into the key highlights from NVIDIA's GTC 2026 conference. Discover the latest in AI advancements, robotics, space computing and enterprise strategies shaping the future of technology. To learn more, visit cdw.ca Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

CCA Accelerator Special Feature: Transforming Advising at Greenville Technical College

CCA On the Air

Play Episode Listen Later May 5, 2026 23:06

In this episode of CCA on the Air: Transforming Advising at Greenville Technical, we're joined by Brett Barclay (Brett.Barclay@gvltec.edu), the Dean of Academic Advising at Greenville Technical College in South Carolina. GTC has participated in CCA's Accelerator, a part of the Gates Foundation Intermediaries for Scale Investment, for nearly three years. Through this project, GTC has focused intently on advising transformation—listen to this episode to find out what they've done!

south carolina transforming greenville accelerator barclay advising cca special features gtc technical college academic advising

AUGTC: S3 E5: The Power of Empathy: Why "To Kill a Mockingbird" Still Matters

Acting Up with GTC

Play Episode Listen Later May 4, 2026 42:02

Step behind the curtain with us on this powerful episode of Acting Up with GTC!

empathy mockingbird to kill gtc acting up

Shopify's AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 22, 2026 72:25

Early bird discounts for the San Francisco World's Fair, the biggest AIE gathering of the year, end today - prices will go up by ~$500 tonight so do please lock in ASAP!From near-universal AI tool adoption inside Shopify to internal systems for ML experimentation, auto-research, customer simulation, and ultra-low-latency search, Mikhail Parakhin joins us for a deep dive into what it actually looks like when a 20-year-old, $200B software company goes all-in on AI. We cover why Shopify has become much more vocal about its internal stack, what changed after the December model-quality inflection, and why the real bottleneck in AI coding is no longer generation, but review, CI/CD, and deployment stability.We also go inside Tangle, Tangent, SimGym, which are three major AI initiatives that Shopify is doing to make experimentation reproducible, optimization automatic, customer behavior simulatable, and search and catalog intelligence faster and cheaper at scale. Along the way, Mikhail explains UCP, Liquid AI, and why token budgets are directionally right but often measured badly, why AI-written code can still increase bugs in production, what makes Shopify's customer simulation defensible, and what he learned from the Sydney era at Bing.We discuss:* Mikhail's path from running a major Microsoft business unit spanning Windows, Edge, Bing, and ads to becoming CTO of Shopify* Why Shopify is talking more publicly about AI now, and why staying at the frontier has become necessary for the company* Shopify's internal AI adoption curve, the December inflection, and why CLI-style tools are rising faster than traditional IDE-based tools* Why Jensen Huang is directionally right on token budgets, but raw token count is still the wrong way to evaluate engineering output* Why the real unlock is not more agents in parallel, but better critique loops, stronger models, and spending more on review than generation* Why AI coding can still lead to more bugs in production even if models write cleaner code on average than humans* Why Shopify built its own PR review flow, and why Mikhail thinks most off-the-shelf review tools miss the point* How PR volume, test failures, and deployment rollback are becoming the real bottlenecks in the agent era* Why Git, pull requests, and CI/CD may need a new metaphor once code is written at machine speed* What Tangle is, and how Shopify uses it to make ML and data workflows reproducible, collaborative, and production-ready from the start* Why Tangle is different from Airflow, and why content-addressed caching creates network effects across teams* What Tangent is, and how Shopify is using auto-research loops to optimize search, themes, prompt compression, storage, and more* Why Tangent is becoming a democratizing tool for PMs and domain experts, not just ML engineers* Why AutoML finally feels real in the LLM era, and where auto-research still falls short today* Why Tangle, Tangent, and SimGym become much more powerful when combined into one system* What SimGym is, why simulated customers only work if you have real historical behavior, and why Shopify's data gives it a moat* How SimGym evolved from comparing A/B variants to telling merchants what to change on a single live storefront to raise conversions* Why customer simulation is so expensive, from multimodal models to browser farms to serving and distillation costs* How Shopify models merchant and buyer trajectories, runs counterfactuals, and thinks about interventions like discounts, campaigns, and notifications* Why category-level behavior is so different across commerce, and why ideas like Chinese Restaurant Processes are showing up again in practice* Shopify's new UCP and catalog work, including runtime product search, bulk lookups, and identity linking* Why Shopify is using Liquid AI, and why Mikhail sees it as the first genuinely competitive non-transformer architecture he has used in practice* Where Liquid already works inside Shopify today, from low-latency query understanding to large-scale catalog and Sidekick Pulse workloads* Whether Liquid could become frontier-scale with enough compute, and why Shopify remains pragmatic and merit-based about model choice* Who Shopify is hiring right now across ML, data science, and distributed databases* The Sydney story at Bing, why its personality was not an accident, and what Mikhail learned from deliberately shaping AI character early onMikhail Parakhin* LinkedIn: https://www.linkedin.com/in/mikhail-parakhin/* X: https://x.com/MParakhinTimestamps00:00:00 Introduction: Mikhail Parakhin, Microsoft, and Shopify00:01:16 Why Shopify Is Talking More About AI00:02:29 Internal AI Adoption at Shopify and the December Inflection00:06:54 Token Budgets, Jensen Huang, and Why Usage Metrics Can Mislead00:10:55 Why Shopify Built Its Own AI PR Review System00:12:38 AI Coding, More Bugs, and the Real Deployment Bottleneck00:14:11 Why Git, PRs, and CI/CD May Need to Change for Agents00:18:24 Tangle: Shopify's Reproducible ML and Data Workflow Engine00:21:19 Why Tangle Is Different from Airflow00:26:14 Tangent: Auto Research for Optimization and Experimentation00:30:07 How Tangent Democratizes Experimentation Beyond ML Engineers00:33:06 The Limits of Auto Research00:36:36 Why Tangle, Tangent, and SimGym Compound Together00:37:20 SimGym: Simulating Customers with Shopify's Historical Data00:42:47 The Infra Behind SimGym00:46:00 Why SimGym Gets Better with Real Customer History00:47:30 Counterfactuals, HSTU, and Modeling Merchant Trajectories00:51:55 CRPs, Clustering, and Category-Level Customer Behavior00:53:30 UCP, Shopify Catalog, and Identity Linking00:55:07 Liquid AI: Why Shopify Uses Non-Transformer Models00:59:13 Real Shopify Use Cases for Liquid01:03:00 Can Liquid Scale into a Frontier Model?01:09:49 Hiring at Shopify: ML, Data Science, and Databases01:10:43 Sydney at Bing: Personality Shaping and AI Character01:13:32 Closing ThoughtsTranscript[00:00:00] swyx: Okay. We're here in the studio, a remote studio, with Mikhail Parakhin, CTO of Shopify. Welcome.[00:00:08] Mikhail Parakhin: Thank you. Welcome.[00:00:10] swyx: I don't even know if I should introduce you as CTO of Shopify. I feel like you have many identities. Uh, you led sort of the, the Bing ML team, I guess, uh, uh, or ads team. I, I don't know, I don't know, uh, you know, it's, uh, people va-variously refer you as like CEO or, or, uh, I don't know what that, that, that said previous role at Microsoft was.[00:00:29] Mikhail Parakhin: Uh, that was... Yeah, my previous role w- at Microsoft was the-- I actually was the CEO of one of Microsoft's business units, which included, as I, you know, as we discussed, all the things that people like to laugh about, uh, including Windows and Edge and Bing and ads and everything.[00:00:47] swyx: Yeah, yeah. What a, what a, what a wild time.You've obviously, uh, done a lot since you landed at Shopify. Uh, one of the reasons I reached out was because you started promoting more sort of internal tooling, uh, primarily Tangle, but also a lot of people have seen and adopted Tobi's QMD, uh, and obviously, I think, uh, Shopify has always been sort of leading in terms of, uh, engineering.I think more-- it's just more recent that you guys have been more vocal about your sort of AI adoption. Is that, is that true?[00:01:16] Mikhail Parakhin: Well, I think AI tools in general are fairly recent development, uh, and we've-- Shopify, you know, at this stage of its development, we're developing AI in-in-house and other, uh, building tools that use AI and, you know, interfacing with the wider AI community, uh, you know, are on the sort of the, uh, runaway trajectory.So it just did by sort of natural byproduct. We, we talk about it more also. We just, uh, just even yesterday, Andrej Karpathy was famous in tweeting about, oh, are there some, uh, ways, uh, that, that you can organize your agents to store the data and then, uh, look up the data so that you don't have to research or, or lose context every- Yestime. And a little bit tongue in cheek, I tweeted that, “Hey, we've, we've done it much earlier, and we even have different approaches, Tobi and I.” Tobi, of course, is a big fan of QMD, and I'm more of a SQL, SQLite fan. But, uh, yeah, very similar things that we've already done here. The point is, yeah, we're very dynamic, you know, explosively growing company, and we have to be at the forefront of AI adoption, obviously.[00:02:29] swyx: Yeah. Yeah. Um, you, your team kindly prepared some slides actually that we were gonna bring up on to, uh, the screen. I think I can, I can screen share, and then we can kind of go through some of the shocking stats that maybe, maybe put some numbers to what exactly is going on. So here we have, uh- An internal AI tool adoption chart.What are we looking at here? What ?[00:02:54] Mikhail Parakhin: Yeah, this is very interesting statistics. Uh, this is number of daily active workers, you know, think of, uh, DAO, basically the active users of-[00:03:05] swyx: Yeah ...[00:03:05] Mikhail Parakhin: AI tool as a percentage of all the people in the company, right? And then- Yeah ... different AI tools. And, uh, you could see two things here is that one is the green is total.Uh, green is just total. So you could see that it approaches really % by now. It's hard not to do your job now without interacting deeply, at least with one tool. You could see another interesting thing is just as many people commented in December was the phase transition when suddenly models gotten good enough that, that everything took off and started growing.Uh, it, it was many people noticed that the thing is that small improvements accumulated into this big change in Sep- December roughly timeframe.[00:03:52] swyx: Yeah.[00:03:52] Mikhail Parakhin: The other thing I would claim you could see is that, uh, CLI-based tools and tools that don't require you to look at the code becoming more popular, and you could see, yeah, various versions of, uh, Cloud Code and Codex and Pi and internal development tools taking off.Uh, exactly, yeah, uh, and blue is our River, just internal agent for coding, where tools, uh, that require IDEs such as, uh, GitHub, Copilot or Cursor, they're not exactly shrinking, but they're not growing as fast. Like, uh, red, red line is, is the IDE kind of tools. So you could see that they're, they're not experiencing as, as fast of a growth.[00:04:37] swyx: As I understand it, basically, every employee has their choice, right? Of choose whatever tool you use, and then you're just kind of doing a, a daily sur-survey or something.[00:04:47] Mikhail Parakhin: Exactly. And, uh, we- Yeah ... the, the push is to get your job done, you can use any tool, and we effectively fund unlimited tokens for everybody.Uh, we, we do, we do try to control the models that, uh, people use, but from the bottom, not from top. Like we basically say, “Hey, please don't use anything less than Opus four point six.”[00:05:09] swyx: Oh .[00:05:10] Mikhail Parakhin: Some people, some people end up using GPT five point four extra high. Some people use Opus four point six. Um, uh, you know, uh, there are some, uh, there are plus and minuses in going for full one million context window versus not.But, uh, we try to discourage people from using anything less than that.[00:05:28] swyx: Yeah, yeah. Got it, got it. Uh, I mean, uh, that's, you know... The, the next chart here, it really kind of shows the expansion and the sort of December twenty twenty-five inflection, right? That, uh, people are using a lot of tokens. I think it's also really interesting that no one was kind of abusing it in twenty twenty-five.Like it was- Had comparatively, uh, to this year, there was almost no growth. I mean, it's still like, you know, probably, probably gave fifty percent.[00:05:56] Mikhail Parakhin: Yeah. This is just a different scale. It's still exponential- Yeah, yeah ...growth at just a different- ...rate of expansion. Uh, there was inflection point, and Sean, I would claim the, the super interesting part here is that you could see that the distribution becoming more and more skewed.Yes. The top percentiles grow faster. So that means- Yeah ...the people in the top ten percentile, they, their consumption grows faster than seventy-five and so forth. So, uh, the distribution skews more and more towards the highest users, which is... I don't know what it tells me. It's like it feels not ideal, to be honest.Or maybe it's okay. We'll see.[00:06:36] swyx: Why does it feel not ideal? Is, is it because of, um, quantity over quality, or what's the concern?[00:06:42] Mikhail Parakhin: Because take it to the limit. That means, you know, if, if this rate of separation continued- Ah, yes ...a year, there will be one person consuming all the tokens. So it's just, it's kinda strange.[00:06:54] swyx: Yeah, I mean, um, uh, I, I think internal like teaching and all that, uh, will, will help sort of distribute things more widely. But in, in the early days, of course, the people who are sort of more AI-pilled will obviously find more ways to use it than the people who are less AI-pilled. Maybe let's, let's call it that.I'll just, I'll just kinda quickly, uh, pause from the, the... You know, we will go back to the rest of the slides, but I just wanna, um, review, you know, there are a lot of CTOs of, of large companies like yourself where they're all considering some kind of token budget, right? Like I think it's something, something that Jensen Huang has been talking about, where like if your 200K engineer is not using 100K of tokens every year, like they're, they're underutilizing coding agents.Of course, Jensen Huang would say that, but like it seems a very quantity over quality approach and like some, some people are basically saying like, well, is this comparable to judging engineer quality by lines of code, right? Which we also know is like kind of flawed, but better than nothing. So I, I don't know if you have like a sort of management take here on, on how to view this kind of, uh, metrics.[00:08:02] Mikhail Parakhin: Well, I mean, you're, you're baiting me. I, I like... This is my favorite topic. Uh, if you let me, I'll probably talk for two hours on just this. I have a lot of things to say. Like I do think Jensen gotten a lot of bad press saying, “Oh, of course you're, you know, this, uh, the- ...the cake seller says you don't need enough cakes.”You know? Like, of course. Uh, but, uh, I actually, uh, think that's undeserved. I think he, he's actually right. Uh, I do think- He,[00:08:33] swyx: he's directionally correct.[00:08:35] Mikhail Parakhin: Yeah. Yeah. He's directionally correct for sure. Uh-[00:08:37] swyx: Who knows what the right number is? Yeah.[00:08:39] Mikhail Parakhin: The thing that I do Uh, want to say, and this is something that we learned through trial and error and very important is like two things.One is that it's not about just consuming tokens. Uh, you can consume tokens and, and in fact, the anti-pattern is running multiple agents, too many agents in parallel that don't communicate with each other. That's almost useless, uh, compared to just fewer agents and burns tokens very efficiently. Uh, setting up the right critique loop, especially with the high quality models, where one agent does something, the other one, ideally with a different model, critiques it, uh, suggests ways to improve it, the agent redoes it with this critique and, and so it takes much longer.So people don't like it because latency goes up. You know, they, they have to wait until this debate is happening. But, uh, the quality of the code is much higher. And another thing, just since you mentioned like, look, uh, uh, yeah, the overall budget is just like, uh, lines of codes. Lines of codes are exploding for everybody right now, or partially because AI is really mover balls, but partially just because AI can write a lot more code, you know, doesn't get tired.And so you have to have to have a very strong narrow waist during PR review. Otherwise, just the number of bugs will go through the roof. It's, uh, it's this unexpected consequence of the just volume trumping everything. I would claim by now good model writes code on average with fewer bugs than, than the average human.But since they write so much more of it, like more of it will make it into production. So you have to- You still[00:10:26] swyx: have[00:10:26] Mikhail Parakhin: more bugs. Yeah. Have to have a very rigorous PR reviews, also automated of course. But, uh, yeah, that to spend a lot budget there. Like this, this for me, for me, actually, the important metric is the ratio of budget spent during code generation versus, uh, spent, uh, expensive tokens like GPT, uh, five point four Pro or, uh, uh, Deep Think from Gemini, you know, checking on PR reviews.[00:10:55] swyx: Yeah, totally. Uh, I noticed in your chart you didn't have any review tools. Do you just use like, like let's say a Claude code to review tools? Or do you have another set of review tools like the Greptiles, the Code Rabbits, uh, Devin Reviews has a review tool. I don't know if you've had those specialist review tools.[00:11:13] Mikhail Parakhin: You are a little bit jumping on my store tool right now because the graphs I was only showing public tools. Uh, uh, the-- I haven't found a good PR review tool that, that does what I think should be done. And, uh, partially my, my thinking is because it's so... It just goes against both what people feel like emotionally they prefer and, uh, some of the, uh, you know, frankly Even business models that, that the companies run.At peer review tool, uh, time, you want to run the largest models. That means, I don't know, Codex or, or, uh, Cloud Code is not gonna cut it. You need to have pro-level models if you really want to, uh, stand the tide of bots from going into production. And you need us to spend a lot of time, the models taking turns, but you don't want, like, a big swarm of, uh, of, uh, agents.So in fact, you end up in a different dual-dualistic world where you generate not that many tokens. You, in fact, generate few tokens, but it takes f-a long time because these are expensive models taking turns rather than many, many agents trying to do many things in parallel. So that's, that's why I feel like I haven't found good tools, so we are using our own for peer review for now.[00:12:33] swyx: Yeah. Yeah. I mean, uh, I think a lot of companies are building their own, uh, especially to their needs, right?[00:12:38] Mikhail Parakhin: Mm-hmm.[00:12:38] swyx: Um, I, uh, you also have a chart here going back to the slides on, uh, PR merge growth, where we're now at thirty percent, uh, month on month rather than ten percent. Uh, and also the, the estimated complexity is going up.You know, this is productivity, right? ‘Cause y- presumably there's more stuff going into the code base and more, more features getting worked on. I'm curious about the backlog, right? Like the, the, the-- I actually don't mind a pro-level model taking an hour or two hours to review my PR, because I've dealt with humans who take a week to review my PR, right?And I keep pinging them on Slack, “Hey, hey, review my PR.” So, you know, I think there's some trade-off here where, like, it still doesn't make sense.[00:13:18] Mikhail Parakhin: Exactly. That, that's exactly m-my point. Uh, that on one hand, you can tolerate longer latencies at, uh, PR. On the other hand, like right now, the real problem is not in spending time waiting for PR.It's real problem is since there's so much more code than- Yeah ... uh, probability of at least some tests failing going up, and then you, like, keep de-failing, then you have to find the offending PR, evict it, retest it without that PR, and so deployment cycle becomes much longer. Uh, so it actually, in terms of the overall time to deploy, it's total time savings if you spend more time on a longer model, like thinking for an hour, because then, then you, you don't have to spend all that time during testing and rolling, you know, rolling back the deployment.[00:14:03] swyx: Yeah, totally. That's still worth it. You know, you don't look at the individual, look at the aggregate, and look at the, the, the change in the aggregate system.[00:14:11] Mikhail Parakhin: Exactly.[00:14:11] swyx: I'm kind of curious if, like, there's this PR mentality and, like, c-- the, the, the CICD paradigm will be changed eventually. Some people are like, obviously a lot of people want new GitHub, but I even wonder if, like, Git is the problem, right?Like, is that the bottleneck? Is the concept of a PR a bottleneck? Do you guys use stack diffs? I don't know if, uh, that's a, like, a merge queue stack diff type of thing.[00:14:34] Mikhail Parakhin: We, we use, we use Stacks, we u- we use Graphite. We worked with, uh, Graphite a lot. Uh, so we use Stack, uh, PRs. I think, uh, like that's clearly the overall CICD in general, and the interaction with the code repository right now is the, clearly the sort of the, the main issue and the bottleneck for us, uh, and highest top of mind.I would say we probably need a different metaphor or different whole design of how to process it in new agentic world. I haven't seen anything dramatically better yet. I, I think everybody right now is just trying to keep their head above the water ‘cause, ‘cause there, there's so many PRs and then everybody's CICD pipelines start creaking, the, the times are increasing, the number of bugs slipping by increasing, and you have to, have to clap on down.And so we are a little bit in this situation when we need to first stabilize that story and then start thinking, hey, what, what it could be a completely different and new world, which I haven't... I know some people working on it. I haven't seen something, like anything super compelling yet, but clearly the old thing were designed for humans will need to be morphed into something new.[00:15:53] swyx: One of the thing that I, I think about is kind of like the merge conflict is basically a global mutex on the whole system, right? And in, in hu- in human organizations, we do have something like that. It's the company standup. But like, other than that, it's like it's actually fitting for us to be somewhat decentralized, somewhat plugged into one stream of information source, but somewhat lossy.Like it's okay, you know, that, that not every delivery is like atomic consistency. Like we're not dealing with a database sometimes.[00:16:27] Mikhail Parakhin: This is a very good point, uh, because since humans don't write code too fast, you know that global mutex is not too bad. Once you-[00:16:36] swyx: Yes ...[00:16:37] Mikhail Parakhin: start writing code at the speed of machine, it becomes the, you know, the bottleneck.Then what do you do? Maybe, and I can't believe I'm saying this because I, I'm long-- lifelong opponent of, uh, microservices, and I always thought that was, like, a really bad idea. And now that you're saying it, like, maybe in new guys like microservices will make a comeback, you know, because then you, you can ship things independently in tiny things and, and the managing all that complexity automatically will be much easier.I don't know. Like, we'll s-- we'll have to see.[00:17:10] swyx: Yeah. I mean, I don't know what the Microsoft or, or Shopify thing is, but I, I read this paper from Google where they have a monorepo that deploys into microservices, right? And then, uh, the other concept that I think about a lot is the Chaos Monkey concept from, from Netflix.Being able to create, like, this robust system where, um, uh, you know, you, you have the service discovery, you have the, uh, the independent, independent microservices discovery and, and, uh, you know, probably going to be a fair amount of duplication. That's how an organic system sort of scales, uh, that, that you have that...I don't know how you call it. Slack? Robustness? Depend-- uh, d-duplication. I, I, I forget the-- I, I'm-- And this-- those-- these are not exactly the terms- Hmm ... I'm looking for, but I c-can't really think of the words. Okay. I was gonna go into Tangent and Tangle. Uh, so, uh, we, we sort of discussed the overall stats that, uh, Shopify has.Uh, but, you know, I, I think some, some pretty cool stuff that you guys are working on is your ML experimentation, uh, and your, your sort of auto tr-research training pipeline. Presumably you're much closer to this one because it's, it's a sort of personal hobby of yours. How, how would you explain them in, together?I thought we have a slide that, like, uh, has the s- the system diagram.[00:18:24] Mikhail Parakhin: Yeah. Tangle first and then Tangent as a-[00:18:27] swyx: Yeah ...[00:18:28] Mikhail Parakhin: as a thing on top of Tangle. And, uh, Tangle is the third generation, I claim, of, uh, systems of, uh, running any data processing, but a bit with a skew for ML experiments, but not necessarily. Any sort of data processing tasks where you need to iterate, share, and you have scale so that you want maximum efficiency.You know how, like, normally you would work, you would-- Imagine you're a data scientist or an ML practitioner, you would get Jupiter notebooks or, or maybe you would get, uh, you know, Pyth- your Python scripts, and you would manage the data, and you produce those TSV files, and you put them in some JFS or something.Then you would notice that, oh, it has this, uh, weird missing values. You go and write another script that, uh, goes and replaces them with, uh-[00:19:20] swyx: Ah ...[00:19:21] Mikhail Parakhin: dash S. And then, then you, then you run some, some, uh, “Oh, I need to filter bots.” And so you run some light GBM model that, uh, removes the bots. And then, then you like-- And then you, you kind of like get into shape, and then you start experimenting, and you run multiple experiments, and then you're like, “Oh my God,” like, “this experiment is worse.”You undo, and you cannot get to previous result. And like, “Ah, what did I do?” Like that. Again, then, then you finally like get everything working. Then you like start throwing it over the fence to production. You, you replicate it, those things don't work, and then sometimes you like don't notice that you forgot some feature naming and the, the features don't match.But then, like imagine you, you did everything, and then six months later you're like, have to repeat it because now there's more data, or you wanted to do another pass, and you're like, “What, what did I do?” Or like, or like, “This script crashes now,” or the, “the path has changed.” And then, then you're trying to, like you spend another month just doing ar- digital archeology on your own, you know, history, right?Now multiply that by many, many teams. Now imagine you got an intern that you wanna ramp up. Now you have to show that intern, “Oh, you know, look, here's the folder, there's the scripts, you know, ask your cloud agent to do, and then, uh, to, to figure it out.” And then cloud agent does something, and then you're, “Ah, yeah, right, right, it was the wrong folder.I forgot to tell you, I actually have this other thing I forgot myself.” And, and that's, that's the, like, the daily life we all, uh, all know it, uh, if, if you're a data scientist, machine practitioner, ma- machine learning practitioner or, uh, or even like any data managing, uh, person.[00:21:00] swyx: Yeah. So I, I used to do this, uh, f- uh, on the quant finance side, uh, in, in my hedge fund.So we did this before Airflow, and then, uh, obviously Airflow came along and, uh, then more recently Dagster, uh, I would say is like, in my mind, what I would use for that shape of problem, uh, where you had to materialize assets and create a pipeline.[00:21:19] Mikhail Parakhin: And that's, that's very good segue because... So Airflow is great, but Airflow is more about you, you have something and you wanna repeatedly run it in production on schedule.It's less about you as a team developing things and being able to share, and you grabbing the standard pipeline and saying, “Hey, I wanna change this tiny little component in the huge sea of data processing, and I don't wanna-- I wanna run ten experiments on this, and I wanna do hyperparameter optimization.”All that is very hard to do with Airflow. It's very easy to do with Tango. Tango is m- more about, it's everything about group of people Running experiments, it might be agents too nowadays. Uh, running experiments cheaply, collaborating, sharing results. Uh, you don't need to understand fully. You, you grab-- you clone somebody else's experiment or somebody else's pipeline, uh, run, uh, change small piece, run it, be, like, get it to production state, and then ship in one click.So then the... You don't have to port it into any other system to, to run in production. You can just run the same experiment. It's, it's fully production ready. And, and it's, uh, it has lots of... Again, as I said, it's third generation system. The original one was, I would claim there was Ether and then, uh, at least in my career, Ether was the first, first, uh, that pioneered this type of approach.And then there was, uh, Nirvana, which, uh, uh, at Yandex, which did kind of sec-second take on this. And now this one aggregates the, the learnings from all of those and, and Airflow as well to, to get to the state where you try it, it, it feels kind of magical. Uh, ‘cause now everything is based on content, uh, hashes.So even if the version changed, but if the output didn't change, nothing is being rerun. It's very efficient. If you... Multiple people start experiment that needs the same sort of data preprocessing, it's not repeated multiple times. It's automatically done only once. If you start ten experiments that all require, you know, some, some data preparation first as the first step, and you don't have to coordinate for that.Like, you don't have to know that other people are starting it. You now, it's very easy compos-, uh, composability, any language you can u- uh, you wanna use, and it's very visual. So you can see immediately, you can edit it easily, you can assemble small things with just even mouse clicks if you want to, and, uh, share, clone.And everybody knows also it's fully kind of static in the sense that we rerun it second time, it will exactly have the same results. Like, you will never have to do digital archeology. So full versioning and everything is also there.[00:24:06] swyx: Uh, so, so people can, uh... It's open source. Go to the GitHub repo and, and, uh, check it out.Uh, and it is also a really good, uh, blog post about it. I think all these is, like, really appealing. The, the, the, the thing that I think sells me the most about it is that, um, sort of development to production transition, right? Which I think, um, a lot of people haven't really solved that, uh, strictly, right?Like, we develop really, really well in, in Python notebooks, but then, you know, that's obviously not a sort of production ready process. I think that, like, any way in which that is solved, I think is, is very appealing. Then the other thing that you mentioned, which also raised my eyebrows, was content-based caching, which you mentioned is, is, um, you know, is ve-very much, uh, um, a sort of efficiency measure about, uh, you know, just like recalculation only on, on sort of content addressing Which I think makes sense.Uh, it surprised me that the savings could be this much, but maybe I just haven't worked at your scale where there's so much duplication, uh, that people just rerun because they change a single ID upstream.[00:25:10] Mikhail Parakhin: It does, yeah. But it's not only you rerun. The, the main savings are coming from the fact that you ran it, you got your job done, and you moved on.Then- Yeah ... somebody else in some department you don't know existed runs the same task, but on a newer version.[00:25:27] swyx: Yeah.[00:25:27] Mikhail Parakhin: Like right now, you can't, in, in most of the organizations, you can't even find out about it so that you can't even measure that you're spending that time twice, right? Here- Yeah ... if everybody's on Tango, that's detected automatically and detected that the output is the same.And then for that person, all it looks like is like experiment just suddenly moved, jumped forward, right? Uh, uh- Yeah ... so that's because, because the, there's network effect of multiple people helping each other.[00:25:51] swyx: Yeah. This is one of those things where it's designed to be a platform from the beginning rather than an individual developer's tool from the beginning, right?And, and everything's gonna streams down from there. That is the sort of Tango, uh, orchestrator, and it's, it manages jobs. We've seen a few versions of this, and this is obviously, uh, uh, the sort of, uh, unique approaches that you guys have, have, uh, figured out. And then there's Tangent.[00:26:14] Mikhail Parakhin: Yeah. And Tangent is basically an automatic auto research loop that can help and kind of do your work for you.Uh- ... you know, uh, effectively, effectively, Andrej Karpathy recently popularized it with auto research. Yes. Remember he said like he was, uh, speed running this, uh... Yeah, uh, you know the story. The, here we're basically bringing the same capability into Tango so that, uh, the, uh, Tangent can analyze it. It's just an agent that can run multiple experiments, figure out what can be changed, and keep on rerunning it, keep on modifying until, uh, maximizing some goal, some loss function, whatever you need to, to achieve.And in general, I would say if you're not using auto research-like approach in whatever you do, like literally whatever you do, then you're missing out. We saw at Shopify that taking like a wildfire, anything where you can put measurements can be done dramatically better. Our-[00:27:19] swyx: Mm-hmm ...[00:27:20] Mikhail Parakhin: uh, speed of, uh, templatization HTML, uh, completely new UX tem- uh, templatization of, uh, reducing latency for liquid themes.Uh, we-- Our, uh, search, uh, recently we moved from It's hard even, uh, quote from eight hundred QPS to forty-two hundred QPS with the same quality just by pure optimizations and not a research loop that kept running and changing code in our index serve on the same number of machines, just increasing the throughput.We, we managed to improve the quality of gisting and machine learning process. Uh, you know, gisting is the prompt compression technique that[00:27:59] swyx: allows for[00:28:00] Mikhail Parakhin: lower latency and, and lower and, uh, actually higher quality slightly. So like literally whatever different walks of life, and it doesn't have to be AI related.Uh, we, we had a reduction in, uh, storage because the agents would go and find data sets that clearly are derivative, uh, and then you don't need to store things twice. You know, we, we, we found somewhat embarrassingly that it was one of the largest tables was hashing random IDs into another random ID, and we literally- Oofput only one. So it was translating, yeah, two random IDs hashed[00:28:36] swyx: into[00:28:37] Mikhail Parakhin: each. So, so[00:28:37] swyx: it has access to the code as well, so it can, it can check the, like what, what the hell is it doing?[00:28:42] Mikhail Parakhin: So there, there cou- it could be run in two levels. You, uh, you know, at the superficial level, it could just use ex-existing components and, uh, reshuffle them.Uh, you know, like you can grab- Yeah ... uh, XGBoost, and you can grab some, some Py- PyTorch module, and then can grab some, you know, grab another tools and, and combine them. At a deeper level, since Tangle is all sort of CLI based underneath you, every, every component is a wrapped really CLI, uh, call and a YAML file, it can analyze code and create new components and, and, uh, keep on iterating as well.So, so you can, you can both have quick modifications of existing t- uh, pipelines with the, with components that are already there pre-baked, or you can create new components, uh, and-[00:29:29] swyx: Yeah ...[00:29:29] Mikhail Parakhin: keep iterating on those. So auto research is, again, this is probably the, the thing I was excited the most in the last two months happening, and we see it taking like, like totally like a wildfire.Just, uh, everybody, every day, every... well, every day, every minute, I would, uh, have somebody Slack message saying, “Oh, look how much better I made it.” And, uh, it's all throughout the research.[00:29:53] swyx: Is this democratized in some way in, in the sense that like is it your ML, uh, engineers and researchers doing this, or is it your regular PMs and software engineers also have the ability to auto-- to use Tangent?[00:30:07] Mikhail Parakhin: This is an awesome question. Like, Tango in general and Tangent in particular are extremely democratizing. Like they- Yeah ... they are the main tools for- ‘Cause I don't[00:30:15] swyx: need the details.[00:30:16] Mikhail Parakhin: Yeah. Exactly. Initially used by ML and AI engineers, but then literally, as you said, PMs are like the highest user right now is one of PMs on our org, uh, Sartak and he was, he was number one by, by usage of, of this ‘cause they're just, uh, energetic and knowledgeable, and now it, it unlocks a lot of capability where you don't have to co-change code manually.[00:30:39] swyx: I mean, I mean, because it kind of cuts out the ML, ML engineer from the process because the, the, the PMs have the domain knowledge and the ability to think about, uh, from first principles about, okay, what, what results do I want? And they can-- they even have the access to the data that, that needs to go in.So it's like in some ways, like this is the magic black box that we've always wanted for, for training and, and for, uh, I guess, uh, uh, hill climbing, whatever.[00:31:04] Mikhail Parakhin: It's basically cloud code for your AI development- ... uh, situation, right? Like now, now you don't have to know exactly how algorithms work. You can just, uh, bring your domain knowledge and expertise and product knowledge and iterate within Tangent until you've gotten the results that you need.[00:31:21] swyx: In my previous roles, every time that someone has pitched AutoML, you know, I've always been like, “Uh, this is not, this is not gonna work. It's, you know, it's, it's always gonna be a flop.” Somehow it's working now. I mean, presumably the answer is now we have LLMs and it's good enough, right? It's, it's an emergent property that we can do auto research, but like, it doesn't feel that satisfying that how come we didn't do this before, right?Like we just did like parameter search and like, I don't know. That's maybe that's it.[00:31:48] Mikhail Parakhin: Yeah. Bayesian optimization and hyperparameter optimization was, was the one that, or facet of AutoML that was used very actively, which incidentally also built into, uh, Tango. But, you know, I know Patrice Simard very well, and, uh, he was such a, uh, such a proponent of AutoML, and he put, like literally spent careers trying to democratize it.Without LLMs, it just turned out to be very hard. Like it, you, you would have flexibility within certain narrow domain, but it was hard to wider scale, and now with LLMs suddenly it's like magic wand, and so suddenly everybody- ... is an AutoML expert.[00:32:28] swyx: Yeah, I, I think it's multiple things, right? Like I'm, I'm just gonna bring up the, the, the chart again, right?Like LLMs can do the monitoring very well. That is the very potentially unbounded, super unstructured. It can do the analysis very well, it can do the... Uh, and basically it is much more intelligence poured into every single step. Uh, there's maybe nothing structurally changed about AutoML, but this is just m-more intelligent and more unstructured.[00:32:53] Mikhail Parakhin: Exactly.[00:32:54] swyx: Any flaws that you've run into? Like everyone is like drinking the Kool-Aid, oh my God, time savings, uh, you know, performance improvements. Like what, what, uh, issues have you have, uh, come up?[00:33:06] Mikhail Parakhin: This is really cool. It's not a solution to all the world's problems for sure. The limitations are usually the ones I-- And this is where we get into a bit of a subjective territory.Uh, I can only share what I've, I've seen so far, and I'm sure the situation, uh, is changing, and, you know, maybe after I say it, like many people will reach out and say, “Hey, what about this?” And you don't know that, and then, then we'll be probably right. But what I've seen is auto research is very good at doing kind of obvious things that you don't have bandwidth to do or you didn't notice or maybe you're not aware of like the-- some standard practices.It is not good at doing something completely out of distribution, something that, you know, you have to think for, for multiple days, uh, and, and do something like none of this. So, so it's, uh, I, uh, set an experiment once, uh, on, on my sort of, uh, hobby thing, and I let it run for, uh, ended up, uh, several weeks run, uh, you know, it's like full production kind of scale, so it, you know, slow runs and, and it ex-- it performed in the end, uh, over four hundred experiments, and only one was successful.I'm like, “Okay, that's, that's good.” But-[00:34:18] swyx: But it saved time.[00:34:19] Mikhail Parakhin: Yeah, I saved time. Like it, it was the, that thing. Yeah, if I, if I were doing four hundred experiments myself, my betting average, as I said, would have been much higher, I'm sure. But also, first of all, it would take me like three years to do four hundred experiments.And, uh, I didn't have to do them. Like the machines were just, uh, the price of electricity did that. So, and I got one improvement, uh, that in, uh, my, my-- Honestly, when I was starting that experiment, my thinking was to go and show that, “Hey, Andre, maybe you just don't know how to optimize.” And I was super smart because in, in my pro-problem, it was optimized for many years, and it was like fully improved.Uh, and I didn't expect it, you know, auto research to find anything at all. Yet it did. So instead of making fun of Andre, I ended up, uh, a big, big supporter. Yeah, that's exactly the tweet. Yes.[00:35:10] swyx: You and Toby really, really go back and forth on-online a lot, which is really funny. Uh, think of it as, as an eval for the optimalness of the code it's running on.Uh, it's almost like it reminds me of like a Kolmogorov complexity thing, but, uh, I guess it's-- there's some optimal thing that you're trying to sort of reduce down to, I guess. Um, and so, so you, you, you know, you should congratulate yourself that you had, uh, you know, uh, ninety-nine percent, uh, optimality.[00:35:36] Mikhail Parakhin: Exactly, yeah. I think Andre really deserves a lot of credit for popularizing this approach. This is, uh, this is incredibly, I think, powerful and cool and You know, the, uh, even him, him just mentioning it led to a lot of gains in a lot of places in the industry, so we should be thankful.[00:35:56] swyx: Yeah. I think he also has a just...I don't know what it is. Like, um, you know, it, it is a simple self-contained project that people can take and apply to other things, which is, is, is one thing, but also just the name. Just like somehow no one, no one managed to call their thing auto research. It's just naming things is very important. I think that that is mostly, uh, our coverage of Tango and, and, uh, Tangents.I think obviously, you know, there's a lot of, uh, ML infra at, at Shopify that people can, uh, dive into. We're about to go into SimGym, but before I do that, any, any other sort of broader comments around this whole effort? Like where is it, where is it leading to?[00:36:36] Mikhail Parakhin: As a segue to SimGym, like all those things start composing strongly.And, uh, you could see a huge unlock when you can look at each one of the tools and, and you see, oh, they're extremely useful. Uh, Tango is useful by itself. Auto Research is useful by itself. SimGym is useful by itself. If you combine all three, you create like synergetic effect. I think that's why we wanted to even, uh, cover them today is because this is something that if you go back even, you know, five years ago, would've been unthinkable.Uh, replicating that, uh, would, would be either incredibly costly or impossible, right? With probably thousands of people are required.[00:37:20] swyx: Well, we have serverless human, uh, serverless intelligence, right? Like, uh, so yes, you do have thousands of hu-- of, of intelligences, not just, not humans. And that's, that's close enough, right?Even if they're not AGI, they're, they're close enough to do the, the task that you need them to do. And, and, you know, that's, there's plenty for, for a lot of routine work, knowledge work. Okay, let's get into SimGym. Um, this is one of those things I, I was surprised to see actually it's apparently your, uh, one of your most popular launches, and I think something that, uh, I think Sim AI, I think Yunjun Park, who did the Smallville thing, there's a very small cottage industry of people trying to do like the simulate customer thing.I think a lot of people maybe don't super trust this yet because they're like, well, obviously they would just do what you prompt them to do, right? But maybe just think, uh, tell us about the sort of inspiration or origin story.[00:38:10] Mikhail Parakhin: That's exactly actually the thing I wanted to cover, because if you don't have the historical data, all you can do is prompt a-agents in a vacuum, and they will do exactly what you prompt them to do.In fact, when I first proposed it, and this is a bit of, um, my brainchild initially, if I, I can boast, even Toby said like, “But wouldn't they, they just repeat what, what you tell them?” And, uh, but I'm like, “Yes, except Shopify has decades of history of how people made changes and what there is, uh, there, what it resulted in terms of sales.”So now what we can do is we can-- we have this... It's not, it's a noisy data. There's a small, usually websites, uh, you know, like things, things are never in isolation. It's almost never AB experiment. It's always AA experiment when there's has two meanings, but basically, you know, in different time you run two different things.But if you aggregate in general, uh, like everything together, and you apply, uh, denoising and collaborative filtering like approach, you can extract a very clear signal. And then you can optimize your agents. And that's why it took so long. It took almost a year of that optimization of just us sitting and fiddling, and, and we had this internal goals of correlation of hitting-- internal goal was to hit zero point seven correlation with, uh, add to cart events, for example.Like that, that if we run real AB test experiment, that it should, it should go and, and rep-uh, replicate, uh, same sort of success that, that humans had or lack thereof. And it, it took forever, and I don't think that's easily replicatable because, uh, like who else would have that data? You have to have this historic, you know, decades, uh, worth of data.And now, now the, like the other thing you need is in-infrastructure and the scale, right? Because, uh, w- again, what we found, uh, stat sig results, you need to run a lot of simulations, a lot of agents, and, and it's-- Those are expensive things. Like you're, you're making actions in the browser because you want a real friction.You want to, to be able to get the image like of what humans will see because you wanna, uh, detect effects like, “Hey, if I make my images larger, will I have more sales or l- uh, fewer sales?” And like usually people's intuition here, by the way, is that I increase my images, I will have more because they look nicer.You know, designers all look sparse and big images. Like usually your sales tank, right? But, but, uh, you know, from HTML, all the characters look the same only the, the size tag looks different, right? So it's very hard. So you have to take visual information, you have to run this in simulated browser environment on the big farm and, and of course, you have to have, uh, like very, very expensive model, good model with multi-model model.So all this it's-- is what's taken so long and, uh, to share my personal fail a little bit there, Sean, is like, you know, we always had this bias to-- for like large company bias. You know, we always, uh, whenever you-- we do, we're like, “Hey, we'll run an experiment,” right? We make, make a change, and we will run an experiment and then, uh, see, uh, see which one's better or like, “No, this is worse,” and most of them are worse, so you discard it and keep iterating, hill climbing.And we're like, “Oh, like smaller merchants, they cannot get stat sig results. They cannot really run experiments simply because, you know, in a week there would be not enough data for them.” So we thought from this perspective. What we didn't realize is that most people don't have A and B, they just have one thing, and they need suggestions of What A and B should be.So, uh, we first build this, hey, we run simulation on two separate teams and, and, uh, say, “Hey, which one is better?” We then morphed it into, and very recently just released it, when you have just your site, your theme, we run over it and we say, “Hey, here's what predicted values of, of, uh, uh, conversions are, and here's how we think you should modify it to increase your conversions.”And then circling back to what you started with, the proof is in the pudding. Like, if we are not correlating with reality, like, people will not be using it. And, uh, thankfully, we see literally every day more users than the previous day. So, so right now, uh, right now- It's working. Yeah. I'm-- Right now my problem is how to pay for it all because the so our major thing is how to optimize the LLMs, do distillation, how to run the headless browsers, uh, and handful browsers, uh, uh, cheaper so that we can accommodate the increase in traffic.[00:42:47] swyx: Yeah. I, I understand that you, uh, you published a lot of technical detail at GTC, so I was just gonna bring it up a little bit. I think s- was this in, in con-conjunction with some kind of GTC presentation? Or something like that, right?[00:42:59] Mikhail Parakhin: Well, we, yeah, we, we did it in several place, but yeah, we had the engineering- Yeahblog, uh, as well. Yeah.[00:43:05] swyx: Yeah. So you're running, uh, GPT OSS. Uh,[00:43:08] Mikhail Parakhin: the, this is an older version. You know, now we run multimodal model. But yeah- Yeah ... GPT OSS, we still run GPT OSS as well for[00:43:15] swyx: And then you have the VMs, and you also have browser-based. I really like this one where it you said, “It violates almost every assumption that standard LLM serving is designed for.”And then you had like, basically orders of magnitude differences between everything.[00:43:29] Mikhail Parakhin: Exactly. Which is, which, uh, which was, you know, a bit of a challenge to implement, like when, like even simple things. Uh, be- since it violates all the assumptions, for example, multi-instance GPUs, like MIGs don't work as well.But we needed, uh, to get MIG to work because, ‘cause otherwise it's way too expensive. And so we had to deal with the, yeah, with, uh, lots of infrastructure and, and, uh, work with, uh, uh, Fireworks and CentML, uh, you know, to help with optimizations and browser-based, as you mentioned. Yeah, like, takes a village.[00:44:04] swyx: Okay. So there's a lot of like, I guess, experimentation in the infrastructure so far, and you've published more or less what you have here. I guess I'm, I'm less familiar with CentML. I, I don't do, uh, that much work in this, this part of the stack. But why was it the sort of preferred instance platform?[00:44:22] Mikhail Parakhin: There are really three probably top companies. There used to be, uh, uh- Three top companies, uh, at least I was aware of that did, uh, LM optimization. You know, together Fireworks and Santa ML, not necessarily in that order. Santa ML recently got acquired by NVIDIA. Uh, what they did is if you have a model and you want to optimize it to a specific prof-- uh, profile of usage, uh, they would go and do it.And, uh, we work with, with those companies, uh, this was work particularly in with Santa ML and NVIDIA to get them the best possible results out of it. And, and sometimes you, you have to retune depending on, like sometimes you want the maximum throughput, sometimes you want minimal latency, sometimes you want like the cheapest, right?And, yeah, or some combination. And so yeah, these are people who would come and help you.[00:45:14] swyx: I see. I see. Yeah, yeah. I'm familiar with these people for the LLM, you know, autoregressive stack. But the other interesting category of these optimizers is also the diffusion people, whereas like Fel and, you know, uh, Pruna recently has come up a lot as well, which I think is like really underappreciated, uh, at least by myself, because I, I thought, oh, all the workload would be LLMs, but actually there's a lot of diffusion as well.[00:45:38] Mikhail Parakhin: Exactly.[00:45:38] swyx: There's a lot here, so I, I, I... it's, it's, uh, it's, it's, it's hard to cover. But I, I do think like people underappreciate the importance of customer simulation, basically. I think this is something that I'm candidly still getting to terms with. Uh, you know, uh, you also-- your team also like prepared this, like, really nice diagram.Uh, I, I assume this is AI generated.[00:46:00] Mikhail Parakhin: Yeah, it looks-[00:46:01] swyx: Maybe it's not.[00:46:01] Mikhail Parakhin: Yeah, it looks, uh, Gemini-ish. Yeah, but, uh, uh, honestly, I, I don't know where, where the hell they generated. It looks, look, uh, looks like it's, uh, Google. But the interesting part, John, that, that, uh, we haven't covered, but I, I wanted to mention is if your store had previous customers, rather than it's a new store, you're like new merchant just launching things, it helps tremendously in just correlation and forecast.Yeah, we take your previous, uh, customer's behavior, and we create agents that replicate those specific distribution of, of customers that you get, and then we a- we apply those to your changes, and then that, that raised raw, you know, the re-- uh, just correlation with the add to cart events or to-- with conversion or whatever it, it, it may be, uh, quite dramatically.So, uh, replicating humans in general seems like an interesting, cool challenge.[00:46:58] swyx: As a shareholder, I think this is the-- like if people are Shopify shareholders, they should really deeply understand this because this is basically the moat. The, the more you use Shopify, the more it will just automatically improve, right?Like you're, you're doing the job for them.[00:47:13] Mikhail Parakhin: Yeah, that's what we started with. Like, uh- ... uh, otherwise, if you're just a startup, I wouldn't do it if, uh, you know, if it was my startup because Without the data, it, yeah, as, as you said, it's, it's exactly the case that, uh, whatever you say in prompt, that's, that's what the agents will be doing.[00:47:30] swyx: The statistician in me wants to like really satisfy the sort of, um, statistical intuition, I guess. Um, to me it's kind of, uh, the, the word that comes to mind is, um, ergodicity. Uh, so let's say a, a customer takes this path, customer takes this path, customer takes this path, right? Um, the... In my mind, the way I explain it is like, okay, here, here's the ninety-five percentile, here's the five percentile, and here's the median, right?Um, but to me, what SimGym is potentially doing is that it can, uh, modify... It can sort of model the sort of in-between sort of journeys as well, that, that maybe are dependent on the previous states. This may be like a very RL-type conclusion where like basically the summary statistics, if you only did naive AB testing, you only have the, the statistics at, at, at a certain point, and you only judge based on the sort of overall summary statistics.But here you can actually model trajectories. Does that make sense? Or-[00:48:31] Mikhail Parakhin: That makes total sense because like, well, that, that makes even more sense that maybe even you realize bec- because-[00:48:38] swyx: Okay. Please,[00:48:38] Mikhail Parakhin: please. Yes ... we do-- Yeah. The, so internally, uh, we have this system, we talked about it briefly once at NeurIPS.We have a huge HSTU-based system that models the whole companies, uh, and their possible paths. And like- Yeah ... what you are, what you are showing, like actually at any point of time, you can either model the user's behavior or you mo- can also think about, uh, the whole merchant as a company, as the entity that acts in the world.You can model that as well. And then you can do, can do counterfactuals. In your graph, like in your blue graph, uh, if you're... Imagine in the center there, uh, somewhere in the middle, you would have an intervention. I give that person a coupon, or I don't know, I send a personal thank you card, or give a discount in some- somewhere.And then you can, uh, then you can do forward rollouts from that counterfactual. So what would have happened with that intervention or without the intervention? And you can even ch- change where that intervention, uh, in time can happen, right? Like some- where, where in this journey. So we, we do this at the Shopify scale for our merchants, and then if we notice that something that they can be fixing, like there's a strong counterfactual, like we have Shopify policy, they basically get a notification like, “Hey, we think your...something is wrong with your-” I don't know, Canadian sales. Like, uh, it looks like it's misconfigured. Here's what you need to do. Or do you think like, uh, you have to set up this campaign with these parameters? And we do that at the buyer level to literally offer discounts or cashback or, or things to buyers.So this is-- I'm getting very excited. Like this is my sort of area of, uh, interest, I guess, and, and hobby. But being able to m-model something complex as human beings or companies and model counterfactuals on it, where you can have interventions in the future and optimize when to make intervention, what kind inter-- uh, what kind of intervention to make.It's such an unlock that previously was completely impossible. Like the-- it was, it was always dreamed of, but never... Like how would you even simulate it without LLMs or HTUs? I think very, very exciting times.[00:50:59] swyx: I just wanted to, uh, to maybe illustrate this. I, I'm not the best illustrator, but I, I am a conceptual statistics guy.And y-you know, you cannot just do this. Like this is a dimensionality AB test doesn't do, right? Like, uh, because it doesn't have the, the, the change over time, uh, stochastic nature, uh, and it doesn't have the sort of contextual like... Here's all the context to this point. Um, okay, cool. Um, that's SimGym.You're, you're gonna burn a lot of tokens on this thing. But you're, you're one of the, the only scale platforms in the world that can, uh, that can do this across a huge variety of workloads, right? I'm even curious on a sort of human, uh, research level of like, well, do, does retail behave d-differently from like clothing sales?D-does that behave differently from electronic sales? I, I don't know. I don't know what else you guys... The Kardashian shoppers, do they differ from like people who buy, uh, I don't know, cars and, uh, whatever.[00:51:55] Mikhail Parakhin: Well, very different, and different sensitivities and different modes of, uh, shopping and, and different levels of what's important.Now, to-totally, you can do aggregations at, uh, at a store level. You can do aggregations at a different, uh, category level. I don't know if, uh, you know, for our statisticians among us, I couldn't believe, but we-- recently we're looking at it, and we had to bring back, uh, CRPs, you know, Chinese restaurant process.It's a, like, way of aggregating and, like, naturally grow clustering. So across... Specifically to answer questions that, uh, like you were just posing on how, how if, if buyers behave different categories. And I'm like, “I haven't seen CRP since two thousand and one.” It's[00:52:37] swyx: so What? It's so- What is... No, I haven't, I haven't seen this.No. This is not in my training. Uh,[00:52:44] Mikhail Parakhin: but, but yeah, it, uh, uh, it actually, like the, the-- there was a very popular kind of theory, popular neurips HTML circles in early two thousands, uh, kind of nice. And now, now it has practical applications, uh- Yeah ... that we were resurrecting.[00:53:03] swyx: Yeah, amazing. Uh, I, I can see, I can see how this is like a, uh, a fun job for you where you get to apply all these things.Um, yeah, yeah, so super cool. Super cool. So, okay, so, so anyone who, who knows what CRPs are and has always wanted to use them at work, uh, they should, they should definitely join Shopify. Okay, so w-we have a lot and but I, I'm, I'm being mindful of the time. I, I do wanted to, to sort of cover some other things.Um, I-I'll give you a choice, UCP or Liquid?[00:53:30] Mikhail Parakhin: Liquid. I think, I think on UCP, you know, like UCP is very important for us and, and it just we are-- UCP, we have a structured, uh, discussions, and you can read about them, and we have, uh, blog posts, and we have a big release this week, in fact, like with our catalog.Oh,[00:53:46] swyx: okay.[00:53:46] Mikhail Parakhin: Uh, yeah,[00:53:46] swyx: but- Le-I mean, we, we can, we can discuss the, the, the release briefly because we'll release this after the-- after it's already announced so whatever. There's a catalog that you guys are doing?[00:53:55] Mikhail Parakhin: Yeah. So we are, we are- Okay ... we are bringing in capabilities of a whole, uh, Shopify catalog.Basically, you now you can search for products, you can do lookups by specific ID, you can do bulk lookups when you need to bring m-multiple products. You don't need to know in ad-in advance what you're trying to show or to sell or check out. Like, you can now, you can now have this decided at, at runtime, and this big area for investment for us for both non-personalized and personalized searches, trying to provide basically a win-window into whole universe of products that are being sold everywhere in the world.And Shopify is really not exactly, but almost like a super set of any-anything being sold. Now we are bringing it into UCP and, uh, and, uh, identity linking is another big thing for us, uh, so that you, you can use, uh, like Google or whatever, whatever identity you have, uh, they're minimizing friction.[00:54:56] swyx: Yeah. So[00:54:57] Mikhail Parakhin: yeah, big release for us.But Liquid AI of course we never talk about, and the problem might be more, more aligned with what we d-discussed previously on this chat.[00:55:07] swyx: Sure. The main thing that everyone understands about Liquid is that it is inspired by Worm, and I still don't know why. I'm curious on your explanation. I think you, you, uh, you can make things very approachable.And also I think like what is the potential of like the, the level of efficiency that you get out of Liquid?[00:55:23] Mikhail Parakhin: You- we all familiar with transformer architectures. And, uh, for the longest time, there was a competing architecture, it's called the state space models. So, so Sams, uh, you know, Chris, Chris Reyes, one of the pioneers and, and lots of startups, uh, trying to make those realities.They have, uh, significant benefits being main being, uh, being much faster and, uh, lower footprint and not quadratic in length, you know, sort of, uh, linear in, in, uh, in your context length. But with state space models- They never quite made it. Like they're used-- They have, uh, certain niches when they thrive, their hybrid architectures are useful, but they never quite made it.And liquid neural networks are, you can think of them as a next step, like, uh, sort of, uh, state-space model square. It's non-transformer architecture that's more complicated than sta-state space and really difficult to code if you-- if I'm being honest. But it's, um, very efficient. It's, uh, subline-- sub, uh, quadratic in, in length of your context.Uh, it's very compact way to represent things, and that's a liquid AI company. They... Their goal is to productize it, and very often you have this need, uh, when you need to have long context and small model, and you want to have low latency. Like in general, it's basically on par with transformers, and if you do hybrids with transformers, it's, it's even better.That's why we at Shopify, when we tried multiple and we constantly try multiple models, multiple companies, we found that for small, particularly with low latency applications, when you have low latency and/or if you need longer context lengths, liquid was the best. And so we still use the whole zoo and always like obviously test and use everything, uh, every open source model and, you know, it feels l

god ceo netflix ai google pr running change canadian chinese microsoft transition budget hiring phase id honestly ab windows lines limits pi kardashians cto jupiter gemini slack openai nirvana fireworks aa explosion wordpress nvidia pulse shopify ux chatbots stack pms liquid tango gpt bing python data science mm optimization ml worm github 1b token kool aid usage llm depend 200k anthropic ids copilot opus html stacks tangents ether dao agi mamba sha kimi smallville ides tangent ide codex sidekick sql sams prs git gpus mig xai cursor bayesian rl ci cd crp lm differential fel tangle cli jensen huang ctos vms graphite yandex cuda liquids gtc crps migs ucp 200b gbm airflow clustering sqlite aie yaml counterfactuals robustness tsv ssm andrej karpathy automl how pr chaos monkeys neurips rnn qps jfs pyth xgboost cloud code dagster kolmogorov yugabyte ssms shopfind

349: Gmail Finally Lets You Ditch xXDragonSlayer2004Xx

The Cloud Pod

Play Episode Listen Later Mar 31, 2026 64:27

Welcome to episode 349 of The Cloud Pod, where the weather is always cloudy! Justin and Jonathan managed to make it into the studio this week, and they brought a guest! Dave Garaway jas joined us, and brought some on-the-ground knowledge from GTC, plus a slew of supply chain attacks, Gmail username changes and Claude's code debacle. We've got all this and more – so let's get started! Titles we almost went with this week AWS Console Gets a Makeover Nobody Asked For From Eight Hours to 22 Seconds, Hackers Got Fast AWS Spring Cleaning Hits Nine Services Hard Trivy Pursuit Turns Into a 500K Credential Heist Skip the Consultant, AWS Security Now Hacks Itself AWS Pen Testing Agent Pokes Your Cloud Around the Clock Your Cringey Gmail Address Gets a Second Chance Stop Babysitting Servers, Let Google Handle MCP AI Agent Untangles Your Kubernetes Networking Spaghetti One Bad Actor Poisons a Hundred Million Downloads Lambda Finally Hits the Gym with 32 GB From GPU Hype to Production Inference Without the Hyperscaler Headache Follow Up 01:28 Hegseth, Trump had no authority to order Anthropic to be blacklisted, judge says A US District Judge granted Anthropic a preliminary injunction blocking the Department of War’s blacklisting, ruling the designation was First Amendment retaliation rather than a legitimate national security action. The court found officials lacked authority to blacklist Anthropic without considering less restrictive alternatives or providing evidence of an urgent security risk, noting the designation was triggered by Anthropic’s “hostile manner through the press.” The practical business impact was already substantial before the ruling, with three trade deals cancelled and other potential partners delaying negotiations, representing potentially billions in lost contracts over five years. Anthropic continues to balance the legal fight with maintaining its government relationships, publicly emphasizing alignment with the Department of War’s mission around safe AI deployment even while litigating against it. For cloud and AI vendors, this case establishes a notable precedent around government procurement decisions and First Amendment protections, with implications for how companies publicly challenge federal contracting positions. 02:35 Jonathan – “I'm guessing Anthropic is super busy with all the people coming to them for deals right now, because it seems to me that Anthropic is getting all the business customers and OpenAI are getting the personal customers.” 04:08 Delve Announces Changes and New Customer Support Measures Delve has

ai donald trump war consultants openai gym ditch titles first amendment gmail anthropic pete hegseth gtc cloud pod

Why Social Media Lost in Court and AI Agents Demand Total Surveillance – Shelley Palmer's 5th Visit

This Week in XR Podcast

Play Episode Listen Later Mar 31, 2026 53:47

Shelley Palmer,media technologist, advisor, and author with over 700,000 daily newsletter subscribers, returns to the show. He's one of the sharpest thinkers writing about AI today, and this conversation covers the full arc: from social media liability to the trust collapse coming for all of us, and into the real productivity gains and surveillance trade-offs of living inside an AI-first workflow.The episode opens with the Google and Meta lawsuit verdict and quickly moves past the legal question. Shelley's position is precise: you can't legislate parenting, but you can legislate transparency, and the tech industry has failed on that front entirely. The $6 million judgment against Meta and Google is a rounding error — not a deterrent. What matters is what platforms actually engineered: engagement above all else, backed by neuroscience, probabilistic math, and dopamine feedback loops optimized for shareholders, not users.AI XR News You Should Know: OpenAI is ending Sora and pivoting hard to Codex and enterprise. Ben Affleck secured $900 million from Netflix for a custom AI filmmaking tool. Epic Games cut 1,000 jobs as Fortnite loses audience. NVIDIA's Jensen Huang introduced Nemo Claw and Open Shell at GTC — a corporatized framework for personal AI agents.Key Moments[00:01:15] – Charlie opens noting the show missed one episode in nearly 300 — his daughter's wedding[00:01:55] – OpenAI kills Sora; the Critters director goes dark before the episode[00:04:45] – Google and Meta lose their social media addiction lawsuit; Meta also loses in New Mexico[00:08:07] – Shelley on what can actually be legislated: not parenting, but transparency[00:11:42] – Shelley on Zuckerberg: he genuinely believed connection would be net positive; ask him today[00:13:31] – "Planetarily net negative. No matter what good it does, it does more harm."[00:18:16] – Rony on dopamine engineering: neuroscientists studying pixel size, color, sound to refine addiction[00:19:40] – Shelley reframes it: engagement maximization for shareholders, no more insidious than that[00:23:19] – The physiological change argument: humans evolved to default to trust; AI-generated everything breaks that[00:31:50] – Rony's counterpoint: trust will reset local; the software ecosystem will follow[00:36:53] – Shelley: "Our business increased last year. Everyone on my staff is doing 400 times the work."[00:44:42] – AI-first means automating every workflow you can honestly automate — and knowing what isn't ready[00:45:06] – Jensen's Nemo Claw and Open Shell: the safer path to personal AI agents, and what it actually costs[00:49:42] – The surveillance trade-off: an effective AI agent requires more personal data exposure than anything before it[00:51:24] – Apple's Secure Enclave play: why Tim Cook may win the AI trust war in the endThe productivity gains are real, but so is the privacy exposure, and the systems that earn trust — at every level — are the ones that will survive.This episode is brought to you by Zappar, the company behind Mattercraft — the leading visual development environment for building immersive 3D web experiences across mobile, headsets, and desktop. Mattercraft now features an AI assistant that helps you design, code, and debug in real time, right in your browser. Start building at mattercraft.io. Subscribe to the AI XR Podcast wherever you listen.Watch the full episode for the full breakdown. Available where podcasts are. Full videos available on YouTube. https://youtu.be/S_AECjELYyoSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Mistral: Voxtral TTS, Forge, Leanstral, & what's next for Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Mar 30, 2026 48:48

Mistral has been on an absolute tear - with frequent successful model launches it is easy to forget that they raised the largest European AI round in history last year. We were long overdue for a Mistral episode, and we were very fortunate to work with Sophia and Howard to catch up with Pavan (Voxtral lead) and Guillaume (Chief Scientist, Co-founder) on the occasion of this week's Voxtral TTS launch:Mistral can't directly say it, but the benchmarks do imply, that this is basically an open-weights ElevenLabs-level TTS model (Technically, it is a 4B Ministral based multilingual low-latency TTS open weights model that has a 68.4% win rate vs ElevenLabs Flash v2.5). The contributions are not just in the open weights but also in open research: We also spend a decent amount of the pod talking about their architecture that combines auto-regressive generation of semantic speech tokens with flow-matching for acoustic tokens (typically only applied in the Image Generation space, as seen in the Flow Matching NeurIPS workshop from the principal authors that we reference in the pod).You can catch up on the paper here and the full episode is live on youtube!Timestamps00:00 Welcome and Guests00:22 Announcing Voxtral TTS01:41 Architecture and Codec02:53 Understanding vs Generation05:39 Flow Matching for Audio07:27 Real Time Voice Agents13:40 Efficiency and Model Strategy14:53 Voice Agents Vision17:56 Enterprise Deployment and Privacy23:39 Fine Tuning and Personalization25:22 Enterprise Voice Personalization26:09 Long-Form Speech Models26:58 Real-Time Encoder Advances27:45 Scaling Context for TTS28:53 What Makes Small Models30:37 Merging Modalities Tradeoffs33:05 Open Source Mission35:51 Lean and Formal Proofs38:40 Reasoning Transfer and Agents40:25 Next Frontiers in Training42:20 Hiring and AI for Science44:19 Forward Deployed Engineering46:22 Customer Feedback Loop48:29 Wrap Up and ThanksTranscriptswyx: Okay, welcome to Latent Space. We're here in the studio with our gues co-host Vibh u. Welcome. Thanks. Excited for this one as well as Guillaume and Pavan from Mistral. Welcome. Excited to be here.Guillaume: Thank you.swyx: Pavan, you are leading audio research at Mistral and Guillaume, you're Chief Scientist,Announcing Voxtral TTSswyxHost(00:05) Okay. (00:05) Welcome to Lean Space. (00:06) We're here in the studio with trustee co-hosts, Vibhu. (00:09) Welcome.VibhuHost(00:11) Very excited for this one.swyxHost(00:12) As well as Guillaume and Pavan from Mistral. (00:15) Welcome. (00:16) Excited to be here. (00:17) Thank you for having us.(00:18) Pavan, you are leading audio research at Mistral and Guillaume, you're a chief scientist. (00:23) What are we announcing today where we're coordinating this release with you guys?GuillaumeGuest(00:26) Yeah, so we are releasing Voxtral TTS. So it's our first audio model that generates speech. It's not our first audio model. We had a couple of releases before.(00:35) We had one in the summer that was Voxtral, our first audio model, but it was like a transcription model, ASR. Like a few months later, we released some update on top of this, supporting more languages. Also a lot of table stack features for our customers, context biasing, precision, timestamping and transcription. We also have some real-time model that can transcribe not just at the end of the level.(00:56) You don't need to fill your entire audio file, but that can also come in real-time. And here, this is a natural extension in the audio, so basically speech generation. So yeah, so we support nine languages, and this is a pretty small model, 3D model, so very fast, and also state of the art. Performed at the same level as the base model, but it's much more efficient in terms of cost, and also much, in terms of cost, it's also much cheaper, only a fraction of the cost of our competitors.(01:22) And we are also releasing the work that this model is running.swyx What's the decision factor?Guillaume It's a good question.swyxThere will be more. Yeah, Pavan, any sort of research notes to add on?Architecture and CodecPavan: But it's a novel architecture that we develop inhouse.We traded on several internal architectures and ended up with a auto aggressive flow matching architecture. And also have a new in-house neural audio codec. Which, converts this audio into all point by herds latent [00:02:00] tokens, semantic and acoustic tokens. And yeah, that's that's their new part about this model and we're pretty excited that it's, it came out with such good quality and Jim was mentioning. Yeah, it's a three B model. It's based off of the TAL model that we actually released just a few months back and insert trunk and mainly meant for like the TTS stuff, but they need text capabilities are also there. Yeah.swyx: So there's a lot to cover.I always I love any, anything to do with novel encodings and all those things because I think that's obviously I creates a lot of efficiency, but also maybe bugs that sometimes happen. You were previously a Gemini and you worked on post training for language models, and maybe a lot of people will have less experience with audio models just in general compared to pure language.What did you find that you have to revisit from scratch as you joined this trial and started doing this? At leastUnderstanding vs GenerationPavan: when it comes to, for, I think the, there are two buckets, I guess the audio understanding and audio [00:03:00] generation. The audio understanding, like the walkthrough models that Kim was mentioning that we released earlier.The walkthrough chat that we released I think July last year, and the follow up transcription only, models family that we released in January, that would be one bucket, and the generation is another bucket. I think. You can also treat them as a unified set of models, but currently the approaches are a little different between these two.To your question on how audio is fed to the model? In the understanding model, it's very similar to actually Pixar models that we also released,swyx: yes.Pavan: That'sswyx: amazing.Pavan: It was pretty, I, that was the first project I worked on after joined Misra. It was pretty, pretty nice. And Wtu was very similar in spirit.I guess So we feed audio through an audio encoder similar to images through a vision encoder, and it produces continuous embeddings and which are fed as tokens to the main transformer decoded transformer model. Yeah. On the model output is just text. So on the output side, there is nothing that needs to be done in these kinds of mode.I [00:04:00] guess the interesting part of what the generation stuff is, the output now has to produce audio and. The approach that we have is this neural audio codec, which converts audio into these latent tokens. There is a lot of existing attrition and a lot of models which are based off of this kind of approach.And we took a slightly. A different, design decisions around this. But at the end of the day, the neural audio product converts audio into a 12.5 herdz set of latents. And each latent is, has a semantic token and a set of acoustic tokens. And the idea is that you take these discrete tokens and then feed it on the input side.There's several ways to use this at each frame, but we just sum the embedding. So it's like having key different vocabularies. Combine all of them because they all correspond to one audio frame on the input side. The output side is the interesting part on the output side, the, it's not the, I don't know if it's the most popular, but one.Popular technique is to have a depth transformer [00:05:00] because you have K tokens at each time step, like with a text, you just have one token at each time step. So you just do predict the token from the vocabulary with, yeah, with just, you get probabilityswyx: This's a very straightforward text. VeryPavan: straightforward.swyx: Yeah.Pavan: But if you have K tokens, then the name thing would be to predict all of them in paddle. That doesn't work. At least that doesn't work that well because audio has more entropy. And the, one of the techniques people use is this depth transformer where you you almost have a small transformer, or it can be L-S-T-M-R in as well, but people use transformers and you predict the K tokens in auto aggressive fashion in that.So you have two auto reive things going on.Flow Matching for AudioPavan: So the thing we did differently is in, instead of having this auto aggressive K step prediction, we have a flow matching model. Instead of modeling this as a discrete token set we trained the codec to be both discrete and continuous to have this flexibility.So we did try the discrete stuff too, and which it works well, but the continuous stuff works just better. So yeah, we took this flow matching, so the, it's a flow [00:06:00] matching head, which takes the latent from the main transformer and like kind in fusion, it's denoising, but in this flow matching itself, velocity estimate.So you go from this noise t all the way to there. Audio latent, which corresponds to the 80 millisecond audio and then, which is sent through the work order to get back the 80 millisecond audio frame.swyx: Yeah. Is this the first application of flow matching in audio? Because usually I come across this in the image.Pavan: Yeah. Actually, in some sense there are models flow matching models in audio, but I think this specific combination I could be wrong. There could be somewhat. No. I haven't seen. I haven't seen much work in this, so I think it's novel and a lot of it's just a way bigger community, so they, I think they pioneer a lot of these diffusion flow matching work, and it's interesting to adopt some of the ideas there into audio and,swyx: yeah.Pavan: Yeah, I'm, personally that's the think part which is trying out about. One of more meta point is unlike text, even in vision, I think this is true, but in [00:07:00] audio step literature that there is no.Winner model, yet there is no, okay, this is the way you do things. It's it's still by, I think people are still iterating and figuring out like what's the best overall recipe. I guess the idea. Pretty sure there are models which are also completely end-to-end, like NATO audio. NATO audio, but it's still not come to a convergence point where this, the right way to think that.That also makes. A space pretty exciting to explore.Real Time Voice AgentsVibhu: What are some of the ways to look at it?Vibhu: There are ways where you can do diffusion for audio generation, but if you want like real time generation, that's a big thing with the approach I'm assuming that you took. Yeah. And also like how do you go about evaluating different axes of what you care about, yeah,Pavan: good point. I think we so you can do just flow matching diffusion for the whole audio. We didn't even go down that path because one of the main applications is voice agents and we want real time streaming, and that's the use case. That's not the only use case, but that's one of the primary use cases we want to get to.So we [00:08:00] picked the auto aggressive approach for that. And within the auto aggressive space, again, you can do chunk by chunk or you can do so we picked the. I think at least personally prefer the operations, which are the simplest, and so we try to see, can we just add audio as just another head to our regular transformer decode model because that kind of makes it easier for eventual end-to-end modeling of audio text native modeling.Yeah. And it works pretty well. So I guess we went with that and we tried a little bit, but the flow matching head itself, like we had a discreet. Diffusion kind of approach, which also works well, but the flow matching work better.swyx: I was just curious about how you also think about this overall direction of research.Do you basically, when you work with the audio team, do you set some high level parameters and then let them explore whatever, or how does it work between you guys?Guillaume: No I think the way it works is that we are the, we are prioritizing together, I think, what are the most important features because there are many things we can do [00:09:00] in audio.Yeah, I think we try to. These are like how we should do things, for instance. Ultimately what we want to do is to build this through duplex model, but we are not going to start this start there directly, I think is. Some of the project people are doing, butswyx: just to confirm, full effects means it can speak while I'm speaking or,Guillaume: yeah.Okay. Audio. Yeah. Yeah. So intimately we're going to get there, but for us it was, we decided to take it like a step by step. So we start with whatever is the most important. I think support customers, which is the transcription is the most popular use case. Then the speech generation, Soviet time, just a bit before that.And then actually to be like more, but try combining everything all together. But but yeah, we thought it was also important to like separate things and optimize each capability one by one before weswyx: measure of that together. And the super omni model. ButGuillaume: very interesting because as Par said, it's when you work on some other domains of this airline and everything, there are many areas where I think it's not as interesting.For instance. Many places, it's essentially just around data or like creating new environments on a lot of kind [00:10:00] of easy things. But things were, I think the research is maybe not as interesting. Were in audio. There are so many ways to actually build this model. So many ways to go around it. That's the sense I think is really interesting.And what we also tried for speed generation is that we tried multiple approaches. What was interesting that even though they were extremely different, they under the big know the particles but the for matching turned out to be quite more natural. So we are happy with this.swyx: Is there intuition why it maybe like flow matching is just models speech better in some natural fundamental, latent dimension?Pavan: No, I think the main thing is e even at a particular time step, there is a distribution of things.swyx: Yes.Pavan: To be predicted like the way you inflate. So you already know the word that you're speaking and Yeah. The intake space, let's say the word maps register a single token for simplicity.In most cases it does. So there is not a lot of so you just pick the word, but with within audio, even the same word could, even with your own voice, could be inflicted in so many different ways. And I think [00:11:00] any approach which like models this distribution and. And flow matching is one, one of the take.It's not the only one at all, but it's a one which works pretty reasonably well. I think that's better. So you have to pick across several different, the intuition I have is it's, there are some, several different clusters each corresponding to some specific way you would inflict, pronounce that thing.And you can't predict the mean of it because that corresponds to some blurred out speech or something like that. But you have to pick one. And then like sharpswyx: conditional inference.Pavan: Yeah, exactly.swyx: Is that all covered under disfluencies, which is I think the normal term of art. Pauses intonations. By the way, I have to thank Sophia for setting all this up, including like some of these really good notes becausePavan: Yeah.swyx: I'm less familiar with the audios for me.Pavan: No. I think dis dismisses are definitely one such Eno defenses is more likeswyx: which is arms are.Pavan: Yeah, arms. And also repeat like you like,swyx: yeah.Pavan: You do this full of words, your thinking, so you repeat the word.swyx: Okay. Whereas intonation is like a diff, it's up up [00:12:00] speak and all this.Okay.Pavan: Yeah. So I think there is a lot of like entropy. And modeling it as a distribution. And a, any technique which helps with it and the depth transformer is a conditional way of modeling this. And Transformers actually really good at it, even though that's a mini transformers. So I think that worked pretty well too for us too.It's just that the main concentration is when you have a depth transformer. If you have K tokens, you need to do K auto steps, right? Even though it's a small thing, it's K steps, which is very vacant, say heavy, but flow matching. We were able to cut it down significantly. So we are able to do the inference in quad steps or 16 steps and it works pretty well.And there are more normal techniques to bring it down even further to like, in extreme case, one step like we're not doing it yet, but it at least the framework, LEDs itself to more efficient and Yes.swyx: And the image guys have done.Pavan: Yeah.swyx: Incredible work guys. Yeah.Pavan: It now you just. Send a prompt and you get an image.swyx: Yeah. Surprisingly not enough. I think image model labs use those techniques in production. I think it's, I feel like it's a lot of research demos, but [00:13:00] nothing I can use on my phone today.Guillaume: The thing, there's a thing that would be interesting here is that since, indeed I've been so much sure that has been done in the vision community compared to radio dys, stomach, I think there are so many long infra Yeah.And there are so many things we can do to actually improve this further. So it's our first version, but we have so many ways to exist, much better and much more efficient, cost efficient, soswyx: yeah.Guillaume: So really it's not a new field at all, of course, but there are still so many things that can be done.Perfect. It'sswyx: nice. I should also mention for those who are newer to flow matching, I think the creator, this guy's name is Alex, he's done I think in Europe's maybe two Europes as ago. There was, there's a very good workshop. There's one hour on like this matching is I would recommend people look that up.That's the other thing, right?Efficiency and Model Strategyswyx: The efficiency wise, like I, I imagine like the reason is open weights the reason you pick 3.6 B backbone it you are 3.4 B you are, try to fit to some kinda hardware constraints. You kinda fits some kinda basic constraints. What are they?Guillaume: Not necessarily, I think something we care about in our model that they're efficient.So we have a [00:14:00] lot of separate model, for instance. So we have this that is very small, very efficient. We also have a small OCR model that is available. Good, highly efficient as well. And I think on a project maybe there, I think companies are going to take is to have a coverage general model that will do a bit of everything.But that is also going to be expensive. On here. What want say is if you care about this specific use case, if you can actually use this model, it just does that. It's extremely good at it. Survey, very efficient. That's why we can actually add. We do, but also OCR that are like really good at that.And that would be much more cost effective factors and the general model that will contain a lot of capabilities you don't really need. So yeah. So we're doing like general model, but also like more customized model. This,Open Weights and BenchmarksVibhu: how does it compare to other TTS models? It's, we are going follow open wave.We're just dropping it. I think it's pretty good.Pavan: Yeah, I think it's pretty good. Like it, it's definitely one of the best. For sure. It's probably I would say it's the best open source model, butVibhu: decipher themselves.swyx: Yeah.Voice Agents VisionVibhu: Why now? How does it fit into broader ral vision? How do you see voice agents?How do you see voice? I think every year I've heard, okay, you're a [00:15:00] voice. You're a voice. There's a lot of architectural stuff. There's a lot of end time that see it, your solving, but where do you see voice setting?Guillaume: We had so many customers asking for voice. That's also why we wanted to build it.What's interesting in this domain is that. In a sense, if you take something simple like transcription it doesn't seem like something that should be very hard to do for a model. It's essentially, it's pattern recognition. It's classification on this. Models are very good at classifying, right?Or nonetheless, when you talk to them it's not there yet, right? It's not, you don't talk to them the same way you talk to a person. On something, maybe people don't realize it. It's in English it's still much better than in any user language, even compared to French instance. If you talk to this million in French, when you see people talking to this they'll talk very slow.They'll articulate as much as they can. So it's not natural, right? We're not yet to this. And I think, yeah, maybe the next generation will not know this, but yeah, I think people that. But our edge will actually always keep this bias speaking very slowly when they talk to this model. Even if maybe, probably in a couple of years, maybe next year it'll not be necessary anymore.But yeah. But what's interesting is to see that yeah, even for like languages [00:16:00] like yeah, French and Spanish Germans that are not no, no resource on religion. You have a lot of audios there on still it's not as good. And I think a consequence. Because then for this, I suppose just is not as much energy, as much effort that has been put done in some other mod that for some vision or like coding.But but yeah, there's still a lot of progress to be done. I think it's just a question of doing the work and it's clear path I think to get there.Pavan: It's a little fascinating because I worked on Google Assistant I think while back at this point, but it's, I think it's, it like when you take a step back, it's fascinating.It's not that long ago. It was like four years ago or five years ago, and it's now it's completely audio in, audio out and the function calling and the whole thing happens completely end to end. And in a very natural,swyx: yeah,Pavan: natural way and still ways to go. Kim was telling, even despite all the previous, it's not like you're speaking to a person.When you talk to any of these agents, bots, or voice mode kind of situation, it's still like a gap. I think that's the great part and I feel like with even the existing [00:17:00] stack, we should be able to get to this very natural speech conversational abilities soon enough I guess.And we'll also hope. I get thatGuillaume: on this kind of the next step, right? Because when you talk to these agents, like usually people are just writing to them and sometimes they'll this very clear, for instance, you are, you want to write code, but you are, you have a very clear idea of how you want the model to implement what you in mind.But so here you are able to spend a lot of time writing. So it's not really efficient on audio is really like a natural interface that is just not there yet, but I think it's just gonna be the place.Vibhu: How's it like building, serving, inferencing, like we see a lot about, it's very easy to take LMS off the shelf, serve them.Fine tuning, deploying. I know you guys have a whole you have Ford, you have a whole stack of customizing, deploying. Is there a lag in getting that. Like distribution channel. Are you helping? There is. So like prompting, lms, you can have them be concise, verbose, all that.They're built on LM backbones, these models. How do you see all that?Enterprise Deployment and PrivacyGuillaume: Yeah, I think this is a lot of what we're doing with our own customers. Very [00:18:00] often they come to us, so it's for different reasons. I think one reason is sometimes they have this lot of privacy concerns.They have this data that it's very sensitive. They don't want data to leave. The companies, they wanted to stay. Inside the company. So we have them deploy model in-house. So either on a, either on premise or on private cloud. So they're not worried that it's given to a third party on the there some leakage.Sometimes they have this kind of many companies have this different, sensitivity of data they have like sometimes channel chat can send it to the cloud has to stay there. So then it creates some kind of heterogeneous workflows where it's annoying. You cannot send some data to the cloud.This one you can, so here, when we actually deploy the model for them, they don't have this consideration. They are like not worried that, this is going to leak. Everything is much easier. So we help them basically do this on the, so it's one of the very proposition. But but the other is very often, when customers use this off the shelf close model, but very sad is that they are not leveraging, these data that have been collecting for four years or something for decades.So much data. Sometimes it's trillions of tokens of [00:19:00] data in a very specific domain. Their domain, which is data that you'll not find in the public, on the public internet. So data on which, like close model, we actually not have access to one, which that's going to be really good. So if they're using like closed source models are basically not benefiting from all these insights.All these data they have collected three years, they can always give it into the context that in France, but is never as good as if you actually train the modern analysis. So yes, that's basically what we help them to do. We actually provide them some purchase, basically what we announced at GTC this week.So we provide them with this, it's basically like a platform with a lot of tools to actually help them process data. Trained on that. Yeah, it's actually the same thing that we're using in the science team. So it's actually very better tested infrastructure, like a lot of efficient training cut base.For a quality pre-training like a fine tuning, even doing S-F-T-I-L. So we help them do this using the same tools as what our science team is building is using. So since it's tools that we've been using for two years now, it's really better tested. It's really sophisticated.So it's the same thing. We are giving to them, giving the company the same thing [00:20:00] that what are same still using internally actually build their own ai and it makes a really big difference. I think sometimes customers. And many in general don't realize how much better the model becomes when you fine tune it on your own data.And you can have a, your model is here. You start from there. You have a cross source model, which is sort here, but if you actually fine tune it can actually really go much further than this. And then you have a very big advantage. The model is trained on your entire company knowledge, so it knows everything.You don't have to feed like 10 K tokens of contact at every query. So it's it's much easier. It's a bit, I think using a closed source model is really sad because it basically puts. You are not leveraging all this data and you are going to be using the same model as all your old competitors when you're actually using, everything you have been collected for years, which is really valuable.So yeah. So we help basically customers do this. We have a lot of solution I mean deployed for engineers that go in the company that basically look at the problem customers are facing to look at what they're struggling to do what we should do to solve it. So we help them solve them together.So it's I think our approach is a bit different, but here. [00:21:00] Some of their companies and competitors, it's, we don't just release an endpoint on sale, do some stuff on top of that, or we don't just give a checkpoint. We really look very closely with customers. We look at the issues they have, we had them solve them.We really make some tailored solution for the client are facing. Some example are also going to be, sometime we have some customers. They really wanted to have a really good model, really performance on some, like Asian languages on the, if you take some of the shelf models, they can speak it, they can write in this language, but it's not amazing.This language would be like maybe zero 1% of the mixture. So it has been included during training, but very little. So what we did here is upgrade. We trained a new model for them, but so this language was 50% of the mix, so it's much, much stronger. It knows of the dialects, it knows the, so it's yeah.So it's some example of things we can do and it's really arbitrary, custom. I think you had some of their customers, for instance, they wanted some. They wanted some 3D model that can do audio with a very good function cable. So something you wanted to put in the car in particular, they wanted this to be offline because in a car you don't necessarily have access to internet.So [00:22:00] yeah. So here we can actually build the solutions. There is no like model out of the box on this. In the internet you have this very, you have this very general model generalist, like he's strong model. But for things like this, they always want at specific solutions and on some other reasons.Sometimes they come to us is because, like they, they experiment with some closed source model. They get some prototype. They're happy with what they build. They, it works well. They're happy with the performance, and then they want to go to production and then they analyze. But it's extremely expensive.You cannot push this. It's so then they come back to us on this. They can help us build the same thing as this, but using something much cheaper on here. And here we can sometime be something 10 x cheaper by just functioning a model and it'll be better OnPrem on their old server and also much cheaper as well.So yeah,swyx: that's the drop pitch right there. Take all themoney.Vibhu: And outside of that you do, we do put open wave models so people can do this themselves. I feel like not enough people go outta their way.swyx: They're not going to, they're gonna ask them to do it as the expert. IGuillaume: think initially we didn't know, [00:23:00] we wanted completely short at the beginning of the company because, I think our study was not exactly the same as what it is today, but what we underestimated initially is the complexity of deploying this model and connecting them to everything to be sure it has access to the company knowledge on the, and it was, yeah, on, we were seeing customers struggling with this, but it was even, that was three years ago and no, things are much more complicated because now you don't just have, text on SFT on a simple instruction following.You have reasoning like your agents, you have like tools. You have a multimodal audio, so it's much more complicated than before. And even back then it was hard for customers. So they really need, have some support and this is why actually providing like always some four D position as well. The processFine Tuning and Personalizationswyx: I'm curious is there also voice fine tuning that people do?Pavan: So in this forge we also have a say unified framework. And the hope is like the er speech to text that we released earlier this year. And even the ER chart that we released last year. And I think a big people, I think there's a big, rich ecosystem [00:24:00] of people fine tuning whisper, and people want the same thing with w so it's much stronger than Whisper.And yeah, the the platform offers that kind of fine tuning yeah, which could be any kind of fine tuning. Like for instance, even sometimes people want to support new languages to this, which are tail languages, which we hope to cover. Certain natively, but if there is a language where you data and you want to frank you, I think this is a good use case.Or the other use cases, you, it's the same language, like even English but it's in a very domain specific way.swyx: Yeah. Terminology, jargon, medical stuff.Pavan: Exactly. And also there's specific acoustic conditions like there's a lot of noise or the, and. The model will do decently in most conditions, but you can always make it better.And that those are some of the use cases where you can improve it e even further. And that's one good use case for this and for text to speech. We're just releasing it so we'll have support for that soon too. I think it's similar use case.Voice Personalization Pavan: It's little different the kind of things that you want to extend a [00:25:00] text to speech model to, which could be like voice personalization, voice adaptation for enterprises.Many enterprises need very specific kind of tone, very specific kind of like personality for this kind of voice. And all of those are like good use cases for fine tuning.swyx: This one I was gonna ask you, we never talked about cloning voice clothing here. How important is it, right?Like I can clone a famous person's voice. Okay. ButPavan: the main use case would be like for enterprise personalization, like enterprises need like a lot of customization. You don't want the same. Voice for all the enterprises. Each enterprise want a customized, specialized something which is representative both their brand and also their, I guess safety considerations and the use case I think the kind of thing that you would deploy as a empathetic assistant in the context of a healthcare domain would be very different from the kind of thing that would be in a customer support bot and would be different from like more conversational aspects.I think those are the. [00:26:00] Customizations you would expect from enterprise. And that's the main use case, at least from our side.Vibhu: My, my basic example is you don't want to call to customer services and have the same exact voice. It's just, it's gonna be weird.Long-Form Speech ModelsLong-Form Speech ModelsVibhu: But also on the technical side of this, so there's like a few things in TRO that I thought were pretty interesting.He's a big fan of this paper. Oh, he said very good paper. He said this is the best SR paper he's ever read. Yeah. I've hyped up this voice paper enough. We covered it. Somewhere, but a big thing. So Whisper is known for 32nd generation a 32nd processing. You extended this to 40 minutes. There was a lot of good detail in the paper about how this was done.Even little niches of how the padding is. So it's very much needed. You need to have that padding in there, the synthetic data generation around this. I'm wondering if you can share the same about the new speech to text, right? Text to speech. So how do you. How do you generate long form, coherent?How do you generate, how do you do that? And then any gems? Is there gonna be a paper?Pavan: Yeah. Yeah. They would be a technical report. Okay. Yeah. I think I could have a lot of details.Real-Time Encoder AdvancesPavan: But me I think the [00:27:00] summary of it, actually, some of the considerations in this paper were, because we started with the wipa encoder as the starting point, and now we have in-house encoders, like the bigger time model, for instance, which we released in January.Also release a technical report for that real time model as well, which is this dual stream architecture. It's an interesting architecture. You should check it out. And there we have a causal encoder and I don't think there's any strong, multilingual causal encoder out in the community. So we thought it's a good contribution.So that's one nice encoder there. Other people want to adapt. That's a good end code. And we train it from scratch. I think her. Post stack is now mature enough that we are able to train super strong ENC codes. And some of these considerations, like spatting and stuff, is a function of the Whisper ENC code.And now that we train encoders, inhouse the design concentrations are different.Scaling Context for TTSPavan: And for the question on text to speech, I think that's also leans onto the original auto aggressive decoder backbone. I think, it says very, almost identical considerations. I think the long context in it's not even long con, [00:28:00] so the model processes audio at 12.5 herds, so one second maps to like 12.5 tokens.So I think one minute is like 7.8 tokens. You can get like up to 10 minutes in eight K context window and get half an hour and 30 K context window. So that's and 30 2K context is something that's we are very comfortable training on. We can extend it even much longer. 1 48 K. Okay. You can naturally see how it can extend to even our long generations.Yeah. We need the. Like data recipe and the whole algorithm to work coherently enough through such long context. But the techniques are some way very similar to the text, long context modeling. And the key differences, it's just doing flow matching order regressively instead of a text open prediction.swyx: Okay. I think that was most, most of the sort of voice questions that we had. ButWhat Makes a Model SmallVibhu: I have a big question on Mr. Al, Mr. Small. So what is small? How do we define [00:29:00] small? What is this? What is this? I remember the days of Misal seven B on my laptop. The snuff fitting on my laptop. I could run it on the big laptop, butGuillaume: it's just additional.Question of terminology, like here what we did, baseball is north active parameters, but it's true. Really not give it another name, but yeah, we could have called it medium, but only, I,I suppose it's a model that we released mixture of experts. It's a model that combines different model before which we were doing the same, is that we had one model, general model for Israel. Doing instruction following, were like a separate model that was Devrel trial. So qu coding specify specific to code with another model for Reason Maal.So this were separate artifacts built by different team at trial on what we're doing is basically merging all of this. It was, you had pixel trial was the first vision model. We was like a separate model on the way we do things internally is that we have one team focus on one capability, build one model.On the means mature, mature enough, we decide to merge this into the [00:30:00] matrix. But here it was the first time we basically match all of this into one. But there are some other things we did at first time to merge time, for instance, like more capabilities or function coding I think would be, are, it's going to be much, much better in this trial, small platform.But but yeah, so it's our latest model on the working is,Vibhu: and yeah, key things is it's very sparse. Six, be active pretty efficient to serve. 2 56 K context. Yeah,Merging Capabilities vs Specialistsswyx: I think what's interesting is just this general theory of developing individual capabilities in different teams and then merging them.Where is this going gonna end up?Vibhu: Like we've seen the five things put together in this. Yeah. What are the next five teams?swyx: I think actually OpenAI has gone away from the original four Oh. Vision of the Omni model. This was what they were selling. All modalities and all modalities out.But I feel like you might do it.Guillaume: I think there's some mod where it's not competitive use, for instance for audio. For audio here, if you want to do transcription, I think it makes no sense to use a model. If you just want to trans tech it, it'll be very inefficient. If you want to do audio, you probably just want to be the [00:31:00] one VR 3D model performance essentiallyswyx: the same.It's going to be incredibly cheaper. So here, that's why we wantGuillaume: to have a separate but just does this. Yeah, I think the question is just, yeah. If you are to, to your model. By speech and you asking like a very complex questions on how you do this on the, just to cascade things. Do you want to put a d in a model that has like a one key around it?It's like a, not a competitive discussion, I think unaware if you doing into the direction, but that's possible. Of course. But yeah. But I think for us, the next capabilities we want to try to integrate into these models when we are going to be yes, like marketing or no reasoning better, I think more capabilities that people don't talk too much about, but at high bottom, I think for our customers in our, on different industries, for instance, things are around like a legal computer.I design all these things that is this males out of the box are to put at that. Because people, if you don't prioritize this, there is not like too benchmark on that. Butswyx: this done how toGuillaume: make this good and this just start to do the work. Extracting some that processing it [00:32:00] expression. So yeah.But we are offering the imagine to this.swyx: I think for voice. Yeah. The key thing I think over maybe like the last year or so with VO and gr Imagine and all these things is joining voice with video, right? Which people don't understand spatial audio because like most TTS is just oh, I'm speaking to a microphone in perfect studio quality.But when you have video, like the voice moves around.Pavan: That's true. The constitution was a little different in the sense that there it's like a a standalone artifact where you get the whole thing and you consume it. But in a conversational setting, it's a, you need the extreme low latency.swyx: Yeah,Pavan: streaming would be one of the primary concentrations.swyx: You can build a giant company just doing that, right? So you don't need to do the voice, but I was just know on the theme of merging modalities, that is something I, I am like, wow. Like I didn't, everyone up till, let's say mid last year was just doing these like pipelines of okay, we'll stitch a TTS model with a voice thing and a lip sync [00:33:00] thing and what have you.Nope. Just giant model. Yeah.Open Source MissionVibhu: I have a two part question. So one is, it's still open. It seems like open source is still very core to what you guys do and I just have to plug your paper. Jan 2024. This is the one trial of experts like. Very fundamental research on how to do good.Moes paper comes out very good paper for anyone. That's just side tangent. No.swyx: This thing caused, we bring back, eight by 22 was like the nuclear bomb for open source. I think it takes Shouldn be more seven B more. Yeah. Yeah. But this is a bigger opposite than me.Yeah. Yeah I don't remember this. I remember, I don't think it was January, right? It was like new reps it was, it dropped during new reps and everyone in Europes was December of 25th, I think. Yeah. The model was did as well.Vibhu: It's just a little update probably.swyx: Yeah. No, but you have a point to make.Vibhu: No, you gotta check that. But then, I just want to hear more broadly on open source for you guys, and when you had asked earlier [00:34:00] about what's next, what are the other, side tapes working on you. You put out Lean straw. This,swyx: it's not necessarily surprise. I was like, I don't, this doesn't fit my mental model or Misra.Guillaume: Yeah. First for open source in general, I think it's really something which looks to the January of the company. I think we started it per once, is we so we have open sourcing with, since the beginning and even before this. So before this, so me and Tim were at Meta, we released LA and I think what was really nice.To see that before this, for most researchers like universities, it was impossible to work on elements. There was no alien outside. And if you look at many of the techniques that were developed after, for instance, was open source all this post-training approaches like even DPOD, like preference optimization, all of this were done by people that had access to this portal.And it'll have been impossible to do without this. So it's really making sense, move faster. So we really want to contribute to this ecosystem. I think like the deep and also like very lot of impact. All these papers that are I think in the open source community are really helping the science community as a whole to move faster.So [00:35:00] we want contribute to this ecosystem. That's why we're releasing very detailed technical reports. So ma trial and our first reason model, and ation, lot of results, things that work, things that did not work as well. Think helpful on the, yeah, so for the audio model also to share a lot of details, share of them for real time model.And the, yeah, so we really want to continue this, basically belong to this community of people who share science. I think we really don't want to be, leading in a world where the smartest model, the best models are only behind, close doors. Only accessible to a shoe companies that we, as a power to decide we can use them on it.I think it's a scary future. We don't want to live in, we really want this model to be accessible to anyone that want. Intelligence to be used unaccessible by anyone who can use it. So yeah, so that's why we are pushing this mission and source model. Yeah. So not, so yeah, no strategy. So it's open source, not the first model, so not the best on the Yeah.Lean and Formal ProofsGuillaume: LIN trial I think is also one step into this direction. So it's yeah, a bit different than what we are usually releasing. But we have a small team internally [00:36:00] working on them. Formal proofing, formal math. So I think a subject we care about in general and we were working on reasoning. I think we started too early before doing reasoning without LMD is very hard, especially when you work with formal systems because the amount of data you have is negligible.It's addressable community of people writing like formal proofs. But the reason why we like it is because I think there is if you look at what people are doing with reasoning, is there, the problems that you can use. Are usually going to be problems where you can verify the output. So for instance, all this ai ME problem where the solution is a number between 100, like a thousand.So you can verify, compare this with a reference or it's an expression. You can actually compare the output expression generic with the reference. But there are many, most of them have problem and most of the reason problem. There is no like way to easily verify the solution. If the question is show that F is continuous, cannot compare in the reference, right?If it's a probe that this is true or probes is properties, there is no way to. You cannot act, simply verify the correctness of your proof. So it's hard to apply the, there is no referable reward here. So [00:37:00] what you could provide is of course, like a judge and judge that will look at your proof. But it's very hard and it's very, you could do certain, some reward hacking happening there.So it's difficult. You could provide like a reference proof, but then there are also many ways to prove the same thing. So if the model says give negative reward because it's a different poop, maybe it was still digit proof, just different. So it's not going to work well. What's nice with lean and with formal probing is that you don't have to worry about this whatsoever.We just,swyx: they're all function is largely compiles in lean is functionally the same. Exactly.Guillaume: It's like a problem if it compiles it's correct. It's very easy. And you can apply this and then you can,swyx: it's just way too small. So no human will actually go and do it.Guillaume: Yeah, that's exactly.It's the only people can do it. It's like a very small committee of people doing a PhD on that. So it's super small. And it's sad because it's actually very useful on not just mat, but also in software verification. So for instance, software verification today. So tiny market. Very few industries work on this and we need that.It's usually going to be like companies like building airplanes, air robotics,swyx: likeGuillaume: things [00:38:00] where they absolutely want to be sure. Life depend on this, but it's very rare that people formally verify the correctness of their software. But I think one of the reasons for this is simply that it's just hard to do.swyx: Are you think of TLA plus? It's the language that some people do for software verification? No. That people use in a ference, but but yeah, it's the reason I think why people don't use it more and why this industry is not as big as could be is because it's very hard. But now with cutting edges that are there, it's going to be very different.Guillaume: We're going to see much more of this. So I think yes, industry there is going to be much larger in the future that we, these models. So yeah. Here also anticipating this a little bit, we wanted to work on that because it's proving like a math theory and like a, essentially the same tools.swyx: Yeah.Reasoning Transfer and Agentsswyx: One of my theories is that because the proofs takes so long, it's actually just a proxy for long horizon reasoning and coherence and planning. Maybe a lot of people will say okay, it's for people who like math. It's for being okay. It's like a niche math language. Who cares? But actually, and you use this as part of your data mixture for [00:39:00] post-training and reasoning, actually, it might spike everywhere else.Yeah. And I think that's un under explored or no one's like really put out a definitive paper on how this generalizes.Guillaume: Yeah, absolutely. AndPavan: I think evenGuillaume: that's what we're seeing already. For instance, you should do some reasoning on math as then the American should do reason even.Yeah. In the early stage. So we, the, there is some transfer, some sort of emergence that happens. And I think some, it's also interesting, it's not just I think the topic in general, but it's, there is a lot of connection with this on including agents because. Sometimes the model can see like a three that it has to prove it's very complex, but then it can take the initiative to say, I'm going to prove this three lr.I'm going to suggest three Rs, and I'm going to in parallel prove each R. So three of them in parallel with sub agents, but I'm also going to prove them in theory and the three tool so you can do this also. Pretty interesting. You can, even if you fail to put one of the LeMar, you can actually, maybe you succeed to put the normal lema too, so you get some possible reward here.So it's a bit less Spartan issue, just get to zero one for the entire thing. [00:40:00] So it's pretty interesting. I think we can actually,Vibhu: yeah, it's also an interesting case just for specialized models in general, right? Like the cost thing you show is pretty interesting yeah, similar score wise, you are, thirty, seventy, a hundred fifty, three hundred bucks.Smaller.swyx: I think cost is a bit unfair, right? ‘cause this one is at like inference cost. It's always there on top with their margins on top of it. But, we don't know anything else, so we gotta figure it out.Vibhu: Okay.Next Frontiers in TrainingVibhu: I did wanna actually push on that more. Not on cost, but you mentioned about, okay, it's a great way to have verifiable long context reasoning.What are other frontiers that, I'm sure you guys are working on internally, there's a lot of push of people pushing back on pre-training. Scaling, RL pushing, compute towards having more than half of your training budget. All on rl. Where are you guys seeing the frontier of research in that?Guillaume: You mean theVibhu: just in foundation model training in the next, one thing that you guys do actually is you do fundamental research from the ground up, right? So you probably have a really good look at where you can [00:41:00] forecast this out.Guillaume: Yeah. I think for us we're still working a lot on the pre-training side.I think we are very far from situational, the pre-training. I think ML four preprinting will be like big step compared to everything we have done before. So we are pretty excited about this. And I think on the other side, I think now we have more and more to think about this algorithm that will actually support this very long trajectories.I think when it was, for instance, GRPO for it doesn't really work this any bit of policy. Which was okay initially because you are solving math problem that can be solved in like a few thousand tokens. So the model can alize them pretty quickly. So when you do your update, the model is never too far off.It's never too far off. But now when you are moving towards this kind of problems where certain takes hours, like six hours to get a reward, then your model is co pick places. So you have bi new infrastructure that supports this, but also new A, so now everything we're doing internally, we're trying to. Build some infra that we actually anticipate is what we have in six months, one now, which is this extremely no scenarios on the, I think when we started Missal, part of me and [00:42:00] we wanted to, is very nice under element where people are there, they can do research, they like with a lot of resources.So it was nice. I think things changed a lot when I think when J Pity came out. I think after that I think was. This one is same again. But but yeah, but it was nice. And I think we also want to work part of this descrip beforeswyx: coming to the end.Hiring and Team Footprintswyx: We're just, obviously, I think you guys are doing incredible work.You've, they are a very impressive vision for open source and for voice. What are you hiring for? What's the what are you looking for that you are trying to join the company?Guillaume: Yeah, so we are hiring a lot of people in our sense team. We're hiring, in all our offices. So we have a, our H two is in France in Paris.We have a small team in London. We like a team in Pato as well. Co we open some offices in in SAU, in Poland. So one in Zurich. We also like some presence in New York as well on Sooner one in San Francisco. So we all bit either way also like hiring remotely. So we're going the team trying to hire like very strong people.I think we want to stay, so the team is not. Instead of fairly small team. [00:43:00] But I think we want to keep it that way. ‘Cause we we find it quite efficient. So like a small team they agile so yeah.swyx: Okay.AI for Science Partnershipsswyx: Let's focus on science and the forward deployed. We actually are strong believers in science.We started the our new science pod that focuses specifically on the air for science. What areas do you think are the most promis.Guillaume: What we're pretty excited about right now, and something we have already started doing or that we'd probably be able to share more about this in a couple of months, is that we are exploring AI for science.And there are a lot of areas where we think that you could get some extremely promising buzz. If you were to apply AI in these domains. There are a lot of long inputs. You just have to find these domains where actually AI has not been yet applied, and it's usually hard to do because the people working in those domains don't necessarily know the capability of these models.They don't know. How I would just have to pair them with Yeah, exactly. Your researcher slashing, which is actually hard to do. But this matching, we're doing it naturally with our customers. So we have some company we are very closely with. So for instance, ISM Andreesen are one of our partners, so we're doing some research with them on their other, like tons of extremely interesting problems.Columns in physics, in [00:44:00] science matter science that they're essentially the only ones to work on. ‘cause they're doing something No, no one else is doing on the, yeah. So there are many domains where AI can actually revolutionize things. Just you have to think about it on you familiar with what can do or to apply it.So yeah, it's something where more modeling with our partners, with our customers sort AI for s, but.swyx: Yeah. Okay.Forward Deployed Skillsswyx: And then for deployed what it makes a good four deployed engineer, what do they need? Where do people fail?Guillaume: I think it's usually you need people that are very familiar with the tech and not necessarily with a lot of research expertise, but that are actually pretty good at using this model that can actually like that know how to do functioning, that know how to like, start some error pipeline.And it's it's not easy. It's something that mucus. Majority of companies will not be able to do this on their own. So here I think we need people that are, that like to solve problems that are accept solving some complex, very concrete problem. It's applied science basically.And yeah, so I think it's not too different. I think from the case you need in research because it's essentially you are trying to find solutions to problems that in [00:45:00] customers have not yet. So sometimes it's easy. Sometimes you're here to do the work. You have to like create synthetic data.Find some edge case. So it can be, yeah. Depends on the problem. But but yeah, you have to, I think it also a bit of patience on the be creative. I think very similar skill is Asian,Pavan: the diversity of the work they do. It always surprises me. It's it's, it goes all the way from the kind of stuff they encounter in industries.It's just very interesting. I think.swyx: Any fun like success anecdotes.Guillaume: Yeah, it can be actually training this small model on edge that just we do one specific thing can be like training some very large model without some specific languages as well. Making models really good at some tube use, like for instance, computer ID design, these kind of things.Is that pairing with vision as well? Yeah,Pavan: and the fact detection for chips or like in, in factories identifying things like it, the. Diversity could be anything where you can deploy these foundation models. So yeah the work to make it work in that specific setting, basically whatever it takes to make it like add value in that, by the way, workflow.Vibhu: Yeah. [00:46:00] And it goes across the stack, right? Like even just pulling up the website like.swyx: It's so broad on compute. It is so broad.Vibhu: We didn't even touch on if you have a coding CLI tool. One thing you guys were actually like, I think the first tool was agents, ral agents. You had the agent builder, you can serve it via API and all that.And I'm guessing forward deploy people.Guillaume: Yeah.Vibhu: Help build that out and stuff.Customer Feedback LoopGuillaume: It is also why we are, so we're doing many things, but I think that's also part of the value proposition that sometime know customers. They're always very. Extremely careful about their data and they don't want to, they don't like, trusting so many partners, trusting one partner for code, giving the data to another third party for like audios and another one.So they don't like this here. What they really like with our approach that we can help them on anything so they don't have to send the data to so many clouds. So yeah,swyx: I think that there can be many orders of magnitude more. F Ds then research scientists and they don't need your full experience, but they're still super variable to customersGuillaume: in practice.These two teams [00:47:00] are still quite intertwine, very often. Yeah. So first of all, they're using the same tools, the same data pipeline and everything on the, it's it's very helpful for the science team to get the feedback and the solution team ‘cause they can. Look at these customers are trying to do this.This is not working. It can really be show in the next version. Yeah. But this is basically a real world eval. Yeah, it's real world eval and it's not something, for instance, if you're just working in the lab, it's just ships model. But you don't do this work of for customers. You have no idea for whether your model is good at this H case.For instance, you even in year found this, right? So yeah, there is a very gap, big gap between the public benchmarks that are very like academic. OnPavan: the rare cases are just very diverse and in the specific concept of a customer, you can fine tune and make it like first evaluate, create a solid eval, benchmark, and then measure in the context of their, the kind of audio.Like for instance, one use case is literally just, there's the word for kids and they have to just say it out. It's a very specific thing. You're just saying one word and then you have to you, you'll grade the kid whether they did it right or not. It's [00:48:00] like R for, but so there're very diverse use cases and the idea is that they, the.Applied scientist engineer will go and make it better. And then from the learnings we incorporate it into the base model itself. So it's it's just better out of the box.Vibhu: Yeah. It's a good full circle system. Like the foundation model evals are all just proxies of what you really, you're never gonna have one that says it, it doesn't make sense for there to be, a one word transcription like that.It's not something you wanna fit on. Perfect.Wrap Up and Thanksswyx: Everyone should go check out everything that Michelle has to offer and try the TTS model, which will link in the show notes. But thank you so much for coming tha thanks. Such a stretch. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.latent.space/subscribe

The Next Wave - AI Tool Better Than OpenClaw? + NVIDIA'S $1T Prediction & AI Image Wars

Hustle And Flowchart - Tactical Marketing Podcast

Play Episode Listen Later Mar 26, 2026 82:00

This episode is from The Next Wave Podcast. Check out more episodes here: https://www.thenextwave.showIn this engaging and forward-thinking episode, Joe Fier and Matt Wolfe dive deep into the current and future landscape of AI tools, the staggering impact of NVIDIA on the tech world, and the fierce competition in AI image generation. The conversation covers exclusive insights from NVIDIA's GTC conference, the evolution of agentic AI, industry disruptions, and recent advancements in robotics and job automation. Whether you're an entrepreneur, tech enthusiast, or curious about where AI is leading us, this episode delivers valuable perspectives and hands-on tests of cutting-edge tools.Links Mentions:The Next Wave Podcast: https://www.thenextwave.showMatt Wolfe: https://www.youtube.com/@mreflow NVIDIA: https://www.nvidia.com/en-us/Jensen Huang: https://www.linkedin.com/in/jenhsunhuang/OpenClaw: https://openclaw.ai/NemoClaw: https://www.nvidia.com/en-us/ai/nemoclaw/Future Tools: https://futuretools.io/

ai predictions wars nvidia next wave jensen huang ai tool gtc matt wolfe joe fier

GTC 2026：AI的下一个战场不是模型，而是「推理系统」| S10E04

What's Next｜科技早知道

Play Episode Listen Later Mar 25, 2026 58:37

2026 年 3 月，英伟达年度开发者大会 GTC 在美国 San Jose 开幕。这一年的 GTC 气氛与往年明显不同——黄仁勋不再需要向市场证明 AI 的价值，因为 Agent 爆发和开源模型崛起已经让算力需求成为行业共识，Token 消耗量正在以百倍速度增长。本期节目，Diane 在 GTC 现场为大家带来了第一手的观察，也专访了推理优化初创公司 Eigen AI 的联合创始人。Eigen AI 由三位 MIT 背景的创始人于 2024 年中创立，主攻开源大模型的推理加速与企业定制化部署。这次 GTC，他们的推理速度跑分直接登上了黄仁勋 Keynote 的大屏幕，是当前推理速度最快的团队之一。节目里我们深入聊了为什么推理层正在成为 AI 行业最重要的竞争战场、GPU 和 LPU 各自在推理过程中扮演什么角色、英伟达斥资约 200 亿美元收购 Groq 背后的战略逻辑，以及当前 AI 应用的商业模式为何正在面临系统性挑战。本期人物丁教 Diane，「声动活泼」联合创始人、「科技早知道」主播 Di Jin，Co-founder at Eigen AI 主要话题 [00:11] 今年 GTC 最大的不同是什么？黄仁勋状态明显更放松，不再需要向市场"推销" AI 的价值 Agent 爆发让 Token 消耗量指数级增长，算力需求已成行业共识开源模型崛起打开了推理层的商业空间，这一层开始变得关键 [09:13] Eigen 是一家什么样的公司，在做什么？三位 MIT 背景创始人，专攻模型压缩与推理加速 Post Training 帮企业定制模型，Inference 加速让模型跑得更快更便宜 GTC 开幕前两天完成技术突破，推理速度登上黄仁勋 Keynote 大屏幕 [13:24] 过去一年 AI 行业最大的结构性变化是什么？模型训练层高度集中，GPU 成本比人才成本贵 10 到 100 倍，中小公司已基本出局 Reasoning（推理时扩展）成为新的性能提升路径，让固定模型通过多花算力输出更好结果 Agent 工作流让 Token 消耗量远超对话场景，推理层的优化价值随之暴增 [23:34] 英伟达为什么要花约 200 亿美元收购 Groq？GPU 和 LPU 各自擅长什么？ AI 生成回答分两阶段：读懂问题（Prefill）适合 GPU 并行处理，逐字生成答案（Decoding）适合 LPU 串行提速当前最快模型约每秒 1000 个 Token，Agent 场景未来可能需要每秒 10000 个，GPU 单独难以跨越这道坎 GPU 负责前段、LPU 接手后段，两者组合是目前长序列推理的最优解 [34:04] 推理优化的技术路径有哪些，分几个层次？底层是 CUDA 算子优化，针对不同模型的矩阵计算特点做精细调整中间层包括量化（降低数字精度）、剪枝（删除冗余专家模块）、投机解码（小模型预测 + 大模型验证）最上层是调度与路由，核心是把请求打到存有对应 KV Cache 的 GPU 上，避免重复计算 [44:05] 推理优化怎么在速度、精度和成本之间做取舍？完全不掉精度、少量掉精度、需要后训练恢复精度，三类方案对应不同客户需求对话场景最看重 TTFT（第一个字的响应时间），Agent 场景更看重整体任务完成时间语音交互场景存在天花板：模型再快也超不过人能听懂的速度，快到一定程度就没有意义了 [47:28] AI 应用的商业模式为什么正在出现系统性问题？ SaaS 订阅制是历史遗留：以前软件边际成本接近零，现在每用一次 AI 都在真实烧钱重度用户轻松"用穿"月度套餐，公司不得不限流，引发用户强烈反弹更合理的方向是按任务完成量收费，但用户心理锚点还没有完成迁移，行业仍在震荡期 [53:52] 开源模型能追上闭源模型吗？推理层未来最大的机会在哪？行业最大的非共识：开源模型到底能不能真正追上闭源，以及 AGI 算不算已经到来推理层几乎只能服务开源模型，开源能力的拐点直接决定这个赛道的天花板一旦开源模型达到拐点，Token 将像电力一样渗透各行各业，推理层的市场规模将彻底打开黄仁勋的「AI 五层蛋糕」模型名词解释 LPU（Language Processing Unit） Groq 公司研发的专用芯片，专为大语言模型的文字生成（Decoding）环节优化，通过把高带宽内存直接集成在芯片上，大幅提升了逐字生成的速度，但牺牲了通用性。 TPU（Tensor Processing Unit）谷歌专为自身 AI 需求定制的芯片，性能强劲且价格相对便宜，但目前仅面向 OpenAI、Anthropic 等少数大型客户供货，缺乏开放的开发者生态。 Quantization（量化）降低模型内部数字精度以节省存储和计算量的技术。好比把精确到小数点后 10 位的数字改写成精确到 2 位——计算量大幅下降，但对最终输出影响有限。精度从高到低依次为 FP32、BF16、INT8、INT4，越低效率越高，但掉点风险也越大。 Pruning（剪枝）识别并删除模型中冗余参数或模块的技术。以 MoE 架构为例，模型内部有大量从未被有效训练的"伪专家"，将其删除后模型精度几乎不受影响，但推理速度和效率显著提升。 Speculative Decoding（投机解码）先用小模型快速"草拟"若干 Token，再让大模型批量验证并决定是否采纳的加速技术。当草稿被采纳的概率足够高时，整体推理速度可提升 50% 以上。 KV Cache（键值缓存） AI 在生成回答过程中，将对前文的"理解结果"缓存起来，避免每次都重新读取和计算全部历史内容。合理调度 KV Cache 是 Agent 场景下降低延迟和成本的关键技术之一。 MoE（Mixture of Experts，专家混合架构）模型内部由多个"专家"子模块组成，每次推理只激活其中最匹配当前任务的少数几个。DeepSeek、Qwen 等主流开源模型均采用此架构，可在维持大参数量的同时显著降低实际计算开销。 SLA（Service Level Agreement，服务水平协议）对服务质量的量化约定，例如"首字响应时间不超过 300 毫秒"或"每秒至少输出多少个 Token"。推理层的大多数技术决策，都是围绕在成本约束下满足客户 SLA 要求来展开的。 TTFT（Time to First Token，首字时延）从用户发出请求到收到第一个输出字符的时间间隔。对话类产品中这一指标最为关键，直接影响用户对系统响应速度的主观感受。「Knock Knock 世界」上周「Knock Knock 世界」更新了「数字收藏」话题：一段视频、一个表情为什么也能成为博物馆的收藏品？点击这里收听节目

token gtc

AI Tool Better Than OpenClaw? + NVIDIA'S $1T Prediction & AI Image Wars

The Next Wave - Your Chief A.I. Officer

Play Episode Listen Later Mar 25, 2026 43:56

Get Matt's AI playbook: https://clickhubspot.com/kfcr Episode 102: Is NVIDIA really the “sun” at the center of the AI universe? Host Matt Wolfe (https://x.com/mreflow) and Joe Fier (https://www.youtube.com/@joefier) break down everything you need to know from NVIDIA's recent GTC conference, the hottest new AI tools for business and marketing, and the changing landscape of AI data, agents, and robotics. This episode dives deep into the explosive potential of NVIDIA's AI roadmap, why Jensen Huang thinks chip sales will hit $1 trillion, and how accessible agent tools like OpenClaw and NemoClaw could change everything for everyday users and enterprises. Plus, Matt Wolfe and Joe Fier explore the rise of data-for-hire side hustles like DoorDash Tasks, where humans help train AI in the real world, and the jaw-dropping athletic skills of the newest generation of robotics. Whether you're wondering where the money and innovation are flowing next—or concerned about the privacy, data, and future job market in the age of AI—it's all here in this packed, must-hear “special” episode. Check out The Next Wave YouTube Channel if you want to see Matt and Nathan on screen: https://lnk.to/thenextwavepd — Show Notes: (00:00) NVIDIA's Future and Growth (06:31) OpenClaw: AI Accessible to All (08:56) AI Compute Shifting to Inference (12:39) Accelerating AI Thinking Time (15:20) Nemo Claw: AI Assistant Revolution (19:00) Stock Buybacks Boost Value (24:11) NVIDIA's Expansive Industry Influence (28:34) Future Economy: UBI or Data Payment? (29:38) Data Privacy and AI Advertising (32:45) CAPTCHA Origins and Duolingo (38:18) Robots: Cool, But Not Smart (40:18) Farewell and Thank You — Mentions: Joe Fier: https://www.youtube.com/@joefier NVIDIA: https://www.nvidia.com/en-us/ Jensen Huang: https://www.linkedin.com/in/jenhsunhuang/ OpenClaw: https://openclaw.ai/ NemoClaw: https://www.nvidia.com/en-us/ai/nemoclaw/ Future Tools: https://futuretools.io/ Get the guide to build your own Custom GPT: https://clickhubspot.com/tnw — Check Out Matt's Stuff: • Future Tools - https://futuretools.beehiiv.com/ • Blog - https://www.mattwolfe.com/ • YouTube- https://www.youtube.com/@mreflow — Check Out Nathan's Stuff: Newsletter: https://news.lore.com/ Blog - https://lore.com/ The Next Wave is a HubSpot Original Podcast // Brought to you by Hubspot Media // Production by Darren Clarke // Editing by Ezra Bakker Trupiano

ai growth future predictions blog wars farewell nvidia duolingo data privacy next wave jensen huang inference ai tool gtc matt wolfe get matt joe fier

Podcast #860 - DLSS 5 Reaction, Arrow Lake Refresh, New Ryzen Rumor, MSI's DDR4 Mobo + MORE!

PC Perspective Podcast

Play Episode Listen Later Mar 23, 2026 76:10

Back in the podcast booth this week! Enjoy the DLSS takes, noticing that SSD vendors are having a great year, there are some actual good news on Copilot, and fake ram kits are a thing, AMD has CPU refresh rumors, and a Windows "degraded" security state is coming soon!0:00 Intro00:37 Patreon01:41 Food with Josh03:01 DLSS 517:18 Intel Arrow Lake refresh20:36 AMD Ryzen refresh rumor21:31 NVIDIA dGPU market share dominance23:52 GTC in spaaaaace29:03 SSD vendors had an amazing quarter for some reason32:02 Some good news for Copilot 365 victims36:09 Antec and Noctua collaborate on a case37:46 Impress your friends with TWO sticks of DDR539:06 MSI plans more DDR4 motherboards40:33 Podcast sponsor - Zapier42:12 (In)Security Corner55:42 Gaming Quick Hits1:03:32 Picks of the Week1:13:03 Outro ★ Support this podcast on Patreon ★

food rumors lake windows arrow refresh impress amd copilot cpu ssd msi dlss ryzen gtc amd ryzen mobo ddr4 noctua antec

AI 时代，最应该做的反而是归纳与总结

FView Friday

Play Episode Listen Later Mar 21, 2026 166:24

本期嘉宾：彭林、十天、森森、蓝白本期节目的主要内容有：· 00:01:03 -- 苹果无预警发布 AirPods Max 2· 00:09:30 -- 等四年终于用上，国行 Apple Watch 房颤历史正式上线· 00:14:22 -- 苹果游戏展示会· 00:37:11 -- 21.99 万元起，新一代小米 SU7 正式发布· 00:53:41 -- 9999 元起，OPPO Find N6 正式发布· 01:08:35 -- Nothing Phone (4a) 系列手机国内发售· 01:21:39 -- 小米深夜发布三款模型· 01:37:03 -- 小米笔记本 Pro 14 体验· 01:46:03 -- 索尼发布首款 LOFIC 传感器 IMX908· 01:52:09 -- 英伟达 GTC 主题演讲· 02:10:53 -- 「3·15 晚会」曝光AI大模型投毒· 02:15:25 -- 任天堂 Switch 2 升级新增「官方超频」模式· 02:17:25 -- 追觅芯际穿越「瑶台」算力基站发射成功· 02:17:55 -- 面壁智能发布「龙虾盒」EdgeClaw Box：给「龙虾」装上安全大脑· 02:20:00 -- 微软电脑管家上线「一键卸载龙虾」功能· 02:22:22 -- 乐天「日本最强 AI」塌房：扒开代码全是 DeepSeek，还删了开源协议· 02:24:53 -- 闲聊环节（标题在最后）每周五晚 8 点，爱否直播间，我们一起开心聊天

ai switch apple watches airpods max gtc nothing phone

TNW 429: DarkSword Puts Hundreds of Millions at Risk - The Internet's Reaction to NVIDIA's DLSS 5

Tech News Weekly (MP3)

Play Episode Listen Later Mar 19, 2026 68:24

Jennifer Pattison Tuohy of The Verge joins Mikah Sargent this week on Tech News Weekly! IKEA's smart home products are not quite there yet. NVIDIA unveiled DLSS 5 at GTC 2026 and faced backlash from within the gaming community. A powerful iPhone-hacking technique has been discovered to take over devices running iOS 18. And some updates on the probes into Tesla's and hands-free driving systems. Jennifer shares her experience using IKEA's newest smart home products utilizing the Matter-over-Thread protocol, expressing her concerns and frustrations with trying to connect them to any smart home platform. Mikah talks about Nvidia's DLSS 5 that was unveiled at GTC 2026, and the criticism & memes it has faced since its unveiling. Andy Greenberg of WIRED joins the show to talk about a tool that has been discovered, called DarkSword, that can hack into millions of iPhones running iOS 18. And Mikah chats about hands-free driving technology in vehicles and NHTSA's probe into Tesla and Ford's hands-free driving systems. Hosts: Mikah Sargent and Jennifer Pattison Tuohy Guest: Andy Greenberg Download or subscribe to Tech News Weekly at https://twit.tv/shows/tech-news-weekly. Join Club TWiT for Ad-Free Podcasts! Support what you love and get ad-free audio and video feeds, a members-only Discord, and exclusive content. Join today: https://twit.tv/clubtwit Sponsors: outsystems.com/twit hipebl.ai

ai internet risk iphone tesla discord millions ios ikea hundreds wired puts nvidia verge thread twit dlss nhtsa gtc mikah andy greenberg mikah sargent tech news weekly tech news today jennifer pattison tuohy

NVIDIA CEO Says Leaders Lack Imagination, Cognizant's $4.5T Warning, & The Case Against the AI Apocalypse

The Future of Work With Jacob Morgan

Play Episode Listen Later Mar 19, 2026 44:09

March 19, 2026: Jensen Huang had one of the biggest weeks in tech at Nvidia's GTC — but his sharpest line wasn't about chips. When asked why companies are laying off workers, he said simply: because they're out of imagination. We unpack what that means, plus his surprise take on compensation from the All-In podcast. Then Cognizant drops a bombshell update to its 2023 workforce study: 93% of jobs impacted by AI, $4.5 trillion in labor shifting to machines, six years ahead of schedule. Their own words: "We underestimated the technology." But two CEOs are pushing back on the doom narrative — Uber co-founder Travis Kalanick says humans will be "super fine" until AGI arrives, and Tech Mahindra CEO Mohit Joshi argues the demand for human labor isn't going anywhere, and has the data to back it up. We close with JPMorgan Chase's 2026 tech trends report and the concept quietly reshaping what leaders actually do: context engineering. Watch the full episode on YouTube ---------- Start your day with the world's top leaders by joining thousands of others at Great Leadership on Substack. Just enter your email: ⁠⁠https://greatleadership.substack.com/ Stop patching problems and start designing an intentional workplace. The 8 Laws of Employee Experience gives you the how. Order your copy: 8EXlaws.com

ai leaders uber lack ceos laws apocalypse substack imagination nvidia all in jp morgan chase agi employee experience cognizant jensen huang great leadership gtc travis kalanick

TNW 429: DarkSword Puts Hundreds of Millions at Risk - The Internet's Reaction to NVIDIA's DLSS 5

Tech News Weekly (Video HI)

Play Episode Listen Later Mar 19, 2026 68:24

Tech News Weekly 429: DarkSword Puts Hundreds of Millions at Risk

All TWiT.tv Shows (MP3)

Play Episode Listen Later Mar 19, 2026 68:24 Transcription Available

ai risk iphone tesla discord millions ios ikea hundreds wired puts nvidia verge thread twit dlss nhtsa gtc mikah andy greenberg mikah sargent tech news weekly tech news today jennifer pattison tuohy

EP287. 輝達 GTC 發表重點、OpenAI 砍掉支線、Agent 外殼大戰 | M觀點

M觀點 | 科技X商業X投資

Play Episode Listen Later Mar 19, 2026 86:29

《AI 百用百科｜生產力系列》專為職場工作者設計，學完直接用在工作上，成為真正的 AI 高手不只是自己會用，更能為團隊建立一套「可持續運行」的工作系統。 — 課程特色 — ✓ 情境導向｜直接對應工作場景，教你用 AI 解決真實工作任務 ✓ 手把手教學｜完整示範操作流程，從思路到實際成果一步一步帶著你走 ✓ 下課即用｜提供完整自動化流程模板，回到工作崗位就能馬上應用生產力系列共 8 堂課，從工作任務出發，打造全方位的應用指南，收錄 8 大職場常見情境，讓你把時間從重複性工作中省下來，回到真正重要的事。會議記錄統整｜名單篩選寄發大量資料整理｜專業簡報製作行銷週報生成｜信件語意分流自動排班規劃｜合約風險辨識 ⏰ 預購優惠到 4/1 截止，現在入手最划算！

hosting openai gtc soundon m podcast agent m

What the hot PPI means for the market

Wall Street Unplugged - What's Really Moving These Markets

Play Episode Listen Later Mar 18, 2026 61:44

The PPI came in hot: What it means for interest rates… Why the market NEEDS a rate cut… Key takeaways from Nvidia's (NVDA) GTC conference… Is NVDA a buy at current levels? … And the latest massive tech layoffs. In this episode: Happy late St. Patrick's Day! (And my Irish car bomb story) [0:15] The PPI came in hot: What it means for interest rates [5:30] Why the market NEEDS a rate cut [7:57] Key takeaways from Nvidia's GTC conference [22:02] Is NVDA a buy at current levels? [24:47] Is AI to blame for the massive tech layoffs? [38:14] My pick for the March Madness champion [48:56] Did you like this episode? Get more Wall Street Unplugged FREE each week in your inbox. Sign up here: https://curzio.me/syn_wsu Find Wall Street Unplugged podcast… --Curzio Research App: https://curzio.me/syn_app --iTunes: https://curzio.me/syn_wsu_i --Stitcher: https://curzio.me/syn_wsu_s --Website: https://curzio.me/syn_wsu_cat Follow Frank… X: https://curzio.me/syn_twt Facebook: https://curzio.me/syn_fb LinkedIn: https://curzio.me/syn_li

ai market irish march madness stitcher nvidia ppi gtc

MM #301: The Next Market Crash? Nvidia's Future, Oil To $100? & Best Assets To Buy Now

Market Mondays

Play Episode Listen Later Mar 17, 2026 107:59

Time Stamps ⏰00:07—Investing Fact Of The Week19:00—Nvidia's GTC Outlook28:00—Time To Sell Nvidia?36:00—Is The Crash Coming?46:00—Top Assets For The Next 12 Months?52:00—Commodities54:00—Robinhood Stock Outlook59:00—Invest Fest Sneak Peek1:05:00—4 Stocks You Hate1:16:00—Target Boycott Fiasco1:30:00—Will Oil Go Above $100?1:38:00—FICO1:40:00—2026 Market CycleIn this episode of Market Mondays, we break down the biggest stories shaping the market right now. From Nvidia's GTC conference and the future of AI, to the possibility of a market crash, oil potentially hitting $100, and the assets investors should be watching over the next 12 months.We also discuss the outlook for Robinhood, the growing conversation around commodities, and four stocks we are not fans of right now. Plus, we address the Target boycott controversy, take a look at the upcoming 2026 market cycle, and give a sneak peek at what's coming for Invest Fest.If you want to stay ahead of the market and understand where the opportunities are in the current economic environment, this episode is packed with insights for investors and entrepreneurs alike.#MarketMondays #StockMarket #Investing #Nvidia #AIStocks #StockMarketNews #FinancialEducation #InvestingTips #Robinhood #Commodities #OilPrices #InvestFest #EarnYourLeisure #WealthBuildingSupport this podcast at — https://redcircle.com/marketmondays/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

ai target robin hood assets nvidia market crash gtc investfest

SOTS 2nd Hour: Cramer Interviews Nvidia CEO, Apollo's Chief Economist, & Iran's Fed Impact 3/17/26

Squawk on the Street

Play Episode Listen Later Mar 17, 2026 43:48

Carl Quintanilla, Sara Eisen, and David Faber kicked off the show with fresh data, tensions within the Fed's dual mandate ahead of tomorrow's rate decision, and more on Iran's market impact before a wide-ranging interview you don't want to miss - Nvidia CEO Jensen Huang, alongside Jim Cramer at the company's GTC conference out in California. Hear the man himself break down Nvidia's staggering $1T forecast, AI demand, and more. Elsewhere in the hour: what to expect out of the Fed tomorrow - according to CNBC's exclusive Fed Survey results, and Apollo Global's Chief Economist Torsten Slok (who says there'll be no cuts tomorrow - or this year). Squawk on the Street Disclaimer Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

california ai iran apollo fed cnbc nvidia chief economists simplecast cramer jim cramer 1t gtc squawk sots david faber carl quintanilla

NVIDIA's Jensen Huang Wants It All (GTC 2026)

AI For Humans

Play Episode Listen Later Mar 17, 2026 22:23

Jensen Huang just stood on stage and said $1 trillion. He wasn't joking. NVIDIA's GTC 2026 keynote was a masterclass in flexing, and we're breaking down every layer of the cake. We walk through Jensen Huang's massive GTC 2026 keynote, from NVIDIA's $1 trillion business projection to the inference inflection point that's reshaping the entire AI industry. We dig into DLSS 5 and why AI-powered neural rendering is about to change gaming forever (sorry, gamers), NVIDIA's deep integration with OpenClaw and the launch of NemoClaw for enterprise agents, chips in space, and what it all means when every company becomes an agentic-as-a-service company. Plus the Dwarkesh podcast with Dylan Patel on the real bottlenecks in compute that nobody's talking about. JENSEN HUANG SAID ONE TRILLION DOLLARS AND DIDN'T BLINK. WE BLINKED. PS, we're now coming to you TWICE a week (both a little shorter). Come to our Discord: https://discord.gg/muD2TYgC8f Join our Patreon: https://www.patreon.com/AIForHumansShow AI For Humans Newsletter: https://aiforhumans.beehiiv.com/ Follow us for more on X @AIForHumansShow Join our TikTok @aiforhumansshow To book us for speaking, please visit our website: https://www.aiforhumans.show/ // Show Links // NVIDIA GTC 2026 Full Keynote with Jensen Huang https://www.youtube.com/live/jw_o0xr8MWU?si=VZAIG3E7vuUCwz6N DLSS 5: Breakthrough in Visual Fidelity for Games https://www.nvidia.com/en-us/geforce/news/dlss5-breakthrough-in-visual-fidelityfor-games/ DLSS 5 Official Trailer https://youtu.be/dJACkKbN-Eo?si=fIJvsV52---bOyTr Digital Foundry Deep Dive on DLSS 5 https://youtu.be/4ZlwTtgbgVA?si=g8TMgNlOWknKnqHo Good Til' Cancelled: The GTC Game https://x.com/SAlexashenko/status/2033585849586331985?s=20 Dwarkesh Podcast: Dylan Patel on Compute Bottlenecks and Chips https://youtu.be/mDG_Hx3BSUE?si=YnLEIVhsaCpdVQgi

tiktok ai games discord ps breakthrough chips nvidia official trailer jensen huang dlss gtc

Nvidia's GTC Conference Underway… And When To Buy The Dip 3/16/26

CNBC's "Fast Money"

Play Episode Listen Later Mar 16, 2026 43:26

All eyes on Nvidia, as the chip giant kicks off its GTC conference. The next-gen GPU's investors are waiting to hear about, and what it means for the stock's next move. Plus Stocks rallying to start the week, as oil pulls back on the latest developments out of Iran. What a top market strategist needs to see before buying the dip, and the areas of the market where he'd put new money to work. Fast Money Disclaimer Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

conference iran nvidia underway gpu simplecast gtc

A Guy Used AI to Cure His Dog's Cancer*

The AI Breakdown: Daily Artificial Intelligence News and Discussions

Play Episode Listen Later Mar 16, 2026 28:27

The AI discourse is absolutely frenetic right now — everything from Karpathy's misinterpreted jobs visualization to a viral dog cancer cure story that's both less and more than it seems. NLW's argument: we're in AI's Second Moment, the agentic equivalent of the original ChatGPT shock, but with bigger capabilities, billions more people in the conversation, higher economic stakes, and an industry that's had three years to get worse at explaining itself. In the headlines: a preview of NVIDIA's GTC, SEC filings quietly listing AI agents as a material risk, and ByteDance shelving its video model over copyright disputes.Learn more about AGENT MADNESS: Our 64-Bracket tournament to find the coolest Agent of 2026 ⁠⁠⁠⁠https://www.agentmadness.ai/⁠⁠⁠⁠Brought to you by:KPMG – Agentic AI is powering a potential $3 trillion productivity shift, and KPMG's new paper, Agentic AI Untangled, gives leaders a clear framework to decide whether to build, buy, or borrow—download it at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠www.kpmg.us/Navigate⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Mercury - Modern banking for business and now personal accounts. Learn more at ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://mercury.com/personal-banking⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AIUC-1 - Get your agents certified to communicate trust to enterprise buyers - ⁠⁠⁠⁠⁠https://www.aiuc-1.com/⁠⁠⁠⁠⁠Blitzy - Want to accelerate enterprise software development velocity by 5x? ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://blitzy.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠AssemblyAI - The best way to build Voice AI apps - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.assemblyai.com/brief⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Robots & Pencils - Cloud-native AI solutions that power results ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://robotsandpencils.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠The Agent Readiness Audit from Superintelligent - Go to ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://besuper.ai/ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠to request your company's agent readiness score.The AI Daily Brief helps you understand the most important news and discussions in AI. Subscribe to the podcast version of The AI Daily Brief wherever you listen: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://pod.link/1680633614⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Our Newsletter is BACK: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://aidailybrief.beehiiv.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Interested in sponsoring the show? sponsors@aidailybrief.ai

ai dogs cancer chatgpt cure agent sec navigate nvidia bracket kpmg bytedance gtc nlw

Navigating the Energy Market Turmoil 3/16/26

Halftime Report

Play Episode Listen Later Mar 16, 2026 45:32

David Faber and the Investment Committee debate how to trade oil and the market as turmoil in the energy sector grows. CNBC's Brian Sullivan joins us with the latest comments from Treasury Secretary Scott Bessent. Plus, CNBC's Kristina Partsinevelos joins us to discuss the latest news out of San Jose, California, where Nvidia is set to kick off its annual GTC event. The Committee debate how to trade the company ahead of the Jensen Huang's keynote speech. And later, the desk debate retail investors abandoning private credit and what it means for the sector. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

california energy navigating market committee cnbc nvidia san jose turmoil simplecast jensen huang investment committee gtc brian sullivan david faber

Federal Judge Blocks Subpoena In Fed Reserve Probe… And Counting Down to Nvidia's GTC 3/13/26

CNBC's "Fast Money"

Play Episode Listen Later Mar 13, 2026 43:35

A fresh wave of semi catalysts lining up for next — from Nvidia's GTC event to Micron earnings and an AWS–Cerebras tie-up. What Fast Money Friend Gene Munster is watching, and what to expect from Nvidia's CEO Jensen Huang when he takes the stage. Plus Jefferies' David Zervos joins us with a simple message for investors: “Don't panic,” as traders weigh inflation risks, Meta's reported AI delay, and Boeing's push to fix wiring issues. Fast Money Disclaimer Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

ai boeing blocks nvidia probe simplecast micron subpoenas federal judges jensen huang gtc fed reserve

Podcasts about gtc

Best podcasts about gtc

Simply Trade

Acting Up with GTC

Everyday AI Podcast â€“ An AI and ChatGPT Podcast

TD Ameritrade Network

Laura's List

WGTD's The Morning Show with Greg Berg

The Insider Travel Report Podcast

Killander & Björk

GTC Podcasts

Girls Talk Comics

The AI Podcast

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

The AI Breakdown: Daily Artificial Intelligence News and Discussions

The tastytrade network

All TWiT.tv Shows (MP3)

On The Tape

Good Luck High Five

Squawk on the Street

This Week in HPC

The SharePickers Podcast with Justin Waite

All CNET Video Podcasts (HD)

M?? | ??X??X??

All TWiT.tv Shows (Video LO)

Golf Talk Canada

Washington AI Network with Tammy Haddad

Radio Leo (Audio)

CNET News (HD)

PC Perspective Podcast

Tech Café

Trader Merlin

Algorithms + Data Structures = Programs

The Six Five with Patrick Moorhead and Daniel Newman

You Can't Make This Up Podcast

Hablando con Científicos - Cienciaes.com

????

Broken Silicon

Greater Than Code

The Full Nerd

Meadowbrook Magic

Tips For Guitar Playing Success

Auto Sausage

GEROS Health - Physical Therapy | Fitness | Geriatrics

NL Rallysport | BENE Servicepark | Rally Podcast

The Circuit

Will Brocker

Latest news about gtc

Latest podcast episodes about gtc

304 – Building Successful Multi-Product Solutions with Hyperscalers and GSI’s

157. Les différents types d'ordres (Partie 3)

Why AI Infrastructure must evolve for Agent Experience — Akshat Bubna, Modal CTO

When the answer is never ‘no': The world of entertainment travel (Global Travel Collection, part 2)

How the GTC Host Agency Is Seeing Incredible Growth

Where Global Travel Collection Grows Great Luxury Advisors

Who Helps GTC Travel Advisors Stay Successful in Selling Travel

How Global Travel Collection Came Together in Austin

An inside look at Global Travel Collection, the $2.4 billion powerhouse host agency (part 1, feat. Angie Licea)

Why Social Media Lost in Court and AI Agents Demand Total Surveillance ft. Shelley Palmer

Ferrari 330GTC: The Story of a Thoroughbred, with Maurice Khawam (Encore)

Perseverance in the Face of Adversity: From Geopolitics to the Gut Check with George Tagg Jr. | Ep 56

News: осінь на ринку LLM?; як не треба робити метріки; локальний інференс майбутнього

PCS Season Payoff: How Military Families Turn Moving Expenses Into Free Vacations #232

AUGTC S3 E6: What, Like It's Hard? Bringing Legally Blonde to Life

台灣民眾薪情差 韓媒評"乞丐超人"？（2026/06/01）

AUGTC S3 E6: What, Like It's Hard? Bringing Legally Blonde to Life

Nvidia komt nu juist met chips voor laptops en desktop-pc's onder naam RTX Spark

SP2. 輝達 GTC 直播抽顯卡 + 軟體股回神了 | M觀點特別篇

【天下零時差06.01.26】輝達GTC、Computex登場；國際油價漲，台灣通膨為何低？；聯準會公布最新褐皮書

【阿榕伯胡說科技Ep.76】5月科技大事解析：黃仁勳再度訪台、聯發科股價噴發、SpaceX上市倒數

[CLIP] World Models, Real-Time Video and the Decade Ahead | Jamie Umpherson (Runway)

#528 The AI Video Revolution Reshaping Cinema, Advertising and Fashion | Jamie Umpherson (CCO at Runway)

Lumentum: 90% Revenue Growth, a $2 Billion Nvidia Investment, Triple Digits Coming — and the Dilution Story Nobody Is Covering

EP3-37 | What Makes AI Leaders Different? The Taiwan Connection Behind Jensen Huang and the AI Boom

NVIDIA GTC 2026: AI, Robotics and Future Trends in Tech

CCA Accelerator Special Feature: Transforming Advising at Greenville Technical College

AUGTC: S3 E5: The Power of Empathy: Why "To Kill a Mockingbird" Still Matters

Shopify's AI Phase Transition: 2026 Usage Explosion, Unlimited Opus-4.6 Token Budget, Tangle, Tangent, SimGym — with Mikhail Parakhin, Shopify CTO

349: Gmail Finally Lets You Ditch xXDragonSlayer2004Xx

Why Social Media Lost in Court and AI Agents Demand Total Surveillance – Shelley Palmer's 5th Visit

台灣民眾薪情差韓媒評"乞丐超人"？（2026/06/01）