Podcasts about gpus

  • 1,382PODCASTS
  • 3,268EPISODES
  • 49mAVG DURATION
  • 2DAILY NEW EPISODES
  • Jun 28, 2026LATEST

POPULARITY

20192020202120222023202420252026

Categories



Best podcasts about gpus

Show all podcasts related to gpus

Latest podcast episodes about gpus

The John Batchelor Show
S8 Ep1066: The Founding of OpenAI. Guest Author: Keach Hagey. In this opening segment, Keach Hagey discusses the January 2016 founding of OpenAI as a nonprofit research lab. Key figures included co-founder Greg Brockman and chief scientist Ilya Sutskever,

The John Batchelor Show

Play Episode Listen Later Jun 28, 2026 10:25


The Founding of OpenAI. Guest Author: Keach Hagey. In this opening segment, Keach Hagey discusses the January 2016 founding of OpenAI as a nonprofit research lab. Key figures included co-founder Greg Brockman and chief scientist Ilya Sutskever, a renowned researcher whose recruitment from Google signaled the lab's potential. Backed by a billion-dollar commitment from Elon Musk, Peter Thiel, and Jessica Livingston, the project was designed as a safe, non-commercial counterweight to Google's DeepMind. Operating initially out of Brockman's apartment, the team aimed to achieve Artificial General Intelligence (AGI) for the benefit of humanity. The technical foundation relied heavily on GPUs—hardware originally designed for video games—which proved essential for training the deep learning neural networks necessary for their research. This era was characterized by an ambitious, "pirate" spirit funded through YC Research to explore radical ideas outside the profit motive. 1JANUARY 1931

Minimum Competence
Legal News for Tues 6/23 - LA "Sanctuary City" Fight with Feds, Voter Roll Database Limits, and OpenAI, Cloud Computing, and the R&D Credit

Minimum Competence

Play Episode Listen Later Jun 23, 2026 7:10


This Day in Legal History: Title IXOn June 23, 1972, President Richard Nixon signed the Education Amendments of 1972, a sweeping federal education law that included what became one of the most consequential civil rights provisions in American history: Title IX. Title IX stated that no person in the United States, on the basis of sex, could be excluded from participation in, denied the benefits of, or subjected to discrimination under any education program or activity receiving federal financial assistance. The language was brief, but its legal effect was enormous because it tied sex-equality obligations to the federal funding received by schools, colleges, and universities. That structure gave the federal government a powerful enforcement tool: institutions that accepted federal education money also had to comply with anti-discrimination rules.Although Title IX is often remembered for transforming women's and girls' athletics, the law was never limited to sports. It also affected admissions, scholarships, hiring, classroom access, pregnancy discrimination, and later legal debates over sexual harassment and institutional responsibility. Before Title IX, many educational institutions openly limited opportunities for women, including through quotas, unequal athletic resources, and restricted access to professional programs. The statute helped turn those practices into legal liabilities rather than accepted traditions. In later decades, courts and federal agencies would shape Title IX's meaning through regulations, enforcement actions, and major cases interpreting what counts as sex discrimination in education. Its influence reached far beyond individual lawsuits because schools had to rethink policies, reporting systems, athletic budgets, and equal-access obligations.Title IX also became a model for how civil rights law can operate through spending power, using federal money as the hook for national anti-discrimination standards. Its passage showed that a single sentence in a larger statute could become a foundation for generations of legal, political, and cultural change. On June 23, 1972, the federal government did more than amend education law; it created a durable legal framework for challenging sex discrimination wherever public money supported educational opportunity.A federal judge in California dismissed the Trump administration's lawsuit challenging Los Angeles's limits on cooperation with federal immigration enforcement. The administration had argued that the city's ordinance was unconstitutional because it restricted the use of city resources to support federal immigration operations and limited the collection of citizenship-status information. U.S. District Judge Fernando Olguin rejected that argument, finding that Los Angeles was regulating the conduct of its own employees and agencies rather than trying to control the federal government. The dismissal was not necessarily the end of the case, because the judge allowed the administration to file an amended complaint. Los Angeles City Attorney Hydee Feldstein Soto praised the ruling, saying it confirmed that local governments can decide how to use their own personnel and resources. The lawsuit was filed after immigration-related protests in Los Angeles and after Trump sent troops to the city in response to unrest over deportation operations. The case is part of a broader Trump administration effort to challenge local “sanctuary” policies in Democratic-led jurisdictions. Similar administration lawsuits against Boston and Chicago have also been dismissed by federal judges. The White House did not immediately comment on the ruling. The decision leaves Los Angeles's ordinance intact for now while giving the federal government another chance to revise its legal claims.US court dismisses Trump administration lawsuit over Los Angeles immigration policy | ReutersA federal judge in Washington, D.C., blocked the Trump administration from using a revised immigration database to help states check voter rolls. The database, known as SAVE, is used by the Department of Homeland Security to verify citizenship and immigration status, but the administration had changed it to make bulk searches easier for state and local officials reviewing voter eligibility. U.S. District Judge Sparkle Sooknanan sided with voting-rights and privacy groups that argued the changes made the system less reliable and could wrongly remove eligible voters from registration lists. The challengers said the database can be outdated, especially when naturalized citizens are still incorrectly listed as noncitizens. The judge also found that the revamped system raised serious privacy concerns because it gave users access to sensitive information, including Social Security numbers. DHS criticized the ruling and framed the case as part of its effort to prevent noncitizen voting. The ruling comes as the Trump administration has tried to expand the federal government's role in election administration before the November 2026 midterm elections. Courts have already blocked several related efforts, including parts of executive orders involving proof-of-citizenship requirements and mail-ballot restrictions. The administration has also faced setbacks in lawsuits seeking full voter-roll data from states. For now, the decision limits how the federal government can use immigration records in voter-roll checks.Judge blocks Trump's use of revamped immigration database for voter checks | ReutersIn my Bloomberg column this week, I wrote about OpenAI's request that Treasury update an outdated R&D tax credit rule for computer-related research expenses. My argument is that OpenAI's position should not be dismissed as just another technology company asking for a more generous tax benefit. The problem is that the existing rule was designed for an older world of identifiable physical computers, not modern cloud computing, data centers, GPUs, and reserved compute capacity. Section 41 allows a research credit for certain amounts paid to another person for computer use in qualified research, but Treasury regulations narrow that benefit by requiring that the computer be owned and operated by someone else, located off the taxpayer's premises, and not be a computer for which the taxpayer is the “primary user.” That “primary user” test made more sense when a taxpayer could point to a discrete machine, but it becomes unstable when a company is buying access to capacity inside a provider-owned cloud or data center.I argue that reserved or exclusive use of computing capacity should not automatically be treated as ownership or abuse, because modern AI research may require dedicated capacity for security, speed, and performance reasons. The real question should be whether the taxpayer is buying a third-party service or has effectively acquired, operated, or taken control of the infrastructure. Treasury can still protect against abuse without treating ordinary commercial cloud arrangements as disguised ownership. I suggest that a practical safe harbor could presume service treatment where the provider owns, operates, maintains, and houses the equipment off the taxpayer's premises while bearing the incidents of ownership. That presumption should remain rebuttable where the taxpayer bears ownership-like risks or is simply routing its own equipment through another entity to claim the credit.The broader point is that modernizing the rule would not need to turn the R&D credit into an AI subsidy machine, but it would prevent an old regulatory framework from excluding a major category of modern research. The column closes with the idea that tax rules meant to police fake outsourcing should not end up penalizing real outsourcing just because the computing world no longer looks like it did when the rule was written.OpenAI's Call for Modernized R&D Credit Rule Makes Perfect Sense This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.minimumcomp.com/subscribe

The Future of Work With Jacob Morgan
The Data Center Race Behind AI: Solidigm's SVP on Why Storage, GPUs, and Scale Matter

The Future of Work With Jacob Morgan

Play Episode Listen Later Jun 22, 2026 44:37


I talk with Greg Matson, Senior Vice President and Head of Marketing and Products at Solidigm, about the storage infrastructure powering the AI boom. We get into why AI training and inference require massive amounts of data, how GPUs, SSDs, and data centers work together, and why storage can't be an afterthought for companies building enterprise AI. We also discuss the scale of today's AI data center buildout, how Solidigm is using AI internally, and what this means for the future of work, education, and the skills people will need in an AI-first world.

TIC TEK TOE
AI Boxes and Dinosaurs Eating People

TIC TEK TOE

Play Episode Listen Later Jun 22, 2026 50:55


Welcome to another very occasional episode of the Tic-Tek-Toe podcast with your hosts Marcel Gagné and Evan Leibovitch. In this episode (037), the duo dives into the hardware and software defining 2026, exploring the rising appeal of compact, sealed AI mini-PCs as the perfect workaround for the sticker shock of modern desktop GPUs and RAM.They also tackle the ongoing debate between paying for multiple premium AI subscriptions versus the dream of running fully local models for coding and personal agentsMarcel demands the sci-fi promises of personal robots and advanced technology right now, while Evan takes a more cautious, patient approach (surprise, surprise).In the second half, the conversation shifts to a massive summer movie and pop-culture roundup. We've got Mandalorian and Grogu, Project Hail Mary, a John Wick-inspired take on Supergirl, Nicolas Cage in Spider-Noir, a wildly self-aware, meta version of He-Man, and Steven Spielberg's return to the alien genre with Disclosure Day.Oh, and Marcel shares his unapologetic loves for movies where dinosaurs get to eat people.

L8ist Sh9y Podcast
Kubernetes as Common Platform

L8ist Sh9y Podcast

Play Episode Listen Later Jun 19, 2026 38:43


This week we examine the emergence of Kubernetes as a common infrastructure platform. We contrast cloud assumptions with bare-metal reality and dig deep on supportability, repairability, and the operational challenges of networking, storage, GPUs, and other specialized systems. We also get into brittle vendor tooling, version changes, and how these can make remediation difficult. I think you'll like this one! Transcript: https://otter.ai/u/cA80HK4iVJWflodPqIYf9mW3Mec?utm_source=copy_url

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Last 4 days before regular tickets sell out at AI Engineer World's Fair - this is the single biggest gathering of AI Engineers, Founders, Leaders, and Researchers in the world. Attendees get >$5000 worth of sponsor credits and talk tracks are looking FANTASTIC. Join us!The AI scaling debate always focuses on the question of “how do we get more GPUs?” but the better question may be: how do we make the most of ones we already have.The fact that a frontier lab like xAI could be running at sub-10% MFU (Model FLOPs Utilization) is just a hint at what the real problem may be.For context, older frontier-scale training runs were already much higher than 10%. GPT-3 was around 21% MFU. Gopher was around 32%. Megatron-Turing NLG was around 30%. PaLM reached around 46%. And our guest Anjney says best-in-class MFU today is closer to 60–70%.It's not necessarily that xAI is uniquely incompetent (it's clear they have talented folks) but rather the priorities may be flipped in the GPU arms race.While GPU access is a bottleneck, simply increasing CapEx won't automatically translate to better models as frontier AI is increasingly a systems problem: scheduling, utilization, networking, kernels, frameworks, data pipelines, parallelism, cluster reliability, and the thousand small decisions that determine whether your theoretical FLOPs become real training progress.From building Discord's developer platform and backing frontier AI companies like Anthropic, Mistral, Black Forest Labs, and Periodic Labs to now building AMP's independent compute grid, Anjney Midha has spent years close to the real bottlenecks of AI scaling. In this episode, Anjney joins swyx at Periodic Labs to unpack why the AI race is not just about buying more GPUs, why 95% utilization would have been considered an outage at Google, and why the next era of AI infrastructure has to be more aligned, more efficient, and more responsible.We go deep on AMP's vision for a compute grid that makes FLOPs flow like megawatts, the difference between full-stack AI labs and horizontal pooling, why AI data centers need community buy-in, and how compute markets could evolve into something closer to an independent system operator. Anjney also explains why DeepMind's unpublished research points to a market failure, why end-of-life prediction remains one of the most important AI applications he has thought about for fourteen years, and why “output maxing” may become a new discipline for frontier systems.We also discuss Anthropic's culture, why “luck favors the prepared mind” in coding models, how Claude cracked coding, why too much capital too early can make AI labs fragile, what Periodic Labs is trying to do with science and superconductors, why great researchers can become great CEOs, and why Silicon Valley is both deeply missionary and deeply mercenary.We discuss:* Why 95% utilization was considered an outage at Google* Why AI infrastructure waste compounds at frontier-lab scale* Why “move fast and break things” does not work for AI data centers* How data center backlash, power grids, and community incentives shape AI scaling* AMP's vision for making FLOPs flow like megawatts* Why compute needs an independent system operator* How interruptible demand and dynamic prioritization worked inside Google* Why DeepMind research hoarding creates negative externalities* AMP's 1.2GW base-load ambition and the need for 6GW of spike capacity* Why end-of-life prediction could become one of AI's most important healthcare applications* Frontier Systems, output maxing, and full-stack alignment* Why APIs and abstraction layers become lossy as organizations scale* Superconductors, standards, and the dream of lossless systems* SF Compute, open protocols, and the future of compute marketplaces* Why non-NVIDIA chips can still benefit from NVIDIA's reference architecture* Trust boundaries and why chip startups need visibility into future model architectures* Why VCs often underestimate researchers as CEOs* Scientists as star athletes of the mind* Why great CEOs need to be confrontational up and down the stack* Why leading the frontier matters more than “winning”* How Anthropic cracked coding* Why culture is fragile, not a permanent moat* Why hardship was a feature, not a bug, for Anthropic* Why Anthropic's P0 was coding from day one* Periodic Labs, physics as the constraint, and technical reality* Silicon Valley mercenaries, missionary teams, and what happens after a breakthroughAnjney Midha* LinkedIn: https://www.linkedin.com/in/anjney* X: https://x.com/AnjneyMidhaAMP PBC* Website: https://amppublic.com/* X: https://x.com/amppublicTimestamps00:00:00 Introduction00:00:09 Why AI Compute Is Being Wasted00:03:17 Responsible Infrastructure and Data Center Backlash00:06:07 AMP Grid: Making FLOPs Flow Like Megawatts00:12:41 Foundry, Frontier Labs, and Research Hoarding00:14:42 Gigawatt-Scale Compute and End-of-Life Prediction00:24:08 Frontier Systems, Output Maxing, and Alignment00:27:38 Compute Markets, SF Compute, and Non-NVIDIA Chips00:32:57 Trust Boundaries, Co-Design, and Researcher CEOs00:38:17 AI Coachella and First-Principles Thinking00:42:43 Leading vs Winning in Frontier AI00:45:54 How Anthropic Cracked Coding00:48:25 Culture, Hardship, and Anthropic's P000:54:03 Periodic Labs, Physics, and Silicon Valley Mercenaries00:56:26 Rishi Valley, Singapore, and Money as a Measure00:58:47 Closing ThoughtsTranscriptIntroduction: Anjney Midha, AMP, and Compute WasteSwyx [00:00:00]: We're in Periodic Labs with Anjney Midha, CEO, founder of AMP. Welcome.Compute Utilization: Node Allocation, MFU, and AlignmentAnjney [00:00:09]: Thanks for having me. At Google, there are two types of utilization usually, right? That you're measuring in these clusters. One is node allocation, and then the other's MFU. Node utilization is usually like what percentage of cards in the data center are just, used, and that, if it's not at, 95%-Swyx [00:00:29]: There is no excuseAnjney [00:00:29]: There's no excuse, right? I think 95% at Google, which is where my co-founder, Seb, came from, he built the Borg, PBorg/GQM scheduler at Google, and there I think 95% was considered an outage, so 96% node utilization is, should be standard. And most single-tenant clusters are not running at that. So that's one. And then MFU should be, I would say the best in class today is somewhere between 60 and 70%. I think this is a leadership question, right? Fundamentally it's an alignment question, which is are the people who are funding the cluster and then deploying the cluster actually aligned? And sometimes theoretically they are, but in practice the number of people in the chain, the supply chain between, the capital and all the way to whoever's managing the cluster and then whoever's measuring what the output is, are just so many, degrees of separation away that, the, The Have you ever heard the radian metaphor, which is at the beginning of an arc, if you have two arcs that are two lines that are just off by a few degrees, that-Swyx [00:01:33]: It spreads outAnjney [00:01:34]: It spreads out, right? Or at scale. And I think what's happening is a lot of cluster implementations and infrastructure, a lot of frontier labs and other teams, that's what's happening, is they're, they initialize the plan, which is kind of like North Star with a team that wants to do good, but then they're, required to scale so fast instead of iteratively that the wastage just compounds really fast at scale. And so I think we know the answer, which is just do iterative bring ups. If you spend time with people who've been in the semiconductor industry or the DSN industry for a long time, this is not new, and I don't think AI should be an excuse. Sure. Something What is new? Okay. We have a lot of new capabilities, but that doesn't mean just abandon common sense. Common sense should always be in fashion. ? AI scaling doesn't change the in fact, if anything, AI scaling should be putting a premium on the value of common sense and infrastructure because the margin of error now is so much lower and the costs of wastage are so much higher. And the cost of wastage, by the way, is not just economic. I'm, obviously I'm, I'm an investor, or I'm an investor by background. Over the last few years now we're running an AI infrastructure business called, AMP. And I think that it's okay to say this time is different on the capabilities front. We are genuinely getting capabilities at, of the, of a kind we haven't had before. That doesn't give you an excuse to say this time is different for everything, especially infrastructure. So look, I love the hacker mindset and the hustler mindset. Now, that's great for the startup mindset, but you remember this moment where Zuck went from saying, “Move fast, break things” to, move-Responsible Infrastructure and Data Center BacklashSwyx [00:03:10]: Fast and stable infrastructureAnjney [00:03:11]: Move fast with stable infrastructure. I think now we need to move fast with, responsible infrastructure. People are going to ask where the impact is. There was a really In our class yesterday, Scott Nolan, who's the founder of General Matter, came by at Stanford to speak about energy bottlenecks. And he had a phenomenal idea. He said, “if you look at the marginal unit economics of compute per hour,” he goes, “let's call it, $4 an hour. If you're having to bring up a new data center in a new community, why not just say we're going to charge 4.50 an hour, and that marginal impact or that marginal increase, we just literally take that and give it to the local community as cash?” I can tell you as a customer of that compute, I would love that. I'd be happy to pay an additional 50 cents per hour at scale.Swyx [00:03:57]: Wow. Yeah.Anjney [00:03:58]: Because if that means the public benefit is so clear to the communities that the data centers are coming up in, I'm going to feel like that compute is much more reliable. Up to 20% of all data centers this year in the US, my understanding is are at risk.Swyx [00:04:13]: Of community backlash?Anjney [00:04:14]: Correct. Of not getting the community support they need to get brought up.Swyx [00:04:19]: Wow. That's a huge number.Anjney [00:04:20]: Yeah. Now, we, I think we should dig into what that number is. I think it's a little bit of overstated. These things can get over-reported, but it-Swyx [00:04:27]: They don't just care about jobs. They care about all the other stuff around it, right? They care about power grid, they care about environments-Anjney [00:04:33]: Power grid, permitting, and so on. And imagine I think if you said there's a new AI deal. If we're bringing up a data center in your community, we're actually going to reduce the cost of your electricity bill. Okay, now we're talking. Right? The community's going, “Okay. Now this is a deal. I feel like a partner in this.” Right now that's not happening. There will be audits, there will be investigations, and when the, when the regulators come, I don't know when it's going to be, the folks who are moving fast and breaking things in the name of AI progress better be prepared. That's certainly not how we're procuring compute. Or we're, we're trying as much as we can to work with partners who have long-term track records. Many of whom, by the way, are not, AI providers. I think this whole idea of neoclouds being somehow this new category is a lot of marketing speak. There are really good, reliable, trusted data center providers in America who've been around 20 plus years. I love those folks. They know how to Sure. Are they sponsoring happy hours at NeurIPS? No. Are they legibly listed in Build? No. Are they hanging out in my, in, situational awareness parties? No. But they're adults. I trust them.Swyx [00:05:44]: They can run LAN. They can run power.Anjney [00:05:45]: They can run LAN, power, and shell. They have credit histories. We sit down, we have a conversations. Many of them live in Silicon Valley. They've, they've had to deal with the boom and bust cycles of the internet, and I love those folks. They are stable infrastructure partners and thinkers. And I think there's a lot of short-term thinking going on in the compute layer, and it's going to catch up to us. It's not going to be good.AMP Grid: Making FLOPs Flow Like MegawattsSwyx [00:06:07]: You talk about aligning incentives, and, I would think that aligning incentives means you have the full stack in one company, which is xAI and OpenAI, right? So you as a standalone infrastructure layer, why are you somehow more aligned to your portfolio companies than people who just own the whole thing?Anjney [00:06:28]: In systems design, right, there's, there's two regimes of, architecture, right? You have integration, and then you have pooling and utilization, right? So the Or rather, the way to increase utilization often is you can do systems integration where you collapse a lot of process into one node, or you can pull out a process from a node and share that amongst various That resource amongst several different nodes. And so we see the AMP grid, which is, the, what, the system we're building here, which is basically a compute grid. We're trying to do for compute what the electric grid-Swyx [00:07:02]: PowerAnjney [00:07:02]: Yeah, what the power grid did for electricity. It-- this is a pooling and utilization layer across clouds, And so we're actually the opposite of a full stack integration like approach.Swyx [00:07:12]: Super horizontal.Anjney [00:07:13]: Where it's much more horizontal and it's, it's multi-cloud, it's multi-silicon. The goal is to try to make FLOPs flow like megawatts, and that is very hard to do today for many reasons. There's stranded pools of compute all over the place and there's no fungibility. And so right now we do it at the level of scheduling, and we often do it at the economic layer. But as we start to announce what we're working on, it's extraordinary like how many folks are coming out of the woodworks and saying, “Hey, I'm actually working on a way to make compute fungible at this part of the stack and that part of the stack.” And as a grid, we'd like all of these folks to participate on the grid. There's, people often ask me, “Andra, are you a new cloud?” And I go, “No, actually neoclouds are suppliers.” sometimes they'll ask, “Are you a venture capital firm?” I go, “No, actually they are, they are demand like sort of off-takers of the grid.” We see ourselves as what's called an independent system operator. So if you study the history of the electric grid, once it became legible to a lot of factories and industrial sort of participants that, hey, actually it turns out pooling is a good idea. We should pool our generators instead of all having a generator running at half capacity in our backyard. There was a need for an independent entity who could coordinate all these parties. Transmission line, power generation, facilities, transmission lines, factories, and that neutral coordination mechanism is very critical. In order-- If you study like the history of grids, the most enduring ones were those that never owned their own assets. They were ones that had, or often started with long-term anchors who are uncorrelated sources of demand, a steel factory, a shoe mill or whatever in a particular town who weren't competitive, where the steel factory want to spike up at night, the shoe mill wanted to spike up during the day. So then you pool and you share, right? So each of you is guaranteed some base load, but then you kind of schedule your spikes to drive a peak utilization across the town. The gold standard, so to speak, historically, has been these utility companies like PJM Interconnect in the northeast of America, where they, over many years became this what's called an ISO, an independent system operator of the grid. So that's how we see ourselves. Economically, that's what we are. From a technical perspective, we started at the scheduling layer because Seb and Mihai, who, run engineering here, built that at-Swyx [00:09:28]: Did your schedulingAnjney [00:09:28]: They did that at Google. And, -Swyx [00:09:32]: And you have infra shops from Discord as well.Anjney [00:09:35]: I have some.Swyx [00:09:35]: I don't know, I don't know if Discord is like the primary identity, but what-whatever, I'm just kind of-Anjney [00:09:39]: No, D-Discord was-Swyx [00:09:40]: Choosing a well-known name.Anjney [00:09:42]: Well, I So I was running the developer platform there. The internal infrastructure I was not responsible for. That was actually a guy by the name of Mark Smith, who was extraordinary. And yes, Discord did pool So Discord is actually a counter example. I had the chance to learn a lot about fully, full stack infra there because-Swyx [00:09:56]: It's the same thing, yeahAnjney [00:09:57]: It's the, it's the other architecture which is, Discord built its own WebRTC vo-voice and video infra. So like Discord did not use-Swyx [00:10:08]: For the calls, yeah.Anjney [00:10:09]: Yeah, did not For communication, Discord did not use third party infra. It was all built in-house. And then the way you maximize utilization was you pool demand from the world's 200 million plus monthly active gamers, right? And so that's, that's how those stacks were constructed. Again, in systems design, the two concepts that keep coming up over and over again are abstraction and composition, right? And-Swyx [00:10:31]: Bundling and unbundlingAnjney [00:10:33]: Bundling and unbundling, abstraction, composition, like verticalization and-Swyx [00:10:36]: HorizontalAnjney [00:10:36]: Horizontalization. So in that sense, AMP is an independent system operator of the grid. We pool demand, we pool supply from a number of partners we trust At about 1.3 gigawatt scale over four years. And then we pool demand from some of the world's best, research labs and so on. We're sitting at one, periodic labs who need extraordinary long-term demand. And the idea is that, each of them is guaranteed base load on the grid, but they can spike up and down flexibly on, for compute, with much shorter timelines as needed. That was roughly the design of the program I came up with at a16z called Oxygen. The same-- That was the same design of the GQM, BorgX, Borg GQM implementation at Google that Mihai and Seb had built. Which was that how do you allow, teams inside of Google, on the internal infrastructure to be guaranteed capacity, for their base workloads? But when they need to spike up on research, how could they ensure that was sufficiently there? And of course, the big innovation that was not discovered, but kind of implemented in the space, this infra space maybe three, four years ago at Google was the idea of interruptible demand, right? Where you just queue up a bunch of jobs and through this like sort of credit system, there can be a bidding mechanism.Swyx [00:11:53]: Like priorities.Anjney [00:11:54]: It's a dynamic prioritization Basically. And jobs can get interrupted based on somebody else who's saying, “what? I have 10 tokens, 10 credits I want to spend on this job.” Another like team lead, research lead is “Genie 3 or whatever is only worth five, credits, and NanoBanana2 is worth 10 credits,” and so the NanoBanana job gets priority. That's a, that's a made up example.Swyx [00:12:15]: It's very real. Brain Marketplace was real. And, we've, we've covered this on the pod with David Luan, who was-Anjney [00:12:20]: Oh, great. OkaySwyx [00:12:20]: Was there. And the criticism is that, well, actually sometimes you need central command to go all in on a thing. And actually sometimes capitalism via credits doesn't work. Not, this is not a criticism of AMP. I'm just saying, this is a thing that has been tried, internally within Google, and it led to Google missing GPT.Foundry, Frontier Labs, and Research HoardingAnjney [00:12:41]: Like, we structured ourself essentially very similarly to Google. We are structured as a holdings company. So, Alphabet holdings is Alphabet holdings, and then they've got these subsidiaries called Google and-Swyx [00:12:51]: Other betsAnjney [00:12:52]: Other bets and so on. We've got, AMP holdings, and we've got our infrastructure business, and then we've got a capital business called Foundry that incubates new frontier AI labs or invests in them as venture capital, like Periodic. We put a few hundred million dollars into Anthropic from our fund earlier this year. So wherever we feel like teams are making progress, especially researchers and so on who've pushed the frontier inside of existing labs like DeepMind, I find, there comes a point where they feel misaligned with the dictatorship of Alphabet holdings. And at that point, sometimes the dictatorship doesn't want them anymore. And they're “Thank you. You've done your job here. You've kind of helped us through the zero to one phase, and for whatever reason, we're going to deprioritize your amazing, omni model or whatever it is, and instead we're going to prioritize coding.” And, I think that's a tragedy, but I get it. They're Sergey and team are running their own business there. But that doesn't mean we the rest of us should sit around waiting for that progress to get unlocked for the rest of the world and humanity. If you think about how much extraordinary research has happened inside of DeepMind over the last 10 years, I, Demis and Sergey and those guys did such a great job. But at the end of the day, so much of that has never seen the light of day?Swyx [00:14:00]: Or they're like papers only, but they never actually shipped it to production or-Anjney [00:14:03]: What's worse is the paper is actually not even being published anymore ‘cause there's a six-month embargo inside of DeepMind, right? We've heard about this where a paper comes out, and then I think there's a six-month embargo window where if anybody on the business team says, “This could be interesting” It's embargoed for life.Swyx [00:14:18]: Exactly. So the stuff that gets published is the stuff that's not good enough.Anjney [00:14:21]: There's an adverse selection problem, basically. Yeah. At this point-Swyx [00:14:25]: It's, it's a common complaint at NeurIPS, by the way, that's “Well, why would I look at the papers that are the trash of GDM?”Anjney [00:14:31]: Again, I think it's a tragedy. I get it. They're running their business, but the rest of the I think there's negative externalities of research being hoarded, and so that'there's a market failure. And somebody needs to unlock that research, and we can't do it on our own. We only have 1.2 gigawatts of compute. That's nothing. That's about $40 billion of cloud spend. We're going to need a lot-Gigawatt-Scale Compute and End-of-Life PredictionSwyx [00:14:51]: By the way, is that's a new number. I haven't, haven't come across that gigawatt number. That's huge.Anjney [00:14:56]: Yeah. And to be clear, we haven't secured all of it. That's how much demand we have started to secure. I think publicly we haven't actually confirmed how much we have for this year. In order-Swyx [00:15:04]: Where do you want to get to?Anjney [00:15:06]: I think the steady state would be that we have a base load pool Of 1.2 gigawatts at all times Of base load capacity. For spike capacity, right now my estimate is we need roughly six gigawatts over the next four years for all our teams to feel like they were able to keep moving the frontier, whatever they're working on, whether it's, like superconductor discovery over here. There's a new investment we're working on right now, which is in the end of life prediction space in healthcare. It's extraordinary how much you can, you can give this was actually my graduate school work. I went to grad school for bioinformatics at Stanford Med. And I know we-Swyx [00:15:40]: Econ, MCS, bio.Anjney [00:15:41]: So my-- I was this really weird cat where, I was never satisfied with my major options. So at one point I was an econ major, then I was a CS major, then I was a MCS major called mathematical computational science, and they decided they were going to end that major. So I took all that coursework, and I applied it to grad school, my graduate degree in bioinformatics, which was the master's program, and then I thought I was going to do a PhD. I never ended up doing it. I dropped out and went to work at Kleiner. But I was lucky enough to apprentice with this professor at, Stanford Med. His name is Nigam Shah, and he was working on end of life prediction. Stanford is one of the only research facilities in America that has a longitudinal patient data set that's larger at scale. I think it's at least 12 million patient lives. The only larger data set is at the VA, the Veterans Affairs, of America. And to do research, like do any deep learning and so on that data set, it was called the STRIDE data set at that time, you had to be a Stanford Med School affiliate, which is why I went and enrolled in the bioinformatics department. End of deep learning was early. Nigam Shah had the visibility-- the vision to see that, you could do end of life prediction to help palliative care. In America, the, over 30% of all Medicare, Medicaid spend, at least at that time, was spent on end of life care. And what's we grew up in Asia, so we all-- Yeah, at least I won't speak for you, but I have A very different relationship with death than I find folks who grew up in America do. In America, spiritually and culturally, especially in Western societies where Christianity, the Christian tradition sort of frames death as this terminal point, there's often a judgment day and so on. The way we view death is with a finality. In Indian culture, in Hindu culture, death is one-Swyx [00:17:35]: Also, he's Buddhist as well.Anjney [00:17:36]: You're Buddhist, yeah. So it's one, it's one step in a journey of many lives, right? And so, I grew up in this city called Chennai in the south of India, and when people die, you dance on the street. There's like a procession where your body is carried to be cremated and your family, like celebrates and there's drums and so on. It's this huge thing. And, It's because the idea is that you're going to be reincarnated. You've been liberated from the responsibilities of this life, and now you're onto your next. It's a new It's like going off to a new college or whatever, right? And so it was so alien to me when I got here as an undergrad- That the medical system works backwards from that assumption that we have to view death as this terminal thing and delay it, postpone it's a bad thing. And so at the time, clinical decision support in the United States was this very primitive field. Even to this day, physicians in the United States often will tell you when you have a terminal disease, this is your, we've diagnosed you, which is great. Our ability to diagnose you is extraordinary. You have somewhere between six months to six years to live. What do you do with that information? The error bars are so high that then you In times of uncertainty, we default to culture, and when the culture is let's-- this is a bad thing, I've got to prolong my life, then you start doing things like And just to, just sort of from a systems perspective, what's going on there is Physicians often feel like they need to provide such high error bars because there's always some uncertainty in end of life diagnosis, and if you provide the wrong Diagnosis or recommendation to your patient, you can be sued for medical malpractice. And then your license can be taken away. It can be catastrophic for your career. In contrast, if in countries where that's not the case, what you often observe is that patients, physicians are quite prescriptive with their recommendation. They say, “Hey, this is your condition. The literature says that you probably have this much time on Earth left. My expert opinion is that you are an outlier or whatever.” And they try to be more prescriptive, and that empowers a patient, right? ‘Cause then a patient can say, “I trust my doctor. They said on average, I have six months to live, but if I do these things, I may have a shot because of my particular predispositions or my genetic history or whatever.” And that empowers you to go about your life in a actually more scientific way than leaning on religion, culture, spirituality, and so on. In contrast, here, because of that medical malpractice sort of thing looming over your head, a physician never gives you a clear recommendation. So instead you say, “Okay, Doc, well, let's try it all.” And then you start a whole regime of drugs and therapies, and then you often spend weeks and weeks in the hospital, and that deteriorates your quality of life. And when that deteriorates your quality of life, you instead of spending your last few days doing the things you love with your family, you're spending it on a hospital bed. And that ends up being thirty percent of Medicare and Medicaid. So it's worse for the patients. The doctors feel terrible. The American taxpayer is paying a huge amount of money. And so this is why Nigam Shah, who was this professor at Stanford, said, “Anjney, if there's “ I kind of sat down with him. I was this young, I'd, I was twenty-one, and I was “I want to work on a big problem.” He's “The big problem is end of life care.” And so we tried to do deep learning to say, to-- So we started trying to run deep learning on these tried patient data sets to say, “Could you have an AI system make a recommendation that is orders of magnitude more precise about how much time you have left once you've been diagnosed with a terminal condition than a human?” And then if we can get that precision to be high enough, then you can empower the patient. And it turns out the tech works. Like it's-- Once you get the data set, like RL works. Honestly, even regression models work. You don't need to get that fancy. At the time, we were just trying, doing like very simple neural nets.Swyx [00:21:54]: Simple solutions, yeah.Anjney [00:21:54]: Today, what we can do with RL is extraordinary. The problem remains then and now is regulatory, because you actually can't shift the burden of the wrong clinical diagnoses from the physician to the AI system. And so at that time, I got quite disillusioned ten years ago for, twelve years ago where, ‘cause I felt I just didn't have the resources to influence regulation. Today, I'm very lucky. I'm in a different place. I've, I'm a lot older, and so I've been spending a lot of time on my next incubation, which is how can we unlock the, patient empowerment by training AI models to do end of life prediction much, with much more precision and ac-Swyx [00:22:37]: Oh, wow. You're still focused on this the whole time.Anjney [00:22:40]: The-- I haven't been able to get, this out of my mind a single day for the last fourteen years. This is the hill I want, I would like to die on. There's two, I would say. What? I actually, I'd prefer not to die.Swyx [00:22:51]: Yeah, exactly.Anjney [00:22:52]: But I think two bipartisan issues, I think two issues that should be bipartisan in America are how do we empower patients to make the right clinical decisions at the end of their life, such that we're reducing the taxpayer burden with science? It's just good old science, and AI can help here. And the second is, net positive data centers, ‘cause I think that's the biggest critical bottleneck on training and good enough AI models to help people at the end of their life. So there's sort of two sides of the, of the same scaling bottleneck curve, but those two, we formed AMP as a public benefit corporation. My wife and I, who you've met, you've met Viv. Her passion is education. Her family is a long line of educators and so on, and, of physicists. And so this class is my attempt to stop being the black sheep of the family and be a, an educator. But if I'm not educating, the thing I would be doing is working, on these two problems, whether on the political spectrum or as a researcher back at, in some lab. And my hope is if anyone's listening to this podcast, if they're passionate about either of those two topics, I'd love to hear from them. We'll, we'll we can share the contact in the show notes, but, we're looking for people to join both of those missions on the, on the political side as well as on the medical side, on the research side.Frontier Systems, Output Maxing, and AlignmentSwyx [00:24:08]: You said, this is a discipline that you want to form. You call it's called variously called Frontier System. It's variously called One Person Frontier Lab. What is the ideal name or shape of this? Like the, what is the mission?Anjney [00:24:24]: Of the class?Swyx [00:24:26]: Of the discipline that you're, exploring, right? I The class is called Frontier Systems. But like for me, maybe one phrase is you're, you're just anti-waste, right? Which is wasting GPUs, wasting in human and Medicare. But is there, is there a broader theme that I'm, that maybe you can encapsulate more succinctly?Anjney [00:24:45]: Yeah. The, from an engineering perspective, it's very simple. It's output maxing. It's the, it's the department of output maxing.Swyx [00:24:51]: Making the most of what we have.Anjney [00:24:52]: Exactly. I'm a huge believer in optimal outcomes. I think both in America and other countries, we are losing our appreciation for nuance, and this is the thing of And AI is the same case, right? Oh, the bitter lesson holds. Okay, fine. But that doesn't mean you just like throw 500 GB300, 500,000 GB300s at your suboptimal model scaling and you waste a bunch of compute. It also doesn't mean that, the most optimal is to have like 50 different architectures where there isn't enough standardization. One of the reasons Anthropic has had extraordinary sort of velocity is ‘cause they picked the transform architecture and said, “This is simple. Let's double down on it,” right? And now luckily there's enough investment going to the space that we can afford other architectures, but at the time, investment was just too fragmented into other architectures, so that arguably unlocked scaling. So I think there's a philosophy. I think we all owe it to ourselves to do output maxing with a new capability called AI on a global level. I think if I was starting a new department at Stanford, depending on how fuzzy or technical I wanted to be, I'd probably call it the Department of Alignment. Like-Swyx [00:25:59]: It's an overloaded termAnjney [00:26:01]: But it is, But alignment really Is a hard problem. And I think when you unlock it, full stack alignment is super hard in any organization and in any system. Like in a, in a venture capital firm, if you can have full stack alignment between your limited partners and your, the founders who are creating the value and ultimately the public that owns the IPO stock, that is a gift that keeps giving. And when you study the history of these systems, when they start off, they usually start out small scale where the feedback loop is actually so tight that there's alignment. And then the more you try to scale, the more division of labor happens, the more specialization happens, and at each step you add abstractions. And wherever there's an API interface, there's like loss. There's communication loss. And so I think a really cool thing would be for us to figure out is there a way for us to have our cake and eat it too as an engineering discipline? Is there a way to actually scale up and scale out Without losing any alignment, without lossy transmission?Swyx [00:27:01]: You mean standards?Anjney [00:27:02]: So standards is one way. The other way is you just have net new capabilities. So like what we're trying to do here is discover new superconductors. A room temperature superconductor would be a lossless transmission mechanism for energy. We would have flying cars. We are right within a few years of having a new room temperature superconductor. So I think those are the two. You either have to standardize On protocols or API specs that allow lossless communication, or you can come up with a whole new capability that unlocks so much abundance, the standardization doesn't matter ‘cause you just unlock net new capacity. This, the, so this is what I spend my days thinking about these days.Compute Markets, SF Compute, and Non-NVIDIA ChipsSwyx [00:27:38]: No, I think every infra person at, who wants scale and wants to output max does eventually end up thinking about this. We don't have time to go into it, but we have done an episode with SF Compute-Anjney [00:27:50]: Oh, coolSwyx [00:27:50]: That is trying to standardize The futures contract for compute. I don't, I don't know how that's going by the way, but like at some point this will be public.Anjney [00:27:57]: Oh, I think Evan is awesome and SF Compute is the kind of effort that I hope we can accelerate because what often happens is these exchanges are very hard to get, they, it's hard to bootstrap them, right? Because they often require-- There's many inefficiencies between parties. There's trust boundary inefficiencies in infrastructure because you don't trust, one part of the stack doesn't trust another part of the stack to give them visibility. There's capital markets inefficiencies, there's operational efficiencies. So if you can inject like a single shock to the system of a ton of compute demand or supply, then you can accelerate, these new flywheels. And so my hope is one day, or soon, if SF Compute needs extra like has excess capacity, they just hook it up to the grid and they get flooded with demand from us. And on the other side, if they have a ton of demand but they don't have supply, they just again hook up to the grid and it's a two-way protocol where they can just hook up to our capacity. And I don't think we're too far from that. Today our working implementation of it is mostly through a group of labs, universities, and a few sort of trusted parties who are, who all feel like they're in alignment to borrow an over sort of used word. But our hope is to just have it be an open protocol that anyone can hook up to on-Swyx [00:29:20]: Hook up for demand or hook up for supply? In primarily demand, it sounds like. Like you-Anjney [00:29:25]: No, bothSwyx [00:29:26]: You would want to offer demand.Anjney [00:29:27]: Both. Yeah. Unfortunately, what's happened in the last six weeks is, we thought we'd have a bunch of excess capacity by the end of this year. It's all gone.Swyx [00:29:37]: It's exploding.Anjney [00:29:38]: It, yeah. It's all gone. And so I have, my text messages are full of friends, we know many of these people, these are founders who've raised billions of dollars in San Francisco going, “Oh, any chance you have like 50 nodes in the next few weeks?”Swyx [00:29:51]: What is the scope for, non-Nvidia, right? You have Lisa Su coming and, Rainer Pope as well. And so There is a lot of demand for, more performance Alternative architectures and all that. At the same time, this hurts your standardization.Anjney [00:30:11]: I don't think so. So actually Rainer's a great example, right? Rainer is a CEO and founder of, MatX. I actually had him by for office hours in the class earlier today, and there was an insight he brought up that I hadn't considered before, which is when they decided to pick the standard For their data center, they picked the NVIDIA reference architecture. So the MatX chips Just plug in to any site that has an NVIDIA bring up planned. And, the-Swyx [00:30:42]: It's just software then. It's, it's not the-Anjney [00:30:44]: A-Swyx [00:30:44]: Hardware.Anjney [00:30:46]: Well, from an input and IO perspective It's the same footprint as an NVIDIA rack.Swyx [00:30:52]: That makes sense.Anjney [00:30:53]: Where they have done, innovated a bunch from what I can tell is on systems co-design. Which is where a lot of the gains are to be had. And so he picked He was “Anjney, we, there's just so much work to do when you're building a new chip company.”Swyx [00:31:08]: Can't fight every front.Anjney [00:31:08]: You just can't fight on every front. So my question to him was, “Well, you're working on this new chip. Their tape-out is next year. What, who are you going to partner with to host the chips?” And he said, “Whoever will host them. That's just not, that's not my focus.” And I said, “But how did you “ you decided back to our earlier systems design question, he decided that, he didn't want to be a full, fully integrated chip provider. The bottleneck they're focused on is the logic die, and they, he feels they can crank out a ton of performance gains through co-design there. But then that means you delegate, to our question earlier, it, you he's the data center provider is a different part of the stack, and so then he's dependent on that part of the ecosystem to host his chips to get the performance gains to the customer. So now you have another abstraction, and you might have loss. So I asked him, “How do you prevent loss?” And back to your point, he said, “I just picked the NVIDIA standard ‘cause I didn't want to Like I wanted to piggyback off of an existing protocol.” And that, what's great about NVIDIA is that reference architecture is known.Swyx [00:32:15]: Open.Anjney [00:32:15]: It's open. They've published it. So Jensen's actually enabled someone like Rainer to build a chip company like MatX, and I don't see them as competitive. The compute demand is so high. Like, I don't I think NVIDIA's not able to meet the demands of production, so we just need more chips. And I think it's very smart what MatX has done, which is say, “We're just going to we're not going to innovate on the data center design ‘cause actually, thank you, Jensen, you've done all the hard work. Where we can innovate is somewhere else.” And I think that's, that's very healthy. I think that's how we unblock new bottlenecks. And my view is these, the, chip teams like MatX, who have arrived at the insight that co-design is the way, The primary bottleneck for them is trust boundary. To do co-design well, you need visibility into the next model generation as soon as possible ‘cause it takes two years to tape out. So if by the time I bring my chip to market, your model architecture's changed, I'm host. Now, when he was inside Google, he was sitting next to the Gemini team. He was on Palm or whatever.Trust Boundaries, Co-Design, and Researcher CEOsSwyx [00:33:19]: His co-founder was the, was one, was one of the Palm guys, I think.Anjney [00:33:23]: Yes. Yes, exactly. So when you're inside the trust boundary of Google, then your systems co-design loop is super tight. When you leave as a founder, one of the biggest risks you take is now you're outside the trust boundary. And so what I love doing is helping chip teams who can help us unlock more capacity for the independent ecosystem access to trust. Because when I If I've been, involved with a lab from day one, and I was lucky enough to work with Anthropic, and then I'm on the board of Mistral and helped Black Forest Labs get started. I think at this point I'm on six or seven different teams.Swyx [00:33:57]: Only six? I feel like my mental number was going to be 13, but yeah, it's-Anjney [00:34:02]: No, I go deep with one at a time.Swyx [00:34:04]: You're founding CEO of Arena.Anjney [00:34:07]: Nah, that was an, that was an-Swyx [00:34:08]: Administrative CEOAnjney [00:34:09]: It was an administrative five-month gig where Whalen and Anastasios were graduating from their PhDs, and they didn't need a product team. So I helped recruit the head of engineering product and design. But Anastasios has always been the CEO of that company. I played a pinch-hitting I'm an intern. I was CEO intern For five months. -Swyx [00:34:33]: I interviewed him, and he's he's very well-spoken. I think he's a debate, former debate, champion. But also very quantitative and mathematical, which is-Anjney [00:34:41]: He-Swyx [00:34:41]: Such a unicorn.Anjney [00:34:43]: See, what's amazing about him? If you look at his output, he's an output maxer. By the time he was graduating from his PhD, which he only graduated last year, he had published more work with a citation count than, people twice his age. But at the same time, he'd already started a project called LLM Arena that was being used by millions of people As a side project. And time and time again, what I've realized is venture capitalists suck at seeing human beings as, dynamic agents where-Swyx [00:35:14]: They want to put you in a boxAnjney [00:35:15]: They want to put you in a box.Swyx [00:35:15]: This is your thing.Anjney [00:35:16]: So the first time I got introduced to Anastasios, somebody had told me “Oh, he's amazing, but he's a researcher.” I was “what? What do you mean he's a researcher?” That's what-Swyx [00:35:28]: Like he's not a CEO, not a founder.Anjney [00:35:29]: Not a CEO, exactly. I was “Are you crazy? Do you Have you met Dario?” Dario's a scientist. He's gone from zero to, what will soon be a trillion-dollar company in four years. Being a CEO, nominally speaking, is not that hard. Being a good CEO is hard. Being a great CEO actually requires a level of performance that scientists who have already published at the top of their field have accomplished. It is super hard to be a competitive scientist. To publish in academia over the last 20, 30 years, to make it to the top of your discipline at a place like Berkeley, you are a star athlete. Like, you are an athlete of the mind, and you perform at the highest levels. And to get there, whether you're, Anastasios or Whalen at Berkeley, or you are Robin, who-Swyx [00:36:23]: BFL, yeahAnjney [00:36:24]: With Black Forest, who created Stable Diffusion, or if you're, like Guillaume at Meta, who created Llama before he started Mistral. The amount of human leadership you have to demonstrate to get the resources, like get the trust of the organization, publish it, put it up. I would just fund researchers all day Right? If who have contributed already to the field. If they've, if they've put SOTA out there, they're, they're star athletes already. If they haven't done SOTA Look, they can still be good CEOs, but then I find the failure mode is that they just don't want to be CEOs, they primarily want to publish, and that's okay, too. One of the things we do with the AMP Grid is we donate excess compute. We have two nonprofits, like university labs. We carved out like a couple thousand H100s. But I do think there's extraordinary research being done on university campuses. My father-in-law's a physicist. He's a professor. Extraordinary work in physics, and we need that. But if you want to be a CEO, what you need to be willing To do is be super confrontational, outside of science. Like within the scientific community, some of the best researchers are very confrontational about their convictions, right? This architecture is right. To be a great CEO, you basically have to be willing to be confrontational up and down the stack.Swyx [00:37:41]: To your own team.Anjney [00:37:42]: To your own team-Swyx [00:37:43]: To customersAnjney [00:37:43]: Hiring, recruiting customers. Well, I would say, Yeah, pretty much to everyone Everybody. Of course-Swyx [00:37:50]: I see, I feel a little bit of that in my own work, but yeah, I can't imagine the stakes that Dario has had to go through. It's, it's pretty insane.Anjney [00:37:56]: No, I don't think the stakes are that different From how you're feeling it, right? Stakes are personal scaling vectors, right? The stakes that seem so low to you, like having this podcast where you can talk to somebody and just have a you're an extraordinary communicator, right? Like already in this conversation, you've pulled more out of me than most people, and I've been on 12 podcasts in the last two weeks.AI Coachella and First-Principles ThinkingSwyx [00:38:17]: I think I, we've just seen each other enough that there's some base trust.Anjney [00:38:20]: There's base trust.Swyx [00:38:20]: And I think, and I know that you, that I've done my homework and like I know that trust is a big deal for you, so.Anjney [00:38:27]: I think trust is about consistency, and you and I have seen each other In the community for years, right? Like, I remember the first time we met was at NeurIPS in New Orleans. I don't know if you remember that, luncheon.Swyx [00:38:38]: Oh my God.Anjney [00:38:39]: Reiko had set up this Reiko's amazing, and he set up this luncheon and-Swyx [00:38:43]: Yeah, I was “Who's this Discord guy?” I'm “Okay.” But-Anjney [00:38:45]: No, you weren't-Swyx [00:38:46]: You were just “You made some investments.”Anjney [00:38:47]: You were much less polite. You were “Who's this VC?” You're like-Swyx [00:38:51]: No, I Was I? Oh my God.Anjney [00:38:53]: It was-Swyx [00:38:53]: I'm so sorryAnjney [00:38:53]: It was visible on your face.Swyx [00:38:54]: I'm so sorry. But you weren't, you weren't The introduction was bad. I was I didn't know who you were.Anjney [00:39:00]: The, see, this is the thing about context, right? Like, but then I think I heard your accent. And I was “Are you-”Swyx [00:39:06]: Singapore, yeahAnjney [00:39:06]: “Are you Singaporean?” And you're “Yeah.” And I said, “I went to high school, JC, in Singapore.” And then the ice broke. But This is the there are in the scientific community, sometimes the stakes are very high for people who haven't had the emotional, what is called EQ Coaching and mentorship, right? Which is like to have scientific impact, you often need to be a extraordinary emotional, like emotionally in tune person with the folks you're trying to influence. And so what comes so naturally to you is actually a super high stakes thing to other people. And so I wouldn't assume that Dario's more stressed out than you. These things are you'd be surprised how similar and small sometimes the problems are to you That some of the world's biggest, leaders are facing. And that's what I've learned from this class. The guest speakers are Sam, Satya, Jensen.Swyx [00:40:01]: AI Coachella.Anjney [00:40:02]: Yeah. It's AI Coachella, right? So we got to get all the headliners, and they're I'm very lucky that some of these people have either mentored me over the years or I've done business with them. And when you, take the performative stuff out and any assumptions you may have about these people that you read in the press or on Twitter, We're all just humans. We're all trying to get along. And what's so special about this moment is AI is forcing, like scaling, the bitter lesson is forcing a lot of people to revise their assumptions for how the world works and go back to first principles or go and educate themselves. So the kind of people I was, I won't name who this person is, but I was at an event last week in Texas and, ran to somebody who said, “Anjney, I came across the class. What do you think about real time action prediction models?” And I was, don't know how happy it made me feel when they asked me that question. I know they've done the work. They've challenged themselves. I'm, they didn't ask me, “What do you think of world models?” They said, “What do you think of n-”Swyx [00:41:04]: Real time action predictionAnjney [00:41:05]: “action, real time action prediction models?” World models, don't get me wrong, are cool and everything, but you and I both know that is a layer of abstraction that is sometimes not usefully precise enough. Right? Ours-Swyx [00:41:16]: There's like four different kinds of world models.Anjney [00:41:17]: Yes, exactly.Swyx [00:41:18]: We've done the part with general intuition, by the way, which is very focused on, -Anjney [00:41:22]: Oh, cool. Yes. I love Pim. Pim is great. And this is what I love about people who've done that level of work. They realize they're not in competition with people who the rest of the world thinks they're in competition with.Swyx [00:41:34]: Because they're not in the category, they're in the specific thing they're trying to do.Anjney [00:41:37]: They're focused on their mission, and they have a systems understanding of the bottleneck they're trying to solve. And when somebody else says, “I'm working on real time, action prediction models too,” Pim goes, “Oh, I love that person. I want, I can learn from them.” But the minute they're “Oh, that person's a world model person,” it's “like which type of world model person?” But mostly they're just trying to figure out if it's a waste of their time, because we don't have enough time. So, Pim, for example, is super, loves this other company I work with we've talked about called Black Forest Labs. And he's mentioned to me multiple times that he's so, He thinks what Flux is doing is really cool. Andy Blattman came by and spoke in the class. And what I find over and over again is for people who do the work, who can be usefully precise enough about like what is actually going on in the world of frontier research, The sense of camaraderie is still well and alive, but it gets lost sometimes when you have to like abstract The technical complexities in, business terms And then the VCs are “How are you different from that world model?” I'm going to say Where do I even start to explain this stuff? And then the misalignment creeps in.Leading vs. Winning in Frontier AISwyx [00:42:43]: This is good. Yeah, I think, people listening get a sense of, what it is like to operate at a real level, like yourself, rather than at, the journalist level, where you have to sort of put everyone in, a rough category and create a narrative of competition, and who's winning today, who's behind.Anjney [00:42:58]: It-- this idea of winning is so Weird to me.Swyx [00:43:03]: You do want to win. You want you want competitiveness.Anjney [00:43:06]: No, I think you want to lead.Swyx [00:43:07]: You want SOTA.Anjney [00:43:07]: No, I think you want to lead. Yes, so you want to push the frontier. You want to push the SOTA. You want to do something that hasn't been done before. You want to capture value, but you don't want to capture so much value that, people think you're unaligned with your mission or trying to do what's best for the world. You want to capture enough value that you can keep innovating, right? And I think that people want to lead, they don't really This idea of winning and losing, again, I love Jensen. He's a, he's a leader. The mindset that he talked about on Dwarkesh's podcast, right? He's “I didn't wake up with a loser mindset.” I think that was awesome, right? Because he's, he's an engineer. Dwarkesh has done the work. So there's at least-- even though the, to me, it was very obvious they're talking about the same thing, they just passed each other. They just had to basically, Jensen has this, five-layer cake abstraction of how the industry works. And Dwarkesh had, I think from that podcast, had more of, a pre-training, mid-training, post-training systems loop concept.Swyx [00:44:04]: It's just a factor of who he talks to, right? Again, it's very clear.Anjney [00:44:06]: It's the systems It's the abstraction, the mental models, the It's the whole-- Dude, so much of the problem in the world is reasoning by analogy. And then the assumptions that are held invisibly.Swyx [00:44:19]: Yeah, I've, I've said, this is actually the best time in human history for first principles thinkers. Because everything you think will happen is actually now coming true.Anjney [00:44:28]: Correct. And the venture capital community is, notorious for this, where people look-- In times of uncertainty, they, cling to axioms that ended up being true from the previous era, and they kind of like proclaim them with confidence as if they're truths, but they're not. And it's very important to see the distinction between a heuristic and an axiom. An axiom can be proven-Swyx [00:44:55]: Like from internal consistency point of viewAnjney [00:44:56]: With internal consistency. A heuristic is a way you kind of a shortcut. And my God, the number of people I have had to put up with over the last few years who proclaim-- use heuristics As axioms to judge people, to judge which companies are going to succeed or the number of people who are “Oh, yeah, Anthropic, they're just training models right now,” but this one continue.Swyx [00:45:22]: Because that's a B2B SaaS?Anjney [00:45:23]: Yeah, the, like Which over the fullness of time, if you squint at it, maybe. But the way you arrive there is so important that you can-- you just, you can dismiss people. Here's what happened, right? What happened is Anthropic basically achieved takeoff in October of last year. That training run-Swyx [00:45:41]: Whatever, three seven?Anjney [00:45:42]: I forget the numbers now, but whatever that checkpoint was-Swyx [00:45:45]: We saw the cognition.Anjney [00:45:46]: Yeah. Right? You probably-- The, to those of us in the community, especially once post-training was done and it was released in December-Swyx [00:45:52]: Yeah. Can I sneak a sneaky question in there? I don't know if you have a perspective, maybe you don't, I just The number one question is how did Anthropic crack coding, right? Because Claude One, Claude Two, okay, like it was part of it, but it wasn't a big deal. And the leading hypothesis, it's a lucky dice roll that was then compounded, right? Like it was like Mildly better, but then they saw it and they were “Okay, let's really invest.”How Anthropic Cracked CodingAnjney [00:46:17]: I had this very annoying teacher. I went to this boarding school called Rishi Valley in India, which is like this, bird preserve. It's like three hundred and fifty acres of bird preserve in rural India, and there was no technology for seven years. There was this teacher, I won't name them, but they would have this-- I hated it every time he said this to me. He was “Luck fa-favors the prepared mind,” which is like a common saying, but the way he delivered it, always grated me, ‘cause he was always I was always one of those kids who got, a good grade without trying very hard. ‘Cause like high middle school is not that hard if you, if you're generally, paying attention and so on. And there was this one time where I-- But then I would get an eighty percent grade, and he would keep pushing me to say “The reason you didn't get the ninety-five plus percent is because you're not that lucky.” And I would say, “What do you mean?” ‘Cause I would think that I deserved that grade, and I would sometimes argue with him. And he'd say, “You didn't have a prepared mind. If you want to get lucky again “ There was basically one time where I got like ninety-five or ninety-six on this, on this subject, and I, now that I felt entitled. I was “Okay, I'm going to keep doing this,” and I didn't. And then he was “Luck favors a prepared mind. You got lucky last time, but you got to stay prepared.” And I didn't understand what he meant. Now, as I'm older, I'm okay, these adults actually knew a thing or two. Anthropic has been the most prepared company for four years. And so then when the right, context data comes in, the right developers start sending in, the right context diffs, Sure, you could say you got lucky, but if you ask me, they're pr-pretty damn prepared with paranoia for like four years. And you have to remember, it was so hard for them to get going early on that they had to do so much more with so much less that you just have to be prepared to be so efficient.Swyx [00:48:06]: Yes. There's numbers on their burn compared to OpenAI. I've, I've written about it, but they are so much more efficient in their, in their tech stack.Anjney [00:48:14]: It's not even It's not funny.Swyx [00:48:14]: Not even close.Anjney [00:48:15]: Yeah. But it's so clear, right? Like how to output max for the world. They have been prepared, and you could call that luck, but Luck favors the prepared mind.Culture, Hardship, and Anthropic's P0Swyx [00:48:25]: This is one of those things that I was going over some of your old lectures and, you were data, people think it's a moat and actually it's culture and actually it's team Actually. And I, it's-- there's different levels of moats, and this is the ultimate one that determines everything else. Which you can then compoundAnjney [00:48:43]: You're saying culture is the ultimate moat? Yeah. But the thing about culture is it's very fragile. So moats, I don't think they're-- there's very few moats I found that are actually moats. They're-- It's, it's a nice concept, but in reality, you have to replenish your culture. Ben Horowitz was, the speaker in CS153 on Tuesday, and I asked him this question about the culture bottleneck in teams because, there are several AI teams-Swyx [00:49:09]: His book, Hard Things About Hard ThingsAnjney [00:49:11]: Hard Thing About Hard Things. But more concretely, there are so many AI labs today that have all the cash they need, they have all the compute they need, and they're still not able to ship anything SOTA. And then you start seeing people leave and so on, and my diagnosis, it's, is it's the culture. And so I asked him, Ben, they're-- He's been one of the most aggressive investors in AI labs. He goes back to this thing which resonates in my mind a lot. It-- When I used to work at a16z, I would, book a conference room, and right outside the conference room, which is closest to the toilet ‘cause it was the fastest way for me to go use the bathroom between Zoom meetings-Swyx [00:49:45]: Oh my God, I'll put maxing my toilet optimization. Okay, never mind.Anjney [00:49:48]: It was not healthy in hindsight, but maybe this is TMI. But anyway, outside that conference on the wall was this quote that was printed that said, “Culture is not a set of beliefs, it's a set of actions.” And it's by Bushido, is this, Japanese philosopher. And if you stop taking the actions that demonstrate the mission alignment to what you've said to your team and to your-- the world matters to you, then your culture starts to fray. So it's not actually a moat, I would say. It's a very brittle, fragile thing that requires daily tending to like a garden. But if you figure out the system to keep that garden tended, which I think ultimately comes down to knowing yourself ‘cause you most naturally, if you're authentic and so on, you'll naturally make trade-offs that seem effortless to you, but that reinforce your culture. And then That becomes this very hard thing for other people to catch up to. And at Anthropic, from day one, there was this mission like-- missionary like zeal and belief that, hey, these capabilities will scale. These systems are stochastic, not deterministic. There will be error bars, and until we crack interpretability, there's risk. And at some point, people will go-- stop using Claude just for coding. They'll use it in some mission-critical context where there's-- it'll throw off a bug, and then people are going to come blame them, and they want to be on the right side of history where they said, “Yes, this is a powerful technology. We think it's going to change the world, And we want to be very measured and scientific about the fact that, ‘Hey, guys, these are stats models, statistical models.' That's how statistics works.” ultimately, when you're training neural nets, it is just a statistical system. And I think that Belief that safety is important and that it might seem toy-like in the early days, and sometimes, you could say, “Anjney, they totally over-exaggerated the risk,” like two years ago when they said, “Let's not launch Claude One,” or whatever. Well, okay, maybe in hindsight, but hindsight is twenty/twenty. And at the time, they didn't know how that model would be used, and to them it felt existential if somebody came and said, “You weren't responsible. It-- This wrote a bug.” The liability associated with that is massive. So how do you prevent against that? Well, day in, day out, you say safety. And when you start deviating from that, you have the team hold you accountable, you have the world hold you accountable, and I think that becomes a moat over time. At some point, that moat will get challenged and so on, and then it become fragile. I hope it endures because that's the beauty of having founders run the show, ‘cause they can make really hard trade-offs to do mission alignment. The hardest part is in the earliest days when you don't have a group of people who are going through difficulty, stress, crisis together, then your culture doesn't get defined sharply enough, and that's what I'm worried about right now, is there's so much money going to these labs. There's no hardship. There's no-Swyx [00:52:50]: To anyone who knowsAnjney [00:52:51]: There's no to anyone who knows. And that, in hindsight, was a feature, not a bug for Anthropic. The number of people who said no, the number of people who said, “Sorry, we're all doing investors in OpenAI,” that is competitive difference. It forces you to really understand, what is the hill you want to die on at the expense of everything else. What's the P zero? And there, P zero from day one was coding. The reason, the mechanism system there was if we crack coding, Then we will crack AGI. Our mission is AGI. We want to get there safely. If we focus on codin

Open Source Startup Podcast
E198: How Unikraft Launches AI Agents in

Open Source Startup Podcast

Play Episode Listen Later Jun 18, 2026 42:20


This Open Source Startup Podcast episode has our co-hosts Robby and Tim in conversation with Dr. Felipe Huici, CEO of Unikraft - the compute layer for sandboxes, AI agents, or any workload with VM-grade isolation. Their open source, also called unikraft, has 4K stars on GitHub and provides a next-generation cloud native kernel. This episode explores how Unikraft is building infrastructure for the next generation of AI agents, arguing that agents should run in virtual machines rather than containers. The conversation focuses on the unique requirements of agentic workloads: fast startup times, the ability to pause and resume state, strong isolation, and efficient resource utilization at massive scale. Unikraft's technology enables lightweight virtual machines that can start in under 10 milliseconds, helping companies reduce latency, lower infrastructure costs, and run large numbers of ephemeral agents on minimal hardware. The discussion also covers emerging AI infrastructure needs such as checkpointing, branching, headless browser automation, and GPU access.The podcast also traces Unikraft's origins from an academic research project to an open-source Linux Foundation initiative and, eventually, a startup founded in 2022. The conversation examines customer adoption, the role of Unikraft as foundational infrastructure for AI platforms, competition and collaboration within the agent ecosystem, the future of GPUs and virtualization, and lessons learned from building a company in the rapidly evolving cloud and AI infrastructure market.

Shadow Warrior by Rajeev Srinivasan
India will collapse without digital sovereignty and Pax Indica: lessons from Hormuz

Shadow Warrior by Rajeev Srinivasan

Play Episode Listen Later Jun 18, 2026 23:07


A version of this essay has been published by Open Magazine at https://openthemagazine.com/world/india-will-collapse-without-digital-sovereignty-and-pax-indica-lessons-from-hormuzBy now it is clear that the Iran War (or West Asia War) has been a disaster to all concerned, including the principals as well as assorted passersby. The massive amounts spent by the US (at last count $25 billion) are at least articulated; the bill for the enormous infrastructural and human suffering inflicted on Gulf states, in the theater of war, must be greater, by definition.The collateral damages suffered by the rest of the world from the cessation of trade through the Straits of Hormuz will presumably run into the trillions of dollars. As one of the worst affected, India, which imports 90% of its hydrocarbons from the Gulf, not to mention other essential items such as urea (for fertilizer), sulfuric acid, helium, etc., is on track to take a massive hit. As an article in The Economic Times said, “India must brace for broad-based economic shock”.Indian exports of up to $50 billion are also affected, especially agricultural products including perishable foodstuffs, but also gems and jewellery, electronics, textiles and garments. Some of this can be diverted via Oman and the UAE's Fujairah port, but much of it passes through the Straits of Hormuz and is potentially blocked and/or stranded at sea.The Hormuz closure is a body blow to India's economy. What can and will India do about it? The Indian State has a habit of rising to the challenge only when there is a crisis, while vegetating otherwise. The 1991 economic crisis is a case in point; the sanctions following “The Buddha is smiling”, and the denial of cryogenic rocket engines and supercomputers are other examples where the nation rallied. So were covid vaccines. Necessity, they say, is the mother of invention.Turning a threat into an opportunityIf I were to be an optimist, I could say that the current crisis is actually an opportunity. In fact, a major opportunity. My reading of the Iran War is that it is President Trump's strategic tit-for-tat against China for denying him rare earths and cutting off soybean purchases. In return Trump decided to deny China access to oil by closing access to Venezuela and Iran. Whether this will work, or whether the G2 condominium (read ‘surrender') will prevail, is unclear.But that is, in a sense, background noise that needs to be managed. India needs to focus on its own issues, of which I see several as critical, and the solution in general is to become Atmanirbhar, self-reliant, and from that, to create an Anti-Fragile nation:* National security/defense* Food security* Energy security* Digital security/narrative control* Trade securityThe first three do not need an explanation: they are obvious. Internal and external security are pre-requisites for any successful society. If India's hard-won food security can be threatened by external threats, then there needs to be some deep introspection. Energy security means diversification, both of hydrocarbon sources, and of types of energy, including renewables, nuclear, biomass, coal-based, and so on.Malign narratives and digital sovereigntyNarrative control is something that the Indian State has failed at so far; it is laughably easy to create hate speech against Indians and India (as has been demonstrated freely by any number of players, starting from the MAGA crowd, to Audrey Truschke to a”Cockroach Janata Party” and some nitwit Norwegian journalist in just the last fortnight) and there are no consequences to the culprits. It's enough to make me pine for Lee Kuan Yew's aggressive legal battles against the media.It's one thing if it were only a problem with foreigners, but with the massive spread of social media, and in particular generativeAI, it is becoming a serious domestic issue. Since India is an avid consumer of social media, and because generativeAI is trained on things like Wikipedia, X, Whatsapp and Google content, biased and motivated material becomes ensconced as The Truth. I have written about narrative warfare and manufacturing consent.This used to be a one-way tsunami of (mis)-information by legacy media, but now there is also the opposite: the wholesale and free vacuuming-up of Indian data (whatever happened to “data is the new oil”?). The “Great Firewall of China” both kept out foreign BIg Tech applications and prevented their plundering Chinese data: is that the way to go?Manufactured narratives are intended for regime change: all the color revolutions today are hatched with massive bot-farms funded by some combination of Deep State, CCP, ISI, Qatar etc. (for example the alleged Gen-Z uprisings that rocked Nepal, drove Sheikh Hasina out of Bangladesh). Thus muzzling malign narratives, and ensuring data security, are imperative.Even Singapore is not immune: it had to block anti-India narratives that likely originated from Chinese sources.A particularly striking example of narrative warfare is the virtual hate speech inducted into Wikipedia by deeply prejudiced anonymous editors. Ashley Rindsberg, who exposed the mighty New York Times' biases in his book The Gray Lady Winked, provides many examples of this.Of note to Indians and Hindus is his recent substack titled “Wikipedia's India War” where he identifies just four editors as having created most of the content condemning the Hindu American Foundation (HAF) in ‘Wikivoice', i.e. the allegedly neutral perspective of Wikipedia. They are, on the contrary, shown to be highly one-sided.As Rindsberg mentions, Wikipedia being central to generativeAI, the damage is baked into the world-view of all AI applications. Truly Orwellian. Says Rindsberg: “four… anonymous accounts can have an enormous impact on what millions of people believe to be the truth.” “Over four years (2021-2025), editors systematically erased HAF's identity as an American civil rights group, transforming its Wikipedia page into a heavily curated dossier of accusations.”Trade, and how the Spice Route was far superior to the Silk RoadFinally, something that is becoming increasingly important: ensuring freedom of trade. This is more than just freedom of navigation, although I find it instructive that Emperor Rajendra Chola sent a huge fleet 1,001 years ago simply to open up the Straits of Malacca. India can make an active attempt to regain primacy in Indian Ocean trade, the whole Pax indica idea.Here is another example of the power of narrative: we have been led to believe that the Silk Road to China was some major highway of commerce between ancient Rome and ancient China, but it was a term coined only in 1877 by the German Ferdinand von Richthofen. There was no highway. A large caravan might take six months, and with 500 camels traversing treacherous deserts and braving bandits, it might carry a maximum of 100 tons. That is puny.In comparison, on the Spice Route, a single stitched ship from Muziris could carry 400 tons of ivory, pepper, silk, tigers and elephants; and the historian Strabo around 1 CE talks about fleets of 250 ships going from Alexandria to India on a six-week monsoon-powered journey. That is 100,000 tons of merchandise. No wonder Pliny the Elder complained that Rome's treasuries were being emptied of gold by India.Simple question: where are hoards of ancient Roman coins found in Asia? Answer: not along the Silk Road. The hoards are in Kerala, Tamil Nadu and Sri Lanka.Today, it is possible for India to aspire to port-led development of trade, especially with the major ports at Trivandrum (Vizhinjam), Maharashtra (Vadhavan), and Great Nicobar (Galathea Bay). The underlying ‘software' of India's millennia-old trade competency was a ‘multi-protocol switch' as I pointed out, and today's India Stack can replicate that. Then there is the need for a blue-water navy: muscle to provide security on the Hormuz to Malacca sea-lanes.So there is a vision. How can India get there? This is where policy matters, as I discussed with policy expert Anuj Gupta. Policy, especially industrial policy, has had a bad reputation in certain circles because it was deemed to violate the virginal purity of classical capitalism. However, in a recent U-turn, even the World Bank admitted that industrial policy may not be all that bad, after all: the success of Japan, the Asian Tigers, and China can't be ignored.That leads to the question of why policy in India has produced mediocre outcomes, what is different now, and where the best use of policy might be.Industrial Policy: What went wrong in the past?There are many problems here. To begin with, the Soviet model, which Nehruvians swore by, was, in hindsight, a dead end. Second, there is the problem of governance: post-Independence bureaucrats have awkwardly borne the legacy of imperial hauteur and the needs of a developing society. Third, until recently, the bare necessities (food, electricity, road access) were not available to many citizens, and GDP growth was not their priority.There is also the culture of jugaad: of clever ways in which you overcome constraints through frugal improvisation and seat-of-the-pants making-do. This is fine for one-off things (e.g. converting a tractor trailer into a makeshift transport vehicle because your truck broke down), but it does not make for efficient and replicable industrial products. As The Economic Times said recently, it is time to junk jugaad. Quality has to become ingrained in people's minds.The issue of governance is significant: the bureaucracy and the judiciary have both under-performed, politicians, as everywhere, have been venal. It is said that China's growth can be attributed to the fact that its babus are engineers, and therefore with engineering ruthlessness move in straight lines. The US' babus are lawyers, and India's are humanities graduates. Well, engineers are not very good at second-order effects (eg. China's lurch from one-child policy to demographic collapse), but a little bit of ruthlessness is probably good.What is going reasonably well?There are a few modest success stories: for example, in electronics manufacturing or assembly. The PLIs (and DLIs) have produced the desired effort, with clusters of excellence where global suppliers have also set up shop (as they did earlier for the automobile industry in, say, Sriperumpudur). The fact that a lot of iPhones in the US are now imported from India is laudable, even though it may be derided as “screwdriver jobs”. That's where one starts the move up the value chain.The current semiconductor policy is a big hope, especially after the landmark agreement by the Dutch firm ASML with Tata Electronics in Dholera, Gujarat. Given that ASML has a near-monopoly position in Deep Ultraviolet Lithography (DUV) this is a major boost to India's chip ambitions. My recent conversation with AMD CTO Suraj Rengarajan went into India's chances to realize its ambitions.A recent announcement from Trivandrum-based fabless startup NetraSemi (a recipient of DLI) of the commercial availability of its edge AI chips is a landmark.Next is the newly announced plan for energy security revolving around both coal gasification and intensive offshore exploration. These fall squarely into the Atmanirbhar category: India simply cannot afford to have its energy held hostage by distant nations. It also needs distinctly Indian innovation.The Samudra Manthan initiative is also showing some promise. At least one out of three deep-water wells in the Andaman Sea (SriVijaya Puram-3) are reported to be showing the availability of natural gas, although it will take 5-10 years for this to be commercially available.What should the future look like for India's Industrial Policies?This of course is the hard question. Here is my personal perspective, and I accept that reasonable people may disagree. I think three areas need to be focused on, and will pay large dividends.* Drones and swarming software* Social media and AI stack* Maritime Trade and Blue-Water NavyI admit that these are not the only worthwhile industrial policies. Another is for copper, which would reverse the catastrophic effects of the closure of the Sterlite plant in Thoothukkudi, as the metal is an increasingly important component in electronics, data centers, etc., and far from being self-sufficient earlier, India now imports 50% of its needs. Another area of interest in quantum computing.There are also failures from which the right lessons need to be learned. The policy for EV batteries has apparently failed: according to Swarajya magazine, India has not been able to escape from near-total dependence on imported Chinese batteries.Drone swarmsI wrote recently that drones may well herald a step-change in warfare. For the moment, though, they are searching for their niche in offensive/defensive warfare. Drone hardware is already a well-trodden path with Chinese and other nations dominating it, although with IdeaForge, Paras, Garuda, IoTechworld Avigation etc., India is also making progress there. And India is indeed buying the hardware, $2 billion-worth, according to the Economic Times.But I believe the real game is in drone swarms. AI-based control software (similar to HiveMind) that would allow an entire swarm to act autonomously, just like a murmuration of starlings, would be the gold standard to aim for. Such a self-managing swarm would be virtually impossible to defend against, and I think India should put in place a PLI to support it, leveraging software capability in the country.Of course, drones are not just for military purposes, but also for commercial uses including things like logistics and agricultural use, such as precision delivery of fertilizer and pesticide to crops (as Garuda demonstrates). An Indian initiative that supports both drone hardware, and especially drone software, would be a potential winner.Digital Sovereignty: Social media and AI stackThere is a raging battle over which part of the AI stack India needs to invest in. As an old Unix hand, I believe the foundational model is not where the differentiation is. In analogy with Linux (the open-source Unix variant that was popularized by Linus Torvalds and an army of volunteers), there is little value in re-writing the operating system, but one can differentiate by building on top of it, or by judiciously choosing certain modules of it.Besides, the cost of building an entirely new foundational model would be astronomical and would consume the entire budget of IndiaAI Mission.Thus, my personal opinion is that the foundational model (especially when, it is believed, there are more or less open-source models available for free, e.g. Llama, DeepSeek) is not where India should expend its precious R&D resources, but on the layers of the stack above it. It is the data that matters, as Larry Ellison apparently suggests too.But there is the interesting counter-example of Sarvam AI which is producing its own sovereign model: multi-lingual and presumably otherwise tuned to Indian needs. The question is whether this can survive when hundreds of billions worth of capital investment are going to the US Big Tech companies and their Chinese rivals. The sad history of Koo, a Twitter rival, comes to mind. So does Arattai, a Whatsapp rival, whose popularity has waned. .A well-thought-through industrial policy on generativeAI is therefore essential. The status quo ante is unsustainable; given the fact that Sarvam has also found it difficult to raise funds in the US, it is worth pondering whether a China-style massive subsidy is the answer. And where should it go, into foundational models or into the layers of the stack above it? The answer is “both”, but with priority to the latter.Here is where I would prioritize investments, in order:* Vertical applications in specific domains: e.g. defense, healthcare, agriculture, governance (particularly in the judiciary and in ease of doing business in the bureaucracy)* Fine-tuning and customization: for the needs of the Indian context, e.g. multi-linguality under Bhashini* Compute infrastructure: GPUs, sovereign and protected indian datasets* Sovereign Small-Language Models such as Sarvam AIAs mentioned above, at the moment India's data is being sucked up for free by US Big Tech. In addition, there is the real danger that Indic Knowledge Systems will be mined and digested, as has happened to yoga, pranayama, etc., which have been given Western analogs and nomenclature, as in Pilates, ‘coherent breathing' etc.These two problems are connected, and both need to be tackled in parallel. Social media is being weaponized against India, and this is magnified by the legacy media in a positive feedback loop. Three examples: one was the rage against Adani based on the dubious research of Hindenburg, which then went under; the second is Bloomberg's reckless accusation about gold reserves being sold by the RBI, which they were forced to retract, but social media and Wikipedia will remember it; the third is the meteoric (media) rise of the Cockroach Janata Party.Trade using major ports, Digital Public Infrastructure and a blue water navyUsing trade for competitive advantage is an age-old tactic. The trade tiffs between the US and China are examples of this: we are witnessing war by other means. Many nations are getting into this act, and India does have some advantages, partly based on geography. Maritime trade is likely to continue to be the key, which makes naval chokepoints the big story, but not the only story to watch out for.The major aspects of maritime trade include infrastructure, the digital “multi-protocol switch”, and security. On the one hand, India is developing not only major container ports, and the road/rail links to get to them, and the industrial goods to ship out through them, but also a serious shipbuilding industry, which was one of India's historical strengths. Then it used to be stitched wooden ships (teak beams lashed together with coconut rope). Now it's modern steel ships.There are the big, efficient new ports, which can now turn ships around with Singapore-like efficiency; the proposed third aircraft carrier group which will make it possible to patrol the Arabian Sea and the Bay of Bengal at the time; the Air-Independent Propulsion diesel submarines and nuclear submarines that can monitor (and if necessary, deny) narrow straits; the sale of supersonic Brahmos cruise missiles to the Philippines, Vietnam and Indonesia (and Cyprus) that create ship-denial zones: all this is muscle.And the final piece, the ‘software' for trade, the “multi-protocol switch”. This last is complicated. Its value is underestimated by many. But this is what enables friction-less transactions between various unrelated parties. The India Stack and the Digital Public Infrastructure can be utilized to provide such a facility. But it is complex enough to need significant study as to what is possible, and how to roll it out.Second-order effectsIn closing, it is worth considering some of what the (unintended) consequences of these proposals may be. Let us note that the G2 has no interest in allowing India to grow and make it a G3. They will do everything in their power to kneecap India, by all means possible.There is also a certain derision for India in some circles. Here is a generic western opinion on why China got rich, and India didn't. Well, the author doesn't consider the second-order effects of the wholesale destruction of Chinese civilization: that is a tradeoff Indians may not prefer for themselves. We all know how China's well-intentioned One Child Policy turned into demographic collapse within a few years. Besides, as The Economist asks, “China is innovative. Its economy is a mess. Which will win out?”This is why I think planning for these second-order effects is important. We tend to ignore them because they seem counterintuitive or unlikely, but Nassim Taleb has sensitized us to how low-probability Black Swan events can have grave consequences.As an example, attempting digital sovereignty may have unwelcome side-effects: Big Tech have the first-mover advantage and network effects and there are increasing returns to scale. They will surely make it hard for a new player to break in. Besides, the large investments in data centers and GCCs that they are making in India would make it very difficult for them to be ejected with a “Great Indian Firewall”.Even taxing their capture of Indian data will be complicated; not to mention that they have demonstrated that they can happily violate copyright laws with no consequence; therefore they will find ways to chew up and spit out Indian Knowledge Systems, and essentially re-colonize India. Digital colonialism is not a threat, it is a reality today, and it is a consequence of the relatively open Indian system.In addition, there is a malign group, the “barbarians within” as Arnold Toynbee once put it, who are ready to sacrifice Indian sovereignty for a pittance.Given all this, it will be very difficult to put in place serious measures to gain digital independence; and the narrative-peddling is likely to gain further momentum: just consider the caste allegations that have haunted BAPS in the US (despite the cases being dismissed by the US DoJ), the Cisco Systems case where, again, the case was dismissed, but the narrative continues, and the persistent efforts in various US states to turn caste into a weapon to bludgeon Indians.Another sensitive issue is that of the multi-protocol switch for trade. While from an Indian point of view, it eases trade and harks back to a Golden Age of Indic maritime commerce, but that will be viewed elsewhere very differently, for instance by the US as an attempt to de-dollarize. The US has jealousy guarded – with very good reasons that we will not go into here – the dollar's reserve currency status.We have also seen what happened to those who attempt to hurt the dollar's primacy: in 1985, the Plaza Accord devalued the dollar, and that was a body blow to Japan's economy, which has not recovered its mojo to this day. Later, Iraq's Saddam Hussein and Libya's Muammar Gaddafi both had ideas about replacing the petro-dollar with, respectively, the Euro and a new pan-African gold-backed currency. We know what happened to them.If the India Stack multi-protocol switch is perceived as an alternative to the US dollar, there may be grave consequences. Therefore, it should be conceived and deployed only as an adjunct to it and to the almighty SWIFT settlement system.ConclusionIndia is at a crossroads now. Even though the Hormuz closure is a serious problem, if it plays its cards right, adversity can be turned into opportunity across a variety of perspectives. The key is Atmanirbhar, self-reliance. If India can now implement a crash program of industrial policy, and at the same time overcome an ingrained Third-World tendency to cut corners, it can finally break free of the years of underperformance, what I called the Nehruvian Penalty in 2004.It is possible, but there are caveats: unforeseen consequences. Hic sunt dracones. Here be dragons. Be afraid. Be very afraid.3700 words, 7 June 2026This is episode 192 of the Shadow Warrior podcast. Here is a companion AI-generated slideshow. (Note that the borders of India are not necessarily depicted correctly here, because it is generated by an AI, notebookLM.google.com) This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit rajeevsrinivasan.substack.com/subscribe

Let's Talk AI
#248 - Fable 5, Siri AI, IPOs, Policy on the AI ​​Exponential

Let's Talk AI

Play Episode Listen Later Jun 17, 2026 100:43


Our 248th episode with a summary and discussion of last week's big AI news!Recorded on 06/12/2026Note: we recorded just before the OTHER big news about Fable... we'll discuss it on the next episode.Hosted by Andrey Kurenkov and Jeremie HarrisFeel free to email us your questions and feedback at andreyvkurenkov@gmail.com and/or hello@gladstone.aiRead out our text newsletter and comment on the podcast at https://lastweekin.ai/In this episode:Anthropic released Claude Fable 5 (a safeguarded version of Mythos 5), showing major benchmark jumps and new risk findings in its system card (eval awareness, transgressive actions, CBRN concerns), alongside controversy over severe guardrails and silent downgrades.Apple announced Siri AI at WWDC, positioning a more capable conversational assistant integrated across iPhone features, reportedly built on a custom Gemini partnership; Google also rolled out Gemini 3.5 Live Translate and cut Google AI Plus pricing while bundling more storage.Business and infrastructure updates include OpenAI's confidential IPO filing amid an IPO race with Anthropic and SpaceX, Bezos-backed Prometheus raising $12B for “physical AI,” DeepSeek seeking a major external round, and Google paying SpaceX about $920M/month for GPUs.Open-source, safety, and policy developments feature new Gemma 4 and Diffusion Gemma releases, a lab letter urging DNA/RNA screening laws, Amodei calling for an FAA-like AI regulator and third-party testing, research on agent harms and RL “societal hacking,” and a dispute over music-label settlements with Suno/Udio.Timestamps:(00:00:10) Intro / Banter(00:01:11) News Preview(00:01:53) SponsorsTools & Apps(00:04:53) Claude Fable 5 and Claude Mythos 5 + Anthropic apologizes for invisible Claude Fable guardrails(00:27:06) Apple announces Siri AI and its next generation of Apple Intelligence | The Verge + I tried Siri AI, and so far it actually works(00:33:47) Gemini 3.5 Live Translate rolling out to Google Meet and Translate(00:35:39) Google just fired a warning shot in the AI subscription price wars | TechCrunchApplications & Business(00:37:55) OpenAI Confidentially Files for IPO on the Heels of SpaceX and Anthropic | WIRED (00:41:57) Jeff Bezos's Prometheus raises $12B to build an 'artificial general engineer' for the physical world | TechCrunch(00:45:39) DeepSeek slated to raise $7 billion in maiden funding round, sources say(00:48:18) Huawei-led team claims it post-trained DeepSeek's 1.6-trillion-parameter model — 1,000 Ascend 910C chips used in training(00:51:57) Google will pay SpaceX $920M per month for compute | TechCrunch(00:55:51) Elon Musk Shows Off AI Data Centers SpaceX Wants to Send Into Space - Business InsiderProjects & Open Source(01:01:14) Google's new Gemma 4 12B model is designed to run on any laptop with 16GB of RAM - Ars Technica(01:05:13) Google AI Releases DiffusionGemma, a 26B MoE Open Model Using Text Diffusion for Up to 4x Faster Generation - MarkTechPostPolicy & Safety(01:09:42) OpenAI and Anthropic Sign Letter to Prevent AI-Developed Biological Weapons | WIRED(01:14:04) Anthropic CEO publishes lengthy article: AI is moving too fast, and policies can't keep up. | PANews(01:20:18) Anthropic Urges Global Pause in AI Development, Flags ‘Self-Improvement' Risk - WSJ(01:24:46) When Benign Inputs Lead to Severe Harms: Eliciting Unsafe Unintended Behaviors of Computer-Use Agents(01:27:42) Large Language Models Hack Rewards, and Society(01:33:46) Senior US officials eye government shares in AI giantsSynthetic Media & Art(01:37:45) AFM Sues UMG, WMG Over Settlements With Suno and UdioSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Learn Cardano Podcast
NuNet Makes Decentralised Compute Easier to Use With Its New Appliance

Learn Cardano Podcast

Play Episode Listen Later Jun 17, 2026 33:26 Transcription Available


NuNet is building a decentralised compute and orchestration network where people can contribute spare CPU, GPU, RAM and other resources, while developers and organisations can deploy workloads across available infrastructure. In this episode, Peter talks with Jennifer from NuNet about the new NuNet Appliance and why it matters for making decentralised compute more practical for everyday users.The conversation covers how NuNet matches the right compute to the right job, how the Appliance lowers the barrier to onboarding devices, and why use cases like n8n automations, private AI agents, edge AI, Cardano SPO infrastructure and web deployment workflows are a natural fit for the network. Jennifer also explains NuNet's zero-trust security model, pricing approach, organisations, ensembles, deployment templates, and how NTX fits into orchestration fees.If you have spare compute, want to run private AI workloads, or are building in the DePIN and Cardano ecosystem, this episode gives a practical look at how NuNet is moving from concept to usable infrastructure.Key Takeaways:- NuNet is a decentralised compute and orchestration platform that lets people contribute spare compute and lets workloads find suitable resources automatically.- The NuNet Appliance is designed to make onboarding CPUs, GPUs, RAM and other compute resources much easier for non-expert users.- NuNet can support broad workloads, including n8n automation, private AI agents, Qwen-based LLM deployments, edge AI, web builds and Cardano SPO infrastructure.- The network uses a zero-trust model where machines are cryptographically identified and verified at each interaction.- Compute pricing is designed around stable currency values, with automatic conversion into NTX rather than forcing users to price workloads directly in a volatile token.- NuNet organisations can let other DePIN projects bring their own communities and native tokens while still using NuNet's orchestration layer.- Ensembles and templates are intended to simplify deployments so users do not need to manually understand every YAML configuration detail.- NuNet is open source, with docs, GitLab, Discord, Medium and X available for people who want to try the network or contribute.Links & References:- NuNet — Compute Orchestration for a Decentralized World: https://link.learncardano.io/eGKGuZ- What is NuNet? | NuNet Documentation: https://link.learncardano.io/rHu2E4- x.com: https://link.learncardano.io/NIhPKR- https://link.learncardano.io/Tlu7wNWebsite: https://link.learncardano.io/bQ68RcX/Twitter: https://link.learncardano.io/3a1QtvDisclaimer: This content is for educational purposes only. Nothing constitutes financial advice.DISCLAIMER: This content is for informational and educational purposes only and is not financial, investment, or legal advice. I am not affiliated with, nor compensated by, the project discussed—no tokens, payments, or incentives received. I do not hold a stake in the project, including private or future allocations. All views are my own, based on public information. Always do your own research and consult a licensed advisor before investing. Crypto investments carry high risk, and past performance is no guarantee of future results. I am not responsible for any decisions you make based on this content.

SPACInsider
Dr. Wasiq Bokhari, CEO of Quantum Computing Firm Pasqal

SPACInsider

Play Episode Listen Later Jun 17, 2026 31:42


This week, we speak with Dr. Wasiq Bokhari, CEO of quantum computing firm Pasqal, to discuss its $1.98 billion combination with Bleichroeder Acquisition Corp. II (NASDAQ:BBCQ). Quantum computers are plugged in and doing real commercial work, but, how can you get them to collaborate with CPUs and GPUs on a single problem? Pasqal believes it may have the answer. Wasiq explains how the company has worked to build a full-stack approach to commercializing quantum solutions and where it has already demonstrated quantum advantage. He also gets into how it is developing software to make engaging with quantum computers easier, and how it can break down problems into the parts that quantum computers, CPUs and GPUs are each best poised to solve the fastest.

Remotely Curious
Why don't more AI tools understand what matters to you?

Remotely Curious

Play Episode Listen Later Jun 16, 2026 29:43


How do you build AI that actually understands you and the work you do? It all starts with having the right context.  We talk with Dropbox staff product manager Noorain Noorani and principal engineer Sean-Michael Lewis about the art of context engineering and how Dropbox connects to all the tools your team needs for work—so you get AI that works wherever you do.  ~ ~ ~  Working Smarter is brought to you by Dropbox. Find, organize, and share your work—all in one place—with context-aware AI from Dropbox. You can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.ai This show would not be possible without the talented team at Cosmic Standard: producer Ben Montoya, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrator Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck. Our theme song was composed by Doug Stuart.  Working Smarter is hosted by Matthew Braga. Thanks for listening!

TechSurge: The Deep Tech Podcast
Battle for the AI Data Center: Deep Dive on the Semiconductor Supercycle

TechSurge: The Deep Tech Podcast

Play Episode Listen Later Jun 16, 2026 53:23


Semiconductors have moved from the background of the technology stack to the center of the AI economy. What used to be a specialized industry discussed mostly by engineers and investors is now shaping the speed, cost, and strategic direction of modern computing.In this episode of TechSurge, host Michael Marks speaks with Stacy Rasgon, Managing Director and Senior Analyst covering U.S. semiconductors and semiconductor capital equipment at Bernstein Research. Stacy has spent years analyzing the chip industry across cycles, but argues that the current moment feels different in scale: AI demand has created an unprecedented scramble for compute, memory pricing has surged, and companies across the stack are being forced to rethink capacity, architecture, and capital allocation.The conversation explains the 4 different kinds of semiconductor cycles—supply, inventory, product, and demand — and why Stacy believes the industry is currently in a demand cycle of unusual magnitude. The discussion also unpacks the distinction between DRAM and NAND, why high-bandwidth memory is becoming strategically central to AI systems, and how the physical realities of wafer capacity and silicon area are constraining supply in ways the broader market often misses.Stacy and Michael also discuss the hardware economics behind the current boom, with Michael pressing Stacy on why compute remains so scarce and how companies are improving performance through packaging and system design. Michael then moves the conversation beyond market headlines to the core business questions: who is actually paying for this compute, which use cases are generating real revenue, and whether AI spending is creating durable economic value or simply shifting costs elsewhere. Together, these questions highlight two of the episode's clearest insights: coding may be one of the earliest AI applications with meaningful willingness to pay, and inference, not training, is the real test of whether the current buildout becomes a lasting business or just another expensive wave of infrastructure.Stacy explains the concentration of power among the major wafer fabrication equipment players, the rise of ASICs as a meaningful share of AI silicon, Broadcom's rapidly expanding AI opportunity, and the growing role of Chinese companies as new entrants, especially in memory and semiconductor equipment. Along the way, the conversation asks the defining question facing the sector: is this just another semiconductor upswing, or the first true supercycle the industry has seen? Stacy believes that this might be the biggest supercycle he has seen in his career.Sign up for our newsletter at techsurgepodcast.com for updates on upcoming TechSurge Live Summits and future episodes.Links:Stacy Rasgon on LinkedIn: https://www.linkedin.com/in/stacy-rasgon-6924963Bernstein: https://www.alliancebernstein.com/corporate/en/home.htmlReferences Mentioned During the DiscussionNVIDIA Blackwell Platform: https://www.nvidia.com/en-us/data-center/blackwell-platform/High Bandwidth Memory (HBM) overview from Micron: https://www.micron.com/products/memory/hbmDRAM overview from IBM: https://www.ibm.com/think/topics/dramNAND flash overview from IBM: https://www.ibm.com/think/topics/nand-flash-memoryFurther ReadingMcKinsey on the semiconductor industry outlook: https://www.mckinsey.com/industries/semiconductors/our-insights/the-semiconductor-industry-in-2025Semiconductor Industry Association: 2025 State of the U.S. Semiconductor Industry: https://www.semiconductors.orgNVIDIA on the Blackwell architecture and AI infrastructure roadmap: https://www.nvidia.com/en-us/data-center/blackwell-platform/Broadcom AI investor materials and infrastructure commentary: https://investors.broadcom.comASML on lithography and advanced chip manufacturing: https://www.asml.com/en/technologyMicron on HBM and AI memory demand: https://www.micron.com/products/memory/hbmChapters[00:00:00] — Highlights[00:00:26] — Welcome to  the Episode[00:01:29] — Meet Stacy Rasgon[00:02:01] — Is This the First Real Semiconductor Supercycle?[00:05:33] — Inside the Strongest Memory Cycle in History [00:09:14] — Can Innovation Keep Up With AI Demand?[00:11:33] — Chiplets, Blackwell, and the New Economics of Compute [00:12:37] — What Could Signal the Cycle Is Slowing[00:14:26] — Vertical Integration at the Hyperscales [00:16:36] — The Difference between Apple and Meta[00:17:15] — What is Vertical Integration Being Done For?[00:18:15] — Will other bottlenecks develop as This Progresses? [00:21:13] — Oligopoly Pricing in the Market[00:22:22] — Any New Entrants into Memory?[00:23:46] — Why the Industry Must Pivot From Training to Inference[00:25:10] — Agentic Coding and the First Real AI Revenues[00:26:57] — Groq, Low-Latency Inference, and What GPUs Cannot Do Alone[00:29:28] —-Could The Smaller Companies All be Bought Up ?[00:30:19] — Why Semiconductor Equipment Matters More Than Ever [00:31:00] — How Semiconductor Equipment is Affected by the Cycle[00:32:55] — A Long Upcycle for Semiconductor Equipment Guys?[00:33:13] — The Big Five and the Rise of Chinese Equipment Players[00:34:24] — The Effects of Geopolitics[00:35:02] — Broadcom's Quiet AI Breakout[00:40:46] — ASICs vs GPUs and the Next Wave of Custom Chips[00:41:06] — Intel, Foundry Strategy, and the Long Turnaround[00:46:46] —-The Risks the Market May Still Be Underestimating[00:49:32] — Where Startups Still Have Room to Win[00:50:39] — What the Semiconductor Industry Could Look Like Next Year

The EVA podcast
Airside International Summer 2026 -Powerd by AI

The EVA podcast

Play Episode Listen Later Jun 16, 2026 21:32


As I write this note, a heatwave is beating down on the UK, signalling the arrival of the busy summer travel season. With air traffic reaching record highs globally, airports and airlines are under increasing pressure to maintain efficient GSE operations, ensure smooth turnarounds, and uphold high safety standards for both passengers and ground handling teams. In this issue, we bring you the latest developments in the GSE space, focusing on ground power units (GPUs), water and lavatory vehicles, and equipment leasing and rental. Electrification continues to be at the forefront of the minds of aviation stakeholders. To learn more, I visited ITW GSE's factory in Odense, Denmark, as well as Rushlift GSE's operation at Gatwick Airport, to find out about the companies' approaches to electrification and to discover how new technologies are transforming GSE design and operations. Moreover, I caught up with Aviator Airport Alliance at IGHC Cairo to gather insights on how the Nordic ground handler is approaching eGSE transition. I also spoke with Mathieu Blondel, co-author of a report on the topic, about the opportunities and challenges associated with decarbonising ground operations. While sustainability is evidently a key focus for the industry, safety on the apron remains a pressing issue. March saw a tragic incident at New York's LaGuardia Airport, in which an Air Canada plane collided with an Oshkosh Striker 1500 airfield rescue and firefighting (ARFF) vehicle. Megan Ramsay explores the circumstances that led to the accident, as well as wider advancements in ARFF technology and design. An additional challenge in flight safety is also emerging: bird strikes, which can result in serious damage to aircraft and, in rare cases, have caused engine failure. Tony Harrington investigates whether enough is being done to tackle the issue. We also welcome back a guest writer, Mark Finch, who pens an insightful article on GSE pooling.

Beyond The Valley
AI Agents Are Coming to Your Devices: Qualcomm CEO Cristiano Amon

Beyond The Valley

Play Episode Listen Later Jun 16, 2026 49:39


Qualcomm CEO Cristiano Amon says artificial intelligence is set to change the way people use smartphones, even if the devices themselves are not going away.Speaking to CNBC's Arjun Kharpal on “The Tech Download,” Amon said phones will increasingly be operated by AI agents that can carry out tasks on behalf of users, from managing apps to interacting with services across the internet. He described agents as a major shift for the mobile industry, comparing their emergence to the rise of apps in the smartphone era.Amon also said new categories of personal AI devices are beginning to take shape, including smart glasses, pins, pendants and other wearables. He said glasses are a natural fit for AI because they sit close to a user's eyes, ears and mouth, allowing models to process what people see and hear in real time.The Qualcomm chief also discussed what the shift means for the semiconductor industry, including the need for more powerful and efficient chips in phones, PCs, glasses, cars and other connected devices. He said AI is forcing a rethink of chip architecture as devices increasingly rely on a mix of CPUs, GPUs and neural processing units to run models across both the device and the cloud.Amon also pointed to memory shortages and wider supply-chain constraints as key challenges for the industry, while arguing that the rise of AI devices could bring new players into consumer electronics.Subscribe to “The Tech Download” wherever you get your podcasts.See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Into the Impossible
Roman Yampolskiy: AI Can't Be Controlled — and We're Building It Anyway

Into the Impossible

Play Episode Listen Later Jun 15, 2026 83:00


Roman Yampolskiy has spent two decades trying to prove that superintelligent AI can be controlled. He couldn't. I invited him on to make his case. Subscribe if you want science with evidence, not speculation. Roman is a professor of computer science at the University of Louisville and one of the earliest researchers in AI safety. His book AI: Unexplainable, Unpredictable, Uncontrollable started as an attempt to solve the alignment problem. After decades of work, it became a proof that the problem cannot be solved. Not difficult. Mathematically impossible. I push back hard. We go after the Einstein test: can a large language model trained only on pre-1911 physics reproduce what Einstein did with the same data? We ran that experiment. It failed. Roman and I disagree about what that means. We also get into the halting problem and what it actually tells us about predicting smarter-than-human behavior, whether value alignment is a real problem or a well-funded category error, the case for a government moratorium on frontier model development, and why Roman thinks giving an AI agent access to your computer is the dumbest thing a smart person can do. What you'll hear: Whether AI control is mathematically impossible or just unsolved Why Roman thinks all current AI safety work is security theater What the halting problem actually means for superintelligence The alignment problem: real issue or well-funded category error Why Roman wants a moratorium on frontier model development What to tell your kids about careers in a world where Roman might be right If you listen to other people, the best you can become is average. CHAPTERS 00:00 Creating a mind without an off switch 01:34 Solving problems beyond our own intelligence 04:08 Einstein's epiphany and the limit of AI intuition 08:18 Assessing the Einstein test: Why the experiment failed 12:22 Path dependency: Are LLMs and GPUs our QWERTY? 16:10 The barriers preventing AI from solving physics 21:54 Safety vs. Capability: Why toddlers are safe but teens are not 23:06 The halting problem: Predicting agents smarter than us 25:58 The impossibility of a system proving its own integrity 28:18 Regulation: Genuine safety or a gift to oligarchs? 33:28 Is human cognition non-computable? Penrose vs. the field 39:00 Ethical duties: Must we treat AI with humanity? 43:00 From internet memes to monsters: Decoding the book cover 46:22 Customized realities: Can everyone have their perfect world? 49:50 Von Neumann probes and the panspermia hypothesis 55:02 Categorizing AI: The one version that should terrify you 58:22 Pause AI: The movement for a development moratorium 59:58 Career advice for kids in a post-professional world 01:07:58 Cross-examining Sam Altman 01:15:48 Roman's dream debate 01:19:50 Lessons for a younger self Substack: https://briankeating.substack.com Get the transcript, fascinating bonus content, and my Monday M.A.G.I.C. Message: https://briankeating.com/yt Have a .edu email and live in the USA? You automatically win a meteorite: https://BrianKeating.com/edu Subscribe: https://www.youtube.com/DrBrianKeating?sub_confirmation=1 Support Into the Impossible on Patreon, get my weekly M.A.G.I.C. Message, unfiltered bonus content, and live monthly Office Hours with me: https://www.patreon.com/drbriankeating Join this channel for perks, monthly Office Hours, and your name in the Member Roster at the end of every episode: https://www.youtube.com/channel/UCmXH_moPhfkqCk6S3b9RWuw/join Featured Guest: Roman Yampolskiy on Twitter/X: https://x.com/romanyam?lang=en AI: Unexplainable, Unpredictable, Uncontrollable: https://www.romanyampolskiy.com/books/ My books: Losing the Nobel Prize (memoir): http://amzn.to/2sa5UpA Think Like a Nobel Prize Winner: https://a.co/d/03ezQFu Focus Like a Nobel Prize Winner: https://a.co/d/hi50U9U Galileo's Dialogue (first-ever audiobook): https://a.co/d/iZPi9Un Twitter/X: https://x.com/BrianKeating Substack: https://briankeating.substack.com Blog: https://briankeating.com/blog Audio-only: https://briankeating.com/podcast #intotheimpossible #briankeating #AIrisk #artificialintelligence #aisafety #podcast #superintelligence #RomanYampolskiy Learn more about your ad choices. Visit megaphone.fm/adchoices

Tech Disruptors
DDN CEO Bouzari on Solving AI Data Bottlenecks

Tech Disruptors

Play Episode Listen Later Jun 15, 2026 48:31


As AI infrastructure scales up, the conversation is moving beyond graphic processing units (GPUs). Faster compute creates new pressure on the data layer, and companies are increasingly focused on whether their infrastructure can move, manage, protect and deliver data fast enough to keep AI systems productive. In this episode of Tech Disruptors, Bloomberg Intelligence analyst Woo Jin Ho speaks with Alex Bouzari, CEO of DDN, about the company's role in AI data infrastructure, the shift from storage to broader data platforms, DDN's high-performance computing heritage, its work with Nvidia and the business-model changes taking shape as AI moves from experimentation to production.

The Tech Blog Writer Podcast
How Paradigm4 Is Helping Organizations Remove Hidden AI Bottlenecks

The Tech Blog Writer Podcast

Play Episode Listen Later Jun 13, 2026 22:44


What happens when a company focused on drug discovery and life sciences encounters a data problem that nobody else seems able to solve? Recorded at the IT Press Tour in Boston, this episode explores the fascinating story behind Paradigm4 and how a challenge in large-scale biomedical research ultimately led to the creation of flexFS, a cloud-native filesystem designed to tackle some of today's biggest data infrastructure challenges. Joining me on the podcast is David Freund from Paradigm4, who shares how the company was originally founded to help scientists work with enormous datasets in fields such as genomics, bioinformatics, and precision medicine. As researchers began working with population-scale datasets such as the UK Biobank, the team discovered that existing storage technologies either couldn't deliver the performance they needed, lacked the functionality required, or became prohibitively expensive at scale. Our conversation explores the moment Paradigm4 realized it would need to build its own solution, why traditional approaches to cloud storage often struggle under modern analytics workloads, and how flexFS emerged from a real-world customer problem rather than a technology trend. David also explains why object storage has become such an attractive foundation for modern infrastructure, while discussing the challenges of latency, performance, and cost that still need to be addressed. We also discuss why many organizations investing heavily in AI infrastructure may be overlooking one of the biggest constraints on performance. While much of the industry conversation focuses on GPUs and compute power, David argues that data access, movement, and management are becoming equally important considerations as AI workloads continue to grow. Along the way, we touch on cloud independence, resilience, large-scale analytics, and why flexibility across cloud providers is becoming an increasingly important requirement for enterprise technology leaders. Whether you're working in AI, life sciences, cloud infrastructure, or enterprise data management, this episode offers an interesting perspective on how customer problems can sometimes lead to entirely new categories of technology. Could the next major AI bottleneck be data rather than compute? And are organizations paying enough attention to the infrastructure feeding their most important workloads? I'd love to hear your thoughts.

Eight Minutes
Space-Based Data Centers: Understanding the Hype Behind SpaceX's IPO - Episode 124

Eight Minutes

Play Episode Listen Later Jun 12, 2026 8:03


Let us know how we're doing - text us feedback or thoughts on episode contentSpaceX just went public at a $1.7 trillion valuation — and a big chunk of that bet is on orbital AI compute. Space-based data centers: GPUs in orbit, powered by unlimited solar, cooled by the vacuum of space, free from earthly permitting headaches. It sounds elegant. But does the business model actually work?In this episode, Paul breaks down the real promise and serious problems behind space-based data centers. He covers why terrestrial AI infrastructure is hitting hard limits on energy, water, land, and permitting — and why orbital compute is attracting serious capital as a result. Then he gets into the physics that SpaceX glosses over: the cooling problem.Not investment advice — just eight minutes of honest physics.Follow Paul on LinkedIn.

The Construction Corner
#435 - The Grid, the GPUs and the Gap

The Construction Corner

Play Episode Listen Later Jun 11, 2026 18:08


Dillon connects the dots between the aging power grid, the AI boom, and what it all means for construction. He traces the arc from EV adoption to data center explosion — and why today's power demands from AI infrastructure dwarf anything the grid has faced before. He also gets into the Anthropic-xAI compute deal, SpaceX's looming IPO, and the SaaS valuation collapse, before bringing it back to a grounded point: AI is a powerful tool, but it still can't replace the contextual knowledge, code expertise, and real-world judgment that construction professionals bring every day.

GREY Journal Daily News Podcast
What Do Oracle's Cloud Misses Mean for Enterprise Budgets?

GREY Journal Daily News Podcast

Play Episode Listen Later Jun 11, 2026 1:50


Yahoo Finance reported that Oracle beat expectations on total revenue in its fiscal fourth quarter while cloud sales missed analyst estimates. Oracle's cloud portfolio includes Oracle Cloud Infrastructure for compute and AI workloads and cloud applications such as Fusion and NetSuite. Supply limits on GPUs, new data center capacity, and multi-cloud security and compliance reviews are slowing deployments and revenue recognition. Oracle is pursuing multi-cloud strategies with integrations that place Oracle Database near Azure and Google Cloud while expanding AI-ready infrastructure. Founders should expect longer validation cycles, cloud-agnostic requirements, and co-selling motions to move enterprise deals. Key metrics to watch include remaining performance obligations, any disclosed growth splits, and capital expenditures tied to new regions and AI capacity.Learn more on this news by visiting us at: https://greyjournal.net/news/ Hosted on Acast. See acast.com/privacy for more information.

Elon Musk Pod
The Nerdy Escorts Cashing In on Silicon Valley's AI Boom

Elon Musk Pod

Play Episode Listen Later Jun 10, 2026 19:29


A Forbes investigation by Anna Tong put a number on something Silicon Valley wasn't talking about: a small group of high-end escorts charging AI founders thousands an hour, and selling intellectual conversation about GPUs, crypto, and longevity alongside the sex.This episode breaks down the reporting and the economics behind it. The rates are the headline. Aella, an escort and self-described data scientist, charges $6,000 an hour, the highest rate in the piece, and is credited with coining the "nerd-first" label. Meida Marek charges $3,500 an hour and says she's booked months out. Talia Sable, a former programmer who lists Dungeons & Dragons and supply chain logistics among her interests, charges $3,000. Forbes cites figures up to $23,000 a day and $30,000 a weekend, where five years ago it was rare to charge more than $1,000 an hour.The why is the part worth sitting with. It's a lens on how the AI gold rush is reshaping social life in the Valley, where founders raising at huge valuations and working 100-hour weeks deprioritize ordinary relationships, and a market fills the gap with transactional intimacy that doubles as founder therapy.There's also a labor angle that ties this directly to the AI story. Marek left an entry-level finance job because she grew anxious that AI would automate her career, then pivoted to a relational skill she figured a model couldn't replicate. We cover that bet, whether it holds, and the obvious risks around discretion when founders talk freely in private.A note on the numbers: most of these rates are self-reported marketing, and people in adjacent corners of the industry have publicly called them inflated. Treat them as claimed, not audited.Silicon Valley AI boom, nerdy escorts, intimacy as a service, AI founders, Aella, Meida Marek, Anna Tong Forbes, AI economy, automation, future of work, tech wealth.

Moneycontrol Podcast
5205: Moneycontrol Advance Business Index on economy, Modi's milestone & government bonds uptick | MC Editor's Picks

Moneycontrol Podcast

Play Episode Listen Later Jun 10, 2026 5:16


India's economy remains on a growth path but is beginning to lose momentum, with Moneycontrol's Advance Business Index slipping to its lowest level since July 2025 amid the fallout from the US-Iran conflict. Meanwhile, as Narendra Modi surpasses Jawaharlal Nehru as India's longest-serving continuously elected prime minister, Moneycontrol's editors assess the political, economic and social legacy of the Modi era. Also in this edition: rising retail participation in government bonds, India's energy-security rethink, GPUs as collateral for AI financing, and key developments across dealmaking, technology and global mobility.

Moneycontrol Podcast
5204: Deloitte, EY, KPMG battle for India's IT infrastructure audit; Cognizant's AI finds $200 million in employee chats; and Zoho forays into hardware with India-designed server Nathu La | MC Tech3

Moneycontrol Podcast

Play Episode Listen Later Jun 10, 2026 6:44


In today's Tech3 from Moneycontrol, the government moves closer to launching one of India's largest cybersecurity audits, Cognizant says AI has helped uncover a $200 million sales pipeline hidden in employee interactions, GPUs are emerging as a new asset class for financing AI infrastructure, Meta partners with Reliance to build an AI-enabled data centre in Jamnagar, and Zoho makes its first major hardware push with the launch of its India-designed server, Nathu La.

Interviews: Tech and Business
Mozilla CTO: Why Most Enterprises Don't Control Their AI

Interviews: Tech and Business

Play Episode Listen Later Jun 9, 2026 57:01


Most enterprises are renters, not owners, of their technology and AI. Raffi Krikorian, Chief Technology Officer of Mozilla, explains why dependence on a handful of closed model providers means losing control over model behavior, pricing, and your own data.In CXOTalk episode 920, Krikorian lays out where open-source AI actually wins in the enterprise, how lock-in happens quietly, and what CIOs and CTOs should do about it now. Krikorian draws on his experience building infrastructure at Twitter and running the self-driving division at Uber to ground the discussion in real engineering and economic tradeoffs, not hype.YOU'LL DISCOVER✅ Why 85% of enterprises believed they could switch AI vendors, but only about 30% actually could when they tried✅ The "renters vs. owners" framing and what it means to control your AI destiny✅ Why Krikorian wants data "protected by architecture, not legal handshakes"✅ How Pinterest reportedly saved on the order of $10 million in a single quarter by switching from closed to open models✅ Why IT is becoming "the HR team for agents," and the read/write "dangerous triangle" of agentic permissions✅ The case for recording your prompts and running your own evaluations instead of trusting public benchmarks✅ Why roughly 70% of enterprise GPUs sit idle, and the missing "LAMP stack for AI" that could put them to work✅ How closed "validation machines" can quietly steer answers toward sponsored outcomes⏱️ TIMESTAMPS (estimated, verify before publishing)0:00 Renters vs. owners: who controls enterprise AI2:26 The risks of depending on closed model makers6:23 How lock-in happens and where open source fits9:53 Regression testing and building your own evals13:24 Pricing instability and the post-IPO cost question23:31 Governance: IT as HR for AI agents32:38 Can a small organization own its AI stack end-to-end?38:47 Validation machines, trust, and sponsored answers43:39 Keeping humans at the center, not in the loop47:23 Can open source beat big tech in AI?51:39 Inside Mozilla.ai: Otari, CQ, Octanus, Thunderbolt55:21 The "rebel alliance" strategy

Clownfish TV: Audio Edition
AI Bros are Going BROKE...

Clownfish TV: Audio Edition

Play Episode Listen Later Jun 9, 2026 15:07


AI is unprofitable. Companies are shoveling TRILLIONS into AI and only Nvidia is turning any kind of a profit... selling GPUs to the other companies. Amazon, Microsoft, Meta and Open AI are all losing massive amounts of money. And investors are starting to notice. In fact, the AI bubble burst might be happening as we speak... Watch the podcast episodes on YouTube and all major podcast hosts including Spotify. CLOWNFISH TV is an independent, opinionated news and commentary podcast that covers Entertainment and Tech from a consumer's point of view. We talk about Gaming, Comics, Anime, TV, Movies, Animation and more. Hosted by Kneon and Geeky Sparkles. Get more news, views and reviews on Clownfish TV News - https://more.clownfishtv.com/ On YouTube - https://www.youtube.com/c/ClownfishTV On Spotify - https://open.spotify.com/show/4Tu83D1NcCmh7K1zHIedvg On Apple Podcasts - https://podcasts.apple.com/us/podcast/clownfish-tv-audio-edition/id1726838629 MORE CLOWNFISH TV - Official Merch Store: http://ClownfishMinus.com Facebook - https://facebook.com/ClownfishTV X - https://x.com/ClownfishTVcom Clownfish TV subreddit: https://www.reddit.com/r/ClownfishTVOfficial/ Disclaimer: This series is produced by Clownfish Studios and WebReef Media, and is part of ClownfishTV.com. Opinions expressed by our contributors do not necessarily reflect the views of our guests, affiliates, sponsors, or advertisers. ClownfishTV.com is an unofficial news source and has no connection to any company that we may cover. This channel and website and the content made available through this site are for educational, entertainment and informational purposes only. These so-called “fair uses” are permitted even if the use of the work would otherwise be infringing. #Tech #AI #Podcast #Commentary #News #Reaction #Gaming #Comedy #Entertainment #Hollywood #PopCulture #Tech #Anime #FYP Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Elon Musk Pod
SpaceX IPO: $1.75 Trillion, Still Losing Money

Elon Musk Pod

Play Episode Listen Later Jun 8, 2026 19:57


SpaceX is set to go public on June 12, 2026 at a $1.75 trillion valuation, the largest IPO in history. The company is targeting a $75 billion raise at $135 per share. But the S-1 filing reveals a contradiction: Starlink generates billions while the company posts a net loss, driven by the xAI merger and a massive bet on AI compute. This episode breaks down the SpaceX IPO filing. xAI posted a $2.47 billion operating loss in Q1 2026, and Starlink revenue is covering most of it. Then two compute deals changed the math. Anthropic agreed to pay $1.25 billion a month to rent xAI's Colossus 1 data center, and Google signed a $920 million per month deal, both running through 2029. Together that's about $75 billion in contracted future revenue. We cover how SpaceX shifted from running GPUs internally for Grok to operating as an AI cloud infrastructure provider, the multi-class share structure that keeps Elon Musk in control, the possible Tesla merger tying together chips, data centers, and robotics, and the FCC filing for a million-satellite "space cloud." Plus where the $600-700 billion premium above Starlink and launch is actually coming from, and what a generational liquidity event means for employees and VC backers. SpaceX IPO 2026, xAI merger, Starlink revenue, Elon Musk, $1.75 trillion valuation, Google compute deal, Anthropic Colossus, AI infrastructure, orbital computing.

Marketplace Tech
Wall Street sets its sights on an AI futures market

Marketplace Tech

Play Episode Listen Later Jun 2, 2026 8:52


There is growing demand for time with GPUs, the chips that power artificial intelligence. AI companies need those chips in order to keep their models up and running. And to do that, they can reserve time with a GPU. Now, there's interest from Wall Street in creating a futures market for this AI compute time, essentially treating it like a commodity. Marketplace's Stephanie Hughes spoke with Liz Hoffman, business and finance editor at Semafor and host of the “Compound Interest” podcast, who recently wrote about this.

Marketplace All-in-One
Wall Street sets its sights on an AI futures market

Marketplace All-in-One

Play Episode Listen Later Jun 2, 2026 8:52


There is growing demand for time with GPUs, the chips that power artificial intelligence. AI companies need those chips in order to keep their models up and running. And to do that, they can reserve time with a GPU. Now, there's interest from Wall Street in creating a futures market for this AI compute time, essentially treating it like a commodity. Marketplace's Stephanie Hughes spoke with Liz Hoffman, business and finance editor at Semafor and host of the “Compound Interest” podcast, who recently wrote about this.

The Interchange
The grid's missing operating system: Why a $100,000 AI controller could defer trillions in hardware and why utilities won't buy it

The Interchange

Play Episode Listen Later Jun 2, 2026 43:46


The energy transition conversation focuses on what connects to the grid. Far less attention goes to whether anyone is coordinating what those assets do once connected. AI training runs swing hundreds of megawatts in seconds as GPUs checkpoint and restart a profile that looks like a generator tripping offline. At distribution level, millions of inverter-based resources create localised variability that overwhelms individual circuits even when aggregate models look healthy. The planning tools in use today were designed for neither problem.Host Bridget van Dorsten is joined by Kay Aikin, CEO and Founder of Dynamic Grid, energy engineer, grid architecture advisor to the DOE-supported GridWise Architecture Council, and contributor to the UN Environmental Program's building decarbonisation work. Kay unpacks what an AI training facility actually does to the grid with full GPU load for hours or days, then a drop to ten percent in seconds during checkpointing. She talks about how at the scale now planned, the Stargate project in Texas alone could represent ten percent of ERCOT disappearing in four seconds. The behaviour is stochastic and cannot be modelled with traditional statistical tools. At distribution level, virtual power plants responding to wholesale signals without circuit-level visibility can create competing oscillations, the kind of emergent dynamics that contributed to the Spanish grid failure.The proposed fix is an AI controller at the substation, sending price-based signals and flexible operating envelopes to large assets and VPP operators, giving them twenty-four-hour forecasts and real-time circuit visibility. Total cost: under a hundred thousand dollars installed. The reason it isn't everywhere is cost-of-service regulation. Utilities earn returns on deployed capital, so a million-dollar transformer replacement is more profitable than software that eliminates the need for it.Without new approaches, rebuilding the US distribution grid could cost up to ten trillion dollars by 2040. Kay is developing grid utilisation metrics with regulators in Maine, Virginia, and Maryland to incentivise extracting more from existing infrastructure. The episode closes on the need for distribution system operators and the affordability death spiral that looms if the structural incentives don't shift. See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Azure Friday (HD) - Channel 9
Anyscale on Azure: Scale Python AI workloads with managed Ray on AKS

Azure Friday (HD) - Channel 9

Play Episode Listen Later Jun 2, 2026


Scott Hanselman talks with Omar Shorbaji from the Anyscale engineering team about how Anyscale on Azure scales Python AI workloads from a single notebook to thousands of CPUs and GPUs. Built on Ray, the most widely adopted AI compute engine, Anyscale gives you a unified runtime to build, train, and serve, running directly on Azure Kubernetes Service without the complexity of managing Kubernetes. See a live demo that fine-tunes a vision-language-action robotics policy, with the metrics you need to push GPU utilization higher. Chapters 00:00 - Introduction 00:52 - Ray and the Anyscale platform 03:11 - Start of demo: Workspaces 04:38 - Running a job and viewing utilization metrics 05:24 - Choosing the right scale 06:53 - Abstracting Kubernetes on AKS 08:53 - Wrap up and where to learn more Recommended resources Learn Docs Anyscale on Azure Connect Scott Hanselman | Twitter/X: @SHanselman Anyscale | Twitter/X: @anyscalecompute Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

I'm excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World's Fair! We'll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella. However we did not hold back with this interview - we asked all the burning questions about uptime and Copilot that we know you have in your minds. Lets go!For almost two decades, GitHub has been the home of software, where both open source and closed flow, through commits, pull requests, reviews, actions, etc.This ecosystem flourished as open-source maintainers and contributors would continue shipping code for the benefit of the community. However as coding agents began to ship mass quantities of code - growing 1400% in 2026, it marked a new era that was both extremely exciting and challenging for GitHub.While these agents help more people ship more projects, they also significantly increase the floor of how much code is shipped, how often it is shipped, how many people commit code, and basically orders of magnitude multiples in every dimension of GitHub infrastructure:Now GitHub inevitably experiences more pressure on their infrastructure which was originally designed around human developers moving at human speed. This has resulted in a very publicly notable uptime story:So it begs the question of whether current systems around code can absorb what AI produces. Can CI/CD keep up when every idea becomes a build? Can open source maintainers survive floods of AI-generated slop contributions? Can GitHub preserve the human social contract of software while becoming the operating layer for agents?Which brings us to the perfect person to answer these questions: GitHub COO Kyle Daigle. In this episode, he joins swyx to unpack what happens when AI doesn't just autocomplete code, but starts changing how companies operate, how open source works, how pull requests get reviewed, and how GitHub itself has to scale. We go deep on GitHub's internal AI workflows: micro-skills, WorkIQ, MCP, Slack, Teams, email, Copilot workflows, the new Copilot desktop app, CLI, cloud agents, and how Kyle uses agents to look backwards across company context before deciding what to do next. Kyle also reflects on GitHub's history building webhooks, APIs, Actions, npm, Dependabot, and Semmle, why the AI era is breaking GitHub in new ways, how Actions became a general-purpose compute layer, and what Copilot becomes after code completion.Full Video PodWe discuss:* Kyle's expanded role across GitHub* How AI got Kyle coding again after years in leadership* Why GitHub rolls out AI through existing workflows instead of forcing new tools* WorkIQ, MCP, Slack, Teams, email, and GitHub as company context* Why massive “mega-skills” are giving way to small, atomic micro-skills* How AI changes summarization, communications, marketing, and analyst work* Why former developers in leadership may have a unique advantage in the AI era* Kyle's “15 agents on Saturday” workflow* How Kyle built an AI-generated executive presentation for CRO/CFO teams* Why AI changes the chief of staff role without removing the human work* GitHub Actions, webhooks, arbitrary code execution, and secure agent compute* The npm acquisition, supply-chain security, 2FA, and token invalidation* Slop forks, vendoring, and whether AI agents change dependency management* What pull requests become when most PRs come from agents* Prompt requests, vouching, AI review, and trust in open source* What counts as a “developer” when AI lowers the barrier to building* GitHub Spark, low-code, and why GitHub refuses to hide the code* 14x commit growth, Actions load, databases, monorepos, and availability* Copilot's evolution from completion to CLI, desktop app, cloud agents, and SDK* Context, memory, rules, and making GitHub “act like Kyle wants it to act”* Ambient AI, OpenClaw, enterprise security, and the new operating system for agents* What swyx should ask Satya Nadella about Microsoft's AI futureKyle Daigle* LinkedIn: https://www.linkedin.com/in/kyledaigle* X: https://x.com/kdaigleTimestamps00:00:00 Introduction00:03:36 Why AI Got Kyle Coding Again00:07:04 Running GitHub with AI: WorkIQ, MCP, Slack, Teams, and Skills00:15:39 The Golden Age for Former Developers in Leadership00:17:31 15 Agents on Saturday and AI-Generated Executive Work00:20:20 How AI Changes the Chief of Staff Role00:21:45 GitHub's History: Actions, npm, Webhooks, and Open Source00:28:45 Slop Forks, Vendoring, and AI Dependency Management00:33:57 Pull Requests, Prompt Requests, and Trust in Agent-Generated Code00:41:21 GitHub Stars, 200M+ Developers, and the New AI Builder Wave00:45:15 GitHub Spark, Low-Code, and Why GitHub Still Shows the Code00:47:38 GitHub's Hardest Era: 14x Growth, Reliability, and Scale00:59:21 Actions as the Compute Layer for CI/CD and Automation01:02:04 The State and Future of GitHub Copilot01:08:24 Ambient AI, Background Agents, and the Future of the SDLC01:13:09 OpenClaw, Enterprise Security, and the New OS for Agents01:18:03 Build Announcements, WorkIQ, FoundryIQ, and Microsoft Context01:21:41 What Should swyx Ask Satya?TranscriptIntroduction: Kyle Daigle's Expanded Role at GitHub and MicrosoftSwyx [00:00:00]: We're here with Kyle Daigle, COO of GitHub. Welcome.Kyle [00:00:07]: Hey, thanks for having me.Swyx [00:00:08]: You're not just CEO of GitHub. People know you as that. You have a new role.Kyle [00:00:11]: So I have an expanded role now. I've been working at GitHub for thirteen years and doing all things developer. Joined as a developer myself. And now, I'm also responsible as the CMO of Developer for Microsoft. And so all the kind of learnings and passion for developers and how we work with them and how we communicate and how we bring our products to market, we're also bringing that expertise to the broader Microsoft ecosystem and helping every developer that uses a Microsoft product or would like to have a sort of similar experience that they've had with GitHub over the years. So it's a different role in some ways, but it's also just building on the experience that I've had at GitHub of just sort of tell the truth, be authentic, show people how to use it and then let the products speak for themselves. Now just doing that with, all of Microsoft.Swyx [00:01:09]: We'll be releasing this in conjunction with Build. You got lots of stuff planned, and we can sort of touch on that whenever it's appropriate. I think one of the interesting things is I rarely meet a COO who's also a CMO. I think you're a very outward facing and you're very confident publicly. That's rare. Do you actually view yourself as COO? What's What is your thing?From GitHub Developer to COO/CMO: Building the Platform and Operating GitHubKyle [00:01:33]: I think for me, it's been funny. The titles have always been, a— have always felt a little strange to me. I joined GitHub as a developer? I wrote so much of theSwyx [00:01:46]: Let's bring that up. You wrote the back ends?Kyle [00:01:48]: I was going through, I was going through, some old photos, when folks were talking about how things were being built or how there was a build GitHub. I built, webhooks and worked with teams building the API, built the platform layer. Anything that integrated with GitHub, up until really twenty eighteen, I built or ran the engineering teams. And that's kind of where my the beginning of my passion always was helping people build things, deliver them to, their customers. And so being a developer, building for developers was always super unique. In a— I think as my role expanded, it became my ability to talk to not just developers, but also enterprise customers or business leaders and have this translation layer. And then through all those years, GitHub has always operated pretty uniquely. Post-pandemic, working remotely was not as novel as it was when GitHub started in two thousand and eight. But all that expertise of running remote teams, doing it well, became this sort of bigger role, ultimately turning into the COO role of how do we operate GitHub in the way that GitHub's always operated after the Microsoft acquisition. And kind of so on from there. So like for me, I think the— I've, I still code. I love coding but the problem has always been, people. It's a much harder problem to both support our own employees, a harder problem to communicate to developers and enterprise buyers what we're building why it matters, ‘cause those are two very different messages. And so getting to work in the mix of COO, CMO, also just being a dev, I think is what's kept me at GitHub for so long.AI Workflows for Leadership: Commits, Retrospectives, and ContextSwyx [00:03:40]: Apparently, you have— your commits have gone up. What's this? What's going on?Kyle [00:03:45]: Rui's called me out pretty aggressively. So I think— as you can imagine, right, you can see my normal era of being a dev In the twenty thirteen, twenty fourteen era, and then moving into management, and then ultimately the COO role. I think what you see there is me, really getting back to coding thanks to AI. I— similar to, attaching problems between how to market and how to operate a business and how to code, I find, building agents and workflows that are connecting very disparate problems to be what's driving this. So that's, some of it's writing software. A lot of it is, connecting a ton of a different data sources to, help me out. But that is completely me really diving in on the AI side in trying out our tools, trying out everyone's tools, But building for me, building for the non-technical leader, though I'm technical and how we're, able to use these tools more than just the simple, call and response that I think a lot of the non-technical, your employers, you have to get— you have to use AI, and so everyone uses, ChatGPT or Copilot or Claude or whatever. To really get into, how is this going to help me out, it— I find that it's not the I need to write a blog post, I need to those simple examples. Helping people find the workflows of, “Okay, I need you to go through all the PRs today. I need you to go through everything that we've posted online. I need you to go through what we did the last three months. Go through all of my Obsidian notes for any mentions of this then go through my transcripts at work.” We use, Teams, so, using WorkIQ, go call that MCP server, grab all the transcripts, go through all the Slack, and then build me out the plan of, what this week's messaging actually was. That's something that was, impossible because for me, I find AI in a what most of this launch here is actually, less building forward. It's actually, a recursive loop backwards. I'm always looking at what had happened first. Go back through the week and tell me what we did, what worked, what didn't work? And then tell me in the next three or four days-What would you tweak based on this sort of like looking backwards and then looking ahead a little bit? I find that to be so much more valuable, especially for like non-technical, because that retrospection is actually LLMs are very good at that. Like finding all the patterns, pulling them out, and then applying that retrospection to just a couple of days or just like a short period of time. Is all a bunch of apps that I've built and launched a bunch of, internal tools. I use the new, GitHub Copilot app, the desktop app with workflows. Every time I crack open my laptop, it's running workflows for me. It's just a ton of different stuff and of course, it all ends up on, it all ends up on GitHub.Swyx [00:06:47]: Of course. That's where, that's where, stuff is hosted. Man, there's so much to ask you. I was going to leave the how do you run a company with AI thing at the end. I have to ask one— double click one thing. You said, you are looking back at the week. You're, you're understanding what happens. When you say we That's three thousand people. How?Rolling Out AI Internally: Skills, CLIs, and Company ContextKyle [00:07:09]: I think when we started rolling out AI internally beyond engineering, right? One of the things that I was really, passionate about is like we have to do this in a way where no one has to change how they work. I don't want to have to teach you a tool. I don't want to have to teach you something new. And so for us, we tried out a few tools. Most of them don't work because I got to get you on board? I got to teach you how to use it. What we've actually ended up doing is we've built like a set of skills internally. We have we each have our set of skills, and we've just been distributing even to the non-technical folks, the CLI. And then effectively, we're just giving it access to like read about everything that we're writing. So that's for us, that's usually GitHub, Teams, Email, and Slack. So Teams for, video chat, generally speaking.Swyx [00:08:03]: Teams and Slack?Kyle [00:08:04]: so we use Teams for video communication, but we don't use it for chat. W-we— GitHub for a long history, right? We're alwaysSwyx [00:08:13]: Also SlackKyle [00:08:14]: Talking about ChatOps and like everything is built into Slack. Like every command, every flow.Swyx [00:08:18]: So even though you have been acquired for I don't know, eight years nowKyle [00:08:22]: we stillSwyx [00:08:23]: You still use Slack?Kyle [00:08:23]: it's a purpose-built tool for us, and I think the reality is that moving off of it would be so bluntly expensive? Simply because all the tooling is, baked in with that paradigm. And they both have their pros and cons but they don't work the same way at all. We still use a bunch of different tools Because it's the purpose-built tools that We need. And thenSwyx [00:08:47]: Well, the same doesn't go for the rest of Microsoft, presumably.Kyle [00:08:50]: like the like various teams like operateSwyx [00:08:53]: They make their own decisionsKyle [00:08:54]: Various ways. I think it just matters what you're trying to what you're trying to do. But we do we do work across kind of every tool that we use, and then by giving everyone access to all of that context and the new WorkIQ MCP server, which is quite cool if you do live in the M365 like world. I can ask it all these backwards-facing questions, and it's incredibly important for our teams that are working remotely. There's a lot of stuff you miss when you're not in an office, and we are spread out all over the world. So most of that is looking back. And then we post, we post either auto-automatically into GitHub issues or discussions, these sorts of like findings or like our industry reports. Like what's happening this morning, today, yesterday. A little automation gets run. We'll use the app. We might use GitHub Actions like with, our agentic workflows just to go do that run, and then we push it into GitHub, and w-we keep having a conversation. So usually for us, it's about that sort of like looking back, looking forward on the non-technical side. And then of course for a lot of those folks, it's also building an app, pushing it to GitHub pages or pushing it somewhere to host it et cetera. But it's just like enabling everyone with that power of it's going to take me a week to figure this out. Instead, we're going “Okay I built a skill. Let's put it into a repo. We'll all share that skill together, and then we'll use the CLI or now the app-” “just to run it.”Micro Skills vs. Mega Skills: How GitHub Uses AI at WorkSwyx [00:10:26]: All right. I think, I think we're going straight into like the team management and productivity thing. I think a lot of people are getting various levels of LLM psychosis. How do you manage the bloat of skills? Like everyone Has their thing, and they're Like trying to promote it to the rest of their peers in their org, right? And obviously, whoever becomes a skill influencer internally becomes like an AI leader, right? Of sorts. I assume you have those.Kyle [00:10:50]: like I think we haveSwyx [00:10:52]: And I assume it's a mess a Yeah.Kyle [00:10:54]: there's like I— like I think the reality is there's two pieces. Like first is I think that we're ending the era of these like massive, beautiful, perfect skills that are just like not any of those things. ‘cause for a while, right every tweet every day is like go download the skills, the perfectly managed thing to do this entire workflow. And I think that like what we've found and what— I was just with my team, this week, and we were talking about the skill side, and we're really talking about these like incredibly micro skills that are just doing one thing for us very well Versus a skill that's going to do I said, that full report. That doesn't really exist on our side anymore. It's usually how do— like a single skill that's going to identify the most important marketing information given any MCP server. Like this is the most important thing. Less about stitch a bunch of tools together and have it produce this mega output because then weeks go by, months go by, things change, and you want to tweakSwyx [00:11:58]: It's brittleKyle [00:11:58]: Your mega skill and you're screwed? You can't do that. And so now we're really just talking about the Legos we're using and just letting the instruction book be something we're all putting together. Whereas I think a lot of AI skills for a while have been that mega instruction book style.Swyx [00:12:15]: I've, thought a lot about Postel's law. I don't know if that's a term that is, means things to folks. It's the idea that you should be liberal in what you accept and strict in what you output, right? And I think that's like a good framing principle for skills. This is my skills, obviously on GitHub. I feel like everyone should have like how like some repos In GitHub are special repos? I feel like we should sort of reify the slash skills and everyone like give it some kind of special presentation. Anyway, so, yeah, this is one of those like download Download anything, transcribe anything, and then you can string together the atomic skills that do one thing well Into like some kind of orchestration skill that calls other skills. I assume, does that match?Kyle [00:12:56]: I like I think so. I think that theSwyx [00:13:00]: Summarize anything.Kyle [00:13:01]: Like I think the- For me, summarizing something for I do communications and PR and analyst relations and marketing and customer activities, and so my summarize everything is very different for each one of those like Contexts. What ‘Cause if I'm summarizing something for an analyst, that's a very different thing than, probably how I'm going to summarize something for like a customer meeting or an engagement. So that's I think like the difference when we're talking about the like the tools I might use on Saturday or the skills I might use on a Saturday when it's just for Kyle. Yeah, those are kind of like they have an atomic actual tool underneath or maybe skill, and then Kyle cares about X. But I think when we're talking about work and enabling the the marketers, communicators there, it's the atomic, this is what good summarization is, and then this is what I care about as for marketing for communications For whatever. And that I think is like the interesting matrix problem when we go from like a developer set of concerns to all kinds of different professions, is that what that word means to me is different than it means to you is different than it means to the analyst or the salesperson, and that's where I think the matrix mess is that we're starting to like still starting to find. It's about these mega skills but they're all just slight permutations, but those permutations are really important. It's the difference between someone reading this and going “Did AI make this?” what Or “This makes total sense, and I would expect this when I'm giving a briefing to Gartner,” or like whatever else.Swyx [00:14:37]: I think the beauty of it maybe is that you don't have to be that careful about what goes in there. It doesn't have to exactly fit as long as it like roughly is contained in there. I used to complain about plugin hell, basically. Like when you have a framework and then you have a hundred things that you need to integrate, everyone does like the GitHub used to be bloated full of these things. And now we don't need them anymore ‘cause now you just use skills.Former Developers in Leadership: AI as a Creation MultiplierKyle [00:15:00]: And like I think the most magical thing is the just that like I can just also crack it open. Like Like yes, I could go like change the how the plugin is coded, or like I could go do that now with AI, but I think there's just something more magical about getting a response back and being “That's not right,” and then you just crack the skill open, you just type English words and it's different. That building block is just, I think very unique. Once I get everyone to kind of understand how to best how to best make those changes to get the most power out of them.Swyx [00:15:36]: Is there a— you have a your peer group that Of people like you. Is there a common framing for Something I'm feeling is, which is true, is that is this a golden age for former developers who are now in leadership? Because you can wield the tools, you would know the right words, you're maybe not too close to the details. Doesn't matter. But like you're more effective than someone who doesn't come from that background.Kyle [00:15:59]: I think that like the secret has always been your ability to identify patterns and solve problems, and I think that for folks that like myself that don't code day to day anymore, that has made me successful as a developer, made me successful as a COO and now CMO. And so now that I have access to get and write code, I'm now applying that sort of like pattern finding and problem solving, and I know enough still about how to then go and say, “Oh, I want to make an app, but I don't want to break into jail or create something that's not going to be able to work or to be deployed scale or whatever.” that ability to apply all that additional business knowledge and still code I think is what makes that so interesting to me. Slightly different than I think some of the other like technical leaders that became business leaders and now are going back to their apps and updating them. Good for them? But I think the more, much more interesting thing is, well, now I have this whole new set of expertise over ten plus years. Why not take that and use that as a developer with these AI tools? So I definitely think that makes me more powerful, but I think that's true for like every dev as well. Most of the dev friends I still have also have some other underlying skill and passion. There's really talented, very kind of linear computer science software devs, absolutely. I just find that the folks that came from a different career, went to school for something else, went off and did this random thing, and then became a software dev, or were a dev, did a random thing, came back. Learning that extra set of information, learning those extra skills, and now having the power of an AI where I can crank up fifteen agents on Saturday while my kids are doing lacrosse, That's like really powerful. And I think it gets me back to that feeling of like creation, and it's very hard to replicate that in most other senses? That first time you build an app and you click it and you show someone that's magical. And so being able to do that not just in code, but across all kinds of different assets that's, that's huge. We were doing we're doing our every year we do our revenue planning. We talk about okay, what is it going to look like for next year? And of course as you imagine, there's, slideshows everywhere talking about what are we going to talk about, what's the narrative, et cetera. And so as you said I'm “Okay, well, I could probably just like build something to build this and then that way I don't have to go build the whole spreadsheet or I have to pass it to my team.” So we went through this process, and I got all the information and used the skills I mentioned. I built like a little app just to make it so I could look at some of the information in a SQLite database, more easily. And I ultimately built this entire presentation without touching any of it and I was “Okay, I'm just going to present this to our CRO, the CFO, their teams,” without mentioning I'd built it with AI. I like built a skill to make it look very much not AI driven. Just not pretty.AI-Generated Presentations, Human Taste, and the Changing Chief of Staff RoleSwyx [00:19:03]: Like a design. Yeah.Kyle [00:19:03]: Not pretty. But just like very clearly not AI. Kind of like don't do anything interesting.Swyx [00:19:08]: That's, yeah, that is valuable.Kyle [00:19:08]: Just go Exactly. We did the whole thing through. It used my notes from Obsidian, it used all the context I mentioned before, the plans, and Never came up once that it was AI generated.Swyx [00:19:20]: It didn't matter.Kyle [00:19:20]: Never once. D It didn't matter. And so now I takeSwyx [00:19:23]: This is a toolKyle [00:19:23]: I can take that tool and go, “Look, I don't want you to go build slideshows.” They're just helping us share information with each other. If this thing can do it With a little bit of crafting from you and then we can look at it together, awesome. There's no value in all that extra work. I think that the ability to, make it look humanly bad and and build a little app to, manipulate the data I think is part of, that upside for devs that are now in leadership roles. Because, the thing that I feel like I said before, this that's all a people, that's all a people problem. I know if you've used a coworker or not to build a slide deck, unless you spent a bunch of time to not do it.Swyx [00:20:07]: I know, but like it was so, I think there's a certain charm to just being blatantly AI. ‘Cause I think that you're well, you're just honest about There may be mistakes here that I cannot vouch for. So how much value is there? But anyway I think, actually the real question I want to ask is, there's a— You were a chief of staff To Thomas. And in the pre-AI world, the that job would've been a chief of staff job of like Can you prep me these slides and all that? And now you do it yourself.Kyle [00:20:35]: I still, I still have a chief of staff. Because, the difference is it's sort of the discussion every time we have some sort of technology evolution is it's not that the jobs the roles don't all go away, they just change? And so yeah, I don't have someone spending all their time building out slides for me and presentations ‘cause I don't need that anymore. But now I need that person that is able to go and find all the different connections between humans in those discussions to help me find out, okay, I should be meeting with this group and this team, and they have an opportunity, and I'm going to be in San Francisco today, I'm going to be in Seattle tomorrow. Those sorts of human connection aspects are still incredibly valuable and has always been a big part of that chief of staff role. But now just like chiefs of staff are not opening up, letters to process, they're doing emails. What It's the same thing. And now they're, they're not building out as many of these presentations because they have the the ability to have a AI take it on for, and share that with me and great. Let's keep moving ‘cause it's allowing us to go faster and make better decisions more quickly.Swyx [00:21:45]: Awesome. Well, so we can dive into more sort of, Productivity insights as you go. I did want to do a little bit of a brief history of colleague and hub. Because, we started here. And then you also involved the NPM acquisition. I did, I do want to touch upon that. And then more recently, I just want to bring up to present day where we're having uptime issues Which transparently we've already Addressed publicly, but we'll, we'll discuss in the pod. Did I miss anything? Like what, any other major highlights? Obviously, it's, it's a lot of years to cover.A Brief History of GitHub: Webhooks, Actions, Acquisitions, and Platform EvolutionKyle [00:22:15]: No the I think one of one highlight was right before the acquisition closed in twenty eighteen, I got to launch the first version of ActionsSwyx [00:22:27]: OhKyle [00:22:27]: At GitHub Universe. So it was OSwyx [00:22:29]: They're that young?Kyle [00:22:30]: It was October of twenty eighteen, I think. Yeah. Yeah.Swyx [00:22:33]: Gee, Jesus.Kyle [00:22:34]: I got to I was the engineering leader on that project and got to launch that. And then, yeah, we did acquisitions of NPM you said, Semmle, Dependabot Pul Panda a whole bunch of things. That was a bigSwyx [00:22:47]: Pul Panda.Kyle [00:22:48]: Abi is doing well.Swyx [00:22:51]: DX. Holy crap.Kyle [00:22:52]: Did well on DX. I and like that was a that was the big shift, after the acquisition. I had to join the sort of business side.Swyx [00:23:00]: So I need to hit you on some of these things ‘cause you were there. Right? And how often do I get to talk to someone who was there? But yeah, Actions. Is that the number one source of security issues on GitHub?Kyle [00:23:11]: Oh, sh I think that the number one source of, security issues is probably like all, the literal code in everyone's like underlying repositories. I would say back further than that is, if you remember I had to show in this graph was this is, I'm, didn't say this before, this is ultimately webhooks.Swyx [00:23:30]: You yeah.Kyle [00:23:31]: Like circa whatever it was.Swyx [00:23:32]: It says Hookshot in there.Kyle [00:23:32]: I forget. Yeah. Yeah, Hookshot's in there. And so like back then, it says GitHub Services. Do you see, it says Hookshot FE for front end, and then it says GitHub Services. GitHub Services back in the old days, right? You we had a repository that was Ruby code, and you could write any Ruby code in there, and then we would execute that On your behalf As a service, and then that way if an if you were trying to integrate with something, it didn't we would run it for you.Swyx [00:23:57]: And of course no containers ‘causeKyle [00:23:58]: No, ‘cause it wasSwyx [00:23:59]: Well, no containersKyle [00:24:00]: Twenty fourteen. And so there was some isolation obviously, but it was mostly the separations on the server level. That's like an example as long as the very old version of Pages, which ran on its own containerization infrastructure, not on Actions.Swyx [00:24:15]: Which like all-time great product.Kyle [00:24:16]: Pages powers the internet at this point to some degree. Those were places where like clearly there were no like issues like to my knowledge. But it was those things where I'm looking at and going “Okay, well we can't be running arbitrary Ruby code,” like on everyone's behalf. Then containerizing all of that up intoUh into actions now where yeah the containerization, is r-really good. The pinning most folks aren't pinning it the like to a particularSwyx [00:24:48]: ImagesKyle [00:24:48]: Sha, et cetera like their workflows, and so that's a big that's a big place Of pain for folks if they're just doing similar to any dependency management, just V1 or newest or latest, I think. But, that journey from that day to “Okay, we're just going to run all this arbitrary code, and, it'll basically be okay,” to now, no, we have, really good containerization. We have a new, underlying, ag-agent, containerization, service. It's like we're using it under the hood. It's through Azure. They recently announced it. The Azure, Dev Compute, but it's, very fast, very fast compute to be able to, spin up your own cloud agents, or whatnot. We're using it under the hood for some parts of the new,Swyx [00:25:36]: Microsoft Dev Box?Kyle [00:25:37]: No. Dev Compute, yeah.Swyx [00:25:41]: Hmm. Not finding it just yet.Kyle [00:25:44]: Oh, it's, it's in there somewhere.Swyx [00:25:46]: All right. Well, we'll cut that out.Kyle [00:25:47]: Sorry. But with, Dev Compute, you can, run, really fast, spin up really, small VMs really quickly, so you're doing a tool callSwyx [00:25:58]: Same conceptKyle [00:25:58]: Just do it containerize exact-exactly. So we're using that so definitely moving that direction to protect us from every every piece of code that we're ultimately running.Swyx [00:26:07]: look, that grows into the full SDLC? Code hosting was just the start and and then it's grown beyond that. Let's talk about NPM may-maybe ‘cause I think that's also, a very major point in the industry. I do think, it was looking for a home. It was, kind of struggling as a business, right? I don't know, I don't know how you would characterize that whole acquisition and how itNPM, Package Security, and Keeping the Internet RunningKyle [00:26:33]: like when we were talking to the team, I think the big thing for the both of us was to find a way to keep NPM, which was basically powering the internet then and way more so now to some degree running. Keep it going keep continuing to scale. It was having scaling problems, if I recall, back at that time. They were doing some rewrites. ItSwyx [00:27:00]: that's cute compared to now.Kyle [00:27:01]: Well, that's the thing is like when I'm talking to folks now, there's there's so many more underlying uses of NPM than there were back when we had them join in with GitHub. But that was ultimately the goal. It was really okay, we used to have pages. We have, the world's code. Let's make sure that we can keep NPM running well for the world. And we put a bunch of time and investment into fixing some of the underlying backend, changes, some of which we talked about some of the manifest work, et cetera. And then now, really trying to bring the the security posture of NPM up to speed. But, it is a unique challenge in that every move that we make to make it more secure will break a lot of people. And security is paramount. And also, we take it very seriously. We're, the any time that we have a problem with GitHub or we make a change that makes us more secure but hurts, there's, a snow day for developers or a really bad fire that they have to go put out. And so we've, have changed the 2FA policies. We've changed the way the tokens work. When we find tokens that have been exposed or potentially, exposed, we invalidate them, andSwyx [00:28:22]: I love that feature in GitHub. Yeah, it's greatKyle [00:28:23]: That creates issues, but, the but that's the thing is we're trying to push the community, forward without necessarily, doing something that is going to break the contract that's been for 15 years or close to it or some amount of years on NPM.Slop Forks, Vendoring, and the Future of Open Source Supply ChainsSwyx [00:28:43]: I think the— So now we're talking about, open source and publishing. And I think there's something here with what people are calling slop forks, which, I think Malta from Vercel is doing. And, part of me thinks, well, the way to get past any vulnerabilities, we just, let's just get rid of the concept of NPM. And we only publish source code. And anytime you want to import it you have your coding agent look at it and then adapt whatever subset you're going to use into your vendor it. But, the AI vendor it. Is that realistic? I don't know. Is it— Will that solve all our security issues? I don't know.Kyle [00:29:24]: I don't think it'll solve I so Mitchell was just talking Mitchell Hashimoto Was just talking about this today, and I think that I-in some ways, it's all all things, old or new again? Yeah, absolutely vendoring everything. Like I do I do remember twenty thirteen, twenty fourteen.Swyx [00:29:42]: This is Yeah. Let's, we must return toKyle [00:29:43]: That's what is We were vendoring everything. We were having actual discussions around, or at least I remember we were “Should we take this full thing?” “Why is this so big? We only need this one file.” And so I do think there's something true there where having either taking only what you need or the dependencies just getting incredibly small over time, I think will help to some degree, but it's not going to solve the fundamental problem, I don't think, because the vulnerabilities in an agent looking at them, there's time and time again, there's a million different ways in which we can convince an agent that this thing is, secure or not and pull it in. Or we can do static code analysis or runtime testing to say whether the code works or not. That is, I think, the step that needs to continue to be, invested in. The question is just on, how much scope. Should it be this enormous project that I'm pulling down, or should it be this piece? Either most companies are running some amount of security checking on the on the packages that they're bringing in or vendoring. That I think won't change. That's like what advanced security does to some degree, Socket does some degree. Like everyone is doing a piece of that. How we each do that like especially when we're talking to enterprise customers, is just like very different. No there's no one wants one single way to do it. And I think that's always been GitHub's, unique position in the world. I talk a lot to maintainers, I talk a lot to folks about this. It's we're— we rarely start like a process and a practice and like push it onto the community. We usually wait for the sort of like RFC process socially or literally, everyone agreeing, and then we'll cement something in. Because otherwise we'reMaintainers, RFCs, Vouching, and the Social Layer of TrustSwyx [00:31:35]: That fits your role in the ecosystem, yeahKyle [00:31:36]: We're GitHub. Yeah, we don't want to shape the whole thing. We want it to be figured out. But like how do you balance that like sort of Role in the industry to keep everything as secure as is possible and make sure that you're you're not going to be compromised as a human, ‘cause that's usually how it all happens. And Not not create a process or lock us into a flow that you're not going to or like Mitchell's not going to or other open source projects aren't going to like. That's always been a tricky balance for us, and I think that's something that we haven't talked about enough is we're not going to be able to fix everything for everyone in a way that everyone is going to like. So tell, help us, tell us what is working. When Mitchell was talking about, the Upvote, the upSwyx [00:32:22]: I was going to bring up his thing. Yeah.Kyle [00:32:23]: I forget what it Yeah. When he's talking to us, I was chatting with him and talking to him about this and I put it on Twitter and we talked to, also over DM, was “We're going to keep working.” but I think the important thing is I do actually want to hear what isn't working for you. And as, be as specific and clear for your project as is possible. And to every piece of credit over the many years that we've known each other through the industry, he's always done that and I appreciate that ‘cause there are places that we need to fix up, and we hear from him, and we'll fix up just like we do all other kinds of maintainers. But that that process between making those types of improvements and being more secure and like creating, I forget what he calls it's not the proof process, not the claims process. Do what I'm talking about? He has that he his projects have a way for you to kind of like,Swyx [00:33:13]: VouchKyle [00:33:13]: Vouch. Thank you. Yeah. He has like the vouch system for saying, “Hey, you should accept my PRs.” That's beenSwyx [00:33:20]: I just built this into GitHub. I don't know.Kyle [00:33:22]: Well, see, but that's the thing is that you say that and like he and his community really likes this and then I'll go talk to other maintainers and other maintainers, globally, and they're “No, this doesn't work for me.” And that is the tension, but also the kind of beauty of GitHub, depending on which way you look at it is we want to help maintainers, so we create all these tools to let you have more control over how much you take in from AI and PRs. But you can also use this. What You can go use this project, and if it takes off and becomes the kind of mostly standard, then yeah, we probably wouldn't enforce it but we would add it in because that's the flow that we tend to do?Swyx [00:34:02]: I hear a lot of people don't know the history of the pull request. And like like that's how, that's something that GitHub standardized basically.Kyle [00:34:08]: Yeah. It was a very messy process Like beforehand, and now the we have the benefit of it being the process? And now we have to go and Figure out the next best process or what adaptations change, or what does a pull request look like when eighty percent of your PRs are just coming from your agents and not From other devs?Swyx [00:34:31]: Do you like the prompt request idea from Peter?Kyle [00:34:34]: like I think that for each like each idea I think has its merits. I'm not, I'm not avoiding saying anything good or bad, but I feel like I've seen a version of we have that we have entire Thomas' store. Take all the assets of what you've built and put that in. I think that's got great ideas. There's all these various permutations of the PR flow, but I think the reason why there's not a single answer is ultimately we're trying to codify trust. We're trying to say “Okay, if Sean reviews this I'm going to trust it because you're Sean or you're the senior dev or you're the whatever.” And right now, when we are working in a flow where an agent writes code and another agent reviews code and then Kyle goes and looks at it the trust is kind of diffuse. And most of the tools that we're talking about are talking more about verification flows. We have more assets to look at, so I can probably say whether this is a good PR or not. But that still doesn't solve, I think, the human problem of I'm looking at a PR and I want to know if I can trust it. And we're still, we still tend to use human signals for that? Mitchell approving it or Kyle approving it or whatever. And so I think that's, I think that's why most of these options haven't really solved it is because, it's a social problem ultimately. It's a it's a human problem to review it and agree. Or you fully trust the tool and you're imbuing that tool with full trust Which I think in some cases that absolutely exists.AI-Generated PRs, Trust, and the Waymo AnalogySwyx [00:36:08]: And so like in the same way that there will be a tipping point in society when we don't allow humans to drive anymore Because machines are measurably better than Than humans. I'm looking for that tipping point, right? Like Mythos is ridiculously expensive. Someday we'll have Mythos on a desktop. I don't know. Will, does that change the equation?Kyle [00:36:30]: I think it's more I took a Waymo here, and I was on my phone and not looking around at all. There are other, self-driving, vehicles that I would not trust while, staring at the road. And I think that trust is something that isSwyx [00:36:48]: Is this a Zoox thing? What is itKyle [00:36:50]: I think that is both. I think that is both. LikeSwyx [00:36:53]: There's Zoox in this robo taxi. That's it. It'sKyle [00:36:56]: Well, depending on what level Of self-driving. But, my point is sort of that I think part of that is I strongly believe that's, a mixture of verifiable proof. Like how many accidents, how much data, and so on, and the human aspect of how I feel when I'm in this car, what it tells me, et cetera. And so that's why I think some of the like Some of these some of our AI tools tend to, imbue me with more of that feeling of trust, even if the data says this is 100% accurate. I feel like it takes more time for us to go, “Should I trust this or not?” And that's in the soft sense of, startups with high agency, weekend projects, and open source. And then there's enterprises and regulated industries and everything else, and that is an even harder problem to go solve because even when it is fully verified, not only do you have to have trust from the humans on the team, you probably have to have trust from multinational,Swyx [00:37:55]: Oh my GodKyle [00:37:55]: Multi governments around the world and regulating agencies. And so that's where I feel like until we tip over to your point on the sort of like human EQ side of it. I feel okay this feels okay I've been proven enough. Then the ball will start to roll a lot faster, where we'll end up getting to the “Okay, we can trust this,” and feel good about it in the Most difficult of cases.Reputation, Sponsors, Stars, and Bot Activity on GitHubSwyx [00:38:18]: If human trust is the thing that matters, I feel like GitHub as the developer social network could maybe do more there. Like vouchers are one system But, we have star counts, and then we have Contributor rights, and that's it. And I feel like there should be more in that space. I don't know if there's any other design decisions there.Kyle [00:38:37]: I think that one of the places that we don't really expose right now in this sort of way is, some degree of like hard trust and support, which would like for me is like sponsors is a good example of that.Swyx [00:38:49]: Ah.Kyle [00:38:49]: It like costs you something. To prove that I believe in your project and I trust you To some degree or I want to support you at the very least.Swyx [00:38:56]: Solve payments for open source. Why not?Kyle [00:38:58]: I think that I think that like as we keep moving forward, right, there's more and more projects where I'm, adding more and more dollars into sponsors personally because I want to like support them, but I also like know of I've probably never met them in person, but, I know of enough of their work that I want to support them. I think the thing that I don't love about stars or commit counts or anything else is ultimately, even with all of the various, abuse and de-spamming and deduplication work that we do or anti-abuse work that we do, these are all, not active social signals. They're passive ones that are ultimately gamifiable. And you may trust me, but another open source maintainer may not. And on what heuristic should you be, trusting me? That I think, is kind of where some of our thinking is right now. What signal from me is most important to you? You— If you can define that potentially, honestly in an agentic workflow that's what we see some of these open source projects do, where you have GitHub actions, and then you have like an agentic workflow that's calling AI, and you're setting these rules. Like if Kyle has submitted and gotten accepted PRs across any given project and has a social handle tied to his account in GitHub, and that social account's older than a certain amount. Really complex measures that matter to you ‘cause most open source projects have that heuristic built into their heads, if not written down in the contributing guidelines. You could take that and then go apply that and then just say, “Oh, we're not going to accept this PR.” Building something that is, I think, malleable to everyone's needs, is a little bit better, rather than going “Hmm, this account's too young.” Because what happens? The attackers just go and go and create a multitude of accounts, and they wait Until it ages up. Needs to have a certain amount of stars. That's how star inflation happens. Need to have a certain amount of reposSwyx [00:40:46]: Oh my God. YeahKyle [00:40:47]: With PRs. They all just create repos and submit PRs to each other, and then they come in and do something nefarious. And so, it's hard. It's hard to find the measure. So I think we're, we're looking more at how can we provide you tools so you can kind of choose what's best for you. And of course, we'll give you some standards. But the trust vector, gets down to I don't know, some version of like human digital ID like everyone's been talking about. Like how do I prove that it's meSwyx [00:41:13]: Give me your eyeballsKyle [00:41:14]: On the internet. Give me your eyeballs. Exactly.Swyx [00:41:18]: The I got to keep moving on Topics, but obviously I can go all day on this stuff because, I've been involved in GitHub and open source My entire professional career. Stars. Very superficial. Everyone knows it. But I think time to one hundred thousand stars is the fastest I've ever seen. Like people just reached that in I don't know, months. And then like at the same time I don't trust it right? Like how many of these are real or bot or like whatever. I don't know how to ask this but like what can we do about it? LikeKyle [00:41:49]: JustSwyx [00:41:49]: Is stars broken? Is stars fine?Kyle [00:41:51]: I think that there's kind of two, there's like two pieces. Obviously we're constantly like trying to find ways in which like your users are producing spam, which would, I would include like be like only doing star gamification. When we find them, we pluck ‘em out and we,Swyx [00:42:08]: But it's like a Whac-A-MoleKyle [00:42:10]: It's a hundred percent like a Whac-A-MoleSwyx [00:42:11]: There's no wayKyle [00:42:11]: Now, powered by AI to be helpful. But I think more so what I'm seeing is, a lot of the like fastest time to X tends to be because we're now inviting so many more people into like software development on GitHub That like the zeitgeist is just swarming? And it'sSwyx [00:42:32]: It's not just developers anymoreKyle [00:42:33]: And it's not you and I. Like like however you want to say like what a developer is it's not just folks who have been coding for a very long time. It's folks that have maybe started coding or only joined in since the AI era. And nowSwyx [00:42:44]: what's the latest Octoverse number? I know eighty million was my lastRem- member that a number of developers on GitHubKyle [00:42:50]: Oh, we're over 200 million now.Swyx [00:42:53]: Okay. Well, so you see?Kyle [00:42:55]: Like over 200 million developers now.Swyx [00:42:56]: But it's not developers, right? It's, it's people with a GitHub account.What Counts as a Developer in the AI Era?Kyle [00:43:00]: So, so this is, this is the biggest debate that I would say, everyone loves to have at GitHub at this point. From my perspective, right, I think that there's, there's clearly a difference between, professional enterprise developer and then developers. But I think that I think that the idea that we should be I don't know, splitting hairs or segmenting developers in the early era of software development is, not worth our not worth the time. SoSwyx [00:43:29]: When you get into gatekeepingKyle [00:43:31]: 100%Swyx [00:43:31]: What is a developer?Kyle [00:43:31]: 100%. ‘Cause I wasn't a developer when I started writing code? I was going toSwyx [00:43:36]: Oh, no. I made— I cloned a thing, seven years before I learned to code. And then I and then I wrote about my learning to code journey, and people Just called me a fraud ‘cause I had a GitHub account. And I'm “Well, no, I just use GitHub, but I don't know-” “I didn't know what I was doing.”Kyle [00:43:49]: I I remember that. I remember those sets of posts, and like that's, that's b******t. So I fight very clearly on the line of, if you create code, if you have an idea and you create it into some way of, I'm, I'm going to run it and use the app right now, you may still use AI in that moment, but that's okay. At some point you're going to do the next thing. You're going to create a big— You're going to have to learn about this database. You're going to fix a bug, whatever. We're all on some same journey, and those people are also hearing about the great new agent skill package or a new CLI tool or a new whatever. And those projects are going up because you want to be a part of this moment, just like I wanted to be a part of the Ruby community when Ruby was popping off when I started becoming a developer, and now I can just click the star button. And so I think that yes, there's clearly some amount of like spamming and game gamification that we're working against, but I really think we're just seeing this whole new cohort of folks that are moving from technology to technology because they're not working on a 20-year-old software application. They're working on a side app that they built on the weekend for their friends or for their new idea or whatever. And that's how you see these enormous charts going up and to the right with With stars.Swyx [00:44:59]: I think something that's remarkable is the persistence or, that GitHub extends to those folks. Usually when I see platforms go into a new audience, they usually have to, have like a second platform with a different name that wraps the main platform. But somehow GitHub has been able to sort of persist and extend, and it's friendly and whatever? So it's, it's nice.Spark, Low-Code, and Always Showing the CodeKyle [00:45:19]: I that's partially why I think as we've tried to move into I don't know, more like low-code-y things. We so we started working on Spark as like a way to, build an app and run it. I think that the reality is that we anytime we try to, kind of put even a veneer on top of it without when we put a veneer on top of something, we still always show you the code. That's kind of like a tenant. We're never going to, hide the code from you ever, because whatSwyx [00:45:52]: Why would you?Kyle [00:45:52]: That's, yeah, that's the whole point? However, I think that what we learned with things like Spark is that really the value of Spark for most devs is, easy runtime. And you may have a runtime or a host that you're going to use for that or you just build something and run it but, the package of making that even more simple isn't really needed for folks that are trying to build software and not just trying to build, an app, which is, slightly different, a slightly different goal. So I want to get you in, I want to get you comfortable. I think the best thing for me as, someone that did not traditionally come into software dev way back, I want anyone to be able to breach that chasm and not be in the I don't know, I feel like we're, we're still in an era of, STEM. I've got a 12-year-old and an eight-year-old, and it's “We got to get ‘em into STEM,”? Over and over. And I like I do, I do the things that good parents do. I was “Oh, you want to do coding?” “Yes, I want to do coding.” Do coding classes. But now they're just not afraid of doing software. And that's, I think, the thing that's honestly kept me at GitHub for so long. Anyone should be able to go and build a thing, just like I can go change a light switch in my house. I'm not going to go into the breaker box ‘cause I'll probably kill myself? But, I can go change that light switch. Everyone should be able to go and say, “This fricking app doesn't do what I want. I want it to work like this.” And that I think, is what's kind of kept us all connected with GitHub through the years and some and during the easiest of times or in the hard times because of that opportunity of, we're the home for all developers, and we want everyone to be able to have that feeling that we've had of, had an idea, I created it and holy s**t here it is.Swyx [00:47:37]: Here it is. All right, I'm going to try to do more spicy questions.GitHub's Hardest Scaling Moment: Growth, Agents, and UptimeKyle [00:47:42]: Great.Swyx [00:47:42]: Is it an easy time now or a hard time?Kyle [00:47:45]: Oh at GitHub? It's a hard time. Like, it's a hard time and also, I was just with my team and I said, “This is also, the best and most exciting time that I think I can remember at GitHub.” BecauseSwyx [00:47:57]: Best of times, worst of times. It's never oneKyle [00:47:59]: ‘cause we've we were talking about Octoverse reports and, usually we do an Octoverse report once a year, and we look at the numbers, and we say, “Oh my goodness.” I was at Universe in October saying, “This was the fastest year of growth that we've ever had,” right? And now we're doing more in a month than we did in a year last year.Swyx [00:48:20]: You're talking about PRs.Kyle [00:48:21]: Commits.Swyx [00:48:21]: Commits, yeah.Kyle [00:48:22]: PRs. Kind of like you name it by roughly every measure that we're looking at, there's some amount of sort of growth that is much bigger, and that is breaking our system in new ways, not old ways. Like webhooks were always notoriously, unreliable over the years?Swyx [00:48:38]: Whose fault is that?Kyle [00:48:39]: not anymore mine, but for a period of time, I'm sure you could pull up a tweet that was “It was me. I'm sorry.” but, now, that got rewritten at a scale level that is still working and is not having problems today. Now what we're finding isn't just the isn't the-The simple stuff that folks are on the sometimes on Twitter or on the internet are “Hey, why is this like this?” Sure. There's absolutely silly problems that we shouldn't exist. But now we're talking about, unique, novel permission problems that happen only at a scale across all different objects or whatever, that now we have to go rewrite this underlying system. And so it's, there are problems that yeah, caught us off guard, which I think I said. Like the growth is astronomical, but also we're making such material progress in that I'm excited once we're once we've kind of like reimagined the underlying foundation layer, or pieces of it at least, what's going to be possible when it's not just all of us and all the new people that are being developers and all of their agents and all the tools like working together. Because that'll still happen in that in that GitHub tool, that GitHub community. But it's a it's a hard day anytime we can't give you what you're looking for. We have the same problem internally. We operate through github. Com. Of course, we have backups when things go down and whatnot for our own operations but we feel it too. If it's not working it's not working for us, and that's kind of like the promise of dogfooding for GitHub. It's always been true. We're using the same tool you're using. We're not using a super secret version. We and so we also need it to be great for us for our customers of course for open source. And now an exponential growth of agents, Doing it too.Swyx [00:50:32]: I wanted to load for audio listeners who maybe haven't seen your tweets, whatever. So one billion commits in twenty-five. Now it's two hundred and seventy-five million per week on pace for fourteen billion this year, if growth remains linear. Is that still the pace? I don't know. It's been aKyle [00:50:48]: it's, it's speedingSwyx [00:50:50]: Roughly.Kyle [00:50:50]: It's still speeding up.Swyx [00:50:51]: It's, it's April, so yeah.Kyle [00:50:51]: Exactly. This was in April.Swyx [00:50:53]: All right. So basically you have fourteen x growth, right? Year on year on year. And I think that's a scaling issue. I think, I'm going to like try to really steel man this thing. People have experienced fourteen x growth. They haven't had your downtime. And that's like— C-can we go dig into that? Why? Like what's the— what broke? What are we doing to fix it? Like just anything for the community to reassure them.Why GitHub Reliability Is Breaking in New WaysKyle [00:51:18]: so there's a Like I was saying, there's a couple different places that we've seen the growth issues. Some of the growth issues, which is why we're t— I was talking about pushing hard on more CPUs is in actions in particular. More tools, more agents, more PRs mean more builds, more builds mean more CPUs. And so we are expanding through not just our data center, but obviously we were talking about moving to Azure and moving to, adding an additional cloud compute because we simply need more CPUs. Not as much GPUs. We definitely need GPUs too, but now CPUs are becoming a factor.Swyx [00:51:53]: It's very CPU heavy.Kyle [00:51:54]: Underneath the hood when it comes to some of the underlying services, we've been breaking up over the years our database infrastructure, so that way we have, more cognitive separation between our the various services. The place that we continue to have pain is in, permissioning. And so right now m-many of our permissioning layers sit into a database that we like internally call MySQL One, and old Hubbers will know what I'm talking about. And so we've been pulling things out of MySQL One for many years, because like and we use we use Vitess and we use other technologies to shard and we do it as one bigSwyx [00:52:31]: Famous thing, PlanetScale was born from this andKyle [00:52:32]: A hundred percent. Sam Old Hubber and friend. And so finding these opportunities to like break this out and then do that globally. The other thing that I think is interesting and both a unique opportunity and tricky is we also run everything I just talked about in a black box container with GitHub Enterprise Server for people that work on-prem. So we take everything I just said, and we also do it on-prem, and we also do all of that and we do it in a data residence setup for customers that need to have their data in a single location. Each of these has the unique characteristic around how we're sort of storing that data in MySQL or in a permissioning setup. That's where some of these outages have oc-occurred, where you're seeing it more like across the board rather than just like the one pieceSwyx [00:53:17]: Filling the databaseKyle [00:53:17]: Isn't quite working. Exactly. And so part of it is that. I think there's been some other places where agents are much more or more projects appear to be moving towards monorepo versus we were going the other direction for many years in the industry. Repos were smaller, but there were more of them, and now we're seeing the opposite. Repos are bigger, and there's, not fewer of them per se ‘cause there's new growth, but, we're just seeing many more big repos. Big repos, big monorepos have always had, a unique performance problem. Because each one, is slightly different if, particularly if the underlying blobs are incredibly big Inside the repos. And so we've done a ton of work that you pro— like most people haven't probably experienced, unless you're in this case of the monorepo. But that Git, infrastructure layer improvement does help the overall, system because, many of the improvements that make monorepos work better make all repo infrastructure work better. And so, I could kind of keep going down the line where it's another thing where we're moving out of, We're changing how we do j I'll just say job queuing for lack of a better, explanation changing the underlying technologies there.Swyx [00:54:32]: I spent two years being a job queuing guy, so.Kyle [00:54:34]: And so it's kind of a little bit of a little bit of piece by piece, and it's mostly because as we were— as it was built, we built everything in a way that assumed, I guess in some ways that the size of the pipe of work was going to remain the same. There's just going to be more people coming through each of those pipes. But instead now in places whereA git push was, generally a certain size for example, is now, no longer true.Swyx [00:55:03]: Oh, yeah.Kyle [00:55:03]: OrSwyx [00:55:05]: I push a thousandKyle [00:55:06]: On the average. 100%Swyx [00:55:06]: A thousand line commits like dailyKyle [00:55:07]: Same thing with PRs. Like PRs same thing. And like we've talked about optimizing that and making changes where, and there were technology choices that did not work there? And it got slow, and it didn't It was not fast. It did not do what the users wanted. And so we've been reeling that all out and going “Okay, that's just not right. Let's stop putting good money after bad and do it the do it the right way or the right way now.” So there's It's a it's a lot of things, not quite when I've experienced scale at GitHub historically, it's almost always two options that we've used. We go vertical scaling, particularly with databases, right? And we go horizontal scaling. Oh, we just have more people using this service. Great. We're going to add more servers, and we rack them in our data center, or we use it in a cloud. And now we're sort of in a like diagonal, where like vertical doesn't really work anymore. Horizontal isn't work either because we're all We all have some CPU or GPU constraints in the world now, and now we have to go in and like crack open services that have been running for 10 or 15 years and go, “Okay, the rules of this service have legitimately changed, and now we have to rewrite them.” None of this is an excuse. This is like we're We have to do the work. We have to make it better.Swyx [00:56:22]: actually as an infra guy, I'm “This is like one of the most fascinating scaling challenges I've ever seen.”Kyle [00:56:26]: That's that's, that's the thing that's the thing that it's hard for Like when we weren't talking about it publicly, and I was like I came out, and I was “Hey, I just want to explain what's going on.” Part of it comes from a very old GitHub ethos, which is it's our it's our uptime. It's down. W What I know you're a developer, so you're, you're inclined to want to understand more what's going on. But at the same time us going “Hey, this service didn't, perform the way we expected, and now we have to go change it,” we weren't We're not trying to hide anything from you i

Remotely Curious
Coming soon: Working Smarter season three

Remotely Curious

Play Episode Listen Later Jun 2, 2026 2:17


Modern work can be frustrating and chaotic—if you don't have the right tools. From context engineering to multimodal search, go behind the scenes and hear how Dropbox engineers are building AI that actually understands you, so you can focus on the work that matters most. If you're new to Working Smarter, we've travelled from the F1 track to the bottom of a lake, and heard real stories from chefs, doctors, lawyers, and founders about how AI is helping them do more of what they love about their jobs. But in our third season, we're talking to the people behind the tools—the engineers and product leaders building helpful, time-saving AI features into the Dropbox experience you already know and trust. You'll hear all about their work on agents, inference, security, and, of course, how the people building AI use AI themselves. ~ ~ ~  Working Smarter is brought to you by Dropbox. Find, organize, and share your work—all in one place—with context-aware AI from Dropbox. You can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.ai This show would not be possible without the talented team at Cosmic Standard: producer Ben Montoya, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrator Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck.  Our theme song was composed by Doug Stuart.  Working Smarter is hosted by Matthew Braga. Thanks for listening!

Azure Friday (Audio) - Channel 9
Anyscale on Azure: Scale Python AI workloads with managed Ray on AKS

Azure Friday (Audio) - Channel 9

Play Episode Listen Later Jun 2, 2026


Scott Hanselman talks with Omar Shorbaji from the Anyscale engineering team about how Anyscale on Azure scales Python AI workloads from a single notebook to thousands of CPUs and GPUs. Built on Ray, the most widely adopted AI compute engine, Anyscale gives you a unified runtime to build, train, and serve, running directly on Azure Kubernetes Service without the complexity of managing Kubernetes. See a live demo that fine-tunes a vision-language-action robotics policy, with the metrics you need to push GPU utilization higher. Chapters 00:00 - Introduction 00:52 - Ray and the Anyscale platform 03:11 - Start of demo: Workspaces 04:38 - Running a job and viewing utilization metrics 05:24 - Choosing the right scale 06:53 - Abstracting Kubernetes on AKS 08:53 - Wrap up and where to learn more Recommended resources Learn Docs Anyscale on Azure Connect Scott Hanselman | Twitter/X: @SHanselman Anyscale | Twitter/X: @anyscalecompute Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure

The Construction Corner
#432 - The Infrastructure Behind AI: Power, Data Centers & the Future of Construction

The Construction Corner

Play Episode Listen Later Jun 2, 2026 28:59


In this episode of Construction Corner, host Dillon breaks down the massive infrastructure transformation sweeping America — and what it means for the construction industry.Dillon covers:The power grid under pressure — Why the surge in data centers and reshoring of manufacturing is driving unprecedented investment in utility infrastructure, and how FERC regulates rate increases tied to capital spending.Data center geography — Which states are winning the data center race (Northern Virginia, Texas, Eastern Oregon, Nevada, Arizona) and why California keeps losing out to regulation and permitting challenges.The $700 billion AI buildout — How the four hyperscalers (Microsoft, Google, Amazon, Meta) are committing historic CapEx, what a gigawatt-scale facility actually costs, and why supply chain — not concrete — is the real bottleneck.Behind-the-meter power — Why major data center operators aren't waiting on utilities and are standing up their own generation (gas turbines, solar, and even small modular nuclear reactors) to turn on racks faster.Battery energy storage at scale — How megawatt-hour battery systems are being deployed at data centers to smooth load swings, support the grid, and reduce utility dependency — and why this is very different from a home Powerwall.The AI compute race — Why demand for GPUs shows no signs of slowing, how Anthropic's revenue explosion illustrates real consumption, and why this infrastructure build likely runs for at least five more years.Construction is hyper-local — A reminder that no matter how big the macro trends are, your personal economy in construction is defined by the geography and relationships where you operate.Whether you're in the trades, engineering, or just trying to understand where the industry is headed, this episode gives you a ground-level view of the biggest construction wave in a generation.

Techmeme Ride Home
Interviewing For A Job At Anthropic? DON'T Use AI.

Techmeme Ride Home

Play Episode Listen Later Jun 1, 2026 21:46


Nvidia unveiled the RTX Spark, an Arm-based consumer chip family built with MediaTek on TSMC 3, plus a DGX Station desktop that runs 1T-parameter models. Intel detailed its Crescent Island GPUs, MiniMax launched a coding model rivaling Opus 4.7 at 1/40th the price, and Anthropic bans AI in interviews. Nvidia announces the RTX Spark, an Arm-based consumer chip family it calls "the most efficient PC chip ever built", made on TSMC 3 in partnership with MediaTek (The Verge) Intel details its Crescent Island data center GPUs, built on its Xe3P architecture and using LPDDR5X memory instead of HBM, calling them "built for agentic AI" (Tom's Hardware) Nvidia unveils DGX Station for Windows, a desktop PC powered by a GB300 Grace Blackwell chip with up to 748 GB of memory, capable of running 1T-parameter models (SiliconAngle) Chinese AI developer MiniMax debuts M3, a new coding model that it says rivals Claude Opus 4.7, costing $0.12 per 1M input tokens, compared with $5 for Opus 4.7 (The Information) A look at Anthropic's hiring process, which prohibits AI use in interviews and features a culture interview that candidates describe as highly intense (Bloomberg) Learn more about your ad choices. Visit megaphone.fm/adchoices

This Week in Startups
This Startup Fused Human Brain Cells with Silicon Chips | E2295

This Week in Startups

Play Episode Listen Later Jun 1, 2026 66:16


This Week In Startups is made possible by:Deel https://deel.com/twistQuo https://quo.com/TWiSTLinkedIn Jobs https://LinkedIn.com/twistToday's show:Cortical Labs is the world's first company selling biological computers. Their CL1 fuses lab-grown human neurons (derived from stem cells, not actual folks) with silicon hardware to create Synthetic Biological Intelligence (SBI).Founder Dr. Hon Weng Chong walks us through how the system works and why neurons are more efficient than GPUs at reinforcement learning. (Also… is this computer alive?)PLUS Pyka co-founder and CEO Michael Norcia explains the various uses for his autonomous aircraft, from crop-spraying drones in Brazil to a a hybrid-electric defense UAV for the military.Guests:Cortical Labs: ****https://corticallabs.com/Dr. Hon Weng Chong on X: https://x.com/dr1337Pyka: https://www.flypyka.com/Pyka on Instagram: https://www.instagram.com/flypyka/?hl=enFurther Reading:2022 Pong paper in Neuron: https://www.cell.com/neuron/fulltext/S0896-6273(22)00806-62017 Paper: “Attention is All You Need”; https://arxiv.org/abs/1706.03762The “Barista Test” for Artificial Intelligence: Chris Rourk: https://medium.com/predict/the-turing-test-is-so-last-century-the-barista-test-for-artificial-general-intelligence-faf91034fa8cNotable Links:Playing “DOOM” on CL1: https://www.youtube.com/watch?v=yRV8fSw6HaEDayOne Data Center: https://dayonedc.com/NeurIPS 2026 Conference: https://neurips.cc/Neuralink: https://neuralink.com/CliniCloud Digital Stethoscope and Thermometer: https://www.design-industry.com.au/clinicloudAir Force Research Laboratory (AFWERX): https://afwerx.com/Joby Aviation: https://www.jobyaviation.com/Prime Movers Lab: https://www.primemoverslab.com/Timestamps:0:00 What is "biological computing"?2:49 Cortical's new $30 million raise4:15 The world's first biological data center9:48 Deel - Founders scale faster on Deel. Set up payroll for any country in minutes, hire anyone anywhere, get visas handled fast, and get back to building. Visit https://deel.com/twist to learn more.10:51 Biological computers have a learning advantage19:43 Quo (formerly OpenPhone) - Quo gives you a clean, modern way to handle every customer call, text, and thread all in one place. Try it free at https://quo.com/TWiST29:15 LinkedIn Jobs - Hire right, the first time. Post your first job and get $100 off towards your job post at https://LinkedIn.com/twist38:46 From paper airplanes to Group 4 UAVs52:20 Introducing the DropShip defense drone58:28 How regulations block US drones1:00:40 Why Pyka builds everything in-houseSubscribe to the TWiST500 newsletter: https://ticker.thisweekinstartups.comCheck out the TWIST500: https://www.twist500.comSubscribe to This Week in Startups on Apple: https://rb.gy/v19fcpFollow Lon:X: https://x.com/lonsFollow Alex:X: https://x.com/alexLinkedIn: ⁠https://www.linkedin.com/in/alexwilhelmFollow Jason:X: https://twitter.com/JasonLinkedIn: https://www.linkedin.com/in/jasoncalacanisCheck out all our partner offers: https://partners.launch.co/Great TWIST interviews: Will Guidara, Eoghan McCabe, Steve Huffman, Brian Chesky, Bob Moesta, Aaron Levie, Sophia Amoruso, Reid Hoffman, Frank Slootman, Billy McFarlandCheck out Jason's suite of newsletters: https://substack.com/@calacanisFollow TWiST:Twitter: https://twitter.com/TWiStartupsYouTube: https://www.youtube.com/thisweekinInstagram: https://www.instagram.com/thisweekinstartupsTikTok: https://www.tiktok.com/@thisweekinstartupsSubstack: https://twistartups.substack.com

ChinaTalk
The Pope has AI Takes

ChinaTalk

Play Episode Listen Later Jun 1, 2026 50:50


Pope Leo has called AI the single greatest challenge facing humanity. Not war, not poverty, not climate change. So we got a panel together to sort out what this encyclical means. Joining Jordan are Tim Hwang, deputy director of the Institute for Christian Machine Intelligence, John-Clark Levin of Kurzweil Technologies, and ChinaTalk's resident Catholic, Aqib Zakaria. We discuss… Why the encyclical's claim that AI cannot truly "understand" is a narrow theological term of art, and why that nuance gets lost on Twitter Pope Leo's call to "disarm AI" and the Holy See's potential role mediating between the US and China and speaking for the global South Tim's pitch for a Vatican alignment lab that buys GPUs and tries to beat Anthropic's benchmarks from Christian first principles Why frontier-lab researchers, including non-believers, are treating the Pope as a moral coordinating signal How Anthropic drifting from deontology toward virtue ethics in training Claude looks like a validation of the Christian approach The provocation underneath all of it: is the American AI stack a Christian AI stack? pope as chicago footwork: https://suno.com/s/1Qb9Ce3Bh6saeF2V Learn more about your ad choices. Visit megaphone.fm/adchoices

The Wolf Of All Streets
Bitcoin CRASHES Below $72K As Saylor Sells For The First Time

The Wolf Of All Streets

Play Episode Listen Later Jun 1, 2026 62:17


Bitcoin is teetering near $72,000 as the Iran war heats back up, with Trump claiming Tehran "really wants" a deal while air strikes resumed over the weekend near the Strait of Hormuz, sending Brent crude up 3.7% to $94.48 and WTI surging 4.3% to $91.07. A tentative 60 day memorandum of understanding would reopen the Hormuz chokepoint with unrestricted shipping and require Iran to clear all mines within 30 days, but the deal still awaits Trump's final approval and Iran's response. Meanwhile Coinbase is launching direct rupee rails in India on June 1 to attack the $3 billion local crypto market, Fed Governor Christopher Waller declared dollar stablecoins could expand the reach of U.S. monetary policy globally, and Jamie Dimon just vowed JPMorgan and the banking lobby will fight the CLARITY Act over stablecoin yield. Plus Michael Burry dropped a bombshell calling the Nvidia, xAI, Apollo, Athene structure "Fugazi", alleging $5.4 billion in GPUs are hidden off balance sheets while American retirees unknowingly hold $103 billion in Level 3 assets at 16x leverage inside a Bermuda insurance shell. We are breaking down whether Bitcoin can survive another Hormuz spike, what Waller's stablecoin endorsement means for the dollar, and why Burry's warning could be the most dangerous story nobody is talking about. Learn more about your ad choices. Visit megaphone.fm/adchoices

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

We're announcing AIEWF speakers this week! Take the AI Engineering Survey!Today's guest Ethan first joined us for the LS Paper Club as the lead on NVIDIA Cosmos World Model, but then joined xAI and built Grok Imagine in 3 months:He comes back on Latent Space with some nuclear hot takes: that Video Models primarily get their intelligence from LLMs, not from training on video data, and that the next frontier for truly interactive, realtime, long-horizon world models is to work on LLMs (perhaps Interaction Models as well…)Put it this way: In the near term, the next Sora won't be a better video model, but a video agent.Generative Media may more closely follow the evolution of AI coding which went from focusing on one-shot output performance and cost, to multiturn reasoning and planning models for agents and systems that can plan, edit, test, debug, and submit PRs.At a certain point, coding models got so good that the only significant next step to improve performance was handling the orchestration of these models.Now as the performance of video models increases significantly across realism, consistency, & prompt adherence while becoming more cost efficient, the next evolution of video generation may also be systems that can plan, generate, edit, critique, and iterate across an entire creative task. In this episode, Ethan joins swyx and Vibhu to unpack what it actually takes to build frontier image and video systems: data, VAEs, diffusion transformers, audio-video alignment, inference speedups, and the hidden cost of storing and moving massive video datasets. From building NVIDIA's Cosmos world model to joining xAI as Grok Imagine was being built from zero to one, Ethan He has been at the center of some of the most important work in video generation, multimodal models, and real-time world models.We go deep on Grok Imagine, how a small xAI team shipped its first multimodal video model in three months, why iteration speed matters more than almost anything in model development, and why many of the biggest gains come from fixing tiny bugs in data and training pipelines. Flipbook: The future of VideomaxxingVideo agents are almost a sure bet to be the trend in the coming year. We end with a glance at what's beyond video agents:Flipbook caused a minor sensation this year when it was released, but most treat it as a fun demo. Ethan takes it very seriously — with the speed and cost of inference coming down every year, the future of custom video JIT UI is closer than you think. We talked about why videogen models may become the front end of AI, how generative UI could replace traditional HTML/CSS, why world models need to be real-time, interactive, and long-horizon, and why the future of video generation may depend more on language models and agents than on diffusion alone.We discuss:* Why fast iteration mattered more than meetings* Why small training bugs can drive huge model quality gains* Why coding models may make compute the bottleneck again* How image and video models are trained with synthetic captions* The role of VAEs and latent space in frontier video models* Why image models are the foundation for video models* The tradeoff between temporal compression and real-time interactivity* Flipbook, Neural OS, and the future of generative UI* Why future interfaces may go from user intent to pixels* The hidden cost of training video models: storage, egress, and GPU hours* How step distillation and consistency models (like OpenAI sCM) makes video inference orders of magnitude faster* Grok Imagine 0.9 and large-scale audio-video generation* Why audio-video alignment is harder than text-video alignment* Ethan's definition of world models* Reference-to-video, video extension, and long-context video generation* Why xAI's research communication undersells Grok Imagine* How xAI culture shaped the speed of development* AI watermarking, SynthID, and detecting generated media* Why prompt rewriting matters for video models* Grok Imagine Agent and the rise of video agents* Why language models may unlock better video generation* Robotics, physical AI, and embodied world models* Why Ethan left xAI and shifted focus toward LLMs* Self-managed context, memory, and the next frontier for language modelsEthan He* LinkedIn: https://www.linkedin.com/in/ethanhe42* X: https://x.com/EthanHe_42Timestamps00:00:00 Introduction00:01:25 From NVIDIA Cosmos to xAI00:03:24 Building Grok Imagine from Zero to One00:10:07 How Image and Video Models Are Trained00:18:53 Video Compression, VAEs, and Real-Time Tradeoffs00:22:10 Generative UI, Flipbook, and Neural OS00:32:10 The Cost of Training Large Video Models00:37:04 Distillation, GANs, and Fast Video Inference00:41:21 Audio-Video Generation and Grok Imagine 0.900:48:34 What Makes a World Model?00:55:51 Reference Videos, Long Context, and Video Memory01:00:11 xAI Culture, Research, and First-Principles Building01:09:45 AI Safety, Watermarking, and Prompt Rewriting01:13:10 Video Agents and AI-Assisted Creation01:27:32 Why Language Models Unlock Better Video01:31:15 Robotics, Physical AI, and Embodied World Models01:32:38 Why Ethan Left xAI01:34:16 Self-Managed Context and the Future of LLMs01:38:43 Ethan's Career Path and Closing ThoughtsTranscriptIntroduction: Ethan He, Latent Space, and the Path to xAISwyx [00:00:00]: We're here in the studio with Ethan He, most recently of xAI. Welcome.Ethan [00:00:10]: Thank you. Glad being here.Swyx [00:00:11]: We're also here with Vibhu. you were first coming to us or joining the latent space world because you were working on Kosmos at NVIDIA, and you did a paper. We loved it. you presented it as well, so thank you for doing that.Ethan [00:00:23]: I've actually, I also presented the MoEs twice at latent space.Swyx [00:00:29]: How did you actually hear about us? Did we reach out to you? Is that how it worked?Ethan [00:00:33]: No, actually, I-- the community. Like I realized, oh, there is this online community that people talk about AI and also learn from each other through papers every week through the Paperclip. It's very nice.Ethan [00:00:49]: I learned a lot.Swyx [00:00:49]: I think three years stop. We haven't stopped even on Christmas and New Years. many weeks I want to stop but it keeps going.Vibhu [00:00:58]: No, that was good. I think you had posted that you worked on a paper, and I was “Oh, very cool. We have Paperclip. Present then.”Vibhu [00:01:04]: But I might have reached out to you after.Swyx [00:01:05]: you-- because it's an amateur club, right?Swyx [00:01:08]: so it's very unusual and but we have sometimes paper authors come by and actually explain the paper. Today we just did, the poolside paper, which was apparently very good.Vibhu [00:01:18]: Came out yesterday.Vibhu [00:01:19]: pretty interesting, right? Fully open. They talk about everything, systems. So it's a good one. We'll, we'll recommend people to read it.Swyx [00:01:25]: Bring us up to speed on your transition to xAI, ‘cause I actually don't even know when you joined. just like tell the, tell the story about the sort of transition.From NVIDIA Cosmos to xAI: Scaling Video and World ModelsEthan [00:01:34]: Before xAI, I was working on Kosmos world model as in-- at NVIDIA. So Kosmos is, it's a giant video foundation models that can-- that aims to simulate the world and for-- it serves as a foundation of-- for all of the roboticists to build on top of. There, once I built the Kosmos one, I realized as this thing also has a scaling law similar to language model, we need to scale up the video models further. that's, that's why I realized I need to move to somewhere with much more compute resources. That's how ISwyx [00:02:13]: Than NVIDIA?Vibhu [00:02:14]: The GPU rich came themselves.Vibhu [00:02:19]: And timeline-wise, when was Kosmo? It was pretty early, right? It was open world model, open paper, everything.Ethan [00:02:25]: It was end of twenty-four.Vibhu [00:02:28]: End of twenty-four.Ethan [00:02:30]: Then at mid twenty-five, I moved to xAI. At that time-- I joined about the time when xAI was about to build video models and in multi-model models. There were no infra, no data, and no model, and it just-- as a few engineers, we built it in three months and released the first model, Grok Imagine zero point nine.Ethan [00:02:55]: And since then, I keep working on video models and move more from training and to post-training of the video models. For example, like a reference to videos, kind of like the cameo feature and, video extensions. And, before I left, I worked on a world model, leading a small team to focus on the real-time long horizon video generation.Building Grok Imagine From Scratch in Three MonthsSwyx [00:03:24]: Can you give like a rough roadmap of okay, you're on a brand-new team. Grok previously was only text, or they partnered with BFL for their image gen stuff. What do you-- what are the building blocks, right? You have compute, data you can procure somewhere. Like just what are like the sequence of things that people should think about when you're setting up a new team?Vibhu [00:03:43]: actually even deeper, not just data you can procure. You guys had to go through getting the data too, right? So you shipped it pretty fast, but yeahSwyx [00:03:51]: three months is likeVibhu [00:03:52]: From everythingSwyx [00:03:52]: actually like very surprisingly fast.Ethan [00:03:55]: One thing I say like thanks to my experience at NVIDIA, ‘cause first time when we were building Kosmos together, we built it, for about a year. So this is like the second time I do it. Roughly have an idea, what to do. I say the most important thing is the talent. Everyone were very strong and clever, very close with each other towards a common goal. So that speed up things a lot. So you reduce the communication bandwidth among people, and everyone can work towards the same goal. It's, it's like every day there's not that much meetings on the calendar, like maybe like a, like a sync a day, and after that it's, it's just all building. It was pretty fun at that time.Ethan [00:04:47]: And another thing is that xAI has very strong foundations of like data inference, model inference, and the supporting there can help the model develop a lot. When I look at, training models, I don't so actually the top important thing is like how many, how many iterations can you do, per day? and the more iteration can you do, you can, you can train the model much faster. So if you have very strong infra and you have a lot of compute, you can, you can train these models in very short period of time. That can give you a much larger buffer to, for errors, and it also gives you the opportunity to spot more bugs.Iteration Speed, Compute, and Debugging Model PipelinesSwyx [00:05:46]: What is an iteration? Is it like a few hundred steps or what are youEthan [00:05:50]: Let's say just the train-training the model, like from acquire new data and maybe design new algorithms and train a new model, maybe at smaller scale orSwyx [00:06:01]: So cycle time for like any hyperparam that you're searching.Ethan [00:06:04]: Cycle time and tune to like eval this model. Is this model better than my previous iteration?Ethan [00:06:11]: SoSwyx [00:06:11]: So it's like before you, someone had already set this up that you can iterate very quickly.Ethan [00:06:15]: I think the foundation there is extremely good forDeveloping and research models.Ethan [00:06:23]: And often I find is it-- this is kind of boring, but like a lot of the improvements does not come from new algorithms. It comes from finding small bugs here and there in the data pipeline, in the, in the model training pipeline. Those give, those give the biggest boost to the model quality.Vibhu [00:06:46]: It's interesting, right? So you say it's like small team, less communication bandwidth, but also a lot of quality is like find little bugs. It seems counterintuitive, right? You have a lot of people, you can iron out more of those, but it's interesting to see the other side, right?Swyx [00:07:00]: I also wonder, have you-- do you try using LLMs to look for bugs? I don't know.Ethan [00:07:05]: I remember at that time it was mid two thousand and twenty-five, so it's the coding model wasn't quite there yet. I remem- I remember like December two thousand and twenty-five, it was extremely good. Yeah, I've been, I've been using it at that time. It's, it's helpful. sometimes it produce codes that are kind of difficult to maintain, even though like the first time it built something extremely fast. But it gave the, like a spaghetti code, thousands of lines that I couldn't maintain, and the LLM itself couldn't figure out what's, what's wrong and how to improve on top of it. But now I find it much better. Yeah, I want to bring up another point here is now coding models are much more efficient and can help us implement stuff much faster. Compute might become a bottleneck again because previously, like if you want to train a new model, say you want to generate new synthetic data and then or write a new algorithm, it might take a few weeks. And during that period of time, you don't-- you might not have experiments to run. But now you can build that thing within a few hours, then you can immediately train a model.Ethan [00:08:24]: Now you have to have enough compute to try all of the ideas. So compute might be the bottleneck of iterating speed again.Swyx [00:08:36]: yeah, I actually, honestly, I think it's like kind of a stressful job because you're “Well, I should be trying everything, and if I'm not, then I'm not doing my job well.”Vibhu [00:08:48]: there's also the stress of you're eating thousands of GPUs per hour, which is very expensive and, compute can go to other researchers.Swyx [00:08:56]: You got the daddy Elon toVibhu [00:08:57]: You got daddy Elon.Ethan [00:08:59]: It wasVibhu [00:09:00]: But there's still finite amount of compute, like you want to use it, you want to use it well, you want more of it.Ethan [00:09:06]: That was quite stressful indeed. Yeah, I think one thing is the-- with coding models now, like a lot of these jobs can be automated, which is much better. A second, it's a, it's a marathon, so you got to maintain good health and, a regular schedule.Vibhu [00:09:28]: It's, it's hard to hear that when you shift from zero to nothing in two months.Swyx [00:09:32]: and, I think obviously the culture at xAI is very famously, people work very hard. one thing I did want to dive into, in our-- in the notes that you, that you sent ahead of time, you had specific comments about the cost of Video Gen training. presumably this is on the Colossus-1, right? the two hundred megawatt cluster. Any whatever you want to just share on that.Vibhu [00:09:54]: I think there's, there's three things we're talking about, right? So there's Video Gen, there's also the Image Gen model that you put out. Do you want to like complete the, okay, so zero to one, you have a few months. Just what are the stages of create Image Gen model?Swyx [00:10:06]: Oh, yeah, maybe I got distracted.How Image and Video Models Are Trained: Synthetic Captions, Tokenizers, and VAEsVibhu [00:10:07]: Sorry. and then, from there's Video Gen, there's Audio Gen. Would love to get into those next. But what is that first few months like? So small team, a lot of bugs, iterations, but what does it look like? Do we take something off the shelf? Do we just get data compute? What's, what's the few months like? How do you go to state-art Image Gen model? How do you just start?Ethan [00:10:28]: I cannot comment specifically how xAI did, but it's, it's a quite standard process. I can draw some, examples from Cosmos. So mainly it's building a video model, you actually need to build a image model first. And building these two models, the data you need is a hundred percent synthetic pair of language and image or language to video. Because on the, on the internet, actually, the videos don't naturally associate with text. So you can say, oh, like on YouTube, you have the title and you have the description and the commentsSwyx [00:11:11]: TitleEthan [00:11:11]: of a video, but usually they're not relevant to the video itself. And say maybe like the video is a natural scene of mountains or something, and the title is, I'm so happy today.Ethan [00:11:26]: So they have they have no correlation at all. So the first step is to, you have to generate synthetic pair of language with the videos. So you gather videos from the internet, and you use a VLM to caption the videos. So that part, here's a question, like how do you, how do you gather VLM to begin with? So if there's noSwyx [00:11:55]: You, so you fuse the model, right? LikeEthan [00:11:57]: Say if there's no like VLM exists, like how do you generate the text to the beginning, right? It's, it's impossible.Swyx [00:12:04]: I see.Ethan [00:12:05]: In the beginning, it's like you ask human to describe the video as detailed as possible.For example, you ask them to describe everything, like all objects, all characters, and all interaction and dialogues in the, in the videos. So that's in the protocol of Cosmos labeling. We require the objective we give to the labelers was that you have to describe the video as detailed as possible, such that a blind person hears a blob of text can reconstruct what the video is like from their head.Swyx [00:12:43]: Video or image? You're talking about images.Ethan [00:12:44]: Video or image, either one of them.Vibhu [00:12:47]: This was pretty common when we went from clip and DALL-E, right?Vibhu [00:12:51]: It's all training on really detailed captioning of images. So same is applied to video, but insteadEthan [00:12:57]: same appliedVibhu [00:12:57]: of using multimodal model to pass in video images and write rich descriptions, you can alsoSwyx [00:13:04]: I think there's this traditional perspective of supervised, or, very highly human curated thing. I feel like there's a unlock with unsupervised, right? Where like you have enough to bootstrap that you can just throw common corpus on it or, whatever. like unsupervised vision and language pairing, right? Like where you just have, interspersed image and text and it just learns. To me, that is the VLM breakthrough that is different from the clip, different from the LM era.Ethan [00:13:36]: It's interesting to see that you kind of need both data.Ethan [00:13:41]: For example, for theSwyx [00:13:41]: You need it to bootstrap it up. YeahEthan [00:13:43]: for the generative model training, there's also usually like a small percentage of unlabeled data. So the model is instructed to generate a video without any text instruction. That can also help the model generalize. So after this stage of generative synthetic pair, so, one important common step is to train a compressor or a tokenizer of the image or videos. So because, if you train-- If you can technically, theoretically train image or video models on pure pixels, but the problem is that the, it's, it's a lot of tokens. So like one image, it's, a thousand by a thousand, it's like one million tokens, one million pixels. It's impossible to train transformer on that. So it's, you need to train a tokenizer, which can go from image to latent space and latent space back to image.Swyx [00:14:45]: That's why we named the podcast.Swyx [00:14:48]: But, basically, you're talking about vocabulary science.Ethan [00:14:50]: so vocab.Swyx [00:14:51]: And so, what is, what is imp-- like a million is impossible?Ethan [00:14:54]: In generative models, the vocab is continuous. It's a continuous space. We can think about like you map an image to a vector. It's a, it's a fixed length vector. It's sixteen or forty-eight, something like that. And then you map that vector back to the image space. And the mapping is, has-- The mapping is patch-based. So you say you haveEthan [00:15:22]: a sixteen by sixteen patch and you match, you map that patch of pixels into this latent space.Swyx [00:15:29]: We've covered thisVibhu [00:15:30]: This is like the vision transformersSwyx [00:15:32]: VAEs,Ethan [00:15:33]: VAEs.Vibhu [00:15:34]: You basically compress your input, you do your generation, you're reasoning all that generation in smaller dimension, and then you project back out.Swyx [00:15:43]: VAE is a form compression, but I think the for me, the patching thing is from VIT, right?Ethan [00:15:48]: You can make those.Swyx [00:15:49]: Literally the, yeah, the paper is titled like sixteen by sixteen is all you need. something like that. and then I think also, people make a lot of comparisons with this kind of patching with convolutions.Swyx [00:16:02]: Which is you're, you're kind of re- reconstructing the old paradigm with the new.Ethan [00:16:05]: Actually, in VAEs, there are, there are both convolution networks and transformers. You can actually do both.Ethan [00:16:14]: After this VAE, so what you've got is you've got latent space tokens and you've got the language tokens. So now the training of the diffusion transformer, usually generative models use diffusion transformers. It is actually quite standard. It's, it's very similar to how you train a language transformer models. It's not that much difference. It's just the tokens, the visual tokens in, visual tokens out. The only difference is there's a denoising process. So you train the model to unmask some of the noise. So you add, you add random noise to the visual tokens, and then you train the model to remove those noise to generate the clean tokens. Any inference, the model can iteratively remove noise from a hundred percent noise.Swyx [00:17:12]: And then there's also, to speed things along on the tech tree of diffusion, there's CFG, and then there's, there's also, latent diffusion that, there's, there's someone in there. I think, somewhere along the line, obviously, like stability and all these other guys, pioneered a lot of this, architecture. I don't know if you want to get into that or just, or do the video side up to you.Bootstrapping Video from Image Models and Temporal CompressionEthan [00:17:37]: After you train such model, such image model, the reason it's a, it's a foundation for video models is that image models are cheaper to train, and they have much denser connection between language and text. So, sorry, language and images. For example, you train a billion, you train on a billion images, and there's a mapping from the text to the image. And the cost to train the same, like the, a billion, a billion text to a billion videos, that's much more expensive because videosNaturally have more tokens than images. Because the diffusion models, their understanding of, language purely come from this mapping. So if you don't have enough mapping, so if you only train on like a ten million videos or something, there-- you might not see enough language tokens in your training, so your model does not understand human intention enough. So that's why you really-- you train-- you first train this image diffusion models, and then you bootstrap the video model from there.Swyx [00:18:53]: One thing I did want to ask, because I-- actually, I think you're, you're the first per-- video model person I've ever talked to, I think. we've, we've like talked to Luma and all those folks. There's all these tricks in video compression where basically frame by frame there's not that much difference, so actually you don't have to regenerate or save the whole frame, right? but I think MP4 compression or something else like that.Swyx [00:19:16]: is it tempting to use that? Or as far as I can tell, everyone just treats it as, “No, we would just generate every frame.” Is that roughly the state-art?Ethan [00:19:27]: There are a few different approaches. Let's say first, like you want to just directly use MP4 compression and use that as the tokens for the transformers to train, right? So people actually have tried that, but the main challenge is the latent space for the MP4 tokens were not, were not very comprehensible for the models. It's, it's extremely hard to train on that. And there's aEthan [00:20:01]: So that's why they created VAEs, which creates more continuous, latent space, so the models can understand that latent space and learn from it much easier. Even within the VAEs, there are different difficulties of the latent space. So you can imagine something the simplest, the most naive VAE is like you have an image, and you just shuffle all of the images into a, into a vector. So you don't need to train any VAEs, right? But that latent space is extremely hard for models to train on top of. That's why there are some debate on like how do you compress the tokens. So you mentioned like you can compress frame by frame. Also, you can compress, the temporal dimension.Ethan [00:20:52]: The difference is if you compress the temporal dimension, you get a much higher compression rate. Because there's temporal redundancy between frames, because, this frame and the last frame, likely they are mostly similar, so there's only some small difference. for example, I think in 12.1 VAE, they have like a eight by eight by four compression rate. So the four temporal tokens are compressed into one tokens. That can save a lot of, save a lot of the context length. If you do it frame by frame, you have to do maybe like eight by eight by one. Your context length will be four times larger. That being said, the benefit of the frame-- per frame compression, we might come back to this later, is, real-timeness and interactivity. ‘Cause if you, if you strain the output of the model, frame by frame, you can-- the model can respond to any user request immediately. So if you have like a temporal four compression, four times compression, thenSwyx [00:22:06]: It might be laggyEthan [00:22:07]: there's a lag there in nature.Swyx [00:22:10]: So you're very pilled on this. let's just go ahead and bring it up ‘cause we have the visual prepared anyway. There's some frontier applications of real-time video gen. So Flipbook is one of the examples that went viral recently, right? What is Flipbook?Real-Time Generative UI: Flipbook, Neural OS, and Diffusion Front EndsEthan [00:22:23]: Flipbook is kind of like a web brow- web browser. You can see like it has the web bro- browser UI on top. The difference is all of the UIs are generated by generative image model in real time, and anything here are fake. But you can, you can explore inside this wor- this imaginary world. Say like we-- here we have engineering the Great Pyramid. Like the model generates this for us to understand how it works, and if we want to navigate around and understand further, we can click on some of the, some of the description here, and the model will generate a new page, new subpage describing the details we want to know about.Swyx [00:23:14]: So it's basically kind of we're playing a video, but it's pausing for our next interaction, and then it just plays the next thing based on our interaction.Swyx [00:23:23]: Which is kind of cool.Vibhu [00:23:25]: and you kind of decide your story. So this was, how do you make a pyramid? levering technique seemed interesting, right? It shows how do you take Okay, I want to know what is thisSwyx [00:23:35]: The demo, the demo tweet had more animation between frames.Vibhu [00:23:38]: I think it's just skipping,Swyx [00:23:39]: Oh, it's just skipping a lot of frames.Ethan [00:23:40]: they also have a video modeVibhu [00:23:42]: It takes a lot. There's a lot of peopleEthan [00:23:42]: but, a lot of people are using it.Ethan [00:23:45]: So it's not available.Vibhu [00:23:46]: There's a live video stream. We can try,Swyx [00:23:50]: So this is an example of the kind of future that you see at the extreme. We don't-- we're obviously not in it today.Swyx [00:23:56]: But in a world where inference is completely free this is better than generating code and text?Ethan [00:24:02]: So this is, this is a final state of where Viva will be at for word model, I think. Imagine internet doesn't exist, and then you type in google.com. Like what should, what should, what should a model show you?the model can imagine something, and this is what the model imagine. And these web pages, they completely do not exist. So I think as the inference costs come down, we are going to have generative UI for everything. If you think about how the coding model works, so they write code for a web page, and they render the code might be con- converted into binary, and the binary render the pixels on the screen. So we in machine learning, every time we have some breakthrough, obviously it's, it's more intuit. So why don't we have like user instruction to the pixel directly? So the generative UI will be user intention to the pixels directly. And say like even if I want email, let's say everyone have the same interface, but I want, I want it slightly different. I want the email to show to me like a TikTok, so I can swipe left and right for the emails. And or maybe you want something else. We can have completely different things. Or like I have I'm looking at, Instagram stories, and I don't like the Like button. I always may click it. And, generative UI resolved it. So it's going to be a revolutionary replacement of the interface. So in the future, we might have much more powerfulEthan [00:25:50]: LLMs and coding models running behind the scene. And in the, in the front-end, the diffusion model will actually be the front-end to show stuff to you. That's how I imagine it.Swyx [00:26:02]: Diffusion front-end, deterministic back-end.Swyx [00:26:04]: Something like that. I find that very expensive, but,Vibhu [00:26:08]: I find it interesting you called LLMs writing code on the back end deterministic, but okay.Swyx [00:26:14]: you write it onceVibhu [00:26:15]: Compare it toSwyx [00:26:16]: And then you execute.Ethan [00:26:17]: If you think about the cost, say, let's say H100 costs $1 per hour, and if you use this eight hours a day and thirty days, so, every month you're paying this two forty, you'll actually not wanna pay for that. That's even more expensive than Cloud Code Max. But if you think about the compute costs come down like two times every year, and I think the future will likely arrive like within few years.Vibhu [00:26:49]: It's everything, right? compute cost comes down, compute gets faster, model gets smarterEthan [00:26:54]: More efficientVibhu [00:26:54]: model gets smaller.Swyx [00:26:55]: I don't know why you say two times, ‘cause I think it's like 100 times. In language models, it is roughly one hundred to a thousand times every twelve to eighteen months, for the same given level of LMSys, ELO.Vibhu [00:27:08]: That's a net of everything, right? That's model performance alongside compute. So different than just compute costs come down. But, a very interesting future.Swyx [00:27:19]: So the web designers will have to shout out that accessibility is an issue, right? how do you deal with screen readers or whatever. But yes, this is higher bandwidth storytelling than anything you can possibly generate with code, right? So I think that's the rough idea.Ethan [00:27:34]: And I'd like to add a little bit that so human naturally have the maximum bandwidth when we are looking at things, look at videos, and we also have maximum output bandwidth when we are talking. So in the future, it might be something like we talk to AI models, and the AI model responds back with a generative UI. So that would be the maximum input and output bandwidth to interact with AI models before neural link happens.Vibhu [00:28:06]: And it's also very custom, right? Some people are very visual, some people are not as visual, right? They prefer the text. But the best thing about generative UI, right, it can also be text.Swyx [00:28:17]: There's another project that we wanted to highlight, which is the Neural OS. Kinda similar idea, but here you're literally operating, simulating an operating system with a video model.Swyx [00:28:27]: and you can play Doom, you can do Firefox. I find this like mildly less impressive, obviously, because it's an OS that I can run.Swyx [00:28:37]: But here everything is imagined.Vibhu [00:28:40]: I was, used to the Command+W to close the Firefox tab. It didn't crash. That's why I saidSwyx [00:28:45]: It's too immersive.Vibhu [00:28:46]: It's, it's too immersive for me.Swyx [00:28:47]: Too immersive.Vibhu [00:28:48]: I wanted to close the tab.Vibhu [00:28:49]: But yes, I can play generated diffusion.Swyx [00:28:51]: this is shockingly fast.Swyx [00:28:54]: Because I remember there was a demo about like maybe one to two years ago. Someone tried to do the first-person shooter with a image model. There was no consistency. It was very slow. But here it looks like realistically it's-- this is Doom.Vibhu [00:29:07]: I think there's two sides to that, right? There's okay, what is running a game? The heavy part of it is actually the game engine, all the lighting, all that stuff, the graphics. This is just kind of video, right? Like we've solved consistency. This is still, it looks like a few years old image generation. There's some temporal consistency, but it's, it's kind of just images stitched together as frame video. But it's a good visual representation to pi- to picture the future you wanna see, right? that's, that's what I see in these more so.Ethan [00:29:38]: This reminds me of how the video models gets better and better. So Neural OS is kinda if you just look at it feels like it's just a crappy version of the, like the Windows we could have, right? And, but the difference is, so the model, this model is overfitted on the existing operating systems. It can generate nothing different than that. But it's actually also similar to video models. So when we are training these video model, image model, we train them on internet. There's no imaginary supernatural stuff on the internet. But once we train this model, you can prompt the model to generate something supernatural that have never existed in the data set. So if you train your Neural OS or neural computer on the standard screen recordings on the entire internet. The model can imagine completely new interface to interact with the computer.Swyx [00:30:43]: This is one of those things that is magical to me. usually generalizing out of distribution is bad, but somehow we have learned some kind of internal world model that you say, this plus, but it looks like rainbows and butterflies, it'll do it and it will kind of make sense.Swyx [00:31:03]: So yeah, that's kind of cool. Yeah, I don't know if there's any comment more on there. I do, I do wanted to, I did wanted to touch a little bit more on the model architecture stuff, which I think you were getting. It's, really fascinating. We don't get a chance to talk about this enough. So one of the papers that we covered, we've covered every annual, segment anything release. and I don't know if you follow-- you're a computer vision guy, so youEthan [00:31:26]: I knowSwyx [00:31:27]: . So they did memory attention, which is kind of interesting. And I always think, anything where you can, across the temporal dimension, keep some consistency, I think it's, very fascinating, and I don't know if Basically, does that-- the CV side bleeding into video gen side, I think is underexplored, right? we talk about it for labeling, but actually you can borrow the architecture itself.Ethan [00:31:50]: There's, there's also complete different approaches, right? you brought up the term world model, so we went from video model to world model. There is diffusion, but there's also other approaches that people are doing. So maybe we get into those after as well,?Swyx [00:32:03]: He has a whole definition of world models and stuff. I feel like we threw a lot at you. Whatever you want to comment on.Why Video Models Are Expensive: Storage, I/O, and Training ScaleEthan [00:32:10]: I think one thing that we should actually comment back on is okay, so we were talking about the steps to train image gen to video model. One thing we don't see as much of is okay, you brought up the delta in training data, right? SoEthan [00:32:24]: you won't have as much a video model might not generalize, but what is the cost of training a large video model? So we know for LLMs roughly, okay, even like the poolside thing that came out today, right? It's a Gemma level model trained on roughly forty trillion tokens at this many H200s over this much time, right? You can see what is the exact cost of that. So how many GPU hours over how much H200 costs? So how do we do the back-end math of, same thing for video models, image models. How do you, how do you kind of break that down? I can share some back-envelope calculation. So surprisingly, video models is-- the cost is very-- is comparable to language models and obviously the largest scale is language model, maybe like a medium scale to language models. I said just storing the videos alone, it costs a lot. You can, you can maybe look up on AWS or something.Ethan [00:33:20]: You really, say if you have a billion videos and let's say, let's just say like each video, like five megabyte, then you need five petabyte to just store those videos. And also remember we talk about you use a VAE to compress the videos, and you also need to store, typically you need to store those continuous feature, in-- also in your storage. That's also comparable size with the videos themselves. So just storing these videos and the features is tens of petabytes alone. And,Swyx [00:33:58]: I just, I just looked up the calculation. Five petabytes on S3 Standard is one hundred K per month.Ethan [00:34:05]: AndSwyx [00:34:05]: It's comparableEthan [00:34:05]: and you needSwyx [00:34:06]: AndEthan [00:34:06]: And then like tens of petabytes, two hundred K. And even more expensive is you have the ingress and egress.Swyx [00:34:13]: Oh, yeah.Ethan [00:34:14]: Like you-- through the internet. You have to just to download those videos, I believe it's, it's more expensive on AWS than just storing those videos.Swyx [00:34:25]: Storing, yeah.Ethan [00:34:25]: And each training runs, you probably need to pull them once. If you train multiple times, it's, it's even more than that. So it's like just storing the network, those costs is just, it would be a few, a few millions per month to just storing everything, not to mention the GPU cost.Ethan [00:34:45]: AndSwyx [00:34:45]: my side tangent, the compute rental, like GPU rental is very efficient. There's one side, okay, you can be XAI and build your data center. Should we not just build our, storage compute as well? LikeEthan [00:34:57]: Of courseSwyx [00:34:57]: cloud cost compared to just,Ethan [00:34:59]: You save so muchSwyx [00:35:00]: store. Yeah, exactly.Swyx [00:35:01]: Especially with like egress and stuff. So.Ethan [00:35:04]: That's a good idea, but it also comes to-- there are some of its own challenges.Swyx [00:35:09]: Of course, of course.Ethan [00:35:10]: like people who build the GPU data centers, they might not expect this much, storage. And yeah, people build storage, typically they just build it somewhere with just CPUs.Swyx [00:35:23]: I just looked it up. Five-- AWS only charges for egress, not ingress. Tier five for five petabytes is two hundred and thirty K.Ethan [00:35:32]: Even more expensive than the storage.Swyx [00:35:34]: But storing is per month, right? You check in, then you cannot check out. so it's so cool. It's okay. So there's that side.Ethan [00:35:41]: So the TLDR, my backhand mathSwyx [00:35:42]: Data is larger than you think. Yes.Ethan [00:35:44]: my backhand math of GPU hours times GPU cost is also very much, I'm missing some storage.Swyx [00:35:49]: You're also-- you're basically like also more IO bound than normal training.Swyx [00:35:55]: Yes. ‘Cause like data loading, so caching everything, it becomes super important.Ethan [00:36:00]: So in Cosmos, we did a lot of optimizations to make it not IO bound. So, speaking of the training, actually training the model, the GPU cost, if you look up like the open source model, how big these video models are, I think like LTX has nineteen B parameters. That's a dense model. And people are also exploring, MoEs, so it might be twenty B active and, like a hun- hundreds B, total. So that's, that's even-- that's similar size as medium-sized LLM models. And if you, if you look at number of tokens-Uh, we disclose that in Cosmos. It's also like tens of trillions of tokens on the visual tokens. So putting this together, the cost of, training these video models, it's actually comparable with LLMs. Not to mention, the infra is slightly different from LLM, so it might be less efficient to train these models.Inference Speedups: Step Distillation, Consistency Models, and GANsSwyx [00:37:04]: Do you get the benefits of traditional diffusion speed-up? So for, images, there's LCM, LoRAs for, fine-tuning. There's, there's a lot of stuff that's beenEthan [00:37:15]: Flow matching.Swyx [00:37:16]: there's flow matching. There's a lot of stuff that's been done. there's some overlap that applies to diffusion on the inference side and stuff or?Ethan [00:37:23]: so the difference-- the inference side is a completely different story.Ethan [00:37:28]: I think for the training side, it might be a little bit hard to reduce that cost. And for the inference side, the biggest gain is from the distillation of these models. You can-- It's called step distillation, slightly different from knowledge distillation in LLMs. So you-- Typically, for flow matching models, you need like 100 steps or something. Like a distortion model even need even more, like 1,000 steps to generate a good image or video. A step distillation is try to learn to generate fewer step from the model itself. It's kind of like now we-- you use the full model to generate in 100 steps, and then you take a model that only generate 10 steps and let that model to learn from the perfect one.Ethan [00:38:25]: why this workSwyx [00:38:27]: Strong to weak seemingly.Ethan [00:38:28]: It is. It's kind ofSwyx [00:38:29]: DistillationEthan [00:38:29]: kind of like strong to weak. the-- from the modeling perspective, the strong model, the teacher model is trying to model the image and videos of inter-internet, and that distribution is extremely complex. But the step distilled model is just trying to learn from the teacher. The teacher is a model, and the size is fixed, as the distribution is much simpler than the whole internet. That's the intuition I have why step distillation can work. So usually these models serve in productions, they only run in a few steps. In Cosmos, I believe we have, we have like four step and eight steps. If you do some simpler task, image-image translation, it can even run in fewer step, like one step in Cosmos Transfer.Swyx [00:39:22]: I think this is the same intuition that guides a lot of the consistency model work. I sent you a link for, SCM. I don't know if you covered that. To me, that was actually one of, the most impressive papers I've ever seen from OpenAI.Swyx [00:39:34]: That this is the unifying grand concept of consistency models. I don't know if you have any comments on this.Ethan [00:39:41]: So there are, there are a few different approaches,Swyx [00:39:46]: Oh, yeah. Here it is.Swyx [00:39:47]: Two steps versus twenty or 100 steps, whatever. It's already done.Ethan [00:39:52]: So there are, there are a few different approaches, for example, consistency model, and there are also Actually, we shouldn't forget GAN. So GAN, actually, that was, that was the OG ofSwyx [00:40:05]: OGEthan [00:40:05]: step distillation ‘cause it trained just one step to begin with. So actually, a lot of, uh-- For example, there's a distribution matching distillation which use, which uses GAN, as one of the laws for distillation. It-- GAN just tells you, “Hey, generate an image,” and thenEthan [00:40:31]: it has a discriminator to tell, is this image real or not? So the model, the model just need to learn one of the distribution, not the full distribution. Because in training, the model is asked to reconstruct the ground truth image from the internet, which is extremely hard. And in-- When you're training GAN, it's a step process. It's just a, “Hey, you generate image. Does this image look as real as the image from the internet?” Which is a much simpler task. And, yeah, combining a lot of these approaches together, people typically do that, like consistency model and distribution matching and GAN, and we can get these few step models.Audio-Video Generation and Time AlignmentSwyx [00:41:21]: Then there's one step I wanted to add, which is audio and video.Ethan [00:41:26]: So, Grok Imagine zero point nine, I believe it's, it's a first audio video transmodel deployed at a large scale. SoSwyx [00:41:39]: And that was your first model?Ethan [00:41:40]: that was, Grok Imagine's first model. It's, it's audio video, joint generation. I think the hard part is, the modality alignment, ‘cause before this transmodel, we have, we have text to video alignment. We have this, correspondence between text and video. Typically, most of the VLMs, they understand images and videos. Video's very rare, and they don't understand audio mostly. And if you look at the audio generation on the LLM side, you can talk to them perfectly fine, but if you ask them to sing a song or something, it typically is not very good. Also, they don't have, they don't have music either. The hard part is thatUh, actually audio has two component. It has like a discrete component, a continuous component. The discrete component is like the language.Ethan [00:42:44]: So when we speak, it's just, someSwyx [00:42:47]: It's an ASR issue, yeah.Ethan [00:42:49]: It's, it's text token with some characteristics, I would say.Ethan [00:42:54]: But musicSwyx [00:42:56]: I think the speech guys would disagree with this.Swyx [00:42:57]: Like disfluencies and then,Vibhu [00:43:00]: There's tones you can get angry.Ethan [00:43:01]: Well, I say largely.Ethan [00:43:03]: the mu- but the music is completely different. It's, it's very continuous, and you cannot model them like discrete tokens in language models. this is like the hard part for models is, not to mention we have to align text, video, and audio together.Ethan [00:43:26]: SoVibhu [00:43:26]: How?Ethan [00:43:28]: So significant-- some significant challenges are like-- So first, like we talk about as the VLMs, they cannot understand most of them cannot understand audio.Ethan [00:43:39]: So you have to have some way to do the synthetic data generation for audio. You have to caption the model, and that involve, that involve synthetic data and human data effort a lot. And not just surprisingly, most of the LLMs are very bad at recognizing, like the beat, tone, and the details of the of music. They can, they can give some general prediction of which song is this, but it's very hard to describe the details of the music. like we mentioned in image generation, like you have to describe image as detailed as possible so that someone blind can reconstruct that. So here is like someoneVibhu [00:44:32]: DeafEthan [00:44:32]: someone deaf can reconstruct how the music sounds like without actually listening to it. Maybe you can think of it need to have the-- or they call the script.Vibhu [00:44:49]: Subtitles, yeah.Ethan [00:44:49]: You gotta have all the details of the music, and the dialogue.Vibhu [00:44:55]: So is the challenge there typically stuff like music and audio, or is it just Like is there a baseline? Okay, there's enough data where we can understand, narration, conversation, but there's nuances in audio that's where you hit all the data issues or is it just from stage zero, you just do it all right?Ethan [00:45:15]: So one important thing is like the alignment. So the model, the model has to know like the video and audio, the, uh-- it has to have a time-based alignment, like at which time step the video and the audio token correspond to each other. But we actually don't have this kind of alignment for most of the other modalities. If you think about like text and image, text and video, they are loosely aligned. So you can, you can have a description of what's going on in the video, but you don't have to exactly, You typically don't have exact description, oh, at, time step one second like what happened?Vibhu [00:46:02]: It's veryEthan [00:46:03]: At time step two second what happenedVibhu [00:46:03]: coarse. Yeah.Swyx [00:46:05]: So what was the ideal time step? You have to oblate it, and then it's like four seconds or something.Ethan [00:46:09]: So that comes down to how you design the model to, for the model to be aware of as a time, as a time modality. So the model is like a time aware. And that's something pretty unique if you think about LLMs. So if you ask LLM to complete a task, say they, uh-- you ask them and they will say, “Oh, this task will probably take twelve hours to complete,” and they come back in one hour. Say “I've already spent two days on this and I've exhausted everything.”Ethan [00:46:47]: So the LLMs them-themselves, they don't have a sense of time there.Vibhu [00:46:53]: I actually don't think that's just them not having a sense of time. I think it's somewhat based, right?Vibhu [00:46:58]: Like you tell someone, “Okay, go work on this feature. Go implement this,” there's a general understanding you would have of how long that would take without LLMs working at LLM speed, right? So you think back like two years ago, if I tell you to like build me like a new front end for latent space, have a search bar, have all this, you'll estimate that it'll take a few days, right?Vibhu [00:47:19]: So you tell an LLM, “Go build this.” It'll take me a few days. But I think it's somewhat grounded as opposed to them not having the best-- Not saying that they have a great understanding, but I think that example is like you can see where it comes from, right? You're trained on all over the text.Swyx [00:47:35]: They're, they're trying to estimate what a human would say.Vibhu [00:47:37]: because that's what the, that's what the data kind of represents. It's not themEthan [00:47:41]: It came from the corpus on the internet. People have a estimate of how much time.Vibhu [00:47:45]: And not even just in direct like training samples, right? Just your world understanding of tokens of how long stuff takes, right? Go read a book. It'll take you a while, right?Vibhu [00:47:56]: Even if you do nothing but read a book, it takes a few days. So yeah, LLM, I read it took me a few hours.Vibhu [00:48:01]: It'll take me a few hours to go through this research. But this is a tangent.Swyx [00:48:05]: Somewhat, yeah.Swyx [00:48:06]: This is a train of thought I haven't really expressed until now is, which is basically like a full world model must also be recursive, meaning that the participant in the world model must also be aware that they have a world model. which is like this whole recursive thing down the, down the line. but yes, and that the world model can be wrong and that they need to update it and blah. Yeah. We've, argued this on the, newsletter as well, that there needs to be sort of recursive or adversarial world models.World Models: Real-Time, Long-Horizon, Interactive VideoVibhu [00:48:34]: just, to ask, how do you define world model?Swyx [00:48:38]: Oh, yeah, let's go there.Ethan [00:48:40]: SoVibhu [00:48:40]: So just for context, we talked about, video generation, and then there's a-- if you say there's a distinction between world models, what's your, what's your definition? How do you see the two?Ethan [00:48:53]: So disclaimer, I'm not going to debate, what is world model. Yeah. there are many definitions, so I'll just talk about my definition. Since I came from the multi-model, multi-model domain, so mainly talking from video. So world model is like real-time interactive long horizon videos. So there are three parts. so we-- let's talk about them one by one. So the so interaction, so we just, we just look at Facebook and neural computer. So the interaction part of it, so you, world model can allow you to interact with them through keyboard, mouse, and maybe also voice. So these all is-- all is a modality. You can, you can interact with the model, and the model should respond reasonably. Second part is real time. So once you, once, say, you move your mouse, if, say, the world model generate a game, how fast can the game respond? So if you're like professional CS: GO players- -my say, oh, you have to respond- He's beginner within sub ten milliseconds or- Yeah even less. So that's not most of the- No, sixty FPS. Let's go. Oh, three hundred FPS. Oh, five hundred FPS. Wait. okay, yeah. I didn't do the math, but yeah, okay. Uh- Yeah, three hundred FPS, that's a three millisecond. So you have to respond- Oh, s**t. Okay. YeahEthan [00:50:29]: within a millisecond. Most of the video models cannot do that. Yeah. And, but if you, say, if you have a video model that is, say, like a digital human, the response time might be more generous. Maybe typically, for real-time voice interaction, it's like two hundred millisecond. So that's, that's much more generous. But even two hundred millisecond is pretty, it is pretty tricky, ‘cause remember we mentionedEthan [00:51:01]: you have this, temporal compression coming from the VAE. So if you, if you don't compress the temporal dimension, your sequence length is going to explode. So if you want to have this real-time, real-timeness in your model, you have to do is one context problem. And the third part is long horizon, ‘cause we-- if you're not going to just play with, video games just, a few seconds, most video models only a few seconds. We're going to play with minutes, hours. The model have to be able to generate long-form content.Ethan [00:51:42]: So putting these three together, it's, real-time, long horizon interactive videos. I think the final state will be, for example, like a video, a video version of Playbook, where you can, you can interact with, a neural computer. You move your mouse, and you click on the generative interface, and it will reply to you through pixels- generating in real time. But getting there, it's, it's a very long way to get there. So one of the first step, at Grok Imagine, where I led a small world model team there, was to build video extension. So, video extension- it's the first step of interactivity. Yeah. It's, it's the first step. Yeah. So it's the first step- You have it here, video editing, yeah. Yeah. Yeah. So the first step is because, this unlocks long horizon videos. Typically, for most of the video generation models, you give it a prompt or an image as an initial frame. You generate video, that's it. That's just, one time, done. And some creators would try to, use the last frame as a first frame for the second video. It can-- sometimes it works, but if you do it a few times, it says the quality would decrease. And- It doesn't have that context- Yeah over the full video, so the temporal- Yeah, exactly. Yeah, ‘cause you only gave it the last frame, of course, right? Yeah. Exactly. And- it's actually a pretty fun hack. if you've seen like- Oh, no, he's saying something better. Yeah. And for example, like Vue, I remember Vue 3 has like a second context of the last video. It is slightly better than using the last frame, but it has the same problem-- similar problem that it, the quality would decrease. if you extend a few times to, one minute, the video quality would look much worse than the first video. Second, another problem is that the model doesn't have long-range knowledge of, what's happening before. Say, if they generate some dialogue, some, two people speaking, and their voice might change, over some time, especially if the second conditioning, it does not cover the previous context. So these are the core challenges. So the Grok Imagine video extension, it has historical context of all of the previous generated videos. It can, It has, it has the context of, who is speaking and what objects have appeared and everything, having that to generate the next video. So if we naively do this, you can imagine, just, put all of the previous history video tokens into the context. The context lens will easily explode. Especially for video models, that can be like a few, a few million context, I would imagine- context lens. Yes.Yeah.Swyx [00:54:58]: Let's run with that.Ethan [00:54:59]: for example, like in Cosmos, I think just five seconds of video is like a fifty K or sixty K number of tokens. So like if you do, if you do fifty second, that's a five hundred K tokens. If you do longer than that, easily explode. This long horizon, problem was the first step we're trying to solve world model. It turns out people, yeah, people love video extension. Like a lot, a lot of the creators love using video extension to create longer form videos. This is the part I liked that you have a, you have an intermediate step toward the final goal instead of just a straight shot to the final version very much.Swyx [00:55:48]: But I can see you have a strong vision of where we want to end up.Long Context, Redundancy, and Efficient Interactive VideoVibhu [00:55:51]: Does it seem like it's an efficiency issue? okay, we're at a few million tokens context,. If you draw the parallel to language models, we had very short context, two thousand, eight thousand, then, you scale it up one million, ten million. sure, there's effective context, but at the end of the day, it's just what's it worth? sure, there's a whole training data side. In video, it might be slightly easier ‘cause we have a hundred million token video, right? Just take a movie with the full context there. Like is this efficiency from an inference standpoint that like it's expensive, but we know how to solve it? Or like why is this not the approach? So like my broader point was on your second point of world models, you say it needs to be interactive and live, right? You should be able to play a game and see the interaction live. So one thing I see with research is a lot of what you actually serve is different than what you build, right? So we talked about distillation. You train big model, you distill it, you do quantization, speculative decoding. We do all this stuff to serve it efficiently. Should we not just have a solution, like a world model that can interact well, do inference optimization, serve it, distill it secondary, so make it real time after you solve it? So like a-- another parallel is say, continual learning, right? What we need is someone to solve it and show it works inefficiently. Give it a few years, people will make it efficient. Same thing with regular attention, right? It worked. Over a few years, people have different forms of attention, and we've scaled it to be efficient at log context,? So kind of two things there, right? One is it seems like it works. You've scaled it. Can we not just scale it a lot more efficiently over time? Do we need a separate approach if this works? And same thing with interaction, right? if we can get it done, like if we can solve some way that it works, we can solve making it more efficient from an inference standpoint later.Ethan [00:57:53]: that's actually a very good point. So in videos, there's actually a lot of redundancies. So we solve a lot of the pixel redundancy from VE, but there's more redundancy in long range and long horizon videos. Say, if a character appear in the first clip and then it disappeared, it only reappear at the end of the video, you probably don't need the-- the context, like in the middle of the generation. So you only need that character, where you need. So that's why, I helped build another feature. It's a reference video.Vibhu [00:58:36]: Is it here?Swyx [00:58:36]: is it the same model release or different one?Ethan [00:58:39]: It's a different one.Ethan [00:58:41]: You probably need to search onSwyx [00:58:43]: I'll find itEthan [00:58:43]: X reference to video.Ethan [00:58:46]: So reference video allow you to like upload up to seven images as condition and generate the video. Say, if like I want-- it can, it can be characters or objects or even scenes. Say like I want, I want condition on, Sean's selfie and holding a bladeSwyx [00:59:07]: We have a dogEthan [00:59:08]: or whatever.Swyx [00:59:08]: We put the dog in the thing.Ethan [00:59:09]: you can put them there and the video models will generate the video from and copies the context over. So that can solve a lot of the problems there, like the long context problem. It doesn't need to have a very long context, but it's-- I feel like it's an intermediate solution. The modelSwyx [00:59:29]: It's cheating.Ethan [00:59:30]: the model should be able to like selectively know, where should I draw the references. So say if I want to generate a movie, I generate it autoregressive, like a ten second at a time or something. And now this character appear, I can look back to where it first appear and, bring that back. Yeah, this one, I put the references. Yeah, that's, Optimus, Einstein myself, Annie.Vibhu [01:00:02]: Oddly enough, I used Grok Search to find it, and it pulled your LinkedIn post. But yeah we found it.Ethan [01:00:08]: Interesting.Vibhu [01:00:10]: ButxAI's Underrated Work, Culture, and WatermarkingSwyx [01:00:11]: this is a problem. This is not your fault, but like XAI doesn't communicate all this work that you do very well because they just have the model release and then that's it. But actually, these details are very good.Swyx [01:00:22]: As far as I understand, everything you just described is state-art, like no one else has done it.Vibhu [01:00:30]: A lot of-- yeah, I have a lot moreSwyx [01:00:32]: And then, and then you just put this blog post with the cookies. I'm this is not enough,?Swyx [01:00:37]: but I, obviously this is like the high level numbers that people want to know. But no, okay, soVibhu [01:00:42]: And I wonder, like part of that is also some labs don't share research into what happens. And ifSwyx [01:00:50]: No, but this is literally bragging about how good they are, right?Swyx [01:00:54]: Like, why would you not say that you are capable of extending with full context? this is not a secret sauce. This is like we did the work. yeah, I don't know.Ethan [01:01:02]: different labs have slightly different communication styles.Swyx [01:01:07]: Anyway, if anyone from XAI is listening we are always happy to help you tell your story. Yeah, okay, so you did references, and I think, I think kind of the point you're, you're making is it is sort of like a kludge, right? this is-- you can do seven, but what about 100?Swyx [01:01:23]: Right? Then you need a completely different thing.Ethan [01:01:26]: So I think it's-- this is, a mechanism to, select the context from the history, and you might not put the entire history into the context. for example, there's a paper called Frame Pack, which haveEthan [01:01:41]: a heuristic that the latest history, the last one second, I put the entire history, and the history before that, I would, compress it and makes the video smaller. So they follow this pattern, this build overall pattern that the maximum sequence length is fixed. So the further you are from the current frame, you have a smaller image. So this is just a heuristic. I think it can be more automatic. The model is aware like which history part of it can be select. So this part of the research is actually being actively, worked on by a lot of people. It's also quite interesting. I feel this is actually, this part of long context is a little bit ahead of the LLM part.Ethan [01:02:31]: So for example, like in LLMs, if you-- so contexts keep growing. Let's say if you call tool and the tool call history is extremely long, that's still in context, and keep growing, keep growing. Even if you switch the topic to something else, the whole context was there. There are some agentic harnesses that help you to, say, prune the tool results and, prune Like when you, when you query a file, only show like the top 200 lines or something. Those were very heuristic-driven.Swyx [01:03:08]: For listeners, we did a write-up on the cloud code, leak where there are eight different kinds of pruning, including like you prune the tool results and all that. So you can, you can read up on that kind of thing.Ethan [01:03:17]: I think, one breakthrough in continual learning might be like a way to automatically, manage its own context.Swyx [01:03:27]: These are all heuristics, and they will be replaced by machine learning.Ethan [01:03:30]: InterestinglyVibhu [01:03:32]: TheEthan [01:03:32]: the same thing is being researched in both LLMs and video models.Vibhu [01:03:36]: The interesting thing is also like in the paper you showed, it's actually happening at the model level, right? Compared to like language models, sure, we have base attention, but we'll do our own compression, we'll do our own pruning, which is separate from model error.Vibhu [01:03:49]: Eventually, it all just boils in, hopefully.Swyx [01:03:52]: I think this is a form of like attention, but like also know sort of reasoning attention. I feel like that's different than normal attention.Swyx [01:04:03]: Does that, does that make sense?Ethan [01:04:04]: It's, it's different in the sense that attention, not to mention, set sparse attention aside,

ChinaEconTalk
The Pope has AI Takes

ChinaEconTalk

Play Episode Listen Later Jun 1, 2026 50:50


Pope Leo has called AI the single greatest challenge facing humanity. Not war, not poverty, not climate change. So we got a panel together to sort out what this encyclical means. Joining Jordan are Tim Hwang, deputy director of the Institute for Christian Machine Intelligence, John-Clark Levin of Kurzweil Technologies, and ChinaTalk's resident Catholic, Aqib Zakaria. We discuss… Why the encyclical's claim that AI cannot truly "understand" is a narrow theological term of art, and why that nuance gets lost on Twitter Pope Leo's call to "disarm AI" and the Holy See's potential role mediating between the US and China and speaking for the global South Tim's pitch for a Vatican alignment lab that buys GPUs and tries to beat Anthropic's benchmarks from Christian first principles Why frontier-lab researchers, including non-believers, are treating the Pope as a moral coordinating signal How Anthropic drifting from deontology toward virtue ethics in training Claude looks like a validation of the Christian approach The provocation underneath all of it: is the American AI stack a Christian AI stack? pope as chicago footwork: https://suno.com/s/1Qb9Ce3Bh6saeF2V Learn more about your ad choices. Visit megaphone.fm/adchoices

Scrum Master Toolbox Podcast
BONUS How AI Is Reshaping Software Teams From the Inside With Dwarak Rajagopal

Scrum Master Toolbox Podcast

Play Episode Listen Later May 30, 2026 36:20


BONUS: How AI Is Reshaping Software Teams From the Inside — Lessons From Google, Meta, and Snowflake In this episode, Dwarak Rajagopal — VP of AI Engineering and Research at Snowflake — shares what he's seeing firsthand as AI agents become part of the software development process. From compressed sprint cycles to automated standups across time zones, Dwarak draws on two decades of building AI infrastructure at Google, Meta, Uber, and Apple to show what's actually changing inside engineering organizations today. From Compiler Engineer to AI Leader — The Thread That Connects Two Decades "In AI, the hardest part isn't just the models itself, it's making them work in real environments where data is messy, fragmented, and governed."   Dwarak started his career as an open-source GCC compiler engineer over two decades ago, optimizing hardware performance. He moved into graphics at Apple, then pivoted to AI when AlexNet started running on GPUs around 2011-2012. From there, he built autonomous driving software at Uber, led Meta's PyTorch core framework team bridging research and production, and at Google led AI Frameworks including getting Gemini training on TPUs. The common thread: always working at the intersection of research and production, making powerful technology work in the real world. That focus on real-world application is what drew him to Snowflake — where enterprise data meets AI at scale. AI Is Changing What Engineers Actually Do All Day "Engineers are spending more time on system design, validation, production reliability — and less time doing the implementation itself, because AI is helping that."   The shift Dwarak sees is concrete: AI is accelerating development, but the real value comes when it's grounded in enterprise data and context. At Snowflake, teams use tools like Cortex Code, Snowflake Intelligence, and other LLMs to generate code and tests faster — because the friction cost of development has dropped dramatically. Customer example: Whoop, the fitness band company, used Cortex Code with conversational data assistance and agents to reduce development cycles from weeks to hours, freeing teams to focus on high-value work. The End of "This or That" — Try Both, Kill Fast "There's a lot more choices now. You don't have to think about this versus that. Do both and then figure out what is the best."   One of the most practical shifts Dwarak describes: teams no longer need to commit to one architectural approach upfront. Because AI reduces the cost of building, teams can pursue two designs in parallel and evaluate both. A concrete example: instead of choosing a cross-platform framework like Flutter or React Native for a mobile app, Snowflake's teams now build native iOS and Android apps simultaneously — one human-led, the other agent-built — at roughly the same speed. But this creates a new challenge: teams have to learn to kill projects faster. When you can build more, you also discard more — and engineers need to detach from "their baby." Smaller Teams, Bigger Output — The Cross-Functional Shift "You could build multiple products now faster with different smaller teams. One back-end person, one front-end person — build vertically end-to-end."   Dwarak's teams moved from functional structures (separate backend, frontend, and feature teams) to project-based teams that own the full vertical stack. This isn't theoretical — Snowflake Intelligence was built this way. The result: fewer dependencies, faster delivery, more products in parallel. The tradeoff is coordination cost — more things running in parallel means more decisions to synchronize. Recruiting Has Fundamentally Changed — Systems Thinking Over Syntax "We used to ask an engineer to code a specific search algorithm. Now we ask them to build a whole search system within an hour."   Dwarak is clear: fundamentals matter more than ever. Systems thinking, judgment, the ability to work with complex data and production systems — these are what hiring evaluates now. AI handles execution; humans need to define problems clearly and ensure systems behave at scale. For junior engineers, the news is encouraging: onboarding is faster because team-specific skills are codified and shared, and the barrier to building end-to-end systems has dropped. "Learning by building is more true than ever now." Monday Planning, Friday Demos — The Compressed Sprint "You basically decide what to do on Monday, and you're testing together as a team on Friday and getting the feedback for the next week."   Daily work has transformed at Snowflake. The traditional multi-week sprint has compressed to a single week: Monday planning, Friday team demos and testing. Standups still happen — but faster, sometimes multiple times per day. For distributed teams across Bay Area, Seattle, and Poland, an automated skill scans each day's code changes and posts a summary in a shared Slack channel — so the next timezone knows exactly what happened without waiting for a meeting. This solves one of the oldest problems in distributed development. The Road to Lights-Out Codebases — Governance, Observability, Reversibility "Can agents take actions? Which of these actions cannot be taken back? You need the concept of committing actions or rolling back."   Building on the "lights-out codebases" concept from Philip Su's episode, Dwarak agrees the direction is clear — agents are already writing more code than humans in some contexts. But enterprise adoption requires governance, observability, traceability, and reversibility of agent actions. The shift from "AI as a tool" to "AI as part of the system" is happening now, with the focus moving from getting answers to enabling actions at scale. What Most People Get Wrong About AI in Software "It's very easy to build prototypes, even end-to-end systems. But it's very hard to get it working in enterprises where the data is so messy."   The gap between demo and production is where most organizations hit the wall. Enterprise data is scattered across invoices, factory outputs, and dozens of systems — combining it meaningfully for AI to generate insights and actions is the real challenge. This is different from the "AI will replace developers" narrative. The bottleneck isn't code generation; it's data integration, governance, and controlled execution at scale. About Dwarak Rajagopal Dwarak Rajagopal is VP of AI Engineering at Snowflake, where he leads the Cortex AI and AI Research teams. Before Snowflake, he led Google's AI Frameworks and On-Device ML teams (including Gemini), ran Meta's PyTorch Core Frameworks team, and built autonomous driving software at Uber. Two decades of shipping AI at the companies that define the field.   You can link with Dwarak Rajagopal on LinkedIn.  

Dev Interrupted
The cost of intelligence will never be this cheap again, the failure of intensive specs, and how bots disguise inefficient workflows

Dev Interrupted

Play Episode Listen Later May 29, 2026 38:45


Are we officially entering the "Eternal Sloptember"? This week on the Friday Deploy, Ben and Andrew unpack the quiet rebellion against skyrocketing API costs as teams transition to fine-tuned local models. They also explore the changing physical architecture of AI data centers, the dangers of using autonomous tools as a crutch for broken workflows, and why spec-driven development is critical for keeping agentic code in check. Finally, the hosts share their latest personal agent experiments, from benchmarking open-source models on a local Mac Studio to taming an AI-generated second brain.Learn why: LinearB is a Leader in the 2026 Gartner® Magic Quadrant™ for Developer Productivity Insight PlatformsFollow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's stories:Outsourcing plus LocalAI will soon become more economical vs Frontier labsAI Datacenters Were Built for GPUs. What Happens When You Remove the GPUs?"The AI Can Do It" Is Not an Excuse To Tolerate a MessThe Eternal SloptemberI'm tired of talking to AIIf you let AI do your writing, I will come to your house and kill youA Blast from the Past: SDD and the Illusion of Known ScopeAndrew's paper: Mise en Place for Agentic Coding: Deliberate Preparation as Context Engineering MethodologyOFFERSStart Free Trial: Get started with LinearB's AI productivity platform for free.Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.LEARN ABOUT LINEARBAI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

Late Confirmation by CoinDesk
Blockspace: Robinhood Opens AI Trading, IREN's $1.6B GPU Buy, Hut 8 CEO Beacon Point Update

Late Confirmation by CoinDesk

Play Episode Listen Later May 28, 2026 63:02


Robinhood now allows AI to trade for you, and IREN just purchased $1.6 billion worth of GPUs for AI workloads. Welcome back to The Blockspace Podcast! Today, Asher Genoot, CEO of Hut 8 joins us to talk about the company's $9.8 billion deal for its Beacon Point AI data center in Texas. For news, we cover Robinhood launching access for AI trading agents, IREN's $1.6B Dell Blackwell purchase, hardware rollout, Core 42's $500M financing, and Mara's rising executive security expenses. Check out our latest report, “What's a Megawatt Worth?” where we quantify the trillion dollar opportunity for bitcoin miners venturing into the AI sector. Download here: https://megawattreport.com/ Subscribe to our newsletter to receive updates for all of our shows and content: https://newsletter.blockspacemedia.com

The Neuron: AI Explained
What Comes After GPUs? Great Sky's Bet on Brain-Like AI

The Neuron: AI Explained

Play Episode Listen Later May 27, 2026 59:54


What if the next big AI breakthrough is not a bigger model, but a completely different kind of computer?Jeff Shainline, co-founder and CEO of Great Sky, joins The Neuron to explain how his team is building brain-inspired AI hardware using superconductors, photonics, and analog computation. Great Sky's architecture, called Superconducting Optoelectronic Networks, or SOENs, is designed to move beyond the traditional GPU roadmap by co-locating memory and processing, communicating with light, and mimicking some of the high-connectivity dynamics found in biological brains.In this conversation, Jeff breaks down why today's chips can struggle with fast, multimodal inference; why transformers may be powerful but inefficient for some future workloads; how Great Sky's system differs from quantum computing; and why early applications could include fusion reactors, particle physics, video understanding, content moderation, and eventually new model architectures that do not map neatly onto today's hardware.Subscribe to The Neuron for grounded, practical conversations about where AI is going next—and what actually has to work before the hype becomes real.

The John Batchelor Show
S8 Ep923: AI Valley examines the "innovator's dilemma," where tech giants like Google hesitate to release advanced AI that might cannibalize their lucrative search advertising profits. This "bigness" often slows innovation, leading ge

The John Batchelor Show

Play Episode Listen Later May 25, 2026 11:10


AI Valley examines the "innovator's dilemma," where tech giants like Google hesitate to release advanced AI that might cannibalize their lucrative search advertising profits. This "bigness" often slows innovation, leading geniuses like Mustafa Suleyman to leave DeepMind at Google to found independent ventures like Inflection. However, the staggering cost of GPUs and computing power often pulls these startups back into the orbit of trillion-dollar corporations. For example, Suleyman eventually moved Inflection to Microsoft to leverage their near-bottomless cash reserves. This dynamic ensures that only the wealthiest companies with massive reach can truly compete in the expensive race for generative AI supremacy. (5/8)1905 LA

The John Batchelor Show
S8 Ep924: Keach Hagey recounts the January 2016 founding of OpenAI in San Francisco, initially established as a modest nonprofit research lab in Greg Brockman's apartment. Co-founded by Sam Altman, Brockman, and chief scientist Ilya Sutskever, the organi

The John Batchelor Show

Play Episode Listen Later May 25, 2026 10:25


Keach Hagey recounts the January 2016 founding of OpenAI in San Francisco, initially established as a modest nonprofit research lab in Greg Brockman's apartment. Co-founded by Sam Altman, Brockman, and chief scientist Ilya Sutskever, the organization aimed to develop artificial general intelligence (AGI) safely outside of profit motives. Major initial backers included Elon Musk and Peter Thiel, who sought to create a counterweight to Google's DeepMind. The discussion explains how neural networks utilize Nvidia's GPUs—originally designed for video games—to mimic human thought, forming the technical foundation for the current AI race. (1/4)MARCH 1959

Invest Like the Best with Patrick O'Shaughnessy
Gavin Baker - Watts and Wafers - [Invest Like the Best, EP.473]

Invest Like the Best with Patrick O'Shaughnessy

Play Episode Listen Later May 20, 2026 76:51


My guest today is Gavin Baker, founding partner and CIO of Atreides Management, and this is our sixth conversation. The central theme is watts and wafers, the two physical constraints that in Gavin's view will dictate the next phase of AI. On power, he thinks the near-term shortage starts to ease in 2027 and 2028 as new sources of energy come online, and that orbital compute solves it in the long term. On wafers, he explains what is different this time from the dotcom bubble and why TSMC's capacity decisions may be the single most important variable to watch. We also discuss Elon's Terrafab, the disaggregation of GPUs, the role of new chip companies, and whether the economic value of AI will keep accruing to frontier models. For the full show notes, transcript, and links to mentioned content, check out the episode page ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠here⁠⁠⁠⁠⁠.  ----- Become a Colossus member to get our quarterly print magazine and private audio experience, including exclusive profiles and early access to select episodes. Subscribe at ⁠colossus.com/subscribe⁠. ----- ⁠Ramp's⁠ mission is to help companies manage their spend in a way that reduces expenses and frees up time for teams to work on more valuable projects. Go to⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠ramp.com/invest⁠⁠ to sign up for free and get a $250 welcome bonus. ----- Trusted by thousands of businesses, ⁠Vanta⁠ continuously monitors your security posture and streamlines audits so you can win enterprise deals and build customer trust without the traditional overhead. Invest Like the Best listeners get a special offer of $1,000 off Vanta when you go to ⁠vanta.com/invest⁠.  ----- WorkOS⁠ is the infrastructure B2B and AI-native companies use to sell to enterprise. It covers everything enterprise security requires: SSO, SCIM, RBAC, Audit Logs, AI governance, and more. Trusted by 2,000+ fast-growing companies, including OpenAI, Anthropic, Cursor, and Vercel. ----- Rogo is the AI platform for finance. They're building agents for Wall Street that are trained to understand how bankers and investors actually do work: from diligence and modeling, to turning analysis into deliverables. To learn more, visit rogo.ai/invest. ----- ⁠Ridgeline⁠ has built a complete, real-time, modern operating system for investment managers. It handles trading, portfolio management, compliance, customer reporting, and much more through an all-in-one real-time cloud platform. Visit⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ridgelineapps.com⁠. ----- Editing and post-production work for this episode was provided by The Podcast Consultant (⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://thepodcastconsultant.com⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠). Timestamps: (00:00:00) Welcome to Invest Like The Best (00:02:29) Gavin Baker Intro (00:03:32) Anthropic's Record ARR Growth (00:11:49) Should OpenAI and Anthropic Raise at a Much Higher Valuation? (00:13:23) How Elon Preserves Investor Trust (00:14:00) Watts & Wafers (00:15:45) Data Centers in Space Explained (00:20:51) Orbital Compute's Impact on Terrestrial Data Centers (00:26:24) TSMC Supply Discipline & Bubble Risk (00:30:50) Demand for Frontier Tokens & The Bitter Lesson (00:35:33) Continual Learning & Memory (00:40:01) New Chip Companies & Startups (00:42:49) Prefill vs. Decode Disaggregation (00:48:40) AI-Native Founders: Different & Hard (00:51:27) Token Path & Application Layer (00:56:13) How Gavin Uses AI in Atreides (01:00:06) Signs of a Diversity Breakdown (01:05:42) Google, Meta, Amazon, Microsoft (01:11:42) Broader Knock-On Effects of AI

Grumpy Old Geeks
746: Reality is Frequently Inaccurate

Grumpy Old Geeks

Play Episode Listen Later May 15, 2026 79:40


FOLLOW UP starts with merchandise promotion and YouTube begging reminiscent of 2007, before GameStop CEO Ryan Cohen gets thoroughly criticized by eBay after proposing a $56 billion takeover plan that eBay called “neither credible nor attractive,” which is corporate-speak for “please stop emailing us at 3 a.m.” Meanwhile, California residents might finally receive a small settlement check from Grubhub worth about half a burrito, just as Americans realize they dislike AI data centers even more than nuclear plants because nobody wants a warehouse full of GPUs boiling away the local water supply. Lake Tahoe residents are learning their electricity now goes to AI processing plants instead of people, xAI keeps adding methane turbines despite being sued over them, and SpaceXAI employees are fleeing Elon's “sleep under your desk forever” lifestyle as if it were the last helicopter out of Saigon.IN THE NEWS, we start gently with the revelation that everyone at the Musk v. Altman trial is sitting on luxury butt cushions because apparently the singularity requires lumbar support, before plunging straight into the abyss: fake AI crypto journalists haunting Forbes and HuffPost like SEO poltergeists, OpenAI launching “Daybreak” so the robots can now secure the software they helped break, Anthropic trying to stop AI from becoming evil by feeding it morality fan fiction, and Google catching AI-generated zero-day exploits in the wild because cyberpunk novels were apparently instructional manuals. Waymo robotaxis are experimenting with driving into floodwaters, a family is suing OpenAI after ChatGPT allegedly advised their son to mix drugs with fatal results, graduating students booed an executive for praising AI as if she were announcing the arrival of cholera, and Meta continues its speedrun toward becoming the world's largest scam mall while simultaneously demanding everyone trust its shiny new “encrypted AI chats.” Also: Meta is testing Grok-for-Threads, somebody created an AI poop-analysis startup that quietly sells your bowel movements to data brokers, GM got nailed for selling driver data, Lime still somehow exists and wants an IPO, and Japan's first 3D-printed house shows that the future will at least look cool even as society collapses.MEDIA CANDY features Spotify celebrating twenty years of collecting your listening habits into a psychological profile you absolutely didn't care about during the CD era, plus The Punisher: One Last Kill ironically looking like unfinished PlayStation cutscenes, Good Omens Season 3, Devil May Cry Season 2, NBC somehow turning Wordle into a TV show because every executive has fully given up, shorter waits for Severance Season 3, and Rings of Power returning in November to continue spending the GDP of a small nation on elf misery.APPS & DOODADS checks in with Apple as it prepares Siri app integrations that developers already suspect will become subscription-based hostage situations. TikTok is testing an ad-free tier in the UK because, somehow, ads weren't already enough punishment. Venmo is finally realizing that public payment feeds are insane. There's a Wikipedia clone made entirely of AI hallucinations, and an iPad arm mount sturdy enough to survive the upcoming climate wars.AT THE LIBRARY wraps up with Clowns (First Contact), Dungeon Crawler Carl, the demise of another Goodreads competitor, Kindle alternatives for those trying to escape Amazon's panopticon, and a reminder that Douglas Adams has now been gone for 25 years, which remains, in the immortal words of the man himself, widely regarded as a bad move.Sponsors:DeleteMe - Get 20% off your DeleteMe plan when you go to JoinDeleteMe.com/GOG and use promo code GOG at checkout.Shopify - Sign up for your one-dollar-per-month trial today at Shopify.com/grumpyCleanMyMac - Get Tidy Today! Try 7 days free and use code OLDGEEKS for 20% off at clnmy.com/OLDGEEKSPrivate Internet Access - Go to GOG.Show/vpn and sign up today. For a limited time only, you can get OUR favorite VPN for as little as $2.03 a month.SetApp - With a single monthly subscription you get 240+ apps for your Mac. Go to SetApp and get started today!!!1Password - Get a great deal on the only password manager recommended by Grumpy Old Geeks! gog.show/1passwordShow notes at https://gog.show/746Watch on YouTube at https://youtu.be/ICjNBnP3sMkFOLLOW UPGrumpy Old Geeks Merch StoreGrumpy Old Geeks on YouTubeeBay Brutally Rejects GameStop's $56 Billion Proposal: ‘Neither Credible nor Attractive'Wang et al. v. Grubhub, Inc.Americans Oppose AI Data Centers in Their AreaEnergy supplier abandons Lake Tahoe residents to serve data centersxAI Got Sued Over Its Gas Turbines, so It Naturally Added More of ThemElon Musk's SpaceXAI has been bleeding staff since its mergerIN THE NEWSEveryone at the Musk v. Altman Trial Is Using Fancy Butt CushionsFour Financial Journalists Accused of Being Fake AI-Generated Puppets That Shill Crypto in Forbes, HuffPost, and MoreDaybreak is OpenAI's response to Anthropic's Claude MythosAnthropic blames dystopian sci-fi for training AI models to act “evil”Google announces its first-ever discovery of a zero-day exploit made with AIWaymo Admits Its Robotaxis Have a Small Issue With Driving Into FloodwatersFamily sues OpenAI, alleging ChatGPT advice led to accidental overdoseGraduation Speaker Says AI Is ‘The Next Industrial Revolution,' Immediately Drowned Out by Booing StudentsMeta is facing another lawsuit over scam ads on Facebook and InstagramAfter Killing Encrypted DMs, Mark Zuckerberg Wants You to Trust His New Encrypted AI ChatHey @meta.ai is that true? Threads is testing a Grok-like AI featureInternet of Shit: AI Poop Analysis App Offered to Sell Me Database of Its Users' PoopsGM agrees to pay $12.75 million to settle California lawsuit over misuse of customers' driving dataThe electric scooter rental company Lime has filed for IPOThis startup built Japan's first 3D-printed two-story home. It wants to solve the country's construction crisisAPPS & DOODADSApple wants apps to integrate with Siri in iOS 27, but one fear holds some back: reportTikTok is rolling out an ad-free option in the UKVenmo's redesigned app offers more discreet payments by defaultNew Wikipedia Clone Made Entirely of AI HallucinationsYICOSUN iPad Mount Tablet Holder, 3-Section Foldable Adjustable Aluminum Alloy Arm with Rotating Clamp Base, Heavy Duty Desk Bracket for iPad Tablet Phone Portable Monitor, Bed Office KitchenMEDIA CANDYSpotify is celebrating its 20th birthday with a Wrapped-like feature that covers your entire time on the appThe Punisher: One Last KillHere's the Real Deal With That Viral Shot From 'Punisher: One Last Kill'Good Omens Season 3 - The FinaleDevil May Cry Season 2NBC is turning Wordle into a TV showAdam Scott Promises the Wait for ‘Severance' Season 3 Won't Be Nearly as Long‘Lord of the Rings: The Rings of Power' Is Returning in NovemberAT THE LIBRARYClowns (First Contact) by Peter CawdronDungeon Crawler Carl by Matt DinnimanTome, another Goodreads booktracker rival, shuts downBookshop.orgKoboSmashwordseBooks.comKobo E-readersONYX BOOXThe Ultimate Hitchhiker's Guide to the Galaxy OmnibusCLOSING SHOUT-OUTS'Revenge of the Nerds' Actor Donald Gibb Dead at 71See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.