Podcasts about Kubernetes

Software to manage containers on a server-cluster

  • 1,392PODCASTS
  • 10,615EPISODES
  • 45mAVG DURATION
  • 2DAILY NEW EPISODES
  • Jun 17, 2026LATEST
Kubernetes

POPULARITY

20192020202120222023202420252026

Categories



Best podcasts about Kubernetes

Show all podcasts related to kubernetes

Latest podcast episodes about Kubernetes

Code Story
The AI Control Loop: AI Discovery isn't just AI - with Tim Ebbers of Wallarm

Code Story

Play Episode Listen Later Jun 17, 2026 16:10 Transcription Available


Today, we are dropping another episode in our series The AI Control Loop, How enterprises govern the AI they've already deployed - sponsored by our friends at Wallarm.Wallarm is the AI Control Platform for Enterprise AI, protecting every AI workload, API, and application in production, giving CISOs the governance they need and CIOs the speed they demand. Organizations choose Wallarm for a complete inventory of APIs, AI agents, and AI apps, patented AI/ML-based threat detection and blocking that operates at production traffic speeds.We all know that you can't secure what you can't see, which is why AI discovery is a first principle for AI security, but what's really required for AI discovery? It's more than just LLMs and agents. Today's episode is entitled AI Discovery isn't just AI, and joining us is Tim Ebbers, Field CTO at Wallarm. Tim and I discuss the real requirements for AI discovery, and why the connections between assets and infrastructure are part of the puzzle.QuestionsSecurity teams often say, “You can't secure what you can't see.” In the context of AI, what exactly do they need to see? What supporting infrastructure matters most when mapping AI risk, such as APIs, cloud services, Kubernetes workloads, data stores, identities, and external integrations?Where does shadow AI typically appear first inside an enterprise environment? How can it be prevented?How do relationships between assets change the risk picture? For example, why does it matter which API an agent can call or which data source a workflow can reach?What makes AI discovery harder than traditional application or cloud asset discovery? What are the similarities and differences?How should organizations prioritize what they find? Is every AI asset equally risky?What does “continuous discovery” mean in a world where AI services can be deployed, connected, or changed in minutes?Once an organization has visibility into its AI footprint, what's next? What are the biggest gaps in today's AI security programs?Linkshttps://www.wallarm.com/https://www.linkedin.com/in/tebbers/Full AbstractMost security teams know that you can't secure what you can't see. In the context of AI, that rule turns out to be a lot harder to satisfy than it sounds.AI discovery isn't just a matter of cataloging your LLMs and agents. The real picture includes the APIs those agents call, the data sources they reach, the infrastructure they run on, and all the AI that got deployed without anyone telling security. Building that picture requires understanding relationships, not just inventories, because risk doesn't live in assets in isolation. It lives in what those assets can do together.In this episode, Tim Ebbers, Field CTO at Wallarm, examines what a complete AI control loop actually requires at the discovery stage: what needs to be visible, why the connections between assets change the risk calculation, where shadow AI tends to appear first and how it becomes unmanaged risk, and what makes AI discovery structurally different from traditional cloud or application discovery. It also looks at what organizations should do once discovery is in place, and where the biggest gaps remain in AI security programs today.If your team is building toward continuous AI governance, this is where that work starts.Our Sponsors:* Check out Cash App and use my code CASHAPP10 for a great deal: https://click.cash.app/ui6m/mt82fpxl #CashAppPod. Cash App is a financial services platform, not a bank. Banking services provided by Cash App's bank partner(s). Prepaid debit cards issued by Sutton Bank, Member FDIC. See terms and conditions at https://cash.app/legal/us/en-us/card-agreement. Cash App Green, overdraft coverage, borrow, cash back offers and promotions provided by Cash App, a Block, Inc. brand. Visit http://cash.app/legal/podcast for full disclosures.* Check out Plaud AI and use my code CODESTORY for a great deal: https://plaud.aiAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

Alexa's Input (AI)
David Aronchick on Distributed Data Orchestration with Expanso

Alexa's Input (AI)

Play Episode Listen Later Jun 15, 2026 77:32


In this episode of Alexa's Input (AI), I sit down with David Aronchick, co-founder and CEO of Expanso and former product lead for Kubernetes at Google.Data is growing everywhere outside your data center. Solar panels in remote across a country. Security cameras at retail stores. IoT sensors across factory floors. And moving that data to the cloud for processing? It's expensive, slow, and often restricted by compliance.David is an expert when it comes to solving distribution problems. He led Kubernetes product at Google, co-founded Kubeflow to bring ML to production, and now he's building Expanso to tackle a difficult constraint: when your data can't move, how do you process it where it lives?We discuss:- The need for distributed data orchestration-Upstream data control: filtering and transforming at the source- Three forces making edge computing inevitable (physics, regulations, economics)- How to build successful open source infrastructure projects- Customer discovery and finding real pain points- His transition from Protocol Labs to founding Expanso- ETL pipelines: moving the first four steps closer to the data- Context loss and lineage in distributed systems- Processing 400,000 signals per second with 150MB agents- AI observability: attaching source metadata to training data- Running ML pipelines at the edge- Real-world deployment challenges (bandwidth, regulations, cost)Expanso is rethinking how we process data in an AI-native world—moving compute to data instead of data to compute. If you want to understand where distributed systems and edge computing are heading, this is a deep dive into the infrastructure layer beneath modern AI applications.General Podcast LinksWatch: https://www.youtube.com/@alexa_griffith Read: https://alexasinput.substack.com/ Listen: https://creators.spotify.com/pod/profile/alexagriffith/ More: https://linktr.ee/alexagriffithLearn more about the host atWebsite: https://alexagriffith.com/ LinkedIn: https://www.linkedin.com/in/alexa-griffith/Find out more about the guest atLinkedIn: https://www.linkedin.com/in/aronchick/ Twitter/X: https://x.com/aronchick GitHub: https://github.com/aronchick Expanso Website: https://expanso.io/ResourcesExpanso Website: https://expanso.io/ Kubernetes: https://kubernetes.io/ Kubeflow: https://www.kubeflow.org/ CNCF (Cloud Native Computing Foundation): https://www.cncf.io/ Protocol Labs: https://protocol.ai/KeywordsDavid Aronchick, Expanso, Kubernetes, Kubeflow, distributed systems, edge computing, data pipelines, ETL, upstream data control, Google Kubernetes Engine, open source, CNCF, observability, log processing, data lineage, provenance, schema enforcement, IoT, edge AI, distributed data, machine learning infrastructure, Protocol Labs, IPFS, Filecoin, data governance, compliance, GDPR, bandwidth optimization, data aggregation, AI infrastructure, multi-cloud, hybrid cloud, real-time processing

DevOps and Docker Talk
K8s Maxxing with AI-Native Platform Engineering Stack with OpenChoreo

DevOps and Docker Talk

Play Episode Listen Later Jun 13, 2026 54:59


OpenChoreo is an opinionated, “batteries included”, AI-native Kubernetes platform stack for Platform Engineers that combines GitOps, Observability, AI Agents, and Workflows into a custom K8s distribution “super pack” that is managed via Backstage, CLI, API, or MCP. Now a CNCF project.Check out the video podcast version here: 

Kubernetes Podcast from Google
Agent Sandbox with Lovable, with Jonathan Grahl

Kubernetes Podcast from Google

Play Episode Listen Later Jun 12, 2026 38:35


In this episode we speak to Jonathan Grahl.  Jonathan is the Team Lead of Infrastructure at Lovable where he oversees the platform stack the company runs on. We talked about Kubernetes, Sandboxes and Chocolate.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod - bluesky: @kubernetespodcast.com   News of the week OpenTelemetry is a CNCF Graduated Project CNCF TAG Elections KubeCon India Kubernetes Community Days global events Links from the interview Lovable Cilium Etcd Cloudflare Sandbox Agent Sandbox (Kubernetes) OpenClaw Claude OpenAI Bun toolkit for Javascript gVisor Firecracker Kata Containers Vitess

Engineering Culture by InfoQ
Craig McLuckie on Culture as a Team's Operating System in the AI Era

Engineering Culture by InfoQ

Play Episode Listen Later Jun 12, 2026 27:29


This is the Engineering Culture Podcast, from the people behind InfoQ.com and the QCon conferences. In this podcast, Shane Hastie, Lead Editor for Culture & Methods, spoke to Craig McLuckie, co-creator of Kubernetes and CEO of Stacklok, about the impact of AI coding tools on open source communities and engineering teams, designing deliberate organisational culture, and navigating evolving career paths for engineers in the age of AI. Read a transcript of this interview: https://bit.ly/3RGzPCa Newsletter: Subscribe to the Software Architects' Newsletter for your monthly guide to the essential news and experience from industry peers on emerging patterns and technologies: https://www.infoq.com/software-architects-newsletter InfoQ online certification cohorts: Online cohorts for senior engineers and architects, built around QCon talks. Join a 5-week confidential peer group to validate your approach and apply practitioner frameworks to the technical challenges you face at work. Learn more: https://certification.qconferences.com/ Upcoming Events: QCon AI Boston 2026 (June 1-2, 2026) Learn how real teams are accelerating the entire software lifecycle with AI. https://boston.qcon.ai QCon San Francisco 2026 (November 16-20, 2026) https://qconsf.com/ The InfoQ Podcasts: Weekly inspiration to drive innovation and build great teams from senior software leaders. Listen to all our podcasts and read interview transcripts: - The InfoQ Podcast https://www.infoq.com/podcasts/ - Engineering Culture Podcast by InfoQ https://www.infoq.com/podcasts/#engineering_culture - Generally AI: https://www.infoq.com/generally-ai-podcast/ Follow InfoQ: - Mastodon: https://techhub.social/@infoq - X: https://x.com/InfoQ?from=@ - LinkedIn: https://www.linkedin.com/company/infoq/ - Facebook: https://www.facebook.com/InfoQdotcom# - Instagram: https://www.instagram.com/infoqdotcom/?hl=en - Youtube: https://www.youtube.com/infoq - Bluesky: https://bsky.app/profile/infoq.com Write for InfoQ: Learn and share the changes and innovations in professional software development. - Join a community of practitioners. - Increase your visibility. - Grow your career. https://www.infoq.com/write-for-infoq

airhacks.fm podcast with adam bien
Split-Brain, ContainerD, Quarkus and a Postgres Cloud Control Plane

airhacks.fm podcast with adam bien

Play Episode Listen Later Jun 12, 2026 55:23


An airhacks.fm conversation with Alvaro Hernandez (@ahachete) about: discussion about the quarkus Insights episode "#337 The Database Cloud" stackgres live demo, StackGres as a Quarkus and GraalVM native kubernetes operator for running Postgres, comparing CloudNativePG (CNPG) by EnterpriseDB to StackGres, Patroni for Postgres high availability, the split-brain risk of relying on Kubernetes and etcd alone, distributed consensus and leader lock election via etcd, why distributed systems and cryptography should not be self-implemented, async, synchronous and quorum (semi-synchronous) Postgres replication trade-offs, cascading and cross-region replication topologies, the false-positive problem and heuristic exceptions in two-phase commit, the ondb ("own your database") project for self-hosted Postgres, losing control with managed cloud services and untestable backups, vanilla unmodified Postgres on StackGres, the "Kubernetes without Kubernetes" (Kubeless) pattern, talking directly to ContainerD through the CRI API, runc and the Docker to ContainerD chain, a self-contained native binary that embeds ContainerD over Unix domain sockets, the slony node-local component named after the Postgres slonik elephant mascot, the Matriarch orchestrator component, reverse gRPC tunnels with Slonies phoning home across NAT and firewalls, a multi-tenant cloud control plane provided as a service, curl-pipe-shell node installation with a token, end-to-end encrypted Postgres protocol tunneling for JDBC from anywhere, psql compiled to wasm in the web console, Tailscale-inspired user experience, unifying nodes, Kubernetes clusters and cloud pools as resources, Slony Kubernetes controller, Java 25 source-mode scripting without dependencies, implementing your own MCP server for Postgres JDBC metadata, the Goose agentic UI donated by Block to the Linux Foundation, AI Rails BCE, Java, Web Components skills Alvaro Hernandez on twitter: @ahachete

LowOpsCast
#50 Da advocacia ao Cloud Native global: carreira internacional com Julia Furst Morgado

LowOpsCast

Play Episode Listen Later Jun 12, 2026 71:43


Neste episódio do LowOpsCast, o papo é com Julia Furst Morgado, CNCF Ambassador, Community Manager OpenTelemetry, AWS Container Hero, Docker Captain, organizadora do KCD New York e palestrante internacional.Mas antes dos títulos e das comunidades globais, a conversa começa pela pessoa.A Julia tem uma trajetória diferente daquelas que parecem roteiro pronto de carreira em tecnologia. Ela começou no Direito no Brasil, estudou negócios na UC Berkeley, passou por marketing e, em 2022, entrou de vez na área tech. Um caminho nada linear, mas cheio de repertório, comunicação, estratégia e coragem para mudar de rota.Hoje, ela atua como Principal Developer Relations Engineer na Dash0, ajudando pessoas desenvolvedoras a adotarem OpenTelemetry e criarem fluxos práticos de observabilidade. Também participa ativamente da comunidade OpenTelemetry e de diversas iniciativas cloud native ao redor do mundo.Neste episódio, a gente conversa sobre:- Transição de carreira para tecnologia- Como transformar uma trajetória não linear em vantagem- Comunidade, pertencimento e construção de confiança- Developer Relations e o papel de educar, conectar e apoiar pessoas- OpenTelemetry e observabilidade na prática- Cloud Native, Kubernetes, Docker e AWS- Como é palestrar internacionalmente em diferentes idiomas- Organização de eventos como KCD New York, AWS Community Day NY e CNCF Meetup NYCTambém iremos falar sobre carreira internacional, open source, desafios de entrar em tecnologia vindo de outra área e a importância de não se excluir só porque o caminho parece diferente do “padrão”.Esse episódio é sobre tecnologia, sim.Mas também é sobre coragem, curiosidade, comunidade e sobre entender que não existe uma única forma certa de construir uma carreira em tech.Se você está em transição de carreira, trabalha com DevOps, SRE, Cloud Native, observabilidade, open source ou quer entender melhor o papel de comunidade na evolução profissional, esse episódio vai fazer muito sentido.Links compartilhados pela Julia:https://www.juliafmorgado.com/posts/the-complete-guide-to-ace-your-next-networking-coffee-chat/https://www.youtube.com/watch?v=HHAXlDu49rEhttps://github.com/juliafmorgado

Ardan Labs Podcast
Innovation, Viva Technology, and Startups with François Bitouzet

Ardan Labs Podcast

Play Episode Listen Later Jun 10, 2026 86:57


In this episode of the Ardan Labs Podcast, Ale Kennedy talks with François Bitouzet, Managing Director of Viva Technology, about the forces shaping the future of technology and innovation. François shares his journey from studying in France to leading one of the world's largest technology and startup events, connecting entrepreneurs, investors, and industry leaders from around the globe.00:00 Introduction02:58 Education and Early Influences08:53 Early Career and Communication17:47 Communication in a Changing World32:25 Innovation and Technology42:40 Creativity and Marketing49:38 Leadership and Career Growth54:54 Adapting to Technological Change59:42 The Future of Events01:06:54 AI and Society01:11:20 Startups and Innovation01:15:35 Deep Tech and the FutureConnect with François: LinkedIn: https://www.linkedin.com/in/fran%C3%A7ois-bitouzet-180a89/Mentioned in this Episode:Viva Technology: https://vivatechnology.comWant more from Ardan Labs? You can learn Go, Kubernetes, Docker & more through our video training, live events, or through our blog!Online Courses : https://ardanlabs.com/education/ Live Events : https://www.ardanlabs.com/live-training-events/ Blog : https://www.ardanlabs.com/blog Github : https://github.com/ardanlabs

FinanZe
Episode 26: Matt Moore, Co-Founder and CTO of Chainguard

FinanZe

Play Episode Listen Later Jun 10, 2026 53:21 Transcription Available


Send us Fan MailToday's guest is Matt Moore, Co-Founder and CTO of Chainguard — the software supply chain security company on a mission to make the open-source ecosystem safe for enterprises everywhere. Matt's background spans some of the most consequential corners of the cloud-native world. Before co-founding Chainguard, he was a core contributor to the open-source ecosystem and played a pivotal role at Google, where he helped shape the Kubernetes and supply chain security landscape that underpins much of modern software infrastructure today.In today's conversation, we'll dive into Matt's journey from open-source engineering to founding a company, how Chainguard is tackling one of the most critical — and often overlooked — challenges in enterprise software, the philosophy behind building security into the foundation rather than bolting it on, what it takes to turn deep technical expertise into a venture-backed company, and his vision for the future of software supply chain security in an AI-first world.Support the show

Founded and Funded
The Best Infrastructure Moment Since Cloud

Founded and Funded

Play Episode Listen Later Jun 10, 2026 41:36


Joe Beda and Craig McLuckie co-created Kubernetes, the infrastructure standard that became the default for cloud native computing. Now running Stacklok, they're watching enterprises hit the same identity, permissions, and security problems with AI agents that took the container ecosystem years to resolve, and they're building tools to compress that timeline. In this episode of Founded & Funded, Madrona's Tim Porter sits down with Joe and Craig to talk through what AI adoption actually requires: why MCP is the Docker moment for AI-native applications, how the LLM gateway is becoming a strategic chokepoint for cost, safety, and model flexibility, and why enterprises that don't get the architecture right early will face a familiar trap: vertical integration that looks like productivity and acts like lock-in. They cover: Why the developer workflow is the template for knowledge worker AI adoption, and where the analogy breaks down The mainframe vs. open platform question that will define the AI infrastructure era Why the knowledge worker transition is harder than it looks — and what has to be built differently before developer-grade AI tooling can scale to the rest of your organization The governance gap between human accountability and AI behavior, and what enterprises actually need to build to close it Where to start: MCP controls first, LLM gateway second, and why deploying a platform without staying to close the loop consistently fails Transcript: https://www.madrona.com/the-best-infrastructure-moment-since-cloud Chapters: (0:00) – Introduction (1:04) – Why the Kubernetes Creators Are the Right People to Read This AI Moment (2:18) – Joe's Lesson from Cloud Native: Ignore Conventional Wisdom, Except When You Shouldn't (4:16) – Craig on Enterprises and the Chaos of a New Infrastructure Era (5:32) – Why Joe Rejoined Craig at Stacklok: The Engineer's Case for Getting Your Hands Dirty (7:05) – Developers as Agent Orchestrators: How the Knowledge Worker Transition Will Follow (10:10) – MCP Explained: Craig Sees Docker in 2013 When He Looks at the MCP Spec 1 (7:53) – The Mainframe vs. Open Platform Question That Will Define the AI Era (20:24) – LLM Lock-In Is the Wrong Worry: The Real Risk Is Left of the Model (25:19) – Where Enterprises Actually Start: Developer Posture First, Knowledge Workers Second (29:10) – MCP First, LLM Gateway Second: The Concrete Technical Starting Point (31:19) – How Stacklok Builds Software Now: Agents, Smaller Teams, the Unrecognizable Developer Profile (38:07) – The Recruiter Who Started Building Agents: What AI Tools Do to Role Boundaries

The DevOps Kitchen Talks's Podcast
DKT98: IPv8, Terraform 1.15, Terragrunt 1.0, NGINX 1.30 - новости DevOps

The DevOps Kitchen Talks's Podcast

Play Episode Listen Later Jun 10, 2026 85:30


IPv8 уже в драфте, Terraform хоронят в блогах, а одна компания жжёт $7M в год на Claude Code. Собрали новости DevOps, которые вы накидали через бота. О ЧЁМ ВЫПУСК Новостной выпуск: накопилось 63 новости, разобрали самое горячее. И снова с нами Ярослав. В этом выпуске: • IPv8: драфт в IETF - ASN в первых 32 битах, старый IPv4 во вторых. Без NAT и dual-stack, плюс токен-идентичность на каждый девайс • Terraform 1.15: переменные в source и version модулей, отдельная аутентификация для S3 backend • "Terraform is dead": разбираем хайповую статью - спека как desired state, Pulumi, CDK и причём тут AI • Terragrunt 1.0: units, stacks и фильтр по affected-ресурсам через git worktree • NGINX 1.30: sticky sessions, keep-alive и HTTP/2 к апстримам, Early Hints (103), Encrypted Client Hello • Экономика AI: Semi-Analysis масштабировала Claude Code до $7M/год, дефицит RAM • Google Agent Sandbox: новый Kubernetes CRD между StatefulSet и Deployment Сквозная мысль выпуска: AI ускоряет всё, но без понимания, как работают системы, спека и вайб-кодинг рано или поздно стреляют в ногу. ГОСТЬ Ярослав Бледковский - Un-principal SRE, Wargaming ССЫЛКИ Все новости выпуска (тезисы, голосование, ссылки): https://dkt-ai.github.io/episodes-news/episodes/episode-97-ru Присылайте новости через бота: @dkt_news_bot Упомянутые ресурсы: • IPv8 draft (IETF): https://www.ietf.org/archive/id/draft-thain-ipv8-00.html • Terraform 1.15: https://github.com/hashicorp/terraform/releases/tag/v1.15.0 • "Terraform is dead" (статья): https://grahamgilbert.com/blog/2026/04/20/terraform-is-dead/ • Terragrunt 1.0: https://github.com/gruntwork-io/terragrunt/releases/tag/v1.0.0 • NGINX 1.30: https://github.com/nginx/nginx/releases/tag/release-1.30.0 • AI tokens (Dylan Patel / Semi-Analysis): https://www.youtube.com/watch?v=LF3aUIM57uw • Kubernetes Agent Sandbox: https://github.com/kubernetes-sigs/agent-sandbox ПОДКАСТ YouTube - www.youtube.com/@DevOpsKitchenTalks Apple Podcasts - https://apple.co/41O6mqA Spotify - https://t.ly/Jg5_2 Yandex Music - https://music.yandex.ru/album/10151746 PodBean - https://devopskitchentalks.podbean.com НАВИГАЦИЯ 00:00 - Интро: вы уже на IPv6 или ещё IPv4? И снова в гостях Ярослав 02:53 - Anthropic и 200K карточек от Маска: лимиты Claude отпустило 04:26 - Адженда из 63 новостей через наш Telegram-бот

The Kubelist Podcast
Ep. #53, Render and the New Cloud Stack with Anurag Goel

The Kubelist Podcast

Play Episode Listen Later Jun 9, 2026 59:55


On episode 53 of The Kubelist Podcast, Marc Campbell and Benjie De Groot sit down with Anurag Goel. Anurag shares his journey from early employee at Stripe to founder and CEO of Render, one of the fastest-growing application platforms in cloud infrastructure. They discuss Kubernetes, platform engineering, bare metal, AI agents, and what the future of application deployment looks like.

Heavybit Podcast Network: Master Feed
Ep. #53, Render and the New Cloud Stack with Anurag Goel

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Jun 9, 2026 59:55


On episode 53 of The Kubelist Podcast, Marc Campbell and Benjie De Groot sit down with Anurag Goel. Anurag shares his journey from early employee at Stripe to founder and CEO of Render, one of the fastest-growing application platforms in cloud infrastructure. They discuss Kubernetes, platform engineering, bare metal, AI agents, and what the future of application deployment looks like.

De Nederlandse Kubernetes Podcast
#136: vLLM, LMD, and the Quest to Build the Linux of AI Inference

De Nederlandse Kubernetes Podcast

Play Episode Listen Later Jun 9, 2026 32:21


In this episode, hosts Ronald and Jan are joined at KubeCon by two guests from Red Hat: Brian Stevens, AI CTO and one of the original architects behind the creation of Kubernetes and the CNCF, and Rob Shaw, co-lead of the vLLM project and maintainer of LMD.Brian shares the remarkable backstory of how Kubernetes came to be open source, including how Red Hat negotiated a single committer seat before agreeing to be a launch partner, and how he later pushed Google to contribute Kubernetes to the newly formed CNCF rather than keeping it proprietary like TensorFlow.Rob explains what an inference runtime actually is: the critical piece of software that takes an abstract AI model and runs it as efficiently as possible on a GPU or other accelerator — handling everything from CUDA-level kernel optimization to memory management and concurrent request scheduling. vLLM serves as a "Rosetta Stone" between the ever-growing zoo of models (Llama, DeepSeek, Mistral, Qwen, Nvidia Nemotron) and accelerators (Nvidia, AMD, Intel, Google TPUs).The conversation covers model compression and quantization how techniques like 4-bit precision can deliver 2x hardware efficiency gains while preserving 99%+ model accuracy. Brian and Rob also address the "big model vs. many small models" debate, recommending to always start with the largest capable model to validate a use case before optimizing down.Looking ahead, both guests see inference as potentially the single largest workload ever run on Kubernetes, and position LMD (now contributed to the CNCF) as the distributed inference layer that will make this possible across heterogeneous accelerator environments  preventing enterprises from ending up with 42 incompatible AI stacks.The episode closes with a discussion on AI slop, human-in-the-loop thinking, and the future of Kubernetes as the universal platform for running AI agents at scale.Powered by  @acc-ict ​Stuur ons een bericht.ACC ICT Specialist in IT-CONTINUÏTEIT Bedrijfskritische applicaties én data veilig beschikbaar, onafhankelijk van derden, altijd en overalSupport the showLike and subscribe! It helps out a lot.You can also find us on:De Nederlandse Kubernetes Podcast - YouTubeNederlandse Kubernetes Podcast (@k8spodcast.nl) | TikTokDe Nederlandse Kubernetes PodcastWhere can you meet us:EventsThis Podcast is powered by:ACC ICT - IT-Continuïteit voor Bedrijfskritische Applicaties | ACC ICT

AWS Morning Brief
OpenAI on Bedrock and Other Strange Bedfellows

AWS Morning Brief

Play Episode Listen Later Jun 8, 2026 7:25


AWS Morning Brief for the week of June 8th, with Corey Quinn. Links:AWS Interconnect - multicloud now offers a free 500 Mbps tierOracle Database@AWS is now available in twenty AWS RegionsAmazon Cognito now supports multi-Region replicationAmazon EKS and Amazon EKS Distro now supports Kubernetes version 1.36Amazon SES now supports tenant-level suppression listsAWS Compute Optimizer now supports 32-day lookback for EBS volume and ECS service rightsizing recommendationsAWS Cost and Usage Report 2.0 now supports Athena and Redshift integrationAmazon ElastiCache for Valkey now supports durabilityUnderstanding how backups work in Amazon AuroraOpenAI models and Codex on Amazon Bedrock are now generally availableHow Bedrock Streaming optimizes its AWS costsFrom Monolith to Multi-Account: Pinterest's AWS Organization Transformation JourneyGain visibility into DDoS attacks with flow logs in AWS Shield AdvancedIdentify unused AWS KMS keys and prevent accidental key deletionsCVE-2026-10591 - Kiro IDE Insufficient File Write Restrictions to Execution-Sensitive PathsCVE-2026-10584 - HTTPS Fallback to HTTP in Graph Explorer

The Generative AI Meetup Podcast
The Best Open Source US Model (Right behind China)

The Generative AI Meetup Podcast

Play Episode Listen Later Jun 7, 2026 114:55 Transcription Available


https://novacut.ai/  https://genaimeetup.com/  Anthropic has officially closed a $65 billion Series H at a $965 billion valuation, nearly 2.5x its valuation from just 100 days ago. Meanwhile, funding is flowing across the ecosystem: Frameworks AI at $15B, Baseten at $11B, OpenRouter's $113M Series B, and Cognition AI's $1B Series D. NVIDIA went on an open-source super week with Nemotron 3 Ultra, Cosmos 3, and Nemotron 3.5 ASR. Microsoft dropped 5 new MAI models. Google released Gemma 4 12B, and Anthropic shipped Opus 4.8. On the benchmarks front, DeepSWE crowns GPT-5.5 as the leader in long-horizon coding tasks, while ITBench shows even frontier models struggle with real-world SRE incidents — Claude Opus 4.7 tops out at just 47%. Plus: Cloudflare acquires VoidZero to build the future of AI-native edge development, and Google is paying SpaceX $920M/month for compute. Topics covered: • Anthropic's $65B Series H and path to $1T • Fireworks AI, Baseten, OpenRouter & Cognition funding rounds • Microsoft's 5 new MAI models • NVIDIA's open-source super week (Nemotron, Cosmos 3) • MiniMax M3, Gemma 4 12B, JetBrains Mellum2, Opus 4.8 • DeepSWE benchmark: GPT-5.5 leads long-horizon coding • ITBench: Frontier models under 50% on real SRE tasks • Cloudflare + VoidZero for AI-native edge dev • Google's $920M/month SpaceX compute deal #AI #Anthropic #NVIDIA #OpenAI #AInews #TechNews #LLM     Funding rounds Anthropic formally confirmed the closure of its $65 billion Series H funding round at a post-money valuation of $965 billion. This represents a 2.5-fold increase over its $380 billion Series G valuation from February 2026, adding $585 billion in value in approximately 100 days https://www.anthropic.com/news/series-h  Frameworks AI raising at 15B valuation representing a near fourfold increase from its $4 billion Series C valuation recorded in October 2025 processing 15 trillion tokens daily for major production clients including Cursor, Notion, and Perplexity https://finance.yahoo.com/sectors/technology/articles/fireworks-ai-eyes-15-billion-174609357.html Baseten is raising 1B at 11B valuation annualized revenue, which skyrocketed from $200 million to $600 million over a single quarter https://techstartups.com/2026/05/26/ai-inference-startup-baseten-in-talks-to-raise-1-billion-at-11-billion-valuation/  OpenRouter has secured a $113 million Series B funding OpenRouter has experienced exponential traffic growth, with weekly production throughput expanding fivefold from 5 trillion to 25 trillion tokens over a six-month horizon https://www.businesswire.com/news/home/20260526953416/en/OpenRouter-Raises-%24113-Million-CapitalG-led-Series-B-as-Weekly-Volume-Explodes-to-25T-Tokens  Further up the stack: Cognition AI secured a $1 billion Series D round led by Lux Capital and 8VC https://cognition.ai/blog/series-d   Model Releases MAI models: MAI-Code-1-Flash: A 5-billion active parameter model optimized for ultra-low latency within GitHub Copilot and VS Code. MAI-Image-2.5: A high-fidelity image generation model ranking third on global image evaluation arenas, outperforming competing architectures like Nano Banana Pro. MAI-Transcribe-1.5: A multi-lingual speech processing engine offering fivefold speed improvements across 43 languages. MAI-Voice-2: Natural audio and voice generation across 15 languages, available at a highly competitive price point. Web IQ: A search-grounding API engineered to directly compete with Perplexity. https://microsoft.ai/models/    https://www.peoplematters.in/news/ai-and-emerging-tech/uber-imposes-dollar1500-monthly-ai-spending-limit-on-employees-amid-rising-costs-50073    Nvidia has executed an "Open-Source Super Week," positioning itself as a dominant software and model publisher: Nemotron 3 Ultra (best US open source open weights model but behind china): A massive 550-billion parameter MoE (55 billion active) designed with a 1-million token context window, optimized specifically for high-throughput, cyclical agent loops. It achieved peak throughput rates of 400 tokens per second on day-zero optimized clusters. Cosmos 3: A physical AI world-modeling framework comprising 16-billion Nano and 64-billion Super variants. Built on a Mixture-of-Transformers (MoT) architecture, Cosmos 3 natively binds textual, visual, auditory, and physical kinetic vectors. Nemotron 3.5 ASR: A highly compact 0.6-billion parameter streaming speech recognition model pushing sub-100 millisecond latencies across 40 language locales.   https://www.minimax.io/models/text/m3  MiniMax M3: A 1-million token context model hitting 59.0% on SWE-Bench Pro and 74.2% on MCP Atlas, though noted for high token consumption due to intensive internal self-validation loops.   https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/  Gemma 4 12B: Google's Apache 2.0 on-device model, which utilizes an encoder-free architecture that projects vision and audio vectors directly into the text-token space, bypassing separate CLIP-style encoders to minimize local memory footprints. https://www.jetbrains.com/mellum/  JetBrains Mellum2: A compact 12-billion parameter MoE (2.5 billion active) engineered for ultra-low latency routing and retrieval-augmented generation (RAG) sub-agents within developer IDEs. Opus 4.8 https://www.anthropic.com/news/claude-opus-4-8    https://www.cnbc.com/2026/06/05/google-to-pay-spacex-920-million-a-month-for-xai-compute-capacity.html      Benchmarks: https://deepswe.d atacurve.ai/blog https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-finds-claude-opus-exploiting-a-benchmark-loophole (GPT 5.5 the winner in long horizon tasks) a highly complex software engineering benchmark focused on original, long-horizon tasks across five distinct programming languages. Comprising 113 chaotic tasks across 91 live, production-grade repositories, DeepSWE forces agents to generate 5.5 times more code and modify an average of 7 separate files per task compared to standard evaluations. On this challenging leaderboard, GPT-5.5 leads with a score of 70%, establishing a significant 16-percentage-point lead over contemporary alternatives I think older benchmarks where models reach ~90% accuracy can be considered saturated. Few percentage points don't give us any good signal.  https://research.ibm.com/publications/developing-ai-agents-for-it-automation-tasks-with-itbench  ITBench-AA, an evaluation framework focusing on live Kubernetes incident response and Site Reliability Engineering (SRE) operations. Comprising 59 live, containerized SRE incident snapshots, the results are remarkably sobering: every frontier model scored under 50% on successful incident resolution, with Claude Opus 4.7 leading at 47% and GPT-5.5 following closely at 46%.   Edge AI announcements: https://www.cloudflare.com/press/press-releases/2026/cloudflare-acquires-voidzero-to-build-the-future-of-the-ai-native-web/  The consolidation of the AI-native developer stack has reached the runtime virtualization layer. Cloudflare recently completed the acquisition of VoidZero, the development group responsible for Vite, Vitest, Rolldown, and Oxc, backing the transaction with a $1 million open-source ecosystem fund. This acquisition is highly strategic; as autonomous agents write an increasing proportion of production software, local development environments, compilation pipelines, and bundlers must be optimized for execution speeds that match agent speeds. Cloudflare's goal is to construct a localized, full-stack edge playground. In this sandbox, AI agents can generate, test, bundle (utilizing the highly parallelized, Rust-based Oxc and Rolldown engines), and deploy entire web applications end-to-end within milliseconds. This architecture completely bypasses traditional local machine container bottlenecks, enabling high-velocity agent loops to execute in a fully sandboxed, web-scale edge runtime.

IT Talks
289 From containers to AI-driven infrastructure

IT Talks

Play Episode Listen Later Jun 5, 2026 26:33


Guest: Per Stene Language: Swedish Duration: 26.33 min Per Stene from Redpill Linpro gives an update on Kubernetes, OpenShift and container platforms four years after his last visit to the podcast. The conversation ranges from how Kubernetes has become standard in modern IT environments to why the platforms have simultaneously become more complex, expensive and more business-critical. Per also highlights trends such as edge computing, air-gapped environments, GitOps, Infrastructure as Code and how organizations are once again starting to look towards their own data centers for security and sovereignty reasons. A central question is how AI is changing the operation of Kubernetes, from monitoring and troubleshooting to the more automated platforms of the future. Kubernetes will not disappear, but developers will likely meet it through increasingly higher abstraction layers.

The DevSecOps Talks Podcast
#101 - Infrastructure as Code in 2026: Still Essential or Already Changing?

The DevSecOps Talks Podcast

Play Episode Listen Later Jun 4, 2026 38:26


Six years after the podcast first covered infrastructure as code, what still holds up and what does not? The hosts revisit IaC through a 2026 lens: platform teams shipping secure-by-default modules, stacks becoming standard, GitOps making more sense for Kubernetes, and AI raising new questions instead of removing old ones. It is a practical look at where infra tooling is heading and what teams should stop assuming.  We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners. DevSecOps Talks podcast LinkedIn page DevSecOps Talks podcast website DevSecOps Talks podcast YouTube channel

CarahCast: Podcasts on Technology in the Public Sector
Securing Kubernetes at the Edge with Spectro Cloud Palette

CarahCast: Podcasts on Technology in the Public Sector

Play Episode Listen Later Jun 4, 2026 69:03


Explore how the Spectro Cloud Palette platform secures Kubernetes & edge management for Government IT teams. Access the podcast series to accelerate ATO today.

explore cloud securing palette kubernetes ato software supply chain security spectro government it
Packet Pushers - Full Podcast Feed
TCG077: News Roundtable: Data Center Backlash and the AI Chip War

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Jun 3, 2026 45:09


William and Eyvonne discuss recent tech news, including the growing political and community opposition to AI data centers driven by fears over power and water usage. They also analyze the “AI Chip War” as hyperscalers such as AWS and Google invest in specialized silicon for training and inference.  Episode Links: Amid backlash, O'Leary Digital CEO... Read more »

Alexa's Input (AI)
How vLLM and llm-d Changed AI Inference with Rob Shaw

Alexa's Input (AI)

Play Episode Listen Later Jun 3, 2026 102:59


In this episode of Alexa's Input (AI), I sat down with Rob Shaw from Red Hat to talk about how AI inference evolved from a simple model serving problem into a large-scale distributed systems problem.We explored the infrastructure shifts behind modern LLM serving, including how vLLM and PagedAttention changed the economics and efficiency of inference, why KV cache management became one of the most important bottlenecks in production AI systems, and how orchestration layers like llm-d are emerging to coordinate distributed inference.We also discuss:how LLM inference differs from traditional model serving runtimesKV cache, prefix caching, and cache-aware routingwhy throughput and latency became major infrastructure challengeslong-context agents and repeated inference callsdistributed inference on Kubernetesintelligent routing, flow control, and load balancingprefill/decode disaggregationenterprise AI deployment realitiesvLLM has become one of the most important open-source projects in AI infrastructure, and llm-d represents a newer shift toward treating inference as a coordinated distributed system rather than just a single runtime problem.If you want to better understand the systems layer beneath modern AI applications, this episode is a deep dive into where inference infrastructure is heading next.General Podcast LinksWatch: ⁠⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠⁠Read: ⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠⁠Listen:⁠⁠ ⁠⁠https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠⁠More: ⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠Learn more about the host atWebsite: ⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠LinkedIn: ⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠⁠Find out more about the guest at:LinkedIn: https://www.linkedin.com/in/robert-shaw-1a01399a/ Red Hat Articles: https://developers.redhat.com/author/robert-shawGithub: https://github.com/robertgshaw2-redhat ResourcesvLLM Website: https://vllm.ai/vLLM GitHub Repository: https://github.com/vllm-project/vllmllm-d Website: https://llm-d.ai/llm-d GitHub Repository - https://github.com/llm-d/llm-d KeywordsAI inference, VLLM, LMD, distributed inference, GPU optimization, open source AI, Kubernetes, multi-cluster deployment, AI infrastructure, enterprise AI AI infrastructure, Kubernetes, model optimization, speculative decoding, mixture of experts, AI deployment, performance tuning, AI systems, neural network scaling Key TopicsEvolution of vLLM and llm-dDistributed inference and routingGPU utilization and performance optimizationOpen source AI infrastructureEnterprise deployment challenges and solutions Standardization in Kubernetes for NIC exposurePerformance optimizations: quantization and speculative decodingMixture of experts architecture and parallelism strategiesFlow control and request scheduling in AI systemsEmerging hardware for AI inference, Cerebras processorReinforcement learning and AI system supportModular architecture of vLLM and ecosystem projects

airhacks.fm podcast with adam bien
JAZ, Copilot SDK, and Why LLMs Write Better Java

airhacks.fm podcast with adam bien

Play Episode Listen Later Jun 3, 2026 76:59


An airhacks.fm conversation with Bruno Borges (@brunoborges) about: discussion about the JAZ command launcher for Java, JVM tuning and default ergonomics for containers versus dedicated cloud environments, replacing the Java launcher with jaz in container images, supporting Java 8 to 25, maximizing resource utilization on kubernetes to reduce waste, running Java on Azure Functions, Azure App Service deploying a fat JAR without a container image, Azure Container Apps as a platform on AKS without YAML, Azure Kubernetes Service and AKS Automatic, Bicep as infrastructure as code, deploying a JAR to Kubernetes via OCI artifacts and a custom operator, Microsoft Foundry and the Microsoft Agent Framework, Semantic Kernel learnings, the Copilot SDK for Java communicating with headless CLIs, A2A and ACP protocols and MCP, agents as microservices with scoped tasks, guardrails, and sandboxing, per-agent model selection for cost and reasoning trade-offs, observability and traceability between agents with opentelemetry, grounding LLMs against MicroProfile, Jakarta EE, JAX-RS normative RFC 2119 specifications for hallucination-free Java code generation, the Boundary Control Entity pattern and business components as Java packages, package-info.java for semantic context, GitHub Copilot skills and custom instructions in Visual Studio Code, the AI Rails skills site, zero-dependency Java CLI scripting, reducing dependencies by reusing source code instead of JARs, the org.json reference implementation reduced to five classes, StackGres and OnGres running Quarkus and GraalVM to manage Postgres on Kubernetes, the Digg Into Java community Bruno Borges on twitter: @brunoborges

The Look Back with Host Keith Newman
Becoming the VMware of AI Infrastructure: Lukas on Building the Operating System for GPU Clouds

The Look Back with Host Keith Newman

Play Episode Listen Later Jun 2, 2026 10:05


AI infrastructure is breaking the old data center model.In this episode of Liftoff with Keith, I sit down with Lukas Gentele, CEO & Co-Founder of vCluster Labs, to unpack what it really takes to operate GPU infrastructure at scale in 2026.As AI workloads explode and neoclouds race to meet demand, Lukas and his team are building the operational backbone for modern AI clouds — from managed Kubernetes and tenant isolation to automated node provisioning and GPU lifecycle management.We discuss:Why traditional data center assumptions are collapsing under AI pressureWhat's fundamentally changed since the VMware eraHow an early partnership with CoreWeave shaped vCluster's trajectoryAnd the one mistake AI cloud operators are making right now that could hurt them over the next 18 monthsIf you care about AI infrastructure, GPU economics, hyperscaler strategy, or building category-defining platforms — this conversation is essential.Sponsor Info: We are strategic business advisors with decades of leadership experience and a proven track record of driving businesses' growth. We specialize in creating custom-tailored strategies to introduce your company, drive growth, build leadership teams, and ensure companies implement appropriate compensation programs. Our mission is to utilize our expansive network to benefit your company https://www.compass-strategic-advisors.com/ Connect with Lukas Gentele: Website: https://www.vcluster.com/ LinkedIn: https://www.linkedin.com/in/gentele/ Subscribe for more founder insights and hit the bell for notifications! Follow us on our channels for exclusive startup content and behind-the-scenes insights from interviews like this one. Spotify: https://open.spotify.com/show/3cFpLXfYvcUsxvsT9MwyAD?si=f5a14e779777487d Apple Podcasts: https://podcasts.apple.com/ca/podcast/liftoff-with-keith-newman/id1560219589 Substack: https://keithnewman.substack.com/ Newman Media Studios: https://newmanmediastudios.com/ LinkedIn: https://www.linkedin.com/company/liftoffwithkeith For sponsorship inquiries, please contact: sponsorships@wherewithstudio.comFrom the Host: A special shout-out to our Great Host of the Ignite Studios: https://www.ignitegtm.com/ and Producers of AI Infra5 @Plug and Play World, HQ in Sunnyvale, CALiftoff is sponsored by a strategic consulting firm and the M&A specialists at Compass Strategic Advisors - https://www.compass-strategic-advisors.com/ and The GTM Firm - https://www.thegtmfirm.com/

Azure Friday (HD) - Channel 9
Anyscale on Azure: Scale Python AI workloads with managed Ray on AKS

Azure Friday (HD) - Channel 9

Play Episode Listen Later Jun 2, 2026


Scott Hanselman talks with Omar Shorbaji from the Anyscale engineering team about how Anyscale on Azure scales Python AI workloads from a single notebook to thousands of CPUs and GPUs. Built on Ray, the most widely adopted AI compute engine, Anyscale gives you a unified runtime to build, train, and serve, running directly on Azure Kubernetes Service without the complexity of managing Kubernetes. See a live demo that fine-tunes a vision-language-action robotics policy, with the metrics you need to push GPU utilization higher. Chapters 00:00 - Introduction 00:52 - Ray and the Anyscale platform 03:11 - Start of demo: Workspaces 04:38 - Running a job and viewing utilization metrics 05:24 - Choosing the right scale 06:53 - Abstracting Kubernetes on AKS 08:53 - Wrap up and where to learn more Recommended resources Learn Docs Anyscale on Azure Connect Scott Hanselman | Twitter/X: @SHanselman Anyscale | Twitter/X: @anyscalecompute Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure

Azure Friday (Audio) - Channel 9
Anyscale on Azure: Scale Python AI workloads with managed Ray on AKS

Azure Friday (Audio) - Channel 9

Play Episode Listen Later Jun 2, 2026


Scott Hanselman talks with Omar Shorbaji from the Anyscale engineering team about how Anyscale on Azure scales Python AI workloads from a single notebook to thousands of CPUs and GPUs. Built on Ray, the most widely adopted AI compute engine, Anyscale gives you a unified runtime to build, train, and serve, running directly on Azure Kubernetes Service without the complexity of managing Kubernetes. See a live demo that fine-tunes a vision-language-action robotics policy, with the metrics you need to push GPU utilization higher. Chapters 00:00 - Introduction 00:52 - Ray and the Anyscale platform 03:11 - Start of demo: Workspaces 04:38 - Running a job and viewing utilization metrics 05:24 - Choosing the right scale 06:53 - Abstracting Kubernetes on AKS 08:53 - Wrap up and where to learn more Recommended resources Learn Docs Anyscale on Azure Connect Scott Hanselman | Twitter/X: @SHanselman Anyscale | Twitter/X: @anyscalecompute Azure Friday | Twitter/X: @AzureFriday Azure | Twitter/X: @Azure

The DevOps Kitchen Talks's Podcast
DKT97 | DevOps в 2026: Platform Engineering, AI-агенты и будущее джунов

The DevOps Kitchen Talks's Podcast

Play Episode Listen Later Jun 1, 2026 93:26


Состояние DevOps в 2026: Platform Engineering, AI-агенты и что стало с junior-инженерами. Собрались на кухне с тимлидом системного юнита из большой компании - поговорить что и как сейча. О ЧЁМ ВЫПУСК • DevOps vs SRE в 2026: где проходит граница и почему «you build it, you run it» иногда создаёт больше проблем, чем решает. • War story: упавший Kubernetes во время корпоратива с пивом - классика первых K8s-внедрений. • Момент, когда DevOps ломается: 600 сервисов, 3600 пайплайнов и почему каждый новый инженер пишет 3601-й. • Platform Engineering: зачем нужна платформа, что такое метаплатформы и как устроены слои внутри крупной компании. • Junior + AI = middle: что изменилось с приходом AI-ассистентов и сколько теперь занимает обучение DevOps. • AI в работе DevOps прямо сейчас: мультиагентные помощники для расследования инцидентов, внутренние vs внешние модели. • Реальные AI-катастрофы 2025-2026: Replit дропнул базу и бэкапы, сервис аренды машин ушатал прод. • Multi-agent flow: refiner + архитектор + автономный бот, PR за час вместо недели. • Тимлидам: не носи инженерам PR, которые ты навайпкодил за вечер. • Что реально учить в 2026: Linux, сеть, Kubernetes, один язык программирования и AI-грамотность. • Знать базу vs спрашивать AI: почему без фундамента ты не поймёшь, куда тебя модель направляет. ГОСТЬ В гостях - Андрей Волхонский, руководитель юнита System в Центре разработки инфраструктуры Авито. 13+ лет опыта: от Windows-DevOps в TravelLine и Kaspersky до платформенной инженерии в большой продуктовой компании. ССЫЛКИ

Kubernetes Bytes
Building Grafana Labs and the Future of Observability with Anthony Woods

Kubernetes Bytes

Play Episode Listen Later May 29, 2026 58:05


In this episode of the Kubernetes Bytes podcast, Bhavin talks to Anthony Woods about all things Grafana Labs, Observability, Telemetry, and how AI impacts both of these ecosystems. The discussion starts off by talking about the early days of Grafana Labs, what is Adaptive Telemetry, and how AI plays a role both in building Observability capabilities in applications, and how it helps perform root cause analysis. Listen to learn more! Check out our website at https://kubernetesbytes.com/ Show Notes: GrafanaCON 2026: https://youtube.com/playlist?list=PLDGkOdUX1UjoSfz1IRj5c0xetw8tl8iin&si=JpT85m4t4bP8ZXgX Grafana Labs Blog: https://grafana.com/blog/ https://www.linkedin.com/in/anthonywoods1/

Practical AI
Rebooting Enterprise AI with MCP and Kubernetes

Practical AI

Play Episode Listen Later May 28, 2026 48:09 Transcription Available


What happens when AI agents start acting less like chatbots and more like coworkers? In this episode, Dan and Chris sit down with Craig McLuckie, CEO of Stacklok to explore MCP, Kubernetes, ToolHive, enterprise AI, and the emerging infrastructure powering AI-native applications. From identity management to agent orchestration and system architecture, this conversation dives into how organizations may soon manage entire fleets of AI agents working behind the scenes.Featuring:Craig McLuckie – LinkedInChris Benson – Website, LinkedIn, Bluesky, GitHub, XDaniel Whitenack – Website, GitHub, XLinks:StacklokToolhiveSponsors:Prediction Guard: A self-hosted AI control plane for running agents in high impact environments. predictionguard.com/practicalaiUpcoming Events: Register for upcoming webinars here!Midwest AI Summit 2026

Kubernetes Podcast from Google
Kubernetes 1.36, with Ryota Sawada

Kubernetes Podcast from Google

Play Episode Listen Later May 27, 2026 26:55


Ryota Sawada is software engineer at Numtide and the release lead of Kubernetes 1.36 code name Haru. He has over a decade of experience mainly in the finance industry including working on Cloud Native technologies, and outside of Cloud Native, he's been tinkering with Emacs and Nix.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod - bluesky: @kubernetespodcast.com   News of the week Etcd version 3.7.0 is out Merge Forward Community Truepositive Links from the interview Kubernetes 1.35 codename Haru Release theme and logo Logo designer User Namespaces Workload API Workload Aware Scheduling (WAS) DRA features graduating to Stable GKE 10 years and SIG Networking, With Antonio Ojea  

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!On the product side, everyone is getting Computer - Perplexity, Manus, Cursor, and so on. Meanwhile on the research side, agentic evals like TerminalBench and GDPVal are also assuming computer (Harbor). On both ends, the consolidating LLM OS stack has become a standard toolkit, and Daytona is one of a small set of AI Infra companies that are booming because of it.“The end of localhost” has been Ivan Burazin's obsession for more than a decade.Something that is all too familiar…Long before agents became the default way people talked about software development, Ivan was already chasing the idea that development should not depend on a fragile local machine. CodeAnywhere, one of the first browser-based IDEs, was an early attempt at that future: move the development environment into the cloud, make setup reproducible, and free developers from the endless “works on my machine” tax.The thesis was directionally right, but the market wasn't ready yet.However, agents changed that. They do not care about a laptop, desk setup, or favorite editor. They need a computer they can access through an API: something stateful enough to keep working, fast enough to spin up instantly, flexible enough to resize, isolated enough to be safe, and composable enough to run the messy real-world workflows that real software engineering actually requires.Daytona isn't just selling “sandboxes” in the narrow code-execution sense. It is the latest version of Ivan's original localhost thesis.In this episode, Daytona's CEO joins swyx to explain why AI agents need more than code execution boxes: they need composable computers, stateful sandboxes, instant startup, dynamic resources, and infrastructure that can survive workloads going from zero to 100,000 CPUs.We go deep on the new agent compute market: Daytona's hard pivot from human dev environments to AI sandboxes, the New Year's Eve MVP that customers begged for, why Daytona runs on bare metal with its own scheduler, how one customer runs almost 850,000 sandboxes a day, and why RL/eval workloads went from 0% to roughly 50% of usage in just months. Ivan also explains why agents need Windows and macOS machines, why CLI may matter more than MCP, why Kubernetes is painful for this workload, and why the future AI cloud may look more like Stripe than AWS.We discuss:* How Daytona grew out of CodeAnywhere, Shift, and the “end of localhost” thesis* Why Daytona pivoted from human dev environments to AI sandboxes* Why agents need composable computers instead of disposable code execution boxes* The New Year's Eve MVP that customers chased API keys for* Why Daytona chose bare metal, stateful snapshots, and its own scheduler* How Daytona spins up one sandbox in ~60ms and 50,000 sandboxes in ~75 seconds* Why Daytona's biggest customer runs ~850,000 sandboxes a day* How RL/eval workloads create zero-to-100,000 CPU spikes* Why RL workloads went from 0% to roughly 50% of Daytona usage* Why customers compare Daytona against EKS/GKS and say they're “never going back”* Why every AI agent may need a computer, including Windows and macOS environments* The Apple licensing constraints that make macOS sandboxes hard* Why CLI gives agents more power than MCP* How open source helps agents integrate Daytona* Why agent-generated PRs may break today's CI/CD assumptions* Why AI SaaS companies reselling tokens may face a cold shower* Why the AI cloud may look more like Stripe than AWSIvan Burazin* LinkedIn: https://www.linkedin.com/in/ivanburazin* X: https://x.com/ivanburazinDaytona* Website: https://www.daytona.io* X: https://x.com/daytonaioTimestamps* 00:00:00 Hook* 00:01:12 Introduction* 00:03:15 CodeAnywhere, Shift, and the end of localhost* 00:05:58 What Daytona is: composable computers for AI agents* 00:08:07 The pivot from dev environments to AI sandboxes* 00:10:17 The New Year's Eve MVP and customers begging for API keys* 00:12:56 Bare metal, stateful sandboxes, and Daytona's scheduler* 00:17:28 60ms startup, 50,000 sandboxes, and 850K daily runs* 00:21:53 Spiky RL/eval workloads and the new agent infra problem* 00:28:12 RL workloads, Kubernetes pain, and dynamic resizing* 00:33:31 Why every AI agent needs a computer* 00:38:48 macOS sandboxes and Apple's licensing problem* 00:44:28 Why CLI may matter more than MCP* 00:48:11 Open source, GitHub stars, and agent integration* 00:53:11 Git, CI/CD, and agent collaboration bottlenecks* 00:58:15 Founder life and building a 25-person infra company* 01:02:44 AI SaaS, token resale, and API-first business models* 01:06:10 GPU sandboxes, data centers, and compute growth* 01:09:48 Why the AI cloud may look more like Stripe than AWS* 01:11:26 Closing thoughtsTranscriptIntroduction: Daytona, CodeAnywhere, and the End of LocalhostSwyx [00:00:02]: Okay, we're in the studio with Ivan Burazin, CEO of Daytona. Welcome.Ivan [00:00:07]: Thanks for having me, man.Swyx [00:00:08]: Ivan, you and I go back.Ivan [00:00:10]: Way back.Swyx [00:00:11]: How I don't even know how, you found, did you reach out or, for Shift.Ivan [00:00:17]: I reached out to you. The reason was you - we were just - we were thinking about I was one of the co-founders of CodeAnywhere, the first browser-based IDE, and so we were thinking a long time of, localhost should die. And you had this article.Swyx [00:00:29]: End of localhost.Ivan [00:00:30]: Then I reached out to you because of that, and then we talked, and I was actually at a different job and learning about I was the head of, developer experience, and you were quite well-versed in that, and I actually reached out to you, among other people, how do we go about that? What are the key things and whatnot at this point in time? And you were nice enough to take the call, and I remember I was late on your call with you.Swyx [00:00:51]: I don't remember.Ivan [00:00:52]: I remember because I was with my then I'm thinking of a girlfriend or wife at that point in time, I'm not sure. It's the same person, so that's great, and I was late ‘cause we were, in, Italy on, vacation, and then I was late for something. I felt so bad, and you were so nice to be, good about.Swyx [00:01:10]: The reason I'm nice is because I'm also late to other people, so it's like, who's, who's without sin here, yeah, so I have to, for those who don't know, InfoBip Shift, there's this whole thing that, you did in the past, and, and that was basically one of the inspirations for me starting AI Engineer, which is like, I have to thank you for giving me that push to be like, “Oh, you can, you can build and sell conferences?”Ivan [00:01:34]: I remember you asked you asked me at the beginning to give me advisory shares, and I was so focused on what we were doing, I said no, and I should've took the advisory shares. So I'm sorry, dude. But anyway.Swyx [00:01:43]: We're not, we're not venture backed.Ivan [00:01:44]: No, it doesn't matter.Swyx [00:01:45]: It's Yeah, anyway, so I think what's impressive about you is that CodeAnywhere is the thing that you've been trying to build, and, you kind of put it on hold and then came back after InfoBip. Just give us the story, do you - the story and the origin story, going into Daytona.From CodeAnywhere and Shift to DaytonaIvan [00:02:05]: Sure. Like, really way back, me and my co-founder have been together. I say this, I've said this multiple times, it's like we were married and divorced and married. Some people actually ask me is my co-founder my partner. they thought it literally. It's not literally, but we have done multiple companies together, and to your point, we had this shift where we went from the CodeAnywhere to the conference called Shift, and then back to, Daytona. We originally started stacking servers, doing like virtualization in the early 2000s and, routers and doing basically all these things, at a foundational level, and that was a services company which we sold to focus on what my co-founder actually invented, which was the very first browser-based IDE, right, I say the first. Before us was actually Heroku. They did it for a very short time until they became Heroku. But outside of them, we were the only one, and it was called.Swyx [00:02:55]: There was Cloud9.Ivan [00:02:57]: Cloud9 came out slightly after us. There was Replit, which came out when we stopped doing it, Replit came out, and they have been successful since then, which is great. There was Nitrous.io. There was quite a few that existed at the time, but it was like too early. But the interesting part is that we, at that point in time, because there was no VS Code, there was no Kubernetes, and Docker had just started when we Or I'm not sure if it was even public at that point in time. And so we had to build everything to the whole stack ourselves and that was the key learning that we brought into and that we've been using in Daytona today. So it was super early. There's about 3 million people used CodeAnywhere. It was slightly, it was angel-backed more than venture-backed. We ended up paying everyone back because it didn't have that sort of scale. But, three years ago, we started something similar with Daytona, which is not what we are today, but it was automating dev environments for human engineers, the basically the underlying stack of CodeAnywhere. And then we did a hard pivot last January to sandboxes. And so here we are.Swyx [00:04:01]: Historic pivot, yeah, and, it's one of those things where, I had independently invested in CodeAnywhere, but also in E2B, and then both of you pivoted into the same thing, and I'm like, “F**k.”Ivan [00:04:12]: You invested, you invested in Daytona. You invested in Daytona. But you were the first If we had not got your check, we wouldn't have done it.Swyx [00:04:18]: No way.Ivan [00:04:19]: No, it was like, “We have to get him on board first,” and you were that kicker that we, that got us off the ground.Swyx [00:04:23]: No, because you were putting me on your pitch deck, man. I was like, “Man, this is like a good trip if I don't invest.”Ivan [00:04:29]: That's because it was your quote. It's like we.Swyx [00:04:30]: Yeah. It's the end of localhost.Ivan [00:04:31]: Did a bunch of research about end of localhost and who was interested in that,.Swyx [00:04:34]: No, that's like, I put, I wrote that blog post, and every single company in that field reached out to me, and then every VC who was receiving those pitches then also had to call me and, talk it, talk through it with me.Ivan [00:04:47]: It's finally happening though.Swyx [00:04:48]: It was really super interesting.Ivan [00:04:48]: It's finally happening.Swyx [00:04:49]: It's finally happening.Ivan [00:04:49]: Yeah, it's finally.Swyx [00:04:49]: It's finally happening, with maybe sort of non-human users. Yeah, so what is Daytona today? Let's get like a quick description. I'm wearing the shirt.What Daytona Is Today: Composable Computers for AI AgentsIvan [00:04:58]: You're wearing the shirt. Yes,.Swyx [00:04:59]: It says, I think your branding is very good. Like, it's very consistent. It runs AI code. Like, it cannot be simpler.Ivan [00:05:05]: Exactly, but we're gonna probably have to change that.Swyx [00:05:07]: Oh, s**t.Ivan [00:05:07]: It's also a subset of what we do. Unfortunately, we really love this, Run AI Code is super simple. People interpret it different ways. I think we've given out 5,000, 6,000 of these shirts. People wear them with pride because it doesn't really market about us.Swyx [00:05:21]: Yeah, Daytona's on the back.Ivan [00:05:22]: It markets the back. It markets to the person itself, so I think we did a really good job on that one. But it is also a subset of what we do, because people, when they think about Run AI Code, they just think about these small, let's call it isolates, code execution boxes that, you send some code, you get an output. Whereas what Daytona is today is essentially composable computers for AI agents. It is, the market calls them sandboxes which can be misleading.Swyx [00:05:44]: All these things. All these things on.Ivan [00:05:45]: Yeah, exactly, ‘cause it can be misleading ‘cause people usually think about sandboxes as a demo or a test environment versus a production-grade environment. But what Daytona does, if you think of the laptop that you have in front of you or the computer that's over there, or, my wife is an architect, so she has like a Windows with a 3D graphics card inside to do 3D rendering. Like, as humans, we have different computers or different compositions of computers. And our belief is strongly that agents today and going forward will need all these different compositions of computers to do different types of tasks. And so we offer that basically through an API.Swyx [00:06:19]: Yeah, to give people - I'm trying to sort of front-load all the aha moments or the wow moments so that people can, stay engaged and click like and subscribe. the market is exploding, right? Like, you have been reporting 74% month-on-month growth, and it also, it's just been growing for a while. Like, it's been going like this. And every single - It's not just you guys. It's every single.Ivan [00:06:41]: Everyone, yeah.Swyx [00:06:42]: Sort of, compute provider. I don't know if you agree with me saying compute provider or not.Ivan [00:06:48]: It's fine.Swyx [00:06:48]: Yeah. So like organically PLG-driven growth, but also enterprise is doing super well, I think I wanna rewind to January of last year when you did the pivot. Like, so you obviously called this market early, and you were positioned for it, and you are now one of the market leaders. But what was the insight that made you do the pivot?The Pivot: From Human Dev Environments to Agent SandboxesIvan [00:07:06]: The insight that made us do this pivot is the quarter before that, so end of 2024, when we had - Basically, we did a demo with - I don't I think we discussed this as well, Devin was not public. You actually gave me access to Devin at that time. So Devin.Swyx [00:07:25]: I did?Ivan [00:07:26]: Yeah, you gave me access.Swyx [00:07:26]: I don't think I was supposed.Ivan [00:07:27]: Yeah, exactly.Swyx [00:07:28]: Yeah, I.Ivan [00:07:28]: So it doesn't matter. You.Swyx [00:07:29]: Yeah. I gave like three friends access.Ivan [00:07:31]: Yeah, or it was a call and you showed it to me. It doesn't matter. but OpenDevin was available, which is now called OpenHands. And so we're like, “Oh, this seems to be a thing. This is not public. Let's take our for human automation of dev environments and take, OpenDevin and launch that as a SaaS.” And we did that. Not very many people signed up and used it, but a lot of people reached out that were building agents, and they were like, “Hey, my agent needs a compute sandbox runtime,” whatever you wanna call it. I forgot what it was called at that point. And then we were like, “Oh, amazing. This is a new market. Here is our infrastructure. Here's our product, and go.” And what we found really fast, soon, was that people did not like what we had built. It didn't work. And I remember talking to people at the beginning when we're doing this, the sandbox we're building for agents. People were like, “Oh, why is it different? It's the same thing. We have like EC2, we have VMs, we have all these things.” But we saw that everyone we gave it to, it was like 20, 30 people, they all said, “No.” Like, “This is not what we need. This sort of breaks.” And basically, me and my co-founder not knowing a lot about - ‘cause we're infra people. We're not AI people. So I basically took it upon myself to like watch every single podcast that exists, including all of, all of these and all that, and sort of get up to date, read all the blogs, like get, understand what's going on.Swyx [00:08:45]: Do you wanna shout out who else was useful, just in case people are also looking.Ivan [00:08:49]: Generally we -, I looked at There's a few of podcast, different segments and different types. So there's you guys, No Priors, Bill Gurley's was great while.Swyx [00:09:04]: VG2, yeah.Ivan [00:09:05]: Yeah, while it was around. So there's a few. 20VC is interesting from a different dynamic, and some are different dynamic. But there was, also Red Points.Swyx [00:09:14]: We're not really about the compute market.Ivan [00:09:15]: It was also already - Sorry?Swyx [00:09:16]: You're, you want - You're looking at the agent infra market.Ivan [00:09:19]: I was looking at the agent market and the AI market in general and sort of understanding who are the players, what the perception, and how that goes. And like obviously you complement this with like going to conferences, going to events, going to meetups, reading white papers, like doing all the things that you have to do to understand what's happening. And so when we figured, when we sort of had an idea of what we had to build, literally over the New Year's Eve, literally on New Year's Eve, I half vibe coded the first MVP, first minimal viable product of what Daytona is today. And I went to sleep at like 3:00 AM or something like that. I was doing - I just put my like baby daughter and wife to sleep and, Happy New Year's, and go back to just, doing this. And I sent it to my co-founder, my CTO, and he saw it in the morning. He's like, “This is absolute garbage.” “Do not show this to anybody at all, but the idea is good.” And so he took two weeks, and he rebuilt it.Swyx [00:10:09]: Did it like look like that? Listen, I - It was rough idea.Ivan [00:10:12]: Oh, not even, not even close. Like it was it was way worse. But it was like a very - It was a simplistic view of what it should be. Like, it worked, but it was not ideal. And so he went, we went down the whole, which is his job as CTO, to go, and he came back with this version. We then called all the people that had said like, “This is garbage,” a quarter ago. And we set up these calls, and we gave it to - We just demoed it to everyone. And all the calls went long, every single one. They were 15-minute calls, and they all went to like 25, 30 minutes or whatnot. And everyone said, “We need, we want access.” There was no login, just an API key, ‘cause it was just a beta or an alpha. And they said, “Oh, we want access.” And we're like, “Sure, yeah. Okay, thank you very much.” But after like the next day, if we'd not send it, every single one, like every call that we did, everyone came back, “Where is my API key?” Like everyone wanted it. We're like, “S**t.” Like this is it. Like I've never felt So one, the understanding to your point was like most people thought it was the same infrastructure for humans and agents. We understood a quarter ago it's not. We just didn't know what was the right primitive. And then when we came, and we can talk about what that is, and we gave it to these people, I've never seen, I've never experienced - I've done multiple companies in my life. I've never experienced this, that people literally call you if you do not give them access. Like they want access right now. And so it's like, okay, they don't want this. the thing that they want doesn't seem to exist, or they have not found it, and they really want what we want. And then when we understood that we're onto something, and then when you think about the size of the market, like the market for human engineers and enterprise is a very large market, so think GitLab or whatnot. But the market for every single agent that will exist ever in the future is just like, what is that market? How big is that? And we're like, “We are all in on this.” And so that is where we made sort of the cut between the old product and the new one.Bare Metal, Stateful Sandboxes, and the Lambda + EC2 ModelSwyx [00:12:02]: Yeah. But it wasn't composable at the time?Ivan [00:12:05]: It was very - It was basically just a Linux box that you could change, that you could define number of CPUs, disk, and RAM. Like that is what you could do, but you couldn't have multiple operating systems, you couldn't resize it on the fly, you couldn't add a GPU, you couldn't do like all the things. It was just the, just the first sort of variation of that, yeah.Swyx [00:12:22]: Was it bare metal from the start?Ivan [00:12:24]: It was bare metal from the start. And so the interesting thing that we thought about right away, so our.Swyx [00:12:29]: Which, give people the background, what is the normal path?Ivan [00:12:32]: Yeah, so, basically most providers run this on top of VMs. And also.Swyx [00:12:37]: Firecracker.Ivan [00:12:38]: Yeah, they run on Firecracker and VM. And so we also fire - We can get - We have multiple isolation layers and we can do that. But the common way to do it is that they, one, that the state of the machine, or the hard disk is not part of the sandbox itself. And the other thing is they're not meant to last forever. So most of them are preemptible, like they can There's a time that they can live. And so our thought was when we were going into this is, agents will be like humans in the sense of you don't want your laptop to be shut down until you're done with work. Like, and you want to close the lid and open the lid, it's the same state. So you - Agents would want that, like the pause and come back. They want those two things. But also agents really want speed, right? Can they get it? So when we thought about it's like we need something insanely fast, how to make it fast, how to make it long-running, and stateful. And so those two things, it's like combining a Lambda and an EC2, right? Those two things together. And so we didn't have an idea how others did it, ‘cause we didn't know too that there was a market around this. It was more like, okay, this is what we need, what they need. And we looked at Kubernetes, it wasn't wasn't good enough for that. We looked at Nomad, it didn't enable that. And so our history in rewriting our own scheduler at CodeAnywhere is basically what my CTO came up with. Like, he's like, “Oh, the learnings from there,” and he brought it. And the funny thing is, our third co-founder, when he saw it, he's like, “Dude, what is this? This is like 2008.” Like, we went back in time, and he's like, “Exactly.” And so the reason why Daytona is like super fast, and you see this on benchmarks, is we essentially, we run on bare metal. We have our own scheduler, we use the underlying, disk, CPU, and RAM of the underlying machine, which means your IOPS are insanely fast because there's no, there's no network between an EBS or something like that. But also the snapshot, the point in time, the templates, are also preloaded on the bare metal machines. So when you fire off a sandbox from a template or a snapshot, you're essentially directed to the bare metal machine where that snapshot is based on that NVMe drive, and then it literally just turns on that machine, and it's local. There's no network latency, anything on there. And so that is sort of the specificities that we, when we're thinking from first principles, what a computer would look like for an agent, that is what we came up with, and that's what we created.Benchmarks, 60ms Startup, and 50,000 SandboxesSwyx [00:15:02]: Yeah. I should maybe, I don't know if you endorse this, but there's someone that does compute SDK, you guys do very well on there, with like the TTI, right? I. is this a, is this a is this a relevant benchmark for you guys? I don't know.Ivan [00:15:16]: I don't know, and it changes every day. So today RKL is.Swyx [00:15:18]: I don't know what RKL is. Never heard of it.Ivan [00:15:20]: Yeah. RK, yeah, so it is there.Swyx [00:15:22]: You are, at least a third of the next tier of performance, and then, there's a lot of other better-known names that are very slow to start.Ivan [00:15:31]: Yeah. We've been the number one by far for a long time, and now there's different, there's different definitions also of sandboxes, different isolation patterns, different other things. So RKL runs it literally on the S3, the data, so it's very different, and they spin up a sandbox, spin up a container for that, so it's a different type of thing. So the definition of a sandbox is something that we can all, we all need to get along with. But yeah, we're insanely fast on getting these things, up and running. And so you can see even there that it's a zero point 0.10 to 0.11, so.Swyx [00:16:03]: Close enough. Yeah. what else do you need, right?Ivan [00:16:05]: Yeah. So the benchmarks itself, so, in this, in I don't think the benchmarks equate to market ownership or revenue or anything like that. and I've seen this with multiple benchmarks, not just in sandboxes, but in general benchmarks around.Swyx [00:16:20]: It's table stakes. It's just like.Ivan [00:16:21]: Exactly. But it doesn't hurt.Swyx [00:16:22]: Just roughly check.Ivan [00:16:22]: Like you definitely have to be up there and you have to be competing so that people know that, oh, this is definitely one of the top. Because this is only one dimension of what customers look for. There's other things like how many can you spin up consecutively? There's a feature set, there's support, there's like all different things that people look at, but you definitely have to be there, on the benchmarks.Swyx [00:16:40]: How many people do people spin up consecutively?Ivan [00:16:43]: So we have.Swyx [00:16:43]: Or concurrently, is the Concurrency, right?Ivan [00:16:45]: There's three metrics that we look at. And so one is like time to spin up one, and so our time to spin up one is 60 milliseconds with network latency. So request, spin up, reply, 60, the whole thing, 60 milliseconds. That is one. But if you wanna spin up 50,000 at once, we are now at about 75 seconds. So it takes about 75 seconds to spin up concurrently 50,000. Some others, there's public data around this, like take 2,000 seconds, which is 30 minutes. Like there's different variations of that. And then there is the so it is speed of one, speed of like multiple, and then how many can you consistently have up and running. And so we basically have right now no limit to how much we can add because we basically own our own metal. But the biggest customer of ours does like about 850,000 every single day is sort of where they're, where they're just shy of a million every single day that they're running, we do have a request for half a million concurrent, which is literally half a million CPUs somewhere running. So that's an interesting.Swyx [00:17:44]: They pay by like vCPU seconds.Ivan [00:17:47]: By seconds, yeah.Swyx [00:17:47]: Or whatever. Yeah. Okay, and so and then, and the other thing is, the sleeping and the resuming, ‘cause it's all the stateful resumption of all these things, how, what kind of workload are people putting through this, right? Like how is it Do we measure by gigabytes in memory, gigabytes in storage? I don't In like network attached storage. I, what are the costly ones of, out of all these features?Workload Economics: CPU, RAM, Network, and StorageIvan [00:18:15]: The most expensive thing are CPU.Swyx [00:18:18]: Okay. Yeah, of course.Ivan [00:18:18]: The second one, yeah Then it's RAM, then it's disk. We actually don't charge.Swyx [00:18:22]: Which is snapshotting, right?Ivan [00:18:23]: No, it's actually the, snapshotting's part of it, but basically the size of your hard disk, of your machine. So do you have 10 gigabytes, do you have 20, do you have 50, do you have whatever? And then the transference of that. Right now, currently we don't charge for, network at all at Polychron.Swyx [00:18:37]: Oh, you gotta, yeah, you gotta fix.Ivan [00:18:38]: Yeah. It is very much a it's a larger and larger part of our bill, so we're working around, that part there. Obviously, that is the least, expensive, so the hard disk is the least expensive, so it's basically CPU, RAM, for us network, ‘cause we don't charge the customer, and then hard disk, is how it's split up. But there's also different types of workloads, so we basically split it up into two types of workloads in Daytona. One is what we call background agents or long-running agents. and the other is, basically RLs and evals, which I put sort of together. And so they have very different patterns of usage, and if you look at the usage of a background And I'll just name names of companies, not specifically.Background Agents vs. RL/Evals: Two Usage ShapesSwyx [00:19:21]: Yeah, open, all hands.Ivan [00:19:23]: Yeah. So like a background agent's a Cognition, a Lovable, a like all these things are Harvey. These are all long-running, background agents. And so if you look at their usage patterns, their usage patterns are similar to human, which is like follow the sun. Basically, the usage patterns of that is like noon is probably the highest, and the midnight is the lowest, and then weekends are lower. weekday is higher.Swyx [00:19:42]: Yeah, that's a fun question. How global is it? Is it very US-centric or?Ivan [00:19:46]: The US is a large part, but we have currently, we have Asia, Europe, and the US regions.Swyx [00:19:52]: So it's quite global.Ivan [00:19:53]: Yeah, it's quite global. We have it all over. It's interesting that our I talked to you a bit about this. Our number one city by user.Swyx [00:20:01]: Hmm.Ivan [00:20:02]: Is Singapore.Swyx [00:20:04]: Oh, wow. Amazing.Ivan [00:20:05]: Which is an interesting one, right? Not by revenue, just by just like by individual head count.Swyx [00:20:09]: Really?Ivan [00:20:09]: Just like an interesting thing.Swyx [00:20:10]: Singapore is, Singapore is weirdly high in the adoption charts of AI for the population. It's like an, seven, eight million population. And it's like keeps showing up.Ivan [00:20:20]: No, it's quite interesting. We were quite shocked, and I was like, “Oh, this is interesting.” And also one that's up there.Swyx [00:20:24]: There's a reason I'm doing AI using Singapore. it's because I'm from there.Ivan [00:20:27]: We're there. We're gonna, we're gonna be there as well. and it's interesting that Japan is in the top or like Tokyo's in the top, which is in all the tech cycles it has never been. It has never been, so it's quite interesting that they're.Swyx [00:20:39]: I think the Japanese just love AI. Yeah. It's that, and then it's Brazil. That's it.Ivan [00:20:44]: Brazil has always been in.Swyx [00:20:45]: I think.Ivan [00:20:46]: Even when I look, if you look at like GitHub's data and ask historically with CodeAnywhere, it was always like US, Western Europe, and then you'd have like India, Brazil, China, like that would be there. But like Singapore was not in, specifically Japan was never in sort of that top, that top.Swyx [00:21:01]: Yeah. Weird pockets.Ivan [00:21:01]: Weird. Yeah, so it's very global.Swyx [00:21:02]: Okay, so actually that, but that's helps you to distribute your load through, all time?Ivan [00:21:08]: The interesting thing is like we have those kind of loads, but if you look at the researcher loads, they're quite different. So what they are is like if you give them concurrency of 10,000 or 50,000 or 100,000 CPUs at ARMb, when they fire off a run, it's just 100%. And then it just runs, and then it stops. So it's very, the usage pattern is squares basically, right? And it's also not follow the sun, because people will fire it off at midnight before they go to sleep but then wake up and so it's very unpredictable, so you don't know where that is. So the shapes of the usage are quite different than we have had before. And also what's interesting is when it's sort of a follow the sun, even if you have a high growth company, you can sort of predict your usage patterns and have enough capacity for that, because it's sort of, it grows in a, in a way you can project. When you have companies doing sort of like evals and RL, they're super spiky. So they're gonna come in, it's like, “We're gonna use nothing, then can we have 100,000?” Right? And then go back down. And then 100,000, go back down. So it's very different, right? And.Swyx [00:22:09]: Do you want to lock them into commits so.Ivan [00:22:11]: Yeah, we do.Swyx [00:22:12]: Yeah, okay.Ivan [00:22:12]: We so we have to lock them into some sort of commits to have that capacity, because we have to have, basically we have to have the capacity for peak. Right? And so right now, Daytona's mean utilization is 15%, 1-5.Swyx [00:22:25]: Oh my God.Ivan [00:22:26]: So it's very low.Swyx [00:22:27]: Because it's very spiky.Ivan [00:22:27]: It's very spiky, but we get up to 90%. so we have these things. And so what we're, what we're looking at right now as a company is similar to Cloudflare where you can like geo move things around, but that works really well for basically the background agent where it's follow the sun. But this, it's not. Like it's a very different shape. Obviously with scale you figure these things out, but that's an interesting new problem that we have, as a compute provider in the agent space. And when we were doing the conference recently, and so we talked to like Nikita from Neon and.Swyx [00:22:57]: I should bring it up.Ivan [00:22:58]: Parag from Parallel and whatnot, everyone has the same problem. Whereas the usage is super spiky, and this is something that has not happened before, that you have these types of like it was always, it the amplitudes were not this high, right? So it's quite interesting use case and problem solve.Compute Conference and Spiky Agent InfrastructureSwyx [00:23:12]: Yeah, I don't know if we're gonna bring this up again, but let's just talk about the conference, you had like 1,000 something people at the Warriors game, at the Sorry, where is it? What's.Ivan [00:23:22]: Chase Center.Swyx [00:23:23]: Chase Center.Ivan [00:23:23]: Chase Center.Swyx [00:23:24]: I went. It was, it was very impressive. Obviously, you can, how to throw a conference, what did you learn? you put, you pulled together all these impressive names.Ivan [00:23:33]: What I.Swyx [00:23:34]: What were you looking for?Ivan [00:23:35]: My thesis behind the Compute Conference was let's bring together people that are building infrastructure for AI agents. Because when I think of what we're building, it is the agent is the primary user, what are the ergonomics and usage patterns of agents, and so we can do that. And what I found, this was a theory, it wasn't proven, is that we all have these problems, as I touched onto. And I was, as I was talking on stage, it was like we all have the same underlying infra problems, which is this spiky workloads, unpredictable workloads that we've never had before, in human, compute or human infrastructure. And it's, again, it's the same when I was talking to Parag or when I was talking.Swyx [00:24:20]: Lynn. Nikita.Ivan [00:24:21]: Lynn, Nikita. Lynn especially, I was talking to her the other day as well. Like the It is a very interesting type of problem to solve because I can touch on Cloudflare because there's a lot of like talk about that recently as to how they solve that, which is they have a bunch of geos, and basically, as users work in different places, and depending on your tier, they can move you around the geos. And so that how, that's how they get the higher utilization. But you can sort of predict these, and it's If it's something in You'll rarely get a spike that is 10 orders of magnitude. Like you'll get a like let's say one of your customers has some like an exponential curve. What is that to I'm using Cloudflare as an example. 10%, 20%, whatever it is. I don't, I don't have this data, I'm just assessing. It's surely not 10x, right? It's surely not something there. And so how do you go out and solve this problem? And we're all solving this in different ways. So we have.Swyx [00:25:11]: She also has the same thing.Ivan [00:25:12]: Yeah, I know specifically that like Neon had that issue as well. Like how are we solving these spiky loads and things like that ‘cause we talked about it. And so the interesting thing for me to actually internalize was, yes, everyone that's building for agents first is going through this, and we're all solving similar problems, which is quite.Swyx [00:25:28]: Let me let me double-click on this. Okay. So for example, Neon, I happen to know that they're very sort of S3 oriented, right? so they're just like fully bet on S3. And you get to benefit from S3's distribution and infrastructure. So I would imagine that Neon doesn't have to care, whereas Lynn maybe has to care a bit more because obviously she's doing GPU inference. And, for listeners, we did an episode with her, one and a half years ago. And you have to care. But like, right?Ivan [00:25:54]: Parag cares for sure, and Nikita.Swyx [00:25:58]: And Parag is C of, Parallel.Ivan [00:25:59]: Parallel, yeah.Swyx [00:26:00]: Former CTO of Twitter.Ivan [00:26:01]: Twitter, yeah.Swyx [00:26:02]: They are the search.Ivan [00:26:03]: Yeah, they're search, yeah.Swyx [00:26:03]: I You and I know but the listeners don't know.Ivan [00:26:08]: Yeah, we can put it down in the screen, and so ‘cause we, when we were talking.Swyx [00:26:11]: I'll put it up on the, on the screen.Ivan [00:26:12]: Yeah, right.Swyx [00:26:12]: People can look it up if they need.Ivan [00:26:14]: Look it up. And, yes, but they still have CPU and RAM, allocation that you have to have up and running. And so CPU and RAM, you have to allocate that and have that ready. And so there's basically two ways to do it. One is you either over-provision and you can handle the bursts, or two, you basically have, I don't know if this is a term, just-in-time compute, which is like as your load becomes, as your usage comes in, you can fire off requests for VMs or bare metals at other cloud providers and then get them up and running.Swyx [00:26:43]: This is if you go above 100%, right?Ivan [00:26:45]: Yeah, this is.Swyx [00:26:46]: Like your overflow.Ivan [00:26:46]: If your overflow, like spillage or whatever you do.Swyx [00:26:48]: You probably lose money on it, but it doesn't matter, right?Ivan [00:26:50]: It, not Well, you might, you might not That is a more cost-effective way to do it but it's a slower way to do it. Because basically what you have to do is you have to like queue your requests, spin up these just-in-time compute, get it all ready, provision it, and then get your workload there. And so if the time isn't important that much, that's fine, and you can do that. But if your customer, and especially for, let's say, the RL training runs, the reason why a lot of people come to us is because GPUs are more expensive than CPUs, right? So you want your GPU running at, what, 100% the entire time. And so when you're running runs on CPUs, when the when the CPU cycle is like down and spinning up the next one, you want that to be instantaneous so that your GPU doesn't go down, right? And if you then have to like go out and provision machines, you're essentially telling the GPU that it has to wait, and that's incurring our cost. So there's things that you have to try to solve for there.RL Workloads, Declarative Images, and Kubernetes ReplacementSwyx [00:27:43]: Yeah, let's talk about the different workload, right? You said that, what was it? A few months ago, you had zero RL workload and now it's 50%.Ivan [00:27:52]: It will be this one, 50%, yeah.Swyx [00:27:54]: Let's talk about how different it is, right? Like I imagine, for example, a lot less dynamic code generation of like arbitrary code. Like here, it's probably all the same code. You're just doing parallel runs or something, I don't know.Ivan [00:28:05]: Yeah. So you'll have multiple Depends on the like for each run, you'll have a snapshot. And they, for the most part, they actually do use our declarative image builder, which is like, “Oh, we, the agent wants these dependencies, these env vars.”Swyx [00:28:17]: These ones, yeah.Ivan [00:28:18]: Yeah, the declarative image builder, it.Swyx [00:28:20]: Which is a very modal like thing that they.Ivan [00:28:22]: Yeah. And so we build it on the fly and then we propagate that snapshot, and you can spin up as many sandboxes as you want against that snapshot. And then if you have to do changes, the model can, or like it could be also be automated. It's like, “Oh, now for the next run, we need to install these things or remove these things or whatever to get, a task done,” and then it goes off and runs that. So yes, that is something that it seems that they prefer. The number one reason I found, or should I say, let's take a step back. What we are competing against in that environment is essentially managed Kubernetes. So EKS, GKE, whatever. That is what the vast majority run on. And anyone that has tried Daytona versus GKE, EKS is like, “I'm never going back.” That has always been. There's a few reasons. One is the ergonomics. So if you have, if you're using Kubernetes to spin that up, you have to essentially manage the interface interactions with that. Daytona, although as a compute provider, it's more akin to a Twilio and Stripe from a consumption perspective than it is an AWS. Like you have an API, an SDK, it's quite like easy and seamless to get these things up and running, that's one. The other is the speed to which we spin up, which we mentioned earlier, which is much faster, and the scale to which we can go to. We haven't got into features, but an interesting feature is that it's very hard to OOM, or out of memory, our sandboxes, because we can dynamically on the fly.Swyx [00:29:48]: Resize.Ivan [00:29:49]: Resize, which is like impossible on almost any other thing. There are some technologies that enable you to do that, but it's like a very hard thing. And so we actually saw this when, the Terminal Revenge team is, brought us actually. So thank you, Alex and the team, that brought us into this whole space.Swyx [00:30:05]: It's just very rare that, a framework would just say, “Guys, just use Daytona.”Ivan [00:30:11]: Yeah, I think it says it somewhere. Yeah.Swyx [00:30:13]: Yeah. I was like, “What is this?”Ivan [00:30:15]: There's all, there's multiple there, but they also mention a few other places. and so Daytona specifically-We have, the, just jumping on themes here We, I don't know where it says Data Center.Swyx [00:30:27]: I, there.Ivan [00:30:27]: Doesn't matter.Swyx [00:30:28]: There's a very strong recommendation, which is, very unusual. Which is, it's.Ivan [00:30:33]: We do not pay them for this, just.Swyx [00:30:34]: I know, yeah. They just like you.Ivan [00:30:35]: Yeah, they like us. yeah, and also a thing, so, Data Center has multiple isolation sets underneath. The customer doesn't have to know what they are. But basically we have Docker, which is a container, that's hardened with Sysbox. So it's Docker's, isolation that is a security equivalent to a VM, but it's still a container. And that is the default, and they, especially in these training workloads, really like that as an interface to be able to use just a basic Docker container, and we enable Docker and Docker. Which for these RL runs, if you need to do a Docker compose or Kubernetes, you can spin up a K3S inside of these things, which unlocks a huge amount of workloads that you can do that you cannot do on other providers. So just on that part is much more interesting. And so we went that, through that. We showed them that we could do that, and they enjoyed that quite a bit. They being the general venture people.Swyx [00:31:28]: Those people, yeah.Ivan [00:31:29]: And Harbor people.Swyx [00:31:29]: Harbor people, do are they, are they a company yet?Ivan [00:31:33]: As far, I do not know.Customer Pull, Slack Connect, and the Computer Use BetSwyx [00:31:35]: Okay. All right. Yeah. It's like super obvious that like, there's a lot of excitement and success around these things, okay, so yeah, tell us more, right? Like, this is an exploding workload, Harbor adopted you, which helped speed things along. But what are you learning as this new workload comes online?Ivan [00:31:53]: There's a couple things that we learned, which we chat about in the beginning. We, and this has led our story, as we mentioned, we like talked to a lot of customers along the way, and we add more features and more tool sets as we talk to customers. And it's interesting that And I think it's that the ecosystem is so small and/or the models get smarter, where when we see one user come with a request, we know it goes on a roadmap if like three to five customers come with the same request in that week. It's like very bizarre. It happens so many times, which is.Swyx [00:32:27]: Because they're all friends.Ivan [00:32:28]: Sorry?Swyx [00:32:28]: They all, they're all friends. They're all in the same group chat.Ivan [00:32:30]: Yeah, probably, yeah. ‘Cause and they're like, “Oh, can you do this?” And I'm like, “Okay, this is interesting. We'll put it on a feature request.” And then the next one's like, “Oh, can you do this?” “Okay.” It's all the same, right? It's always the same. And so what we try to do, and I personally try to do, I try to be on as many call, quote-unquote “sales calls” I can. I'm in every Slack channel. We literally have about 1,000 Slack Connect channels, something like that. It's an interesting, there's so many interesting things you find out when you have all the Slack channels. You can also see where people, transfer between companies. You see leave Slack channel, enter Slack channel. It's an interesting thing. Also, just I digress, I feel that Slack Connect is literally LinkedIn what it should be. You have a list.Swyx [00:33:08]: LinkedIn charges you to, use your own connections, but Slack doesn't, right? Slack is like, do it for free. It's more lock-in. It's great.Ivan [00:33:15]: Yeah. It's amazing. Yeah. It's one of the reasons.Swyx [00:33:17]: You're gonna pay Slack for life.Ivan [00:33:18]: Exactly. You're there for life. So that's interesting. And so one of the things, the newer things we were talking about earlier is we made a big bet and put a lot of investment on computer use. that is not seen publicly the light of day. We haven't GA'd that yet, but we have.Swyx [00:33:32]: Is there a thing I can pull up?Ivan [00:33:33]: There is computer use there. It's right up a bit.Swyx [00:33:36]: Oh, yeah. Okay.Ivan [00:33:38]: What we have, what we talked about and what we've seen publicly is there's this theme now about, the human emulator where And Elon from XAI has talked about this publicly, and if you think about the models today, they're actually quite sophisticated and they can do a lot of work, but they still don't have access to all the tools. Like, I'm a strong believer that the most efficient way for an agent to work is essentially headless or through, terminal or whatnot. But if we, if we look at knowledge work in general, there's about 100 million knowledge workers in the US, about a billion in the world, and knowledge workers, and the salaries of them aggregate to 10 trillion in the US 50 trillion worldwide.Swyx [00:34:24]: Wow.Ivan [00:34:25]: Something like that. And if we look at, the five most important sectors of that, so like healthcare and government and financial services and whatnot, that's about 56% of that. So let's say it's about half of that. So in the US it's about 25 trillion, and most of them, most of that work is actually still locked into legacy apps inside of Windows, which is not going anywhere for a very long time. Like, people just won't invest in that. How much of it? our assumption is the following: if, in the RPA market, which is similar market, well, not the same 25% of, these white collar, workers', work is automated. If an agent is more sophisticated, can go through more runs, figure stuff out, let's say it's, 40%, right? And so if you take 40% of that, you get to essentially, $10 trillion a year.Swyx [00:35:17]: That's a TAM.Ivan [00:35:18]: That is a that is a TAM. So that's the TAM of the models, right? That's not our, essentially ours. But you get to that size, and to be able to do that, you essentially have to give agents these computers with the legacy. So computer use, either Mac or Windows or Linux. Linux we also obviously have and others have. But Windows specifically is something very new, and the only option right now is an EC2 with, Windows or on Azure. Both of them take anywhere from three to five minutes to spin up. We've created an actual sandbox, so it's a second instead of milliseconds, but you have, point in time snapshots, you have, forking, you have all the things that you have from a sandbox, but essentially enables you to hopefully unlock all this value. And so that's been our big push and bet, but we've sort of, kept our ear to the ground. What is sort of the next things in the market?RPA Returns: Why Agents Still Need ComputersSwyx [00:36:06]: Yeah, knowledge work, and building, and sort of RPA, the next wave of RPA. I got very excited about RPA kind of during COVID times. The UI path was IPO-ing. And it was, a very hot Isn't it, Eastern European?Ivan [00:36:20]: It is, Romanian.Swyx [00:36:21]: Romanian?Yeah, it might be the only Romanian, big unicorn okay, yeah. This I don't I don't, I don't have like a I think there's, I think there's a stage being set for the resurgence of RPA, ‘cause everyone understands that, yeah, no one wants to deal with these shitty apps and no one's gonna rewrite them. Like, you just have to do, a remote operation and programmatic operation of them.Ivan [00:36:45]: If you wanna unlock it, my own setup was basically the following. So I was doing a board deck recently, last month, whatever, and I'm like, “Okay, let's just, let's just do automated.” So, all our data's in, ClickHouse and PostHog and QuickBooks, where everyone else's is, and I'm basically, connected that all to, my Cloud code, like go off and go Cloud code whatever. Go off and, here's the integrations, go do that. It pulled out the first report, which was great. It connected to Brex and all these things, pulled it, which was great, and then I say, “Okay, now pull out this, and this,” and I kept getting, really well McKinsey-style design reports, but the data said partial data. all the missing data, partial data. Like, it can't access all the things, and I got so frustrated, and so I got, I got, my Mac Mini virtual sandbox with OpenClaw. I gave it its own account in our company, and then I went to all these services and created a read-only account, so literally like an intern in your company. And so I would say, “Now go and do this report,” and it would get the same, or like, “I can't via the MCP or the API or whatever. I can't get all the information.” I'm like, “Go log in.” And it will log into the website, then go in, export the data. It'll export the data and do the thing end to end. So even for things that have today APIs, not all of it is exposed, and I to get value, I get immense value right now, but it has to be a computer usage, unfortunately, and so I spend a bunch of tokens just on that, but I get the job done. And so if even a startup like ours, and using all the hottest tools, still needs a computer agent what hope does, Goldman have to have a headless, right?Swyx [00:38:22]: Yeah, what a - Why isn't Microsoft doing this?Ivan [00:38:27]: I'm pretty sure, Satya had a post yesterday.Swyx [00:38:29]: Oh, okay. I see.Ivan [00:38:29]: Which was like, “Every agent needs a computer.”Swyx [00:38:31]: I see, I see.Ivan [00:38:32]: So they have launched something recently.Swyx [00:38:34]: Yeah, they have Microsoft Power Automate, I'm sure, I'm sure, they're gonna have their version.macOS Sandboxes, Apple Constraints, and the Windows OpportunityIvan [00:38:39]: Version of that, yeah.Swyx [00:38:39]: You're gonna try to do yours, and it - I always know there's always demand for Mac, but I know it's, tricky to host, macOS sandboxes.Ivan [00:38:49]: We will have macOS sandboxes fairly soon. The problem with macOS, OS sandboxes is, I'm deep in this, I don't know how much interesting is.Swyx [00:38:55]: No, it's.Ivan [00:38:56]: MacOS has this problem.Swyx [00:38:57]: It's a licensing thing, right?Ivan [00:38:58]: Licensing thing. So one, you're allowed to run only two parallel VMs per machine, so that's one. Two, you can only license to a different user every 24 hours. So if you come in and theoretically, if I wanna charge you per second and I charge you one second, I have to have it idle for the rest of the day. I can't have anyone else doing that. So the pricing will be different in the sense that I will have to - we would have to charge for 24 hours, and that's not even, that's not even the most difficult thing. But the, thing above that is, from a security perspective, they enable you to do memory snapshot, pause, resume, but only on the same physical drive, physical machine. And so what you can do in, Windows world or Linux world is that I can move in the background, your snapshot from one to the other and manage load, right? Here, if you wanna do that, you essentially have to have your.Swyx [00:39:49]: Yeah, snapshots. Yeah.Ivan [00:39:50]: Your.Swyx [00:39:51]: It's like.Ivan [00:39:51]: Physical machine.Swyx [00:39:52]: You can't break it up.Ivan [00:39:53]: You can't, you can't move things around that, and all of that is, that part is, from a security standpoint, if it is written. Like, I understand the security aspect of that, but it disables you from doing these agentic, like really scalable agentic workloads.Swyx [00:40:08]: You need to do a vibe-coded, clean room implementation on macOS that you can then - That's like Clean OS or something. I don't know.Ivan [00:40:17]: So. We have.Swyx [00:40:18]: ‘cause like Linux was originally like a clean room rewrite of Unix.Ivan [00:40:21]: Okay. Yeah.Swyx [00:40:21]: Or something like that, right? Like same thing to macOS. Someone needs to do it.Ivan [00:40:25]: Someone will do that, and someone will have some long-running agents for a few days to figure this stuff out. But yeah. So definitely we - we're really close to offering something ‘cause people do want it, but the pricing will be different, and the feature set will be sort of stringent.Swyx [00:40:38]: Yeah, nobody's gonna use this. like, the labs, the labs will because they want to automate macOS.Ivan [00:40:42]: They have to do RL. They have to do RL again. But even if you The - So the point is with the RL part, if you, if you do RL on macOS, then the next iteration of the model comes out, it will be able to use these tools significantly. Then you actually need to run those, that somewhere. So you're gonna have to have that, later on. And from, if anyone at Apple is listening, I very much feel that they are shooting themselves in the foot of the scale of the revenue of compute or licensing they could get if they would just enable a concurrency model similar to what you can get on a Windows and a, and Linux.Swyx [00:41:17]: Yeah. Yeah. And I'm sure they've heard this before. They just don't care. Yeah, it's And maybe they will change their mind with the new CEO.Ivan [00:41:24]: Yeah. We'll see.Swyx [00:41:25]: We'll see.Ivan [00:41:25]: High hopes.Swyx [00:41:26]: High hopes.Ivan [00:41:26]: High hopes.Swyx [00:41:27]: Okay. But I, it's very clear the market opportunity is huge in Windows, and you can go for a long time on just Windows, but your customers are gonna want both. and I think, it is interesting to me that, this is the sort of God application of agents, right? Like, I don't It was - How big was OpenClaw for you guys? Like, was it, was there, a significant bump.OpenClaw, Agent Labs, and the B2B2C Sandbox MarketIvan [00:41:54]: Not for us because we.Swyx [00:41:54]: Because you already.Ivan [00:41:55]: We're kind of positioned differently. Whereas although it's completely PLG and we have individual developers that use it, most of the users that use Daytona are sort of a B2B2C. Sort of it's either B2B or B2B2C. So, in the researcher world, it's B2B, so you're selling to, labs and neo labs and things like that. But on the long-running agents, it's mostly, from a scale revenue perspective, it's mostly B2B2C, where you have a app layer agent that uses you at a big scale.Swyx [00:42:26]: Like a Manus. Yeah.Ivan [00:42:28]: Like a Manus Lovable type of thing.Swyx [00:42:31]: Yeah. I think that's the question of, well how, um-Uh, yeah, B2B to C is basically to me what I've been calling an agent lab, which is kind of like you're not in a model lab, but you're making a very good wrapper that is a platform that other people can sign up so they don't have to code those things. Yeah, it sound, it sounds like a much better market than the direct OpenClaw market.Ivan [00:42:56]: I've like - We I've done multiple things. So the CodeAnywhere's part of our career path R in the calendar, was very much an end user developer product. And so that is great. It You can get a lot of developer love, and I feel that we do as a company have a bunch of developer love. But it's a different type, where it's people building these things. Again, it's more akin to a Twilio because you don't really run - As a person, you wouldn't run Twilio. I don't know how many people remember. It was like ask your developer billboard and whatnot. And people really love Twilio, but they only used it inside of like, “Oh, I'm building this app or service for thing.” And so we're very much directly to that. And you also know that I used to work for a competitor for Twilio, so it's kind of ingrained, in my DNA.Swyx [00:43:35]: People don't know InfoBip is that big.Ivan [00:43:38]: Yeah, it's.Swyx [00:43:39]: Because.Ivan [00:43:40]: It's a billion euro.Swyx [00:43:40]: They're all American. They're like, “Whatever's in Europe doesn't matter to me.” But like it's the, it's the same size or bigger? Same size?Ivan [00:43:46]: It's about half the size.Swyx [00:43:47]: Half the size?Ivan [00:43:48]: Yeah, about half the size.Swyx [00:43:48]: It's like, yeah.Ivan [00:43:48]: Still huge. Multiple billions a year. Yes.Swyx [00:43:51]: That's crazy.Ivan [00:43:51]: Exactly, and so that - These are like really interesting and large revenue-generating, very sticky businesses. Whereas when you're selling to the - When your focus is the end developer, it is a very hard sell because they're very price sensitive, very price conscious, very around that. And there's very It's very hard to scale. Your cap is the number of people that are willing to spin up - First of all, wanna spin that up, and then spin up multiple of these. Whereas if you're in the enterprise one, like we know everyone's talking about like how many tokens they're spending, I'm spending. Like a lot of companies today are like, “If this is our company, spend as much as you can.” Like basically that is where we're going. And so if you think about that paradigm, where you're selling to companies that say, “Spend as much as you can to generate, productivity,” versus, “Oh, I'm a single person. I have this much budget, and I'm doing this thing because it's fun or it's helping me out or whatever.” Like it is a different, it's a different go-to-market, I think, strategy.MCP, CLIs, and Sandboxes as the Agent RuntimeSwyx [00:44:50]: Yeah, there's a lot of discussion. I'm just kind of going through like the mental list of things that are in your favor, which is, for example, MCP versus CLI. Like obviously you want CLI. It's been very good for you. I feel like it's maybe a drop in the bucket or maybe it's huge. I'm just checking whether it's like these are big trends.Ivan [00:45:10]: Those things you - work well in our favor, to your point just because every.Swyx [00:45:13]: They're kind of drop in the bucket, right?Ivan [00:45:15]: I think it's like sort of all the things come together. And so there's so many things that impact that. To your point, like OpenClaw wasn't huge for us, but like having the agent SDK, from Anthropic, so or Cloud Claude Code was very interesting. The reason why it was interesting is that a lot of, let's call them app I don't know what to call them, app layer agent companies, essentially they are like, “Oh, I can create this new app, this new agent. All I need, I just use Claude Code, and I throw it into a sandbox, and then I have my interface to the human to that.” And so that enabled so many more companies to actually offer this, and then they would pull on sandbox. So that was, that was interesting. And to your point, like MCP, versus the CLI, the MCP is an interface against an API, whereas the CLI is like you can actually go do things. Like this is it. The difference between integrations and actually running scripts or data or analysis against a thing. So being able to use a CLI very well enables the agent to do more things, and it's because that people will invoke a sandbox, they'll run it in the CLI, and but it'll do anal-analysis on that data and then give you an actual result versus just, pulling data from an API source.Swyx [00:46:29]: Yeah, it's a layer of indirection basically, it's the same thing as agentic search versus RAG, which where you're.Ivan [00:46:34]: Exactly, yeah.Swyx [00:46:34]: Just like you just win whenever people put more agents into their workflow. And so like it doesn't really matter, but I'm just kinda teasing out like what else have people heard about that like it's sort of, “Oh yeah, this is another sandbox use case. Oh yeah, that's another one.” Am I, am I missing any big ones?Ivan [00:46:51]: The thing, the thing that people, which is the computer use stuff, which I think is probably the most interesting one, is, and to your point, we've talked to so many people over the last year. It's like, “Oh, like why do you need a sandbox? Why do you need this? Why this?” And to your point, it's like, “Oh, I need sandbox for this. I need sandbox for that. I need sandbox-” It's like, “Oh, I need it for every single thing.” And so basically what I, what I - and it sounds like a broken record, it's like you use a laptop every single day, right? And you are n of one. It's just you. But now imagine how And by the way, the laptop, the computer PC market, the PC market is about equal to the cloud market in total. So it's about 150, 180 billion a year. Something like that. It's about roughly the three cloud hyperscalers is about equal to like Apple, HP, Lenovo, whatever, It's a little bit less, but it's sort of like that. And now imagine And that's just like, so how big is the addressable market? What, how many people are there in the world now? What's the last data?Swyx [00:47:45]: Let's call it eight billion.Ivan [00:47:46]: Eight billion. And so let's say you can have two computer, like you have one personal and one business, whatever. Like so it's double that, right? and so that's 16 billion, right? How many agents are gonna be running in two years, in 10 years, in 100 years? Like And for every single task, they will need one of these. And so how big is that? That market is essentially quote unquote “infinite”. You will get to the point, and Dylan Patel was at the conference talking about, from SemiAnalysis, that talks usually about GPUs, was also talking about how CPUs will now be a bottleneck because it will be the constraint. You won't be able to grow, or we won't be able to have enough of these because there won't be enough CPUs to basically do.Swyx [00:48:23]: Yeah. Well, I actually had a really good podcast with Doug Oliphant, who, which was his president at SemiAnalysis, where they've basically been like, yeah, it's been a GPU shortage first, but then it's cascaded down to memory and now to CPUs.Ivan [00:48:35]: CPU, yeah.Swyx [00:48:35]: It-What's next? So networking. So, networking actually has been in shortage for a while if you're looking at, just GPU networking. But, yeah, it's really crazy the amount of computer use that's going on, yeah, cool. I, other questions are, just the one very big part is the open sourceness which you didn't have to do, your competitors don't do, like it's not, a lot of people are worried about keeping their projects open source because some competitor can just slot fork it. I don't know if there's any reflections on just being an open source company.Open Source, Trust, and Enterprise ProcurementIvan [00:49:15]: Yeah. There's a bunch. So we the original product that we did was open source.Swyx [00:49:19]: Yeah. CodeAnywhere.Ivan [00:49:20]: So doing that was actually very good for us. There's basically a saying of, What's the saying? Like, companies that are, that are doing really well, measure themselves against, free cashflow, that are kinda okay, it's EBITDA, then, it's, it goes all the way down.Swyx [00:49:36]: The worst is like GitHub stars.Ivan [00:49:37]: GitHub stars. GitHub stars are the worst, yeah. So you go all the way down to GitHub stars. And so our original one was GitHub stars. That's what we talked about, we're at the point we're talking about revenue, so we're we've gone up the stack on that. And so we started.Swyx [00:49:47]: No, profit.Ivan [00:49:48]: Yeah. We haven't, we're, we'll get there. We'll get there. But basically at that point we did stars and GitHub and it was useful, and the original variation that we did, it we split the core into its own repo and it was Apache 2.0, so very, permissive. And then we basically would bundl

Getup Kubicast
#1 - O que faz um pesquisador de segurança?

Getup Kubicast

Play Episode Listen Later May 21, 2026 18:24


O Antes do Deploy chegou! Depois de mais de 190 episódios do Kubicast, o primeiro e maior podcast do Kubernetes do Brasil, a Getup estreia um novo formato: conversas técnicas sobre software supply chain security com quem vive esse assunto na prática.No primeiro episódio, Camila Bedretchuk e Heitor Gouvêa conversaram sobre o que faz um pesquisador de segurança e o que esse trabalho revela sobre como as empresas ainda tratam segurança hoje.O episódio cobre:O que faz um pesquisador de segurança na prática e como esse perfil se formaA relação entre pesquisa de segurança e produtoComo times de produto podem começar a incluir segurança nas discussões do dia a diaA maturidade do mercado brasileiro em software supply chain security e o que está mudando com novas regulaçõesNosso podcast é construído com a comunidade. Se você tem um tema que quer ver discutido, conhece alguém que tem muito a contribuir ou quer aparecer por aqui para trocar uma ideia sobre software supply chain security, preencha o formulário.#Getup #Quor #AntesdoDeploy #SoftwareSupplyChainSecurity #DevOps #Kubernetes #SRE #CloudNative #CNCF #DevSecOps #Cibersegurança #Podcast O Antes do Deploy é uma produção da Getup, empresa especialista em Kubernetes e projetos open source para Kubernetes. Também somos criadores do Quor, catálogo de imagens de container com CVE próximo de zero. Os episódios estão nas principais plataformas de áudio digital e no YouTube.com/@getupcloud. 

Packet Pushers - Full Podcast Feed
TCG076: Packet Pushers Assemble! Bridging the Telemetry Divide

Packet Pushers - Full Podcast Feed

Play Episode Listen Later May 20, 2026 56:25


Today our Packet Pushers team assembles to discuss whether the grass is greener on the NetOps or DevOps side of the telemetry fence. William of The Cloud Gambit, Scott of Total Network Operations, and Ned and Kyler of Day Two DevOps discuss the difficulties and differences of getting telemetry and state from devices across different... Read more »

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!This was recorded before Railway suffered a major GCP outage on May 19, despite being a multi-AZ, multi-zone mesh ring, with HA fiber interconnects between their Metal GCP AWS, because workload discoverability was unintentionally still tied to GCP. All has been resolved with a post-mortem.Railway did not start as an AI infrastructure company.It was founded in 2020 years before agents became the default way people thought about deploying software. Jake Cooper, formerly at Bloomberg and Uber, started Railway with a simple obsession: the activation energy to ship something to production should be near zero. Push code, get a URL, iterate. No Docker files, no Kubernetes manifests, no Ansible scripts stacked on Ansible scripts.For years, this was a slow grind. Railway spent its first 18 months hand-acquiring its first 100 users with Jake personally greeting every Discord signup on a second monitor.Today, Railway has raised $124m and is growing very fast. A 35-person team supports 3 million users, adding roughly 100,000 signups a week. Their bare metal data centers have a 3-month payback period vs. renting in the cloud, with 70% margins funding aggressive cloud bursting when needed. The servers they own have actually appreciated in value as RAM prices have climbed basically meaning the value of their hardware now exceeds the capital they've raised.From rebuilding Railway's network overlay over a weekend to moving the vast majority of workloads onto its own bare metal data centers, Jake Cooper is trying to build a new cloud for an agent-native world. In this episode, Railway's founder and “conductor” joins swyx and Alessio to unpack why the next era of software infrastructure is not just “Heroku but newer,” what agents need that humans did not, and why the old deployment loop of Git, PRs, CI/CD, and static cloud resources may be heading for a rewrite.We go deep on Railway's infrastructure stack: own-metal data centers, three-month cloud payback periods, cloud bursting, data center debt, Railpack, Nixpacks, Temporal, feature flags, Central Station, content-addressable filesystems, agent-safe production forks, and why the CLI may become more important than the canvas in an agent world. Jake also shares the founder journey behind Railway, how the company survived losing $500K/month, why it now serves millions of users with only 35 people, and why he believes the pull request is dying.We discuss:* How Railway went from a slow six-year grind to adding 100,000 users a week* How Railway thinks about agents as the next dominant software species* Why agents need version control, observability, compute, storage, and orchestration at 1000x scale* The economics of Railway's own-metal data centers and three-month payback* How Railway uses cloud bursting while scaling its own infrastructure* Why data center debt can be a better tool than venture debt for infra startups* Central Station, Railway's internal system for clustering customer feedback and incidents* Why responsible disclosure and over-communication matter for platforms* Why feature flags, progressive rollouts, and shadow traffic are essential for agents* Temporal's strengths, pain points, and why workflows matter for agents* Railpack, Nixpacks, Nix, and lazy-loaded content-addressable filesystems* Why “cattle, not pets” may change if you can clone the pets* Why Railway is building a new cloud from scratch instead of copying hyperscalers* The solo founder path, focus, writing, and how Jake thinks about company buildingRailway:* Website: https://railway.com/* X: https://x.com/RailwayJake Cooper:* LinkedIn: https://www.linkedin.com/in/thejakecooper/* X: https://x.com/JustJakeTimestamps00:00:00 Introduction: What Is Railway?00:02:07 Jake's Path to Railway00:06:13 Railway's Six-Year Growth Story00:08:52 Rebuilding the Business After the Free Tier00:11:17 Agents as the Next Software Platform00:13:29 Railway's Infrastructure Philosophy00:15:42 Bare Metal, Cloud Economics, and the Compute Crunch00:17:22 Cloud Bursting and Five-Cloud Networking00:20:20 Data Center Debt and Infra Financing00:23:31 Data Centers in Space00:25:24 What Agents Need From Infrastructure00:28:24 CLIs, Canvas, and Agent-Native UX00:35:15 Central Station, Incidents, and Responsible Disclosure00:40:30 Safe Rollouts, SRE Agents, and Production Forks00:45:00 AI SRE, Specs, Code, and Tests00:48:24 Self-Replicating Infrastructure and the New Serverless00:53:18 Heroku, Temporal, and Workflow Engines01:04:07 Railpack, Nixpacks, and Lazy-Loaded Filesystems01:06:01 Coding Agents, Token Spend, and Roadmap Acceleration01:10:56 The Pull Request Is Dying01:12:28 Feature Flags and the Agent-Era SDLC01:16:15 Cattle, Pets, and Cloning Machines01:19:29 Solo Founder Lessons01:24:12 Focus, GPUs, and Building a New Cloud01:28:20 Closing ThoughtsTranscriptAlessio [00:00:00]: Hey, everyone. Welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swyx, editor of Latent Space.Swyx [00:00:10]: Hey, hey, hey. Today we're in the studio with Jake Cooper of Railway.Alessio [00:00:14]: Conductor of Railway.Swyx [00:00:15]: Conductor at Railway. Yeah.Alessio [00:00:16]: Choo-choo.Swyx [00:00:17]: Do you actually have that anywhere, like on your business card?Jake [00:00:20]: We call some of our volunteer moderators conductors. I don't have a business card. We're not that big yet. At some point I will. I got handed a nice business card from the Supermicro folks, and I was like, “Damn, this is pretty official.”Swyx [00:00:30]: Business cards are coming back.Jake [00:00:32]: They're cool. They're hip. The conductor thing is good. We're trying to figure out what we want to call each other internally. Some people think it's super cringe and say, “You don't need a name for people internally.” Some people want to call each other something. We still don't have a really good one.Jake [00:00:55]: We've got New Railcrews, Trainiacs. Nothing has stuck yet.Swyx [00:01:00]: I like Trainiac. Trainiac sounds good. Railwayians. For those who don't know, what is Railway? Let's give people a crisp definition up front.Jake [00:01:09]: Railway is the easiest way to ship anything. You go to the canvas, or you talk with Claude, and you say, “Deploy a Postgres instance, deploy my GitHub repository, run this code,” and you're off to the races.Swyx [00:01:22]: You've got a nice animation on the landing page.Jake [00:01:24]: Thank you. None of my work, by the way. They don't let me touch the design stuff anymore.Jake [00:01:25]: We want to make it trivially easy not just to deploy things, but to evolve applications over time. Most tooling right now stacks entropy on top of entropy: Docker, Kubernetes, Ansible scripts, and all these other things. If we can version all of your software and keep track of all the changes, then we can make it trivial to clone environments, fork into a parallel universe, get copies of production data, get copies of any services, make changes, validate them, and collapse them back in without reproducing everything across a staging environment.The Railway Origin Story: From Uber Systems to a New CloudSwyx [00:02:07]: I was looking at your background: Bloomberg, Uber. Nothing immediately stands out as, “This guy is going to found the next great platform as a service.” What prepared you for Railway?Jake [00:02:21]: It was curiosity to keep going deeper. I started out on front-end stuff, working on Wolfram Mathematica and porting it over. Then I briefly moved to Bloomberg, then toward Uber and distributed systems, taking the Jump Bikes systems and moving them to a distributed system built on top of Cadence, the pre-Temporal Temporal.Swyx [00:02:44]: Which, by the way, I'm happy to talk about, pros and cons.Jake [00:02:48]: Totally.Swyx [00:02:51]: But let's do the Railway story.Jake [00:02:52]: It has been a continual step of wanting an experience. Whether it's walking up to a bike, unlocking it, and having it work frictionlessly, or something else, the depth required to make that happen follows from the experience. A lot of the work I do, and a lot of the team does, is in service of that experience. We fundamentally don't care how deep we have to go. We will swim to the bottom of the swimming pool to get the experience.Jake [00:03:17]: I don't have a physics PhD. I did an EECS degree. It has always been about figuring out the next step: how do we get there? That's what led to starting Railway for that experience and then moving all the way to bare metal data centers. I was adding patches to the kernel this week to get the experience there because I can see how much better it can be.Swyx [00:03:49]: Other patches to the Linux kernel this week?Jake [00:03:51]: Yeah. Not upstream. Our fork.Swyx [00:03:52]: That's a flex. Railpack? No, this is different. This is the OS on top of Railpack?Jake [00:03:57]: No, this is an actual kernel patch. It's always literally: what do we have to do to get that experience? Then figure it out. Anything is figureoutable.Swyx [00:04:10]: Would you send the patch upstream, or does it not fit other use cases?Jake [00:04:13]: Maybe. We have to work out the experience internally. It has to do with the storage layer we're building for some of the agentic stuff. Maybe it'll be useful upstream, but it's deeply useful for us internally.Open Source, Forks, and Non-Deterministic VersioningSwyx [00:04:29]: You mentioned open source before. How do you think about starting from open source, and then coding agents letting you do a lot more from forks of it?Jake [00:04:38]: GitHub's original sin is that it's almost a series of broken pointers. You have this thing, then you clone it, and now you've lost the whole upstream. How do we make it trivial for people to modify really small pieces of it?Jake [00:04:51]: We think of Git in a discrete sense: I've either made a change and merged upstream, or I haven't. What would it look like if it were percentage-based, a little more non-deterministic, or a stream of changes that users traverse as a percentage rolled out in general and then rolled all the way up?Jake [00:05:13]: We have the open-source kickback program and let you deploy templates because we want to make it trivial for people to version these shards over time. It solves a large problem around authentication, authorization, and security. NPM has a way to define, “Don't take any new packages.” The ideal end state is that you roll out progressively to users with the minimum impact zone and continue rolling up. JPMorgan should probably be the last one on the patch line, for all our sakes, because our money and livelihoods are there.Jake [00:05:53]: It's okay if Johnny Vibe Coder gets a broken patch because there's so much entropy in the system that the rubber has to meet the road at some point. You have to test at varying levels.The Long Grind: First Users, Free Tier, and Making the Business WorkSwyx [00:06:13]: I wanted to pull up this glorious chart, which is your usage or number of daily signups?Jake [00:06:22]: Daily signups, I think.Swyx [00:06:24]: You started six years ago. It was a slow grind, and now you're on a rocket ship. You say, “Don't doubt your fight and don't quit.” Maybe pick out certain points that were key inflections for the company.Jake [00:06:40]: At the start, it's about getting your first 100 users, hell or high water. We had a website and a support link. The support link was the Discord channel. I had notifications on with two monitors: the monitor I was working on and the other monitor with Discord. If anybody came in, I was immediately like, “Hey, how's it going?” It was rare, so getting those first 100 users to come back was the start.Jake [00:07:14]: Then you build a consultancy factory because users want all these things. You have to go back to the board and ask, “What is the actual product offering I want to build on top of this?”Jake [00:07:28]: VCs want charts that always go up and to the right, but in reality you don't necessarily want charts that look like that. For us, there have been periods of expansion where we add features to test use cases, and periods of compaction where we ask, “If the experience we have is good, how do we make it significantly better?” Maybe we strip out features that don't fit our ICP anymore.Jake [00:07:57]: The boom from 2022 to 2023 came from the free tier. Everybody under the sun was using it.Swyx [00:08:09]: A lot of Reddit bots and Discord bots.Jake [00:08:12]: And crypto miners. When you build an open product on the internet where anybody can sign up, the internet is a horrible place with so many things. You go through periods of asking, “How do I reach as many people as possible?” Then, “How do I fit the exact use case for the people who really matter and are really excited about this specific thing?”Jake [00:08:39]: Then there was a two-year period of making the actual business work. During the free-tier era, we were losing about half a million dollars a month.Swyx [00:08:59]: On a $20 million bank account.Jake [00:09:02]: On a $20 million bank account with maybe $50,000 a month in revenue. That's a horrible business. I don't know how anybody invested. But you have to go through it and say, “We have an experience people love, but the business has to work.”Jake [00:09:17]: There are two schools of thought. You can run the horrible business all the way up with bad margins, or you can go back and make it work. We've always wanted a super lean team. We're 35 people right now. It's very small.Swyx [00:09:36]: Supporting three million already?Jake [00:09:38]: Yeah. We're adding 100,000 users a week right now, so it's growing fast. We don't want to add headcount for the sake of headcount or throw bodies at problems. We want to build systems. It's hard to build systems during expansion because you're adding things to the system because people are asking for them or things are breaking.Jake [00:10:00]: We had to cut off the free users for a little while, rebuild the business, and make sure it worked. We want to reach as many people as possible because software is important. It's become difficult to create things in the physical world, so it's important to make it easy for people to build in the virtual world and have access to creation. But there are legs to that journey.Jake [00:10:30]: You can see divots in the charts. If you follow between 2025 and 2026, it's either summer or winter. People go on holiday with family.Swyx [00:10:50]: It affects that much?Jake [00:10:51]: Yeah. It's kind of B2C and kind of B2B. People are shipping constantly, then they stop. Our activation curve now shows more people activating on weekdays because we have more business users, so it smooths out over time.Agents as the New Interface to DeploymentSwyx [00:11:17]: Was there a point where you started prioritizing AI development or agent development?Jake [00:11:24]: We've prioritized agentic as a top-of-funnel thing. Over the last six months, we've deeply prioritized agentic as a mechanism to build and deploy things because we believe the curve is so steep and that is how people will build and deploy software.Jake [00:11:42]: It almost fundamentally doesn't matter whether this is dot-com or not because we're all on the internet anyway. If agents are going to deploy a bunch of things and we hit an inference wall at some point, we'll fix those problems. The dominant species over the next 10 years is that we've moved from assembly to C to C++ to JavaScript to words. You're going to need to close that loop.Swyx [00:12:13]: When you say this is dot-com, did you mean buying the domain, or the general case?Jake [00:12:17]: I mean the dot-com era, when companies had a huge run-up because people understood the internet was important. Then they hit bottlenecks, fundamental laws of physics, math didn't work, and everybody came back down to earth. But it didn't matter because the internet became so impactful. If you operate on a long enough time horizon, you should build these things anyway because you can see where it's going.Jake [00:12:45]: That's where I think a lot of agent stuff is. You get to a point where you're running thousands of agents in parallel. What is the inference cost? What is the compute cost? How do you make that efficient? How do you coordinate all this? We have issues coordinating humans; we don't even have good tooling for that. Now we have to figure out how to get agents to coordinate, safely version changes, and know when to raise their hand for someone to intervene. Otherwise it becomes an interrupt factory.Railway's Infrastructure Thesis: Network, Compute, Storage, and MetalSwyx [00:13:19]: Let's go right into the technical side. What are the core infrastructure or architectural beliefs of Railway that allow you to do what you do?Jake [00:13:29]: The primitives matter a lot for us. We need network, compute, storage, and orchestration around it. You need control over a lot of those things. We've talked a lot about how we don't really use Kubernetes because we want higher-order control to place workloads in very specific places.Jake [00:13:48]: The reason is that you have to be very efficient with agents: memory reuse and all these other things, or you're going to massively blow up your cost structure. Being able to rack and stack your own servers and build your own metal unlocks performance and cost. Experiences where you're running 1,000 agents in parallel are not massively cost prohibitive.Jake [00:14:13]: Token use and compute use are blowing up. Over time, those things have to get a lot more efficient. You can get a lot of margin to make those experiences solid by building your own metal. That's all in service of offering a differentiated experience to as many people as humanly possible.Swyx [00:14:51]: You have a data center in Singapore.Jake [00:14:53]: Yeah. We have two in every other region now. In Singapore, we're adding a second one in Q3.Swyx [00:14:58]: What's it like? I've never built a data center. Do you go to Equinix and say, “I want some slots?”Jake [00:15:05]: Yeah. Equinix. You basically go and say, “I want power and I want a cage.” They say, “Great, here's what it's going to be.” You rent the cage for a period of time, fill it with racks and servers, and hook up internet to it. That's all the pieces.Swyx [00:15:36]: Then you handle everything else.Jake [00:15:37]: You handle everything else.Swyx [00:15:39]: What's the math versus clouds doing it for you?Jake [00:15:43]: If we rented in the cloud, our payback period when we go to metal is about three months.Swyx [00:15:50]: Which is crazy.Jake [00:15:51]: It's nuts. That's four years of depreciated hardware. You're going to see a lot of this compute crunch because hyperscalers are buying up a lot of stuff. We're working directly with OEMs, resellers, and people building these machines: Supermicro, Dell, and others.Jake [00:16:11]: Upstream, there's a bunch of supply pressure. When we raised our last round, between deploying capital for servers and now, the amount of money we've raised is less than the amount of money we have in the bank plus the value of the servers because the servers have appreciated as RAM has gone up. It's nuts how valuable hardware has become.Jake [00:16:50]: If you look at hyperscalers, they deployed around $80 billion of capital expenditures this year, and next year will be more. That's a massive infrastructure build-out. You look at that and think it's crazy that they're spending way more than the Manhattan Project. But if every person is going to run dozens or hundreds of agents in parallel, you have no conceptual idea how much compute is required to make that experience happen, even if you're deeply efficient and sharing resources. And that doesn't even count inference.Swyx [00:17:22]: How do you plan the build-out? The growth chart is so vertical. Are you usually at 100% utilization as soon as racks are live? How far ahead are you planning?Jake [00:17:33]: We still maintain cloud presence for bursting. We work with AWS, GCP, and a few other clouds. We can rent, and then the moment we get space or power, we compact those workloads off the cloud. We started on the clouds, then built a system to migrate to our own metal. There's nothing that says you can't continually do that again, and that's exactly what we do. We never want to be compute constrained.Jake [00:18:09]: At the start of the year, we actually became compute constrained because one upstream provider wasn't able to give us quota at the rate we needed, and the hardware was slower. I spent a weekend rebuilding our entire network overlay so we could straddle five clouds: Oracle, AWS, ourselves, GCP, and one other one. We can do more than that now.Jake [00:18:38]: We got into a spot where we were trying to pack instances tight because we couldn't get enough compute. That led to a few reliability issues, which are now past us. I made a tweet pointing out that it's becoming harder and harder to acquire compute at the rate these models need to acquire compute. We got bit by it.Swyx [00:19:15]: How do you think about pricing knowing you might not have your own metal available at all times? Are you pricing assuming you need extra margin if you end up going into the cloud?Jake [00:19:26]: Because we've built out our metal data centers, our margins on metal are around 70%. We can deeply subsidize the cloud business if we want to scale at a reasonable rate. We have a few levers: metal, which makes the margins; cloud burst; debt to buy servers; and venture capital. It's an interesting operational problem: how much cash do we have, how much should we raise, how quickly can we deploy it, and can we scale revenue as quickly as we scale compute?Jake [00:20:05]: If we continue making it trivially easy for people to build and deploy, then the faster we close that loop and the more operationally excellent we are with capital, the faster the business can scale. It's almost a straight linear deployment rate.Financing Infrastructure: Hardware Debt, VC, and Operational LeverageSwyx [00:20:20]: I think infra startups raising debt is a tool people don't utilize enough or know enough about. What can you tell us about that? Is it secured against your CPUs?Jake [00:20:32]: It's secured against our hardware.Swyx [00:20:37]: What rates do you get? Who are the lenders?Jake [00:20:39]: We pay prime plus a spread, and we can refinance any of the debt as rates go down. The terms are pretty good. The unfortunate thing is that Twitter has no nuance, so people say, “Venture debt bad.” But as with all things, there are specific tools and areas where you can be deliberate instead of using one tool as a hammer. Venture capital is not the hammer for everything. You have to explore and figure out what works.Swyx [00:21:12]: VC is usually the most expensive financing you can get.Jake [00:21:15]: Yeah. I also think people think about VC incorrectly from a capital-raising perspective. Most people think, “How do I raise as much money as possible from whoever is probably the best I can get at that time?” That's close to right, but what we've tried to do is figure out what unfair advantage we can buy with that equity.Jake [00:21:34]: It's the most expensive equity you're going to give away at that point in time, assuming the company keeps getting better. How do you use it to work with someone stellar who complements you? In the seed stage, I had never started a company. Ray Tonsing had good advice, and I could text him all the time. He was really fast. Awesome.Jake [00:22:01]: Then with John and Erica at Unusual, they said, “You roughly know what you're doing building a product. We'll mostly leave you alone and be available for advice.” Amazing. Then we got to Series A and the business was an operational tire fire because we didn't know how to scale a business. Work with Erica, and Jordan is over at Redpoint, so bonus.Jake [00:22:28]: Now we've raised from TQ and FPV as we're moving into enterprises. Every step of the way, we've asked: who can we partner with at this specific time to unlock the next section of the journey? I don't know enterprise sales. As an engineer, I can eyeball what features we might need, and we have wonderful people internally who can help. But you want boardroom dynamics where everyone is aligned and asking, “How do we win this?” instead of bickering about strategy.Data Centers in Space and the Physics of ComputeSwyx [00:23:31]: You had a tweet about data centers in space. Why no data centers in space?Jake [00:23:37]: It's not “no data centers in space.” My hot take is that I think it is solvable. I've just never seen anybody solve it.Swyx [00:23:49]: You said, “How are you going to dissipate that much heat in a vacuum?” You're making a physics claim.Jake [00:23:55]: I haven't seen anybody prove how you're going to dissipate that much heat in a vacuum. It doesn't mean it's not possible. It just means nobody has brought it up yet.Swyx [00:24:05]: Astrophage.Jake [00:24:06]: I don't know what that is.Swyx [00:24:07]: The Martian thing. Okay, you're very logical.Jake [00:24:09]: It could work. A lot of people are putting the cart before the horse. They say, “We're going to put data centers in space.” Okay, but how? “We have time to figure it out.” It's like in The Martian where they ask how they're going to intercept something and say, “We'll figure it out.”Swyx [00:24:36]: Making a bet on human invention is weird because you blind trust that it can be solved. But with physics, there are first-principles bounds you can put on it. Maybe not. Maybe you're asking to travel time or break a fundamental thermodynamic law.Jake [00:24:57]: I don't know how VCs do this either. How do you know what's not possible and a grift versus what's possible but sounds completely insane? “We're going to put data centers in space.” Coin flip as to which it is, and I guess you'll know in 10 years. That's one cycle.What Agents Need: Versioning, Observability, and 1,000x ScaleSwyx [00:25:23]: Moving back to agents. The branching, fast spin-up, and orchestration you do feels like pre-work that happened to be exactly what agents want. What do agents want differently than humans?Jake [00:25:37]: They want the ability to version things. It's not that different; it materializes slightly differently. Agents want a way to test changes incrementally. Engineers have feature flags. Is there a reason agents can't use feature flags? I don't think so.Jake [00:25:54]: They want version control. Can we use Git or not Git? That one is up in the air. I think something outside Git will emerge for how we version these things over time. They need observability. You need to query what happened, when it happened, which steps failed, traces, logs, metrics, and all the rest. They need network, compute, and storage. They need to write files, save files, iterate on files, and snapshot file systems.Jake [00:26:25]: A lot of what humans needed is in line with what agents need. Branching and forking are not different; we're just moving 1,000 times quicker. It can look like you need something massively different, but what you need is something massively better than what existed. You need orchestration massively better than Kubernetes. You need networking probably better than Envoy. It goes all the way down the stack.Jake [00:26:55]: If the workload profile doesn't change so much as it gets massively compressed because you need thousands of these things, what assumptions change? etcd is going to melt. You need to replace it with something. You can go all the way down the stack and say, “That part has to change, that part has to change, and that part has to change.”Jake [00:27:19]: The interesting thing about the super-exponential curve is that you have to build systems where you can rip out those parts at any time because a new bottleneck might emerge. You get good at parallel agents, and a different part of the system breaks. So it's similar to what humans needed, but at 1,000x scale.Jake [00:27:55]: How do you do code review in the age of agents?Swyx [00:28:00]: You throw more agents at it.Jake [00:28:01]: You don't. But then who reviews for CVEs and all these other things?Swyx [00:28:07]: More agents.Jake [00:28:08]: And that's how we hit the inference wall. You can continually throw agents at the problem, but I think there's a limit to the number of agents you can throw at a problem.CLI, Agent Handles, and Closing the LoopSwyx [00:28:24]: You already had a CLI before it was cool. How is the shape of what you're exposing changing, if at all?Jake [00:28:28]: CLIs have always been cool. The CLI changes because we think about how to give Claude, Codex, ChatGPT, or any model a handhold.Jake [00:28:50]: A CLI is a single command: deploy, get logs, and so on. Things that were prohibitively annoying to humans are not annoying to agents. They're nice. If I handed you a CLI with 40 arguments and 600 flags, you'd think, “I'm never going to use all of this.” But if you hand it to an agent, it says, “This is excellent. I have so many handles to work with.”Jake [00:29:24]: If you're going to expose things to agents that way, you want as many handles as possible where they can get information, query dynamic information, and close the loop quickly. Most problems right now are about how to close the loop as quickly as possible. Where does the agent get stuck, and how can you remove that?Jake [00:29:49]: Telemetry is important. If you can tell where the agent gets stuck from the CLI and say, “12% of people deviate from the happy path because of this, and now I add this argument and drive it down to 2%,” you massively increase the rate of loop closure.Jake [00:30:03]: That's how we think about not just the CLI, but every point in the dashboard. It's a user journey: I hear about Railway. I get something deployed. I get my first green build or aha moment. I see an endpoint, logs, whatever. Then I iterate. The iteration loop is indefinite. The user wants to deploy a new thing, a Postgres instance, change code, and keep iterating.Jake [00:30:36]: If you focus on the iteration loops and what's blocking them from closing quickly, one thing we say internally is: you never want to be waiting on compute anymore. You always want to be waiting on intelligence. If you're waiting on compute, there's a bottleneck that needs to be destroyed because eventually that bottleneck becomes so large that another workflow emerges to change it.Jake [00:31:04]: We've built a product where you push code, build it, and so on. But I fundamentally believe the push-pull loop is going away. We'll get to a point where you make a small change in production, that change is versioned across your infrastructure, you're working alongside copy-on-write versions of your database and infrastructure, and then you merge it in and it's instantaneously live. That's the holy grail of loops. The push-pull-rebuild thing is a point of friction that we're removing entirely.Canvas as Output: Dashboards, Context Anchors, and HyperstructuresSwyx [00:31:43]: It's incredibly fast. If anyone hasn't tried it, that fast feedback is great. My hot take is that Railway was famous for its canvas, which visualizes your infrastructure and lets you manipulate it visually. But that was for humans. For the next phase of growth, Railway CLI is more important than canvas.Jake [00:32:05]: The canvas is funny because it's a mechanism to show changes over time. You're right that previously we used it a lot as an input. Moving forward, its goal is more like an output. You would go to the canvas, make changes, see them, and watch your infrastructure evolve. Now agents have access to the CLI and can make those changes. So the canvas becomes an output: what information does the human need at this moment to make suitable decisions about control requests? Do I approve this or not?Jake [00:32:57]: It also has to be an anchor for your context, a port in the storm. Think of it like layers in a file system. You start with a project, then drill down into services, then into a function or code, because you want to represent the entire thing not just in your head, but in the canvas. Other people can share that representation, think on the same wavelength, and move quickly.Jake [00:33:33]: A lot of organizations get in trouble as they scale because all the context lives in someone's head. “How does this microservice work?” “I have no idea; go ask this person.” Then you have whole categories of products built around context discovery. A lot of that melts away if you have a solid hierarchy and can infinitely nest services, code, context, and everything else all the way down. That's what lets you build these structures over time.Jake [00:34:18]: It's also what lets us build what I've called hyperstructures: things that are way bigger. You look at the Golden Gate Bridge and ask, “How did we build that?” There's a meme that we lost the technology. To some extent, yes, because the coordination that built those things evolved and changed. We lost some of the art of building structure as we jammed everything into Slack.Swyx [00:34:52]: But you jam everything in Discord.Jake [00:34:53]: Same point. It doesn't matter. It's message passing and interrupts, message passing and interrupts.Swyx [00:35:00]: So you're arguing there should be something better and more structured than Slack?Jake [00:35:04]: Yeah. For sure. I think Slack is awful, and Discord is awful too.Central Station: Context Routing, Support, and Incident ClustersSwyx [00:35:09]: This is the equivalent of my mom test. What have you done that has your solution to this?Jake [00:35:15]: Internally, we've built a tool called Central Station that aggregates all the context from our users. Every piece of feedback, every customer support item, everything gets aggregated into clusters. If an incident is brewing, we can determine how many users are affected and break off a discussion based on that.Jake [00:35:40]: That is more helpful than long-running channels where you're trying to decide which channel to put something in. If you can dynamically aggregate information and dynamically route it to the right person based on context, it works better. We know internally that these four people are close to networking. If we see a networking thing, we can drill it down to those four people. If it's with this part, we can look at the commits. This is no longer a manual process internally.Jake [00:36:13]: If you go to station or help.railway.com, that's why we built it. We wanted to scale with a massive amount of leverage by aggregating feedback.Swyx [00:36:27]: This is built in-house?Jake [00:36:28]: Yep.Swyx [00:36:29]: I remember helping out on this one with Angelo in 2023. You scale a lot with a very small team.Jake [00:36:38]: Yeah. We're about 10 times bigger now.Swyx [00:36:40]: You have your full developer code here? Very cool.Jake [00:36:44]: If you go to railway.com/stats, we expose this as a pub-sub-able thing. It's all real-time metrics. There's a way to get it as JSON somewhere if you care.Jake [00:37:01]: We're big on trying to build everything in public and talk about what we're working on. We've had issues in the past, and we'll say, “Here's how we're fixing these things.” We've gotten compliments and flak for incident reports. We're always trying to make them better and talk with people.Incidents, Disclosure, and Progressive RolloutsSwyx [00:37:20]: You had a big one recently. I liked that it was scoped to 3,000. You presumably used Central Station. Talk through what happened and how you address it internally as a team.Jake [00:37:38]: Internally, this one really sucked. It had to do with an upstream provider that didn't do the behavior it said it documented, which is unfortunate given they wrote the RFC for how the behavior should work. We rolled those things out, and Central Station caught it initially when a couple users said caches weren't invalidating. We turned it off immediately.Jake [00:38:03]: When you roll out to a large user base of three million people, you get a lot of disparate behaviors. We tested in staging and had tests, but we hit an edge case. We've hardened those systems, and now we can make that better. But it was a tough one.Swyx [00:38:39]: I always wonder how private disclosure is supposed to work if people find an issue. Are they supposed to contact you first? When you run a platform, these things will happen. What channels should people pursue to quietly resolve it before it becomes a bigger incident?Jake [00:38:59]: There's responsible disclosure. We err on the side of over-disclosing and letting you know something is wrong versus having your provider gaslight you. We've erred on sharing those things more publicly, even if they impact a small subset of users. That's a decision we've made internally. We have four values. One is honor. The honorable thing is to notify people to the widest degree at which they may have been affected or there was an issue, and then confront it head-on: why did it happen, what can we do better?Swyx [00:39:45]: Not the whole user base. That's because of incremental rollouts and other things?Jake [00:39:50]: Yeah. Progressive rollouts.Swyx [00:39:54]: That should be the norm at all large platforms.Jake [00:39:58]: It should. A variety of companies do this. There's the quote that Meta runs 10,000 different versions of Meta. To our earlier point about agents, they need the same thing. They need shadow traffic and all these other things. We've built so much ceremony around production being sacred that we need to make it trivially easy to test different behaviors in a safe environment. Then you can make mistakes in a safe environment.Safe AI SRE: Customer Agents, Forked Environments, and Production ParityAlessio [00:40:30]: Do you see a world where these things get automatically caught, not necessarily by your agent, but by your customer's agent? The cache invalidation issue seems easy to check if you know to look for it.Jake [00:40:44]: It's hard because to determine it, we almost need to hook into your observability infrastructure. That's why we have the template loop on the platform: so you can roll things out progressively. You can roll out to Johnny Vibe Coder initially, or push a shard that someone consumes at their own leisure. Or you can roll it out over weeks: 0.1% of people, 1% of people, early adopters, then all the way up. That's the non-deterministic version control we talked about earlier.Jake [00:41:30]: I believe that's where most things should go, because most companies end up building staged rollout systems in-house. It's the same thing built again and again at every company. There's a massive opportunity to consolidate developer debt.Alessio [00:41:45]: You should have a free tier. Model providers give free tokens if you let them use the data. You could give free compute if someone is the number-one shard that goes out and lets you plug into their observability.Jake [00:41:55]: We do that. That's why we talked about the impact on 3,000 people. We start with lower-impact people. Larger companies on the platform are last to receive those rollouts so they have a version of the platform that's deeply stable.Alessio [00:42:16]: I have three services, so I'm sure I get the first rollout. You can nuke my thing at any time. There are all these SRE agent companies. Observability people also want agents that fix upstream problems. You have your own agent in the canvas now. How do you see that playing out?Jake [00:42:39]: It's the stacking entropy problem. If you don't have primitives to make iteration in production safe, it becomes difficult. If you're an observability provider saying, “Here's the fix to this error,” assume 80% are good and make sense. But in the last 20% long tail of complex issues, if you let somebody stamp it, you create an opportunity for an incident.Jake [00:43:08]: That's why forked environments are important. People have staging, but it always drifts from production. You need primitives, workflows, and experience built first-party on the platform so you can fork any service at any point in time.Jake [00:43:33]: I think of the canvas as a sheet of transparency paper. The agent is a little guy you push up into the canvas. It should say, “I need to copy that service and that service so I can test these two things.” It gets a read-only copy of production. Anything that's PII gets marked as a transform when we clone the database, create a copy-on-write version, or read from it. Then the agent makes changes and asks, “Does this actually work?” as close to production as possible.Jake [00:44:22]: That's how close you have to be, or you get massive drift. The system becomes unstable. You see this with massive systems built on Docker for local, Kubernetes for production, and a specific thing for something else. That complexity slows developers and becomes unstable at scale, making it hard to iterate. We want to compress that way down and say, “As close to prod as possible is where we want to be.”From AISRE Skeptic to Agent BelieverSwyx [00:45:00]: I was texting Erica for questions, and she says you were originally not a believer in AISRE. Have you come around on it?Jake [00:45:10]: I flipped, but I'm still not a believer in AISRE if you don't have the primitives to make it safe. If you unleash AISRE on production infrastructure without safe primitives for copying volumes and making sure things are fine, it's going to nuke your production database. It's not a matter of if, but when. I'm a big believer in making those loops safe.Jake [00:45:33]: I was a deep AI skeptic until 2023. In 2024, I thought, “Maybe I can roughly make this thing do it.” In 2025, I thought, “Now I can hold this.” Over winter break, everybody came back saying, “It's almost impossible to hold this.”Swyx [00:46:01]: Did you see this on the Claude docs? CloudBot? OpenCloud?Jake [00:46:06]: It's gotten to a point where it's harder to hold it wrong than to hold it right. There's a scene in Avengers where Vision picks up Thor's hammer and says it's terribly well-balanced. It self-balances and works well. I'm a deep believer at this point that this will be the dominant species: assembly, C, C++, JavaScript, words.Swyx [00:46:35]: It feels like a big jump.Jake [00:46:37]: It is. But it's not like you abandon CPU-based discrete logic and move straight to fuzzy logic. You need both. Your skills should call code or applications or some static structure. You can use skills to distill what the procedure should be or how the code should act.Jake [00:47:02]: I'm coming to a thesis: you need three points. You need a clear spec defining the system, the code, and the tests. When you say it out loud, if you've been in engineering long enough, you're like, “Of course. That's an RFC, tests, and code.” But they all matter. Having them together lets them reinforce each other: the spec and tests match, but the code doesn't, so reconcile it. Or the tests and code match but the spec doesn't, so reconcile that. That's the iteration loop.Jake [00:47:41]: That's why you're seeing people talk about software factories, docs, and reconciliation. Some of that is architectural astronomy if you don't implement it, but that loop is where most things will end up.Swyx [00:48:07]: For listeners, we've been talking about this on the pod for three years: the holy trinity of specs and tests. Itamar Friedman from Qodo is the reference if people want to look it up.Self-Modifying Infrastructure and the End of Push-Pull-RebuildSwyx [00:48:18]: One thing I want to mention on the OpenCloud idea is self-modification. I don't know how Railway would support it, but I have my OpenClaw, and I just tell it it has the Railway CLI and can do whatever. In theory, whatever capabilities or new infra it needs, it can call the Railway CLI, provision it, and add it to itself. The agent can modify its own infra.Jake [00:48:45]: It's nuts. I have a loop set up where you put the Railway CLI on top of something that runs on Railway. You're authenticated as whatever the current box is, and you can make any changes to it. Then you call Railway deploy, and it deploys itself.Jake [00:49:04]: It's like: “I need to spin up this instance of this environment. I already exist in this environment. Excellent, I have access to a Postgres instance now.” That's where we want to go with agentic, self-replicating infrastructure. That's your loop: iterate in production. You continue making changes. If it works, merge it upstream. If it doesn't, throw it away.Jake [00:49:37]: How do you make throwaway copies trivial to spin up and super cheap? The era of “I have an AWS instance with four vCPU and 16 gigs of RAM” is going to get destroyed. If you do that for agents, you need a thousand of those machines. It's prohibitively expensive compared with what we've spent a ton of time figuring out: the atomic unit of deploy, whether you call it isolates, sandboxes, or something else. Only pay for what you use, spin up instantaneously, and close the loop as quickly as possible.Jake [00:50:15]: If the system can self-replicate safely and say, “This is my environment, I'm making these changes,” it can come back with, “Does this look good? This is a new state of infrastructure given this prompt. I think I've solved it.” Then you go back and say, “Actually, it looks different.” It does the loop again. Then you say, “Cool. Apply.”Swyx [00:50:38]: That's retroactively obvious, which is the most useful kind. Any other comments on agent deployment on Railway?Jake [00:50:51]: It's getting better every day. I'm on X or Twitter. You can always yell at me about the parts not working as well as they should, because plenty of things should work way better.The New Serverless: Stateful, Long-Running, Pay-for-What-You-Use LinuxSwyx [00:51:04]: At this stage, when people want massively or embarrassingly parallel compute, they usually talk serverless. I feel like there's a new serverless compared to the previous five years of serverless. You're in that new bucket. Do you have comparisons or philosophical differences you want to call out?Jake [00:51:31]: It's somewhere in between. It's the ability to run stateful, long-running workflows or executions.Swyx [00:51:42]: Vercel has Fluid Compute, Cloudflare has some container thing, Google has App Runner and others.Jake [00:51:55]: That's where everything is roughly going, and it's why we've been working on this for six years. We believe users need access to a computer: a box that speaks Linux. They need to deploy what they want. Other systems change the surface area of what you can build. For us, users need a computer and need to deploy anything they truly want. That's why we've focused on the primitives: network, compute, storage. If we give you those and expose them so you can run things indefinitely, that's where we believe it's going.Jake [00:52:43]: Twitter has no nuance, so everyone says “servers” or “serverless.” It's always somewhere in the middle: I want to run it for a long time, but I don't want to provision the resource statically or pay for things I'm not using. That's been our thesis from day one: pay only for what you use, run it indefinitely, and it is full Linux.Swyx [00:53:12]: That's why I like the naming of Fluid. It's fluid. Flexible.Heroku, Focus, and Carrying the Torch Without Becoming the PastSwyx [00:53:18]: Another milestone is the Heroku official deprecation. You're one of the presumptive new Herokus. “New Heroku” has been a category for as long as I've been in developer tooling. It's finally happening. What was that like? Any behind-the-scenes of, “This is the moment”?Jake [00:53:42]: You have people where you're like, “You were running stuff on here? You, as this company?” It's crazy that names you would know are running on it and now coming to us saying, “We want to move a lot of this off.”Swyx [00:54:00]: Any behind-the-scenes on why Salesforce let Heroku stagnate?Jake [00:54:05]: I can only guess. It's hard when it's not your business. Salesforce's business is to build a great CRM. That's their focus. Then you acquire a compute business as an offshoot. A lot of early Meta people talk about focus. Boz has a write-up about how in the early days of Meta they had no money, so they were forced to focus. Then they turned on the money tree and had no reason not to split their focus.Jake [00:54:52]: But that dilutes your product. You get offshoots where you ask, “Is this the focus of the business?” If it's not core, it languishes. A lot of companies get in trouble when they split focus because they're fighting a multi-front war, not just externally but internally for alignment. Where are we going? What are we doing? What is our purpose?Jake [00:55:24]: If you're Salesforce-built and mission-driven, you want to work on Salesforce. Heroku is off to the side. It's not core to the business. Getting resources, budget, focus, and alignment internally becomes hard. It was a matter of time.Swyx [00:56:06]: Kudos for them to call it out instead of leaving it unknown.Jake [00:56:12]: Their release was a little odd. They called it out, but they didn't say they were shutting it down. Behind the scenes, I think they issued messages to people saying they should close accounts and that they were going to deprecate and remove things over time.Jake [00:56:30]: It's crazy because some of my first deployment experiences were on Heroku. You start with dragging things into an FTP server, then you try to get a deploy working, and then it's Heroku. It was the on-ramp for us. But the wheel turns. New things emerge. We're happy to carry the torch for a lot of that. But we don't want to be the new Heroku. We want to be the way people build and deploy software, and ultimately the way people monetize software over time.Swyx [00:57:19]: It's still a big crown to be the new Heroku. There are 50 companies that fought for that.Jake [00:57:23]: Everybody is holding some portion of it. We're happy to support people and companies. The platform works differently. The game loop is similar, but we've been dogmatic about where these things are going: primitives, agents, fan-out. Some things fit; some workflows need to change. We have an approximation of Heroku pipelines with the environment system. It's exciting. We've got a ton of people we can support, and it's growing a lot.Temporal, Workflow Engines, and State MachinesSwyx [00:58:12]: I have one more technical question about Temporal. I've sold my shares. You're a power user and one of our earliest customers. I met you through Temporal. You built on Temporal. You have complaints. This may be the most neutral and informed conversation anyone will hear about Temporal without someone working at the company.Jake [00:58:39]: That's fair. I've used Temporal for almost 10 years because of Cadence at Uber.Swyx [00:58:52]: Give people a sense of what Cadence was at Uber.Jake [00:58:57]: Cadence was the precursor to Temporal. It powers trip actions, rides, when you rent a Jump bike or scooter or car. You're running workflows for a period of time and saying, “This ride will run indefinitely until it finishes.” You attach information: you paused in this zone, so add this charge to the bill. When you end the trip, the workflow is done. That experience was powered by Cadence at the time.Swyx [00:59:34]: I used to say it's like programming the entire user journey top-down as one function.Jake [00:59:39]: It's a powerful idea and important. It's also important for the next phase of the agentic journey. You want an agent to do a specific task, be complete or incomplete on that task, and move on to the next thing. You need a way to manage workflows dynamically.Jake [00:59:59]: Temporal was always great in theory, and great when you got it working the way you wanted in production. But it required you to model the entire journey in your head. If you didn't, you could cause issues where replaying the state of the workflow causes non-determinism.Swyx [01:00:25]: Because it works on deterministic workflow history.Jake [01:00:28]: Exactly. I describe it as a jet engine. If you know how to operate it and run it, it's great. But you can't hand it to people trying to build complicated things if they don't have the whole state in their head.Jake [01:00:48]: We run our whole deployment pipeline on top of it. That's a reasonably complicated workflow: pre-commit hooks, signaling, queuing, and all the rest. We ran into the same thing at Uber. As you express a large workflow, it gets more complicated, with more states in the state machine that you have to map back to the workflow.Swyx [01:01:15]: It's a lot of ifs.Jake [01:01:16]: Exactly. At Uber, we built a system for doing the state machine and testing it. We've started to build some of those things here because it's grown heavily. It's not quite love-hate. When it works well, it works super well. But if someone who doesn't have full context puts something into the system that invalidates state or causes non-determinism, or spins off a ton of activities, you have to keep track of underlying SRE knobs like activity slots. Those should scale with memory, vCPU, and so on. It becomes a bear to scale.Swyx [01:02:10]: You need a capable sysadmin running things behind the scenes. If you moved off, what would you do?Jake [01:02:19]: We'd build our own workflow engine. We have a few internally that we've worked on.Swyx [01:02:27]: This is one of those classes of things you typically wouldn't vibe code, but I'm wondering if you can.Jake [01:02:33]: I still don't think you should vibe code it. You still want to run decent tests to make sure it works.Swyx [01:02:39]: Timo didn't invent that from scratch either. There are libraries you can run. On top of that, it's just a state machine that you have to map out. Ultimately, you define the instructions you want and run them through a state machine.Jake [01:03:00]: It's very doable. Workflow stuff is interesting. Restate is doing neat stuff here.Swyx [01:03:10]: You're tied into JavaScript. Are you a JavaScript maxi?Jake [01:03:13]: Internally, we have TypeScript, Rust, and Go. We don't add more languages. Actually, we have a little C because we write BPF code and hooks. But those are the languages.Swyx [01:03:28]: Is this for sidecars?Jake [01:03:32]: No. It's for the networking stack, volumes, and things like that. We use TypeScript a lot because it powers the dashboard, but we're moving a lot of workflow stuff off the dashboard stack and into the infrastructure stack.Railpack, Nixpacks, and Content-Addressable FilesystemsSwyx [01:04:00]: Cool. Any other technical infrastructure stuff? Railpacks?Jake [01:04:07]: We built an engine for determining dependencies based on source code. It's called Railpack. We built the first version, Nixpacks, on top of Nix, and then we moved.Swyx [01:04:17]: People have been trying to get me to adopt Nix and NixOS for four years. Is it ever going to be a thing?Jake [01:04:23]: I don't know. We're excited about it, but it has pain points. Think of it as a stack of versioned binaries at specific slices in time. If you want version X and version Y, you bloat the package space, which blows up image size and makes real-world workloads difficult.Swyx [01:04:53]: But you content-address it and cache it. In theory, there are optimizations.Jake [01:05:00]: In theory, yes. But with a large enough user base and disparate enough machines, you run into a problem Meta described in the XFAAS paper, their internal serverless system. It becomes difficult at scale unless you break out specific runtimes.Jake [01:05:24]: We didn't want to do that because we wanted to truly allow you to deploy anything. That was our initial thing with Nix. But we've moved toward interesting work around content-addressable file systems that can lazy-load anything from any point and page it into memory.Swyx [01:05:48]: Amazing.Jake [01:05:49]: The future is very bright. It's crazy, and it's going to be nuts.Coding Agent Spend, Roadmaps, and Token ROISwyx [01:05:54]: Founder journey stuff?Alessio [01:05:56]: Your cloud usage: you tweeted you're going to spend $300K this month?Jake [01:06:01]: I think we got to $200K.Alessio [01:06:02]: Coding agents?Jake [01:06:03]: Yeah.Swyx [01:06:04]: Across the company?Alessio [01:06:05]: You only have 35 people, so I'm sure they're not all spending $10K a month. What's the distribution?Jake [01:06:10]: I think I'm at about $25K. We have power users all the way down. We came back from winter break, and I basically said, “If you're writing code by hand, you're doing this wrong.” The tools are good enough now that you can move extremely quickly. There are issues and pain points, but you should be reviewing the code you are writing instead of writing it by hand.Jake [01:06:40]: Architectural patterns matter more now than ever, but you shouldn't spend your time generating code you would write. If you know how to write it, ask the agent to write it and reconcile it until it looks like you would have written it yourself.Jake [01:06:58]: People misconstrue my propensity to push people toward agents as connected to our growth and some reliability bumps. They're not necessarily related. The tools are good enough to move extremely quickly and build things way larger than you could before.Jake [01:07:19]: To the earlier point about cooling data centers in space: I don't know. But with software, you can ask, “How would I build block storage from scratch? How would I do these things?” I have ideas because I have history and have read papers. Let me work them out and build massive test benches with thousands of tests, because those are now free to author. If you're not using AI systems to speed-run your roadmap and reconcile your existing system onto the future, you're missing a large point of what's happening.Alessio [01:08:12]: What's the path to spending $3 million a month? Is it bound by ideas and things customers can absorb?Jake [01:08:19]: For most companies, it's bound by deployment at this point. That's why we've seen a massive boom in users and companies, from Fortune 50s down, asking how to get developers to move faster. You'll probably hit your CFO before any technical limits because they'll look at the eye-watering amount of money spent on tokens. Inference costs have to come down, but we're inference constrained now. There will be price discovery around what makes sense for an org to adopt.Jake [01:09:06]: I think you'll end up with the F1 driver concept. If someone is really adept at these things, it makes sense to put them in a $3 million car. If they're not, it probably doesn't make sense. You'll take a few people and say, “You can drive the F1 car. We need to go in this direction. Figure out if it works and prototype it.”Jake [01:09:33]: We've done some of that and vastly accelerated our roadmap. We thought we'd ship something in a few years; now we can probably ship it in a few months because we validated it and don't have to build it incrementally. We can skip steps and move toward our vision.Alessio [01:09:58]: A lot of people are realizing the roadmap doesn't always have a business impact, so they say tokens are too expensive. But if your roadmap were built to make more money by the time you built it, you'd have token pricing for it, the same way you do with sales. You'd spend a billion dollars on sales if you knew you would get $2 billion of revenue.Jake [01:10:19]: Exactly. A naive way to measure this is the percentage of tokens that end up in production. If you can measure impact because those tokens end up in production, that's awesome. But the burden of proof will rise. Internally, we have a growing number of pull requests that haven't merged. The question becomes: how do you get this into production? It's about how quickly you can build and deploy software, which is exciting because that's our whole thing.The SDLC Shift: Prompt Requests, Feature Flags, and Safe RolloutsSwyx [01:10:56]: The SDLC is changing. One thesis is that the pull request is dying. It's going to be the prompt request. Beyond that, code review is also kind of dying if you have all the other systems in place. What else is changing about the SDLC?Jake [01:11:19]: The AISRE and the tools to make it happen. AISRE is pie-in-the-sky aspirational. What does it take to get an AISRE? What tools do you need to build?Swyx [01:11:32]: You should expose your tooling to customers at some point. The Central Station command center.Jake [01:11:39]: We have it for template maintainers. Template maintainers can deploy and maintain templates, and they get feedback. We're going to expose those things incrementally.Swyx [01:11:51]: Clustering around incidents. Everyone has a version of that, but I don't think anyone has solved it.Jake [01:11:56]: I won't say we've solved it internally, but it's gotten so good that we can see incidents forming pretty quickly. At some point, those will be things either someone else builds or we build. We've always built things purpose-built for us. If it makes sense to make it useful for users, monetize it, or turn that loop into a profit center instead of a cost center, we want to do that.Jake [01:12:28]: Pull request is definitely dying.Swyx [01:12:29]: Do you do first-party feature flagging and incremental rollout stuff?Jake [01:12:34]: We have a feature-flagging engine we built internally and will eventually roll out.Swyx [01:12:38]: I don't see it as a user. How come you didn't give us what you have?Jake [01:12:43]: We have to beta test it. We care a lot about the quality of the things. There's plenty we've used internally that doesn't make it all the way through the journey because it fails. It works for one service but not multiple services. We'd have to build it for multiple services and know that if we released it, we'd rebuild it again and again. Some things are worth that, but many inform the roadmap.Jake [01:13:18]: We don't want to dilute the experience by saying, “This works, but only for this service,” unless it's a core initiative. Over the next few months, we'll roll out things that work for a single service, then multiple services, then multiple services across the environment. You have to be deliberate. Otherwise you create broken disparate experiences and support load because people ask how to use the feature.Jake [01:13:52]: It's the earlier expansion and compaction pattern. You expand the company to get features, then compact and smooth them out so the experience is stellar. You told me in the hallway, “It's gotten so much better.” Internally we're saying, “This part really sucks. We need to make it significantly better.”Swyx [01:14:11]: I can attest to that over the last three years watching you build Railway. For listeners, feature flagging is a huge part of Uber culture. So much so that they have too many feature flags and another thing to remove feature flags. Facebook has Gatekeeper. Agents are going to need this. It's fundamental to incremental rollouts. OpenAI acquired Statsig. GPT-5 is routing and flagging through different models.Jake [01:14:56]: It's super important. If the software development lifecycle is going to change because we're doing things 1,000 times faster and 1,000 times more concurrently, what becomes important at scale?Jake [01:15:16]: Before I started Railway, I built a feature-flagging product and tried to sell it. It was an easier version of LaunchDarkly. I ran into a problem: anyone small enough to adopt your technology doesn't care about feature flags, and anyone large enough to need feature flags needs so much scale that you have to build out all the infrastructure. I scrapped it.Jake [01:15:42]: But what is old is new again. Companies are trying to move quickly, but you can't YOLO a vibe-coded thing straight into production. You need to say, “Here's my blast radius, my impact, and I want to shadow it for these users.” Feature flags. You're going to need the tools larger companies built to maintain their structures. Everything gets compressed by 1,000x so everybody can build those structures quickly.Jake [01:16:07]: That's exactly where we are: compressing the software development lifecycle, then expanding it and adding more new things.Cattle, Pets, and Clonable InfrastructureSwyx [01:16:15]: Another term that comes to mind for newer developers is “cattle, not pets.” People treat production like a pet. It has a name. You baby it and keep it alive. With cattle, you can mass farm, roll out, portion parts out, and kill them.Jake [01:16:37]: I think that might change. You can move toward having pets as long as you have a cloning machine for your pets.Swyx [01:16:52]: Yeah.Jake [01:16:52]: If you can snapshot every single thing at every frame, it doesn't matter if something gets obliterated because you have a snapshot of it. The things we've built right now are designed to block changes from the hermetically sealed DevOps line. You have to write a Dockerfile because you nee

The Product Experience
How PMs can win with open source - Dan Ciruli (Product Leader, Nutanix)

The Product Experience

Play Episode Listen Later May 20, 2026 41:45 Transcription Available


Dan Ciruli is VP and General Manager of Cloud Native at Nutanix. A computer science graduate of UC Berkeley, Dan spent a decade in engineering before pivoting to product management in 2003, a role that barely had a name when he started. Since then he has held product leadership positions at EMC and Google, where he was part of the team that helped create Kubernetes and open source Google's cloud infrastructure.He was a founding member of the OpenAPI Initiative and a steering committee member for the Istio service mesh project, and has spent the last two decades with one foot in commercial product development and one in the open source community.In this episode, Dan explains why open source is not a charity exercise, how companies actually make money from code they give away for free, and what product managers get wrong when they tell their engineers to avoid it.Key takeaways— Open source is not crowdsourcing from individuals — much of the contribution comes from companies investing on the clock, because broad adoption benefits everyone more than proprietary lock-in.— The CNCF succeeded because it created a neutral space where the largest and smallest organisations felt equally safe contributing and consuming. That structure — not the code itself — is what made cloud native computing universal.— Being a product manager in open source requires the same core instinct as any other PM role: understanding the why. The difference is that your engineers may work for a competitor, and your roadmap is not entirely yours to control.— AI is multiplying the capability of both good actors and bad actors in open source security. The answer is not to slow adoption but to keep a credible human in the loop — someone with accumulated trust, judgement and accountability.— Before open sourcing your own work, be clear on how your company will make money, articulate it concisely for leadership, and then find at least one other organisation — even a competitor — willing to join you. A consortium signals a standard. A solo release signals a gamble.Chapters1:16 — From engineering to product management3:11 — Bridging open source and commercial work5:05 — The origin of Kubernetes at Google6:35 — How Nutanix embraces open source7:16 — The crowdsourcing misconception8:51 — Why the CNCF changed everything11:25 — Building a defensible moat in open source12:13 — The business models behind free code14:18 — Managing roadmaps you don't fully control15:04 — When your competitor writes your code16:04 — The CEO who wore his secrets around his neck18:13 — Developing an open source strategy19:37 — The one question every PM must ask22:44 — What is the CNCF?23:34 — AI, open source and the security arms race29:45 — Chop wood, carry water: the human in the loop31:48 — Advice for PMs running open source products33:15 — Harnessing a community you don't manage34:38 — Should you open source your own work?36:35 — How messy does it really get?39:33 — Linux is an anti-patternOur HostsLily Smith enjoys working as a consultant product manager with early-stage and growing startups and as a mentor to other product managers. She's currently Chief Product Officer at BBC Maestro, and has spent 13 years in the tech industry working with startups in the SaaS and mobile space. She's worked on a diverse range of products – leading the product teams through discovery, prototyping, testing and delivery. Lily also founded ProductTank Bristol and runs ProductCamp in Bristol and Bath.Randy Silver is a Leadership & Product Coach and Consultant. He gets teams unstuck, helping you to supercharge your results. Randy's held interim CPO and Leadership roles at scale-ups and SMEs, advised start-ups, and been Head of Product at HSBC and Sainsbury's. He participated in Silicon Valley Product Group's Coaching the Coaches forum, and speaks frequently at conferences and events. You can join one of communities he runs for CPOs (CPO Circles), Product Managers (Product In the {A}ether) and Product Coaches. He's the author of What Do We Do Now? A Product Manager's Guide to Strategy in the Time of COVID-19. A recovering music journalist and editor, Randy also launched Amazon's music stores in the US & UK.

Ardan Labs Podcast
AI, Open Source, and Accessibility with Eugene Cheah

Ardan Labs Podcast

Play Episode Listen Later May 20, 2026 80:57


In this episode of the Ardan Labs Podcast, Ale Kennedy talks with Eugene Cheah, founder of Featherless, about his journey from physics to building globally accessible AI systems. Eugene shares his vision for making AI more affordable, multilingual, and open to communities around the world through efficient architectures and open-source collaboration.The conversation explores GPU optimization, evolving AI infrastructure, the importance of multilingual support, and the balance between innovation and regulation. Eugene also reflects on speaking at the United Nations, the future of open-source AI, and why accessibility and transparency are essential for the next generation of AI technology.00:00 Introduction and Featherless02:25 Education and Early Interests10:24 University and Military Service15:19 Entering the AI Industry22:33 Startups and AI Development30:42 AI as a Force for Good34:28 AI, Culture, and Automation42:13 Fundraising and Building a Startup50:10 AI Architecture and Optimization58:23 The Evolution of Featherless01:02:37 Building a Global AI Vision01:06:57 Open Source and AI Accessibility01:12:35 AI Risks and Real-World Concerns01:18:20 Lessons Learned and Final ThoughtsConnect with Eugene: LinkedIn: https://www.linkedin.com/in/eugene-cheah-a47791126/Mentioned in this Episode:Featherless AI: https://featherless.ai/Want more from Ardan Labs? You can learn Go, Kubernetes, Docker & more through our video training, live events, or through our blog!Online Courses : https://ardanlabs.com/education/ Live Events : https://www.ardanlabs.com/live-training-events/ Blog : https://www.ardanlabs.com/blog Github : https://github.com/ardanlabs

Late Night Linux All Episodes
Hybrid Cloud Show – Episode 56

Late Night Linux All Episodes

Play Episode Listen Later May 15, 2026 29:51


We get into some homelab updates. Sean has been consolidating hardware, Gary has been implementing high availability with Proxmox, and Shane has been working hard to get Home Assistant working with Kubernetes, as well as downloading YouTube videos. Shane’s homelab Support us on patreon and get an ad-free RSS feed with early episodes sometimes Subscribe to the RSS feed.

MLOps.community
Agents are Just While Loops

MLOps.community

Play Episode Listen Later May 15, 2026 41:11


Hamza Tahir, co-founder of ZenML, joins the show to cut through the hype around long-running agents — arguing that at the end of the day, an agent is just a while loop that talks to a model, calls a tool, and writes to a file system. He covers the architecture of agent harnesses (inner and outer), what durable execution actually guarantees (and what it doesn't), and why the ML pipeline paradigm is a cleaner mental model than transactions for most agent workloads.Hamza also announces Kitaru — ZenML's new open-source execution runtime for async Python agents — built on five years of running ML workloads in enterprise environments.What we get into:Agents are while loops: The surprising simplicity under all the tooling: a brain (LLM), hands (tool calls), and a file system, stacked recursivelyInner harness vs outer harness: Why Pydantic AI owns the inner loop while production deployment needs a separate runtime layerWhat "long-running" actually means: Why the infrastructure we need to build is about extrapolating the future, not defining a time window todayDurable execution demystified: What checkpointing actually guarantees (infra failures, pod death, network drops) vs. what it never will (external state, bad LLM outputs, Snowflake rollbacks)ML pipelines vs transactions: Why bursty containers in Kubernetes map more naturally to agent workloads than microsecond-latency queue workers — and why Hamza argues against the complexity taxAnthropic opening the harness: Why letting other models run Claude Cowork is a "boss move," and what it means for the one-harness vs one-model debateHuman-in-the-loop, done right: The pod-kill-and-resume pattern, and why warm pools matter less when your agent runs for daysKitaru: ZenML's new open source durable execution runtime: zero-config local, Kubernetes/SageMaker/Vertex in production, built on Pydantic AI integrationArguing with Claude about Temporal: Hamza's story of spending hours getting an LLM to admit ZenML and Temporal solves the same problemIf you're architecting agents for production, picking between Pydantic AI, LangGraph, and Temporal, or just want to understand what "durable execution" actually means — this is the episode.// LINKS & RESOURCESKitaru on GitHub: https://github.com/zenml-io/kitaruKitaru launch blog post: https://www.zenml.io/blog/kitaru-launchKitaru on Hacker News: https://news.ycombinator.com/item?id=47520115Hamza Tahir on LinkedIn: https://www.linkedin.com/in/hamzatahirofficial/ZenML: https://www.zenml.io/ Timestamps[00:00] While Loop Checkpointing[00:24] Long-Running Agents Explained[01:28] Agent Harness Model Definitions[06:30] Durability and State Recovery[11:03] Agent Systems Layers[18:45] Durability in Agent Systems[22:07] ML Pipeline vs Transactions[29:23] Durability vs Guarantees[33:13] Durability vs Chaos Engineering[39:50] Kitaru Naming and Purpose[40:38] Wrap up#AIAgents #DurableExecution #OpenSource

The DevOps Kitchen Talks's Podcast
DKT 96 | Mock-интервью DevOps: AWS EKS, Terraform, Kubernetes, AI + много практики

The DevOps Kitchen Talks's Podcast

Play Episode Listen Later May 15, 2026 125:45


Mock-интервью с Николаем Лебедевым - DevOps/SRE-инженер, 17 лет в Linux, 4 года AWS EKS. Stack: Terraform, Flux, Cassandra, Kafka, Vault, SOPS. Два часа - много практики, много каверзных вопросов. ЧТО СПРАШИВАЛИ ☁️ AWS: EKS и IRSA, VPC с нуля (CIDR, multi-AZ, multi-region), managed K8s vs self-hosted, Elasticache, Golden Signals и метрики SRE.

Cloud Realities
RR012 How Open Source is powering AI with Richard Harmond, Red Hat

Cloud Realities

Play Episode Listen Later May 14, 2026 53:47


Open Source is giving AI a real boost, making it easier and faster for organisations to build and experiment with new ideas. As adoption grows, these open ecosystems are helping businesses move quicker, stay flexible, and unlock value with more confidence.This week, Dave, Esmee, and Rob are joined by Richard Harmon, VP & Global Head of Financial Services at Red Hat to explore how Open Source is shaping AI, from mainframes to Kubernetes, and from regulation and sovereignty to a future of AI agents writing code. TLDR00:25 – Introduction00:52 – Hangout: Deep democracy training and “what instrument are you?”03:19 – Dig in: Open‑source culture and AI, do they complement each other?10:02 – Conversation with Richard Harmon51:12 – Sitting in the chair and trying to keep up with AI GuestRichard Harmon: https://www.linkedin.com/in/richardlaurenharmon/ HostsDave Chapman:  https://www.linkedin.com/in/chapmandr/Esmee van de Giessen:  https://www.linkedin.com/in/esmeevandegiessen/Rob Kernahan:  https://www.linkedin.com/in/rob-kernahan/ ProductionMarcel van der Burg:  https://www.linkedin.com/in/marcel-vd-burg/Dave Chapman:  https://www.linkedin.com/in/chapmandr/ SoundBen Corbett:  https://www.linkedin.com/in/ben-corbett-3b6a11135/Louis Corbett:   https://www.linkedin.com/in/louis-corbett-087250264/ 'Realities Remixed' is an original podcast from Capgemini

Kubernetes Podcast from Google
Kubernetes at Uber with Lucy Sweet

Kubernetes Podcast from Google

Play Episode Listen Later May 13, 2026 40:34


Guest is Lucy Sweet, a Staff Software engineer at Uber and the lead for the Kubernetes Node Lifecycle Working Group. Imagine trying to move millions of compute cores and thousands of microservices to a brand new platform. All without dropping a single user request, ride, or delivery. Sounds like an absolute logistical nightmare, right? Well, today we are sitting down with someone who actually lived to tell the tale Lucy. In this episode, we are diving deep into Uber's monumental infrastructure journey: moving away from their in-house system to Kubernetes. We'll be unpacking the reality of running at this scale, why it's always DNS and why building things for fun is worth it. Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod - bluesky: @kubernetespodcast.com   News of the week Broadcom announced donating Velero to the CNCF Sandbox Level KubeCon && CloudNativeCon Amsterdam 2026 Transparency report Call for Proposals for KubeCon && CloudNativeCon North America 2026 closes May 31 OpenChoreo v1.0 CNCF Sandbox Links from the interview Lucy on Linkedin Lucy's website [Article] Migrating Uber's Compute Platform to Kubernetes [Lucy Video] Migrating 2 million CPU cores to Kubernetes Up: Portable Microservices Ready for the Cloud Peloton: Uber's Unified Resource Scheduler for Diverse Cluster Workloads Odin: Uber's Stateful Platform Uber Batch platform Apache Mesos Hyrum's Law GKE Blue-Green nodepools Node Lifecycle Working Group Scaling Infrastructure Management with Grail kubegpt.org Osquery Uber Careers  

Crazy Wisdom
Episode #546: Beyond Postgres and Node.js: What Happens When Your Database Runs Your Code

Crazy Wisdom

Play Episode Listen Later May 11, 2026 56:42


In this episode of the Crazy Wisdom Podcast, host Stewart Alsop sits down with Tyler Cloutier, founder of Clockwork Labs and creator of SpaceTimeDB. They explore how SpaceTimeDB functions as more than just a database—it's essentially a distributed operating system that merges server logic with data storage, enabling real-time applications and time-travel capabilities. The conversation ranges from the technical architecture of databases and operating systems to the philosophy of distributed systems, touching on everything from Unix and Linux to how SpaceTimeDB could revolutionize AI-generated software deployment. Tyler explains how their system reduces the complexity of building real-time applications, makes deployment simpler for both humans and AI agents, and why games like their MMORPG BitCraft Online drove them to create this new infrastructure. They also discuss the future of the internet, the role of bots in gaming, and how SpaceTimeDB fits into the broader landscape of cloud computing alongside tools like Cloudflare, Vercel, and Docker. For more information, visit spacetimedb.com or check out Clockwork Labs on GitHub and Twitter.Timestamps00:00 Stewart introduces Tyler Cloutier, founder of Clockwork Labs, discussing the origin of SpaceTimeDB's name inspired by Einstein's theory and its time travel capabilities that store all operations indefinitely05:00 Tyler explains SpaceTimeDB as more of an operating system than a database, using tables instead of file systems while running code in a sandboxed environment with full atomic properties10:00 Discussion of how SpaceTimeDB replaces both Node.js and Postgres by merging web server and database functionality, eliminating separate deployment concerns15:00 Tyler explains JavaScript execution through Chrome's V8 engine and JIT compiling, leading to Node.js creation for server-side JavaScript development20:00 Explanation of stateless web servers versus stateful game servers, and why games require in-memory state management for real-time performance25:00 Tyler introduces reducers and real-time subscriptions, questioning why more applications aren't real-time when state changes should update immediately30:00 Discussion of Facebook as essentially a text-based MMO, comparing social media architecture to game server requirements and the need for unified systems35:00 Tyler explains ACID properties in databases: atomic, consistent, isolated, and durable, using game item trading examples40:00 Comparing SpaceTimeDB to smart contract systems without cryptocurrency or global consensus, positioning it as a smart database with centralized trust45:00 Tyler reveals SpaceTimeDB uses 43% fewer tokens than Postgres for AI-generated applications, making it valuable for vibe coding platforms50:00 Conversation shifts to bots in games and proof-of-human concepts, with Tyler proposing biometric systems and discussing potential in-person gaming applications55:00 Closing discussion about tracking AI-driven traffic through UTM parameters and finding SpaceTimeDB at spacetimedb.comKey Insights1. SpaceTimeDB is fundamentally a database that runs application code directly inside it, combining what traditionally required separate systems like Postgres and Node.js. Users compile their application logic into WebAssembly or JavaScript and upload it to run within the database itself. This architecture provides high performance because the entire server backend operates inside the database environment. The system also features time travel capabilities, storing every operation and change to data persistently and indefinitely, allowing users to set application state back to any earlier point in time. This makes SpaceTimeDB more accurately described as an operating system rather than just a database, where the abstraction is that everything is a table rather than a file.2. The inspiration for SpaceTimeDB came from building BitCraft Online, an MMORPG where all players exist in a single persistent world and rebuild civilization together. Traditional MMO backends required complex custom solutions to handle real-time state, with game servers storing state in memory and periodically writing to databases. This complexity existed because games cannot afford the latency of constantly delegating to distant databases like traditional web applications can. SpaceTimeDB solved this by making the database fast enough to handle real-time requirements directly, eliminating the need for separate game servers. This same performance advantage that benefits games also applies to web applications, which is why SpaceTimeDB evolved from a game-specific tool to a general-purpose platform.3. SpaceTimeDB functions as a distributed operating system where each database acts like a process in an actor model system, similar to Erlang or Scala Akka. Databases can send messages to other databases and be spawned across a cluster for horizontal scaling. This represents an overlay operating system running on top of Linux rather than competing with it, providing a distributed abstraction across many machines while Linux handles device drivers and hardware support. The vision is for the cloud to function as a single enormous computer running one operating system, where developers simply publish their programs without managing separate services, deployment, routing, networking, or persistence infrastructure.4. The real-time capabilities of SpaceTimeDB address a fundamental limitation in how most web applications work today. Traditional web servers are stateless, delegating all state to databases and accepting network round-trip latency for each request, which is why users often must refresh pages to see updates. SpaceTimeDB allows queries to be subscribed to, maintaining open connections that stream changes whenever query results update. This makes applications like Discord, Facebook, or banking systems naturally real-time without requiring page refreshes. The historical accident that more things are not real-time represents a problem SpaceTimeDB solves by unifying the web world with the game world's real-time requirements.5. SpaceTimeDB implements ACID properties—Atomic, Consistent, Isolated, and Durable—ensuring database operations are reliable and safe. Atomic means operations either fully happen or not at all, preventing issues like item duplication in games when trading between players. Consistent means declared invariants like unique usernames are always enforced. Isolated means concurrent operations do not interfere with each other. Durable means changes persist even if computers restart, with varying levels from in-memory on one machine to disk storage across multiple geographic locations. These properties are managed through reducers, functions inspired by React Redux that fold changes into application state incrementally.6. For AI and large language models, SpaceTimeDB offers significant advantages in building and deploying applications. Testing showed that creating applications with SpaceTimeDB uses 43% fewer tokens compared to Postgres implementations, costs less, has fewer bugs, and is easier to extend. This matters because the primary cost for vibe coding platforms is tokens. As more software gets written in the next twelve months than ever before, there is insufficient focus on infrastructure required to run all this AI-generated software. SpaceTimeDB positions itself as ideal for LLMs to target because of its simplified deployment model where developers just publish code and the system handles everything behind the scenes.7. SpaceTimeDB can be understood as a smart contract system without cryptocurrency or global decentralized consensus. Like blockchain smart contracts, it executes code with atomic, consistent, isolated, and durable properties, but avoids the expense and slowness of requiring all computers worldwide to agree on everything. Instead, it offers centralized trust where users trust Clockwork Labs not to modify deployed contracts, rather than the trustless but extremely costly blockchain approach. This makes it functionally similar to Cloudflare's durable objects but with full relational database capabilities. The system exists before the networking layer where Cloudflare operates, handling deployment, server, and database functions while Cloudflare could provide DDoS protection in front of it.

Open Source Security Podcast
Open source is critical infrastructure with Kat Cosgrove

Open Source Security Podcast

Play Episode Listen Later May 11, 2026 38:01


Josh talks to Kat Cosgrove about a how companies should be treating open source more like their critical infrastructure than free stuff. Kat has a ton of knowledge about how the interactions between companies and open source communities can work well, or not work at all. Kat's time on the Kubernetes Release Team. We touch on how a project like Kubernetes is super successful, while another, Ingress NGINX, was not. It's a super insightful discussion with a ton of lessons and advice for everyone. The show notes and blog post for this episode can be found at https://opensourcesecurity.io/2026/2026-05-open-source-infrastructure-kat/

Cabeça de Lab
KUBERNETES NA MAGALU CLOUD

Cabeça de Lab

Play Episode Listen Later May 11, 2026 68:30


Neste episódio, mergulhamos no universo de Kubernetes e containerização na Magalu Cloud. Falamos sobre containers, Docker, Kubernetes sem hype, escalabilidade, arquitetura, desafios reais de produção e como a cloud brasileira está construindo uma experiência mais simples, flexível e próxima da realidade dos desenvolvedores no Brasil.---Nos siga no Twitter e no Instagram: @luizalabs e @cabecadelabDúvidas, cabeçadas ou sugestões? Mande um e-mail para ⁠cabecadelab@luizalabs.com⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Participantes:LIA SALES | https://www.linkedin.com/in/liasales/JONATAS BALDIN | https://www.linkedin.com/in/jonatasbaldin/THIAGO PAGOTTO | https://www.linkedin.com/in/thiago-pagotto/MARCOS NORIYUKI | https://www.linkedin.com/in/marcos-noriyuki-miyata/

Kubernetes Podcast from Google
GKE Turns 10 Hackathon, with Amie Wei

Kubernetes Podcast from Google

Play Episode Listen Later May 7, 2026 16:36


Amie Wei is a Sr. Solutions Engineer at HashiCorp and was the winner of last year's GKE Turns 10 Hackathon. It was Amie's first time entering a hackathon and she ended up bringing the prize home with a Cart-To-Kitchen AI Assistant. Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod - bluesky: @kubernetespodcast.com   News of the week Kubernetes 1.36 codename Haru is here LLM-d is a CNCF Sandbox project CNCF Certification Advancement & Recertification Experience program (CARE) Agones is a CNCF Sandbox project Links from the interview Amie Wei on LinkedIn GKE turns 10 Hackathon Hackathon winner announcement Online boutique sample Bank of Anthos sample The Cart-to-Kitchen AI Assistant on GKE  

Talk Python To Me - Python conversations for passionate developers
#547: Parallel Python at Anyscale with Ray

Talk Python To Me - Python conversations for passionate developers

Play Episode Listen Later May 6, 2026 59:16 Transcription Available


When OpenAI trained GPT-3, they didn't roll their own orchestration layer. They used Ray, an open source Python framework born out of the same Berkeley research lab lineage that gave us Apache Spark. And here's the twist: Ray was originally built for reinforcement learning research, then quietly faded as RL hit a wall. Until ChatGPT showed up. Suddenly reinforcement learning was back, as the post-training step that turns a raw language model into something genuinely useful. Edward Oakes and Richard Liaw, two founding engineers behind Ray and Anyscale, join me on Talk Python to tell that story. We'll trace Ray from its RISE Lab origins at UC Berkeley to powering some of the largest training runs in the world. We'll talk about what Ray actually is, a distributed execution engine for AI workloads, and how a few lines of Python become work running across hundreds of GPUs. We'll cover Ray Data for multimodal pipelines, the dashboard, the VS Code remote debugger, KubRay for Kubernetes, and where Ray fits alongside Dask, multiprocessing, and asyncio. If you've ever stared at a single-machine Python script and thought, "there has to be a better way to scale this", this one's for you Episode sponsors Sentry Error Monitoring, Code talkpython26 AgentField AI Talk Python Courses Links from the show Guests Richard Liaw: github.com Edward Oakes: github.com Ray: www.ray.io Example code (we used for walk-through): docs.ray.io Getting Started with Ray: docs.ray.io Ray Libraries: docs.ray.io kuberay: github.com Watch this episode on YouTube: youtube.com Episode #547 deep-dive: talkpython.fm/547 Episode transcripts: talkpython.fm Theme Song: Developer Rap

Packet Pushers - Full Podcast Feed
TCG075: Say the Thing: How the Network Automation Conference Circuit Shaped One SP Operator's Voice

Packet Pushers - Full Podcast Feed

Play Episode Listen Later May 6, 2026 48:53


Eyvonne and William sit down with Joseph Nicholson, a Network Operations Engineer with NTT DATA, to share how public speaking transformed his career and technical experience. Joseph went from a terrifying ten minute lightning talk at AutoCon 2 to presenting 45-minute sessions at conferences like NANOG. Together they discuss how conversations in conference halls influenced... Read more »

Ardan Labs Podcast
Creativity, AI Systems, and True Foundry with Nikunj Bajaj

Ardan Labs Podcast

Play Episode Listen Later May 6, 2026 86:55


In this episode of the Ardan Labs Podcast, Ale Kennedy talks with Nikunj Bajaj, co-founder of True Foundry, about his journey from India to Silicon Valley and his work building modern AI infrastructure. Nikunj shares insights from his time at Meta, where he worked on large-scale machine learning systems, and how those experiences shaped the foundation of True Foundry.The conversation explores the evolution of AI—from early machine learning systems to today's generative models—and the infrastructure required to support them. Nikunj also discusses leadership lessons, long-term thinking, and what it takes to build and scale an AI startup in a rapidly changing landscape.00:00 Introduction07:38 IIT and Early Academic Journey16:45 Internships and Career Decisions29:15 Research at UC Berkeley36:37 Entering the Workforce and AI Evolution40:16 Leadership Lessons from Meta46:42 Leaving Meta and Starting a Startup52:50 Building During the Pandemic01:00:44 Founding True Foundry01:06:31 Product Development and Early Challenges01:10:31 Evolution of AI Infrastructure01:13:35 Vision for True Foundry01:20:55 Reflections and Lessons LearnedConnect with Nikunj: LinkedIn: https://www.linkedin.com/in/nikunj-bajaj-10476824/Mentioned in this Episode:True Foundry: https://www.truefoundry.comWant more from Ardan Labs? You can learn Go, Kubernetes, Docker & more through our video training, live events, or through our blog!Online Courses : https://ardanlabs.com/education/ Live Events : https://www.ardanlabs.com/live-training-events/ Blog : https://www.ardanlabs.com/blog Github : https://github.com/ardanlabs

The Cloud Pod
TCP-Talks: Keep the Raccoons Out: Service Mesh, MCP, and Securing Agentic Workloads.

The Cloud Pod

Play Episode Listen Later May 5, 2026 38:05


Keep the Raccoons Out: Service Mesh, MCP, and Securing Agentic Workloads With William Morgan, CEO of Buoyant and creator of Linkerd Linkerd just turned 10, so we brought on the person who built it and coined the term “service mesh” in the first place. William Morgan joins Jonathan and Justin to talk about where service mesh came from, where it’s going, and the very specific kind of chaos that agentic AI is about to unleash on anyone who owns a Kubernetes cluster. The short version: lock your doors, because the raccoons are coming. “Our job is basically to make Linkerd as boring as possible.” William traces Linkerd’s origins back to Twitter’s infrastructure work between 2010 and 2014, when a Ruby on Rails monolith turned into a sprawling distributed system — the same problems we have today, just a different decade. As the fifth project ever to join the CNCF, Linkerd has had a front-row seat to the ecosystem’s evolution, and William explains why his actual goal these days is to make it as boring as humanly possible: the kind of dependable infrastructure layer you can trust to still be around in another 90 years. That’s also why he’s not adding AI to Linkerd — an infrastructure layer has to be fast, lightweight, and predictable, and generative AI is the opposite of all three. “At some point your agentic workload is going to figure out how to delete the production database. And it’s going to try it.” The heart of the conversation is what the AI wave means for the platform teams who own the clusters. Developers just got an army of AI assistants, and that has real consequences for CI/CD, code quality, and blast radius. William digs into the boundary problem — agentic workloads are untrusted but need access to your most important systems — and why zero trust has suddenly stopped being optional now that the code hitting your database no longer clears peer review and a security committee. Along the way they get into cache-aware routing that can take a 13-second inference call down to one, the still-unsolved mess of agentic identity, and why we keep anthropomorphizing these tools and letting our guard down. “If you don’t use Linkerd, your data system will be overrun by raccoons.” Finally, they turn to MCP — building a catalog of MCP servers, detecting tool calls, and adding DLP-style protection in front of the services an agent never sees. But William’s real point is that MCP is something of a red herring for a much older problem: uncontrolled access to your APIs. Whatever protocol you use, once an unconstrained workload is loose in your environment, you need an immune system to keep it in check. Links and resources: Linkerd:

DevOps Paradox
DOP 348: Now It's Time to Panic

DevOps Paradox

Play Episode Listen Later Apr 29, 2026 50:05


Something flipped this year. Chatbots were a toy. Useful sometimes, but a toy. Agents are not. Agents take actions, hold credentials, write code, move Kanban cards, and run on cron schedules. The window between "this is interesting" and "this is existential" has closed faster than cloud, faster than Kubernetes, faster than any prior shift. Viktor's read is blunt. One person can now build a bigger business than most mid-size companies have ever managed. That is not hyperbole -- that is a description of what is already happening with a handful of solo-built projects shipping in weeks what used to take a hundred-person org years. The thesis: panic. Not because the sky is falling, but because larger companies cannot turn around overnight, and the gap between the people who get this and the people who are still scheduling meetings about scheduling meetings is widening every week. The conversation walks through what each big provider is actually doing. AWS is not pretending to compete on models -- they want the inference revenue. Microsoft is lost in Copilot button-stuffing. Google is quietly winning on three layers at once: TPUs, models, and inference infrastructure. Anthropic is on the path to becoming the next defining IPO, while OpenAI looks like a place to take money out of, not put more in. The Linux Foundation's new Agentic AI foundation got Anthropic's MCP, Block's Goose, and OpenAI's AGENTS.md spec. Viktor's reaction: those are heavy hitters donating not very much. Then it gets practical. Vendor-provided agents are like hiring a genius engineer who knows nothing about your company. Public skills are mostly nonsense -- if it is in public training data, the model already knows it; what is missing is everything specific to you, which is exactly what no public skill can provide. OWASP just published an Agentic AI Top 10 and most of it is least-privilege rebranded for agents. The cost story is also not what the marketing says: a 00 monthly subscription will not last a day for anyone working full-time with agents. There is a true story in here about a leaked token that turned a 00 monthly spend into 5,000 in two days. The hardest part of the episode is the part nobody likes hearing. If your output stays the same in 2026, you are in trouble. If you multiply your output, you are fine. Companies have always wanted to do more than they could afford to do. Now they can. The middle is where careers used to live. The middle is where the cuts are going.   YouTube channel: https://youtube.com/devopsparadox   Review the podcast on Apple Podcasts: https://www.devopsparadox.com/review-podcast/   Slack: https://www.devopsparadox.com/slack/   Connect with us at: https://www.devopsparadox.com/contact/

Packet Pushers - Full Podcast Feed
TCG074: From SOAR to Agents: Why Practical Automation Has to Survive Contact with Real Infrastructure

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Apr 22, 2026 44:49


Eyvonne Sharp and William Collins speak with Sif Baksh, Principal Solutions Architect at Tines, to discuss the power of automation. Sif shares some personal stories of how he has been able to use automation to innovate and modernize networking operations. They also discuss the importance of learning AI and using it as a tool, how... Read more »