POPULARITY
Categories
Kako se inovacije uvode u banke, a kako na odeljenje neurohirurgije? U 371. epizodi podkasta Pojačalo, Ivan nastavlja razgovor sa Nikolom Kalinovićem, partnerom u agenciji Evoke i Design Thinkers akademiji. Nikola nas vodi kroz svoj dinamičan životni put - od ambicija na fudbalskom terenu, preko freelancinga, do uvođenja Design Thinking metodologije i postavljanja ozbiljnih biznis sistema na našem tržištu. Otkrivamo zašto su korporacije trome kada su inovacije u pitanju, kako prepoznati stvarne potrebe korisnika i izgraditi brend sa svrhom. U duboko ličnom delu razgovora, Nikola deli svoje iskustvo suočavanja sa tumorom kičme i kako ga je boravak na neurohirurgiji naučio najvažnijim lekcijama o empatiji, preduzetništvu i zahvalnosti za ono što imamo. Prava epizoda za sve koji traže konkretne poslovne savete i snažnu životnu inspiraciju. Ukoliko ste propustili početak priče, toplo preporučujemo da prvo pogledate prvi deo razgovora kako biste imali potpunu sliku o Nikolinom putu - https://www.youtube.com/watch?v=mJIH0cTizYE O čemu smo pričali: - Talanoa i mentor Dalibor Vasiljević - Otkriće design thinkinga - Erste banka i krediti za penzionere - Otpor organizacija prema novom - Zašto je razgovor sa korisnicima ključan - Tobako industrija i zamka globalnih persona - Zašto se velike korporacije teško menjaju - Projekti sa gostujućih terena: Dubai i Češka - Udruženje mladih privrednika Srbije - Povratak fudbalu - Partnerstvo u Evoke-u - Sređivanje procesa i alati Realizaciju ove epizode podržali su naši prijatelji i sponzori: - Epson Srbija - https://www.epson.rs - Orion telekom - https://oriontelekom.rs - Smilies - https://smilies.rs Hvala na poverenju i podršci! Podržite nas na BuyMeACoffee: https://bit.ly/3uSBmoa Pročitajte transkript ove epizode: https://bit.ly/43J6rO4 Posetite naš sajt i prijavite se na našu mailing listu: http://bit.ly/2LUKSBG Prijavite se na naš YouTube kanal: http://bit.ly/2Rgnu7o Pratite Pojačalo na društvenim mrežama: FB: https://www.facebook.com/PojacaloRS/ IG: https://www.instagram.com/pojacalo.rs/ X: https://x.com/PojacaloRS LN: https://www.linkedin.com/company/pojacalo TikTok: https://www.tiktok.com/@pojacalo.rs
In this episode, Corey Quinn sits down with Dexter Horthy, CEO and Co-founder of Human Layer, to unpack what engineers are getting wrong about AI, especially when it comes to coding agents.From the obsession with “just throwing more tokens at the problem” to the reality of building scalable AI workflows, Dexter shares hard-earned insights on how to actually push models to their limits. They dive into the evolution of developer workflows, the rise of AI-powered software factories, and why understanding context and verification matters more than raw model power.If you're building with AI or trying to, this episode will challenge how you think about what these systems can (and can't) do.Show highlights: (00:00)Throwing Tokens Too Far(01:04) Meet Dexter Horthy(01:52) Personal AI Benchmarks(04:12) Human Layer Race Condition(05:59) Rewrites and Tech Debt(07:19) Software Factories Mindset(10:20) Verifiable Problems and Token Limits(13:45) Agents in the Trenches(18:05) GitHub at Agent Scale(26:23) Safety Ethics and Closing ThoughtsAbout Dexter: Dexter Horthy is the CEO and Co-Founder of HumanLayer, where he helps engineering teams tackle complex problems in large codebases using coding agents. Previously, he worked in DevOps, SRE, and Solutions Engineering at Replicated, and contributed to lunar navigation software at NASA JPL. Outside of work, he's a fan of tacos and burpees, though not necessarily in that order.Links: LinkedIn: https://www.linkedin.com/in/dexterihorthy/Website: https://humanlayer.devSponsored by: duckbillhq.com
Más inversión para escuelas en EdomexSRE asciende a cuatro embajadores Machado buscará presidencia en Venezuela Más información en nuestro Podcast#grc
In Elixir Wizards S15E04, Charles Suggs and Emma Whamond are joined by Somtochi Onyekwere, a software engineer at Fly.io and contributor to the Corrosion distributed database project, to talk about distributed systems, infrastructure resilience, and the growing fragility of centralized cloud platforms. We discuss what recent outages across major providers reveal about modern infrastructure and why more teams are starting to rethink assumptions around reliability, failover, and system design. Somtochi explains how Fly.io approaches geographic distribution, eventual consistency, and replication across nodes, along with the trade-offs that come with building systems this way. The conversation explores CRDTs (Conflict-free Replicated Data Types), consensus, split-brain prevention, and what actually happens when distributed systems fail in production. We also talk about testing strategies, rollback planning, property-based testing tools, and how teams can reduce blast radius when things inevitably go wrong. Along the way, we discuss AI infrastructure, sandboxing AI agents, and how newer workloads may add pressure to already centralized systems. The episode closes with practical advice for developers who want to build more resilient applications without over-complicating their architecture. Topics Discussed in this Episode: Corrosion and distributed database replication Centralized cloud fragility and recent outage patterns Distributed systems versus traditional cloud architectures Multi-region deployment strategies for Phoenix applications CRDTs and conflict resolution in distributed systems Eventual consistency versus strict consistency tradeoffs Consensus, leader election, and split-brain prevention Testing failover and recovery scenarios Property-based testing and Antithesis Rollback planning for database schema migrations Reducing blast radius through system isolation Health checks and blue-green deployment strategies Fly Proxy request routing and replay behavior Cross-region synchronization and replication challenges Single points of failure inside “redundant” systems Backup restoration testing and disaster recovery planning Network partitions and failure handling in production Infrastructure monitoring and operational visibility AI infrastructure workloads and operational strain Sandboxing and securing AI agents Sprites and AI workflows at Fly.io Latency improvements from geographic distribution Distributed systems tradeoffs in real-world environments Transitive dependency failures across cloud providers Practical resilience strategies for modern engineering teams Links Mentioned: https://fly.io https://github.com/superfly/corrosion https://docs.gitops.weaveworks.org/ FluxCD https://fluxcd.io/ Fly.io Stateful Sandbox Environments https://sprites.dev/ Cloudflare Workers AI Inference Platform https://www.cloudflare.com/products/workers-ai/ “An AI Agent Just Destroyed Our Production Data. It Confessed in Writing” Twitter post from PocketOS founder: https://x.com/lifeof_jer/status/2048103471019434248 Oct 2025 AWS Outage https://www.theguardian.com/technology/2025/oct/24/amazon-reveals-cause-of-aws-outage Dec 2025 Cloudflare Outage https://www.theguardian.com/technology/2025/dec/05/another-cloudflare-outage-takes-down-websites-linkedin-zoom July 2025 Crowdstrike Outage https://www.ibm.com/think/news/recent-crowdstrike-outage-what-you-should-know March 2026 Stryker Cyber Attack https://www.stryker.com/us/en/about/news/2026/a-message-to-our-customers-03-2026.html https://aws.amazon.com/ https://cloud.google.com/ https://azure.microsoft.com/en-us https://fly.io/docs/elixir/ CRDTs!! https://smartlogic.io/podcast/elixir-wizards/s13-e03-local-first-liveview-svelte-pwa/ https://antithesis.com/docs/resources/property_based_testing/ https://hex.pm/packages/proper
Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!This was recorded before Railway suffered a major GCP outage on May 19, despite being a multi-AZ, multi-zone mesh ring, with HA fiber interconnects between their Metal GCP AWS, because workload discoverability was unintentionally still tied to GCP. All has been resolved with a post-mortem.Railway did not start as an AI infrastructure company.It was founded in 2020 years before agents became the default way people thought about deploying software. Jake Cooper, formerly at Bloomberg and Uber, started Railway with a simple obsession: the activation energy to ship something to production should be near zero. Push code, get a URL, iterate. No Docker files, no Kubernetes manifests, no Ansible scripts stacked on Ansible scripts.For years, this was a slow grind. Railway spent its first 18 months hand-acquiring its first 100 users with Jake personally greeting every Discord signup on a second monitor.Today, Railway has raised $124m and is growing very fast. A 35-person team supports 3 million users, adding roughly 100,000 signups a week. Their bare metal data centers have a 3-month payback period vs. renting in the cloud, with 70% margins funding aggressive cloud bursting when needed. The servers they own have actually appreciated in value as RAM prices have climbed basically meaning the value of their hardware now exceeds the capital they've raised.From rebuilding Railway's network overlay over a weekend to moving the vast majority of workloads onto its own bare metal data centers, Jake Cooper is trying to build a new cloud for an agent-native world. In this episode, Railway's founder and “conductor” joins swyx and Alessio to unpack why the next era of software infrastructure is not just “Heroku but newer,” what agents need that humans did not, and why the old deployment loop of Git, PRs, CI/CD, and static cloud resources may be heading for a rewrite.We go deep on Railway's infrastructure stack: own-metal data centers, three-month cloud payback periods, cloud bursting, data center debt, Railpack, Nixpacks, Temporal, feature flags, Central Station, content-addressable filesystems, agent-safe production forks, and why the CLI may become more important than the canvas in an agent world. Jake also shares the founder journey behind Railway, how the company survived losing $500K/month, why it now serves millions of users with only 35 people, and why he believes the pull request is dying.We discuss:* How Railway went from a slow six-year grind to adding 100,000 users a week* How Railway thinks about agents as the next dominant software species* Why agents need version control, observability, compute, storage, and orchestration at 1000x scale* The economics of Railway's own-metal data centers and three-month payback* How Railway uses cloud bursting while scaling its own infrastructure* Why data center debt can be a better tool than venture debt for infra startups* Central Station, Railway's internal system for clustering customer feedback and incidents* Why responsible disclosure and over-communication matter for platforms* Why feature flags, progressive rollouts, and shadow traffic are essential for agents* Temporal's strengths, pain points, and why workflows matter for agents* Railpack, Nixpacks, Nix, and lazy-loaded content-addressable filesystems* Why “cattle, not pets” may change if you can clone the pets* Why Railway is building a new cloud from scratch instead of copying hyperscalers* The solo founder path, focus, writing, and how Jake thinks about company buildingRailway:* Website: https://railway.com/* X: https://x.com/RailwayJake Cooper:* LinkedIn: https://www.linkedin.com/in/thejakecooper/* X: https://x.com/JustJakeTimestamps00:00:00 Introduction: What Is Railway?00:02:07 Jake's Path to Railway00:06:13 Railway's Six-Year Growth Story00:08:52 Rebuilding the Business After the Free Tier00:11:17 Agents as the Next Software Platform00:13:29 Railway's Infrastructure Philosophy00:15:42 Bare Metal, Cloud Economics, and the Compute Crunch00:17:22 Cloud Bursting and Five-Cloud Networking00:20:20 Data Center Debt and Infra Financing00:23:31 Data Centers in Space00:25:24 What Agents Need From Infrastructure00:28:24 CLIs, Canvas, and Agent-Native UX00:35:15 Central Station, Incidents, and Responsible Disclosure00:40:30 Safe Rollouts, SRE Agents, and Production Forks00:45:00 AI SRE, Specs, Code, and Tests00:48:24 Self-Replicating Infrastructure and the New Serverless00:53:18 Heroku, Temporal, and Workflow Engines01:04:07 Railpack, Nixpacks, and Lazy-Loaded Filesystems01:06:01 Coding Agents, Token Spend, and Roadmap Acceleration01:10:56 The Pull Request Is Dying01:12:28 Feature Flags and the Agent-Era SDLC01:16:15 Cattle, Pets, and Cloning Machines01:19:29 Solo Founder Lessons01:24:12 Focus, GPUs, and Building a New Cloud01:28:20 Closing ThoughtsTranscriptAlessio [00:00:00]: Hey, everyone. Welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swyx, editor of Latent Space.Swyx [00:00:10]: Hey, hey, hey. Today we're in the studio with Jake Cooper of Railway.Alessio [00:00:14]: Conductor of Railway.Swyx [00:00:15]: Conductor at Railway. Yeah.Alessio [00:00:16]: Choo-choo.Swyx [00:00:17]: Do you actually have that anywhere, like on your business card?Jake [00:00:20]: We call some of our volunteer moderators conductors. I don't have a business card. We're not that big yet. At some point I will. I got handed a nice business card from the Supermicro folks, and I was like, “Damn, this is pretty official.”Swyx [00:00:30]: Business cards are coming back.Jake [00:00:32]: They're cool. They're hip. The conductor thing is good. We're trying to figure out what we want to call each other internally. Some people think it's super cringe and say, “You don't need a name for people internally.” Some people want to call each other something. We still don't have a really good one.Jake [00:00:55]: We've got New Railcrews, Trainiacs. Nothing has stuck yet.Swyx [00:01:00]: I like Trainiac. Trainiac sounds good. Railwayians. For those who don't know, what is Railway? Let's give people a crisp definition up front.Jake [00:01:09]: Railway is the easiest way to ship anything. You go to the canvas, or you talk with Claude, and you say, “Deploy a Postgres instance, deploy my GitHub repository, run this code,” and you're off to the races.Swyx [00:01:22]: You've got a nice animation on the landing page.Jake [00:01:24]: Thank you. None of my work, by the way. They don't let me touch the design stuff anymore.Jake [00:01:25]: We want to make it trivially easy not just to deploy things, but to evolve applications over time. Most tooling right now stacks entropy on top of entropy: Docker, Kubernetes, Ansible scripts, and all these other things. If we can version all of your software and keep track of all the changes, then we can make it trivial to clone environments, fork into a parallel universe, get copies of production data, get copies of any services, make changes, validate them, and collapse them back in without reproducing everything across a staging environment.The Railway Origin Story: From Uber Systems to a New CloudSwyx [00:02:07]: I was looking at your background: Bloomberg, Uber. Nothing immediately stands out as, “This guy is going to found the next great platform as a service.” What prepared you for Railway?Jake [00:02:21]: It was curiosity to keep going deeper. I started out on front-end stuff, working on Wolfram Mathematica and porting it over. Then I briefly moved to Bloomberg, then toward Uber and distributed systems, taking the Jump Bikes systems and moving them to a distributed system built on top of Cadence, the pre-Temporal Temporal.Swyx [00:02:44]: Which, by the way, I'm happy to talk about, pros and cons.Jake [00:02:48]: Totally.Swyx [00:02:51]: But let's do the Railway story.Jake [00:02:52]: It has been a continual step of wanting an experience. Whether it's walking up to a bike, unlocking it, and having it work frictionlessly, or something else, the depth required to make that happen follows from the experience. A lot of the work I do, and a lot of the team does, is in service of that experience. We fundamentally don't care how deep we have to go. We will swim to the bottom of the swimming pool to get the experience.Jake [00:03:17]: I don't have a physics PhD. I did an EECS degree. It has always been about figuring out the next step: how do we get there? That's what led to starting Railway for that experience and then moving all the way to bare metal data centers. I was adding patches to the kernel this week to get the experience there because I can see how much better it can be.Swyx [00:03:49]: Other patches to the Linux kernel this week?Jake [00:03:51]: Yeah. Not upstream. Our fork.Swyx [00:03:52]: That's a flex. Railpack? No, this is different. This is the OS on top of Railpack?Jake [00:03:57]: No, this is an actual kernel patch. It's always literally: what do we have to do to get that experience? Then figure it out. Anything is figureoutable.Swyx [00:04:10]: Would you send the patch upstream, or does it not fit other use cases?Jake [00:04:13]: Maybe. We have to work out the experience internally. It has to do with the storage layer we're building for some of the agentic stuff. Maybe it'll be useful upstream, but it's deeply useful for us internally.Open Source, Forks, and Non-Deterministic VersioningSwyx [00:04:29]: You mentioned open source before. How do you think about starting from open source, and then coding agents letting you do a lot more from forks of it?Jake [00:04:38]: GitHub's original sin is that it's almost a series of broken pointers. You have this thing, then you clone it, and now you've lost the whole upstream. How do we make it trivial for people to modify really small pieces of it?Jake [00:04:51]: We think of Git in a discrete sense: I've either made a change and merged upstream, or I haven't. What would it look like if it were percentage-based, a little more non-deterministic, or a stream of changes that users traverse as a percentage rolled out in general and then rolled all the way up?Jake [00:05:13]: We have the open-source kickback program and let you deploy templates because we want to make it trivial for people to version these shards over time. It solves a large problem around authentication, authorization, and security. NPM has a way to define, “Don't take any new packages.” The ideal end state is that you roll out progressively to users with the minimum impact zone and continue rolling up. JPMorgan should probably be the last one on the patch line, for all our sakes, because our money and livelihoods are there.Jake [00:05:53]: It's okay if Johnny Vibe Coder gets a broken patch because there's so much entropy in the system that the rubber has to meet the road at some point. You have to test at varying levels.The Long Grind: First Users, Free Tier, and Making the Business WorkSwyx [00:06:13]: I wanted to pull up this glorious chart, which is your usage or number of daily signups?Jake [00:06:22]: Daily signups, I think.Swyx [00:06:24]: You started six years ago. It was a slow grind, and now you're on a rocket ship. You say, “Don't doubt your fight and don't quit.” Maybe pick out certain points that were key inflections for the company.Jake [00:06:40]: At the start, it's about getting your first 100 users, hell or high water. We had a website and a support link. The support link was the Discord channel. I had notifications on with two monitors: the monitor I was working on and the other monitor with Discord. If anybody came in, I was immediately like, “Hey, how's it going?” It was rare, so getting those first 100 users to come back was the start.Jake [00:07:14]: Then you build a consultancy factory because users want all these things. You have to go back to the board and ask, “What is the actual product offering I want to build on top of this?”Jake [00:07:28]: VCs want charts that always go up and to the right, but in reality you don't necessarily want charts that look like that. For us, there have been periods of expansion where we add features to test use cases, and periods of compaction where we ask, “If the experience we have is good, how do we make it significantly better?” Maybe we strip out features that don't fit our ICP anymore.Jake [00:07:57]: The boom from 2022 to 2023 came from the free tier. Everybody under the sun was using it.Swyx [00:08:09]: A lot of Reddit bots and Discord bots.Jake [00:08:12]: And crypto miners. When you build an open product on the internet where anybody can sign up, the internet is a horrible place with so many things. You go through periods of asking, “How do I reach as many people as possible?” Then, “How do I fit the exact use case for the people who really matter and are really excited about this specific thing?”Jake [00:08:39]: Then there was a two-year period of making the actual business work. During the free-tier era, we were losing about half a million dollars a month.Swyx [00:08:59]: On a $20 million bank account.Jake [00:09:02]: On a $20 million bank account with maybe $50,000 a month in revenue. That's a horrible business. I don't know how anybody invested. But you have to go through it and say, “We have an experience people love, but the business has to work.”Jake [00:09:17]: There are two schools of thought. You can run the horrible business all the way up with bad margins, or you can go back and make it work. We've always wanted a super lean team. We're 35 people right now. It's very small.Swyx [00:09:36]: Supporting three million already?Jake [00:09:38]: Yeah. We're adding 100,000 users a week right now, so it's growing fast. We don't want to add headcount for the sake of headcount or throw bodies at problems. We want to build systems. It's hard to build systems during expansion because you're adding things to the system because people are asking for them or things are breaking.Jake [00:10:00]: We had to cut off the free users for a little while, rebuild the business, and make sure it worked. We want to reach as many people as possible because software is important. It's become difficult to create things in the physical world, so it's important to make it easy for people to build in the virtual world and have access to creation. But there are legs to that journey.Jake [00:10:30]: You can see divots in the charts. If you follow between 2025 and 2026, it's either summer or winter. People go on holiday with family.Swyx [00:10:50]: It affects that much?Jake [00:10:51]: Yeah. It's kind of B2C and kind of B2B. People are shipping constantly, then they stop. Our activation curve now shows more people activating on weekdays because we have more business users, so it smooths out over time.Agents as the New Interface to DeploymentSwyx [00:11:17]: Was there a point where you started prioritizing AI development or agent development?Jake [00:11:24]: We've prioritized agentic as a top-of-funnel thing. Over the last six months, we've deeply prioritized agentic as a mechanism to build and deploy things because we believe the curve is so steep and that is how people will build and deploy software.Jake [00:11:42]: It almost fundamentally doesn't matter whether this is dot-com or not because we're all on the internet anyway. If agents are going to deploy a bunch of things and we hit an inference wall at some point, we'll fix those problems. The dominant species over the next 10 years is that we've moved from assembly to C to C++ to JavaScript to words. You're going to need to close that loop.Swyx [00:12:13]: When you say this is dot-com, did you mean buying the domain, or the general case?Jake [00:12:17]: I mean the dot-com era, when companies had a huge run-up because people understood the internet was important. Then they hit bottlenecks, fundamental laws of physics, math didn't work, and everybody came back down to earth. But it didn't matter because the internet became so impactful. If you operate on a long enough time horizon, you should build these things anyway because you can see where it's going.Jake [00:12:45]: That's where I think a lot of agent stuff is. You get to a point where you're running thousands of agents in parallel. What is the inference cost? What is the compute cost? How do you make that efficient? How do you coordinate all this? We have issues coordinating humans; we don't even have good tooling for that. Now we have to figure out how to get agents to coordinate, safely version changes, and know when to raise their hand for someone to intervene. Otherwise it becomes an interrupt factory.Railway's Infrastructure Thesis: Network, Compute, Storage, and MetalSwyx [00:13:19]: Let's go right into the technical side. What are the core infrastructure or architectural beliefs of Railway that allow you to do what you do?Jake [00:13:29]: The primitives matter a lot for us. We need network, compute, storage, and orchestration around it. You need control over a lot of those things. We've talked a lot about how we don't really use Kubernetes because we want higher-order control to place workloads in very specific places.Jake [00:13:48]: The reason is that you have to be very efficient with agents: memory reuse and all these other things, or you're going to massively blow up your cost structure. Being able to rack and stack your own servers and build your own metal unlocks performance and cost. Experiences where you're running 1,000 agents in parallel are not massively cost prohibitive.Jake [00:14:13]: Token use and compute use are blowing up. Over time, those things have to get a lot more efficient. You can get a lot of margin to make those experiences solid by building your own metal. That's all in service of offering a differentiated experience to as many people as humanly possible.Swyx [00:14:51]: You have a data center in Singapore.Jake [00:14:53]: Yeah. We have two in every other region now. In Singapore, we're adding a second one in Q3.Swyx [00:14:58]: What's it like? I've never built a data center. Do you go to Equinix and say, “I want some slots?”Jake [00:15:05]: Yeah. Equinix. You basically go and say, “I want power and I want a cage.” They say, “Great, here's what it's going to be.” You rent the cage for a period of time, fill it with racks and servers, and hook up internet to it. That's all the pieces.Swyx [00:15:36]: Then you handle everything else.Jake [00:15:37]: You handle everything else.Swyx [00:15:39]: What's the math versus clouds doing it for you?Jake [00:15:43]: If we rented in the cloud, our payback period when we go to metal is about three months.Swyx [00:15:50]: Which is crazy.Jake [00:15:51]: It's nuts. That's four years of depreciated hardware. You're going to see a lot of this compute crunch because hyperscalers are buying up a lot of stuff. We're working directly with OEMs, resellers, and people building these machines: Supermicro, Dell, and others.Jake [00:16:11]: Upstream, there's a bunch of supply pressure. When we raised our last round, between deploying capital for servers and now, the amount of money we've raised is less than the amount of money we have in the bank plus the value of the servers because the servers have appreciated as RAM has gone up. It's nuts how valuable hardware has become.Jake [00:16:50]: If you look at hyperscalers, they deployed around $80 billion of capital expenditures this year, and next year will be more. That's a massive infrastructure build-out. You look at that and think it's crazy that they're spending way more than the Manhattan Project. But if every person is going to run dozens or hundreds of agents in parallel, you have no conceptual idea how much compute is required to make that experience happen, even if you're deeply efficient and sharing resources. And that doesn't even count inference.Swyx [00:17:22]: How do you plan the build-out? The growth chart is so vertical. Are you usually at 100% utilization as soon as racks are live? How far ahead are you planning?Jake [00:17:33]: We still maintain cloud presence for bursting. We work with AWS, GCP, and a few other clouds. We can rent, and then the moment we get space or power, we compact those workloads off the cloud. We started on the clouds, then built a system to migrate to our own metal. There's nothing that says you can't continually do that again, and that's exactly what we do. We never want to be compute constrained.Jake [00:18:09]: At the start of the year, we actually became compute constrained because one upstream provider wasn't able to give us quota at the rate we needed, and the hardware was slower. I spent a weekend rebuilding our entire network overlay so we could straddle five clouds: Oracle, AWS, ourselves, GCP, and one other one. We can do more than that now.Jake [00:18:38]: We got into a spot where we were trying to pack instances tight because we couldn't get enough compute. That led to a few reliability issues, which are now past us. I made a tweet pointing out that it's becoming harder and harder to acquire compute at the rate these models need to acquire compute. We got bit by it.Swyx [00:19:15]: How do you think about pricing knowing you might not have your own metal available at all times? Are you pricing assuming you need extra margin if you end up going into the cloud?Jake [00:19:26]: Because we've built out our metal data centers, our margins on metal are around 70%. We can deeply subsidize the cloud business if we want to scale at a reasonable rate. We have a few levers: metal, which makes the margins; cloud burst; debt to buy servers; and venture capital. It's an interesting operational problem: how much cash do we have, how much should we raise, how quickly can we deploy it, and can we scale revenue as quickly as we scale compute?Jake [00:20:05]: If we continue making it trivially easy for people to build and deploy, then the faster we close that loop and the more operationally excellent we are with capital, the faster the business can scale. It's almost a straight linear deployment rate.Financing Infrastructure: Hardware Debt, VC, and Operational LeverageSwyx [00:20:20]: I think infra startups raising debt is a tool people don't utilize enough or know enough about. What can you tell us about that? Is it secured against your CPUs?Jake [00:20:32]: It's secured against our hardware.Swyx [00:20:37]: What rates do you get? Who are the lenders?Jake [00:20:39]: We pay prime plus a spread, and we can refinance any of the debt as rates go down. The terms are pretty good. The unfortunate thing is that Twitter has no nuance, so people say, “Venture debt bad.” But as with all things, there are specific tools and areas where you can be deliberate instead of using one tool as a hammer. Venture capital is not the hammer for everything. You have to explore and figure out what works.Swyx [00:21:12]: VC is usually the most expensive financing you can get.Jake [00:21:15]: Yeah. I also think people think about VC incorrectly from a capital-raising perspective. Most people think, “How do I raise as much money as possible from whoever is probably the best I can get at that time?” That's close to right, but what we've tried to do is figure out what unfair advantage we can buy with that equity.Jake [00:21:34]: It's the most expensive equity you're going to give away at that point in time, assuming the company keeps getting better. How do you use it to work with someone stellar who complements you? In the seed stage, I had never started a company. Ray Tonsing had good advice, and I could text him all the time. He was really fast. Awesome.Jake [00:22:01]: Then with John and Erica at Unusual, they said, “You roughly know what you're doing building a product. We'll mostly leave you alone and be available for advice.” Amazing. Then we got to Series A and the business was an operational tire fire because we didn't know how to scale a business. Work with Erica, and Jordan is over at Redpoint, so bonus.Jake [00:22:28]: Now we've raised from TQ and FPV as we're moving into enterprises. Every step of the way, we've asked: who can we partner with at this specific time to unlock the next section of the journey? I don't know enterprise sales. As an engineer, I can eyeball what features we might need, and we have wonderful people internally who can help. But you want boardroom dynamics where everyone is aligned and asking, “How do we win this?” instead of bickering about strategy.Data Centers in Space and the Physics of ComputeSwyx [00:23:31]: You had a tweet about data centers in space. Why no data centers in space?Jake [00:23:37]: It's not “no data centers in space.” My hot take is that I think it is solvable. I've just never seen anybody solve it.Swyx [00:23:49]: You said, “How are you going to dissipate that much heat in a vacuum?” You're making a physics claim.Jake [00:23:55]: I haven't seen anybody prove how you're going to dissipate that much heat in a vacuum. It doesn't mean it's not possible. It just means nobody has brought it up yet.Swyx [00:24:05]: Astrophage.Jake [00:24:06]: I don't know what that is.Swyx [00:24:07]: The Martian thing. Okay, you're very logical.Jake [00:24:09]: It could work. A lot of people are putting the cart before the horse. They say, “We're going to put data centers in space.” Okay, but how? “We have time to figure it out.” It's like in The Martian where they ask how they're going to intercept something and say, “We'll figure it out.”Swyx [00:24:36]: Making a bet on human invention is weird because you blind trust that it can be solved. But with physics, there are first-principles bounds you can put on it. Maybe not. Maybe you're asking to travel time or break a fundamental thermodynamic law.Jake [00:24:57]: I don't know how VCs do this either. How do you know what's not possible and a grift versus what's possible but sounds completely insane? “We're going to put data centers in space.” Coin flip as to which it is, and I guess you'll know in 10 years. That's one cycle.What Agents Need: Versioning, Observability, and 1,000x ScaleSwyx [00:25:23]: Moving back to agents. The branching, fast spin-up, and orchestration you do feels like pre-work that happened to be exactly what agents want. What do agents want differently than humans?Jake [00:25:37]: They want the ability to version things. It's not that different; it materializes slightly differently. Agents want a way to test changes incrementally. Engineers have feature flags. Is there a reason agents can't use feature flags? I don't think so.Jake [00:25:54]: They want version control. Can we use Git or not Git? That one is up in the air. I think something outside Git will emerge for how we version these things over time. They need observability. You need to query what happened, when it happened, which steps failed, traces, logs, metrics, and all the rest. They need network, compute, and storage. They need to write files, save files, iterate on files, and snapshot file systems.Jake [00:26:25]: A lot of what humans needed is in line with what agents need. Branching and forking are not different; we're just moving 1,000 times quicker. It can look like you need something massively different, but what you need is something massively better than what existed. You need orchestration massively better than Kubernetes. You need networking probably better than Envoy. It goes all the way down the stack.Jake [00:26:55]: If the workload profile doesn't change so much as it gets massively compressed because you need thousands of these things, what assumptions change? etcd is going to melt. You need to replace it with something. You can go all the way down the stack and say, “That part has to change, that part has to change, and that part has to change.”Jake [00:27:19]: The interesting thing about the super-exponential curve is that you have to build systems where you can rip out those parts at any time because a new bottleneck might emerge. You get good at parallel agents, and a different part of the system breaks. So it's similar to what humans needed, but at 1,000x scale.Jake [00:27:55]: How do you do code review in the age of agents?Swyx [00:28:00]: You throw more agents at it.Jake [00:28:01]: You don't. But then who reviews for CVEs and all these other things?Swyx [00:28:07]: More agents.Jake [00:28:08]: And that's how we hit the inference wall. You can continually throw agents at the problem, but I think there's a limit to the number of agents you can throw at a problem.CLI, Agent Handles, and Closing the LoopSwyx [00:28:24]: You already had a CLI before it was cool. How is the shape of what you're exposing changing, if at all?Jake [00:28:28]: CLIs have always been cool. The CLI changes because we think about how to give Claude, Codex, ChatGPT, or any model a handhold.Jake [00:28:50]: A CLI is a single command: deploy, get logs, and so on. Things that were prohibitively annoying to humans are not annoying to agents. They're nice. If I handed you a CLI with 40 arguments and 600 flags, you'd think, “I'm never going to use all of this.” But if you hand it to an agent, it says, “This is excellent. I have so many handles to work with.”Jake [00:29:24]: If you're going to expose things to agents that way, you want as many handles as possible where they can get information, query dynamic information, and close the loop quickly. Most problems right now are about how to close the loop as quickly as possible. Where does the agent get stuck, and how can you remove that?Jake [00:29:49]: Telemetry is important. If you can tell where the agent gets stuck from the CLI and say, “12% of people deviate from the happy path because of this, and now I add this argument and drive it down to 2%,” you massively increase the rate of loop closure.Jake [00:30:03]: That's how we think about not just the CLI, but every point in the dashboard. It's a user journey: I hear about Railway. I get something deployed. I get my first green build or aha moment. I see an endpoint, logs, whatever. Then I iterate. The iteration loop is indefinite. The user wants to deploy a new thing, a Postgres instance, change code, and keep iterating.Jake [00:30:36]: If you focus on the iteration loops and what's blocking them from closing quickly, one thing we say internally is: you never want to be waiting on compute anymore. You always want to be waiting on intelligence. If you're waiting on compute, there's a bottleneck that needs to be destroyed because eventually that bottleneck becomes so large that another workflow emerges to change it.Jake [00:31:04]: We've built a product where you push code, build it, and so on. But I fundamentally believe the push-pull loop is going away. We'll get to a point where you make a small change in production, that change is versioned across your infrastructure, you're working alongside copy-on-write versions of your database and infrastructure, and then you merge it in and it's instantaneously live. That's the holy grail of loops. The push-pull-rebuild thing is a point of friction that we're removing entirely.Canvas as Output: Dashboards, Context Anchors, and HyperstructuresSwyx [00:31:43]: It's incredibly fast. If anyone hasn't tried it, that fast feedback is great. My hot take is that Railway was famous for its canvas, which visualizes your infrastructure and lets you manipulate it visually. But that was for humans. For the next phase of growth, Railway CLI is more important than canvas.Jake [00:32:05]: The canvas is funny because it's a mechanism to show changes over time. You're right that previously we used it a lot as an input. Moving forward, its goal is more like an output. You would go to the canvas, make changes, see them, and watch your infrastructure evolve. Now agents have access to the CLI and can make those changes. So the canvas becomes an output: what information does the human need at this moment to make suitable decisions about control requests? Do I approve this or not?Jake [00:32:57]: It also has to be an anchor for your context, a port in the storm. Think of it like layers in a file system. You start with a project, then drill down into services, then into a function or code, because you want to represent the entire thing not just in your head, but in the canvas. Other people can share that representation, think on the same wavelength, and move quickly.Jake [00:33:33]: A lot of organizations get in trouble as they scale because all the context lives in someone's head. “How does this microservice work?” “I have no idea; go ask this person.” Then you have whole categories of products built around context discovery. A lot of that melts away if you have a solid hierarchy and can infinitely nest services, code, context, and everything else all the way down. That's what lets you build these structures over time.Jake [00:34:18]: It's also what lets us build what I've called hyperstructures: things that are way bigger. You look at the Golden Gate Bridge and ask, “How did we build that?” There's a meme that we lost the technology. To some extent, yes, because the coordination that built those things evolved and changed. We lost some of the art of building structure as we jammed everything into Slack.Swyx [00:34:52]: But you jam everything in Discord.Jake [00:34:53]: Same point. It doesn't matter. It's message passing and interrupts, message passing and interrupts.Swyx [00:35:00]: So you're arguing there should be something better and more structured than Slack?Jake [00:35:04]: Yeah. For sure. I think Slack is awful, and Discord is awful too.Central Station: Context Routing, Support, and Incident ClustersSwyx [00:35:09]: This is the equivalent of my mom test. What have you done that has your solution to this?Jake [00:35:15]: Internally, we've built a tool called Central Station that aggregates all the context from our users. Every piece of feedback, every customer support item, everything gets aggregated into clusters. If an incident is brewing, we can determine how many users are affected and break off a discussion based on that.Jake [00:35:40]: That is more helpful than long-running channels where you're trying to decide which channel to put something in. If you can dynamically aggregate information and dynamically route it to the right person based on context, it works better. We know internally that these four people are close to networking. If we see a networking thing, we can drill it down to those four people. If it's with this part, we can look at the commits. This is no longer a manual process internally.Jake [00:36:13]: If you go to station or help.railway.com, that's why we built it. We wanted to scale with a massive amount of leverage by aggregating feedback.Swyx [00:36:27]: This is built in-house?Jake [00:36:28]: Yep.Swyx [00:36:29]: I remember helping out on this one with Angelo in 2023. You scale a lot with a very small team.Jake [00:36:38]: Yeah. We're about 10 times bigger now.Swyx [00:36:40]: You have your full developer code here? Very cool.Jake [00:36:44]: If you go to railway.com/stats, we expose this as a pub-sub-able thing. It's all real-time metrics. There's a way to get it as JSON somewhere if you care.Jake [00:37:01]: We're big on trying to build everything in public and talk about what we're working on. We've had issues in the past, and we'll say, “Here's how we're fixing these things.” We've gotten compliments and flak for incident reports. We're always trying to make them better and talk with people.Incidents, Disclosure, and Progressive RolloutsSwyx [00:37:20]: You had a big one recently. I liked that it was scoped to 3,000. You presumably used Central Station. Talk through what happened and how you address it internally as a team.Jake [00:37:38]: Internally, this one really sucked. It had to do with an upstream provider that didn't do the behavior it said it documented, which is unfortunate given they wrote the RFC for how the behavior should work. We rolled those things out, and Central Station caught it initially when a couple users said caches weren't invalidating. We turned it off immediately.Jake [00:38:03]: When you roll out to a large user base of three million people, you get a lot of disparate behaviors. We tested in staging and had tests, but we hit an edge case. We've hardened those systems, and now we can make that better. But it was a tough one.Swyx [00:38:39]: I always wonder how private disclosure is supposed to work if people find an issue. Are they supposed to contact you first? When you run a platform, these things will happen. What channels should people pursue to quietly resolve it before it becomes a bigger incident?Jake [00:38:59]: There's responsible disclosure. We err on the side of over-disclosing and letting you know something is wrong versus having your provider gaslight you. We've erred on sharing those things more publicly, even if they impact a small subset of users. That's a decision we've made internally. We have four values. One is honor. The honorable thing is to notify people to the widest degree at which they may have been affected or there was an issue, and then confront it head-on: why did it happen, what can we do better?Swyx [00:39:45]: Not the whole user base. That's because of incremental rollouts and other things?Jake [00:39:50]: Yeah. Progressive rollouts.Swyx [00:39:54]: That should be the norm at all large platforms.Jake [00:39:58]: It should. A variety of companies do this. There's the quote that Meta runs 10,000 different versions of Meta. To our earlier point about agents, they need the same thing. They need shadow traffic and all these other things. We've built so much ceremony around production being sacred that we need to make it trivially easy to test different behaviors in a safe environment. Then you can make mistakes in a safe environment.Safe AI SRE: Customer Agents, Forked Environments, and Production ParityAlessio [00:40:30]: Do you see a world where these things get automatically caught, not necessarily by your agent, but by your customer's agent? The cache invalidation issue seems easy to check if you know to look for it.Jake [00:40:44]: It's hard because to determine it, we almost need to hook into your observability infrastructure. That's why we have the template loop on the platform: so you can roll things out progressively. You can roll out to Johnny Vibe Coder initially, or push a shard that someone consumes at their own leisure. Or you can roll it out over weeks: 0.1% of people, 1% of people, early adopters, then all the way up. That's the non-deterministic version control we talked about earlier.Jake [00:41:30]: I believe that's where most things should go, because most companies end up building staged rollout systems in-house. It's the same thing built again and again at every company. There's a massive opportunity to consolidate developer debt.Alessio [00:41:45]: You should have a free tier. Model providers give free tokens if you let them use the data. You could give free compute if someone is the number-one shard that goes out and lets you plug into their observability.Jake [00:41:55]: We do that. That's why we talked about the impact on 3,000 people. We start with lower-impact people. Larger companies on the platform are last to receive those rollouts so they have a version of the platform that's deeply stable.Alessio [00:42:16]: I have three services, so I'm sure I get the first rollout. You can nuke my thing at any time. There are all these SRE agent companies. Observability people also want agents that fix upstream problems. You have your own agent in the canvas now. How do you see that playing out?Jake [00:42:39]: It's the stacking entropy problem. If you don't have primitives to make iteration in production safe, it becomes difficult. If you're an observability provider saying, “Here's the fix to this error,” assume 80% are good and make sense. But in the last 20% long tail of complex issues, if you let somebody stamp it, you create an opportunity for an incident.Jake [00:43:08]: That's why forked environments are important. People have staging, but it always drifts from production. You need primitives, workflows, and experience built first-party on the platform so you can fork any service at any point in time.Jake [00:43:33]: I think of the canvas as a sheet of transparency paper. The agent is a little guy you push up into the canvas. It should say, “I need to copy that service and that service so I can test these two things.” It gets a read-only copy of production. Anything that's PII gets marked as a transform when we clone the database, create a copy-on-write version, or read from it. Then the agent makes changes and asks, “Does this actually work?” as close to production as possible.Jake [00:44:22]: That's how close you have to be, or you get massive drift. The system becomes unstable. You see this with massive systems built on Docker for local, Kubernetes for production, and a specific thing for something else. That complexity slows developers and becomes unstable at scale, making it hard to iterate. We want to compress that way down and say, “As close to prod as possible is where we want to be.”From AISRE Skeptic to Agent BelieverSwyx [00:45:00]: I was texting Erica for questions, and she says you were originally not a believer in AISRE. Have you come around on it?Jake [00:45:10]: I flipped, but I'm still not a believer in AISRE if you don't have the primitives to make it safe. If you unleash AISRE on production infrastructure without safe primitives for copying volumes and making sure things are fine, it's going to nuke your production database. It's not a matter of if, but when. I'm a big believer in making those loops safe.Jake [00:45:33]: I was a deep AI skeptic until 2023. In 2024, I thought, “Maybe I can roughly make this thing do it.” In 2025, I thought, “Now I can hold this.” Over winter break, everybody came back saying, “It's almost impossible to hold this.”Swyx [00:46:01]: Did you see this on the Claude docs? CloudBot? OpenCloud?Jake [00:46:06]: It's gotten to a point where it's harder to hold it wrong than to hold it right. There's a scene in Avengers where Vision picks up Thor's hammer and says it's terribly well-balanced. It self-balances and works well. I'm a deep believer at this point that this will be the dominant species: assembly, C, C++, JavaScript, words.Swyx [00:46:35]: It feels like a big jump.Jake [00:46:37]: It is. But it's not like you abandon CPU-based discrete logic and move straight to fuzzy logic. You need both. Your skills should call code or applications or some static structure. You can use skills to distill what the procedure should be or how the code should act.Jake [00:47:02]: I'm coming to a thesis: you need three points. You need a clear spec defining the system, the code, and the tests. When you say it out loud, if you've been in engineering long enough, you're like, “Of course. That's an RFC, tests, and code.” But they all matter. Having them together lets them reinforce each other: the spec and tests match, but the code doesn't, so reconcile it. Or the tests and code match but the spec doesn't, so reconcile that. That's the iteration loop.Jake [00:47:41]: That's why you're seeing people talk about software factories, docs, and reconciliation. Some of that is architectural astronomy if you don't implement it, but that loop is where most things will end up.Swyx [00:48:07]: For listeners, we've been talking about this on the pod for three years: the holy trinity of specs and tests. Itamar Friedman from Qodo is the reference if people want to look it up.Self-Modifying Infrastructure and the End of Push-Pull-RebuildSwyx [00:48:18]: One thing I want to mention on the OpenCloud idea is self-modification. I don't know how Railway would support it, but I have my OpenClaw, and I just tell it it has the Railway CLI and can do whatever. In theory, whatever capabilities or new infra it needs, it can call the Railway CLI, provision it, and add it to itself. The agent can modify its own infra.Jake [00:48:45]: It's nuts. I have a loop set up where you put the Railway CLI on top of something that runs on Railway. You're authenticated as whatever the current box is, and you can make any changes to it. Then you call Railway deploy, and it deploys itself.Jake [00:49:04]: It's like: “I need to spin up this instance of this environment. I already exist in this environment. Excellent, I have access to a Postgres instance now.” That's where we want to go with agentic, self-replicating infrastructure. That's your loop: iterate in production. You continue making changes. If it works, merge it upstream. If it doesn't, throw it away.Jake [00:49:37]: How do you make throwaway copies trivial to spin up and super cheap? The era of “I have an AWS instance with four vCPU and 16 gigs of RAM” is going to get destroyed. If you do that for agents, you need a thousand of those machines. It's prohibitively expensive compared with what we've spent a ton of time figuring out: the atomic unit of deploy, whether you call it isolates, sandboxes, or something else. Only pay for what you use, spin up instantaneously, and close the loop as quickly as possible.Jake [00:50:15]: If the system can self-replicate safely and say, “This is my environment, I'm making these changes,” it can come back with, “Does this look good? This is a new state of infrastructure given this prompt. I think I've solved it.” Then you go back and say, “Actually, it looks different.” It does the loop again. Then you say, “Cool. Apply.”Swyx [00:50:38]: That's retroactively obvious, which is the most useful kind. Any other comments on agent deployment on Railway?Jake [00:50:51]: It's getting better every day. I'm on X or Twitter. You can always yell at me about the parts not working as well as they should, because plenty of things should work way better.The New Serverless: Stateful, Long-Running, Pay-for-What-You-Use LinuxSwyx [00:51:04]: At this stage, when people want massively or embarrassingly parallel compute, they usually talk serverless. I feel like there's a new serverless compared to the previous five years of serverless. You're in that new bucket. Do you have comparisons or philosophical differences you want to call out?Jake [00:51:31]: It's somewhere in between. It's the ability to run stateful, long-running workflows or executions.Swyx [00:51:42]: Vercel has Fluid Compute, Cloudflare has some container thing, Google has App Runner and others.Jake [00:51:55]: That's where everything is roughly going, and it's why we've been working on this for six years. We believe users need access to a computer: a box that speaks Linux. They need to deploy what they want. Other systems change the surface area of what you can build. For us, users need a computer and need to deploy anything they truly want. That's why we've focused on the primitives: network, compute, storage. If we give you those and expose them so you can run things indefinitely, that's where we believe it's going.Jake [00:52:43]: Twitter has no nuance, so everyone says “servers” or “serverless.” It's always somewhere in the middle: I want to run it for a long time, but I don't want to provision the resource statically or pay for things I'm not using. That's been our thesis from day one: pay only for what you use, run it indefinitely, and it is full Linux.Swyx [00:53:12]: That's why I like the naming of Fluid. It's fluid. Flexible.Heroku, Focus, and Carrying the Torch Without Becoming the PastSwyx [00:53:18]: Another milestone is the Heroku official deprecation. You're one of the presumptive new Herokus. “New Heroku” has been a category for as long as I've been in developer tooling. It's finally happening. What was that like? Any behind-the-scenes of, “This is the moment”?Jake [00:53:42]: You have people where you're like, “You were running stuff on here? You, as this company?” It's crazy that names you would know are running on it and now coming to us saying, “We want to move a lot of this off.”Swyx [00:54:00]: Any behind-the-scenes on why Salesforce let Heroku stagnate?Jake [00:54:05]: I can only guess. It's hard when it's not your business. Salesforce's business is to build a great CRM. That's their focus. Then you acquire a compute business as an offshoot. A lot of early Meta people talk about focus. Boz has a write-up about how in the early days of Meta they had no money, so they were forced to focus. Then they turned on the money tree and had no reason not to split their focus.Jake [00:54:52]: But that dilutes your product. You get offshoots where you ask, “Is this the focus of the business?” If it's not core, it languishes. A lot of companies get in trouble when they split focus because they're fighting a multi-front war, not just externally but internally for alignment. Where are we going? What are we doing? What is our purpose?Jake [00:55:24]: If you're Salesforce-built and mission-driven, you want to work on Salesforce. Heroku is off to the side. It's not core to the business. Getting resources, budget, focus, and alignment internally becomes hard. It was a matter of time.Swyx [00:56:06]: Kudos for them to call it out instead of leaving it unknown.Jake [00:56:12]: Their release was a little odd. They called it out, but they didn't say they were shutting it down. Behind the scenes, I think they issued messages to people saying they should close accounts and that they were going to deprecate and remove things over time.Jake [00:56:30]: It's crazy because some of my first deployment experiences were on Heroku. You start with dragging things into an FTP server, then you try to get a deploy working, and then it's Heroku. It was the on-ramp for us. But the wheel turns. New things emerge. We're happy to carry the torch for a lot of that. But we don't want to be the new Heroku. We want to be the way people build and deploy software, and ultimately the way people monetize software over time.Swyx [00:57:19]: It's still a big crown to be the new Heroku. There are 50 companies that fought for that.Jake [00:57:23]: Everybody is holding some portion of it. We're happy to support people and companies. The platform works differently. The game loop is similar, but we've been dogmatic about where these things are going: primitives, agents, fan-out. Some things fit; some workflows need to change. We have an approximation of Heroku pipelines with the environment system. It's exciting. We've got a ton of people we can support, and it's growing a lot.Temporal, Workflow Engines, and State MachinesSwyx [00:58:12]: I have one more technical question about Temporal. I've sold my shares. You're a power user and one of our earliest customers. I met you through Temporal. You built on Temporal. You have complaints. This may be the most neutral and informed conversation anyone will hear about Temporal without someone working at the company.Jake [00:58:39]: That's fair. I've used Temporal for almost 10 years because of Cadence at Uber.Swyx [00:58:52]: Give people a sense of what Cadence was at Uber.Jake [00:58:57]: Cadence was the precursor to Temporal. It powers trip actions, rides, when you rent a Jump bike or scooter or car. You're running workflows for a period of time and saying, “This ride will run indefinitely until it finishes.” You attach information: you paused in this zone, so add this charge to the bill. When you end the trip, the workflow is done. That experience was powered by Cadence at the time.Swyx [00:59:34]: I used to say it's like programming the entire user journey top-down as one function.Jake [00:59:39]: It's a powerful idea and important. It's also important for the next phase of the agentic journey. You want an agent to do a specific task, be complete or incomplete on that task, and move on to the next thing. You need a way to manage workflows dynamically.Jake [00:59:59]: Temporal was always great in theory, and great when you got it working the way you wanted in production. But it required you to model the entire journey in your head. If you didn't, you could cause issues where replaying the state of the workflow causes non-determinism.Swyx [01:00:25]: Because it works on deterministic workflow history.Jake [01:00:28]: Exactly. I describe it as a jet engine. If you know how to operate it and run it, it's great. But you can't hand it to people trying to build complicated things if they don't have the whole state in their head.Jake [01:00:48]: We run our whole deployment pipeline on top of it. That's a reasonably complicated workflow: pre-commit hooks, signaling, queuing, and all the rest. We ran into the same thing at Uber. As you express a large workflow, it gets more complicated, with more states in the state machine that you have to map back to the workflow.Swyx [01:01:15]: It's a lot of ifs.Jake [01:01:16]: Exactly. At Uber, we built a system for doing the state machine and testing it. We've started to build some of those things here because it's grown heavily. It's not quite love-hate. When it works well, it works super well. But if someone who doesn't have full context puts something into the system that invalidates state or causes non-determinism, or spins off a ton of activities, you have to keep track of underlying SRE knobs like activity slots. Those should scale with memory, vCPU, and so on. It becomes a bear to scale.Swyx [01:02:10]: You need a capable sysadmin running things behind the scenes. If you moved off, what would you do?Jake [01:02:19]: We'd build our own workflow engine. We have a few internally that we've worked on.Swyx [01:02:27]: This is one of those classes of things you typically wouldn't vibe code, but I'm wondering if you can.Jake [01:02:33]: I still don't think you should vibe code it. You still want to run decent tests to make sure it works.Swyx [01:02:39]: Timo didn't invent that from scratch either. There are libraries you can run. On top of that, it's just a state machine that you have to map out. Ultimately, you define the instructions you want and run them through a state machine.Jake [01:03:00]: It's very doable. Workflow stuff is interesting. Restate is doing neat stuff here.Swyx [01:03:10]: You're tied into JavaScript. Are you a JavaScript maxi?Jake [01:03:13]: Internally, we have TypeScript, Rust, and Go. We don't add more languages. Actually, we have a little C because we write BPF code and hooks. But those are the languages.Swyx [01:03:28]: Is this for sidecars?Jake [01:03:32]: No. It's for the networking stack, volumes, and things like that. We use TypeScript a lot because it powers the dashboard, but we're moving a lot of workflow stuff off the dashboard stack and into the infrastructure stack.Railpack, Nixpacks, and Content-Addressable FilesystemsSwyx [01:04:00]: Cool. Any other technical infrastructure stuff? Railpacks?Jake [01:04:07]: We built an engine for determining dependencies based on source code. It's called Railpack. We built the first version, Nixpacks, on top of Nix, and then we moved.Swyx [01:04:17]: People have been trying to get me to adopt Nix and NixOS for four years. Is it ever going to be a thing?Jake [01:04:23]: I don't know. We're excited about it, but it has pain points. Think of it as a stack of versioned binaries at specific slices in time. If you want version X and version Y, you bloat the package space, which blows up image size and makes real-world workloads difficult.Swyx [01:04:53]: But you content-address it and cache it. In theory, there are optimizations.Jake [01:05:00]: In theory, yes. But with a large enough user base and disparate enough machines, you run into a problem Meta described in the XFAAS paper, their internal serverless system. It becomes difficult at scale unless you break out specific runtimes.Jake [01:05:24]: We didn't want to do that because we wanted to truly allow you to deploy anything. That was our initial thing with Nix. But we've moved toward interesting work around content-addressable file systems that can lazy-load anything from any point and page it into memory.Swyx [01:05:48]: Amazing.Jake [01:05:49]: The future is very bright. It's crazy, and it's going to be nuts.Coding Agent Spend, Roadmaps, and Token ROISwyx [01:05:54]: Founder journey stuff?Alessio [01:05:56]: Your cloud usage: you tweeted you're going to spend $300K this month?Jake [01:06:01]: I think we got to $200K.Alessio [01:06:02]: Coding agents?Jake [01:06:03]: Yeah.Swyx [01:06:04]: Across the company?Alessio [01:06:05]: You only have 35 people, so I'm sure they're not all spending $10K a month. What's the distribution?Jake [01:06:10]: I think I'm at about $25K. We have power users all the way down. We came back from winter break, and I basically said, “If you're writing code by hand, you're doing this wrong.” The tools are good enough now that you can move extremely quickly. There are issues and pain points, but you should be reviewing the code you are writing instead of writing it by hand.Jake [01:06:40]: Architectural patterns matter more now than ever, but you shouldn't spend your time generating code you would write. If you know how to write it, ask the agent to write it and reconcile it until it looks like you would have written it yourself.Jake [01:06:58]: People misconstrue my propensity to push people toward agents as connected to our growth and some reliability bumps. They're not necessarily related. The tools are good enough to move extremely quickly and build things way larger than you could before.Jake [01:07:19]: To the earlier point about cooling data centers in space: I don't know. But with software, you can ask, “How would I build block storage from scratch? How would I do these things?” I have ideas because I have history and have read papers. Let me work them out and build massive test benches with thousands of tests, because those are now free to author. If you're not using AI systems to speed-run your roadmap and reconcile your existing system onto the future, you're missing a large point of what's happening.Alessio [01:08:12]: What's the path to spending $3 million a month? Is it bound by ideas and things customers can absorb?Jake [01:08:19]: For most companies, it's bound by deployment at this point. That's why we've seen a massive boom in users and companies, from Fortune 50s down, asking how to get developers to move faster. You'll probably hit your CFO before any technical limits because they'll look at the eye-watering amount of money spent on tokens. Inference costs have to come down, but we're inference constrained now. There will be price discovery around what makes sense for an org to adopt.Jake [01:09:06]: I think you'll end up with the F1 driver concept. If someone is really adept at these things, it makes sense to put them in a $3 million car. If they're not, it probably doesn't make sense. You'll take a few people and say, “You can drive the F1 car. We need to go in this direction. Figure out if it works and prototype it.”Jake [01:09:33]: We've done some of that and vastly accelerated our roadmap. We thought we'd ship something in a few years; now we can probably ship it in a few months because we validated it and don't have to build it incrementally. We can skip steps and move toward our vision.Alessio [01:09:58]: A lot of people are realizing the roadmap doesn't always have a business impact, so they say tokens are too expensive. But if your roadmap were built to make more money by the time you built it, you'd have token pricing for it, the same way you do with sales. You'd spend a billion dollars on sales if you knew you would get $2 billion of revenue.Jake [01:10:19]: Exactly. A naive way to measure this is the percentage of tokens that end up in production. If you can measure impact because those tokens end up in production, that's awesome. But the burden of proof will rise. Internally, we have a growing number of pull requests that haven't merged. The question becomes: how do you get this into production? It's about how quickly you can build and deploy software, which is exciting because that's our whole thing.The SDLC Shift: Prompt Requests, Feature Flags, and Safe RolloutsSwyx [01:10:56]: The SDLC is changing. One thesis is that the pull request is dying. It's going to be the prompt request. Beyond that, code review is also kind of dying if you have all the other systems in place. What else is changing about the SDLC?Jake [01:11:19]: The AISRE and the tools to make it happen. AISRE is pie-in-the-sky aspirational. What does it take to get an AISRE? What tools do you need to build?Swyx [01:11:32]: You should expose your tooling to customers at some point. The Central Station command center.Jake [01:11:39]: We have it for template maintainers. Template maintainers can deploy and maintain templates, and they get feedback. We're going to expose those things incrementally.Swyx [01:11:51]: Clustering around incidents. Everyone has a version of that, but I don't think anyone has solved it.Jake [01:11:56]: I won't say we've solved it internally, but it's gotten so good that we can see incidents forming pretty quickly. At some point, those will be things either someone else builds or we build. We've always built things purpose-built for us. If it makes sense to make it useful for users, monetize it, or turn that loop into a profit center instead of a cost center, we want to do that.Jake [01:12:28]: Pull request is definitely dying.Swyx [01:12:29]: Do you do first-party feature flagging and incremental rollout stuff?Jake [01:12:34]: We have a feature-flagging engine we built internally and will eventually roll out.Swyx [01:12:38]: I don't see it as a user. How come you didn't give us what you have?Jake [01:12:43]: We have to beta test it. We care a lot about the quality of the things. There's plenty we've used internally that doesn't make it all the way through the journey because it fails. It works for one service but not multiple services. We'd have to build it for multiple services and know that if we released it, we'd rebuild it again and again. Some things are worth that, but many inform the roadmap.Jake [01:13:18]: We don't want to dilute the experience by saying, “This works, but only for this service,” unless it's a core initiative. Over the next few months, we'll roll out things that work for a single service, then multiple services, then multiple services across the environment. You have to be deliberate. Otherwise you create broken disparate experiences and support load because people ask how to use the feature.Jake [01:13:52]: It's the earlier expansion and compaction pattern. You expand the company to get features, then compact and smooth them out so the experience is stellar. You told me in the hallway, “It's gotten so much better.” Internally we're saying, “This part really sucks. We need to make it significantly better.”Swyx [01:14:11]: I can attest to that over the last three years watching you build Railway. For listeners, feature flagging is a huge part of Uber culture. So much so that they have too many feature flags and another thing to remove feature flags. Facebook has Gatekeeper. Agents are going to need this. It's fundamental to incremental rollouts. OpenAI acquired Statsig. GPT-5 is routing and flagging through different models.Jake [01:14:56]: It's super important. If the software development lifecycle is going to change because we're doing things 1,000 times faster and 1,000 times more concurrently, what becomes important at scale?Jake [01:15:16]: Before I started Railway, I built a feature-flagging product and tried to sell it. It was an easier version of LaunchDarkly. I ran into a problem: anyone small enough to adopt your technology doesn't care about feature flags, and anyone large enough to need feature flags needs so much scale that you have to build out all the infrastructure. I scrapped it.Jake [01:15:42]: But what is old is new again. Companies are trying to move quickly, but you can't YOLO a vibe-coded thing straight into production. You need to say, “Here's my blast radius, my impact, and I want to shadow it for these users.” Feature flags. You're going to need the tools larger companies built to maintain their structures. Everything gets compressed by 1,000x so everybody can build those structures quickly.Jake [01:16:07]: That's exactly where we are: compressing the software development lifecycle, then expanding it and adding more new things.Cattle, Pets, and Clonable InfrastructureSwyx [01:16:15]: Another term that comes to mind for newer developers is “cattle, not pets.” People treat production like a pet. It has a name. You baby it and keep it alive. With cattle, you can mass farm, roll out, portion parts out, and kill them.Jake [01:16:37]: I think that might change. You can move toward having pets as long as you have a cloning machine for your pets.Swyx [01:16:52]: Yeah.Jake [01:16:52]: If you can snapshot every single thing at every frame, it doesn't matter if something gets obliterated because you have a snapshot of it. The things we've built right now are designed to block changes from the hermetically sealed DevOps line. You have to write a Dockerfile because you nee
De 269 requerimientos de extradición, EU no ha entregado a ninguno: SRE SCJN resuelve más de 7 mil asuntos en ocho meses OMS alerta por rápida propagación de ébola en CongoMás información en nuestro podcast#grc
Олег Федоткин, CTO в компании Циан, в гостях у Андрея Смирнова из Weekend Talk. Новый выпуск подкаста «В SREду на кухне» – https://clc.to/33TGUg Телеграм-канал Андрея Смирнова – https://t.me/itsmirnov 00:00 Начало 00:29 Чем можешь быть известен моей аудитории? 00:48 Рекламная пауза 02:11 Как прошел путь от Rails-разработчика в Саранске до CTO? 09:22 Чему научил гиперрост в 12 раз и переход от платформы к продукту? 22:16 Зачем стал вести телеграм-канал, выступать и делиться опытом? 33:07 Кому в IT станет сложнее из-за ИИ, охлаждения рынка и зарплат? 39:25 Как устроен типичный рабочий день и где ИИ пока не заменяет менеджера? 43:54 Почему после закрытия подкаста «Для tech и этих» создал личный ютуб? 49:18 Кем бы ты стал, если бы не было IT-сферы? 51:17 Почему стоит переехать в Саранск? 52:06 В чём сейчас главная проблема современного IT? Ссылки по теме: 1) Телеграм-канал «Инженер и Менеджер» – https://t.me/engineering_manager 2) Подкаст «Для tech и этих» – https://youtube.com/playlist?list=PLCfzTvRRQvO6X1ZpXhgtO3Lx7gQ-1_grM 3) Личный YouTube-канал Олега – https://youtube.com/@oleg_fedotkin 4) Доклады Олега на конференциях – https://ctoconf.ru/2025/authors/17400
Israel interceptó 28 embarcaciones de la Flotilla Global Sumud Avanza programa federal de pavimentación carretera RicardoMonreal prevé periodo extraordinario por reforma judicialMás información en nuestro podcast#grc
Srečanja med voditeljema Kitajske in Združenih držav Amerike imajo vedno predznak zgodovinska, tudi zaradi spomina na leto 1972 in obisk Richarda Nixona pri Mao Cetungu, kjer naj bi padla ena od prvih domin, ki je sprožila razpad Sovjetske zveze in vzhodnega bloka. Si bomo najnovejši Trumpov obisk pri Ši Džinpingu zapomnili po Šijevi omembi Tukididove zanke, s katero je Združene države označil za staro, Kitajsko pa za prihajajočo velesilo. Bo Kitajska okrepila svojo vlogo pri iskanju miru na Bližnjem vzhodu? So ZDA pripravljene zmanjšati vojaško podporo Tajvanu? Peking bo še nekaj časa v središču pozornosti svetovne javnosti, saj se tam vrstijo obiski z vsega sveta. Med drugim pričakujejo tudi Vladimirja Putina. O tem in še čem se bomo pogovarjali v tokratnem Studiu ob 17h.
Društvo slovenskih izobražencev DSI iz Trsta je nocojšnje srečanje naslovilo Park miru na Opčinah. Po ogledu filma Marka Sosiča z naslovom Strel v tišino bodo v pogovoru sodelovali Marija Brecelj, Štefan Čok, Dušan Kalc, Emil Petaros in Fulvia Premolin. Prireditev bo danes ob 20h v Peterlinovi dvorani v Trstu. V tem tednu si boste dvakrat lahko ogledali literarno-glasbeno prireditev Srečko in jaz pod režisersko taktirko Aleksandra Tolmaierja, v kateri poezijo Srečka Kosovela ob 100. obletnici smrti v recitalu predstavlja Janko Krištof. Jutri ob 19.30 bo nastopil v iKultu v Celovcu, v sredo ob isti uri pa v Kulturnem domu v Pliberku. Srečko in jaz ni zgolj klasična recitacija, temveč globok oseben dialog z vizionarjem s Krasa. Projekt je nastal na pobudo dekana Krištofa, ki je s svojo interpretacijo oživil Kosovelove verze. Ti kljub stoletni razdalji ostajajo srhljivo aktualni, saj neposredno nagovarjajo sodobnega človeka, njegovo stisko, upanje in etično držo. Da pesnikove besede zvenijo še močneje, je večer obogaten z glasbenimi vložki. Na klavirju recitatorja spremlja Ana Tijssen, ki s skrbno izbranimi melodijami dopolnjuje pesnikovo duhovno krajino. Vabi Krščanska kulturna zveza KKZ iz Celovca.
In this episode of Alexa's Input (AI), I sit down with Sal Furino to explore the hidden engineering work that keeps modern systems reliable.We break down what Service Level Objectives, Indicators (SLOs/SLIs), and error budgets actually mean in practice, why reliability is as much a cultural problem as a technical one, and how teams can better measure real user experience instead of just infrastructure health.Sal also explains reliability engineering and the challenges of reliability at scale, like:Why latency and correctness become harder to measure with GenAIThe difference between a bad incident and a fundamentally bad systemHow observability and telemetry shape modern engineering organizationsWhy most teams focus too much on infrastructure metrics and not enough on user happiness Why “the best systems are the ones nobody notices.”If you work in AI infrastructure, distributed systems, platform engineering, observability, or SRE, this episode is a must listen!SRECon Talk Dashboards & Dragons: Reliability Magic for AI Platforms by Alexa Griffith and Sal Furino: https://youtu.be/aWMB_7ksbkc?si=S49nPyAl_hCUIH7yGeneral Podcast LinksWatch: https://www.youtube.com/@alexa_griffithRead: https://alexasinput.substack.com/Listen: https://creators.spotify.com/pod/profile/alexagriffith/More: https://linktr.ee/alexagriffithLearn more about the host atWebsite: https://alexagriffith.com/LinkedIn: https://www.linkedin.com/in/alexa-griffith/Find out more about the guest at:LinkedIn: https://www.linkedin.com/in/salvatore-furino/Rootly Interview: https://rootly.com/humans-of-reliability/salvatore-furinoReliability at Scale Talk: https://youtu.be/J-VrU5JHPlk?si=8aV8acy57NWX30KABloomberg Careers: https://bloomberg.avature.net/careers/SearchJobsChapters00:00 - Introduction: Reliability in a world reshaped by generative AI02:22 - The importance of seamless, background system design04:41 - Becoming a Customer Reliability Engineer at Bloomberg05:17 - Clarifying the CRE role and its customer focus08:02 - The importance of observability and high-scale performance in finance09:00 - Balancing technical and cultural aspects of reliability10:19 - Coaching teams to be proactive using error budgets and SLIs12:21 - The social-technical system: People, processes, and tools13:06 - Mediation of differing opinions on reliability practices15:06 - The nuanced approach to alerting and incident response17:08 - The significance of tiered SLOs and the concept of error budgets21:08 - Using signals like latency, correctness, availability, saturation in system measurement22:53 - The impact of service level "nines" on system design and resilience28:00 - Handling non-determinism and trust in AI responses33:01 - Error budgets and their role in managing deployments34:10 - The challenge of achieving five nines and data durability considerations40:03 - Adapting SLOs for GenAI systems: core principles remain intact42:23 - Measuring non-deterministic AI responses and quality proxies44:41 - The ongoing importance of reliability even in AI/ML contexts47:25 - Reacting to error budget exhaustion and proactive mitigation50:42 - The significance of involving cross-functional teams during outages55:36 - Advocating reliability investment to leadership56:24 - The customer perspective: reliability as a fundamental feature58:42 - Connecting with Sal Furino: where to follow his work and learn more about Bloomberg's engineering culture59:20 - Final advice: Focus on user happiness to avoid common pitfalls in adopting SLOs
Svet stranke Resnica je sklenil, da stranka ne bo del prihodnje vlade, bo pa podprla Janeza Janšo za mandatarja. Potem, ko so vstop v koalicijo že podprli Nova Slovenija in Demokrati, bo to v ponedeljek po pričakovanjih storila še SDS. V oddaji tudi o tem: - Varuhinja človekovih pravic Simona Drenik Bavdek poudarja, da je pokop žrtev povojnih pobojev mednarodna obveznost Slovenije - V Londonu dva množična protestna shoda. Policija operacijo označila za eno najzahtevnejših v zadnjih letih. - V domačem Tomaju so se spomnili pesnika Srečka Kosovela. Večer prinesel tudi nova spoznananja o njem.
Mock-интервью с Николаем Лебедевым - DevOps/SRE-инженер, 17 лет в Linux, 4 года AWS EKS. Stack: Terraform, Flux, Cassandra, Kafka, Vault, SOPS. Два часа - много практики, много каверзных вопросов. ЧТО СПРАШИВАЛИ ☁️ AWS: EKS и IRSA, VPC с нуля (CIDR, multi-AZ, multi-region), managed K8s vs self-hosted, Elasticache, Golden Signals и метрики SRE.
In this episode, Corey Quinn sits down with AWS Senior Principal Engineer David Yanacek to explore the next evolution of DevOps.After two decades of building systems to reduce operational pain, David shares how AWS's new DevOps Agent is pushing automation to a whole new level, autonomously diagnosing incidents, suggesting fixes, and proactively improving systems before engineers even log in.From pager overload to autonomous remediation, this conversation is a glimpse into a world where software isn't the bottleneck anymore, operations are evolving into something entirely new.If you care about DevOps, SRE, platform engineering, or just want fewer 3 a.m. alerts, this episode is for you.Show highlights: (00:00) DevOps Meets Agents(00:13) Welcome and Sponsor Break(01:29) David Yanacek Backstory(02:34) DevOps Roots at Amazon(04:22) DevOps Agent GA Overview(05:32) LLMs MCP and Any Cloud(08:32) Guardrails and Safe Changes(11:47) Beta Results and Consistency(14:13) Troubleshooting Theory and On Demand(17:29) Future of DevOps and ClosingAbout David: David Yanacek is a Senior Principal Engineer at AWS and a lead advisor on the Agentic AI team. His current work focuses on Kiro, Amazon Bedrock AgentCore, and AWS's operational agents, where he helps shape the future of intelligent, autonomous systems.Over a 19+ year career at Amazon and AWS, David has been at the forefront of building services that simplify life for developers and operators. His experience spans serverless, DevOps, and CloudOps, including launching Amazon DynamoDB and AWS IoT Core, and contributing to the direction of cornerstone services like AWS Lambda, Amazon API Gateway, and Amazon CloudWatch.David also served as the lead publisher for the Amazon Builders' Library, helping customers apply Amazon's hard-earned architectural and operational lessons to their own systems.Outside of engineering, David plays the French horn in a local Seattle ensemble.Links:LinkedIn: https://www.linkedin.com/in/david-yanacek/Website: https://aws.amazon.com/builders-library/authors/david-yanacek/Sponsored by: duckbillhq.com
Reservas internacionales de México aumentan 766 mdd SAT recuerda vencimiento para dictamen fiscalChina exigirá etiquetar contenido generado con IAMás información en nuestro podcast#grc
What's happening in the world of SRE and resilience engineering? Join us as we catch up with fellow podcast hosts Colette Alexander and Clint Byrum of the This Is Fine! podcast at SREcon in Seattle.
SRE destaca avance histórico con la Unión Europea Profepa clausura predios por daños forestales en el EdomexCCH alerta por venta de alcohol a menoresMás información en nuestro podcast#grc
My guest today is Tyler Wells, co-founder of Brain Grid.Tyler recounts 25+ years in software, from an early IBM XT to work across military communications, startups, Skype/Microsoft, and seven and a half years at Twilio building video and SRE organizations, before founding Propel Data (which didn't find product-market fit) and then Brain Grid. He describes an experiment-driven approach to building high-performance systems by defining hypotheses, creating a “steel thread” MVP, and prioritizing observability for 2:00 AM incidents. He discusses how AI coding shifts focus from typing code to architecture, documentation, critical thinking, and red-teaming plans, while warning that agents need guidance on separation of concerns and DRY to avoid refactor side effects. Brain Grid emerged from using Cursor agents during Propel's wind-down and aims to generate detailed specs, acceptance criteria, and validation loops so agents implement features reliably, with attention to token efficiency. He also covers co-founder traits, chaos engineering, compliance challenges for solopreneurs, career advice, and staying grounded through exercise, cooking, and family.Tyler Wells is the Co-founder and CTO at BrainGrid, BrainGrid is one of the first platforms built specifically to replace the missing product management role in AI-native software development.He is currently building BrainGrid — helping engineering teams ship faster with AI-assisted requirements breakdown and task management. We're focused on bridging the gap between product ideas and implementation-ready work.His Background: He has spent 25+ years building systems where failure isn't an option—from satellite communications at Hughes Space to real-time video at global scale. I led the team that built Facebook's first video calling feature powered by Skype, then spent 7+ years at Twilio building their Video Platform (WebRTC) and leading SRE/Observability across the company.
"Pour moi, la prod, ça doit être un sujet qui est relaxant" Le D.E.V. de la semaine est Kevin Davin, Google Developer Expert Cloud et Kotlin, SRE chez Gradle.Dans cet épisode, on clarifie la frontière (parfois floue) entre SRE et DevOps, et comment la philosophie DevOps s'oppose au métier concret de SRE. Kevin décortique la question des rôles et des responsabilités, l'importance de l'erreur budget, et l'impact réel des outils managés et de l'IA. On discute de la montée en puissance de l'observabilité, du modèle déclaratif et de la responsabilité partagée du run. Au final, la prod n'est pas forcément synonyme de tension : bien maîtrisée, elle peut même devenir relaxante.Chapitrages00:01:06 : Introduction au DevOps et SRE00:01:20 : Présentation de Kevin00:02:37 : Rôles et responsabilités SRE et DevOps00:05:23 : La collaboration entre Dev et Ops00:09:49 : Concept d'erreur budget00:12:35 : Importance de l'observabilité00:14:42 : Évolution des technologies et des métiers00:17:43 : Platform Engineering et ses implications00:20:49 : Complexité croissante des technologies00:23:42 : Conseils pour les développeurs00:28:34 : Intérêt pour la production00:32:17 : Changements dans les pratiques de travail00:37:42 : Impact de l'IA sur les métiers00:41:41 : Vers une intégration des rôles00:45:24 : Conclusion et réflexions finales Liens évoqués pendant l'émission Série: Sense 8
México eleva protesta por intercepción israelí de flotillaRescatan decenas de aves vendidas ilegalmente en GuadalajaraJuez de EU frena petición de El Chapo para volver a MéxicoMás información en nuestro Podcast#grc
SRE mantiene contacto con mexicanos en Flotilla Global SumudÁlvaro Obregón recupera 12 hectáreas en Alameda PonienteOMS alerta aumento de ataques contra personal sanitario en Irán Más información en nuestro Podcast#grc
Send us Fan MailLexi talks us through passing Network Plus even after feeling sure she would fail, and we break down the study habits that actually help when the pressure hits. We also get real about career direction, why SRE work looks appealing, and why digital “ownership” keeps feeling more like renting. • passing the Network Plus exam and what made it manageable • test anxiety and how it shows up right before check-in • what we have been playing lately and why Diablo can be brutal • car projects, saving money on OEM repairs, and why Charlotte traffic is a no • getting back into yerba mate and why the setup matters • figuring out what parts of IT work we genuinely enjoy • moving away from management goals and toward SRE responsibilities • choosing AWS certification levels based on existing experience • learning methods that work for us: visual, hands-on, classes, practice tests • using AI tools to generate quizzes, notes, and study structure • learning Spanish through listening and repetition instead of classrooms • the rumor about Sony PS5 DRM check-ins and why it worries us • subscription creep across games, music, and even car features https://www.carolinaotakus.com/
EU investiga al gobernador de Sinaloa por presuntos nexos con el narcoRubén Rocha niega vínculos con la delincuenciaPescadores protestan por falta de apoyos#grc
Zašto je Viktor platio neverovatnih 300 evra za rani pristup seriji "Senke nad Balkanom 3" i šta se desi kada čovek sa najlepšim glasom Jugoslavije, Igor Brakus i polovina @ObneobRadio ,donese preporuke za najdepresivnije serije na Balkanu? U 79. epizodi Njuz POPkasta donosimo vam ultimativne filmske (muzičke i književne) preporuke! Od maestralne i mračne Pejakovićeve trilogije "Meso, Kosti, Koža", preko novog SF hita "Hail Mary" sa Rajanom Goslingom i brutalnog filma Park Čan Vuka, pa sve do skandala zvanog "Beogradsko poselo na Zappa barci"! Igor Brakus objašnjava zašto nije pročitao knjigu od 12. godine, dok vas mi učimo kako da postanete milioneri preko Indeks fondova. Obavezno ostavite lajk i napišite u komentarima - koja domaća serija je po vama najmračnija?
SRE regulará uso de sedes de las embajadas mexicanas Exportaciones manufactureras crecen 29.5% en el 2026 Putin promete impulsar paz en Oriente Medio Más información en nuestro podcast#grc
Gobierno va por eliminación de oficios y trámites en papel Rigoberta Menchú se suma a política exterior de México Venecia en riesgo: advierten posible desaparición Más información en nuestro podcast #grc
Surge el nuevo éxito: "El príncipe de la SRE" obra de nuestro querido Vampipe. El Estaquita picioso se avienta un "GabyCam" y recomienda una peli en la sección de Gaby Cam. Yuri está encima del 'conejo malo', ¡qué envidia! Ángel Reyna habla fuerte de la corrupción en el fut y Joserra habla fuerte, también, pero de los "influencers" del deporte. Y Martinoli recuerda con cariño cuando Joserra le dijo que era un pend...
For all those who missed out on London, see you in Miami next week!Notion, the knowledge work decacorn, has been building AI tooling since before ChatGPT, with many hits from Q&A in 2023 and unified AI in 2024 and Meeting Notes in 2025. At the end of their last Make user conference, Ryan Nystrom teased Notion 3.0's Custom Agents - and they are finally embracing the Agent Lab playbook!Sarah Sachs and Simon Last of Notion join us for a deep dive into how Notion built Custom Agents, why it took years and multiple rebuilds to get right, and what it means to turn a productivity tool into an agent-native system of record for enterprise work.We go inside the product, engineering, evals, pricing, and org design decisions behind one of the most ambitious AI product efforts in software today — from early failed tool-calling experiments in 2022 to agent harnesses, progressive tool disclosure, meeting notes as data capture, and the long-term vision for software factories and agentic work.We discuss:* Sarah and Simon's path to launching Notion Custom Agents, and why the feature was rebuilt four or five times before it was ready for production* Why early agent attempts failed: no tool-calling standard, short context windows, unreliable models, and too much complexity exposed to the model* The “Agent Lab” thesis: not just wrapping a model, but understanding how people collaborate and building the right product system around frontier capabilities* How Notion thinks about roadmap timing: not swimming upstream against model limitations, but also building early enough that the product is ready when the models are* Why coding agents feel like the kernel of AGI, and how Notion is thinking about “software factories” made up of agents that spec, code, test, debug, review, and maintain codebases together* How Sarah runs AI engineering at Notion (“notes from Token Town”): objective-setting over idea ownership, low-ego teams comfortable deleting their own work, and a culture designed to swarm around fast-changing opportunities* The “Simon Vortex,” company hackathons, and why security gets pulled in early rather than late* How Notion organizes AI: core AI capabilities and infrastructure, product packaging teams, and a broader company mandate that every product surface must increasingly work for both humans and agents* Why prototypes have become much easier to build internally, and how “demos over memos” changes product development inside a tool the whole company already uses every day* Notion's eval philosophy: regression tests, launch-quality evals, and “frontier/headroom” evals that intentionally only pass ~30% of the time so the company can see where model capabilities are going* What a “Model Behavior Engineer” is, and why Notion treats eval writing, failure analysis, and model understanding as a distinct function rather than just software engineering* The changing role of software engineers in the age of coding agents, and why the new job looks less like typing code and more like supervising a rigorous outer system of agents, PRs, and verification loops* How the “software factory” should work: specs, self-verification, bug flows, subagents, and minimizing human intervention while preserving the invariants that matter* A live walkthrough of a Notion Custom Agent handling coworking space tenant applications by triaging email, enriching applicants with web search, and writing structured data into a Notion database* How agents compose inside Notion: shared databases as primitives, agents invoking other agents, “manager agents” supervising dozens of specialized agents, and memory implemented simply as pages and databases* Notion's take on MCP vs CLI: why Simon is bullish on CLI's self-debugging nature, where MCP still makes sense, and how Sarah thinks about capability, determinism, permissioning, and pricing alignment* The evolution of Notion's internal agent harness: from early JavaScript coding agents, to custom XML, to Markdown and SQL-like abstractions, to tool definitions, progressive disclosure, and a much shorter system prompt* Why Notion cares about teaching “the top of the class,” building for sophisticated operators rather than abstracting away too much capability for everyone* How agent setup works today: agents that can configure themselves, inspect their own failures, and edit their own instructions — with guardrails around permissions* How Notion prices Custom Agents: credits as an abstraction over tokens, model type, serving tier, web search, and future sandbox costs; why usage-based pricing was necessary; and how “auto” tries to match the right model to the right task* Why Notion is not eager to train a foundation model, where they do fine-tune and optimize today, and why retrieval/ranking is one of the most important investment areas as more searches come from agents rather than humans* Why Meeting Notes became one of Notion's strongest growth loops: not just as transcription, but as high-signal data capture that powers search, custom agents, follow-up workflows, and the broader system of record for company collaboration* Why Notion is more interested in being the place where collaboration data lives than in building hardware themselves — and how wearables or other capture devices may eventually feed into that systemSarah SachsLinkedIn: https://www.linkedin.com/in/sarahmsachsX: https://x.com/sarahmsachsSimon LastLinkedIn: https://www.linkedin.com/in/simon-last-41404140X: https://x.com/simonlastFull Video EpisodeTimestamps* 00:00:00 Introduction and launching Notion Custom Agents* 00:01:17 Why Notion rebuilt agents four or five times* 00:03:35 Building for where models are going, not just where they are* 00:05:32 The Agent Lab thesis, wrappers, and product intuition* 00:08:07 User journeys, leadership, and low-ego AI teams* 00:13:16 The Simon Vortex, hackathons, and bringing security in early* 00:16:39 Team structure, demos over memos, and building for agents* 00:20:25 Evals, Notion's Last Exam, and the Model Behavior Engineer role* 00:27:37 Evals as an agent harness and the changing role of software engineers* 00:30:42 The software factory: specs, verification, and agent workflows* 00:32:18 Live demo: a custom agent for coworking space applications* 00:35:08 Composing agents, manager agents, and memory as pages* 00:38:15 Notion Mail, Gmail, native integrations, and tools* 00:39:43 MCP vs CLI and the cost of capability* 00:44:13 When Notion uses MCP vs building its own integrations* 00:47:43 The history of Notion's agent harness rebuilds* 00:55:35 Power users, public tools, and the setup agent* 00:58:01 Self-fixing agents, permissions, and “flippy”* 01:01:13 Pricing, credits, and choosing the right model automatically* 01:09:01 Why Notion isn't training its own frontier model* 01:14:07 Retrieval, ranking, and search built for agents* 01:17:27 Meeting Notes as data capture and workflow automation* 01:21:18 Wearables, hardware, and Notion as the system of record* 01:23:45 OutroTranscript[00:00:00] Alessio: Hey everyone. Welcome to the Latent Space podcast. This is Alessio founder of Kernel Labs and I'm joined by swyx, editor of the Latent Space.[00:00:11] swyx: Hello. Hello. We're back in the beautiful studio that, uh, Alessio has set up for us with Simon and Sarah from Notion. Welcome.[00:00:18] Sarah Sachs: Thanks for having us.[00:00:19] Alessio: Thanks for having us. Yeah.[00:00:20] swyx: Congrats on the launch recently the custom agents, finally it's here. How's it feel?[00:00:26] Sarah Sachs: We ship things slowly. So it had been in Alpha for a little bit and at the point at which is it's an alpha, um, there's a group of people that are making sure it's ready for prod, and then there's a group of people working on the next thing.So sometimes some of these launches are a bit delayed satisfaction, so it's quite nice to remind yourself all the work you did because we do have a habit of like. Being two or three milestones ahead. Uh, just ‘cause you have to be, you know, you can't get complacent. Um, but it's been great that people understood how this is helpful.And I think that's just easier in general building AI tools today than it was two, three years ago. People kind of get it and so that user education, um, there's just, it was our most successful launch in terms of free trials and converting people and things like that. It was really successful, so yeah.But there's a lot to build.[00:01:12] swyx: Making it free for three months helps.[00:01:16] Sarah Sachs: Yep.[00:01:17] Simon Last: It was definitely super exciting for me because it's probably the fourth or fifth time that we rebuilt that.[00:01:22] swyx: Yes.[00:01:23] Simon Last: And I mean,[00:01:24] swyx: you've been building this since like 20, 22.[00:01:26] Simon Last: Yeah, I mean, like, it was even right when we got access to like GPT four in late 20 22, 1 of the first ideas we had is like, oh, okay, let's make an agent that I, we used the word assistant at the time, there wasn't really the word, the word agent yet, but, oh, we'll give an access to all the tools the notion can do, and then it, we run in the background like, like do work for us.And then we just tried that many times and it just. Was too early. Um,[00:01:48] swyx: I need to force you to like double click on that. What is too early? What didn't work?[00:01:52] Sarah Sachs: We were fine to, like, before function calling came out. We were trying to fine tune with the Frontier Labs and with fireworks, like a function calling model on notion functions.This is right when I joined. I joined because, um, we needed a manager as Simon was needed to be able to go on vacation. So, uh, that's, that's around when I joined, so you can speak much more to it.[00:02:11] Simon Last: Yeah, we did partnerships with both philanthropic and open AI at different times, uh, to try to, at the time the, I mean, when we first tried, there wasn't even a constant of like tools yet.We, we sort of designed our own like, like tool calling framework and then we tried to fine tune the models to, uh, to use it over multiple turns. Um, and because it, it didn't work well out the box, I think. Yeah. The models are just too dumb and the context thing was also way too short.[00:02:37] Alsesio: Yeah.[00:02:37] Simon Last: Um, and yeah, we just kind of banged our head against it for a long time.Uh, unfortunately it was always like, there was always like sort of. Glimmers that it was working, but um, it never felt quite robust enough to be like a useful, delightful thing. Um, until I would say, uh, the big unlock was probably like Sonic 3.6 or seven, uh, early last year. And that's when we started working on our agent, which we shipped last year.Um, and then, and then uh, uh, custom agents, kinda a similar capability and that, that one just took longer because we, we just wanted to get the reliability up a lot higher. ‘cause it's actually running in the background.[00:03:14] Sarah Sachs: And the product interface of like permissions and understanding, you know, this custom agent is shared in a Slack channel with X group of people and has access to documents that are surfaced to Y group of people.And the intersect experts, Y might not be whole. And so how do you build the product around making sure administrators understand that permissioning took multiple swings.[00:03:35] Alsesio: Everything is hard back at the end of the day. Yeah. I'm curious, like when the models are not working, how do you inform the product roadmap of like, okay, we should probably build, expecting the models to be better at some reasonable pace, but at the same time we need to, you know, you had a lot of customers in 2022.It's not like you were a new company or like no user base.[00:03:54] Simon Last: Yeah, I mean I think there's always the balance of, you know, like you want to be a GI pilled and thinking ahead and building for where things are going. Uh, but also you wanna be like shipping useful things. And so we always try to like, like keep a balance there.You know, we. We try to take clear, like a portfolio approach. You know, we're always working on multiple projects and, and we're always trying to work on, you know, maintaining things where that have already shipped, like, like shipping new things that are like eminently working well and make them really good.And, and then we wanna always have a few projects that are a little bit crazy. Um,[00:04:23] Alsesio: and what are the a GI peel projects that you have today? I'm curious about, uh, you don't have to share exactly what you're working on, but I'm curious what are things today that maybe in 18 months people will be like, oh, obviously this was gonna work[00:04:35] Sarah Sachs: 18 months.[00:04:37] Alsesio: Yeah, 18 months is, you know,[00:04:37] Sarah Sachs: it's a long time and Yeah. Yeah.[00:04:39] Simon Last: I mean, there's a number of things happening. I think one thing that's becoming more clear is I think like, like, uh, coding agents are the kernel of EGI, sort of, everything is a coding agent. Mm-hmm. I think that's, that's sort of one, one direction.Um, and then, yeah, the exciting thing about that is sort of your agent can sort of bootstrap its own software and capabilities and actually debug and maintain them. And so yeah, we're, we're, we're thinking a lot about that. And then, yeah, like, like another category of things that I'm, I'm really excited about is like, uh, we call the software factory also.People are using this, uh, this, this sort of word. Um, basically it just means can you create sort of like a, as automated as possible, a workflow for developing debugging. Mm-hmm. Merging, reviewing, and maintaining a code base and a service where there's a bunch of agents working together inside, and like, like how does that work?[00:05:28] Sarah Sachs: If you think back to your initial question, like, why did this take so long? I think something,[00:05:32] swyx: I didn't say that, but Yes. Okay. Go ahead.[00:05:34] Sarah Sachs: Why, what, what changed over the three and half years of trying[00:05:37] swyx: it? Exactly. Right. Because most people always say like, it didn't work yet. Then reasoning models came, then it worked.I was like, okay, let's go a little[00:05:43] Sarah Sachs: bit. That's, I mean, that's part of it, but I think the other part of it that I actually think is really what will set notion apart for every new capability is we have like. Two skills that are crucial when it comes to frontier capabilities. One is not letting yourself swim upstream.So like quickly realizing if you're just pressing against model capabilities versus not exposing the model to the right information, not having the right infrastructure set up. That and of itself is the skill of intuition. And the second is to see, okay, you're not swimming upstream. Which direction is the river flowing and what is like, how do we think ahead about the product and start building it even if it's not great yet, so that when it is there, we're ready for it.Right? And like those can sometimes feel like counterintuitive things. Like we can be trying to fine tune a tool calling model when they don't exist yet. And that the trick is to not do that for too long, but realize that there was something there. And we've had a lot of things which like, um, we're just like not swimming in the right direction with the streams.I think we had multiple versions of transcription before we got meeting notes, right? Oh, I gotta talk[00:06:39] swyx: about that. Yeah.[00:06:40] Sarah Sachs: Yeah. Um, and so. I, I, I think that like we, we really closely partner with the Frontier Labs on capabilities and we also have to have strong conviction on, as those capabilities move.Notion is about being the best place for you to collaborate and do your work. And how does that narrative change if the way that we work changes?Yeah.[00:06:58] swyx: Yeah. You told me you were a fan of the Agent Lab thesis, and this is, this is kind of it, right?[00:07:02] Sarah Sachs: Right. I show that thesis to so many candidates. Like I have it as like micro chrome autofill.Um, at this point, like it's one of my most visitations[00:07:10] swyx: because like, is this the, here's why you should work in notion and not open, open eye. I, it's like,[00:07:14] Sarah Sachs: here's, here's what's different about it.[00:07:16] swyx: Yeah.[00:07:16] Sarah Sachs: And here's why. It's not just a rapper. I actually think more and more people understand it's not just a wrapper.[00:07:21] swyx: Yeah.[00:07:22] Sarah Sachs: Um, and by the way, like in the beginning, parts of what we build are wrappers on functionality. That works well, of course, but that's not really the most, um. I would say that's not the product that, that drives revenue. And that's not necessarily always what users need.[00:07:35] swyx: I mean, you know, notion is the AWS wrapper, but like the, the wrapper is very beautiful and like very, very well polished.So[00:07:40] Sarah Sachs: like the analogy,[00:07:41] swyx: like[00:07:42] Sarah Sachs: the analogy that I've been coming back to his Datadog in AWS[00:07:45] swyx: Yeah.[00:07:46] Sarah Sachs: So, uh, Datadog could not exist with, without cloud storage. Right. That it's kind of fundamental that that works. Um, and AWS has like a CloudWatch product, but Datadog is an expert on understanding how people want observability on the products they launch.And we're experts in understanding how people wanna collaborate, and that's really where our expertise lies.[00:08:04] swyx: Totally.[00:08:04] Sarah Sachs: Um, regardless of the tools that we use,[00:08:07] Alsesio: I'm kind of curious how you think about implicit versus explicit expertise. I feel like Datadog is half and half implicit and explicit. It's like they understand across markets and industries what engineering teams usually look for.With notion, it's almost like more of the expertise is at the edge because you as a platform, you're like so horizontal that the end user is not really the same. Mm-hmm. Like with Datadog, the end user is always like, yeah, an engineering lead, a kinda like SRE related person with notion. It can be anything.So I'm curious how you put that expertise into a product versus, you know, obviously it, WS cannot build notion. It's, that doesn't quite work in this case, but[00:08:44] Simon Last: it's, it's a little bit differently shaped. I think, you know, a classic vertical SaaS, like the data is kind of like that. They understand their individual customer very deeply.It's kinda a narrow slice, um, notion has always been super horizontal. And our, our task has always been to sort of balance these two somewhat opposing forces of like, we're listening to our customers and what they want us to build. It's a broad slice. And then also we're thinking about like, okay, how do we decompose what they want into, uh, nice primitives that are, that are really nice to use and we'll, we'll get us like as much bang for the buck as possible.And then, you know. Maintain the whole system, make it all like, like super clean and nice to use.[00:09:22] Sarah Sachs: We still have user journeys. I mean, we still focus on like core. I actually think the failure of our team is when we focus too much on what are cools that are, what are tools that are[00:09:31] Simon Last: mm-hmm.[00:09:31] Sarah Sachs: Cool tools. I actually think that's when we make have the least velocity because you still need some sort of focus on a user journey.So like for instance, we'll all sit down every Friday and look at the P 99 of like the most token exhaustive custom agent transcript and just look at why it didn't do well and cut a bunch of tasks. Like we still focus on like, this has, like this should work. Email triaging should work. Mm-hmm. Right. And similarly, like when we're talking about before building, um, chatting, um, before we started filming about, okay, how can I do PDF export?Well that's functionality that then merits. Maybe we should build a tool that has access to a computer sandbox in a file system and the ability to write code. Right? Right. Um, but it's because we're thinking about the fact that our users to do their, to do their daily work, need to export PDFs, not because we're like, Hmm, I think a computer tool could be cool.Like, let's just see what happens. Mm-hmm. Like we, we have to focus on some user journeys, otherwise we just don't have like, enough strategy to, to prioritize.[00:10:29] swyx: I think there's a lot of like really strong opinions that you've had. Do you have like sort of like a towel of Sarah Sachs? Like, you know, like what, how do you run your team?Like I feel like you just have accumulated all these strong opinions. Obviously part, part of this is your, your token town thing.[00:10:43] Sarah Sachs: I think the TAs working with Service X is, um, you'd have to, it depends who you ask. Um, I think it depends if you're on my team or a partner Right. Or a vendor.[00:10:54] swyx: Yeah. There other people want to run their teams the way that you're Yeah.You're like bringing these things. And then also similarly, uh, Simon, when you did the custom agents demo, you had like, well, we've been using custom agents and here's the super long list of everything that we do. No humans ever read it. Right? That's what you said. I was like,[00:11:07] Sarah Sachs: yeah. So I think for, for me, um, something that I learned very quickly and became very comfortable with was that my job was not to be the ideas per person or the technical expert.My job was to make it so that everybody understood the objective, had a resource to help prioritize what they should work on, and had an avenue to prioritize what they thought was important. And I think that's true with all, all leadership, but I think especially on the AI team. Almost all of our best ideas come from prototypes, from people that have a cool idea because they saw a user problem, and it's a huge disservice if all of those ideas have to pass, like the sniff test of what me and a product partner or Simon and Ivan decided were the direction, right?Because a lot of what we're doing is leaning into capabilities, so. I think that's the first thing is like, I don't really view like the role of engineering leadership as like, uh, hierarchical, nor has it ever been, but especially now, like very willing to change direction based on, um, like proof is in the pudding.Yeah. And like, and I think we have rebuilt our harness three or four times. And when you do that, then the second rule of engineering leadership is like you need to build a team that's comfortable deleting their own code and is very low ego and is driven by what's best for the company. And, um, doesn't write design docs because they think it's their promotion packet.Right. And that's a culture that notion had long before I joined, but like our willingness to just swarm on different problems and um, redo things that we've built before because something has changed. Like, there's a lot of friction that can happen at companies when you do that. And it doesn't happen at Notion.And because it doesn't happen when new people join. Like they don't wanna be the ones that are saying, we shouldn't do this. I wrote that code. So then it's, you know, you, you create a culture that everyone thoughts and that culture comes directly, I think from Simon and Ivan though, um, because they're very open-minded.[00:12:50] swyx: Anything that you,[00:12:50] Simon Last: you'd add? I'm not a manager, like, like, like Sarah is. Um, a lot of my role is really to try to think a little bit ahead, make sure that we're, we're building on the right capabilities and then like the prototyping stuff. And yeah, it's really, really critical to always just be starting again.It's like, okay, this is new thing. What does this mean? What if we just rethought everything or wrote everything? And so I, I'm, I'm basically just doing that in a loop every six months.[00:13:16] swyx: Yeah. Do you believe in internal hackathons for this stuff?[00:13:19] Sarah Sachs: I think there's like two different versions. So one is like, we just have a, a, a solid bench of senior engineers that come and go on what we call the Simon Vortex and Productionizing what we built, right?Because when you're in the Simon Vortex, the velocity is super high. The direction changes daily, and it's meant to be like the equivalent of a SC Works lab. We don't need to do hackathons for that. We need to have senior engineers that we trust to come in and out of those projects. For instance, like management boundaries are really loose.Like you report to him, but you work for her right now. Yeah. That's something that when we hire managers, it's important they don't care about because we tend to form more structures. Yeah. Don't be too[00:13:54] swyx: territorial.[00:13:55] Sarah Sachs: We form more. It's after we ship things, not not before, just historically. Um, the second thing is we do have companywide hackathons.Actually we just had our demos day for the hackathon we had last week this morning. That's more for people that aren't directly working on the project, feeling like they have the time to pause and learn how to make themselves more productive or how they would use notion custom agents to build something.Or part of the hackathon was actually encouraging everyone across the company to build their own agentic tool loop, calling from scratch. Follow like an every blog post on how to do what I think because we want[00:14:26] swyx: just with the compound engineering one. Yeah.[00:14:28] Sarah Sachs: We want everyone to use cloud code in the company or whatever the coding agent they please and understand that fundamental.So we set aside a day and a half. We're all leadership, encourage everyone on their teams across the company to do it. So we have hackathons like that. I would say like kind of facetiously, like everything we build is a little bit like a hackathon until it graduates and puts on big boy pants and as a product ops rollout leader and has a assigned data scientists and stuff like that,[00:14:54] swyx: security review enterprise stuff,[00:14:56] Sarah Sachs: actually security reviews one of the things that we bring in first because it just slows us down way more and, um, causes a lot of tension and they build better product if they're involved early.So, um, that is probably the first person to get involved in something that's the[00:15:09] swyx: right PR approved answer.[00:15:10] Sarah Sachs: No, but it's not just PR approved. It like, um, um, it's[00:15:13] swyx: actually real. It's actually real. It's like, um, I'm just saying scar[00:15:15] Sarah Sachs: tissue.[00:15:15] swyx: Yeah,[00:15:16] Sarah Sachs: because like, you know, my background's also, I worked at Robinhood for a number of years.Yes. So like, uh, compliance and things like that, um, are a little bit more, you learn the hard way when it doesn't come naturally.[00:15:26] Simon Last: Yeah. I think the. The hackathon is really important for uplifting the general population, but like, if that's the only way you can build new things, you're kind of toast. I mean, it, it has to be like the daily processes, like, you know, building these new things.Um, and it has to be about, I think like, I think in the AI era a lot more leverage accumulates to the most curious and excited people. And so it's like we're all about just like activating that energy. You know, like if someone's protesting something on the weekend that they're excited about and it's important, that should be the main thing that we're doing.Yeah. Um, it's not a hackathon that we schedule once a quarter, it's just like, yeah. Daily process. Part of the culture.[00:16:02] Sarah Sachs: I mean, that's how we shift image generation and notion now. It was always this thing that would be kind of nice to have, but it wasn't really clear where that was necessarily aligned in product priorities.It'd be a lot of work. And we had someone on the database collections team, Jimmy, who was like. I really wanna do image generation for cover photos and inside notion. And we're like, if you wanna build it, like it's, do it please. Like we encourage you. We gave ‘em all the resources of working directly with Gemini and being able to like track the token usage and it working through endpoints.We gave them eval, support, everything, and then became a, a full project.[00:16:34] Alsesio: Yeah.[00:16:35] Sarah Sachs: That's why you can't have like ego as a, a leader. Like that's, that's how we work.[00:16:39] Alsesio: What's the size of the team today, both engineering and overall?[00:16:43] Sarah Sachs: I manage, uh, the team. That's what we'll call it. Core AI capabilities and infrastructure.That's about 50 people. But then we have per i partner teams that do packaging. So how it shows up in the corner chat versus custom agents versus meeting notes, that's another 30, 40 people. And, and then every team that has a product service at Notion that a user can interface with owns the tool that the agent interfaces with the editor team.The team that did CRDT for offline mode is the same team that handles how two agents, um, edit competing blocks. Mm-hmm. Right? It's the same problem. The team that built the underlying SQL engine is the same team that owns how the agent asks it to run a SQL query, and it does it performantly. And so from that regard, anyone working on product engineering is tasked with making them work for customers that are humans and agents because over time the majority of our traffic will be coming from agencies using in our interface, not humans.And so. Our objective is to make it so that the whole product org is building for agents.[00:17:40] Alsesio: Yeah. How has it changed internally? The activation bar is kind of lowered a lot. Like anybody can kind of create a prototype very, somewhat easily, especially if you're like an existing code base. Have you raised the bar on like what type of prototype people need to bring forward to gonna be taken?Not like seriously, but like, you know what I[00:17:58] Simon Last: mean? Yeah. I think the bar is lowered in many ways. Be like, one thing our, uh, our team built that is really cool is our, uh, our, our design team made a whole separate GitHub repo, uh, called the, the design Playground. And it's basically just to create a bunch of like, like helper components and you, uh, for, for quickly a throwing together UIs.And it's become like actually quite sophisticated. Like it has like an agent in there and like, uh, that's pretty fun. So like, we pretty much, like, they don't do mocks, they just make like, like full, full prototypes.[00:18:27] swyx: Here it is. It works.[00:18:28] Simon Last: They give you like a u rl. They're like, okay, all right. So we have to make the, like the real production version of that.Um, and then for engineers. A prototype looks like just making it a feature flag that actually works. Like that's sort of the bar.[00:18:39] Sarah Sachs: Something to understand that's really unique about notion. One of the reasons I joined we're super lucky is no one uses Notion in their job as much as people that work at Notion.[00:18:46] Simon Last: Of course.[00:18:47] Sarah Sachs: So I think there's very few companies, maybe if you worked on Chrome I guess, but like everything that we ship, we ship internally first and get a lot of really quick feedback. And also sometimes our dev instance is totally borked and you have to change a bunch of flags to get things done. And that's kind of like, but everyone, so people that do it ticketing, people that do supply chain procurement, recruiting, everyone is using the same instance of notion with like a lot of flags on for these prototypes people build.Um, and so we have this, Brian Levin, one of the designers on our team, I think evangelize this concept of demos over memos.[00:19:18] swyx: Ooh, too[00:19:20] Sarah Sachs: good. Um, which has been, uh, very good for building demos, and I think it's put a big pressure point on us to have really strong product conviction, because if anything can be demoed, you really need a strong filter of making sure that if you know, you're doing X amount of work, you're making the, you're, you're focusing on one tower, you're not just building a really flat hill.Right. That's actually where I think there has to be more conviction from our PMs, um, and our designers and, and well, the company really to have conviction of what journey we're going on.[00:19:52] Simon Last: But overall, I feel like it works pretty well. Like people, almost all the engineers have good enough taste to realize that like, this prototype doesn't actually make sense in the product, or, or it does.So it's not that common that I would see a prototype. It's like, oh, this makes no sense. Mm-hmm. It's like, you know, people are doing reasonable things and, and, and then it's just a matter of. Which things we build first and then often just, just figuring out how to turn it on and off. There's our, in the, in our like experimental chat ui, there's this, there's probably like, like a hundred check boxes in there.[00:20:22] Sarah Sachs: Kills me[00:20:23] Simon Last: the things you could turn on and off.[00:20:25] Sarah Sachs: Uh, but I think that, okay, so that is kind of true, Simon, but like being the person that manages the evals team, like there is a level of intensity that it adds to the platform team. So, you know, if we're gonna do image generation and notion, all of a sudden the way that we do attachments and the way that we, um, our LLM completion like cortex talks and expects tokens back and now it's getting images back.Like there's a lot of platform work that we do need to, like solidify a little bit. So sometimes it'll be in dev for a couple weeks before it makes it to prod just because we still have to like, make it robust, make it HIPAA compliant, ZDR compliant, figure out the right contracting with the vendor, whatever it is.And we need to eval it because we want the team. To still maintain what they build. That's the one thing is like if we have a bunch of prototypes, it can't just be like a small group of people that then maintain whatever end prototypes. So we have invested a lot of people in an eval and model behavior understanding teams that, we call it agent dev velocity.So your dev velocity building agents can be faster if we invest in that platform. And so we have a whole org dedicated to Asian, um, platform velocity so that you can build your own eval and then maintain it once you ship it. So if a new model release comes out and we, every[00:21:38] swyx: team maintains their own eval,[00:21:40] Sarah Sachs: we maintain the eval framework.Every team owns their own evals and a lot of them we've integrated to Optin, to ci, or we run them nightly and we have a team, uh, a custom agent that triggers to a team to look at the major failures. That's really critical because if we have like all these different surfaces now, a lot of it's on the same agent harness, so it's easier to maintain.It's just packaging of different agent harnesses, but new functionality of the agent. Let's say that like we wanna update like. Uh, you know, they deprecated, sonnet, um, four or whatever it is and we need to auto update. Are[00:22:11] swyx: they already? That's so, okay. Yeah. Actually wasn't that long ago.[00:22:14] Alsesio: Theywere[00:22:14] Alsesio: just 3.5.[00:22:15] Sarah Sachs: 3.537. Just got deprecated.[00:22:18] swyx: 3 7, 5 0.2 or, yeah. No,[00:22:20] Sarah Sachs: it's not. 5.2 is five point. Five point no. Yeah, five four is 40% more expensive than five two. So if they deprecated five two, you would hear they can, you would hear from me about that one. Um, but, uh, another conversation to have.[00:22:35] swyx: I have a cheeky evals question for you.Have you noticed any secret degradation from any of the major model providers?[00:22:40] Sarah Sachs: Secret degradation,[00:22:42] swyx: like. During the War Bay, when it's high traffic, it suddenly gets dumber.[00:22:47] Sarah Sachs: Yeah. I mean, not just between the, I mean, we definitely notice flakiness, we've definitely noticed, particularly for some providers, that things are slower during working hours and[00:22:57] swyx: there's a latency argument.Yes. Not a quality argument.[00:22:59] Sarah Sachs: No. I think the quality difference that's interesting is, um, even though companies that say they're selling the same, a, it's really into like quanti quantization, but like companies that say they're selling the same model through different vendors, whether it be through first party or Bedrock, Azure, et cetera.We do see different qualities sometimes, and that's not necessarily what's advertised.[00:23:21] swyx: Yeah. Kidney went to the point of like, if we, they shipped like this, like eval across all the providers and it was like very obvious we were secret equalizing and it was very,[00:23:28] Sarah Sachs: yeah. But[00:23:29] swyx: that's very embarrassing.[00:23:30] Sarah Sachs: You know, um, we hire Subprocess to figure that out for us.So we just wanna understand where it's regressing or where it's optimized. And sometimes we're okay with regressions that optimize latency if they're the appropriate regressions. Our job is to make sure we have the evals to understand the changes that are important to us. And even like when we're partnering with labs on pre-releasees of models, they'll send us multiple snapshots.And this is less about quantization, but more just regressions. Like they have shipped models that were not the snapshots that we wanted, and they have changed the snapshots that they shipped based on the feedback that we give. Because our feedback tends to be more enterprise work focused and not coding agent focused.And definitely those can be bummers, like, you know, uh, we know that this wasn't the version you wanted, but we'll help you make it work. I mean, we always make it work, but that definitely happens.[00:24:16] Alsesio: Yeah. Do you have, um, failing evals that you're just hoping, oh, that will have success eventually when a good model comes out?[00:24:23] Sarah Sachs: Uh, I mean, yeah. So I think. I mean, I could talk about this for 60 minutes, so I will limit myself. I think it's a real issue when people say evals and it's just like, that's quality, that's like unit, I mean, it's like saying testing. It's not just unit tests, right? So. We have the equivalent of unit test.Regression test. Those live in ci, those have to pass a certain percent, you know, within some stochastic error rate. Then we have, as you're building a product, evals of these aren't passing right now, and this is launch quality. So we have a report card and we need to, on these categories, you know, be it 80 or 90% of all of these user journeys to launch, and then what we have what we call frontier or headroom evals, where we actively wanna be at 30% pass rate.And that's actually been a effort that we took in partnership with philanthropic and OpenAI in the past maybe two or three months, because we actually hit a point where our evals were saturated and we weren't able to really give insightful feedback other than it wasn't worse. And not only is that not helpful for our partners, it's not helpful for us to understand where the stream is going.You know, going back to that analogy. And so we spent a lot of time thinking about. What notions last exam looks like, right? Mm-hmm. Not just humanities, last exam. Ooh, notions last exam. Mm-hmm. And, um, there's a lot of, you know, dreams about what that would look like. I know we've talked a lot about benchmarking, um, swix, but, uh, yeah.Notions last exam is a big thing inside the company and we have people, full-time staff to it exclusively. Mm. We have a data scientist, a model behavior engineer, and an full-time, um, evals engineer just dedicated to the evals that we pass 30% of the time.[00:25:56] swyx: What you're hiring for[00:25:57] Sarah Sachs: MBEs? I am hiring[00:25:58] swyx: What is an MBEA[00:25:59] Sarah Sachs: model?Behavior Engineer Model. Behavior engineers started with a title data specialist before I joined when they were working with Simon on like, uh, Google Sheets and like Simon just needed someone to look through Google Sheets and say, yes, no, this looks bad. This looks good. Right? And so we hired people with kind of diverse linguistics background.We had like a linguistics PhD dropout. Mm-hmm. And a Stanford ate new grad. And they're amazing. And they formed a new function basically. And over time we've built a whole team, um, with a manager who's now kind of reinventing what that role is with coding agents. So they used to be kind of manually inspecting code.Now they're primarily building agents that can write evals for themselves or LLM judges. There's a really funny day I can send you the picture where Simon, about a year and a half ago, was teaching them how to use GitHub. Um, and they're on the whiteboard and it was like, okay, I think it would be so much faster if our data specialists learned how to use GitHub and like learned how to commit these things in Dakota.And, and that was then and now I think, you know, coding has been a lot more accessible. Um, but moving forward it's this mix of like data scientist PM and prompt engineer because there's craft in understanding like even like what models can and can't do things. How do we define like that headroom? How do we define like what a good journey is?Um, is this model better or not? Why is this failing? There's some qualitative work, but then there's also like a lot of instinct and taste to it, and that's not necessarily software engineering. And so we have like very firm conviction and we have had for a number of years now that that is its own career path and we have always welcomed the misfits, so to speak.So we really firmly believe that you don't need an engineering background to be the best at this job. And that's what's quite unique about this particular role.[00:27:37] Simon Last: Yeah, this is something that I've been pretty excited about recently is we made an effort basically to treat the eval system as like an agent harness.So if you think about it, like, you know, you should be able to have an agent end-to-end, download a dataset, run an eval, iterate on a failure, debug, and, and then implement a fix. And ultimately you should be able to, you know, drive the full time process with a human sort of observing the, you know, the outer uh, system.So yeah, we went, went pretty hard on that. And that's, that's worked extremely well so far. It's like basically just to turn it into a coding agent, uh, uh, problem.[00:28:11] swyx: Your coding agent or just whatever[00:28:13] Simon Last: harness No coding agent. Yeah, code, cloud code. It should be totally general. Yeah. I think if it would be a mistake to like, like fix it on any, any particular coding agent.At the end of the day, it's just like CLI tools.[00:28:21] Sarah Sachs: It's like the same way that you would've a coding agent write the unit test. You should have a coding agent write the eval.[00:28:26] swyx: Yeah.[00:28:26] Sarah Sachs: But there's a lot of supervision in that still. We just don't believe that supervision has to come from software engineers because a lot of it is like, um, kind of you XREE and whatever, and these are the people that also triage failures and tell us where we should be investing next.[00:28:40] swyx: Yeah. I'm gonna go ahead and ask a spicy question. Is there a data, there are no software engineers at Notion.[00:28:46] Simon Last: Um,[00:28:46] Sarah Sachs: what does it mean to be a software engineer?[00:28:47] swyx: Exactly.[00:28:48] Simon Last: I mean, I think the way things are going is like we're on some continuum where. If, if you look back three years ago, humans were typing all the code and then we had auto complete, you're typing list of the code.Then we had sort of like filling agents, filling lines, and now we're getting into like agents doing longer range tasks where you can debug and implement a fix and then verify it works and you know, get your, get your PR even like, like Merion deployed. I think we're sort of just moving up the abstraction ladder and then the human role becomes more about observing and maintaining the outer system.There's a string of agents flowing through, like me prs what's going off the rails. Like what do I need to approve? Is there like a learning or memory mechanism that that works? So it's kind of a hard engineering problem. There's a, you know, there's, there's a lot to do there. I think we're just sort of moving up stack[00:29:34] Sarah Sachs: the same transition machine learning engineers have made, right?Like I haven't looked at a PR curve in a while.[00:29:39] swyx: Yeah. You used to do this stuff and now, um, auto research can do it,[00:29:42] Sarah Sachs: right? Like I think it depends on what you define as a software engineer.[00:29:46] swyx: Yes. It's, that's changing for sure.[00:29:49] Sarah Sachs: I think every software engineer in notion this summer went through like this, um, sheer, um, one of our engineering leads of the company called it, like every software engineer is going through the, the, uh, identity crisis that every manager goes through, where all of a sudden they realize their ability to write code is less important than their ability to delegate in context switch.And I think that is a transition out of being a software engineer. But[00:30:12] Simon Last: yeah. Yeah, there's a critical difference to being a manager, which is that like, it is actually very deeply technical. The problem, you know, humans are very like, like, like fuzzy and you can't like treat a team of humans like a, like a rigorous system where like, you know, prs like, like flow through and can be in like a block status and then what happens when they're blocked, right.With a set of agents, you actually can do that. And, and, and I think it's actually, there's a lot of interesting technical rigor that that goes into that it's like it's a technical design problem. Ultimately.[00:30:42] Alsesio: What is the design of the software factory that you're building?[00:30:46] Simon Last: Yeah, I mean, I think we're. Trying a lot of different things.I mean, ultimately you want to design a system that requires as little human intervention as possible, but like still maintaining the in variance that, that you care about. So yeah, we're exploring a lot different ideas there. I mean, I think I could talk about a few things I think are important there.Like, one thing I think is really important is, um, having some kind of like specification layer you can just commit marked on files. Mm-hmm. That works pretty well, but[00:31:15] swyx: it's nice to be notion man. I'm just saying like the spec, like Yeah. The natural home for specs is notion.[00:31:21] Simon Last: Yeah. Right. It can be a database of pages.Yeah. I mean, it needs to be something that is, you know, human readable and I viewable and I think that's pretty key. Another really key component is like the, the self verification loop. Yes. You need really, really good testing layers, basically. And that's a really deep, uh, uh, problem. But by getting that right, you know, and then, and then it's kinda like the workflow of like.What happens when there's a bug? How does it flow into the system? Like, is it like a subagent working on it? How does it make a PR and how does that get reviewed? And me, and then, you know, so there's like the, the flow or process.[00:31:56] swyx: Yeah. Cool. Uh, you know, one thing we did work out before you guys came in was this demo or this[00:32:01] Simon Last: agents[00:32:02] swyx: agent demo.Uh,[00:32:03] Simon Last: so every,[00:32:04] Alsesio: every time we do an episode, we try the product. Right. I don't think there's ever been an episode that I haven't tried. Yeah. Um,[00:32:11] swyx: and we, we try, try is a, a big word. Like since day one lane space has been on Notion, but this is the, this is the net new thing. Yes.[00:32:18] Alsesio: So this is for Nel Labs, which is the space we're in.So next week we're opening applications for tenants. So there's a web form, let me, we got this form done here. Uh, so, uh, before. Uh, the workflow would be I get an email, then I look at the person. It was like, should I spend time talking to this person? Then I respond, they respond back. So I build this. So the name it came up for on its own.Can you maybe h how do, how does it come up with its own name?[00:32:43] Simon Last: Yeah, that's a pretty app name. It's, it, it is just a random, it's a random, a name generator.[00:32:47] Alsesio: Oh, that's funny. It just came,[00:32:49] Simon Last: the fact that it picked that is, is kind of hilarious. I'm pretty sure it's just determined,[00:32:54] Sarah Sachs: resilient collector. I, I think I've never looked at the code for that.I've never second guessed it. I think it's kind of like a madlib situation.[00:33:00] Simon Last: Yeah, I think you're right. Yeah. It's, it's totally a, a deterministic. Oh, I thought it was great. Yes. Although, although when the, if you use the AI to set itself up, it can update its own name, so. Okay. Um,[00:33:11] Sarah Sachs: how did you create it? It, did you just do[00:33:12] Alsesio: classroom?I,[00:33:13] Sarah Sachs: okay.[00:33:13] Alsesio: I did, yeah. I'll say just check my inbox for applications for a coworking space. Keep a people, so it created the database for me. Which I have here. And I guess database is like an notion table because everything is notion. Um, and then whenever um, an email comes in, like here, it just creates a new role for the person.Mm-hmm. And then it uses web search to enrich the mm-hmm. The profile. So it kind of like searches the web and it's like, this is who this person is, this is when they say they wanna move in and kind of updates everything else. This is, I mean, it's not a GI, but to me, I don't wanna do this work. So it feels like, I mean, it took me maybe like 15 minutes to set up the whole thing.Um, and I really like that most of the information should live here. You know, it is not like some other tool asking me[00:34:01] Sarah Sachs: Yeah.[00:34:01] Alsesio: To like, bring my stuff there. It's like I would've probably already created an ocean thing.[00:34:06] Sarah Sachs: Mm-hmm.[00:34:06] Alsesio: So[00:34:07] Sarah Sachs: most of our biggest use cases and gains are from. That extra layer of human involvement in the process to make it so right.And so like one of our biggest use cases is bug triaging. So if someone posts something in Slack, can you just have a custom agent that lives there that has its own routing constitution of what team this belongs to, creates a task in your task database and then posts in that Slack channel, right? Like that's like one of the first things that we built internally, I think.And it's completely changed the way that notion functions as a company. Nothing falls through, well, most things don't fall through the crack. We don't know what we don't know. But it's not replacing people, it's replacing processes.[00:34:44] Alsesio: Yeah.[00:34:44] Sarah Sachs: Right.[00:34:45] Alsesio: And I'm curious how you think about composability of these things.So the other one I was working on is like a. These filler. So whenever somebody signs up as a tenant, kind of he'll sell the lease for them. There should probably some agent that is like office manager agent mm-hmm. That can handle the request, make the lease, and then, uh, give them a ADA access to the office and all of that.How do you think about that feature?[00:35:08] Simon Last: Yeah, so I mean, there's, there's two ways you can compose. One way is by using like the data primitives. So you can, you know, you, you could give, you have one agent, uh, be writing to the database and there's another agent that's walked in the database. So that's, that's one way that they, they can coordinate that's like a little bit more decoupled and mm-hmm.Works really well. Or you, you can couple them. So I, I think it's actually not released yet. Releasing it like next week is, uh, in the settings for an agent, you can give access to invoke any other agent.[00:35:34] swyx: Hmm.[00:35:34] Simon Last: So you can have them just. Just, uh, uh, talk directly. So[00:35:37] swyx: you, was there a limit on like, number of recursions or just,[00:35:40] Simon Last: um, probably,[00:35:42] swyx: you know what I mean?Like, you can just get an infinite loop that way there's[00:35:45] Simon Last: some kind of Yeah,[00:35:46] Sarah Sachs: I think it's, there is actually a number somewhere.[00:35:49] swyx: I believe I'm just, you know, like, you're, you're, someone's gonna screw up. You[00:35:51] Simon Last: should you try to see[00:35:53] swyx: Yeah. I mean, everything's gonna be paperclips.[00:35:55] Simon Last: Oh, yeah. Yeah. But, uh, but, but that's really useful.Yeah. So we, you know, like I just, I, I helped, uh, someone internally the other day, they had, they had built like over 30 custom agents for, uh, for our go to market team doing all kinds of different things. You know, for example, like researching, you know, like, like filling information about, about a customer or like, like triaging customer feedback or like, uh, something like that.Literally over 30 of them. And, and then he, and then he even made like a database of all the agents and then he is like, okay, and, and now I'm getting 70, over 70 notifications per day with just the agents are blocked on various things. Uh, and then I was like, oh, okay, cool. You know, the obvious thing to do there is to make a manager agent,[00:36:32] Sarah Sachs: right?[00:36:33] Simon Last: That's gonna sort of blocks be another abstraction layer in between your, your, uh, uh, 30 agents. Uh, so yeah, we, we send out with like a manager agent and then has access to invoke all the other agents and it's sort of like, like watching and observing them and then it sort of, it just creates a layer of abstraction.So instead of 70 notifications per day, it's like, like five. And then, and then the manager agent can help like, uh, debug and fix any problems with the,[00:36:54] swyx: does this is a concept of like an inbox or something like piece, you're basically saying that they can message each other?[00:37:00] Simon Last: Yeah.[00:37:01] Sarah Sachs: Well[00:37:01] swyx: they use the system of record, which, which is[00:37:02] Sarah Sachs: notion, so we[00:37:03] Simon Last: actually, yeah, we didn't make any special concepts at all.[00:37:06] swyx: They're interested to the motion notifications that I would've got,[00:37:09] Sarah Sachs: they can just like write a task to a database that the other agent's task to listening to, or they can actually call a web book to the agent, like they can just add the agent. Okay.[00:37:17] Simon Last: Yeah, I mean, this is something that, that we're still working on.I, I think we, you know, like, like generally, generally the way we do these things is, you know, you first make it possible, maybe like a sort of janky way. So I, I, I think the way I set ‘em up is like, you know, we created like a new database that was sort of like issues mm-hmm. That the custom agents were, were experiencing, and then gave them all access to file an issue and then the manager has access to, to read the issues.Um, and that works pretty well, essentially like, like give it its own like internal issue tracker just for the agents. And then, you know, if that becomes a, a concept that seems useful, generally maybe we will think of how to package it in. But I mean, generally we try to just keep it to composing the primitive if we can.You know, another example of this is we have no built-in memory concept. Memory is, is just pages and databases. And so if you wanna give a memory, just give it a page and give it. Edit access to that page and the[00:38:03] swyx: human can edit it. Agent can edit[00:38:04] Simon Last: it. Yeah. And so that works, that pattern works extremely well on it.And you know, depending this case, you can have it be just a page or it could be an entire database with, you know, or, you know, I can have sub pages is is pretty on what you can do with that.[00:38:15] Alsesio: So when I was setting this up, uh, I connected my inbox and it was like, do you wanna use Gmail or Notion Mail? And I'm like, I don't wanna use Eater, I just want you to do it.I'm curious how you think about, you know, notion, mail, notion, calendar, all of these kind of ui ux interfaces, full stack[00:38:29] Simon Last: notion.[00:38:30] Alsesio: Yeah. When like at the same time you have the agents abstracting them away from you in a way, you know, how do you spend like the product calories so to speak?[00:38:37] Simon Last: Yeah, I mean, I think it's pretty important that you don't have to use, not your mail to connect to the mail capability.So we can just connect to Gmail or, or whatever you want, uh, to use. And we're thinking of the mail service as being really great to the extent that it's really agent built, right? So maybe the mail app is just sort of a prepackaged agent that helps you automate your, your inbox.[00:39:00] Alsesio: Yeah, the auto labeling is great.Think[00:39:03] Sarah Sachs: the, when we, um, integrate with Gmail for instance, we have a series of tools available that are available via MCP or API to Gmail. When we integrate with Notion Mail, we have the Notion Mail engineering team to build us the, um, exact right tools that optimize latency, optimize performance and quality.They own that quality. Um, there's product leads there. They're directly thinking about the user problems that happen in mail. So it tends to be when we build integrations and connections, we build natively first. Um, and then think about, um, extending them generally just because it's also easier. Mm-hmm. Um, um, to build natively first.Um, so that tends to be how we phase things out.[00:39:43] swyx: Talking about integrations, you prompted me, so I gotta ask. M-C-P-C-L-I. What's going on? What's the[00:39:48] Simon Last: Yeah. Opinion. I think, I mean, I'm, I'm definitely bullish and excited about cli. I think there's a few really cool things about cli. So one really cool thing is like, um, is that it's in the terminal environment, so it gets a bunch of extra power.So it, you know, for example, it can like, like paginating and cursor through like long outputs. Um, and it has a progressive disclosure inherently. Uh, so, you know, you don't see all the tools at once. It's just, you see the CLI wrapper and you can like use the, the help commands and, and, and read files. And then I think the most important thing that's, that's super cool is that there, it's also inherently a, a bootstrapped.So if there's an issue, uh, the agent can debug and fix itself within the same environment that it uses the tool.[00:40:30] swyx: Mm.[00:40:30] Simon Last: Right. Like, you know, I think I saw a tweet this morning. Someone said, you know, my agent didn't have a browser, so I asked it to make all a browser tool and within a hundred lines of code, it gave itself a little browser, like, like wrapping the, the, the chromium API, um.That's pretty incredible. And then if there was a bug, it would just immediately try to fix it. Mm-hmm. Right. On the other hand, if you use an, you know, if you use like of, of the Chrome dev tools, MCP, I've had this issue where like, like sometimes the transport gets like messed up. If it gets messed up, the agent has no way to fix itself.It, it no longer has a browser, it's, it's not broken. Right. I think that's, that's pretty fundamental, but I would say like a lot of the, the bad things about it can be fixed. Uh, so I think like, as a progressive disclosure, that can be fixed with, with right harness. Like, it, it obviously doesn't make sense to show it all the tools all the time.That's not really inherent to the MCP protocol. It's just like how you wrap it and use it.[00:41:16] swyx: There's many poorly built MCPs because we didn't know.[00:41:19] Simon Last: Yeah, yeah. I mean it was just early, like, like the obvious thing is, uh, you know, to start with is, is to just show it all the tools and it's like, okay, now we have a hundred tools.Yeah. And like the tool calling actually works. So let's of[00:41:28] swyx: your success[00:41:29] Simon Last: give it a way to like, like filter to source the tools. So yeah, I would say like broadly speaking, I'm really bullish on cli. I'm still bullish on CPS and in a certain environment. I think in, in particular, CP is really great for when you want sort of like a narrow, lightweight agent.I think there's, there's definitely a lot of use cases where, where you don't want like a full coding agent with a compute run time. And also you want it to be like more tightly permissioned. MCP inherently has a really strong permission model, like all you can do is call the tools. A CLI is a little bit murkier.It's like, can I access the, if PI token are you, like, properly sort of like re-encrypt the token so it can't like exfiltrate it, it introduce a lot of like, like new issues, which are. Real and hard to solve. And MCP is just like the dumb simple thing that works and it that it's pretty good.[00:42:12] Sarah Sachs: I'll add two more perspectives, not from it working well for Notion, but how notion like commits to both platforms.Notion is dedicated to being the best system of record for where people do their enterprise work. So we will always support our MCP and so far as other people are using cps, right? So regardless of our perspective, we've put a lot of effort into our MCP and we have a fantastic team that we're building, um, to do more there.And the second thing I'll say, I think, um, we all think a lot, but lately I've been thinking a lot about making sure there's a value alignment and pricing, um, with capability.[00:42:43] swyx: Literally our next question[00:42:44] Sarah Sachs: and. Needing language to execute deterministic tasks feels wasteful and requiring on a language model to interface with third party providers seems wasteful for tasks that don't require it.And particularly because our custom agents are using usage-based pricing. We think of pricing as like the barrier of entry for use of our product, and we're quite committed to making sure that it's not wasteful. Um, not just because it's a bad deal for our customers, but it's also bad business. We wanna have as many buyers, like there's a, there's an elasticity of demand and so if we can have our agents properly execute code that calls on CLI deterministically, it's a one-time cost, right?Versus constantly having a language model integrate with an MCP over and over and over and paying those like repeated token fees and it's happening outside the cash window, then you're paying for it over and over and over and it's just kind of unnecessary and less deterministic when it doesn't have to be.[00:43:36] Alessio: Yeah, the open-endedness I think is like, the main thing is like, well, if I go write code to just call an API, I would never use an MCP. But then you need an NCP sometimes when you know what to call, but you don't want it to restart versus like, I think the it built a browser from scratch is like, it's great when you're doing it on your own, but like if your customers were having your AI write a browser from scratch every time and you had to pay the token cost of that, yeah.You'd be like, no, no. The Chrome dev tools CP is actually pretty great. Just use that. I'm curious, how do you make that decision? Like should it be. Just straight API call very narrow. Should it be an MCP? Should it be super open-ended?[00:44:10] Sarah Sachs: Do you mean for when we ship notion capabilities or when we add capabilities to[00:44:13] Alessio: notion[00:44:14] Sarah Sachs: AI or,[00:44:14] Alessio: I mean, you might have a capability that the only way to do is an open-ended agent, like an agent with a coding sandbox.[00:44:21] Sarah Sachs: Yeah. In Notion ai they're not explicit, not We also ship an MCP.[00:44:24] Alsesio: Yeah. Yeah. In B,[00:44:25] Sarah Sachs: yeah.[00:44:26] Alsesio: Internally. Okay. Like is there ever a discussion of like, we're not gonna ship it because we're not able to tie it down? Or are you happy to just like,[00:44:33] Sarah Sachs: um, no. I mean, there are a lot of things where we choose not to use MCP because we wanna add more high touch to quality.I think search an agent to find is like the largest instance of that, where we have. Um, slack and linear and Jira search and notion that is not using necessarily the search MCP functionality that is provided by those companies. And that's because it's quite critical we think, to how our agent trajectories work is for us to have a little bit more control on the functionality of the search journey.And so it usually comes from quality and there's a long tail of things and that's why we built an MCP client or an MCP server, excuse me, so that people can connect whatever they want. There's that long tail, right. But we, for search particularly, I would say that's like the primary entry point, but there are other connections as well that it's a little bit of secret sauce a
Gobierno amedrenta a tortilleros y gasolineros por subida de precios; una revelación: Marcelo Ebrard envió a su hijo a vivir a la embajada en Londres, con cargo al erario, mientras era titular de SRE; Tesoro de EU sanciona a abogado de Morena, casinos y activista, por vínculos con el narco.
In this episode of the PurePerformance Podcast, Andi and Brian sit down with Chris LaBrado—Solutions Architect for AI Enablement, FSO, SRE, and ITSM at HSN/QVC, where he has spent an incredible 27 years shaping operational excellence. Their conversation dives deep into how AI is transforming software creation, enterprise workflows, and even the very role of developers.Chris shares how the barrier to entry for building tools and automation has dropped overnight thanks to natural‑language-based development: “Everyone can now create automation or tools without having to worry about the syntax.” He explains why AI is rapidly becoming the primary interface into the enterprise—capable of navigating presentations, emails, and complex back‑office systems—and why the future of engineering may shift from human‑oriented coding to AI-driven development models such as MDCD (MarkDown Continuous Development).The discussion also takes unexpected but fascinating detours into Chris's background as a former bowling‑industry podcaster, his recent work with generative agents like DynaClaude, his Vibe Coded Root Cause Agent, and a philosophical exploration of AI, creativity, and the concept of singularity.Amidst all the change, Chris remains optimistic: “AI opens up a lot of new opportunity for everyone willing to adapt. It will result in us creating more things that ultimately help us as humans.” This episode is a thoughtful, energizing look at where software engineering is headed—and why the future might be brighter than we think.Links we discussedChris LaBrado on LinkedIn: https://www.linkedin.com/in/chrislabrado/Mo Gawdat, former Google Executive on the Singularity "moment of truth": https://x.com/vitrupo/status/2008824930646057380?s=20CEO of NVIDIA had an interesting excerpt from interview: https://x.com/MinusWells/status/2031974516155695414?s=20Elon Musk on speed of AI: https://x.com/r0ck3t23/status/2031639621465931903?s=20AI brain emulation of a fly (e.g. "a sign of the times"): https://x.com/alexwg/status/2030217301929132323?s=20Elon on fiat currency transforming based on AI manufacturing loop: https://x.com/elonmusk/status/2020202496547844312?s=20Fiat currency moves to model based on thermodynamics: https://x.com/r0ck3t23/status/2033371028202602547?s=20
Transportistas alertan crisis por peajes y diéselRefuerzan prevención de inundaciones en TexcocoFMI y BM evalúan impacto de tensiones globalesMás información en nuestro Podcast#grc
Refuerzan cooperación bilateral entre México y EU Continúa registro para credencial del Servicio Universal de SaludPerú despliega fuerzas de seguridad por elecciones presidencialesMás información en nuestro podcast#grc
En esta emisión de Saga Noticias con Max Espejel te presentamos la ratificación de Roberto Velasco en el Senado como nuevo titular de la SRE en medio de tensiones con la ONU. También analizamos la postura de Claudia Sheinbaum sobre el precio de la gasolina Premium y el debate por el bloqueo de cuentas por la UIF. Además, se mantiene la polémica por el posible regreso del fracking en México. En el ámbito nacional, destacan los lujos de legisladores en Oaxaca y el escándalo por un gasto millonario en una fiesta de XV años en Pemex. Finalmente, abordamos temas de seguridad e internacional como el atentado contra una diputada en Culiacán y la tensión entre Estados Unidos, Israel e Irán. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
Procesan a militares por caso de niñas asesinadas en SinaloaMéxico y Brasil alistan visita de SheinbaumMacron pide respetar tregua en Medio OrienteMás información en nuestro Podcast#grc
CDMX analiza home office por apertura del MundialCaravanas acercan trámites a municipios del EdomexEU respalda nueva etapa con canciller mexicanoMás información en nuestro Podcast#grc
En esta emisión de Saga Noticias con Max Espejel te presentamos la ratificación de Roberto Velasco en el Senado como nuevo titular de la SRE en medio de tensiones con la ONU. También analizamos la postura de Claudia Sheinbaum sobre el precio de la gasolina Premium y el debate por el bloqueo de cuentas por la UIF. Además, se mantiene la polémica por el posible regreso del fracking en México. En el ámbito nacional, destacan los lujos de legisladores en Oaxaca y el escándalo por un gasto millonario en una fiesta de XV años en Pemex. Finalmente, abordamos temas de seguridad e internacional como el atentado contra una diputada en Culiacán y la tensión entre Estados Unidos, Israel e Irán. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
El gas natural está garantizado con EU: SheinbaumPermanecen cerrados los carriles laterales de la Calzada Ignacio Zaragoza El 8 de abril de 1973 falleció el pintor español Pablo Picasso#grc
Nuevo Canciller designado se apoyará en embajadores retirados Falleció a los 99 años el muralista veracruzano Melchor Peredo García Irán cierra el Estrecho de Ormuz tras ataques de Israel a Líbano #grc
IMSS reporta más de 22.7 millones de empleos formales Rehabilitación de Línea 2 del Metro va más allá del MundialEfeméride: España intentó reconquistar México en 1829Más información en nuestro podcast#grc
Alerta por lluvias fuertes, viento y posible granizo en CDMXDecomisan sueros vinculados a ocho muertes en SonoraZelenski impulsa tregua y ve avance hacia la pazMás información en nuestro Podcast#grc
SRE pide a mexicanos en Irán evitar difndir fotos del conflicto Ordenan desbloquear cuentas de Tomás YarringtonMetro CDMX agiliza servicio en Línea 3 tras revisión de trenMás información en nuestro podcast#grc
Senado recibe designación de Roberto Velasco para la SRESenado avala salida de tripulación del Buque Escuela CuauhtémocPapa llama a evitar guerra con IránMás información en nuestro Podcast#grc
En la emisión de este lunes 6 de abril de Me lo dijo Adela, Adela Micha aborda la creciente crisis de inseguridad en México, marcada por un nuevo ataque contra la familia LeBarón en Chihuahua y la polémica respuesta del Gobierno Federal ante los informes de la ONU sobre desapariciones forzadas; además, analiza el impacto del paro nacional de transportistas, que amenaza con paralizar las principales carreteras del país. A lo largo del programa, se profundiza en la postura de la SRE frente a las críticas internacionales por la crisis de desaparecidos, mientras escuchamos el testimonio de Brenda Valenzuela, madre de Carlos Emilio, quien enfrenta el dolor de su ausencia y la falta de respuesta institucional. También, Julián LeBarón relata la irrupción de grupos armados en su comunidad, evidenciando la persistente violencia en el norte del país, y David Estévez Gamboa, presidente de la ANTAC, explica las exigencias de seguridad del gremio transportista. Finalmente, el programa cierra con la información deportiva de Juan Carlos Díaz Murrieta y los consejos de Javi Derma para cuidar la piel tras la exposición solar. ¡Acompáñanos para entender lo que está pasando en México! Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.
SUMMARY: With the explosion of AI-generated code and applications, the modern SRE requires an AI-native approach to managing complex systems. GUEST: Anish Agarwal - CEO/Cofounder of TraversalSHOW: 1016SHOW TRANSCRIPT: The Reasoning Show #1016 TranscriptSHOW VIDEO: https://youtu.be/hF3MCRDhMnoSHOW SPONSORS:Nasuni - Activate your data for AI and request a demoShareGate - ShareGate Protect. Microsoft 365 Governance, we got this!SHOW NOTES:Traversal (homepage)Topic 1 - Welcome to the show. Tell us a little bit about your background, and what you focus on these days at Traversal. Topic 2 - AI is dramatically accelerating code generation, but not improving production outcomes. What's fundamentally breaking in the traditional SRE model—and where do you see the biggest friction between speed and reliability?Topic 3 - What are the most common failure patterns or mistakes you're seeing in production from AI-generated code—and what's driving them?Topic 4 - AI can generate functional code, but it often lacks context about how systems behave in production. How is this changing what ‘good observability' needs to look like?Topic 5 - How do you see SRE evolving in an AI-first world? Does it become more automated, more policy-driven, or even partially autonomous?Topic 6 - For organizations that want to embrace AI-assisted development but avoid production chaos, what are the most important guardrails they should put in place?Topic 7 - If we fast-forward 2–3 years, what does a ‘modern' production stack look like in a world where most code is AI-generated? What capabilities become absolutely essential? In one sentence—what's the #1 thing a CTO should do right now?FEEDBACK?Email: show @ reasoning dot showBluesky: @reasoningshow.bsky.socialTwitter/X: @ReasoningShowInstagram: @reasoningshowTikTok: @reasoningshow
Zašto radoznalo dete uvek stigne dalje od pametnog? Uroš Petrović o tome kako mozak „pali” na izazove i zašto je igra najozbiljniji posao na svetu. Uroš Petrović je pisac koga mnogi znaju po „Zagonetnim pričama", „Petom leptiru" ili po naslovu „najpametniji Srbin", ali malo ko zna celokupnu priču iza toga. U ovom razgovoru sa Ivanom, Uroš po prvi put na jednom mestu ispričaće čitav put koji ga je doveo tu gde je danas: od detinjstva u Gornjem Milanovcu i odrastanja na Novom Beogradu, preko školskih sukoba sa nastavnicima koji nisu trpeli drugačije, crtanja stripova i prodavanja ih u jednom primerku, butika sa ručno dizajniranom garderobom, prodavnice ribica u Beograđanci, do trenutka kad je sam odštampao roman „Aven i jazopas u zemlji Vauka" bez lekture, bez korekture, sa paginacijom na naslovnoj strani - i od te knjige napravio čitavu karijeru. Priča o tome kako je u ponoć stajao pred zatvorenom knjižarom Plato da vidi da li su mu stavili knjigu u izlog, o lupi od 60 dinara koja mu je donela ugovor sa Lagunom, o tome zašto je zatvorio uspešan posao da bi živeo od pisanja, i o inatskom polaganju najtežeg IQ testa na svetu posle mejla koji mu je praktično rekao „nemoj ni da pokušavaš". Kroz sve to provlači se razgovor o dečijoj radoznalosti, o tome šta se dešava sa mozgom koji prestane da se trudi, zašto je teško dobro i zašto radoznalo dete uvek stigne dalje od pametnog. O čemu smo pričali: - Početak razgovora - Smilies pitanje: Šta je hteo da bude kad poraste? - Škola: kreativnost i buntovnost - Srednja škola - Tehnologija, mozak i veštačka inteligencija - Odabir fakulteta - Teškoće kao pokretačka snaga - Sreća i kvalitet odnosa - Fakultet, nacrtna geometrija i zagonetke - Razvoj govora i kreativnosti kod dece - Prvi poslovi i preduzetništvo - Modni brend Aventurier - Aven i jazopas u zemlji Vauka - Put prve knjige - Kako je nastala saradnja sa Lagunom - Opus knjiga: od fantastike do mange - Vizuelna komponenta knjiga - Bajke, imaginacija i moć knjige - Mensa i najpametniji Srbin - Fotografija i National Geographic Podržite nas na BuyMeACoffee: https://bit.ly/3uSBmoa Pročitajte transkript ove epizode: https://bit.ly/4sXnRSi Posetite naš sajt i prijavite se na našu mailing listu: http://bit.ly/2LUKSBG Prijavite se na naš YouTube kanal: http://bit.ly/2Rgnu7o Pratite Pojačalo na društvenim mrežama: FB: https://www.facebook.com/PojacaloRS/ IG: https://www.instagram.com/pojacalo.rs/ X: https://x.com/PojacaloRS LN: https://www.linkedin.com/company/pojacalo TikTok: https://www.tiktok.com/@pojacalo.rs
Če pretiravaš pri večerji, te lahko ponoči tlači mora, je zajčku povedala modra sova ... Pripovedujejo: Tamara Doneva, Boštjan Napotnik Napo, Lili Bačer Kermavner in Srečko Kermavner. Napisal: Tomo Kočar. Posneto v studiih Radia Slovenija 1999.
Colectivos denuncian agresión en protesta en TlalpanLimpian el Río Lerma y retiran toneladas de basuraNuevo relevo en la cancillería mexicanaMás información en nuestro Podcast#grc
Esperan boom turístico en Quintana RooMexicana de Aviación expande rutasUE enviará más ayuda a CubaMás información en nuestro Podcast#grc
Tamaulipas avanza 60% en meta de viviendas para el BienestarCancillería mantiene recomendación de no viajar a Medio Oriente Trump asegura que Irán a pedido un alto al fuego #grc
Vijoy Pandey of Outshift by Cisco lays out his vision for an “Internet of Cognition,” where AI agents can share context, build reputation, and collaborate safely at scale. He offers a useful mental model for superintelligence: progress has to scale in two directions — up, through better individual models, and out, through networks of agents and humans thinking together. The conversation explores how distributed, protocol-driven agent systems could give enterprises fine-grained permissions, auditability, and controlled interfaces, in contrast to today's centralized frontier models. Vijoy also walks through Cisco's internal CAIPE system of 20 cooperating agents, the open-source AGNTCY project, and a live multi-agent healthcare demo spanning diagnostics, insurance, pharmacy, and scheduling. LINKS: AGNTCY Project Open source multi-agent infrastructure under Linux Foundation governance. Covers discovery, identity, communication, observability. Vijoy walks through the architecture at [00:34:57] and [00:41:17]. Scaling Out Superintelligence Whitepaper The technical whitepaper detailing the Internet of Cognition architecture, three-layer stack, and cognition state protocols. Referenced at [01:25:40]. Internet of Cognition Interactive Demo Clickable walkthrough showing per-agent activity, intent, context, and collective reasoning across a multi-agent SRE system. Vijoy demos at [01:26:20]. CAIPE Project (GitHub) Cloud Native AI Platform Engineer. Multi-agent system with participation from Adobe, AWS, Cisco, Nike. 20 agents, 100+ tool calls, 10+ workflows. Referenced at [00:11:52]. Sponsors: Tasklet: Build your own Cognitive Revolution monitoring agent in one click.Try it for free and use code COGREV for 50% off your first month at https://tasklet.ai VCX: VCX, by Fundrise, is the public ticker for private tech, giving everyday investors access to high-growth private companies in AI, space, defense tech, and more. Learn how to invest at https://getvcx.com Claude: Claude is the AI collaborator that understands your entire workflow, from drafting and research to coding and complex problem-solving. Start tackling bigger problems with Claude and unlock Claude Pro's full capabilities at https://claude.ai/tcr CHAPTERS: (00:00) About the Episode (04:16) Cisco and networking foundations (13:34) Jarvis and ASI vision (Part 1) (18:16) Sponsors: Tasklet | VCX (21:09) Jarvis and ASI vision (Part 2) (Part 1) (31:46) Sponsor: Claude (33:59) Jarvis and ASI vision (Part 2) (Part 2) (34:00) Practical multi-agent examples (50:02) Multi-agent plumbing architecture (01:01:44) Agent identity and TBAC (01:15:23) Internet of cognition fabric (01:21:48) Emergent agents and safety (01:36:52) Outro PRODUCED BY: https://aipodcast.ing SOCIAL LINKS: Website: https://www.cognitiverevolution.ai Twitter (Podcast): https://x.com/cogrev_podcast Twitter (Nathan): https://x.com/labenz LinkedIn: https://linkedin.com/in/nathanlabenz/ Youtube: https://youtube.com/@CognitiveRevolutionPodcast Apple: https://podcasts.apple.com/de/podcast/the-cognitive-revolution-ai-builders-researchers-and/id1669813431 Spotify: https://open.spotify.com/show/6yHyok3M3BjqzR0VB5MSyk