POPULARITY
Categories
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
How has use of framing protection security headers changed in the past 3 years? https://isc.sans.edu/diary/How%20has%20use%20of%20framing%20protection%20security%20headers%20changed%20in%20the%20past%203%20years%3F/33068 Preparing for npm v12: install scripts and non-registry sources become opt-in https://github.com/orgs/community/discussions/198547 Adobe Patches https://helpx.adobe.com/security.html Rogue Planet new Microsoft Defender Vulnerability https://github.com/MSNightmare/RoguePlanet My Upcoming Classes https://www.sans.org/profiles/dr-johannes-ullrich
On this episode, Norton Rose Fulbright's Ammad Waheed joins Kyle Younker to discuss why “powered land” now defines data center strategy. He explores its spectrum, valuation by megawatt, and rising importance of speed to power.The discussion also covers development risks, financing and leasing dynamics, and growing community scrutiny shaping the industry's next phase.NPM is a leading data, intelligence & events company providing business development led coverage of the US & European power, storage & data center markets for the development, finance, M&A and corporate community.Download our mobile app.
Today's Ministry Monday features Annie Donnellon, an NPM certified cantor, as well as Rebecca Schaffer Wells and Joe Simmons, both cantors and leaders in the field. In this Ministry Monday episode, we explore the call to being a cantor in ministry, ways to strengthen ministerial and musical skills, and considerations for Directors of Music to support inclusion for both blind and sighted cantors for music ministers in a program.
Parce que… c'est l'épisode 0x306! Shameless plug 24 et 25 juin 2026 - Troopers 26 et 27 juin 2026 - leHACK 30 juin au 2 juillet 2026 - Pass the SALT 19 septembre 2026 - Bsides Montréal 20 au 26 septembre 2026 - BruCON 13 novembre 2026 - DEATHCon 16 au 19 novembre - European Cyber Week 1 au 3 décembre 2026 - Forum INCYBER - Canada 2026 24 et 25 février 2027 - SéQCure 2027 Notes IA ou Ghost in the shell Mythos Anthropic invites EU to access Mythos hacking tech Anthropic scales Claude Mythos to critical infrastructure in 15+ countries Anthropic Expands Project Glasswing Claude Mythos Preview to 150 New Organizations Kevin Beaumont: “Mythos is not great btw. Runni…” - Cyberplace Free AI model powers self-spreading worm in enterprise test network Instapassword Hackers Used Meta's AI Support Bot to Seize Instagram Accounts Instagram Meta AI Vulnerability Allegedly Enables Password Reset for Accounts Hackers duped Meta AI support chatbot to steal celebrity Instagram accounts Instagram Fixes Password Reset Flaw That Exposes User Emails and Phone Numbers Hackers Simply Asked Meta AI to Give Them Access to High-Profile Instagram Accounts. It Worked Kevin Beaumont: “How people hacked Meta account…” - Cyberplace Injecte moi ça ChatGPT for Google Sheets Exfiltrates Workbooks New Google Gemini Vulnerability Exploited via Prompt Injections from WhatsApp, Slack, and SMS New ChatGPT Lockdown Mode Limits Tools That Could Enable Data Exfiltration Irresponsable Florida sues OpenAI, Sam Altman after multiple ChatGPT-linked murders School shooting survivor sues AI gun detection firm after system failed to spot weapon AI Agents Get Their Own Directory Built Atop DNS Remove all LLM generated commits before people get hurt by this nonsense. · Issue #934 · RsyncProject/rsync Amazon Shuts Down Internal AI Leaderboard After Employees Cheated Open source project contains hidden instruction for “AI” agents: delete my code DOD wants to integrate cyber in all operations, and integrate security into AI Trump plan to test AI models has a problem—US security teams were gutted by DOGE Kevin Beaumont: “xAI have asked a court to stri…” - Cyberplace Commvault says it's time to rethink resiliency as AI crooks leave victims in a ‘dark, dead' state Attackers Use AI to Automate EDR Evasion Testing Pluralistic: Delusion as a service (04 Jun 2026) – Pluralistic: Daily links from Cory Doctorow These LLMs are the best at resisting Russian propaganda RAG Security and Privacy: Formalizing the Threat Model and Attack Surface From Attack Simulation to SIEM Rule: Deterministic Detection-as-Code Synthesis with Probe-Level Traceability Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals Critical Hugging Face Transformers Vulnerability Enables Remote Code Execution Attacks La guerre, la guerre, c'est pas une raison pour se faire mal! Iran-Linked Hackers Destroy IT, Backups, and Recovery Systems in Cyberattack targeting Middle East Pentagon raised threat of Israeli spying on U.S. to highest level, sources say Souveraineté ou vive le numérique libre! EU plots long game against US digital supremacy OSI welcomes the European Union's “Tech Sovereignty” package Cable lobby warns of chaos if FCC doesn't relax ban on foreign routers Privacy ou cachez ces informations que je ne saurais voir The Pentagon Finally Admits That Location Data Is a Battlefield Problem Age verification for social media – the beginning of the end for a free internet? Privacy isn't dead: it's just that tech companies have made it inconvenient Amazon-owned Ring should pay Americans for scanning their faces, lawsuit says Elon Musk tries again to escape FTC audits of X data handling I am the law Policy-Compliant Cloud Storage Systems GrapheneOS user reported to authorities for using GrapheneOS Red ou tout ce qui est brisé Cachez ce fiasco que j'ai fait Microsoft's Zero-Day Legal Threats Spark Backlash Microsoft Clarifies It Won't Sue Security Researchers Amid Nightmare-Eclipse Controversy Microsoft reaches for olive branch after public dustup with 0-day researcher Nightmare Eclipse incident shows the researcher-vendor fights may never fully go away Another bug hunter leaks Microsoft exploits in defiance of company's handling of vulnerability disclosures Microsoft MSRC Allegedly Dismissed Dependency Confusion Vulnerability, Claims Researcher Just LOL BIN BAS Kevin Beaumont: “Wake up babe, new lolbins and …” - Cyberplace Microsoft's Coreutils project brings Linux commands to Windows Microsoft Investigates MFA Setup Failure and MySigns-In Portal Outage Dozens of Red Hat packages backdoored through its official NPM channel Inspector general finds NIST mistakes have made vulnerability database ineffective Sur le serveur X.Org, neuf nouvelles failles de sécurité dont huit débusquées par une IA HTTP/2 Bomb : une mini-requête suffit pour faire tomber nginx, Apache ou IIS Blue ou tout ce qui améliore notre posture - An Analysis of GrapheneOS's Server Infrastructure - Android phones will soon be able to detect spoofed calls and impersonation scams - Kernel-Level Ground Truth: Why eBPF is Replacing User-Space Agents for Security Observability - Dashlane explains how attackers managed to download encrypted password vaults - Let's Encrypt Unveils Merkle Tree Certificates to Secure the Web Against Quantum Threats Divers ou parce que j'ai aucune idée où les placer - The Infosec Phrasebook - United Airlines Flight To Spain Pulls U-Turn Over Bluetooth Device Name - Cyber Insurance Rates Are Dropping, but Exclusions Widen - DNS is for people - not for IT infrastructure - The US Military Quietly Turned GPS Into a Global ‘Numbers Station,' Evidence Suggests - I led the 2014 U.S. CDC Ebola response. An action plan is needed now - Teen social media ban risks strengthening Big Tech dominance: Bluesky Collaborateurs Nicolas-Loïc Fortin Crédits Montage par Intrasecure inc Locaux réels par Intrasecure inc
Nesse episódio trouxemos as notícias e novidades do mundo da programação que nos chamaram atenção dos dias 30/05 a 05/06.☕ Café Código FontePrograme sua xícara para o sabor certo!https://cafe.codigofonte.com.br
Nesse episódio trouxemos as notícias e novidades do mundo da programação que nos chamaram atenção dos dias 30/05 a 05/06.☕ Café Código FontePrograme sua xícara para o sabor certo!https://cafe.codigofonte.com.br
Send us Fan MailThe breach that takes down a company often does not kick in the front door. It walks in through a “simple” integration you set up months ago, powered by a token no one remembered to rotate. We start with a real-world Zapier-style scenario and unpack how researchers chained together a harmless-looking code block, an AWS Lambda environment, and a misconfigured IAM role to reach private repository files and ultimately an NPM token that could enable a supply chain attack.From there, we zoom out to the bigger cloud security problem: non-human identities. Service accounts, API keys, and OAuth tokens multiply fast, and they are frequently overprivileged, poorly tracked, and left active long after an integration is retired. We also talk about why SaaS-to-SaaS connections are so hard to secure, and why agentic AI makes visibility even more urgent. If you do not know what systems are connected, what data crosses those links, and who owns the risk, you are effectively trusting an invisible tunnel into your environment.To make this actionable, we lay out a four-phase third-party risk management (TPRM) framework you can apply immediately: build a vendor and integration inventory with tiering, run real due diligence (SOC 2 Type II, ISO 27001, data access scope, subprocessors and fourth parties), lock protections into contracts (DPA language, right to audit, breach notification expectations), then enforce ongoing monitoring and governance with quarterly token reviews, logging, and incident response playbooks. If you are studying for the CISSP, you will also see exactly how this maps to Domain 1, Domain 3, Domain 4, and Domain 5.Subscribe for more practical CISSP training, share this with a teammate who owns vendor approvals, and leave a review so more security pros can find it. What is the one integration you would audit first?Gain exclusive access to 360 FREE CISSP Practice Questions at FreeCISSPQuestions.com and have them delivered directly to your inbox! Don't miss this valuable opportunity to strengthen your CISSP exam preparation and boost your chances of certification success. Join now and start your journey toward CISSP mastery today!
On this week's show special guest co-host Andy Boyd joins Patrick Gray and James Wilson to discuss the week's cybersecurity news. Andy is the CEO of REDLattice, which makes the Paragon “intelligence collection and reconnaissance” solution. They cover: Adversaries are tracking US troop locations with commercially available location data A new Signal phishing campaign is going after message backups 404 Media is suing ICE to get its spyware contract with REDLattice (lol) Microsoft's tone-deaf response to ‘never justifiable' zero-day disclosures Mini Shai-Hulud pops up again just as Glassworm gets shattered Much, much more This week's episode is sponsored by Authentik, an open source identity platform that you can host yourself. In this week's sponsor interview Authentik's CEO Fletcher Heisler joins Patrick Gray to talk about how they're keeping up with the bugpocalypse, and also the work they're doing to support identities for AI agents. This episode is also available on YouTube. Show notes The Pentagon Knew Enemies Could Track Troops' Phones for Years. Now They Are | wired.com U.S. says troops were targeted with location data, as senator warns ad industry is a ‘national security threat' | TechCrunch Security DOD location data attachment (Wyden) | Risky Business #830 -- LiteLLM and security scanner supply chains compromised | Risky Business Media US has seized nearly $1 billion in crypto from Iran, Bessent says | Russia claims foreign spy agencies hacked officials' phones | therecord.media Hackers are trying to steal Signal users' backups in new wave of phishing attacks | TechCrunch Security We Sued ICE to Get Its Spyware Contract. The Agency Is Redacting Essentially Everything | Social Signals Microsoft calls zero-day releases ‘never justifiable' as researcher threatens to drop more | therecord.media A shared responsibility: Protecting customers through Coordinated Vulnerability Disclosure | Social Signals Microsoft says it will not pursue security researchers after zero-day backlash | therecord.media IBM's new $5B initiative will help enterprises rapidly patch open-source vulnerabilities | Social Signals Federal audit reveals NIST's NVD is plagued by poor planning and duplication | cyberscoop.com Hackers Used Meta's AI Support Bot to Seize Instagram Accounts | krebsonsecurity.com Critical Windows Netlogon RCE flaw now exploited in attacks | BleepingComputer CISA adds exploited Palo Alto Networks GlobalProtect flaw to KEV | Cybersecurity Dive Password manager Dashlane says hackers stole some customers' password vaults | TechCrunch Security CrowdStrike disrupts Glassworm botnet that preyed on open-source supply chain | cyberscoop.com Botnet of more than 17 million devices dismantled | arstechnica.com Chinese-speaking fraud gang could be stealing millions from 2026 World Cup fans | therecord.media ACCC investigating Olympics ticket scam | ABC Dozens of Red Hat packages backdoored through its offical NPM channel | arstechnica.com Solo podcast: A deep dive on TeamPCP - Risky Business Media | Trump administration releases scaled-back AI executive order | cyberscoop.com Google security engineer accused of turning confidential search trends into $1.2M win on Polymarket | cyberscoop.com
In this episode of the Microsoft Threat Intelligence Podcast, host Sherrod DeGrippo is joined by Allie Luhrs and Mario Samolis from Microsoft Security to explore the growing threat of open source software supply chain attacks. They discuss how malicious NPM packages, compromised developer ecosystems, AI-generated attacks, and software dependency risks are reshaping modern incident response, while sharing insights from their recent presentation at BlueHat IL 2025. In this episode you'll learn: How attackers are targeting open source software ecosystems at scale Why AI is accelerating both cyberattacks and threat detection What was uncovered during their BlueHat presentation on modern software supply chain attacks Some questions we ask: What patterns did you uncover in NPM attack campaigns? Should developers rely on dependencies or build everything themselves? Why should organizations pay closer attention to open source security risks? Resources: View Allie Luhrs on LinkedIn View Mario Samolis on LinkedIn View Sherrod DeGrippo on LinkedIn Related Microsoft Podcasts: Afternoon Cyber Tea with Ann Johnson The BlueHat Podcast Uncovering Hidden Risks Discover and follow other Microsoft podcasts at microsoft.com/podcasts Get the latest threat intelligence insights and guidance at Microsoft Security Insider The Microsoft Threat Intelligence Podcast is produced by Microsoft, Hangar Studios and distributed as part of N2K media network.
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Unidentified RAT pushes NetSupport RAT https://isc.sans.edu/diary/Unidentified%20RAT%20pushes%20NetSupport%20RAT/33034 CVE-2026-41089: Windows Netlogon Vulnerability Exploited https://ccb.belgium.be/advisories/warning-microsoft-patch-tuesday-may-2026-patches-118-vulnerabilities-16-critical-102 RedHat npm Packages Affected https://www.aikido.dev/blog/red-hat-npm-packages-compromised-credential-stealing-worm Dashlane Locking Accounts after Brute Force https://status.dashlane.com/pages/5aabcb89fccc4b04d3774443 My Upcoming Classes https://www.sans.org/profiles/dr-johannes-ullrich
I'm excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World's Fair! We'll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella. However we did not hold back with this interview - we asked all the burning questions about uptime and Copilot that we know you have in your minds. Lets go!For almost two decades, GitHub has been the home of software, where both open source and closed flow, through commits, pull requests, reviews, actions, etc.This ecosystem flourished as open-source maintainers and contributors would continue shipping code for the benefit of the community. However as coding agents began to ship mass quantities of code - growing 1400% in 2026, it marked a new era that was both extremely exciting and challenging for GitHub.While these agents help more people ship more projects, they also significantly increase the floor of how much code is shipped, how often it is shipped, how many people commit code, and basically orders of magnitude multiples in every dimension of GitHub infrastructure:Now GitHub inevitably experiences more pressure on their infrastructure which was originally designed around human developers moving at human speed. This has resulted in a very publicly notable uptime story:So it begs the question of whether current systems around code can absorb what AI produces. Can CI/CD keep up when every idea becomes a build? Can open source maintainers survive floods of AI-generated slop contributions? Can GitHub preserve the human social contract of software while becoming the operating layer for agents?Which brings us to the perfect person to answer these questions: GitHub COO Kyle Daigle. In this episode, he joins swyx to unpack what happens when AI doesn't just autocomplete code, but starts changing how companies operate, how open source works, how pull requests get reviewed, and how GitHub itself has to scale. We go deep on GitHub's internal AI workflows: micro-skills, WorkIQ, MCP, Slack, Teams, email, Copilot workflows, the new Copilot desktop app, CLI, cloud agents, and how Kyle uses agents to look backwards across company context before deciding what to do next. Kyle also reflects on GitHub's history building webhooks, APIs, Actions, npm, Dependabot, and Semmle, why the AI era is breaking GitHub in new ways, how Actions became a general-purpose compute layer, and what Copilot becomes after code completion.Full Video PodWe discuss:* Kyle's expanded role across GitHub* How AI got Kyle coding again after years in leadership* Why GitHub rolls out AI through existing workflows instead of forcing new tools* WorkIQ, MCP, Slack, Teams, email, and GitHub as company context* Why massive “mega-skills” are giving way to small, atomic micro-skills* How AI changes summarization, communications, marketing, and analyst work* Why former developers in leadership may have a unique advantage in the AI era* Kyle's “15 agents on Saturday” workflow* How Kyle built an AI-generated executive presentation for CRO/CFO teams* Why AI changes the chief of staff role without removing the human work* GitHub Actions, webhooks, arbitrary code execution, and secure agent compute* The npm acquisition, supply-chain security, 2FA, and token invalidation* Slop forks, vendoring, and whether AI agents change dependency management* What pull requests become when most PRs come from agents* Prompt requests, vouching, AI review, and trust in open source* What counts as a “developer” when AI lowers the barrier to building* GitHub Spark, low-code, and why GitHub refuses to hide the code* 14x commit growth, Actions load, databases, monorepos, and availability* Copilot's evolution from completion to CLI, desktop app, cloud agents, and SDK* Context, memory, rules, and making GitHub “act like Kyle wants it to act”* Ambient AI, OpenClaw, enterprise security, and the new operating system for agents* What swyx should ask Satya Nadella about Microsoft's AI futureKyle Daigle* LinkedIn: https://www.linkedin.com/in/kyledaigle* X: https://x.com/kdaigleTimestamps00:00:00 Introduction00:03:36 Why AI Got Kyle Coding Again00:07:04 Running GitHub with AI: WorkIQ, MCP, Slack, Teams, and Skills00:15:39 The Golden Age for Former Developers in Leadership00:17:31 15 Agents on Saturday and AI-Generated Executive Work00:20:20 How AI Changes the Chief of Staff Role00:21:45 GitHub's History: Actions, npm, Webhooks, and Open Source00:28:45 Slop Forks, Vendoring, and AI Dependency Management00:33:57 Pull Requests, Prompt Requests, and Trust in Agent-Generated Code00:41:21 GitHub Stars, 200M+ Developers, and the New AI Builder Wave00:45:15 GitHub Spark, Low-Code, and Why GitHub Still Shows the Code00:47:38 GitHub's Hardest Era: 14x Growth, Reliability, and Scale00:59:21 Actions as the Compute Layer for CI/CD and Automation01:02:04 The State and Future of GitHub Copilot01:08:24 Ambient AI, Background Agents, and the Future of the SDLC01:13:09 OpenClaw, Enterprise Security, and the New OS for Agents01:18:03 Build Announcements, WorkIQ, FoundryIQ, and Microsoft Context01:21:41 What Should swyx Ask Satya?TranscriptIntroduction: Kyle Daigle's Expanded Role at GitHub and MicrosoftSwyx [00:00:00]: We're here with Kyle Daigle, COO of GitHub. Welcome.Kyle [00:00:07]: Hey, thanks for having me.Swyx [00:00:08]: You're not just CEO of GitHub. People know you as that. You have a new role.Kyle [00:00:11]: So I have an expanded role now. I've been working at GitHub for thirteen years and doing all things developer. Joined as a developer myself. And now, I'm also responsible as the CMO of Developer for Microsoft. And so all the kind of learnings and passion for developers and how we work with them and how we communicate and how we bring our products to market, we're also bringing that expertise to the broader Microsoft ecosystem and helping every developer that uses a Microsoft product or would like to have a sort of similar experience that they've had with GitHub over the years. So it's a different role in some ways, but it's also just building on the experience that I've had at GitHub of just sort of tell the truth, be authentic, show people how to use it and then let the products speak for themselves. Now just doing that with, all of Microsoft.Swyx [00:01:09]: We'll be releasing this in conjunction with Build. You got lots of stuff planned, and we can sort of touch on that whenever it's appropriate. I think one of the interesting things is I rarely meet a COO who's also a CMO. I think you're a very outward facing and you're very confident publicly. That's rare. Do you actually view yourself as COO? What's What is your thing?From GitHub Developer to COO/CMO: Building the Platform and Operating GitHubKyle [00:01:33]: I think for me, it's been funny. The titles have always been, a— have always felt a little strange to me. I joined GitHub as a developer? I wrote so much of theSwyx [00:01:46]: Let's bring that up. You wrote the back ends?Kyle [00:01:48]: I was going through, I was going through, some old photos, when folks were talking about how things were being built or how there was a build GitHub. I built, webhooks and worked with teams building the API, built the platform layer. Anything that integrated with GitHub, up until really twenty eighteen, I built or ran the engineering teams. And that's kind of where my the beginning of my passion always was helping people build things, deliver them to, their customers. And so being a developer, building for developers was always super unique. In a— I think as my role expanded, it became my ability to talk to not just developers, but also enterprise customers or business leaders and have this translation layer. And then through all those years, GitHub has always operated pretty uniquely. Post-pandemic, working remotely was not as novel as it was when GitHub started in two thousand and eight. But all that expertise of running remote teams, doing it well, became this sort of bigger role, ultimately turning into the COO role of how do we operate GitHub in the way that GitHub's always operated after the Microsoft acquisition. And kind of so on from there. So like for me, I think the— I've, I still code. I love coding but the problem has always been, people. It's a much harder problem to both support our own employees, a harder problem to communicate to developers and enterprise buyers what we're building why it matters, ‘cause those are two very different messages. And so getting to work in the mix of COO, CMO, also just being a dev, I think is what's kept me at GitHub for so long.AI Workflows for Leadership: Commits, Retrospectives, and ContextSwyx [00:03:40]: Apparently, you have— your commits have gone up. What's this? What's going on?Kyle [00:03:45]: Rui's called me out pretty aggressively. So I think— as you can imagine, right, you can see my normal era of being a dev In the twenty thirteen, twenty fourteen era, and then moving into management, and then ultimately the COO role. I think what you see there is me, really getting back to coding thanks to AI. I— similar to, attaching problems between how to market and how to operate a business and how to code, I find, building agents and workflows that are connecting very disparate problems to be what's driving this. So that's, some of it's writing software. A lot of it is, connecting a ton of a different data sources to, help me out. But that is completely me really diving in on the AI side in trying out our tools, trying out everyone's tools, But building for me, building for the non-technical leader, though I'm technical and how we're, able to use these tools more than just the simple, call and response that I think a lot of the non-technical, your employers, you have to get— you have to use AI, and so everyone uses, ChatGPT or Copilot or Claude or whatever. To really get into, how is this going to help me out, it— I find that it's not the I need to write a blog post, I need to those simple examples. Helping people find the workflows of, “Okay, I need you to go through all the PRs today. I need you to go through everything that we've posted online. I need you to go through what we did the last three months. Go through all of my Obsidian notes for any mentions of this then go through my transcripts at work.” We use, Teams, so, using WorkIQ, go call that MCP server, grab all the transcripts, go through all the Slack, and then build me out the plan of, what this week's messaging actually was. That's something that was, impossible because for me, I find AI in a what most of this launch here is actually, less building forward. It's actually, a recursive loop backwards. I'm always looking at what had happened first. Go back through the week and tell me what we did, what worked, what didn't work? And then tell me in the next three or four days-What would you tweak based on this sort of like looking backwards and then looking ahead a little bit? I find that to be so much more valuable, especially for like non-technical, because that retrospection is actually LLMs are very good at that. Like finding all the patterns, pulling them out, and then applying that retrospection to just a couple of days or just like a short period of time. Is all a bunch of apps that I've built and launched a bunch of, internal tools. I use the new, GitHub Copilot app, the desktop app with workflows. Every time I crack open my laptop, it's running workflows for me. It's just a ton of different stuff and of course, it all ends up on, it all ends up on GitHub.Swyx [00:06:47]: Of course. That's where, that's where, stuff is hosted. Man, there's so much to ask you. I was going to leave the how do you run a company with AI thing at the end. I have to ask one— double click one thing. You said, you are looking back at the week. You're, you're understanding what happens. When you say we That's three thousand people. How?Rolling Out AI Internally: Skills, CLIs, and Company ContextKyle [00:07:09]: I think when we started rolling out AI internally beyond engineering, right? One of the things that I was really, passionate about is like we have to do this in a way where no one has to change how they work. I don't want to have to teach you a tool. I don't want to have to teach you something new. And so for us, we tried out a few tools. Most of them don't work because I got to get you on board? I got to teach you how to use it. What we've actually ended up doing is we've built like a set of skills internally. We have we each have our set of skills, and we've just been distributing even to the non-technical folks, the CLI. And then effectively, we're just giving it access to like read about everything that we're writing. So that's for us, that's usually GitHub, Teams, Email, and Slack. So Teams for, video chat, generally speaking.Swyx [00:08:03]: Teams and Slack?Kyle [00:08:04]: so we use Teams for video communication, but we don't use it for chat. W-we— GitHub for a long history, right? We're alwaysSwyx [00:08:13]: Also SlackKyle [00:08:14]: Talking about ChatOps and like everything is built into Slack. Like every command, every flow.Swyx [00:08:18]: So even though you have been acquired for I don't know, eight years nowKyle [00:08:22]: we stillSwyx [00:08:23]: You still use Slack?Kyle [00:08:23]: it's a purpose-built tool for us, and I think the reality is that moving off of it would be so bluntly expensive? Simply because all the tooling is, baked in with that paradigm. And they both have their pros and cons but they don't work the same way at all. We still use a bunch of different tools Because it's the purpose-built tools that We need. And thenSwyx [00:08:47]: Well, the same doesn't go for the rest of Microsoft, presumably.Kyle [00:08:50]: like the like various teams like operateSwyx [00:08:53]: They make their own decisionsKyle [00:08:54]: Various ways. I think it just matters what you're trying to what you're trying to do. But we do we do work across kind of every tool that we use, and then by giving everyone access to all of that context and the new WorkIQ MCP server, which is quite cool if you do live in the M365 like world. I can ask it all these backwards-facing questions, and it's incredibly important for our teams that are working remotely. There's a lot of stuff you miss when you're not in an office, and we are spread out all over the world. So most of that is looking back. And then we post, we post either auto-automatically into GitHub issues or discussions, these sorts of like findings or like our industry reports. Like what's happening this morning, today, yesterday. A little automation gets run. We'll use the app. We might use GitHub Actions like with, our agentic workflows just to go do that run, and then we push it into GitHub, and w-we keep having a conversation. So usually for us, it's about that sort of like looking back, looking forward on the non-technical side. And then of course for a lot of those folks, it's also building an app, pushing it to GitHub pages or pushing it somewhere to host it et cetera. But it's just like enabling everyone with that power of it's going to take me a week to figure this out. Instead, we're going “Okay I built a skill. Let's put it into a repo. We'll all share that skill together, and then we'll use the CLI or now the app-” “just to run it.”Micro Skills vs. Mega Skills: How GitHub Uses AI at WorkSwyx [00:10:26]: All right. I think, I think we're going straight into like the team management and productivity thing. I think a lot of people are getting various levels of LLM psychosis. How do you manage the bloat of skills? Like everyone Has their thing, and they're Like trying to promote it to the rest of their peers in their org, right? And obviously, whoever becomes a skill influencer internally becomes like an AI leader, right? Of sorts. I assume you have those.Kyle [00:10:50]: like I think we haveSwyx [00:10:52]: And I assume it's a mess a Yeah.Kyle [00:10:54]: there's like I— like I think the reality is there's two pieces. Like first is I think that we're ending the era of these like massive, beautiful, perfect skills that are just like not any of those things. ‘cause for a while, right every tweet every day is like go download the skills, the perfectly managed thing to do this entire workflow. And I think that like what we've found and what— I was just with my team, this week, and we were talking about the skill side, and we're really talking about these like incredibly micro skills that are just doing one thing for us very well Versus a skill that's going to do I said, that full report. That doesn't really exist on our side anymore. It's usually how do— like a single skill that's going to identify the most important marketing information given any MCP server. Like this is the most important thing. Less about stitch a bunch of tools together and have it produce this mega output because then weeks go by, months go by, things change, and you want to tweakSwyx [00:11:58]: It's brittleKyle [00:11:58]: Your mega skill and you're screwed? You can't do that. And so now we're really just talking about the Legos we're using and just letting the instruction book be something we're all putting together. Whereas I think a lot of AI skills for a while have been that mega instruction book style.Swyx [00:12:15]: I've, thought a lot about Postel's law. I don't know if that's a term that is, means things to folks. It's the idea that you should be liberal in what you accept and strict in what you output, right? And I think that's like a good framing principle for skills. This is my skills, obviously on GitHub. I feel like everyone should have like how like some repos In GitHub are special repos? I feel like we should sort of reify the slash skills and everyone like give it some kind of special presentation. Anyway, so, yeah, this is one of those like download Download anything, transcribe anything, and then you can string together the atomic skills that do one thing well Into like some kind of orchestration skill that calls other skills. I assume, does that match?Kyle [00:12:56]: I like I think so. I think that theSwyx [00:13:00]: Summarize anything.Kyle [00:13:01]: Like I think the- For me, summarizing something for I do communications and PR and analyst relations and marketing and customer activities, and so my summarize everything is very different for each one of those like Contexts. What ‘Cause if I'm summarizing something for an analyst, that's a very different thing than, probably how I'm going to summarize something for like a customer meeting or an engagement. So that's I think like the difference when we're talking about the like the tools I might use on Saturday or the skills I might use on a Saturday when it's just for Kyle. Yeah, those are kind of like they have an atomic actual tool underneath or maybe skill, and then Kyle cares about X. But I think when we're talking about work and enabling the the marketers, communicators there, it's the atomic, this is what good summarization is, and then this is what I care about as for marketing for communications For whatever. And that I think is like the interesting matrix problem when we go from like a developer set of concerns to all kinds of different professions, is that what that word means to me is different than it means to you is different than it means to the analyst or the salesperson, and that's where I think the matrix mess is that we're starting to like still starting to find. It's about these mega skills but they're all just slight permutations, but those permutations are really important. It's the difference between someone reading this and going “Did AI make this?” what Or “This makes total sense, and I would expect this when I'm giving a briefing to Gartner,” or like whatever else.Swyx [00:14:37]: I think the beauty of it maybe is that you don't have to be that careful about what goes in there. It doesn't have to exactly fit as long as it like roughly is contained in there. I used to complain about plugin hell, basically. Like when you have a framework and then you have a hundred things that you need to integrate, everyone does like the GitHub used to be bloated full of these things. And now we don't need them anymore ‘cause now you just use skills.Former Developers in Leadership: AI as a Creation MultiplierKyle [00:15:00]: And like I think the most magical thing is the just that like I can just also crack it open. Like Like yes, I could go like change the how the plugin is coded, or like I could go do that now with AI, but I think there's just something more magical about getting a response back and being “That's not right,” and then you just crack the skill open, you just type English words and it's different. That building block is just, I think very unique. Once I get everyone to kind of understand how to best how to best make those changes to get the most power out of them.Swyx [00:15:36]: Is there a— you have a your peer group that Of people like you. Is there a common framing for Something I'm feeling is, which is true, is that is this a golden age for former developers who are now in leadership? Because you can wield the tools, you would know the right words, you're maybe not too close to the details. Doesn't matter. But like you're more effective than someone who doesn't come from that background.Kyle [00:15:59]: I think that like the secret has always been your ability to identify patterns and solve problems, and I think that for folks that like myself that don't code day to day anymore, that has made me successful as a developer, made me successful as a COO and now CMO. And so now that I have access to get and write code, I'm now applying that sort of like pattern finding and problem solving, and I know enough still about how to then go and say, “Oh, I want to make an app, but I don't want to break into jail or create something that's not going to be able to work or to be deployed scale or whatever.” that ability to apply all that additional business knowledge and still code I think is what makes that so interesting to me. Slightly different than I think some of the other like technical leaders that became business leaders and now are going back to their apps and updating them. Good for them? But I think the more, much more interesting thing is, well, now I have this whole new set of expertise over ten plus years. Why not take that and use that as a developer with these AI tools? So I definitely think that makes me more powerful, but I think that's true for like every dev as well. Most of the dev friends I still have also have some other underlying skill and passion. There's really talented, very kind of linear computer science software devs, absolutely. I just find that the folks that came from a different career, went to school for something else, went off and did this random thing, and then became a software dev, or were a dev, did a random thing, came back. Learning that extra set of information, learning those extra skills, and now having the power of an AI where I can crank up fifteen agents on Saturday while my kids are doing lacrosse, That's like really powerful. And I think it gets me back to that feeling of like creation, and it's very hard to replicate that in most other senses? That first time you build an app and you click it and you show someone that's magical. And so being able to do that not just in code, but across all kinds of different assets that's, that's huge. We were doing we're doing our every year we do our revenue planning. We talk about okay, what is it going to look like for next year? And of course as you imagine, there's, slideshows everywhere talking about what are we going to talk about, what's the narrative, et cetera. And so as you said I'm “Okay, well, I could probably just like build something to build this and then that way I don't have to go build the whole spreadsheet or I have to pass it to my team.” So we went through this process, and I got all the information and used the skills I mentioned. I built like a little app just to make it so I could look at some of the information in a SQLite database, more easily. And I ultimately built this entire presentation without touching any of it and I was “Okay, I'm just going to present this to our CRO, the CFO, their teams,” without mentioning I'd built it with AI. I like built a skill to make it look very much not AI driven. Just not pretty.AI-Generated Presentations, Human Taste, and the Changing Chief of Staff RoleSwyx [00:19:03]: Like a design. Yeah.Kyle [00:19:03]: Not pretty. But just like very clearly not AI. Kind of like don't do anything interesting.Swyx [00:19:08]: That's, yeah, that is valuable.Kyle [00:19:08]: Just go Exactly. We did the whole thing through. It used my notes from Obsidian, it used all the context I mentioned before, the plans, and Never came up once that it was AI generated.Swyx [00:19:20]: It didn't matter.Kyle [00:19:20]: Never once. D It didn't matter. And so now I takeSwyx [00:19:23]: This is a toolKyle [00:19:23]: I can take that tool and go, “Look, I don't want you to go build slideshows.” They're just helping us share information with each other. If this thing can do it With a little bit of crafting from you and then we can look at it together, awesome. There's no value in all that extra work. I think that the ability to, make it look humanly bad and and build a little app to, manipulate the data I think is part of, that upside for devs that are now in leadership roles. Because, the thing that I feel like I said before, this that's all a people, that's all a people problem. I know if you've used a coworker or not to build a slide deck, unless you spent a bunch of time to not do it.Swyx [00:20:07]: I know, but like it was so, I think there's a certain charm to just being blatantly AI. ‘Cause I think that you're well, you're just honest about There may be mistakes here that I cannot vouch for. So how much value is there? But anyway I think, actually the real question I want to ask is, there's a— You were a chief of staff To Thomas. And in the pre-AI world, the that job would've been a chief of staff job of like Can you prep me these slides and all that? And now you do it yourself.Kyle [00:20:35]: I still, I still have a chief of staff. Because, the difference is it's sort of the discussion every time we have some sort of technology evolution is it's not that the jobs the roles don't all go away, they just change? And so yeah, I don't have someone spending all their time building out slides for me and presentations ‘cause I don't need that anymore. But now I need that person that is able to go and find all the different connections between humans in those discussions to help me find out, okay, I should be meeting with this group and this team, and they have an opportunity, and I'm going to be in San Francisco today, I'm going to be in Seattle tomorrow. Those sorts of human connection aspects are still incredibly valuable and has always been a big part of that chief of staff role. But now just like chiefs of staff are not opening up, letters to process, they're doing emails. What It's the same thing. And now they're, they're not building out as many of these presentations because they have the the ability to have a AI take it on for, and share that with me and great. Let's keep moving ‘cause it's allowing us to go faster and make better decisions more quickly.Swyx [00:21:45]: Awesome. Well, so we can dive into more sort of, Productivity insights as you go. I did want to do a little bit of a brief history of colleague and hub. Because, we started here. And then you also involved the NPM acquisition. I did, I do want to touch upon that. And then more recently, I just want to bring up to present day where we're having uptime issues Which transparently we've already Addressed publicly, but we'll, we'll discuss in the pod. Did I miss anything? Like what, any other major highlights? Obviously, it's, it's a lot of years to cover.A Brief History of GitHub: Webhooks, Actions, Acquisitions, and Platform EvolutionKyle [00:22:15]: No the I think one of one highlight was right before the acquisition closed in twenty eighteen, I got to launch the first version of ActionsSwyx [00:22:27]: OhKyle [00:22:27]: At GitHub Universe. So it was OSwyx [00:22:29]: They're that young?Kyle [00:22:30]: It was October of twenty eighteen, I think. Yeah. Yeah.Swyx [00:22:33]: Gee, Jesus.Kyle [00:22:34]: I got to I was the engineering leader on that project and got to launch that. And then, yeah, we did acquisitions of NPM you said, Semmle, Dependabot Pul Panda a whole bunch of things. That was a bigSwyx [00:22:47]: Pul Panda.Kyle [00:22:48]: Abi is doing well.Swyx [00:22:51]: DX. Holy crap.Kyle [00:22:52]: Did well on DX. I and like that was a that was the big shift, after the acquisition. I had to join the sort of business side.Swyx [00:23:00]: So I need to hit you on some of these things ‘cause you were there. Right? And how often do I get to talk to someone who was there? But yeah, Actions. Is that the number one source of security issues on GitHub?Kyle [00:23:11]: Oh, sh I think that the number one source of, security issues is probably like all, the literal code in everyone's like underlying repositories. I would say back further than that is, if you remember I had to show in this graph was this is, I'm, didn't say this before, this is ultimately webhooks.Swyx [00:23:30]: You yeah.Kyle [00:23:31]: Like circa whatever it was.Swyx [00:23:32]: It says Hookshot in there.Kyle [00:23:32]: I forget. Yeah. Yeah, Hookshot's in there. And so like back then, it says GitHub Services. Do you see, it says Hookshot FE for front end, and then it says GitHub Services. GitHub Services back in the old days, right? You we had a repository that was Ruby code, and you could write any Ruby code in there, and then we would execute that On your behalf As a service, and then that way if an if you were trying to integrate with something, it didn't we would run it for you.Swyx [00:23:57]: And of course no containers ‘causeKyle [00:23:58]: No, ‘cause it wasSwyx [00:23:59]: Well, no containersKyle [00:24:00]: Twenty fourteen. And so there was some isolation obviously, but it was mostly the separations on the server level. That's like an example as long as the very old version of Pages, which ran on its own containerization infrastructure, not on Actions.Swyx [00:24:15]: Which like all-time great product.Kyle [00:24:16]: Pages powers the internet at this point to some degree. Those were places where like clearly there were no like issues like to my knowledge. But it was those things where I'm looking at and going “Okay, well we can't be running arbitrary Ruby code,” like on everyone's behalf. Then containerizing all of that up intoUh into actions now where yeah the containerization, is r-really good. The pinning most folks aren't pinning it the like to a particularSwyx [00:24:48]: ImagesKyle [00:24:48]: Sha, et cetera like their workflows, and so that's a big that's a big place Of pain for folks if they're just doing similar to any dependency management, just V1 or newest or latest, I think. But, that journey from that day to “Okay, we're just going to run all this arbitrary code, and, it'll basically be okay,” to now, no, we have, really good containerization. We have a new, underlying, ag-agent, containerization, service. It's like we're using it under the hood. It's through Azure. They recently announced it. The Azure, Dev Compute, but it's, very fast, very fast compute to be able to, spin up your own cloud agents, or whatnot. We're using it under the hood for some parts of the new,Swyx [00:25:36]: Microsoft Dev Box?Kyle [00:25:37]: No. Dev Compute, yeah.Swyx [00:25:41]: Hmm. Not finding it just yet.Kyle [00:25:44]: Oh, it's, it's in there somewhere.Swyx [00:25:46]: All right. Well, we'll cut that out.Kyle [00:25:47]: Sorry. But with, Dev Compute, you can, run, really fast, spin up really, small VMs really quickly, so you're doing a tool callSwyx [00:25:58]: Same conceptKyle [00:25:58]: Just do it containerize exact-exactly. So we're using that so definitely moving that direction to protect us from every every piece of code that we're ultimately running.Swyx [00:26:07]: look, that grows into the full SDLC? Code hosting was just the start and and then it's grown beyond that. Let's talk about NPM may-maybe ‘cause I think that's also, a very major point in the industry. I do think, it was looking for a home. It was, kind of struggling as a business, right? I don't know, I don't know how you would characterize that whole acquisition and how itNPM, Package Security, and Keeping the Internet RunningKyle [00:26:33]: like when we were talking to the team, I think the big thing for the both of us was to find a way to keep NPM, which was basically powering the internet then and way more so now to some degree running. Keep it going keep continuing to scale. It was having scaling problems, if I recall, back at that time. They were doing some rewrites. ItSwyx [00:27:00]: that's cute compared to now.Kyle [00:27:01]: Well, that's the thing is like when I'm talking to folks now, there's there's so many more underlying uses of NPM than there were back when we had them join in with GitHub. But that was ultimately the goal. It was really okay, we used to have pages. We have, the world's code. Let's make sure that we can keep NPM running well for the world. And we put a bunch of time and investment into fixing some of the underlying backend, changes, some of which we talked about some of the manifest work, et cetera. And then now, really trying to bring the the security posture of NPM up to speed. But, it is a unique challenge in that every move that we make to make it more secure will break a lot of people. And security is paramount. And also, we take it very seriously. We're, the any time that we have a problem with GitHub or we make a change that makes us more secure but hurts, there's, a snow day for developers or a really bad fire that they have to go put out. And so we've, have changed the 2FA policies. We've changed the way the tokens work. When we find tokens that have been exposed or potentially, exposed, we invalidate them, andSwyx [00:28:22]: I love that feature in GitHub. Yeah, it's greatKyle [00:28:23]: That creates issues, but, the but that's the thing is we're trying to push the community, forward without necessarily, doing something that is going to break the contract that's been for 15 years or close to it or some amount of years on NPM.Slop Forks, Vendoring, and the Future of Open Source Supply ChainsSwyx [00:28:43]: I think the— So now we're talking about, open source and publishing. And I think there's something here with what people are calling slop forks, which, I think Malta from Vercel is doing. And, part of me thinks, well, the way to get past any vulnerabilities, we just, let's just get rid of the concept of NPM. And we only publish source code. And anytime you want to import it you have your coding agent look at it and then adapt whatever subset you're going to use into your vendor it. But, the AI vendor it. Is that realistic? I don't know. Is it— Will that solve all our security issues? I don't know.Kyle [00:29:24]: I don't think it'll solve I so Mitchell was just talking Mitchell Hashimoto Was just talking about this today, and I think that I-in some ways, it's all all things, old or new again? Yeah, absolutely vendoring everything. Like I do I do remember twenty thirteen, twenty fourteen.Swyx [00:29:42]: This is Yeah. Let's, we must return toKyle [00:29:43]: That's what is We were vendoring everything. We were having actual discussions around, or at least I remember we were “Should we take this full thing?” “Why is this so big? We only need this one file.” And so I do think there's something true there where having either taking only what you need or the dependencies just getting incredibly small over time, I think will help to some degree, but it's not going to solve the fundamental problem, I don't think, because the vulnerabilities in an agent looking at them, there's time and time again, there's a million different ways in which we can convince an agent that this thing is, secure or not and pull it in. Or we can do static code analysis or runtime testing to say whether the code works or not. That is, I think, the step that needs to continue to be, invested in. The question is just on, how much scope. Should it be this enormous project that I'm pulling down, or should it be this piece? Either most companies are running some amount of security checking on the on the packages that they're bringing in or vendoring. That I think won't change. That's like what advanced security does to some degree, Socket does some degree. Like everyone is doing a piece of that. How we each do that like especially when we're talking to enterprise customers, is just like very different. No there's no one wants one single way to do it. And I think that's always been GitHub's, unique position in the world. I talk a lot to maintainers, I talk a lot to folks about this. It's we're— we rarely start like a process and a practice and like push it onto the community. We usually wait for the sort of like RFC process socially or literally, everyone agreeing, and then we'll cement something in. Because otherwise we'reMaintainers, RFCs, Vouching, and the Social Layer of TrustSwyx [00:31:35]: That fits your role in the ecosystem, yeahKyle [00:31:36]: We're GitHub. Yeah, we don't want to shape the whole thing. We want it to be figured out. But like how do you balance that like sort of Role in the industry to keep everything as secure as is possible and make sure that you're you're not going to be compromised as a human, ‘cause that's usually how it all happens. And Not not create a process or lock us into a flow that you're not going to or like Mitchell's not going to or other open source projects aren't going to like. That's always been a tricky balance for us, and I think that's something that we haven't talked about enough is we're not going to be able to fix everything for everyone in a way that everyone is going to like. So tell, help us, tell us what is working. When Mitchell was talking about, the Upvote, the upSwyx [00:32:22]: I was going to bring up his thing. Yeah.Kyle [00:32:23]: I forget what it Yeah. When he's talking to us, I was chatting with him and talking to him about this and I put it on Twitter and we talked to, also over DM, was “We're going to keep working.” but I think the important thing is I do actually want to hear what isn't working for you. And as, be as specific and clear for your project as is possible. And to every piece of credit over the many years that we've known each other through the industry, he's always done that and I appreciate that ‘cause there are places that we need to fix up, and we hear from him, and we'll fix up just like we do all other kinds of maintainers. But that that process between making those types of improvements and being more secure and like creating, I forget what he calls it's not the proof process, not the claims process. Do what I'm talking about? He has that he his projects have a way for you to kind of like,Swyx [00:33:13]: VouchKyle [00:33:13]: Vouch. Thank you. Yeah. He has like the vouch system for saying, “Hey, you should accept my PRs.” That's beenSwyx [00:33:20]: I just built this into GitHub. I don't know.Kyle [00:33:22]: Well, see, but that's the thing is that you say that and like he and his community really likes this and then I'll go talk to other maintainers and other maintainers, globally, and they're “No, this doesn't work for me.” And that is the tension, but also the kind of beauty of GitHub, depending on which way you look at it is we want to help maintainers, so we create all these tools to let you have more control over how much you take in from AI and PRs. But you can also use this. What You can go use this project, and if it takes off and becomes the kind of mostly standard, then yeah, we probably wouldn't enforce it but we would add it in because that's the flow that we tend to do?Swyx [00:34:02]: I hear a lot of people don't know the history of the pull request. And like like that's how, that's something that GitHub standardized basically.Kyle [00:34:08]: Yeah. It was a very messy process Like beforehand, and now the we have the benefit of it being the process? And now we have to go and Figure out the next best process or what adaptations change, or what does a pull request look like when eighty percent of your PRs are just coming from your agents and not From other devs?Swyx [00:34:31]: Do you like the prompt request idea from Peter?Kyle [00:34:34]: like I think that for each like each idea I think has its merits. I'm not, I'm not avoiding saying anything good or bad, but I feel like I've seen a version of we have that we have entire Thomas' store. Take all the assets of what you've built and put that in. I think that's got great ideas. There's all these various permutations of the PR flow, but I think the reason why there's not a single answer is ultimately we're trying to codify trust. We're trying to say “Okay, if Sean reviews this I'm going to trust it because you're Sean or you're the senior dev or you're the whatever.” And right now, when we are working in a flow where an agent writes code and another agent reviews code and then Kyle goes and looks at it the trust is kind of diffuse. And most of the tools that we're talking about are talking more about verification flows. We have more assets to look at, so I can probably say whether this is a good PR or not. But that still doesn't solve, I think, the human problem of I'm looking at a PR and I want to know if I can trust it. And we're still, we still tend to use human signals for that? Mitchell approving it or Kyle approving it or whatever. And so I think that's, I think that's why most of these options haven't really solved it is because, it's a social problem ultimately. It's a it's a human problem to review it and agree. Or you fully trust the tool and you're imbuing that tool with full trust Which I think in some cases that absolutely exists.AI-Generated PRs, Trust, and the Waymo AnalogySwyx [00:36:08]: And so like in the same way that there will be a tipping point in society when we don't allow humans to drive anymore Because machines are measurably better than Than humans. I'm looking for that tipping point, right? Like Mythos is ridiculously expensive. Someday we'll have Mythos on a desktop. I don't know. Will, does that change the equation?Kyle [00:36:30]: I think it's more I took a Waymo here, and I was on my phone and not looking around at all. There are other, self-driving, vehicles that I would not trust while, staring at the road. And I think that trust is something that isSwyx [00:36:48]: Is this a Zoox thing? What is itKyle [00:36:50]: I think that is both. I think that is both. LikeSwyx [00:36:53]: There's Zoox in this robo taxi. That's it. It'sKyle [00:36:56]: Well, depending on what level Of self-driving. But, my point is sort of that I think part of that is I strongly believe that's, a mixture of verifiable proof. Like how many accidents, how much data, and so on, and the human aspect of how I feel when I'm in this car, what it tells me, et cetera. And so that's why I think some of the like Some of these some of our AI tools tend to, imbue me with more of that feeling of trust, even if the data says this is 100% accurate. I feel like it takes more time for us to go, “Should I trust this or not?” And that's in the soft sense of, startups with high agency, weekend projects, and open source. And then there's enterprises and regulated industries and everything else, and that is an even harder problem to go solve because even when it is fully verified, not only do you have to have trust from the humans on the team, you probably have to have trust from multinational,Swyx [00:37:55]: Oh my GodKyle [00:37:55]: Multi governments around the world and regulating agencies. And so that's where I feel like until we tip over to your point on the sort of like human EQ side of it. I feel okay this feels okay I've been proven enough. Then the ball will start to roll a lot faster, where we'll end up getting to the “Okay, we can trust this,” and feel good about it in the Most difficult of cases.Reputation, Sponsors, Stars, and Bot Activity on GitHubSwyx [00:38:18]: If human trust is the thing that matters, I feel like GitHub as the developer social network could maybe do more there. Like vouchers are one system But, we have star counts, and then we have Contributor rights, and that's it. And I feel like there should be more in that space. I don't know if there's any other design decisions there.Kyle [00:38:37]: I think that one of the places that we don't really expose right now in this sort of way is, some degree of like hard trust and support, which would like for me is like sponsors is a good example of that.Swyx [00:38:49]: Ah.Kyle [00:38:49]: It like costs you something. To prove that I believe in your project and I trust you To some degree or I want to support you at the very least.Swyx [00:38:56]: Solve payments for open source. Why not?Kyle [00:38:58]: I think that I think that like as we keep moving forward, right, there's more and more projects where I'm, adding more and more dollars into sponsors personally because I want to like support them, but I also like know of I've probably never met them in person, but, I know of enough of their work that I want to support them. I think the thing that I don't love about stars or commit counts or anything else is ultimately, even with all of the various, abuse and de-spamming and deduplication work that we do or anti-abuse work that we do, these are all, not active social signals. They're passive ones that are ultimately gamifiable. And you may trust me, but another open source maintainer may not. And on what heuristic should you be, trusting me? That I think, is kind of where some of our thinking is right now. What signal from me is most important to you? You— If you can define that potentially, honestly in an agentic workflow that's what we see some of these open source projects do, where you have GitHub actions, and then you have like an agentic workflow that's calling AI, and you're setting these rules. Like if Kyle has submitted and gotten accepted PRs across any given project and has a social handle tied to his account in GitHub, and that social account's older than a certain amount. Really complex measures that matter to you ‘cause most open source projects have that heuristic built into their heads, if not written down in the contributing guidelines. You could take that and then go apply that and then just say, “Oh, we're not going to accept this PR.” Building something that is, I think, malleable to everyone's needs, is a little bit better, rather than going “Hmm, this account's too young.” Because what happens? The attackers just go and go and create a multitude of accounts, and they wait Until it ages up. Needs to have a certain amount of stars. That's how star inflation happens. Need to have a certain amount of reposSwyx [00:40:46]: Oh my God. YeahKyle [00:40:47]: With PRs. They all just create repos and submit PRs to each other, and then they come in and do something nefarious. And so, it's hard. It's hard to find the measure. So I think we're, we're looking more at how can we provide you tools so you can kind of choose what's best for you. And of course, we'll give you some standards. But the trust vector, gets down to I don't know, some version of like human digital ID like everyone's been talking about. Like how do I prove that it's meSwyx [00:41:13]: Give me your eyeballsKyle [00:41:14]: On the internet. Give me your eyeballs. Exactly.Swyx [00:41:18]: The I got to keep moving on Topics, but obviously I can go all day on this stuff because, I've been involved in GitHub and open source My entire professional career. Stars. Very superficial. Everyone knows it. But I think time to one hundred thousand stars is the fastest I've ever seen. Like people just reached that in I don't know, months. And then like at the same time I don't trust it right? Like how many of these are real or bot or like whatever. I don't know how to ask this but like what can we do about it? LikeKyle [00:41:49]: JustSwyx [00:41:49]: Is stars broken? Is stars fine?Kyle [00:41:51]: I think that there's kind of two, there's like two pieces. Obviously we're constantly like trying to find ways in which like your users are producing spam, which would, I would include like be like only doing star gamification. When we find them, we pluck ‘em out and we,Swyx [00:42:08]: But it's like a Whac-A-MoleKyle [00:42:10]: It's a hundred percent like a Whac-A-MoleSwyx [00:42:11]: There's no wayKyle [00:42:11]: Now, powered by AI to be helpful. But I think more so what I'm seeing is, a lot of the like fastest time to X tends to be because we're now inviting so many more people into like software development on GitHub That like the zeitgeist is just swarming? And it'sSwyx [00:42:32]: It's not just developers anymoreKyle [00:42:33]: And it's not you and I. Like like however you want to say like what a developer is it's not just folks who have been coding for a very long time. It's folks that have maybe started coding or only joined in since the AI era. And nowSwyx [00:42:44]: what's the latest Octoverse number? I know eighty million was my lastRem- member that a number of developers on GitHubKyle [00:42:50]: Oh, we're over 200 million now.Swyx [00:42:53]: Okay. Well, so you see?Kyle [00:42:55]: Like over 200 million developers now.Swyx [00:42:56]: But it's not developers, right? It's, it's people with a GitHub account.What Counts as a Developer in the AI Era?Kyle [00:43:00]: So, so this is, this is the biggest debate that I would say, everyone loves to have at GitHub at this point. From my perspective, right, I think that there's, there's clearly a difference between, professional enterprise developer and then developers. But I think that I think that the idea that we should be I don't know, splitting hairs or segmenting developers in the early era of software development is, not worth our not worth the time. SoSwyx [00:43:29]: When you get into gatekeepingKyle [00:43:31]: 100%Swyx [00:43:31]: What is a developer?Kyle [00:43:31]: 100%. ‘Cause I wasn't a developer when I started writing code? I was going toSwyx [00:43:36]: Oh, no. I made— I cloned a thing, seven years before I learned to code. And then I and then I wrote about my learning to code journey, and people Just called me a fraud ‘cause I had a GitHub account. And I'm “Well, no, I just use GitHub, but I don't know-” “I didn't know what I was doing.”Kyle [00:43:49]: I I remember that. I remember those sets of posts, and like that's, that's b******t. So I fight very clearly on the line of, if you create code, if you have an idea and you create it into some way of, I'm, I'm going to run it and use the app right now, you may still use AI in that moment, but that's okay. At some point you're going to do the next thing. You're going to create a big— You're going to have to learn about this database. You're going to fix a bug, whatever. We're all on some same journey, and those people are also hearing about the great new agent skill package or a new CLI tool or a new whatever. And those projects are going up because you want to be a part of this moment, just like I wanted to be a part of the Ruby community when Ruby was popping off when I started becoming a developer, and now I can just click the star button. And so I think that yes, there's clearly some amount of like spamming and game gamification that we're working against, but I really think we're just seeing this whole new cohort of folks that are moving from technology to technology because they're not working on a 20-year-old software application. They're working on a side app that they built on the weekend for their friends or for their new idea or whatever. And that's how you see these enormous charts going up and to the right with With stars.Swyx [00:44:59]: I think something that's remarkable is the persistence or, that GitHub extends to those folks. Usually when I see platforms go into a new audience, they usually have to, have like a second platform with a different name that wraps the main platform. But somehow GitHub has been able to sort of persist and extend, and it's friendly and whatever? So it's, it's nice.Spark, Low-Code, and Always Showing the CodeKyle [00:45:19]: I that's partially why I think as we've tried to move into I don't know, more like low-code-y things. We so we started working on Spark as like a way to, build an app and run it. I think that the reality is that we anytime we try to, kind of put even a veneer on top of it without when we put a veneer on top of something, we still always show you the code. That's kind of like a tenant. We're never going to, hide the code from you ever, because whatSwyx [00:45:52]: Why would you?Kyle [00:45:52]: That's, yeah, that's the whole point? However, I think that what we learned with things like Spark is that really the value of Spark for most devs is, easy runtime. And you may have a runtime or a host that you're going to use for that or you just build something and run it but, the package of making that even more simple isn't really needed for folks that are trying to build software and not just trying to build, an app, which is, slightly different, a slightly different goal. So I want to get you in, I want to get you comfortable. I think the best thing for me as, someone that did not traditionally come into software dev way back, I want anyone to be able to breach that chasm and not be in the I don't know, I feel like we're, we're still in an era of, STEM. I've got a 12-year-old and an eight-year-old, and it's “We got to get ‘em into STEM,”? Over and over. And I like I do, I do the things that good parents do. I was “Oh, you want to do coding?” “Yes, I want to do coding.” Do coding classes. But now they're just not afraid of doing software. And that's, I think, the thing that's honestly kept me at GitHub for so long. Anyone should be able to go and build a thing, just like I can go change a light switch in my house. I'm not going to go into the breaker box ‘cause I'll probably kill myself? But, I can go change that light switch. Everyone should be able to go and say, “This fricking app doesn't do what I want. I want it to work like this.” And that I think, is what's kind of kept us all connected with GitHub through the years and some and during the easiest of times or in the hard times because of that opportunity of, we're the home for all developers, and we want everyone to be able to have that feeling that we've had of, had an idea, I created it and holy s**t here it is.Swyx [00:47:37]: Here it is. All right, I'm going to try to do more spicy questions.GitHub's Hardest Scaling Moment: Growth, Agents, and UptimeKyle [00:47:42]: Great.Swyx [00:47:42]: Is it an easy time now or a hard time?Kyle [00:47:45]: Oh at GitHub? It's a hard time. Like, it's a hard time and also, I was just with my team and I said, “This is also, the best and most exciting time that I think I can remember at GitHub.” BecauseSwyx [00:47:57]: Best of times, worst of times. It's never oneKyle [00:47:59]: ‘cause we've we were talking about Octoverse reports and, usually we do an Octoverse report once a year, and we look at the numbers, and we say, “Oh my goodness.” I was at Universe in October saying, “This was the fastest year of growth that we've ever had,” right? And now we're doing more in a month than we did in a year last year.Swyx [00:48:20]: You're talking about PRs.Kyle [00:48:21]: Commits.Swyx [00:48:21]: Commits, yeah.Kyle [00:48:22]: PRs. Kind of like you name it by roughly every measure that we're looking at, there's some amount of sort of growth that is much bigger, and that is breaking our system in new ways, not old ways. Like webhooks were always notoriously, unreliable over the years?Swyx [00:48:38]: Whose fault is that?Kyle [00:48:39]: not anymore mine, but for a period of time, I'm sure you could pull up a tweet that was “It was me. I'm sorry.” but, now, that got rewritten at a scale level that is still working and is not having problems today. Now what we're finding isn't just the isn't the-The simple stuff that folks are on the sometimes on Twitter or on the internet are “Hey, why is this like this?” Sure. There's absolutely silly problems that we shouldn't exist. But now we're talking about, unique, novel permission problems that happen only at a scale across all different objects or whatever, that now we have to go rewrite this underlying system. And so it's, there are problems that yeah, caught us off guard, which I think I said. Like the growth is astronomical, but also we're making such material progress in that I'm excited once we're once we've kind of like reimagined the underlying foundation layer, or pieces of it at least, what's going to be possible when it's not just all of us and all the new people that are being developers and all of their agents and all the tools like working together. Because that'll still happen in that in that GitHub tool, that GitHub community. But it's a it's a hard day anytime we can't give you what you're looking for. We have the same problem internally. We operate through github. Com. Of course, we have backups when things go down and whatnot for our own operations but we feel it too. If it's not working it's not working for us, and that's kind of like the promise of dogfooding for GitHub. It's always been true. We're using the same tool you're using. We're not using a super secret version. We and so we also need it to be great for us for our customers of course for open source. And now an exponential growth of agents, Doing it too.Swyx [00:50:32]: I wanted to load for audio listeners who maybe haven't seen your tweets, whatever. So one billion commits in twenty-five. Now it's two hundred and seventy-five million per week on pace for fourteen billion this year, if growth remains linear. Is that still the pace? I don't know. It's been aKyle [00:50:48]: it's, it's speedingSwyx [00:50:50]: Roughly.Kyle [00:50:50]: It's still speeding up.Swyx [00:50:51]: It's, it's April, so yeah.Kyle [00:50:51]: Exactly. This was in April.Swyx [00:50:53]: All right. So basically you have fourteen x growth, right? Year on year on year. And I think that's a scaling issue. I think, I'm going to like try to really steel man this thing. People have experienced fourteen x growth. They haven't had your downtime. And that's like— C-can we go dig into that? Why? Like what's the— what broke? What are we doing to fix it? Like just anything for the community to reassure them.Why GitHub Reliability Is Breaking in New WaysKyle [00:51:18]: so there's a Like I was saying, there's a couple different places that we've seen the growth issues. Some of the growth issues, which is why we're t— I was talking about pushing hard on more CPUs is in actions in particular. More tools, more agents, more PRs mean more builds, more builds mean more CPUs. And so we are expanding through not just our data center, but obviously we were talking about moving to Azure and moving to, adding an additional cloud compute because we simply need more CPUs. Not as much GPUs. We definitely need GPUs too, but now CPUs are becoming a factor.Swyx [00:51:53]: It's very CPU heavy.Kyle [00:51:54]: Underneath the hood when it comes to some of the underlying services, we've been breaking up over the years our database infrastructure, so that way we have, more cognitive separation between our the various services. The place that we continue to have pain is in, permissioning. And so right now m-many of our permissioning layers sit into a database that we like internally call MySQL One, and old Hubbers will know what I'm talking about. And so we've been pulling things out of MySQL One for many years, because like and we use we use Vitess and we use other technologies to shard and we do it as one bigSwyx [00:52:31]: Famous thing, PlanetScale was born from this andKyle [00:52:32]: A hundred percent. Sam Old Hubber and friend. And so finding these opportunities to like break this out and then do that globally. The other thing that I think is interesting and both a unique opportunity and tricky is we also run everything I just talked about in a black box container with GitHub Enterprise Server for people that work on-prem. So we take everything I just said, and we also do it on-prem, and we also do all of that and we do it in a data residence setup for customers that need to have their data in a single location. Each of these has the unique characteristic around how we're sort of storing that data in MySQL or in a permissioning setup. That's where some of these outages have oc-occurred, where you're seeing it more like across the board rather than just like the one pieceSwyx [00:53:17]: Filling the databaseKyle [00:53:17]: Isn't quite working. Exactly. And so part of it is that. I think there's been some other places where agents are much more or more projects appear to be moving towards monorepo versus we were going the other direction for many years in the industry. Repos were smaller, but there were more of them, and now we're seeing the opposite. Repos are bigger, and there's, not fewer of them per se ‘cause there's new growth, but, we're just seeing many more big repos. Big repos, big monorepos have always had, a unique performance problem. Because each one, is slightly different if, particularly if the underlying blobs are incredibly big Inside the repos. And so we've done a ton of work that you pro— like most people haven't probably experienced, unless you're in this case of the monorepo. But that Git, infrastructure layer improvement does help the overall, system because, many of the improvements that make monorepos work better make all repo infrastructure work better. And so, I could kind of keep going down the line where it's another thing where we're moving out of, We're changing how we do j I'll just say job queuing for lack of a better, explanation changing the underlying technologies there.Swyx [00:54:32]: I spent two years being a job queuing guy, so.Kyle [00:54:34]: And so it's kind of a little bit of a little bit of piece by piece, and it's mostly because as we were— as it was built, we built everything in a way that assumed, I guess in some ways that the size of the pipe of work was going to remain the same. There's just going to be more people coming through each of those pipes. But instead now in places whereA git push was, generally a certain size for example, is now, no longer true.Swyx [00:55:03]: Oh, yeah.Kyle [00:55:03]: OrSwyx [00:55:05]: I push a thousandKyle [00:55:06]: On the average. 100%Swyx [00:55:06]: A thousand line commits like dailyKyle [00:55:07]: Same thing with PRs. Like PRs same thing. And like we've talked about optimizing that and making changes where, and there were technology choices that did not work there? And it got slow, and it didn't It was not fast. It did not do what the users wanted. And so we've been reeling that all out and going “Okay, that's just not right. Let's stop putting good money after bad and do it the do it the right way or the right way now.” So there's It's a it's a lot of things, not quite when I've experienced scale at GitHub historically, it's almost always two options that we've used. We go vertical scaling, particularly with databases, right? And we go horizontal scaling. Oh, we just have more people using this service. Great. We're going to add more servers, and we rack them in our data center, or we use it in a cloud. And now we're sort of in a like diagonal, where like vertical doesn't really work anymore. Horizontal isn't work either because we're all We all have some CPU or GPU constraints in the world now, and now we have to go in and like crack open services that have been running for 10 or 15 years and go, “Okay, the rules of this service have legitimately changed, and now we have to rewrite them.” None of this is an excuse. This is like we're We have to do the work. We have to make it better.Swyx [00:56:22]: actually as an infra guy, I'm “This is like one of the most fascinating scaling challenges I've ever seen.”Kyle [00:56:26]: That's that's, that's the thing that's the thing that it's hard for Like when we weren't talking about it publicly, and I was like I came out, and I was “Hey, I just want to explain what's going on.” Part of it comes from a very old GitHub ethos, which is it's our it's our uptime. It's down. W What I know you're a developer, so you're, you're inclined to want to understand more what's going on. But at the same time us going “Hey, this service didn't, perform the way we expected, and now we have to go change it,” we weren't We're not trying to hide anything from you i
terralayr CEO Philipp Man joins NPM to discuss the evolution of the company's virtualised battery storage model and the rapidly changing dynamics of the German BESS market.Philipp reflects on the original idea behind founding terralayr in 2023, key milestones in building its proprietary trading platform, and where the business now stands following a landmark equity raise concluded earlier in 2026. The conversation also explores financing conditions for battery storage in Germany, evolving revenue stacks, regulatory change including grid fee reforms, and how geopolitical events are intersecting with BESS deployment in Europe.NPM is a leading data, intelligence & events company providing business development led coverage of the US & European power, storage & data center markets for the development, finance, M&A and corporate community.Download our mobile app.
PHP Podcast – May 28, 2026 Hosts: Eric Van Johnson & John Congdon Links from the show: PHP barely avoided disaster – YouTube CVE-2026-45793: Anatomy of a 14-Hour PHP Supply-Chain Near-Miss · graycoreio/github-actions-magento2 · Discussion #261 · GitHub An Update on Composer & Packagist Supply Chain Security PHP Tek: A Homecoming by Ben Ramsey Tek Roundup – Roave Speaking at PHP Tek 2026! #tech – YouTube PHP Tek is behind us, the ballroom is cleaned up, and we’re back to talk about all of it. Here’s what we covered: RIP Archie Bot After a long fight to keep him alive, Eric has officially retired Archie — the Discord bot built on OpenClaw that handled team standups, monitored PHP Architect’s Twitter/X group for join requests, and did a surprising amount of background work for the consulting team. When Anthropic shut down the OpenClaw API, Eric tried every model and service he could find to bring Archie back to form, but nothing got him all the way there. After a month of “almost working,” the call was made. He’s dead. Eric hasn’t ruled out revisiting it eventually — maybe with Claude Cowork — but for now, the bot is gone and the starting-soon link in Discord is broken because of it. Reviving a Six-Year-Old Codebase A client PHP Architect Consulting worked with from 2018 to 2021 has come back. The project — a reimagining of their app — was killed off when COVID hit and the CEO couldn’t align with the team’s vision. The last commit was six years ago. Now the client wants to bring it back, and Eric is spending the next few days analyzing what it’ll take to get it running again. Outdated packages, an old PHP version, and the general entropy of time are all on the checklist. Eric has genuine affection for this codebase — it was one of the first projects where he felt like the team was truly operating as a team, not just as an extension of him. Now it’s time to dust it off. Partner Spotlight: PHP Score → Our CVEs The PHP Score sponsor read may be getting a refresh — the folks at Artisan Build, who built PHP Score, have a new product they’re excited about: ourCVEs.com. It monitors your codebase’s Composer and NPM packages — and optionally your servers via a lightweight agent — for exposure to open CVEs, and alerts you when something needs attention. Pricing is generous: free forever for open source projects, $17/month for solo devs, $83/month for teams (or $1,000/year), with server monitoring scaling at $1 per server above 50. Ed from Artisan Build was at PHP Tek and made a strong impression. Go check it out at ourcves.com. How PHP Barely Avoided a Supply Chain Disaster Brent Roose released a 22-minute video covering a near-miss in the PHP ecosystem involving GitHub and Composer. The short version: GitHub changed their token format and briefly released it before Composer was ready to handle it. Composer was logging the token when the format check failed — meaning GitHub tokens were ending up in CI logs. In GitHub Actions, depending on how your action is configured, that container (and its token) might stick around for a while, giving an attacker a window to act. An alert developer caught the issue, used Claude to help research it, then did responsible disclosure — contacting the Composer maintainers and reaching out to Taylor Otwell, Vincent Pontier, and others in the ecosystem to disable their actions until the fix was in place. Update your Composer. GitHub rolled back the new token format but won’t keep it rolled back forever. Packagist MFA and Account Security Following up on the supply chain theme: Nils and Igor (Composer/Packagist maintainers) released a blog post on what they’re doing to improve supply chain security. The immediate ask for anyone publishing packages is to enable MFA on your Packagist account — it’s not required yet, but it will be. Eric went to check his own account, found MFA was already on, but noticed his username was still “diegodev” and he was using an old email. While updating it, he noted that Packagist didn’t require him to re-authenticate or confirm the change via the old email — a gap worth flagging if you have popular packages and someone ever gets into your session. PHP Tek 2026 Recap — The Good PHP Tek 2026 in Chicago is done, and despite everything (see below), the team is proud of how it went. Some highlights: Holly (CodeLorax) built a conference mobile app from scratch, released on both Google Play and the Apple App Store within 24 hours of the conference opening. The app let attendees build their own schedule, detected conflicting talk selections, sent push notifications when talks moved rooms, and even included a vendor lead-scanning feature where vendors could scan attendee QR codes to capture contacts. It was a genuine game-changer for the event. Eric and John named the conference elephant after Holly in appreciation — she also changed a trailer tire during setup, which sealed the deal. Clayton Kendall sponsored and produced the conference shirts and bags on an extremely tight timeline — shirts two weeks out, bags just one week before the event. Both were a hit. Attendees at the conference were getting questions about the rainbow PHP Architect shirt in particular. A job fair ran for the first time, with four companies represented. One hiring manager showed up even though they already had 1,400 applicants — because they knew that conference attendees are exactly the kind of motivated, self-improving developers they want. Attendees got to ask questions directly, including the real-world stuff like remote vs. office. Eric would love feedback on how to make it better next year. JS Tech debuted as a fourth track alongside the three PHP tracks, bringing in fresh faces from the JavaScript community. Eric came away energized by the cross-pollination — different people, different approaches to similar problems. Ben Ramsey and James Tickham (Rove) both wrote great blog posts about the conference. Ben’s will be featured in the magazine. Diana Pham also put together a video recap. Links in the show notes. PHP Tek 2026 Recap — The Incident On Monday during final setup, a hotel employee had a medical incident while walking through the main ballroom — leaving a trail that required hazmat-suited cleanup crews and forced the team to quarantine the ballroom, the hallway leading to it, and the adjacent bathroom. The person is okay and was back at the hotel by Friday, which was a relief. But in the moment, nobody knew what was happening or how long the room would be unavailable. The team had to rebuild the entire conference footprint overnight. The keynote moved, the JS Tech track went into the quiet room, vendors moved to the atrium, and the hotel staff — to their enormous credit — cleared their own furniture and accommodated every ask without complaint. Attendees were equally patient; once they understood the situation, there was no drama, just “tell us where to go.” The incident also took out the streaming setup for day one, compounding an already-difficult start. The solution that eventually worked — plugging the Ethernet into a hub before the streaming equipment — wasn’t tried until day three. Eric is mad at himself for thinking of it and not doing it sooner. PHP Tek 2027 — Save the Date (TBD) Planning for next year is already underway. The current target is April 2027 — away from the May timing that caused Eric to miss two of his kid’s band performances this year. Nothing is locked yet, but they’re working through venue and date options and hope to have an announcement soon. Links from the show: ourCVEs.com — Daily security audit on autopilot PHPScore — Technical debt monitoring for PHP Brent Roose — “How PHP Barely Avoided Disaster” (YouTube) Packagist — Enable MFA on your account PHP Architect Discord PHP Architect Merch Store PHP Architect YouTube Host: Eric Van Johnson X: @shocm Mastodon: @eric@phparch.social Bluesky: @ericvanjohnson.bsky.social PHPArch.me: @eric John Congdon X: @johncongdon Mastodon: @john@phparch.social Bluesky: @johncongdon.bsky.social PHPArch.me: @john Streams: Youtube Channel Twitch Connect & Hire PHP Architect Website Twitter/X Mastodon Hire PHP Developers Looking to hire PHP developers? Email support@phparch.com – Joe and the team are available for consulting, infrastructure work, Ansible playbooks, and code review. Partner This podcast is made a little better thanks to our partners Displace Infrastructure Management, Simplified Automate Kubernetes deployments across any cloud provider or bare metal with a single command. Deploy, manage, and scale your infrastructure with ease. https://displace.tech/ PHPScore Put Your Technical Debt on Autopay with PHPScore CodeRabbit Cut code review time & bugs in half instantly with CodeRabbit. Music Provided by Epidemic Sound https://www.epidemicsound.com/ Join Us Live Next Week Youtube Channel Got feedback? Join us on Discord at discord.phparch.com The post The PHP Podcast 2026.05.28 appeared first on PHP Architect.
Today in the business of podcasting:A new Signal Hill Insights Pulse Report, conducted in partnership with FlightStory, finds that 45% of monthly podcast consumers in the U.K. used a smart TV to listen to podcasts in the past month, making it the second most-used device behind smartphones and ahead of computers. More than half of video podcast viewers watch during prime time hours, suggesting video podcasting is becoming a living-room, prime-time television behavior in the U.K.NPR, its sponsorship subsidiary National Public Media, and podcast distributor PRX have announced a collaboration enabling NPM to sell sponsorships for station-produced podcasts hosted on PRX's Dovetail platform, with stations including Boise State Public Radio, LAist, and WWNO already participating.Podcast subscription platform Supporting Cast has launched delivery of gated, subscriber-only video on Spotify using Spotify's Distribution API, making it the first subscription platform to offer this capability. Journalist-led podcasts Libero and Legacy are the first shows to use the feature.Media writer Brian Morrissey examines how AI-driven changes to Google Search are accelerating the decline of page-view-based publisher models, and argues that the recent Vox Media sale reflects how owned podcast networks have become central to media companies' valuations.To find links to these, and every article covered in today's episode, click here. You can also subscribe to The Download's newsletter to receive the full issue straight to your email inbox every day.
Entérate de lo que está cambiando el podcasting y el marketing digital:-Spotify amplía su apuesta por el audio hablado con artículos narrados.-Del dolor personal al éxito internacional con “Vos Podés”.-NPR, PRX y NPM impulsan alianza para monetizar pódcast públicos.Tips y herramientas para podcasters-CapCut impulsa la edición de vídeo con IA junto a Gemini.PatrociniosSuscríbete a la newsletter de Vía Podcast y recibe a diario en tu bandeja de entrada las últimas noticias de inteligencia artificial, marketing digital y podcasting.Este episodio es presentado por RSS.com, la plataforma de hosting de pódcast que te permite publicar, distribuir y monetizar tu pódcast de forma sencilla. Lanza tu pódcast hoy mismo y haz crecer tu audiencia con herramientas profesionales y analíticas avanzadas.
Today in the business of podcasting:A new Signal Hill Insights Pulse Report, conducted in partnership with FlightStory, finds that 45% of monthly podcast consumers in the U.K. used a smart TV to listen to podcasts in the past month, making it the second most-used device behind smartphones and ahead of computers. More than half of video podcast viewers watch during prime time hours, suggesting video podcasting is becoming a living-room, prime-time television behavior in the U.K.NPR, its sponsorship subsidiary National Public Media, and podcast distributor PRX have announced a collaboration enabling NPM to sell sponsorships for station-produced podcasts hosted on PRX's Dovetail platform, with stations including Boise State Public Radio, LAist, and WWNO already participating.Podcast subscription platform Supporting Cast has launched delivery of gated, subscriber-only video on Spotify using Spotify's Distribution API, making it the first subscription platform to offer this capability. Journalist-led podcasts Libero and Legacy are the first shows to use the feature.Media writer Brian Morrissey examines how AI-driven changes to Google Search are accelerating the decline of page-view-based publisher models, and argues that the recent Vox Media sale reflects how owned podcast networks have become central to media companies' valuations.To find links to these, and every article covered in today's episode, click here. You can also subscribe to The Download's newsletter to receive the full issue straight to your email inbox every day.
In episode 322, the co-hosts examine critical vulnerabilities, changing security standards, and adaptive defense mechanisms. They deep dive into the recent "Megalodon" breach, identifying it as a direct poisoned pipeline execution attack. Rather than exposing a flaw inside GitHub itself , researchers at Hudson Rock traced the root cause to credentials stolen from developer desktops via infostealer malware, which allowed attackers to push base64-encoded payloads into GitHub Actions workflow YAML files. To counter these types of automated supply chain threats, the hosts praise NPM's newly released "staged publishing" pipeline, which mandates two-factor authentication from human maintainers before releasing packages pushed by automated CI/CD workflows. Shifting to framework flaws, they highlight a catastrophic, vanilla SQL injection flaw discovered in GoCMS during active exploitation. Finally, the duo reviews the emergence of AI-powered honeypots highlighted Talos Intelligence. They conclude that turning the tables on attackers by utilizing LLM-driven "hall of mirrors" environments to impersonate real systems represents an innovative, under-explored AppSec strategy designed to drain attacker resources and trigger high token costs.
Today our “Story Series” continues with a conversation I had with Brother Louis Canter. Brother Louis is a member of the Order of Ecumenical Franciscans and founder of Franciscan Ministry of Peace. Louis has been involved with ecumenical and inter faith dialogue. His work as a composer and concert artist for International Liturgical Publications has offered him a wide variety of opportunities for dialogue with many faith communities. Brother Louis was kind enough to sit down with me and discuss how he decided to become a Brother of Ecumenical Franciscans, his pastoral music initiatives, and why he thinks NPM is as important as ever in our ministry of the Church today.
In this episode, host Kyle Younker sits down with Liam Weld, Head of Data Centers at Meter, to explore how AI is reshaping the modern data center from the inside out. While data centers are often discussed in terms of land and power, Weld explains why networking has become one of the most critical—yet underappreciated—drivers of performance and cost in AI facilities.The conversation unpacks the shift from traditional north–south traffic to east–west traffic between GPUs, why AI clusters now function more like a single massive computer, and how latency, reliability, and cabling can directly impact model performance. Weld also details Meter's vertically integrated approach to networking, spanning hardware, software, installation, and operations, and how this enables automation and long-term efficiency. NPM is a leading data, intelligence & events company providing business development led coverage of the US & European power, storage & data center markets for the development, finance, M&A and corporate community.Download our mobile app.
Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!This was recorded before Railway suffered a major GCP outage on May 19, despite being a multi-AZ, multi-zone mesh ring, with HA fiber interconnects between their Metal GCP AWS, because workload discoverability was unintentionally still tied to GCP. All has been resolved with a post-mortem.Railway did not start as an AI infrastructure company.It was founded in 2020 years before agents became the default way people thought about deploying software. Jake Cooper, formerly at Bloomberg and Uber, started Railway with a simple obsession: the activation energy to ship something to production should be near zero. Push code, get a URL, iterate. No Docker files, no Kubernetes manifests, no Ansible scripts stacked on Ansible scripts.For years, this was a slow grind. Railway spent its first 18 months hand-acquiring its first 100 users with Jake personally greeting every Discord signup on a second monitor.Today, Railway has raised $124m and is growing very fast. A 35-person team supports 3 million users, adding roughly 100,000 signups a week. Their bare metal data centers have a 3-month payback period vs. renting in the cloud, with 70% margins funding aggressive cloud bursting when needed. The servers they own have actually appreciated in value as RAM prices have climbed basically meaning the value of their hardware now exceeds the capital they've raised.From rebuilding Railway's network overlay over a weekend to moving the vast majority of workloads onto its own bare metal data centers, Jake Cooper is trying to build a new cloud for an agent-native world. In this episode, Railway's founder and “conductor” joins swyx and Alessio to unpack why the next era of software infrastructure is not just “Heroku but newer,” what agents need that humans did not, and why the old deployment loop of Git, PRs, CI/CD, and static cloud resources may be heading for a rewrite.We go deep on Railway's infrastructure stack: own-metal data centers, three-month cloud payback periods, cloud bursting, data center debt, Railpack, Nixpacks, Temporal, feature flags, Central Station, content-addressable filesystems, agent-safe production forks, and why the CLI may become more important than the canvas in an agent world. Jake also shares the founder journey behind Railway, how the company survived losing $500K/month, why it now serves millions of users with only 35 people, and why he believes the pull request is dying.We discuss:* How Railway went from a slow six-year grind to adding 100,000 users a week* How Railway thinks about agents as the next dominant software species* Why agents need version control, observability, compute, storage, and orchestration at 1000x scale* The economics of Railway's own-metal data centers and three-month payback* How Railway uses cloud bursting while scaling its own infrastructure* Why data center debt can be a better tool than venture debt for infra startups* Central Station, Railway's internal system for clustering customer feedback and incidents* Why responsible disclosure and over-communication matter for platforms* Why feature flags, progressive rollouts, and shadow traffic are essential for agents* Temporal's strengths, pain points, and why workflows matter for agents* Railpack, Nixpacks, Nix, and lazy-loaded content-addressable filesystems* Why “cattle, not pets” may change if you can clone the pets* Why Railway is building a new cloud from scratch instead of copying hyperscalers* The solo founder path, focus, writing, and how Jake thinks about company buildingRailway:* Website: https://railway.com/* X: https://x.com/RailwayJake Cooper:* LinkedIn: https://www.linkedin.com/in/thejakecooper/* X: https://x.com/JustJakeTimestamps00:00:00 Introduction: What Is Railway?00:02:07 Jake's Path to Railway00:06:13 Railway's Six-Year Growth Story00:08:52 Rebuilding the Business After the Free Tier00:11:17 Agents as the Next Software Platform00:13:29 Railway's Infrastructure Philosophy00:15:42 Bare Metal, Cloud Economics, and the Compute Crunch00:17:22 Cloud Bursting and Five-Cloud Networking00:20:20 Data Center Debt and Infra Financing00:23:31 Data Centers in Space00:25:24 What Agents Need From Infrastructure00:28:24 CLIs, Canvas, and Agent-Native UX00:35:15 Central Station, Incidents, and Responsible Disclosure00:40:30 Safe Rollouts, SRE Agents, and Production Forks00:45:00 AI SRE, Specs, Code, and Tests00:48:24 Self-Replicating Infrastructure and the New Serverless00:53:18 Heroku, Temporal, and Workflow Engines01:04:07 Railpack, Nixpacks, and Lazy-Loaded Filesystems01:06:01 Coding Agents, Token Spend, and Roadmap Acceleration01:10:56 The Pull Request Is Dying01:12:28 Feature Flags and the Agent-Era SDLC01:16:15 Cattle, Pets, and Cloning Machines01:19:29 Solo Founder Lessons01:24:12 Focus, GPUs, and Building a New Cloud01:28:20 Closing ThoughtsTranscriptAlessio [00:00:00]: Hey, everyone. Welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swyx, editor of Latent Space.Swyx [00:00:10]: Hey, hey, hey. Today we're in the studio with Jake Cooper of Railway.Alessio [00:00:14]: Conductor of Railway.Swyx [00:00:15]: Conductor at Railway. Yeah.Alessio [00:00:16]: Choo-choo.Swyx [00:00:17]: Do you actually have that anywhere, like on your business card?Jake [00:00:20]: We call some of our volunteer moderators conductors. I don't have a business card. We're not that big yet. At some point I will. I got handed a nice business card from the Supermicro folks, and I was like, “Damn, this is pretty official.”Swyx [00:00:30]: Business cards are coming back.Jake [00:00:32]: They're cool. They're hip. The conductor thing is good. We're trying to figure out what we want to call each other internally. Some people think it's super cringe and say, “You don't need a name for people internally.” Some people want to call each other something. We still don't have a really good one.Jake [00:00:55]: We've got New Railcrews, Trainiacs. Nothing has stuck yet.Swyx [00:01:00]: I like Trainiac. Trainiac sounds good. Railwayians. For those who don't know, what is Railway? Let's give people a crisp definition up front.Jake [00:01:09]: Railway is the easiest way to ship anything. You go to the canvas, or you talk with Claude, and you say, “Deploy a Postgres instance, deploy my GitHub repository, run this code,” and you're off to the races.Swyx [00:01:22]: You've got a nice animation on the landing page.Jake [00:01:24]: Thank you. None of my work, by the way. They don't let me touch the design stuff anymore.Jake [00:01:25]: We want to make it trivially easy not just to deploy things, but to evolve applications over time. Most tooling right now stacks entropy on top of entropy: Docker, Kubernetes, Ansible scripts, and all these other things. If we can version all of your software and keep track of all the changes, then we can make it trivial to clone environments, fork into a parallel universe, get copies of production data, get copies of any services, make changes, validate them, and collapse them back in without reproducing everything across a staging environment.The Railway Origin Story: From Uber Systems to a New CloudSwyx [00:02:07]: I was looking at your background: Bloomberg, Uber. Nothing immediately stands out as, “This guy is going to found the next great platform as a service.” What prepared you for Railway?Jake [00:02:21]: It was curiosity to keep going deeper. I started out on front-end stuff, working on Wolfram Mathematica and porting it over. Then I briefly moved to Bloomberg, then toward Uber and distributed systems, taking the Jump Bikes systems and moving them to a distributed system built on top of Cadence, the pre-Temporal Temporal.Swyx [00:02:44]: Which, by the way, I'm happy to talk about, pros and cons.Jake [00:02:48]: Totally.Swyx [00:02:51]: But let's do the Railway story.Jake [00:02:52]: It has been a continual step of wanting an experience. Whether it's walking up to a bike, unlocking it, and having it work frictionlessly, or something else, the depth required to make that happen follows from the experience. A lot of the work I do, and a lot of the team does, is in service of that experience. We fundamentally don't care how deep we have to go. We will swim to the bottom of the swimming pool to get the experience.Jake [00:03:17]: I don't have a physics PhD. I did an EECS degree. It has always been about figuring out the next step: how do we get there? That's what led to starting Railway for that experience and then moving all the way to bare metal data centers. I was adding patches to the kernel this week to get the experience there because I can see how much better it can be.Swyx [00:03:49]: Other patches to the Linux kernel this week?Jake [00:03:51]: Yeah. Not upstream. Our fork.Swyx [00:03:52]: That's a flex. Railpack? No, this is different. This is the OS on top of Railpack?Jake [00:03:57]: No, this is an actual kernel patch. It's always literally: what do we have to do to get that experience? Then figure it out. Anything is figureoutable.Swyx [00:04:10]: Would you send the patch upstream, or does it not fit other use cases?Jake [00:04:13]: Maybe. We have to work out the experience internally. It has to do with the storage layer we're building for some of the agentic stuff. Maybe it'll be useful upstream, but it's deeply useful for us internally.Open Source, Forks, and Non-Deterministic VersioningSwyx [00:04:29]: You mentioned open source before. How do you think about starting from open source, and then coding agents letting you do a lot more from forks of it?Jake [00:04:38]: GitHub's original sin is that it's almost a series of broken pointers. You have this thing, then you clone it, and now you've lost the whole upstream. How do we make it trivial for people to modify really small pieces of it?Jake [00:04:51]: We think of Git in a discrete sense: I've either made a change and merged upstream, or I haven't. What would it look like if it were percentage-based, a little more non-deterministic, or a stream of changes that users traverse as a percentage rolled out in general and then rolled all the way up?Jake [00:05:13]: We have the open-source kickback program and let you deploy templates because we want to make it trivial for people to version these shards over time. It solves a large problem around authentication, authorization, and security. NPM has a way to define, “Don't take any new packages.” The ideal end state is that you roll out progressively to users with the minimum impact zone and continue rolling up. JPMorgan should probably be the last one on the patch line, for all our sakes, because our money and livelihoods are there.Jake [00:05:53]: It's okay if Johnny Vibe Coder gets a broken patch because there's so much entropy in the system that the rubber has to meet the road at some point. You have to test at varying levels.The Long Grind: First Users, Free Tier, and Making the Business WorkSwyx [00:06:13]: I wanted to pull up this glorious chart, which is your usage or number of daily signups?Jake [00:06:22]: Daily signups, I think.Swyx [00:06:24]: You started six years ago. It was a slow grind, and now you're on a rocket ship. You say, “Don't doubt your fight and don't quit.” Maybe pick out certain points that were key inflections for the company.Jake [00:06:40]: At the start, it's about getting your first 100 users, hell or high water. We had a website and a support link. The support link was the Discord channel. I had notifications on with two monitors: the monitor I was working on and the other monitor with Discord. If anybody came in, I was immediately like, “Hey, how's it going?” It was rare, so getting those first 100 users to come back was the start.Jake [00:07:14]: Then you build a consultancy factory because users want all these things. You have to go back to the board and ask, “What is the actual product offering I want to build on top of this?”Jake [00:07:28]: VCs want charts that always go up and to the right, but in reality you don't necessarily want charts that look like that. For us, there have been periods of expansion where we add features to test use cases, and periods of compaction where we ask, “If the experience we have is good, how do we make it significantly better?” Maybe we strip out features that don't fit our ICP anymore.Jake [00:07:57]: The boom from 2022 to 2023 came from the free tier. Everybody under the sun was using it.Swyx [00:08:09]: A lot of Reddit bots and Discord bots.Jake [00:08:12]: And crypto miners. When you build an open product on the internet where anybody can sign up, the internet is a horrible place with so many things. You go through periods of asking, “How do I reach as many people as possible?” Then, “How do I fit the exact use case for the people who really matter and are really excited about this specific thing?”Jake [00:08:39]: Then there was a two-year period of making the actual business work. During the free-tier era, we were losing about half a million dollars a month.Swyx [00:08:59]: On a $20 million bank account.Jake [00:09:02]: On a $20 million bank account with maybe $50,000 a month in revenue. That's a horrible business. I don't know how anybody invested. But you have to go through it and say, “We have an experience people love, but the business has to work.”Jake [00:09:17]: There are two schools of thought. You can run the horrible business all the way up with bad margins, or you can go back and make it work. We've always wanted a super lean team. We're 35 people right now. It's very small.Swyx [00:09:36]: Supporting three million already?Jake [00:09:38]: Yeah. We're adding 100,000 users a week right now, so it's growing fast. We don't want to add headcount for the sake of headcount or throw bodies at problems. We want to build systems. It's hard to build systems during expansion because you're adding things to the system because people are asking for them or things are breaking.Jake [00:10:00]: We had to cut off the free users for a little while, rebuild the business, and make sure it worked. We want to reach as many people as possible because software is important. It's become difficult to create things in the physical world, so it's important to make it easy for people to build in the virtual world and have access to creation. But there are legs to that journey.Jake [00:10:30]: You can see divots in the charts. If you follow between 2025 and 2026, it's either summer or winter. People go on holiday with family.Swyx [00:10:50]: It affects that much?Jake [00:10:51]: Yeah. It's kind of B2C and kind of B2B. People are shipping constantly, then they stop. Our activation curve now shows more people activating on weekdays because we have more business users, so it smooths out over time.Agents as the New Interface to DeploymentSwyx [00:11:17]: Was there a point where you started prioritizing AI development or agent development?Jake [00:11:24]: We've prioritized agentic as a top-of-funnel thing. Over the last six months, we've deeply prioritized agentic as a mechanism to build and deploy things because we believe the curve is so steep and that is how people will build and deploy software.Jake [00:11:42]: It almost fundamentally doesn't matter whether this is dot-com or not because we're all on the internet anyway. If agents are going to deploy a bunch of things and we hit an inference wall at some point, we'll fix those problems. The dominant species over the next 10 years is that we've moved from assembly to C to C++ to JavaScript to words. You're going to need to close that loop.Swyx [00:12:13]: When you say this is dot-com, did you mean buying the domain, or the general case?Jake [00:12:17]: I mean the dot-com era, when companies had a huge run-up because people understood the internet was important. Then they hit bottlenecks, fundamental laws of physics, math didn't work, and everybody came back down to earth. But it didn't matter because the internet became so impactful. If you operate on a long enough time horizon, you should build these things anyway because you can see where it's going.Jake [00:12:45]: That's where I think a lot of agent stuff is. You get to a point where you're running thousands of agents in parallel. What is the inference cost? What is the compute cost? How do you make that efficient? How do you coordinate all this? We have issues coordinating humans; we don't even have good tooling for that. Now we have to figure out how to get agents to coordinate, safely version changes, and know when to raise their hand for someone to intervene. Otherwise it becomes an interrupt factory.Railway's Infrastructure Thesis: Network, Compute, Storage, and MetalSwyx [00:13:19]: Let's go right into the technical side. What are the core infrastructure or architectural beliefs of Railway that allow you to do what you do?Jake [00:13:29]: The primitives matter a lot for us. We need network, compute, storage, and orchestration around it. You need control over a lot of those things. We've talked a lot about how we don't really use Kubernetes because we want higher-order control to place workloads in very specific places.Jake [00:13:48]: The reason is that you have to be very efficient with agents: memory reuse and all these other things, or you're going to massively blow up your cost structure. Being able to rack and stack your own servers and build your own metal unlocks performance and cost. Experiences where you're running 1,000 agents in parallel are not massively cost prohibitive.Jake [00:14:13]: Token use and compute use are blowing up. Over time, those things have to get a lot more efficient. You can get a lot of margin to make those experiences solid by building your own metal. That's all in service of offering a differentiated experience to as many people as humanly possible.Swyx [00:14:51]: You have a data center in Singapore.Jake [00:14:53]: Yeah. We have two in every other region now. In Singapore, we're adding a second one in Q3.Swyx [00:14:58]: What's it like? I've never built a data center. Do you go to Equinix and say, “I want some slots?”Jake [00:15:05]: Yeah. Equinix. You basically go and say, “I want power and I want a cage.” They say, “Great, here's what it's going to be.” You rent the cage for a period of time, fill it with racks and servers, and hook up internet to it. That's all the pieces.Swyx [00:15:36]: Then you handle everything else.Jake [00:15:37]: You handle everything else.Swyx [00:15:39]: What's the math versus clouds doing it for you?Jake [00:15:43]: If we rented in the cloud, our payback period when we go to metal is about three months.Swyx [00:15:50]: Which is crazy.Jake [00:15:51]: It's nuts. That's four years of depreciated hardware. You're going to see a lot of this compute crunch because hyperscalers are buying up a lot of stuff. We're working directly with OEMs, resellers, and people building these machines: Supermicro, Dell, and others.Jake [00:16:11]: Upstream, there's a bunch of supply pressure. When we raised our last round, between deploying capital for servers and now, the amount of money we've raised is less than the amount of money we have in the bank plus the value of the servers because the servers have appreciated as RAM has gone up. It's nuts how valuable hardware has become.Jake [00:16:50]: If you look at hyperscalers, they deployed around $80 billion of capital expenditures this year, and next year will be more. That's a massive infrastructure build-out. You look at that and think it's crazy that they're spending way more than the Manhattan Project. But if every person is going to run dozens or hundreds of agents in parallel, you have no conceptual idea how much compute is required to make that experience happen, even if you're deeply efficient and sharing resources. And that doesn't even count inference.Swyx [00:17:22]: How do you plan the build-out? The growth chart is so vertical. Are you usually at 100% utilization as soon as racks are live? How far ahead are you planning?Jake [00:17:33]: We still maintain cloud presence for bursting. We work with AWS, GCP, and a few other clouds. We can rent, and then the moment we get space or power, we compact those workloads off the cloud. We started on the clouds, then built a system to migrate to our own metal. There's nothing that says you can't continually do that again, and that's exactly what we do. We never want to be compute constrained.Jake [00:18:09]: At the start of the year, we actually became compute constrained because one upstream provider wasn't able to give us quota at the rate we needed, and the hardware was slower. I spent a weekend rebuilding our entire network overlay so we could straddle five clouds: Oracle, AWS, ourselves, GCP, and one other one. We can do more than that now.Jake [00:18:38]: We got into a spot where we were trying to pack instances tight because we couldn't get enough compute. That led to a few reliability issues, which are now past us. I made a tweet pointing out that it's becoming harder and harder to acquire compute at the rate these models need to acquire compute. We got bit by it.Swyx [00:19:15]: How do you think about pricing knowing you might not have your own metal available at all times? Are you pricing assuming you need extra margin if you end up going into the cloud?Jake [00:19:26]: Because we've built out our metal data centers, our margins on metal are around 70%. We can deeply subsidize the cloud business if we want to scale at a reasonable rate. We have a few levers: metal, which makes the margins; cloud burst; debt to buy servers; and venture capital. It's an interesting operational problem: how much cash do we have, how much should we raise, how quickly can we deploy it, and can we scale revenue as quickly as we scale compute?Jake [00:20:05]: If we continue making it trivially easy for people to build and deploy, then the faster we close that loop and the more operationally excellent we are with capital, the faster the business can scale. It's almost a straight linear deployment rate.Financing Infrastructure: Hardware Debt, VC, and Operational LeverageSwyx [00:20:20]: I think infra startups raising debt is a tool people don't utilize enough or know enough about. What can you tell us about that? Is it secured against your CPUs?Jake [00:20:32]: It's secured against our hardware.Swyx [00:20:37]: What rates do you get? Who are the lenders?Jake [00:20:39]: We pay prime plus a spread, and we can refinance any of the debt as rates go down. The terms are pretty good. The unfortunate thing is that Twitter has no nuance, so people say, “Venture debt bad.” But as with all things, there are specific tools and areas where you can be deliberate instead of using one tool as a hammer. Venture capital is not the hammer for everything. You have to explore and figure out what works.Swyx [00:21:12]: VC is usually the most expensive financing you can get.Jake [00:21:15]: Yeah. I also think people think about VC incorrectly from a capital-raising perspective. Most people think, “How do I raise as much money as possible from whoever is probably the best I can get at that time?” That's close to right, but what we've tried to do is figure out what unfair advantage we can buy with that equity.Jake [00:21:34]: It's the most expensive equity you're going to give away at that point in time, assuming the company keeps getting better. How do you use it to work with someone stellar who complements you? In the seed stage, I had never started a company. Ray Tonsing had good advice, and I could text him all the time. He was really fast. Awesome.Jake [00:22:01]: Then with John and Erica at Unusual, they said, “You roughly know what you're doing building a product. We'll mostly leave you alone and be available for advice.” Amazing. Then we got to Series A and the business was an operational tire fire because we didn't know how to scale a business. Work with Erica, and Jordan is over at Redpoint, so bonus.Jake [00:22:28]: Now we've raised from TQ and FPV as we're moving into enterprises. Every step of the way, we've asked: who can we partner with at this specific time to unlock the next section of the journey? I don't know enterprise sales. As an engineer, I can eyeball what features we might need, and we have wonderful people internally who can help. But you want boardroom dynamics where everyone is aligned and asking, “How do we win this?” instead of bickering about strategy.Data Centers in Space and the Physics of ComputeSwyx [00:23:31]: You had a tweet about data centers in space. Why no data centers in space?Jake [00:23:37]: It's not “no data centers in space.” My hot take is that I think it is solvable. I've just never seen anybody solve it.Swyx [00:23:49]: You said, “How are you going to dissipate that much heat in a vacuum?” You're making a physics claim.Jake [00:23:55]: I haven't seen anybody prove how you're going to dissipate that much heat in a vacuum. It doesn't mean it's not possible. It just means nobody has brought it up yet.Swyx [00:24:05]: Astrophage.Jake [00:24:06]: I don't know what that is.Swyx [00:24:07]: The Martian thing. Okay, you're very logical.Jake [00:24:09]: It could work. A lot of people are putting the cart before the horse. They say, “We're going to put data centers in space.” Okay, but how? “We have time to figure it out.” It's like in The Martian where they ask how they're going to intercept something and say, “We'll figure it out.”Swyx [00:24:36]: Making a bet on human invention is weird because you blind trust that it can be solved. But with physics, there are first-principles bounds you can put on it. Maybe not. Maybe you're asking to travel time or break a fundamental thermodynamic law.Jake [00:24:57]: I don't know how VCs do this either. How do you know what's not possible and a grift versus what's possible but sounds completely insane? “We're going to put data centers in space.” Coin flip as to which it is, and I guess you'll know in 10 years. That's one cycle.What Agents Need: Versioning, Observability, and 1,000x ScaleSwyx [00:25:23]: Moving back to agents. The branching, fast spin-up, and orchestration you do feels like pre-work that happened to be exactly what agents want. What do agents want differently than humans?Jake [00:25:37]: They want the ability to version things. It's not that different; it materializes slightly differently. Agents want a way to test changes incrementally. Engineers have feature flags. Is there a reason agents can't use feature flags? I don't think so.Jake [00:25:54]: They want version control. Can we use Git or not Git? That one is up in the air. I think something outside Git will emerge for how we version these things over time. They need observability. You need to query what happened, when it happened, which steps failed, traces, logs, metrics, and all the rest. They need network, compute, and storage. They need to write files, save files, iterate on files, and snapshot file systems.Jake [00:26:25]: A lot of what humans needed is in line with what agents need. Branching and forking are not different; we're just moving 1,000 times quicker. It can look like you need something massively different, but what you need is something massively better than what existed. You need orchestration massively better than Kubernetes. You need networking probably better than Envoy. It goes all the way down the stack.Jake [00:26:55]: If the workload profile doesn't change so much as it gets massively compressed because you need thousands of these things, what assumptions change? etcd is going to melt. You need to replace it with something. You can go all the way down the stack and say, “That part has to change, that part has to change, and that part has to change.”Jake [00:27:19]: The interesting thing about the super-exponential curve is that you have to build systems where you can rip out those parts at any time because a new bottleneck might emerge. You get good at parallel agents, and a different part of the system breaks. So it's similar to what humans needed, but at 1,000x scale.Jake [00:27:55]: How do you do code review in the age of agents?Swyx [00:28:00]: You throw more agents at it.Jake [00:28:01]: You don't. But then who reviews for CVEs and all these other things?Swyx [00:28:07]: More agents.Jake [00:28:08]: And that's how we hit the inference wall. You can continually throw agents at the problem, but I think there's a limit to the number of agents you can throw at a problem.CLI, Agent Handles, and Closing the LoopSwyx [00:28:24]: You already had a CLI before it was cool. How is the shape of what you're exposing changing, if at all?Jake [00:28:28]: CLIs have always been cool. The CLI changes because we think about how to give Claude, Codex, ChatGPT, or any model a handhold.Jake [00:28:50]: A CLI is a single command: deploy, get logs, and so on. Things that were prohibitively annoying to humans are not annoying to agents. They're nice. If I handed you a CLI with 40 arguments and 600 flags, you'd think, “I'm never going to use all of this.” But if you hand it to an agent, it says, “This is excellent. I have so many handles to work with.”Jake [00:29:24]: If you're going to expose things to agents that way, you want as many handles as possible where they can get information, query dynamic information, and close the loop quickly. Most problems right now are about how to close the loop as quickly as possible. Where does the agent get stuck, and how can you remove that?Jake [00:29:49]: Telemetry is important. If you can tell where the agent gets stuck from the CLI and say, “12% of people deviate from the happy path because of this, and now I add this argument and drive it down to 2%,” you massively increase the rate of loop closure.Jake [00:30:03]: That's how we think about not just the CLI, but every point in the dashboard. It's a user journey: I hear about Railway. I get something deployed. I get my first green build or aha moment. I see an endpoint, logs, whatever. Then I iterate. The iteration loop is indefinite. The user wants to deploy a new thing, a Postgres instance, change code, and keep iterating.Jake [00:30:36]: If you focus on the iteration loops and what's blocking them from closing quickly, one thing we say internally is: you never want to be waiting on compute anymore. You always want to be waiting on intelligence. If you're waiting on compute, there's a bottleneck that needs to be destroyed because eventually that bottleneck becomes so large that another workflow emerges to change it.Jake [00:31:04]: We've built a product where you push code, build it, and so on. But I fundamentally believe the push-pull loop is going away. We'll get to a point where you make a small change in production, that change is versioned across your infrastructure, you're working alongside copy-on-write versions of your database and infrastructure, and then you merge it in and it's instantaneously live. That's the holy grail of loops. The push-pull-rebuild thing is a point of friction that we're removing entirely.Canvas as Output: Dashboards, Context Anchors, and HyperstructuresSwyx [00:31:43]: It's incredibly fast. If anyone hasn't tried it, that fast feedback is great. My hot take is that Railway was famous for its canvas, which visualizes your infrastructure and lets you manipulate it visually. But that was for humans. For the next phase of growth, Railway CLI is more important than canvas.Jake [00:32:05]: The canvas is funny because it's a mechanism to show changes over time. You're right that previously we used it a lot as an input. Moving forward, its goal is more like an output. You would go to the canvas, make changes, see them, and watch your infrastructure evolve. Now agents have access to the CLI and can make those changes. So the canvas becomes an output: what information does the human need at this moment to make suitable decisions about control requests? Do I approve this or not?Jake [00:32:57]: It also has to be an anchor for your context, a port in the storm. Think of it like layers in a file system. You start with a project, then drill down into services, then into a function or code, because you want to represent the entire thing not just in your head, but in the canvas. Other people can share that representation, think on the same wavelength, and move quickly.Jake [00:33:33]: A lot of organizations get in trouble as they scale because all the context lives in someone's head. “How does this microservice work?” “I have no idea; go ask this person.” Then you have whole categories of products built around context discovery. A lot of that melts away if you have a solid hierarchy and can infinitely nest services, code, context, and everything else all the way down. That's what lets you build these structures over time.Jake [00:34:18]: It's also what lets us build what I've called hyperstructures: things that are way bigger. You look at the Golden Gate Bridge and ask, “How did we build that?” There's a meme that we lost the technology. To some extent, yes, because the coordination that built those things evolved and changed. We lost some of the art of building structure as we jammed everything into Slack.Swyx [00:34:52]: But you jam everything in Discord.Jake [00:34:53]: Same point. It doesn't matter. It's message passing and interrupts, message passing and interrupts.Swyx [00:35:00]: So you're arguing there should be something better and more structured than Slack?Jake [00:35:04]: Yeah. For sure. I think Slack is awful, and Discord is awful too.Central Station: Context Routing, Support, and Incident ClustersSwyx [00:35:09]: This is the equivalent of my mom test. What have you done that has your solution to this?Jake [00:35:15]: Internally, we've built a tool called Central Station that aggregates all the context from our users. Every piece of feedback, every customer support item, everything gets aggregated into clusters. If an incident is brewing, we can determine how many users are affected and break off a discussion based on that.Jake [00:35:40]: That is more helpful than long-running channels where you're trying to decide which channel to put something in. If you can dynamically aggregate information and dynamically route it to the right person based on context, it works better. We know internally that these four people are close to networking. If we see a networking thing, we can drill it down to those four people. If it's with this part, we can look at the commits. This is no longer a manual process internally.Jake [00:36:13]: If you go to station or help.railway.com, that's why we built it. We wanted to scale with a massive amount of leverage by aggregating feedback.Swyx [00:36:27]: This is built in-house?Jake [00:36:28]: Yep.Swyx [00:36:29]: I remember helping out on this one with Angelo in 2023. You scale a lot with a very small team.Jake [00:36:38]: Yeah. We're about 10 times bigger now.Swyx [00:36:40]: You have your full developer code here? Very cool.Jake [00:36:44]: If you go to railway.com/stats, we expose this as a pub-sub-able thing. It's all real-time metrics. There's a way to get it as JSON somewhere if you care.Jake [00:37:01]: We're big on trying to build everything in public and talk about what we're working on. We've had issues in the past, and we'll say, “Here's how we're fixing these things.” We've gotten compliments and flak for incident reports. We're always trying to make them better and talk with people.Incidents, Disclosure, and Progressive RolloutsSwyx [00:37:20]: You had a big one recently. I liked that it was scoped to 3,000. You presumably used Central Station. Talk through what happened and how you address it internally as a team.Jake [00:37:38]: Internally, this one really sucked. It had to do with an upstream provider that didn't do the behavior it said it documented, which is unfortunate given they wrote the RFC for how the behavior should work. We rolled those things out, and Central Station caught it initially when a couple users said caches weren't invalidating. We turned it off immediately.Jake [00:38:03]: When you roll out to a large user base of three million people, you get a lot of disparate behaviors. We tested in staging and had tests, but we hit an edge case. We've hardened those systems, and now we can make that better. But it was a tough one.Swyx [00:38:39]: I always wonder how private disclosure is supposed to work if people find an issue. Are they supposed to contact you first? When you run a platform, these things will happen. What channels should people pursue to quietly resolve it before it becomes a bigger incident?Jake [00:38:59]: There's responsible disclosure. We err on the side of over-disclosing and letting you know something is wrong versus having your provider gaslight you. We've erred on sharing those things more publicly, even if they impact a small subset of users. That's a decision we've made internally. We have four values. One is honor. The honorable thing is to notify people to the widest degree at which they may have been affected or there was an issue, and then confront it head-on: why did it happen, what can we do better?Swyx [00:39:45]: Not the whole user base. That's because of incremental rollouts and other things?Jake [00:39:50]: Yeah. Progressive rollouts.Swyx [00:39:54]: That should be the norm at all large platforms.Jake [00:39:58]: It should. A variety of companies do this. There's the quote that Meta runs 10,000 different versions of Meta. To our earlier point about agents, they need the same thing. They need shadow traffic and all these other things. We've built so much ceremony around production being sacred that we need to make it trivially easy to test different behaviors in a safe environment. Then you can make mistakes in a safe environment.Safe AI SRE: Customer Agents, Forked Environments, and Production ParityAlessio [00:40:30]: Do you see a world where these things get automatically caught, not necessarily by your agent, but by your customer's agent? The cache invalidation issue seems easy to check if you know to look for it.Jake [00:40:44]: It's hard because to determine it, we almost need to hook into your observability infrastructure. That's why we have the template loop on the platform: so you can roll things out progressively. You can roll out to Johnny Vibe Coder initially, or push a shard that someone consumes at their own leisure. Or you can roll it out over weeks: 0.1% of people, 1% of people, early adopters, then all the way up. That's the non-deterministic version control we talked about earlier.Jake [00:41:30]: I believe that's where most things should go, because most companies end up building staged rollout systems in-house. It's the same thing built again and again at every company. There's a massive opportunity to consolidate developer debt.Alessio [00:41:45]: You should have a free tier. Model providers give free tokens if you let them use the data. You could give free compute if someone is the number-one shard that goes out and lets you plug into their observability.Jake [00:41:55]: We do that. That's why we talked about the impact on 3,000 people. We start with lower-impact people. Larger companies on the platform are last to receive those rollouts so they have a version of the platform that's deeply stable.Alessio [00:42:16]: I have three services, so I'm sure I get the first rollout. You can nuke my thing at any time. There are all these SRE agent companies. Observability people also want agents that fix upstream problems. You have your own agent in the canvas now. How do you see that playing out?Jake [00:42:39]: It's the stacking entropy problem. If you don't have primitives to make iteration in production safe, it becomes difficult. If you're an observability provider saying, “Here's the fix to this error,” assume 80% are good and make sense. But in the last 20% long tail of complex issues, if you let somebody stamp it, you create an opportunity for an incident.Jake [00:43:08]: That's why forked environments are important. People have staging, but it always drifts from production. You need primitives, workflows, and experience built first-party on the platform so you can fork any service at any point in time.Jake [00:43:33]: I think of the canvas as a sheet of transparency paper. The agent is a little guy you push up into the canvas. It should say, “I need to copy that service and that service so I can test these two things.” It gets a read-only copy of production. Anything that's PII gets marked as a transform when we clone the database, create a copy-on-write version, or read from it. Then the agent makes changes and asks, “Does this actually work?” as close to production as possible.Jake [00:44:22]: That's how close you have to be, or you get massive drift. The system becomes unstable. You see this with massive systems built on Docker for local, Kubernetes for production, and a specific thing for something else. That complexity slows developers and becomes unstable at scale, making it hard to iterate. We want to compress that way down and say, “As close to prod as possible is where we want to be.”From AISRE Skeptic to Agent BelieverSwyx [00:45:00]: I was texting Erica for questions, and she says you were originally not a believer in AISRE. Have you come around on it?Jake [00:45:10]: I flipped, but I'm still not a believer in AISRE if you don't have the primitives to make it safe. If you unleash AISRE on production infrastructure without safe primitives for copying volumes and making sure things are fine, it's going to nuke your production database. It's not a matter of if, but when. I'm a big believer in making those loops safe.Jake [00:45:33]: I was a deep AI skeptic until 2023. In 2024, I thought, “Maybe I can roughly make this thing do it.” In 2025, I thought, “Now I can hold this.” Over winter break, everybody came back saying, “It's almost impossible to hold this.”Swyx [00:46:01]: Did you see this on the Claude docs? CloudBot? OpenCloud?Jake [00:46:06]: It's gotten to a point where it's harder to hold it wrong than to hold it right. There's a scene in Avengers where Vision picks up Thor's hammer and says it's terribly well-balanced. It self-balances and works well. I'm a deep believer at this point that this will be the dominant species: assembly, C, C++, JavaScript, words.Swyx [00:46:35]: It feels like a big jump.Jake [00:46:37]: It is. But it's not like you abandon CPU-based discrete logic and move straight to fuzzy logic. You need both. Your skills should call code or applications or some static structure. You can use skills to distill what the procedure should be or how the code should act.Jake [00:47:02]: I'm coming to a thesis: you need three points. You need a clear spec defining the system, the code, and the tests. When you say it out loud, if you've been in engineering long enough, you're like, “Of course. That's an RFC, tests, and code.” But they all matter. Having them together lets them reinforce each other: the spec and tests match, but the code doesn't, so reconcile it. Or the tests and code match but the spec doesn't, so reconcile that. That's the iteration loop.Jake [00:47:41]: That's why you're seeing people talk about software factories, docs, and reconciliation. Some of that is architectural astronomy if you don't implement it, but that loop is where most things will end up.Swyx [00:48:07]: For listeners, we've been talking about this on the pod for three years: the holy trinity of specs and tests. Itamar Friedman from Qodo is the reference if people want to look it up.Self-Modifying Infrastructure and the End of Push-Pull-RebuildSwyx [00:48:18]: One thing I want to mention on the OpenCloud idea is self-modification. I don't know how Railway would support it, but I have my OpenClaw, and I just tell it it has the Railway CLI and can do whatever. In theory, whatever capabilities or new infra it needs, it can call the Railway CLI, provision it, and add it to itself. The agent can modify its own infra.Jake [00:48:45]: It's nuts. I have a loop set up where you put the Railway CLI on top of something that runs on Railway. You're authenticated as whatever the current box is, and you can make any changes to it. Then you call Railway deploy, and it deploys itself.Jake [00:49:04]: It's like: “I need to spin up this instance of this environment. I already exist in this environment. Excellent, I have access to a Postgres instance now.” That's where we want to go with agentic, self-replicating infrastructure. That's your loop: iterate in production. You continue making changes. If it works, merge it upstream. If it doesn't, throw it away.Jake [00:49:37]: How do you make throwaway copies trivial to spin up and super cheap? The era of “I have an AWS instance with four vCPU and 16 gigs of RAM” is going to get destroyed. If you do that for agents, you need a thousand of those machines. It's prohibitively expensive compared with what we've spent a ton of time figuring out: the atomic unit of deploy, whether you call it isolates, sandboxes, or something else. Only pay for what you use, spin up instantaneously, and close the loop as quickly as possible.Jake [00:50:15]: If the system can self-replicate safely and say, “This is my environment, I'm making these changes,” it can come back with, “Does this look good? This is a new state of infrastructure given this prompt. I think I've solved it.” Then you go back and say, “Actually, it looks different.” It does the loop again. Then you say, “Cool. Apply.”Swyx [00:50:38]: That's retroactively obvious, which is the most useful kind. Any other comments on agent deployment on Railway?Jake [00:50:51]: It's getting better every day. I'm on X or Twitter. You can always yell at me about the parts not working as well as they should, because plenty of things should work way better.The New Serverless: Stateful, Long-Running, Pay-for-What-You-Use LinuxSwyx [00:51:04]: At this stage, when people want massively or embarrassingly parallel compute, they usually talk serverless. I feel like there's a new serverless compared to the previous five years of serverless. You're in that new bucket. Do you have comparisons or philosophical differences you want to call out?Jake [00:51:31]: It's somewhere in between. It's the ability to run stateful, long-running workflows or executions.Swyx [00:51:42]: Vercel has Fluid Compute, Cloudflare has some container thing, Google has App Runner and others.Jake [00:51:55]: That's where everything is roughly going, and it's why we've been working on this for six years. We believe users need access to a computer: a box that speaks Linux. They need to deploy what they want. Other systems change the surface area of what you can build. For us, users need a computer and need to deploy anything they truly want. That's why we've focused on the primitives: network, compute, storage. If we give you those and expose them so you can run things indefinitely, that's where we believe it's going.Jake [00:52:43]: Twitter has no nuance, so everyone says “servers” or “serverless.” It's always somewhere in the middle: I want to run it for a long time, but I don't want to provision the resource statically or pay for things I'm not using. That's been our thesis from day one: pay only for what you use, run it indefinitely, and it is full Linux.Swyx [00:53:12]: That's why I like the naming of Fluid. It's fluid. Flexible.Heroku, Focus, and Carrying the Torch Without Becoming the PastSwyx [00:53:18]: Another milestone is the Heroku official deprecation. You're one of the presumptive new Herokus. “New Heroku” has been a category for as long as I've been in developer tooling. It's finally happening. What was that like? Any behind-the-scenes of, “This is the moment”?Jake [00:53:42]: You have people where you're like, “You were running stuff on here? You, as this company?” It's crazy that names you would know are running on it and now coming to us saying, “We want to move a lot of this off.”Swyx [00:54:00]: Any behind-the-scenes on why Salesforce let Heroku stagnate?Jake [00:54:05]: I can only guess. It's hard when it's not your business. Salesforce's business is to build a great CRM. That's their focus. Then you acquire a compute business as an offshoot. A lot of early Meta people talk about focus. Boz has a write-up about how in the early days of Meta they had no money, so they were forced to focus. Then they turned on the money tree and had no reason not to split their focus.Jake [00:54:52]: But that dilutes your product. You get offshoots where you ask, “Is this the focus of the business?” If it's not core, it languishes. A lot of companies get in trouble when they split focus because they're fighting a multi-front war, not just externally but internally for alignment. Where are we going? What are we doing? What is our purpose?Jake [00:55:24]: If you're Salesforce-built and mission-driven, you want to work on Salesforce. Heroku is off to the side. It's not core to the business. Getting resources, budget, focus, and alignment internally becomes hard. It was a matter of time.Swyx [00:56:06]: Kudos for them to call it out instead of leaving it unknown.Jake [00:56:12]: Their release was a little odd. They called it out, but they didn't say they were shutting it down. Behind the scenes, I think they issued messages to people saying they should close accounts and that they were going to deprecate and remove things over time.Jake [00:56:30]: It's crazy because some of my first deployment experiences were on Heroku. You start with dragging things into an FTP server, then you try to get a deploy working, and then it's Heroku. It was the on-ramp for us. But the wheel turns. New things emerge. We're happy to carry the torch for a lot of that. But we don't want to be the new Heroku. We want to be the way people build and deploy software, and ultimately the way people monetize software over time.Swyx [00:57:19]: It's still a big crown to be the new Heroku. There are 50 companies that fought for that.Jake [00:57:23]: Everybody is holding some portion of it. We're happy to support people and companies. The platform works differently. The game loop is similar, but we've been dogmatic about where these things are going: primitives, agents, fan-out. Some things fit; some workflows need to change. We have an approximation of Heroku pipelines with the environment system. It's exciting. We've got a ton of people we can support, and it's growing a lot.Temporal, Workflow Engines, and State MachinesSwyx [00:58:12]: I have one more technical question about Temporal. I've sold my shares. You're a power user and one of our earliest customers. I met you through Temporal. You built on Temporal. You have complaints. This may be the most neutral and informed conversation anyone will hear about Temporal without someone working at the company.Jake [00:58:39]: That's fair. I've used Temporal for almost 10 years because of Cadence at Uber.Swyx [00:58:52]: Give people a sense of what Cadence was at Uber.Jake [00:58:57]: Cadence was the precursor to Temporal. It powers trip actions, rides, when you rent a Jump bike or scooter or car. You're running workflows for a period of time and saying, “This ride will run indefinitely until it finishes.” You attach information: you paused in this zone, so add this charge to the bill. When you end the trip, the workflow is done. That experience was powered by Cadence at the time.Swyx [00:59:34]: I used to say it's like programming the entire user journey top-down as one function.Jake [00:59:39]: It's a powerful idea and important. It's also important for the next phase of the agentic journey. You want an agent to do a specific task, be complete or incomplete on that task, and move on to the next thing. You need a way to manage workflows dynamically.Jake [00:59:59]: Temporal was always great in theory, and great when you got it working the way you wanted in production. But it required you to model the entire journey in your head. If you didn't, you could cause issues where replaying the state of the workflow causes non-determinism.Swyx [01:00:25]: Because it works on deterministic workflow history.Jake [01:00:28]: Exactly. I describe it as a jet engine. If you know how to operate it and run it, it's great. But you can't hand it to people trying to build complicated things if they don't have the whole state in their head.Jake [01:00:48]: We run our whole deployment pipeline on top of it. That's a reasonably complicated workflow: pre-commit hooks, signaling, queuing, and all the rest. We ran into the same thing at Uber. As you express a large workflow, it gets more complicated, with more states in the state machine that you have to map back to the workflow.Swyx [01:01:15]: It's a lot of ifs.Jake [01:01:16]: Exactly. At Uber, we built a system for doing the state machine and testing it. We've started to build some of those things here because it's grown heavily. It's not quite love-hate. When it works well, it works super well. But if someone who doesn't have full context puts something into the system that invalidates state or causes non-determinism, or spins off a ton of activities, you have to keep track of underlying SRE knobs like activity slots. Those should scale with memory, vCPU, and so on. It becomes a bear to scale.Swyx [01:02:10]: You need a capable sysadmin running things behind the scenes. If you moved off, what would you do?Jake [01:02:19]: We'd build our own workflow engine. We have a few internally that we've worked on.Swyx [01:02:27]: This is one of those classes of things you typically wouldn't vibe code, but I'm wondering if you can.Jake [01:02:33]: I still don't think you should vibe code it. You still want to run decent tests to make sure it works.Swyx [01:02:39]: Timo didn't invent that from scratch either. There are libraries you can run. On top of that, it's just a state machine that you have to map out. Ultimately, you define the instructions you want and run them through a state machine.Jake [01:03:00]: It's very doable. Workflow stuff is interesting. Restate is doing neat stuff here.Swyx [01:03:10]: You're tied into JavaScript. Are you a JavaScript maxi?Jake [01:03:13]: Internally, we have TypeScript, Rust, and Go. We don't add more languages. Actually, we have a little C because we write BPF code and hooks. But those are the languages.Swyx [01:03:28]: Is this for sidecars?Jake [01:03:32]: No. It's for the networking stack, volumes, and things like that. We use TypeScript a lot because it powers the dashboard, but we're moving a lot of workflow stuff off the dashboard stack and into the infrastructure stack.Railpack, Nixpacks, and Content-Addressable FilesystemsSwyx [01:04:00]: Cool. Any other technical infrastructure stuff? Railpacks?Jake [01:04:07]: We built an engine for determining dependencies based on source code. It's called Railpack. We built the first version, Nixpacks, on top of Nix, and then we moved.Swyx [01:04:17]: People have been trying to get me to adopt Nix and NixOS for four years. Is it ever going to be a thing?Jake [01:04:23]: I don't know. We're excited about it, but it has pain points. Think of it as a stack of versioned binaries at specific slices in time. If you want version X and version Y, you bloat the package space, which blows up image size and makes real-world workloads difficult.Swyx [01:04:53]: But you content-address it and cache it. In theory, there are optimizations.Jake [01:05:00]: In theory, yes. But with a large enough user base and disparate enough machines, you run into a problem Meta described in the XFAAS paper, their internal serverless system. It becomes difficult at scale unless you break out specific runtimes.Jake [01:05:24]: We didn't want to do that because we wanted to truly allow you to deploy anything. That was our initial thing with Nix. But we've moved toward interesting work around content-addressable file systems that can lazy-load anything from any point and page it into memory.Swyx [01:05:48]: Amazing.Jake [01:05:49]: The future is very bright. It's crazy, and it's going to be nuts.Coding Agent Spend, Roadmaps, and Token ROISwyx [01:05:54]: Founder journey stuff?Alessio [01:05:56]: Your cloud usage: you tweeted you're going to spend $300K this month?Jake [01:06:01]: I think we got to $200K.Alessio [01:06:02]: Coding agents?Jake [01:06:03]: Yeah.Swyx [01:06:04]: Across the company?Alessio [01:06:05]: You only have 35 people, so I'm sure they're not all spending $10K a month. What's the distribution?Jake [01:06:10]: I think I'm at about $25K. We have power users all the way down. We came back from winter break, and I basically said, “If you're writing code by hand, you're doing this wrong.” The tools are good enough now that you can move extremely quickly. There are issues and pain points, but you should be reviewing the code you are writing instead of writing it by hand.Jake [01:06:40]: Architectural patterns matter more now than ever, but you shouldn't spend your time generating code you would write. If you know how to write it, ask the agent to write it and reconcile it until it looks like you would have written it yourself.Jake [01:06:58]: People misconstrue my propensity to push people toward agents as connected to our growth and some reliability bumps. They're not necessarily related. The tools are good enough to move extremely quickly and build things way larger than you could before.Jake [01:07:19]: To the earlier point about cooling data centers in space: I don't know. But with software, you can ask, “How would I build block storage from scratch? How would I do these things?” I have ideas because I have history and have read papers. Let me work them out and build massive test benches with thousands of tests, because those are now free to author. If you're not using AI systems to speed-run your roadmap and reconcile your existing system onto the future, you're missing a large point of what's happening.Alessio [01:08:12]: What's the path to spending $3 million a month? Is it bound by ideas and things customers can absorb?Jake [01:08:19]: For most companies, it's bound by deployment at this point. That's why we've seen a massive boom in users and companies, from Fortune 50s down, asking how to get developers to move faster. You'll probably hit your CFO before any technical limits because they'll look at the eye-watering amount of money spent on tokens. Inference costs have to come down, but we're inference constrained now. There will be price discovery around what makes sense for an org to adopt.Jake [01:09:06]: I think you'll end up with the F1 driver concept. If someone is really adept at these things, it makes sense to put them in a $3 million car. If they're not, it probably doesn't make sense. You'll take a few people and say, “You can drive the F1 car. We need to go in this direction. Figure out if it works and prototype it.”Jake [01:09:33]: We've done some of that and vastly accelerated our roadmap. We thought we'd ship something in a few years; now we can probably ship it in a few months because we validated it and don't have to build it incrementally. We can skip steps and move toward our vision.Alessio [01:09:58]: A lot of people are realizing the roadmap doesn't always have a business impact, so they say tokens are too expensive. But if your roadmap were built to make more money by the time you built it, you'd have token pricing for it, the same way you do with sales. You'd spend a billion dollars on sales if you knew you would get $2 billion of revenue.Jake [01:10:19]: Exactly. A naive way to measure this is the percentage of tokens that end up in production. If you can measure impact because those tokens end up in production, that's awesome. But the burden of proof will rise. Internally, we have a growing number of pull requests that haven't merged. The question becomes: how do you get this into production? It's about how quickly you can build and deploy software, which is exciting because that's our whole thing.The SDLC Shift: Prompt Requests, Feature Flags, and Safe RolloutsSwyx [01:10:56]: The SDLC is changing. One thesis is that the pull request is dying. It's going to be the prompt request. Beyond that, code review is also kind of dying if you have all the other systems in place. What else is changing about the SDLC?Jake [01:11:19]: The AISRE and the tools to make it happen. AISRE is pie-in-the-sky aspirational. What does it take to get an AISRE? What tools do you need to build?Swyx [01:11:32]: You should expose your tooling to customers at some point. The Central Station command center.Jake [01:11:39]: We have it for template maintainers. Template maintainers can deploy and maintain templates, and they get feedback. We're going to expose those things incrementally.Swyx [01:11:51]: Clustering around incidents. Everyone has a version of that, but I don't think anyone has solved it.Jake [01:11:56]: I won't say we've solved it internally, but it's gotten so good that we can see incidents forming pretty quickly. At some point, those will be things either someone else builds or we build. We've always built things purpose-built for us. If it makes sense to make it useful for users, monetize it, or turn that loop into a profit center instead of a cost center, we want to do that.Jake [01:12:28]: Pull request is definitely dying.Swyx [01:12:29]: Do you do first-party feature flagging and incremental rollout stuff?Jake [01:12:34]: We have a feature-flagging engine we built internally and will eventually roll out.Swyx [01:12:38]: I don't see it as a user. How come you didn't give us what you have?Jake [01:12:43]: We have to beta test it. We care a lot about the quality of the things. There's plenty we've used internally that doesn't make it all the way through the journey because it fails. It works for one service but not multiple services. We'd have to build it for multiple services and know that if we released it, we'd rebuild it again and again. Some things are worth that, but many inform the roadmap.Jake [01:13:18]: We don't want to dilute the experience by saying, “This works, but only for this service,” unless it's a core initiative. Over the next few months, we'll roll out things that work for a single service, then multiple services, then multiple services across the environment. You have to be deliberate. Otherwise you create broken disparate experiences and support load because people ask how to use the feature.Jake [01:13:52]: It's the earlier expansion and compaction pattern. You expand the company to get features, then compact and smooth them out so the experience is stellar. You told me in the hallway, “It's gotten so much better.” Internally we're saying, “This part really sucks. We need to make it significantly better.”Swyx [01:14:11]: I can attest to that over the last three years watching you build Railway. For listeners, feature flagging is a huge part of Uber culture. So much so that they have too many feature flags and another thing to remove feature flags. Facebook has Gatekeeper. Agents are going to need this. It's fundamental to incremental rollouts. OpenAI acquired Statsig. GPT-5 is routing and flagging through different models.Jake [01:14:56]: It's super important. If the software development lifecycle is going to change because we're doing things 1,000 times faster and 1,000 times more concurrently, what becomes important at scale?Jake [01:15:16]: Before I started Railway, I built a feature-flagging product and tried to sell it. It was an easier version of LaunchDarkly. I ran into a problem: anyone small enough to adopt your technology doesn't care about feature flags, and anyone large enough to need feature flags needs so much scale that you have to build out all the infrastructure. I scrapped it.Jake [01:15:42]: But what is old is new again. Companies are trying to move quickly, but you can't YOLO a vibe-coded thing straight into production. You need to say, “Here's my blast radius, my impact, and I want to shadow it for these users.” Feature flags. You're going to need the tools larger companies built to maintain their structures. Everything gets compressed by 1,000x so everybody can build those structures quickly.Jake [01:16:07]: That's exactly where we are: compressing the software development lifecycle, then expanding it and adding more new things.Cattle, Pets, and Clonable InfrastructureSwyx [01:16:15]: Another term that comes to mind for newer developers is “cattle, not pets.” People treat production like a pet. It has a name. You baby it and keep it alive. With cattle, you can mass farm, roll out, portion parts out, and kill them.Jake [01:16:37]: I think that might change. You can move toward having pets as long as you have a cloning machine for your pets.Swyx [01:16:52]: Yeah.Jake [01:16:52]: If you can snapshot every single thing at every frame, it doesn't matter if something gets obliterated because you have a snapshot of it. The things we've built right now are designed to block changes from the hermetically sealed DevOps line. You have to write a Dockerfile because you nee
A serious new Windows 11 BitLocker vulnerability, open-sourced offensive malware tools, a suspected Iranian cyber campaign targeting U.S. fuel infrastructure, and malware that appears designed to interfere with nuclear weapons simulation systems. Cybersecurity Today would like to thank Material Security for sponsoring this podcast. Material Security provides faster, more complete detection and response for email, identity, and data threats inside Google Workspace and Microsoft 365. You can contact them at material[dot]security. David Shipley breaks down four major cybersecurity stories on Cybersecurity Today. First, a newly disclosed zero-day dubbed YellowKey reportedly defeats default Windows 11 BitLocker protection on systems using TPM-only encryption, giving attackers with physical access a path to unencrypted data through the Windows Recovery Environment. Microsoft is investigating, while security experts are urging stronger BitLocker configurations. The episode also examines the TeamPCP threat group's decision to release offensive tooling publicly, dramatically lowering the barrier for copycat supply-chain attacks. Researchers have already spotted malicious NPM packages borrowing similar techniques, including persistence mechanisms aimed at developer environments such as Visual Studio Code and Claude Code. David also looks at disturbing analysis of the FAST16 malware, which researchers believe was engineered to tamper with nuclear weapons simulation software including LS-DYNA and AutoDyn. And finally, U.S. officials reportedly suspect Iranian actors in cyberattacks targeting internet-exposed gas station automatic tank gauge systems, a reminder that weak operational technology security can quickly become a real-world infrastructure problem. 00:00 Sponsor Message 00:24 Headlines Overview 00:50 BitLocker Zero Day 03:32 TeamPCP Tools Leak 06:13 Copycat NPM Malware 06:50 Fast16 Nuclear Sabotage 08:37 Iran Gas Station Hacks 10:28 Hardening Critical Infrastructure 11:16 Wrap Up And Events 11:59 Sponsor Deep Dive #Cybersecurity #Windows11 #BitLocker #ZeroDay #TeamPCP #IranCyberAttack #SupplyChainAttack #CriticalInfrastructure #CyberSecurityToday
On this week's episode, we sat down with Emmanuel Becker, CEO of Mediterra Datacenters, a Southern European data centre business backed by PEIF III, a fund managed by DWS Infrastructure. Becker discusses investor appetite, the reasons behind focusing on Tier II markets, and the huge potential for growth across the Southern European data centre market. He also reveals what he believes to be the biggest challenge in the market – which may come as a surprise to some.NPM is a leading data, intelligence & events company providing business development led coverage of the US & European power, storage & data center markets for the development, finance, M&A and corporate community.Download our mobile app.
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Microsoft Patch Tuesday https://isc.sans.edu/diary/32980 Tanstack npm and others compromised https://socket.dev/blog/tanstack-npm-packages-compromised-mini-shai-hulud-supply-chain-attack Ruby Gems Attack https://x.com/maciejmensfeld/status/2054164602577940619
Java 26 est là, GraalVM cartonne chez Trivago (43 à 12 réplicas !), OpenJDK interdit le code généré par LLM, Spring et Quarkus enchaînent les releases. Côté IA : ADK 1.0, A2A, Lyria 3 chante (mal ?), Yann LeCun lance Ami Labs et ses World Models. Mythos d'Anthropic fait trembler la sécu, Claude Code a leaké son source, et les git worktrees envahissent vos terminaux. Bonus : la mort annoncée de l'IDE, vagues de licenciement chez Oracle et Block, et nos voix toutes clonées. Bon week-ends de mai ! Enregistré le 7 mai 2026 Téléchargement de l'épisode LesCastCodeurs-Episode-340.mp3 ou en vidéo sur YouTube. News Langages Retour d'expérience d'une migration vers graalVM chez Trivago https://medium.com/graalvm/inside-trivagos-graalvm-migration-native-image-for-graphql-at-scale-912bca9df841 La passerelle GraphQL de Trivago (point d'entrée de tout le trafic vers 48 microservices) souffrait de pics de timeout au démarrage JVM Résultats spectaculaires après migration vers GraalVM Native Image : réduction des réplicas de 43 à 12, CPU de 15 à 5 cœurs, images Docker plus légères Obstacles techniques : incompatibilité Log4j → migration vers Logback, remplacement de Mockk par Testcontainers, compilation CI/CD très gourmande Netflix DGS et d'autres librairies manquaient de support GraalVM → l'équipe a contribué des correctifs upstream en open source Approche recommandée : commencer par les services les moins complexes, investir massivement dans les tests automatisés À la 14e migration, le processus était si rodé qu'il allait plus vite que la toute première tentative OpenJDK Interim Policy on Generative AI - https://openjdk.org/legal/ai OpenJDK adopte une politique intérimaire interdisant toute contribution incluant du contenu généré par des LLMs, modèles de diffusion ou systèmes deep-learning Le périmètre est large : code source, texte, images dans les dépôts Git, pull requests GitHub, emails, pages wiki et issues JBS Les contributeurs peuvent utiliser les outils d'IA de manière privée pour comprendre, déboguer et relire le code OpenJDK, mais ne peuvent pas contribuer le contenu généré Trois risques justifient cette politique : surcharge des relecteurs face au code plausible mais incorrect, risques de sûreté/sécurité pour une plateforme critique, et risques de propriété intellectuelle (l'OCA exige que les contributeurs possèdent les droits IP de leurs contributions) Même éditer partiellement du code AI-généré ne le rend pas acceptable à la contribution Oracle, sponsor corporatif d'OpenJDK, travaille sur une politique complète à soumettre au Governing Board GraalVM Native Image et la Closed-World Assumption en Java https://pvs-studio.com/en/blog/posts/java/1357/ Un bon article de rappel du contexte de closed world en Java GraalVM Native Image compile les applications Java en exécutables natifs statiques, sans JVM au runtime. La JVM fonctionne en monde ouvert : les classes sont chargées à la demande, les appels sont des références symboliques résolues dynamiquement. Native Image impose la "closed-world assumption" : tous les chemins d'exécution doivent être connus à la compilation. Les fonctionnalités dynamiques Java (réflexion, proxies, chargement de classes) créent des chemins cachés invisibles à l'analyse statique. C'est pourquoi Native Image exige des fichiers de configuration explicites pour la réflexion, les proxies, les ressources et la FFM API. L'article illustre le problème avec la Foreign Function & Memory API pour appeler printf natif : fonctionne sur JVM, échoue en Native Image sans config. Inclure tout le bytecode accessible serait inutilisable : binaire géant, compilation très lente, et la réflexion nécessite des métadonnées précises. La configuration n'est pas un défaut de conception mais une conséquence logique du passage du dynamique au statique. Java 26 : les nouveautés https://foojay.io/today/java-26-whats-new/ Java est le langage de la JVM, publié tous les 6 mois depuis Java 9 ; Java 26 est une version non-LTS avec 10 JEPs. JEP 500 : protection des champs final modifiés par réflexion profonde, avec des avertissements configurables. JEP 504 : suppression définitive de l'API Applet, plus supportée par les navigateurs. JEP 516 : le cache AOT (Project Leyden) fonctionne désormais avec n'importe quel garbage collector. JEP 517 : support HTTP/3 dans le client HTTP, HTTP/2 reste le défaut mais HTTP/3 est accessible à la demande. JEP 522 : amélioration du débit du GC G1 en réduisant la synchronisation entre threads applicatifs et threads GC. Nouveau support des UUIDv7 via UUID.ofEpochMillis(), naturellement triables et adaptés aux identifiants de bases de données. Process devient AutoCloseable, utilisable dans un try-with-resources. Aucune fonctionnalité en preview n'est graduée en standard ; Structured Concurrency en est à sa 6e preview. Librairies Guillaume a créé une petite librairie Java sans dépendance pour extraire le JSON d'une réponse d'un LLM un peu verbeux https://glaforge.dev/posts/2026/03/22/extracting-json-from-llm-chatter-with-jsonspotter/ Les LLM génèrent souvent du JSON, mais il est parfois entouré de bla-bla et/ou contient des erreurs (ex: commentaires, virgules finales) qui bloquent les parseurs JSON standards. Guillaume a créé une petite librairie légère sans dépendance pour localiser et extraire la structure la plus longue ressemblant à du JSON (même malformé) On peut ensuite passé cette chaîne à un parseur "lénient" (plus tolérant) comme Jackson pour ensuite avoir de bons vieux objets Java fortement typés Librairie dispo sur Maven Central ADK Java sort sa version 1.0 (Agent Development Kit par Google) https://developers.googleblog.com/announcing-adk-for-java-100-building-the-future-of-ai-agents-in-java/ ADK est un framework open source de Google pour créer des agents IA, initialement en Python, maintenant multi-langages (Python, Java, Go, Typescript). Nouvelles fonctionnalités majeures : Outils puissants : GoogleMapsTool, UrlContextTool, ContainerCodeExecutor, VertexAiCodeExecutor, abstraction ComputerUseTool. Architecture de plugins centralisée : Nouveau conteneur App pour gérer les Plugins à l'échelle de l'application (ex: LoggingPlugin, GlobalInstructionPlugin). Context engineering amélioré : Compaction d'événements pour gérer la taille des fenêtres de contexte (résumé et rétention). Human-in-the-Loop (HITL) : Supporte les workflows ToolConfirmation pour approbation humaine des actions d'agent. Services de session et de mémoire : Contrats clairs pour la gestion de l'état (InMemory, VertexAI, Firestore) et la mémoire à long terme. Support Agent2Agent (A2A) : Collaboration native entre agents distants de différents frameworks via le protocole A2A. Dans cet autre article, Guillaume partage comment il a développé l'application Comic Trip montrée dans la vidéo YouTube et qui utilise ADK 1.0 https://glaforge.dev/posts/2026/03/30/building-my-comic-trip-agent-with-adk-java-1-0/ Nouvelle version du SDK Java pour Agent2Agent Protocol, avec le support de la version 1.0 de la spécification https://medium.com/google-cloud/a2a-java-sdk-1-0-0-beta1-released-e83c414b34cc Alignement avec la version 1.0 de la spécification Nouveau groupId org.a2aproject.sdk et package org.a2aproject.sdk Protocoles de transport : support complet et équivalent pour JSON-RPC, gRPC et HTTP+JSON/REST. Gestion des erreurs : introduction de codes d'erreur et détails structurés pour une meilleure observabilité. Optimisation HTTP : ajout d'en-têtes de cache pour les métadonnées des agents (Agent Card). Flexibilité du client HTTP : support par défaut du JDK HttpClient, avec option Vert.x pour les environnements Quarkus. Nouvelles fonctionnalités techniques : méthode DataPart.fromJson() pour la création simplifiée d'objets depuis du JSON brut. Prochaines étapes (v1.0.0.GA) : support simultané des versions 1.0.0 et 0.3.0 du protocole pour assurer l'interopérabilité. JPA 4.0 Milestone 2 : nouvelles fonctionnalités pour Jakarta Persistence https://in.relation.to/2026/04/23/JPA-4-M2/ Jakarta Persistence (JPA) est la spécification standard Java pour le mapping objet-relationnel (ORM), implémentée notamment par Hibernate. JPA 4.0 M2 est la deuxième milestone de la prochaine version majeure de la spécification, annoncée par Gavin King. Construction de requêtes Criteria à partir de chaînes JPQL, offrant plus de flexibilité dans la composition dynamique des requêtes. Nouveaux types d'expressions spécialisés (TextExpression, NumericExpression) pour simplifier l'écriture des requêtes Criteria. Nouvelle interface FetchOption pour contrôler explicitement la stratégie de chargement des associations, dont un BatchSize intégré. Nouvelle annotation @EntityListener qui découple les classes entités de leurs listeners, supprimant les dépendances à la compilation. Les listeners peuvent cibler plusieurs types de callbacks et s'appliquer globalement à toute l'unité de persistance. Introduction de FlushModeType.EXPLICIT et QueryFlushMode pour un contrôle plus fin de la synchronisation avec la base de données. La méta-annotation @Discoverable permet de placer des annotations comme @NamedQuery sur n'importe quelle classe ou interface. Améliorations du DDL via @Index amélioré et clarifications de la spécification via la javadoc. Quarkus 3.35 : tree-shaking, PGO et AOT Semeru https://quarkus.io/blog/quarkus-3-35-released/ Quarkus est un framework Java cloud-natif optimisé pour GraalVM et HotSpot, conçu pour les microservices et les environnements conteneurisés. Nouveau JAR tree-shaking expérimental : analyse des dépendances à la compilation pour supprimer les classes inutilisées. Sur le CLI Quarkus, cela supprime plus de 6 000 classes et économise environ 18 Mo (39,5 %). Support du Profile-Guided Optimization (PGO) pour les builds natifs via quarkus.native.pgo.enabled=true. Le PGO est une fonctionnalité Oracle GraalVM, non disponible dans la Community Edition. Support de l'AOT IBM Semeru : le démarrage passe de ~380 ms à ~190 ms dans les premiers tests. Nouvelle extension quarkus-reactive-transactions : support de @Transactional pour les méthodes Hibernate Reactive retournant Uni. Configuration CORS dédiée pour l'interface de management, indépendante de l'interface HTTP principale. Les tests n'utilisent plus les System Properties pour la propagation de configuration, facilitant la parallélisation future. Le serializer jackson sans reflection n'est pas le default du aux retours de cas limites, encore du travail This Week in Spring - 21 avril 2026 https://spring.io/blog/2026/04/21/this-week-in-spring-april-21-2026 Spring Framework 6.2.18 et 7.0.7 corrigent trois failles de sécurité : DoS via fichiers multipart WebFlux, empoisonnement de cache de ressources statiques, et DoS sur Windows. Le support open source de Spring Framework 5.3.x et 6.1.x est terminé, la migration est recommandée. Spring Data 2026.0.0-RC1 introduit l'upsert (MERGE/INSERT ON CONFLICT) dans l'API Template de Spring Data Relational. Spring Data ajoute un RedisMessageSendingTemplate pour la cohérence avec les listeners Redis, et une optimisation de réinitialisation de caches en un seul appel. Spring AI introduit une Session API (série Agentic Patterns, partie 7) : architecture event-sourcée pour la mémoire des agents IA. La Session API supporte la compaction turn-safe, l'isolation de sous-agents en parallèle, et la persistence JDBC (PostgreSQL, MySQL, MariaDB, H2). Elle vise Spring AI 2.1 (novembre 2026) et remplacera à terme l'API ChatMemory. Spring Vault 4.1.0-RC1 et 4.0.2 sont disponibles. Netflix a présenté son usage de Java, Spring Boot et Spring AI dans une vidéo. This Week in Spring - 28 avril 2026 https://spring.io/blog/2026/04/28/this-week-in-spring-april-28-2026 Cette série hebdomadaire de Josh Long compile les nouveautés de l'écosystème Spring : articles, outils, podcasts et annonces de la communauté. Spring Boot 4 introduit un package natif de résilience org.springframework.resilience avec une nouvelle API de retry qui remplace les approches fragiles via Spring Retry ou Resilience4j. L'API retry native de Spring Boot 4 a des noms d'attributs et sémantiques différents des anciennes bibliothèques, rendant les tutoriels pré-2025 obsolètes et sources de bugs silencieux. Le SDK Spring AI pour Amazon Bedrock AgentCore est disponible en GA : il intègre les capacités AgentCore dans Spring AI via annotations et auto-configuration. Le SDK AgentCore gère automatiquement le contrat runtime AgentCore : endpoint /invocations, health check /ping, SSE avec backpressure. Il offre mémoire court terme (sliding window) et long terme (sémantique, préférences, résumé, épisodique), ainsi que des outils pour navigateur et exécution de code en sandbox. Un plugin Maven (Nullability Maven Plugin) simplifie l'intégration de JSpecify et NullAway pour enforcer la null-safety à la compilation dans les projets Java. Le plugin génère automatiquement les fichiers package-info.java par package et configure le compilateur pour traiter les violations de nullabilité comme des erreurs. Josh Long et Dr. Venkat Subramaniam ont co-présenté à Voxxed Days Amsterdam sur "Intelligent Kotlin", avec un épisode de podcast associé. Cloud Amazon S3 Files https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-s3-files/ Amazon S3 Files est un nouveau service donnant un accès système de fichiers direct aux données stockées dans les buckets S3 Basé sur la technologie Amazon EFS, il supprime la barrière entre stockage objet et interface système de fichiers sans dupliquer les données Débit en lecture pouvant atteindre plusieurs téraoctets par seconde ; des milliers de ressources de calcul peuvent y accéder simultanément Les données restent accessibles via les deux interfaces : S3 API classique et système de fichiers standard, sans migration nécessaire Cas d'usage : agents IA pour la persistance de mémoire entre pipelines, équipes ML sans staging, simplification des data lakes Disponible dans 34 régions AWS Data et Intelligence Artificielle Comment générer de la musique et des clips audio en Java avec le modèle Lyria 3 https://glaforge.dev/posts/2026/03/25/generating-music-with-lyria-3-and-the-gemini-interactions-java-sdk/ Génération musicale avec Lyria 3 (DeepMind) et le SDK Java Gemini Interactions. Lyria 3 : modèle d'IA générative pour créer musique avec paroles ou pistes instrumentales. Utilisation via le SDK Java de l'API Gemini, nécessite une clé API Gemini. Deux versions de modèle Lyria 3 : lyria-3-clip-preview : Clips courts (30s), extraits. lyria-3-pro-preview : Chansons complètes (jusqu'à 3 min), structurées. Personnalisation via les prompts : Fournir ses propres paroles ou les faire générer. Contrôler la structure de la chanson ([Intro], [Verse], [Chorus], [Outro]). Générer des morceaux instrumentaux uniquement. Utiliser des images comme source d'inspiration (modèle multimodal). Sortie : Audio (MP3) et texte (paroles/structure) directement, sans décodage complexe. Facilite l'intégration de la génération musicale dans les applications Java. Les world model, la prochaine étape pour les IA https://www.lepoint.fr/sciences-nature/comment-le-commando-de-yann-le-cun-se-prepare-a-ringardiser-les-geants-mondiaux-de-lia-depuis-paris-OZVUWTDYBNE25C6WF44265ZQKE/ Yann LeCun a quitté Meta FAIR pour créer AMI Labs (Advanced Machine Intelligence) basée à Paris Sa thèse : les LLMs ne mèneront pas à l'intelligence générale, la vraie IA doit partir de la compréhension du monde physique AMI Labs a levé 1,03 milliard de dollars en seed (le plus grand seed round de l'histoire européenne) à 3,5 milliards de valorisation Les world models apprennent à prédire et comprendre la réalité physique plutôt qu'à prédire le prochain token d'une séquence Slogan d'AMI : "Real intelligence does not start in language. It starts in the world." Paris comme base stratégique pour challenger la Silicon Valley dans la prochaine rupture de l'IA Debezium 2026 : résultats du sondage communautaire https://debezium.io/blog/2026/04/27/debezium-2026-survey-results/ Debezium est un outil de Change Data Capture (CDC) open source qui capture les modifications de bases de données en temps réel pour les diffuser vers des systèmes comme Kafka. 98,6% des répondants utilisent Debezium activement ou prévoient de le faire dans l'année, avec 91,3% déjà en production. 63,8% des déploiements tournent sur Kubernetes, 60,9% utilisent Kafka Connect auto-géré, et 17,4% restent sur des VMs ou bare metal. Helm charts est l'approche dominante pour la gestion de configuration, souvent combiné avec GitOps, CI/CD, Ansible ou Terraform. PostgreSQL domine les connecteurs utilisés à 69,6%, suivi de MySQL (33,3%), SQL Server (29%) et Oracle (27,5%). Les volumes de changements capturés vont de 1-25 modifications par minute jusqu'à 1-2 millions par minute selon les environnements. Infinispan rejoint l'écosystème OGX comme fournisseur de stockage vectoriel https://infinispan.org/blog/2026/04/17/infinispan-joins-ogx-ecosystem OGX (anciennement Llama Stack) est un serveur API agentique open source pour construire des applications d'IA complètes. OGX compose des fournisseurs d'inférence, des stores vectoriels, des backends de sécurité, des runtimes d'outils et du stockage de fichiers en un seul serveur déployable. OGX se positionne comme une alternative à l'API OpenAI, déployable sur diverses infrastructures et modèles. OGX cible les workflows RAG (Retrieval-Augmented Generation) et les applications agentiques. Infinispan s'y intègre comme fournisseur de vector IO, apportant recherche vectorielle, par mots-clés et hybride. Je n'ai pas entendu parlé de ce renommage, vous le voyez dans vos deploiements ? Outillage cmux un nouveau terminal basé sur Ghostty spécialisé pour les coding agents https://cmux.com/ Application macOS native construite sur le moteur de rendu Ghostty (libghostty), offrant une accélération GPU pour une fluidité maximale Conçu spécifiquement pour le multitâche et les workflows assistés par IA, avec des onglets verticaux affichant la branche Git, le répertoire et les ports actifs Intègre des notifications qui illuminent les panneaux lorsqu'un agent IA (Claude Code, Codex, etc.) nécessite l'attention de l'utilisateur Propose un navigateur web intégré et scriptable qui peut être affiché en écran scindé à côté du terminal via une API Alternative moderne à tmux, ne nécessitant pas de fichiers de configuration complexes ou de préfixes de touches pour la gestion des vitres et des sessions Supporte nativement tous les agents de codage en ligne de commande et permet l'automatisation via une API socket et une interface CLI dédiée Git Worktree comme un chef https://www.metal3d.org/blog/2026/git-worktree-comme-un-chef/ Article par Patrice Ferlet Git Worktree: Travailler sur plusieurs branches simultanément via des répertoires distincts. Évite git stash ou clones multiples pour le changement de contexte rapide. Méthode "bare" (recommandée): Cloner le dépôt en mode bare (ex: .bare). Lier le dossier racine au dépôt bare via un fichier .git. Configurer le remote tracking pour voir toutes les branches distantes. Ajouter des worktrees pour chaque branche (git worktree add ). Avantages: Économie d'espace, source de vérité unique (un git fetch met tout à jour), hooks/configs partagés, sécurité. Conseils: Ne jamais faire de git checkout à l'intérieur d'un worktree. git fetch --all depuis n'importe quel worktree pour tout mettre à jour. git worktree add --detach pour tester des merges temporaires sans créer de branche. Supprimer: git worktree remove puis git worktree prune. Un script wtree est fourni pour automatiser l'initialisation du setup "bare". Améliore considérablement le workflow. L'IDE meurt et vite https://x.com/jdegoes/status/2036931874057314390?s=46&t=C18cckWlfukmsB_Fx0FfxQ Des leaders techniques prédisent la fin rapide de l'IDE traditionnel, remplacé par des interfaces conversationnelles agentiques Le changement de paradigme : le développeur n'écrit plus des lignes de code mais exprime son intention et supervise des agents autonomes Des outils comme Claude Code, Copilot et Cursor transforment déjà radicalement les workflows de développement quotidiens L'IDE centré sur l'éditeur de code perd sa raison d'être quand l'agent lit, modifie et structure le code de manière autonome La transition est comparable au passage du desktop au mobile : les pratiques établies depuis 30 ans remises en question en quelques mois Le source de Claude Code a leaké via probablement le codemap et un site decrit sont fonctionnement https://ccunpacked.dev/ Le 31 mars 2026, Anthropic a accidentellement inclus les sourcemaps dans un package npm de Claude Code, exposant ~512 000 lignes de TypeScript La fuite n'était pas un piratage mais une erreur humaine : un "*.map" oublié dans .npmignore Le site ccunpacked.dev a été lancé pour analyser et visualiser le code source décompressé Le code révèle un agent background permanent nommé "KAIROS", un mode furtif pour cacher les contributions des employés Anthropic à l'open source, et 44 feature flags cachés Une fonctionnalité inédite "Buddy" (animal de compagnie électronique dans le terminal) et un mode "dream" pour l'idéation continue ont été découverts Anthropic a confirmé : "Aucune donnée client sensible n'était impliquée. Erreur humaine dans le packaging de la release." Gemini CLI passe aux agents https://x.com/srithreepo/status/2039794081925382307?s=46&t=GLj1NFxZoCFCjw2oYpiJpw Gemini CLI, l'agent IA open source de Google pour le terminal, introduit des hooks dans sa boucle agentique Les hooks permettent d'exécuter des scripts automatiquement (scanners de sécurité, vérifications de conformité, logging) à chaque étape de l'agent Lancement de Gemini CLI GitHub Actions : un agent autonome pour les repositories qui peut exécuter des tâches de codage de routine Support des MCP servers pour étendre les capacités et des "Agent Skills" pour des workflows spécialisés Mode agent disponible dans VS Code et IntelliJ avec accès aux outils du système de fichiers et terminal Wispr, le speech to text en local sur macOS http://wispr.stormacq.com/ Wispr est une application macOS de dictée vocale entièrement locale, propulsée par Whisper (OpenAI) sur appareil, sans cloud ni tracking Sébastien Stormacq a développé Wispr en un jour et demi sans écrire une seule ligne de code, grâce à Kiro CLI (agent IA Amazon) Disponible en open source sur GitHub et via Homebrew Détection automatique de la langue, insertion du texte au curseur dans n'importe quelle application via un raccourci global En un mois : 19 releases incluant mode mains-libres, suppression des mots de remplissage, auto-envoi pour les chats, et un outil CLI Exemple concret de développement vibe coding produisant un outil de qualité production sans expertise Swift préalable Comment, Gordon, l'assistant spécialisé en Docker est né https://n9o.xyz/posts/202603-building-gordon/ Nuno Coração (n9o.xyz) détaille comment Gordon, l'assistant spécialisé Docker, a été construit sur docker-agent, le runtime d'agents IA open source de Docker écrit en Go Les agents sont définis en YAML déclaratif et distribués comme des artefacts OCI, sans mise à jour binaire nécessaire L'architecture initiale en essaim de 9 agents spécialisés a été abandonnée au profit d'un agent racine unique avec un prompt soigneusement conçu Le modèle utilisé est Claude Haiku 4.5, suffisant après optimisation des prompts Principe clé "show, then do" : toute action de l'agent nécessite une approbation explicite de l'utilisateur La description des outils impacte fortement la précision du LLM : ajouter des outils peut paradoxalement dégrader les performances existantes Le prompt est une spécification détaillée (identité, patterns d'accès fichiers, règles de sécurité) plutôt qu'une simple instruction IBM Bob https://bob.ibm.com/blog/announcing-ibm-bob-launch IBM Bob assistant IA d'IBM pour coder sur de vraies codebases (lancé avril 2026) 5 modes : Ask, Plan, Code, Advanced (MCP), Orchestrator Détecte la complexité du code en temps réel et propose des refactos Fait des revues de code automatiques sur tes branches/issues GitHub Permet d'écrire en langage naturel directement dans l'éditeur Fonctionne aussi en terminal/CLI et dans les pipelines CI/CD Sécurité : approbation manuelle, .bobignore, checkpoints, pas de training sur tes prompts How I use Claude - 50 tips pratiques https://www.youtube.com/watch?v=mZzhfPle9QU Staff Engineer Meta partage 50 tips après 6 mois d'utilisation intensive de Claude Code Basé sur ~12h/jour d'usage perso et professionnel Couvre tout : bases, workflows avancés, parallélisation Objectif : partager ce qu'il aurait voulu savoir dès le départ Méthodologies Quelqu'un rale sur la non soutenabilité des bases de code écritent avec des agents https://mariozechner.at/posts/2026-03-25-thoughts-on-slowing-the-fuck-down/ Mario Zechner estime que les agents IA font les mêmes erreurs répétitivement sans apprendre, accumulant la complexité à grande vitesse faute de bottlenecks humains Sans vision globale, les agents créent du cargo-cult : les "best practices" de l'industrie appliquées localement sans cohérence architecturale La croissance de la base de code dégrade la capacité des agents à retrouver le code existant → duplication et incohérences croissantes Il cite des pannes AWS et des initiatives qualité Microsoft comme signes préoccupants liés au code généré par IA Solution : réserver les agents aux tâches délimitées et évaluables, garder l'architecture, les APIs et les systèmes critiques écrits à la main Maintenir une revue de code rigoureuse et traiter les humains comme les gardiens finaux de la qualité On m'oblige à utiliser l'IA https://n.survol.fr/n/on-moblige-a-utiliser-lia Éric D. défend l'adoption obligatoire de l'IA comme décision stratégique légitime, comparable au choix du full remote ou de la stack technique Il distingue la décision stratégique (adoption IA) de la méthode d'accompagnement (qui reste collaborative et bienveillante) La compétence IA devient un critère de recrutement : chercher des candidats déjà curieux et explorateurs de ces outils L'alignement culturel sur les pratiques et outils est un prérequis à la cohésion d'équipe Le refus d'adopter certains outils stratégiques peut justifier de ne pas recruter un candidat autrement compétent Encore une metodo SPDD https://martinfowler.com/articles/structured-prompt-driven/ Problème : l'IA accélère le dev individuel mais amplifie ambiguïtés et incohérences à l'échelle d'une équipe. martinfowler SPDD : traiter les prompts comme des artefacts versionnés, révisables et réutilisables plutôt que des échanges jetables. martinfowler Canvas REASONS : 7 dimensions (Requirements, Entities, Approach, Structure, Operations, Norms, Safeguards) pour guider le LLM de l'intention à l'exécution. martinfowler Workflow en 6 étapes : exigences → analyse → contexte → prompt structuré → code → tests unitaires, chaque étape s'appuyant sur la précédente. martinfowler 3 compétences clés : abstraction d'abord, alignement de l'intention, revue itérative. martinfowler Limites : fort ROI sur du code métier complexe, peu adapté aux hotfixes urgents, scripts jetables ou travail créatif/visuel. m Sécurité Le projet Glasswing pour sécuriser les logiciels https://www.anthropic.com/glasswing Anthropic lance Glasswing, une initiative de cybersécurité utilisant Claude Mythos Preview pour identifier des vulnérabilités zero-day 12 partenaires fondateurs dont AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft et NVIDIA Anthropic investit 100 millions de dollars en crédits de modèle et 4 millions en dons aux organisations de sécurité open source Le modèle opère avec une autonomie substantielle, identifiant des milliers de vulnérabilités dans les OS, navigateurs et infrastructures critiques Plus de 40 organisations supplémentaires ont accès pour scanner et sécuriser leurs systèmes Objectif : donner l'avantage aux défenseurs avant que les techniques de hacking assistées par IA ne se généralisent chez les attaquants LinkedIn vous espionne https://frenchbreaches.com/blog/linkedin-est-accuse-de-fouiller-dans-votre-ordinateur-illegalement Scandale "BrowserGate" : LinkedIn injecte du JavaScript qui tente de détecter les extensions Chrome installées sur votre navigateur Le script analysé contient une liste codée en dur de 6 222 extensions Chrome avec identifiants et chemins de fichiers internes Croissance alarmante de la liste ciblée : 38 extensions en 2017 → 461 en 2024 → ~1 000 en mai 2025 → 6 222 début 2026 Les données collectées incluent aussi CPU, RAM, résolution d'écran, timezone et état batterie pour du fingerprinting Certaines extensions ciblées sont liées à la neurodivergence, aux pratiques religieuses ou aux opinions politiques → violation grave du RGPD LinkedIn défend que le scan vise uniquement à détecter les extensions qui pratiquent le scraping de données Post mortem de la supply chain attack sur la librairie NPM axios https://github.com/axios/axios/issues/10636 Le 31 mars 2026, deux versions malveillantes d'axios (1.14.1 et 0.30.4) ont été publiées via un compte mainteneur compromis Vecteur d'attaque : RAT installé via ingénierie sociale ciblée sur la machine personnelle du mainteneur principal La 2FA ne protège pas si la machine de l'utilisateur est compromise : l'attaquant contrôle tout et peut agir comme l'utilisateur Les packages malveillants injectaient plain-crypto-js@4.2.1, un cheval de Troie multi-plateforme (macOS, Windows, Linux) Détection communautaire en ~3 heures, suppression par npm, mesures correctives : rotation complète des credentials Changements préventifs : publication via OIDC, releases immuables, amélioration des pratiques GitHub Actions Passbolt un gestionnaire de mots de passe open source https://lesjoiesducode.fr/passbolt-gestionnaire-de-mots-de-passe-gratuit-open-source-que-votre-equipe-merite-vraiment Gestionnaire de mots de passe open source conçu pour le partage d'identifiants en équipe, utilisé par plus de 50 000 organisations Chiffrement individuel par utilisateur et par version de credential, pas de coffre-fort partagé — architecture zero-knowledge "Forward secrecy" : quand un membre quitte l'équipe, ses copies chiffrées sont automatiquement révoquées sans reset manuel Supporte TOTP, clés SSH, tokens API et champs personnalisés avec piste d'audit complète de tous les accès Édition communautaire entièrement gratuite avec utilisateurs illimités, auto-hébergeable ou cloud Chiffrement OpenPGP nécessitant passphrase + clé privée, avec tokens visuels anti-phishing Loi, société et organisation Anthropic fait un don d'1,5 millions de dollars à la fondation Apache https://news.apache.org/foundation/entry/the-apache-software-foundation-announces-1-5m-donation-from-anthropic Anthropic donne 1,5 million de dollars à l'ASF pour soutenir l'infrastructure, la sécurité et la communauté open source Vitaly Gudanets (CISO d'Anthropic) : "Soutenir l'ASF est un investissement direct dans la résilience et l'intégrité des systèmes dont dépend l'IA moderne" Les fonds financeront les systèmes de build, les processus de sécurité et les services aux projets Apache Ce don est le déclencheur de l'initiative IA responsable à 10 millions de dollars de l'ASF L'infrastructure Apache est invisible mais critique : des systèmes financiers aux plateformes de santé, elle sous-tend l'écosystème logiciel mondial L'ASF lance l'initiative IA responsable https://news.apache.org/foundation/entry/the-apache-software-foundation-launches-10m-responsible-ai-initiative-with-initial-1-75m-donation L'ASF lance une initiative pour une IA responsable dotée d'un budget de 10 millions de dollars sur 3 ans minimum Anthropic est le premier donateur avec 1,5 million de dollars ; Alpha-Omega contribue 250 000 dollars L'initiative fournit aux projets Apache un accès à des modèles IA pour l'expérimentation et la sécurité Elle soutient l'ensemble de la chaîne IA/ML : pipelines de données, infrastructure, frameworks de deep learning Des tracks de conférences, hackathons et bourses de voyage sont prévus pour élargir la communauté Les principes directeurs incluent la supervision humaine, l'intégrité des licences et la sécurité open source Oracle vire 30000 personnes https://rollingout.com/2026/03/31/oracle-slashes-30000-jobs-with-a-cold-6/ Oracle licencie 20 000 à 30 000 employés, 18% de ses effectifs mondiaux. Les salariés ont appris leur licenciement par un simple email à 6h du matin, sans aucun préavis. L'accès à tous les systèmes (Slack, Zoom, badges) a été coupé immédiatement après. But : libérer 8 à 10 milliards de dollars pour construire des centres de données IA. Oracle a déjà contracté 50 milliards de dettes en 2026 pour financer ses projets IA. Paradoxe : l'entreprise affiche un bénéfice record de 6,13 milliards, mais ses liquidités sont dans le rouge. L'action Oracle a perdu plus de la moitié de sa valeur depuis septembre 2025. Et si l'IA n'était qu'un prétexte pour licencier https://eventuallycoding.com/p/ia-licenciements-et-si-l-intelligence-artificielle-n-etait-qu-une-excuse Hugo Lassiège (eventuallycoding) estime que les entreprises utilisent l'IA comme narratif commode pour masquer des erreurs de gestion passées (Block a triplé ses effectifs post-COVID sans croissance des revenus correspondante) Moins de 1% des licenciements technologiques seraient réellement dus à des gains de productivité IA selon les analyses citées Mesurer la productivité des développeurs reste un problème non résolu, mais les entreprises affirment des gains d'efficacité sans preuves Des pressions économiques réelles (inflation, guerres commerciales, coûts énergétiques) sont masquées derrière le discours IA Les restructurations nécessaires sont présentées comme des transformations AI-driven positives pour rassurer les investisseurs Il y voit une fenêtre d'opportunité pour l'Europe pendant que les géants américains se restructurent GitHub Copilot va utiliser les interacitons pour entrainer ses modèles sauf si vous vous délistez https://github.blog/news-insights/company-news/updates-to-github-copilot-interaction-data-usage-policy/ À partir du 24 avril 2026, GitHub utilise par défaut les interactions des utilisateurs Copilot Free, Pro et Pro+ pour entraîner ses modèles Les données collectées incluent le code accepté ou modifié, les snippets envoyés, les noms de fichiers et structures de dépôts, et les retours utilisateurs Les utilisateurs Copilot Business, Enterprise et les dépôts d'entreprise sont exclus de cette collecte de données d'entraînement Opt-out disponible dans les paramètres GitHub > "Privacy" ; les préférences de désactivation préalables sont conservées automatiquement Objectif déclaré : améliorer la précision des modèles sur les langages et cas d'usage du monde réel Grosse percée de Claude Code dans les commits sur GitHub https://aifoc.us/damn-claude-thats-a-lot-of-commits/ Explosion de Claude Code : En six mois, Claude Code est passé de 0,7 % à 4,5 % de tous les commits publics sur GitHub, surpassant tous les autres outils d'IA combinés. Adoption massive des agents IA : Environ 5 % des commits publics sur GitHub sont désormais générés par des agents IA, un chiffre en croissance rapide depuis fin 2025. Domination des bots sur GitHub : Au-delà des commits, les outils d'IA sont omniprésents dans la gestion des pull requests et des problèmes (Copilot et CodeRabbit notamment). Limites méthodologiques : Les données ne concernent que les dépôts publics (les entreprises utilisent massivement des dépôts privés, invisibles ici). Le comptage dépend fortement de la visibilité des signatures (certains outils comme Claude marquent systématiquement leurs commits, d'autres non) L'API de recherche GitHub présente une fiabilité variable à cette échelle. Changement de paradigme : Le développement logiciel vit une transition majeure, comparable au passage du desktop au mobile. L'intégration des agents IA dans le cycle de production n'est plus une expérimentation, mais une réalité opérationnelle à grande échelle. Dysmaths une application pour aider à apprendre les mathématiques et la géométrie lorsque l'on souffre de dyspraxie, dysgraphie https://dysmaths.com/ Application web pour aider les élèves de collège et lycée souffrant de dysgraphie et dyspraxie à faire des maths et de la géométrie Outils de dessin à main levée, géométrie précise (compas, rapporteur, règle) et opérations structurées (fractions, racines, puissances, symboles mathématiques) Export PDF et PNG avec conservation fidèle de l'échelle pour l'impression et la soumission des exercices Options d'accessibilité : police OpenDyslexic, personnalisations d'interface, import d'images et de PDFs Répond à un besoin réel : les outils standards ne sont pas adaptés aux difficultés de coordination et d'organisation spatiale en mathématiques IA ou réalité ? Par Amistory https://www.youtube.com/watch?v=PPYdAhBBF2I L'IA génère des contenus (images, voix, vidéos) de plus en plus indétectables Les arnaques au clonage de voix et deepfakes sont en forte hausse Les faux contenus viraux manipulent l'opinion à grande échelle Le faux n'est plus un accident, c'est devenu un système organisé La société entre dans une ère de doute généralisé sur le réel Comment s'informer quand le réel lui-même peut être simulé ? Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 6-7 mai 2026 : Devoxx UK 2026 - London (UK) 12 mai 2026 : Lead Innovation Day - Leadership Edition - Paris (France) 12-13 mai 2026 : Lyon Craft - Lyon (France) 19 mai 2026 : La Product Conf Paris 2026 - Paris (France) 19-20 mai 2026 : Green Code Challenge - Paris (France) 21-22 mai 2026 : Flupa UX Days 2026 - Paris (France) 22 mai 2026 : AFUP Day 2026 Lille - Lille (France) 22 mai 2026 : AFUP Day 2026 Paris - Paris (France) 22 mai 2026 : AFUP Day 2026 Bordeaux - Bordeaux (France) 22 mai 2026 : AFUP Day 2026 Lyon - Lyon (France) 27 mai 2026 : aMP Day Strasbourg 2026 - Strasbourg (France) 28 mai 2026 : DevCon 27 : I.A. & Vibe Coding - Paris (France) 28 mai 2026 : Cloud Toulouse 2026 - Toulouse (France) 29 mai 2026 : NG Baguette Conf 2026 - Paris (France) 29 mai 2026 : Agile Tour Strasbourg 2026 - Strasbourg (France) 2-3 juin 2026 : Agile Tour Rennes 2026 - Rennes (France) 2-3 juin 2026 : OW2Con - Paris-Châtillon (France) 3 juin 2026 : IA–NA - La Rochelle (France) 4 juin 2026 : Workplace Intelligence Days - 1ère édition - Lyon (France) 5 juin 2026 : TechReady - Nantes (France) 5 juin 2026 : Fork it! - Rouen - Rouen (France) 6 juin 2026 : Polycloud - Montpellier (France) 9 juin 2026 : JFTL - Montrouge (France) 9 juin 2026 : C: - Caen (France) 9 juin 2026 : France API 2026 - Paris (France) 11-12 juin 2026 : DevQuest Niort - Niort (France) 11-12 juin 2026 : DevLille 2026 - Lille (France) 12 juin 2026 : Tech F'Est 2026 - Nancy (France) 15 juin 2026 : Jupyter Workshops: Demystifying MyST Markdown in Education - Orsay (France) 16 juin 2026 : Mobilis In Mobile 2026 - Nantes (France) 17-19 juin 2026 : Devoxx Poland - Krakow (Poland) 17-20 juin 2026 : VivaTech - Paris (France) 18 juin 2026 : Tech'Work - Lyon (France) 22-26 juin 2026 : Galaxy Community Conference - Clermont-Ferrand (France) 23-24 juin 2026 : MWCP 2026 - Paris (France) 24-25 juin 2026 : Agi'Lille 2026 - Lille (France) 24-26 juin 2026 : BreizhCamp 2026 - Rennes (France) 25-26 juin 2026 : Agile Tour Toulouse 2026 - Toulouse (France) 27 juin 2026 : Asynconf - Paris (France) 2 juillet 2026 : Azur Tech Summer 2026 - Valbonne (France) 2-3 juillet 2026 : Sunny Tech - Montpellier (France) 3 juillet 2026 : Agile Lyon 2026 - Lyon (France) 6-8 juillet 2026 : Riviera Dev - Sophia Antipolis (France) 28-30 août 2026 : State of the Map - Champs-sur-Marne (France) 4 septembre 2026 : JUG Summer Camp 2026 - La Rochelle (France) 10-11 septembre 2026 : Nantes Craft - Nantes (France) 17 septembre 2026 : dotAI - Paris (France) 17-18 septembre 2026 : API Platform Conference 2026 - Lille (France) 18 septembre 2026 : dotJS - Paris (France) 18 septembre 2026 : WordCamp Bretagne - Rennes (France) 22 septembre 2026 : Salon Data 2026 - Nantes (France) 22-23 septembre 2026 : Agile en Seine & IA 2026 - Paris (France) 24 septembre 2026 : OWASP AppSec Days France 2026 - Paris (France) 24 septembre 2026 : PlatformCon Paris - Paris (France) 24 septembre 2026 : React Native Connection 2026 - Paris (France) 24-26 septembre 2026 : Paris Web 2026 - Paris (France) 28-29 septembre 2026 : 4th Tech Summit on AI & Robotics - Paris (France) & Online 1 octobre 2026 : WAX 2026 - Marseille (France) 1-2 octobre 2026 : Volcamp - Clermont-Ferrand (France) 2 octobre 2026 : DevFest Perros-Guirec 2026 - Perros-Guirec (France) 5-9 octobre 2026 : Devoxx Belgium - Antwerp (Belgium) 12 octobre 2026 : Dev With AI - Paris (France) 27-29 octobre 2026 : Directions EMEA 2026 - Paris (France) 29-30 octobre 2026 : BDX I/O 2026 - Bordeaux (France) 30 octobre 2026 : Cloud Nord 2026 - Lille (France) 4-5 novembre 2026 : Devoxx Morocco - Casablanca (Morocco) 14-15 novembre 2026 : Capitole du Libre - Toulouse (France) 19 novembre 2026 : DevFest Toulouse 2026 - Toulouse (France) 27 novembre 2026 : DevFest Paris 2026 - Paris (France) 1-3 décembre 2026 : Apidays Paris - Paris (France) 4 décembre 2026 : DevFest Lyon 2026 - Lyon (France) 4 décembre 2026 : DevFest Dijon 2026 - Dijon (France) 9-10 décembre 2026 : OpenSource Expérience - Paris (France) 9-10 décembre 2026 : DevOps REX - Paris (France) 10 décembre 2026 : KCD Provence - Aix-en-Provence (France) 7-9 avril 2027 : Devoxx France 2027 - Paris (France) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via X/twitter https://twitter.com/lescastcodeurs ou Bluesky https://bsky.app/profile/lescastcodeurs.com Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/
On this week's episode, we sat down with Jennifer Smith, CFO of Pan-European fibre infrastructure operator and telco service provider Zayo Europe to discuss the critical role connectivity plays in the rapidly expanding data centre market.Smith examined the growing partnerships between fibre providers and data centre operators – including its recently announced venture with QTS for its EUR 10bn Cambois project – and unpacked some of the key challenges shaping the sector today.NPM is a leading data, intelligence & events company providing business development led coverage of the US & European power, storage & data center markets for the development, finance, M&A and corporate community.Download our mobile app.
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Today's Odd Web Requests https://isc.sans.edu/diary/Today%27s%20Odd%20Web%20Requests/32934 Incomplete Patch of APT28's Zero-Day Leads to CVE-2026-32202 https://www.akamai.com/blog/security-research/2026/apr/incomplete-patch-apt28s-zero-day-cve-2026-32202 Assess Secure Boot status with Microsoft Defender https://techcommunity.microsoft.com/blog/MicrosoftDefenderATPBlog/assess-secure-boot-status-with-microsoft-defender/4510356 Deprecating Legacy TLS and Endpoints for POP and IMAP in Exchange Online https://techcommunity.microsoft.com/blog/exchange/deprecating-legacy-tls-and-endpoints-for-pop-and-imap-in-exchange-online/4515201 SAP Related npm Packages Compromised https://www.stepsecurity.io/blog/a-mini-shai-hulud-has-appeared
Ministry Monday's Story Series continues with Rodolfo López and Estela García-López, a husband and wife duo who serve in music ministry in the pacific northwest and are two of NPM's Pastoral Musicians of the Year. Rudy López and Estela García-López are pastoral musicians, composers, and event presenters originally from Los Angeles, California, and from Portland, Oregon, for the last twenty-six years. They have been committed to enriching the liturgy -- in both English and Spanish--through music. They have served as music directors, cantors, choristers, and instrumentalists in parishes throughout Southern California and Portland, Oregon, for over four decades. Today I sit with Rudy and Stella and discuss their pastoral music journey with its joys, challenges, and many blessings.
On this week's episode, AtlasEdge CEO Tesh Durvasula joins Madeline Sherratt to unpack the company's European growth strategy and shifting data centre landscape.Tesh highlights Lisbon as a key hub in AtlasEdge's Iberian expansion, citing strong connectivity, renewable energy, and rising demand. He also explains the rationale behind the sale of nine edge data centres, as the company refocuses on larger, scalable campuses targeting hyperscalers and major enterprises.The discussion explores growth beyond traditional FLAP-D markets, with Germany and Austria emerging as core regions, and why less saturated cities offer faster access to power and permits.Tesh also dives into the impact of AI on infrastructure demand, the push toward localised data processing, and the key challenges facing developers - from supply chain delays to power constraints and capital discipline.NPM is a leading data, intelligence & events company providing business development-led coverage of the US & European power, storage & data centre markets for the development, finance, M&A and corporate community. Download our mobile app.
Malware has shifted from phishing expeditions to open source packages, domains, and repositories. Ned and Kyler welcome Jenn Gile, co-founder of Open Source Malware, to discuss how malware is making its way into open source software. Together they break down NPM compromises, AI-driven infiltration, malicious agent skills, and more. Episode Links: Open Source Malware –... Read more »
Malware has shifted from phishing expeditions to open source packages, domains, and repositories. Ned and Kyler welcome Jenn Gile, co-founder of Open Source Malware, to discuss how malware is making its way into open source software. Together they break down NPM compromises, AI-driven infiltration, malicious agent skills, and more. Episode Links: Open Source Malware –... Read more »
Malware has shifted from phishing expeditions to open source packages, domains, and repositories. Ned and Kyler welcome Jenn Gile, co-founder of Open Source Malware, to discuss how malware is making its way into open source software. Together they break down NPM compromises, AI-driven infiltration, malicious agent skills, and more. Episode Links: Open Source Malware –... Read more »
On this week's episode, Alsym Energy COO Graeme Grant joins Andrew Burnes to discuss the growing opportunity for sodium-ion batteries to take hold in the battery storage landscape of the US. Graeme discusses the state of the sodium-ion market today as compared to lithium-ion and some of the benefits of the burgeoning technology but also challenges that sodium-ion will need to overcome to reach commercial market adoption. The discussion also dives into the recent release from the US Treasury detailing PFE rules and how sodium-ion may be positioned to benefit while lithium-ion is bottlenecked.NPM is a leading data, intelligence & events company providing business development led coverage of the US & European power, storage & data center markets for the development, finance, M&A and corporate community.Download our mobile app.
Introducing "The Story Series", a recurring series of episodes on Ministry Monday! The Story Series interviews pastoral musicians across the country and shares their pastoral music journey, both in its rewards and challenges along the way. Today we speak with Rick Gibala, a lifelong pastoral musician who started his ministry in the Diocese of Pittsburgh. He was the first layperson to be the Diocesan Director of Music, and founded the Pittsburgh NPM chapter. His ministry led him to the DC-Virginia area where he's ministered there since 1986. Rick shares how he began to serve in pastoral ministry, what brought him to Virginia, and why he thinks NPM remains a critical part of music ministry in the Church today.
This episode covers several major cybersecurity and tech news stories, including a sophisticated NPM supply chain attack that compromised the widely used Axios library through advanced social engineering, and the broader implications for software security. The hosts also discuss the accidental leak of Anthropic's Claude codebase, what it reveals about AI development practices, and the risks of misconfigurations exposing sensitive systems. Additional conversation touches on AI reliability, “vibe-coded” software, and the growing role of AI in both development and attack techniques.Join us LIVE on Mondays, 4:30pm EST.A weekly Podcast with BHIS and Friends. We discuss notable Infosec, and infosec-adjacent news stories gathered by our community news team.https://www.youtube.com/@BlackHillsInformationSecurityChat with us on Discord! - https://discord.gg/bhis
Fortinet EMS Zero-Day Exploited, Anthropic's AI Finds Thousands of Bugs, and Iranian Hackers Target US ICS Cybersecurity Today would like to thank Meter for their support in bringing you this podcast. Meter delivers a complete networking stack, wired, wireless and cellular in one integrated solution that's built for performance and scale. You can find them at Meter.com/cst Host David Shipley reports Fortinet issued emergency hotfixes for a new actively exploited FortiClient EMS unauthenticated RCE zero-day (CVE-2026-35616) affecting 7.4.0.5/7.4.0.6, with over 2,000 exposed instances online and a full fix coming in 7.4.0.7. Anthropic says its Claude "Mythos" model (Project Glasswing) has found thousands of high-severity zero days and demonstrated advanced exploit chaining and sandbox escape, but will not be released publicly; it is being used with major partners and funded with up to $100M in credits plus $4M for open-source security. A postmortem details a North Korea–linked social-engineering supply-chain breach of Axios on NPM, part of a broader campaign spreading 1,700+ malicious packages across multiple ecosystems. US agencies warn Iranian-linked hackers are targeting Rockwell/Allen-Bradley PLCs in critical infrastructure. The White House proposes a $707M cut to CISA, reducing staffing while preserving $1.4B for core cybersecurity. 00:00 Headlines and Sponsor 00:55 Fortinet EMS Zero Day 03:21 AI Finds Zero Days 05:56 Axios Supply Chain Breach 08:02 North Korea Package Campaign 10:13 Iran Targets Industrial Control 12:22 CISA Budget Cuts Debate 14:05 Wrap Up and Thanks 14:59 Sponsor Message Meter
Security problems aren't changing very much even though security teams are. We catch up on the implications of the Claude Code source leak, the very human lessons from the axios NPM compromise, and what secure design looks like when it involves agents, humans, or both. AppSec has always celebrated interesting and impactful vulns. And LLMs are now a favored tool for finding flaws. We shouldn't forget the success and effectiveness of fuzzers like OSS-Fuzz, which has improved security for over 1,000 projects and found over 50,000 bugs. But we can't ignore the ease of prompting an agent to go find -- and exploit -- a vuln when the UX and overhead of doing so is hardly more than writing some markdown. The SDLC Blind Spot: Why Breaches Start with Identity, Not Code Developers have access to source code, CI/CD pipelines, and cloud infrastructure — and attackers know it. Target lost 860GB of source code through a single compromised credential. Recruitment fraud campaigns have pivoted from a compromised developer to cloud admin in under 10 minutes. As agents join human developers, contractors, and service accounts in the SDLC, the attack surface is expanding faster than static security tools can track. Security teams need real-time visibility beyond code and into who has access and what they're actually doing. This segment is sponsored by Apiiro. To lean more, visit https://securityweekly.com/apiirorsac. How AI-Driven Development is Reshaping the Application Risk Landscape Agent coding assistants are accelerating software development, generating more code and more change than security teams were built to handle. In this interview, Idan Plotnik discusses how AI-driven development is reshaping the application risk landscape and why traditional vulnerability management models can't keep up. Make sure to schedule a free SDLC Risk Assessment with BlueFlag Security - 30 minutes to deploy. 48 hours to results. Please visit https://securityweekly.com/blueflagrsac. Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw-377
We're proud to release this ahead of Ryan's keynote at AIE Europe. Hit the bell, get notified when it is live! Attendees: come prepped for Ryan's AMA with Vibhu after.Move over, context engineering. Now it's time for Harness engineering and the age of the token billionaires.Ryan Lopopolo of OpenAI is leading that charge, recently publishing a lengthy essay on Harness Eng that has become the talk of the town:In it, Ryan peeled back the curtains on how the recently announced OpenAI Frontier team have become OpenAI's top Codex users, running a >1m LOC codebase with 0 human written code and, crucially for the Dark Factory fans, no human REVIEWED code before merge. Ryan is admirably evangelical about this, calling it borderline “negligent” if you aren't using >1B tokens a day (roughly $2-3k/day in token spend based on market rates and caching assumptions):Over the past five months, they ran an extreme experiment: building and shipping an internal beta product with zero manually written code. Through the experiment, they adopted a different model of engineering work: when the agent failed, instead of prompting it better or to “try harder,” the team would look at “what capability, context, or structure is missing?”The result was Symphony, “a ghost library” and reference Elixir implementation (by Alex Kotliarskyi) that sets up a massive system of Codex agents all extensively prompted with the specificity of a proper PRD spec, but without full implementation:The future starts taking shape as one where coding agents stop being copilots and start becoming real teammates anyone can use and Codex is doubling down on that mission with their Superbowl messaging of “you can just build things”.Across Codex, internal observability stacks, and the multi-agent orchestration system his team calls Symphony, Ryan has been pushing what happens when you optimize an entire codebase, workflow, and organization around agent legibility instead of human habit.We sat down with Ryan to dig into how OpenAI's internal teams actually use Codex, why the real bottleneck in AI-native software development is now human attention rather than tokens, how fast build loops, observability, specs, and skills let agents operate autonomously, why software increasingly needs to be written for the model as much as for the engineer, and how Frontier points toward a future where agents can safely do economically valuable work across the enterprise.We discuss:* Ryan's background from Snowflake, Brex, Stripe, and Citadel to OpenAI Frontier Product Exploration, where he works on new product development for deploying agents safely at enterprise scale* The origin of “harness engineering” and the constraint that kicked off the whole experiment: Ryan deliberately refused to write code himself so the agent had to do the job end to end* Building an internal product over five months with zero lines of human-written code, more than a million lines in the repo, and thousands of PRs across multiple Codex model generations* Why early Codex was painfully slow at first, and how the team learned to decompose tasks, build better primitives, and gradually turn the agent into a much faster engineer than any individual human* The obsession with fast build times: why one minute became the upper bound for the inner loop, and how the team repeatedly retooled the build system to keep agents productive* Why humans became the bottleneck, and how Ryan's team shifted from reviewing code directly to building systems, observability, and context that let agents review, fix, and merge work autonomously* Skills, docs, tests, markdown trackers, and quality scores as ways of encoding engineering taste and non-functional requirements directly into context the agent can use* The shift from predefined scaffolds to reasoning-model-led workflows, where the harness becomes the box and the model chooses how to proceed* Symphony, OpenAI's internal Elixir-based orchestration layer for spinning up, supervising, reworking, and coordinating large numbers of coding agents across tickets and repos* Why code is increasingly disposable, why worktrees and merge conflicts matter less when agents can resolve them, and what it really means to fully delegate the PR lifecycle* “Ghost libraries”, spec-driven software, and the idea that a coding agent can reproduce complex systems from a high-fidelity specification rather than shared source code* The broader future of Frontier: safely deploying observable, governable agents into enterprises, and building the collaboration, security, and control layers needed for real-world agentic workRyan Lopopolo* X: https://x.com/_lopopolo* Linkedin: https://www.linkedin.com/in/ryanlopopolo/* Website: https://hyperbo.la/contact/Timestamps00:00:00 Introduction: Harness Engineering and OpenAI Frontier00:02:20 Ryan's background and the “no human-written code” experiment00:08:48 Humans as the bottleneck: systems thinking, observability, and agent workflows00:12:24 Skills, scaffolds, and encoding engineering taste into context00:17:17 What humans still do, what agents already own, and why software must be agent-legible00:24:27 Delegating the PR lifecycle: worktrees, merge conflicts, and non-functional requirements00:31:57 Spec-driven software, “ghost libraries,” and the path to Symphony00:35:20 Symphony: orchestrating large numbers of coding agents00:43:42 Skill distillation, self-improving workflows, and team-wide learning00:50:04 CLI design, policy layers, and building token-efficient tools for agents00:59:43 What current models still struggle with: zero-to-one products and gnarly refactors01:02:05 Frontier's vision for enterprise AI deployment01:08:15 Culture, humor, and teaching agents how the company works01:12:29 Harness vs. training, Codex model progress, and “you can just do things”01:15:09 Bellevue, hiring, and OpenAI's expansion beyond San FranciscoTranscriptRyan Lopopolo: I do think that there is an interesting space to explore here with Codex, the harness, as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding. We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to.Build a user journey that you're trying to solve into code. It's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate in prompts. To let the model cook, you have to step back, right? Like you need to take a systems thinking mindset to things and constantly be asking, where is the Asian making mistakes?Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC.swyx: [00:01:00] All right.[00:01:03] Meet Ryan swyx: We're in the studio with Ryan from OpenAI. Welcome.Ryan Lopopolo: Hi,swyx: Thanks for visiting San Francisco and thanks for spending some time with us.Ryan Lopopolo: Yeah, thank you. I'm super excited to be here.swyx: You wrote a blockbuster article on harness engineering. It's probably going to be the defining piece of this emerging discipline, huh?Ryan Lopopolo: Thank you. It is it's been fun to feel like we've defined the discourse in some sense.swyx: Let's contextualize a little bit, this first podcast you've ever done. Yes. And thank you for spending with us. What is, where is this coming from? What team are you in all that jazz?Ryan Lopopolo: Sure, sure.Ryan Lopopolo: I work on Frontier Product Exploration, new product development in the space of OpenAI Frontier, which is our enterprise platform for deploying agents safely at scale, with good governance in any business. And. The role of VMI team has been to figure out novel ways to deploy our models into package and products that we can sell as solutions to enterprises.swyx: And you have a background, I'll just squeeze it in there. Snowflake, brick, [00:02:00] stripe, citadel.Ryan Lopopolo: Yes. Yes. Same. Any kind of customerswyx: entire life. Yes. The exact kind of customer that you want to,Vibhu: so I'll say, I was actually, I didn't expect the background when I looked at your Twitter, I'm seeing the opposite.Stuff like this. So you've got the mindset of like full send AI, coding stuff about slop, like buckling in your laptop on your Waymo's. Yes. And then I look at your profile, I'm like, oh, you're just like, you're in the other end too. Oh, perfect. Makes perfect.Ryan Lopopolo: I it's quite fun to be AI maximalist if you're gonna live that persona.Open eye is the place to do it. And it'sswyx: token is what you say.Ryan Lopopolo: Yeah. Certainly helps that we have no rate limits internally. And I can go, like you said, full send at this stay.swyx: Yeah. Yeah. So the Frontier, and you're a special team within O Frontier.Ryan Lopopolo: We had been given some space to cook, which has been super, super exciting.[00:02:47] Zero Code ExperimentRyan Lopopolo: And this is why I started with kind of a out there constraint to not write any of the code myself. I was figuring if we're trying to make agents that can be deployed into end to enterprises, they should be [00:03:00] able to do all the things that I do. And having worked with these coding models, these coding harnesses over 6, 7, 8 months, I do feel like the models are there enough, the harnesses are there enough where they're isomorphic to me in capability and the ability to do the job.So starting with this constraint of I can't write the code meant that the only way I could do my job was to get the agent to do my job.Vibhu: And like a, just a bit of background before that. This is basically the article. So what you guys did is five months of working on an internal tool, zero lines of code over a mi, a million lines of code in the total code base.You say it was cenex, more like it was cenex faster than you would've. If you had done it by end. SoRyan Lopopolo: yeah, thatVibhu: was the mindset going into this, right?Ryan Lopopolo: That's right.[00:03:46] Model Upgrades LessonsRyan Lopopolo: Started with some of the very first versions of Codex CLI, with the Codex Mini model, which was obviously much less capable than the ones we have today.Which was also a very good constraint, right? Quite a visceral feeling to ask the [00:04:00] model to build you a product feature. And it just not being able to assemble the pieces together.Which kind of defined one of the mindsets we had for going into this, which is whenever the model just cannot, you always pop open at the task, double click into it, and build smaller building blocks that then you can reassemble into the broader objective.And it was quite painful to do this. Honestly, the first month and a half was. 10 times slower than I would be. But because we paid that cost, we ended up getting to something much more productive than any one engineer could be because we built the tools, the assembly station for the agent to do the whole thing.[00:04:43] Model Generations, Build Systems & Background ShellsRyan Lopopolo: But yeah, so onward to G BT 5, 5, 1, 5, 2, 5, 3, 5 4. To go through all these model generations and see their kind of corks and different working styles also meant we had to adapt the code base to change things up when the model was revved. [00:05:00] One interesting thing here is five two, the Codex harness at the time did not have background shells in it, which means we were able to rely on blocking scripts to perform long horizon work.But with five, three and background shells, it became less patient, less willing to block. So we had to retool the entire build system to complete in under a minute and. This is not a thing I would expect to be able to do in a code base where people have opinions. But because the only goal was to make the Asian productive over the course of a week, we went from a bespoke make file build to Basil, to turbo to nx and just left it there because builds were fast at that point.swyx: Interesting. Talk more about Turbo TenX. That's interesting ‘cause that's the other direction that other people have been doing.Ryan Lopopolo: Ultimately I have. Not a lot of experience with actual frontend repo architecture.swyx: You're talking that Jessica built the sky. So I'm like, I know the NX team. I know Turbo from Jared [00:06:00] Palmer.And I'm like, yeah, that's an interesting comparison.[00:06:02] One Minute Build LoopRyan Lopopolo: The hill we were climbing right, was make it fast.swyx: Is there a micro front end involved? Is it how how complex reactRyan Lopopolo: electron base single app sort of thingswyx: And must be under a minute. That's an interesting limitation. I'm actually not super familiar with the background shelf stuff.Probably was talked about in the fight three release.Ryan Lopopolo: BA basically means that codex is able to spawn commands in the background and then go continue to work while it waits for them to finish. So it can spawn an expensive build and then continue reviewing the code, for example.swyx: Yeah.Ryan Lopopolo: And this helps it be more time efficient for the user invoking the harness.swyx: And I guess and just to really nail this, like what does one minute matter? Like why not five, okay, good. We want no. WeRyan Lopopolo: want the inner loop to be as fast as possible. Okay. One minute was just a nice round number and we were able to hit it.swyx: And if it doesn't complete, it kills it or some something,Ryan Lopopolo: No.We just take that as a signal that we need to stop what we're doing, double click, decompose a build graph a bit to get us to high back under so that we [00:07:00] can able the agent continue to operate.swyx: It's almost like you're, it's like a ratchet. It's like you're forcing build time discipline, because if you don't, it'll just grow and grow.That's right. And you mentioned that my current, like the software I work on currently is at 12 minutes. It sucks.Ryan Lopopolo: This has been my experience with platform teams in the past, where you have an envelope of acceptable build times and you let it go up to breach and then you spend two, three weeks to bring it back down to the lower end of the average low bed stop.But because tokens are so cheap Yeah. And we're so insanely parallel with the model, we can just constantly be gardening this thing to make sure that we maintain these in variants, which means. There's way less dispersion in the code and the SDLC, which means we can simplify in a way and rely on a lot more in variance as we write the software.[00:07:45] Observability, Traces & Local Dev StackVibhu: Lovely.[00:07:46] Humans Are BottleneckVibhu: You mentioned in your article, like humans became the bottleneck, right? You kicked off as a team of three people. You're putting out a million line of code, like 1500 prs, basically. What's the mindset there? So as much as code is disposable, you're doing a lot of review. A lot [00:08:00] of the article talks about how you wanna rephrase everything is prompting everything, is what the agent can't see.It's kind of garbage, right? You shouldn't have it in there. So what's like the high level of how you went about building it, and then how you address okay, humans are just PR review. Like how is human in the loop for this?Ryan Lopopolo: We've moved beyond even the humans reviewing the code as well.[00:08:19] Human Review, PR Automation & Agent Code ReviewRyan Lopopolo: Most of the human review is post merge at this point.But post, post merge, that's not even reviewed. That's justswyx: Oh, let's just make ourselves happy by YouRyan Lopopolo: haven't used fundamentally. The model is trivially paralyzable, right? As many GPUs and tokens as I am willing to spend, I can have capacity to work with my hood base.The only fundamentally scarce thing is the synchronous human attention of my team. There's only so many hours in the day we have to eat lunch. I would like to sleep, although it's quite difficult to, stop poking the machine because it makes me want to feed it. You have to step back, right?Like you need to take a systems thinking mindset to things and [00:09:00] constantly be asking where is the agent making mistakes? Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC, and usually what that has looked like is like we started needing to pay very close attention to the code because the agent did not have the right building blocks to produce.Modular software that decomposed appropriately that was reliable and observable and actually accrued a working front end in these things, right?[00:09:35] Observability First SetupRyan Lopopolo: So in order to not spend all of our time sitting in front of a terminal at most, doing one or two things at a time, invested in giving the model that observability, which is that that graph in the post here.swyx: Yeah. Let's walk through this traces and which existed firstRyan Lopopolo: we started with just the app and the whole rest of it. From vector through to all these login metrics, APIs was, I dunno, half an [00:10:00] afternoon of my time. We have intentionally chosen very high level fast developer tools. There's a ton of great stuff out there now.We use me a bunch, which makes it trivial to pull down all these go written Victoria Stack binaries in our local development. Tiny little bit of python glue to spin all these up. And off you go. One neat thing here is we have tried to invert things as much as possible, which is instead of setting up an environment to spawn the coding agent into, instead we spawn the coding agent, like that's the entry point.It's just Codex. And then we give Codex via skills and scripts the ability to boot the stack if it chooses to, and then tell it how to set some end variables. So the app and local Devrel points at this stack that it has chosen to spin up. And this I think is like the fundamental difference between reasoning models and the four ones and four ohs of the past, where these models could not think so you had to put them in [00:11:00] boxes with a predefined set of state transitions.Whereas here we have the model, the harness be the whole box. And give it a bunch of options for how to proceed with enough context for it to make intelligent choices. SoVibhu: sales, so like a lot of that is around scaffolding, right? Yes. Previous agents, you would define a scaffold. It would operate in that.Lube, try again. That's pivoted off from when we've had reasoning models. They're seeming to perform better when you don't have a scaffold, right? That's right.[00:11:28] Docs Skills GuardrailsVibhu: And you go into like niches here too, like your SPEC MD and like having a very short agent MG Agent md.swyx: Yes. Yes.Vibhu: Yeah. So you even lay out what it is here, but I likeswyx: the table contents.Vibhu: Yeah.swyx: Like stuff like this, it really helps guide people because everyone's trying to do this.Ryan Lopopolo: This structure also makes it super cheap to put new content into the repository to steer both the humans and the agents.swyx: You, you reinvented skills, right?Vibhu: One big agents andswyx: skills from first princip holdsRyan Lopopolo: all skills did not exist when we started doing this.Vibhu: You have a short [00:12:00] one 100 line overall table of contents and then you have little skills, right? Core beliefs, MD tech tracker. Yeah. Yeah. The scale is overRyan Lopopolo: The tech jet tracker and the quality score are pretty interesting because this is basically a tiny little scaffold, like a markdown table, which is a hook for Codex to review all the business logic that we have defined in the app, assess how it matches all these documented guardrails and propose follow up work for itself.Before beads and all these ticketing systems, we were just tracking follow up work as notes in a markdown file, which, we could spa an agent on Aron to burn down. There's this really neat thing that like the models fundamentally crave text. So a lot of what we have done here is figure out ways to inject textswyx: intoRyan Lopopolo: the system right when we get a page, because we're missing a timeout, for example.I can just add Codex in Slack on that page and say, I'm gonna fix this by adding a timeout. Please update our reliability documentation. To require that all network calls have [00:13:00] timeouts. So I have not only made a point in time fix, but also like durably encoded this process knowledge around what good looks like.swyx: Yeah.Ryan Lopopolo: And we give that to the root coding agent as it goes and does the thing. But you can also use that to distill tests out of, or a code review agent, which is pointed at the same things to narrow the acceptable universe of the code that's produced.swyx: I think one of the concerns I have with that kind of stuff is you think you're making the right call by making, it's persisted for all time across everything.Yes. But then you didn't think about the exceptions that you need to make, right? And that you have to roll it back.Vibhu: Part of it isswyx: also sometimes it can follow your s instructions too.Vibhu: It's somewhat a skill, right? So it determines when it uses the tools, right? Like it's not like it'll run outta every call.It'll determine when it wants to check quality score, right?Ryan Lopopolo: Yeah. And we do in the prompts we give these agents, allow them to push back,[00:13:51] Agent Code Review RulesRyan Lopopolo: When we first started adding code review agents to the pr, it would be Codex, CLI. Locally writes the change, pushes up a PR on [00:14:00] those PR synchronizations of review agent fires.It posts a comment. We instruct Codex that it has to at least acknowledge and respond to that feedback. And initially the Codex driving the code author was willing to be bullied by the PR reviewer, which meant you could end up in a situation where things were not converging. So yeah, we had to,swyx: he's just a thrash.Ryan Lopopolo: We had to add more optionality to the prompts on both of these things, right? The reviewer agents were instructed to bias toward merging the thing to not surface anything greater than a P two in priority. We didn't really define P two, but we gave it, youswyx: did define P two.Ryan Lopopolo: We gave it a framework within which to score its outputswyx: and then greater than P zero is worse, right?Yes. P two is very good.Ryan Lopopolo: P zero is you will mute the code place ifswyx: you merch thisRyan Lopopolo: thing, right?swyx: Yeah.Ryan Lopopolo: But also on the code authoring agent side, we also gave it the flexibility to either defer or push back against review feedback, right? This happens all the time, right? Like I happen to notice something and leave a code review, [00:15:00] which.Could blow up the scope by a factor of two. I usually don't mean for that to be addressed Exactly. In the moment. It's more of an FYI file it to the backlog, pick it up in the next fix it week sort of thing. And without the context that this is permissible, the coding agents are gonna bias toward what they do, which is following instructions.swyx: Yeah.[00:15:19] Autonomous Merging Flowswyx: I do wanted to check in on a couple things, right? Sure. All the coding review agent, it can merge autonomously. I think that's something that a lot of people aren't comfortable with. And you have a list here of how much agents do they do Product code and tests, CI configuration and release tooling, internal Devrel tools, documentation eval, harness review, comments, scripts that manage the repository itself, production dashboard definition files, like everything.Yes. And so they're just all churning at the same time, is there like a record that, that any human on the team pulls to stop everythingRyan Lopopolo: Because we are building a native application here. We're not doing continuous deploy. So there's still a human in the loop for cutting the release branch.I see. We require a blessed [00:16:00] human approved smoke test of the app before we promote it to distribution, these sort of things.swyx: So you're working on the app, you're not building like infrastructure where you have like nines of reliability, that kinda stuff?Ryan Lopopolo: That's correct. That's correct. Okay. And also like full recognition here that all of this activity took in a completely greenfield repository.There's. Should be no script that this applies generally toswyx: this is a production thing, you're gonna shipRyan Lopopolo: toswyx: customers. Of course. Yeah, of course. So this is realVibhu: And like one of the things there is, you mentioned you started this as a repo from scratch. The onboarding first month or so was pretty, it was like working backwards, right?Yeah. And then you had to work with the system and now you're at that point where you know, you're very autonomous. I'm curious like, okay, so what, how human in the loop is it? So what are the bottlenecks that you wish you could still automate? And part of that is also like, where do you see the model trajectory improving and offloading more human in the loop?We just got 5.4. It's a really good,Ryan Lopopolo: fantastic model, by the way.Vibhu: Yeah. Yeah. It's the first one that's merged. Top tier coding. So it's codex level coding and reasoning. So general reasoning both in one model. SoRyan Lopopolo: andVibhu: computer [00:17:00] use vision.Ryan Lopopolo: Now we now with five four, I can just have Codex write the blog post, whereas for this one I had to balance between chat.swyx: Oh, I need to, I might be out of a job. Oh my God.Ryan Lopopolo: Oh,swyx: I know. You just gave me an idea for a completely AI newsletter that five four could do. Yeah, I get it Now.Ryan Lopopolo: This sort of thing is just one example of closing the loop, right? Like the dashboard thing you mentioned. We have Codex authoring the Js ON, for the Grafana dashboards and publishing them and also responding to the pages, which means when it gets the page, it knows exactly which dashboards are defined and what alerts.What alert was triggered by which exact log in the code base. ‘cause all of this stuff is collated together.swyx: It has to own everything.Yes. Yeah. Yeah.Ryan Lopopolo: And it means that if we have an outage that did not result in a page. It has the existing set of dashboards available to it. It has the existing set of metrics and logs and can figure out where the gaps in the dashboard are or [00:18:00] in the underlying metrics and fix them in one go.In the same way, you would have a full stack engineer be able to drive a feature from the backend all the way to the front end.Vibhu: So it, it seems like a lot of the work you guys had to do was you as a small team are fully working for a way that the model wants the software to be written. It's like less human legible for better. Code legibility, agent legibility. How do you think that affects broader teams? So one at OpenAI, do liaison, like this is how software should be written. Like I can imagine, say you join a new team with this methodology, this mindset there's ways that, teams do code review, teams write code, like teams are structured and a lot of it is for human legibility.So should we all swap? Like how does this play back one broader into OpenAI and then like broader into the software engineering, right? Is it like teams that pick this up will it's pretty drastic, right? You have to make a pretty big switch. Should they just full send Yeah.Ryan Lopopolo: The mindset is very much that I'm removed from the process, right? I can't really have deep code level opinions about [00:19:00] things. It's as if I'm. Group tech leading a 500 person organization.Vibhu: Yeah.Ryan Lopopolo: Like it's not appropriate for me to be in the weeds on every pr. This is why that post merge code review thing is like a good analog here, right?Like I have some representative sample of the code as it is written, and I have to use that to infer what the teams are struggling with, where they could use help, where they're already moving quickly and I can pivot my focus elsewhere.Vibhu: Yeah.Ryan Lopopolo: So I don't really have too many opinions around the code as it is written.I do, however, have a command based class, which is used to have repeatable chunks of business logic that comes with tracing and metrics and observability for free. And the thing to focus on is not how that business logic is structured, but that it uses this primitive ‘cause I know that's gonna give leverage by default.Vibhu: Yeah.Ryan Lopopolo: Yeah, back to that sort of systems stinking,Vibhu: and you have part of that in your blog post, enforcing architecture and ta taste how you set boundaries for what's used. There's also a section on redefining [00:20:00] engineering and stuff, but yeah, it's just, it's interesting to hear,Ryan Lopopolo: and as the models have gotten better, they have gotten better at proposing these abstractions to unblock themselves, which again, lets me move higher and higher up the stack to look deeper into the future on what ultimately blocked the team from shipping.swyx: Yeah. You mentioned so you, this is primarily a, it is like a 1 million line of code base electron app. But it manages its own services as well, so it's like a backend for front end type thing.Ryan Lopopolo: We do have a backend in there, but that's hosted in the cloud.Yeah. This sort of structure is actually within the separate main and render processesWithin theswyx: electric.That's just how electronic works.Ryan Lopopolo: Yeah, of course. So have also treated like. MVC style decomposition with the same level of rigor, which has been very fun.swyx: I have a fun pun. This is a tangent, NVC is model view controller. Any sort of full stack web Devrel knows that.But my AI native version of this is Model view Claw, the clause the harness.Ryan Lopopolo: That's right. That's right. I do think that there is an interesting space to [00:21:00] explore here with Codex, the harness as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding.We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to build, a user journey that you're trying to solve into code, it's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate and prompts to let the model cook.Yeah. It's been very fun. And there's also a very engineering legible way of increasing capabil. It's fantastic, right? Yeah. Just give you, just give the model scripts, the same scripts you would already build for yourself.swyx: Yeah.Yeah. So for listeners, this is Ryan saying that software engineering or coding against will eat knowledge work like the non-coding parts that you would normally think.Oh, you have to build a separate agent for it. No, start a coding agent and go out from there. Which open Claw has like it's pie Underhood.Ryan Lopopolo: [00:22:00] Yes.Vibhu: Basically define your task in code. Everything is a codingswyx: agent by the way. Since I brought it up, it's probably the only place we bring it up. Is any open claw usage from you?Any?Ryan Lopopolo: No. No. Not for me. I don't have any spare Mac Minis rattling around my house.swyx: You can afford it? No. I just, I'm curious if it's changed anything in opening eye yet, but it's probably early days. And then the other, the other thing I, I wanna pull on here is like you mentioned ticketing systems and you mentioned prs and I'm wondering if both those things have to go away or be reinvented for this kind of coding.So the git itself and is like very hostile to multi-agent.Ryan Lopopolo: Yeah. We make very heavy use of work trees.swyx: But like even then, like I just did a, dropped a podcast yesterday with Cursors saying, and they said they're getting rid of work trees ‘cause it still has too many merge conflicts.It's still un too un unintuitive. But go ahead.Ryan Lopopolo: The models are really great at resolving merge conflicts. Yeah. And to get to a state where I'm not synchronously in the loop in my terminal, I almost don't care that there are mergeswyx: with disposable.[00:23:00] Yeah.Ryan Lopopolo: We invoke a dollar land skill and that coaches codex to push the PR Wait for human and agent reviewers Wait for CI to be green.Fix the flakes if there are any merged upstream. If the PR comes into conflict, wait for everything to pass. Put it in the merge queue. Deal with flakes until it's in Maine. End. This is what it means to delegate fully, right? This is in a, very large model re probably a significant tax on humans to get PRS merged, but the agent is more than capable of doing this and I really don't have to think about it other than keep my laptop open.swyx: Yeah. I used to be much more of a control freak, but now I'm like, yeah, actually you could do a better job of this than me. Yeah. With the right context. Yes.[00:23:47] Encoding Requirementsswyx: Anything else in harness in general? Just this piece, I just wanna make sure we,Ryan Lopopolo: I think one thing that I maybe didn't make super clear in the article that I heard on Twitter as an interesting, that's respond [00:24:00]swyx: to them.What's the chatter and then what's your response?Ryan Lopopolo: Ultimately, all the things that we have encoded in docs and tests and review agents and all these things are ways to put all the non-functional requirements of building high scale, high quality, reliable software into a space that prompt injects the agent.We either write it down as docs, we add links where the error messages tell how to do the right thing. So the whole meta of the thing is to basically tease out of the heads of all the engineers on my team, what they think good looks like, what they would do by default, or what they would coach a new hire on the team to do to get things to merch.And that's why we pay attention to all the mistakes, mistakes that the agent makes, right? This is code being written that is misaligned with some as yet not written down, non-functional requirement.swyx: Sorry, what? Did the online people misunderstand orRyan Lopopolo: No,swyx: whatyouRyan Lopopolo: responded to? Somebody just literally said that.I was like, oh yeah,swyx: okay,Ryan Lopopolo: This is the [00:25:00] thing. This is what I've been doing. Oh, youswyx: agree? Yeah. I see. Interesting.Ryan Lopopolo: One other neat thing, which I did totally did not expect is folks were just. Taking the link to the article and giving it to pi or Codex and say, make my repo this,Vibhu: you achi a whole recursion.Ryan Lopopolo: And it was wildly effective. Really? It was wildly effective. NoVibhu: way. It just actually is something I tried with five, four yesterday. I didn't have time. Last time I was like out speaking of something, and this is one of my things, I was like, okay, I have this article. Can we just scaffold out what it would be like to run this?And I, I did it first as that and then I was like, okay, let me take another little side repo and say okay, if I was to fully automate this like this because I haven't written a line of code, it'sRyan Lopopolo: like over full, setVibhu: it right. The side thing I'm doing of voice. TTS I'm just like, slobbing out, whatever.It's nothing production. I'm like, how would I make this like this? And it's actually like a really good way. It's like a good way to learn what could be changed, what could be like, it's just a good analyzing, right? You give it all the codes, you give it all the context, you give it the article and it walks you through it very well.That's right. That's right.[00:25:57] Inlining Dependencies[00:25:57] Dependencies Going Away & Brett Taylor's Responseswyx: I guess one more thing before we go to Symphony is I wanted to cover [00:26:00] Brett Taylor's response. We had him on the show. He is your chairman, which is wild. Yeah. That he's reading your articles as well and like getting engaged in it. He says software dependencies are going away.Basically they can just be like vendored. Yes. Response.Ryan Lopopolo: Aswyx: hundred percent. A hundred percent agree. You still pro qr, you still pay Datadog. You still pay Temporal. Thank you.Ryan Lopopolo: Yep. The level of complexity of the dependencies that we can internalize is, I would say low, medium right now. Just based on model capability.What does the,swyx: what is medium?Ryan Lopopolo: I would say like a. A couple thousand line dependency is a thing that we could in-house No problem. Call in an afternoon of time. One neat thing about it is like probably most of that code you don't even need. Like by in-house and abstraction, you can strip away all the generic parts of it and only focus on what you need to enable the specific thing.Yes. You're building,swyx: I've been calling this the end of b******t plugins.Ryan Lopopolo: Yeah.swyx: Because there's so much when I published an open source thing, I want to accept everything, be liberal. I want to accept, this is post's law, but that means there's so much bloat. Yes. There's so much overhead.Ryan Lopopolo: One other neat thing about [00:27:00] this too is when we deploy Codex Security on the repo, it is able to deeply review and change. The internalized dependencies in a much lower friction way than it would be to like, push patches upstream, wait for them to be released, pull them down, make sure that's compatible with all the transitive I have in my repo and things like that.So it's also much lower friction to internalize some of these things if code is free. ‘cause the tokens are cheap sort of thing.swyx: Yeah. Yeah. I think like the only argument I have against this is basically scale testing, which obviously the larger pieces of software like Linux, MySQL, he calls up even the Datadog and Temporals and then maybe security testing where Yes.Classically, I think, is it linis tos, it said security open source is the best disinfectant.Ryan Lopopolo: Many eyes.swyx: Many eyes. And if inline your dependencies and code them up, you're gonna have to relearn mistakes from other people that Yep.Ryan Lopopolo: Yep. And to internalize that dependency, you're back to zero and you have to start.Reassembling all those bits and pieces to Yeah. Have [00:28:00] high confidence in the code as it is written. Yeah.Vibhu: Even part of the first intro of this, you basically mentioned like everything was written by codex, including internal tooling, right? So internal tooling, like when you're visualizing what's going on it's writing it for itself.swyx: Yeah. I'm built internal tools way I now, and like I just show them off and they're like, how long did you spend? And I didn't spend any time. I just prompted it,Ryan Lopopolo: very funny story here.swyx: Yeah, go ahead.Ryan Lopopolo: We had deployed our app to the first dozen users internally had some performance issues, so we asked them to export a trace for us get a tar ball, gave it to our on-call engineer, and he did a fantastic job of working with Codex to build this beautiful local Devrel tool, next JS app, the drag and drop the tar ball in, and it visualizes the entire trace.It's fantastic. Took an afternoon, but none of this was necessary. Because you could just spin up codex and give it the tar ball and ask the same thing and get the response immediately. So in a way, optimizing for human [00:29:00] legibility of that debugging process was wrong. It kept him in the loop unnecessarily when instead he could have just like Codex cooked for five minutes and gotten this same.swyx: Yeah, you verify your instincts here of this is how we used to do it. Or this is how I would have used to solve it.Ryan Lopopolo: Yeah. In this local observability stack. Like sure, you can de deploy Yeager to visualize the traces, but I wouldn't expect to be looking at the traces in the first place because I'm not gonna write the code to fix them.swyx: Yeah. So basically there needs to be like this kind of house stack and owning the whole loop. I think that is very well established. And it sounds like you might be like sharing more about that in the future, right?Ryan Lopopolo: Yeah. I think we're excited to do[00:29:36] Ghost Libraries Specs[00:29:36] Ghost Libraries & Distributing Software as SpecsRyan Lopopolo: We're gonna talk about Symphony in a little bit, but like the way we distribute it as a spec, which I think folks are calling Ghost Libraries on Twitter.This is like a such a cool name. It does mean it becomes much cheaper to share software with the world, right? You define a spec, how you could build your own specifying as much as is required for a coding agent to reassemble it [00:30:00] locally. The flow here is very cool. Like we have taken. All the scaffolding that has existed in our proprietary repo spun up a new one.Ask Codex with our repo as a reference. Write the spec. We tell it. Spin up a team ox spawn a disconnected codex to implement the spec. Wait for it to be done. Spawn another codex and another team ox to review the spec com or review the implementation compared to upstream and update the spec so it diverges less.And then you just loop over and over Ralph style until you get a spec that is with high fidelity able to reproduce the system as it is. It's fantastic.Vibhu: And you're basically, you're not really adding any of your human bias in there, right? That's correct. A lot of times people write a spec and be like, okay, I think it should be done this way, and you'll riff on something.And it's no, the agent could have just handled it like you're still scaffolding in a sense, right? I want it done this way. It can determine its spec better.swyx: That's right. That's right. Part of me it, I'm, I've been working a lot on evals recently, and part of me is wondering if [00:31:00] an agent can produce a spec that it cannot solve.Is it always capable of things that he can imagine or can you imagine things that it is impossible to do?Ryan Lopopolo: I think with Symphony, we, there's like this there's this axis where you have things that are easier, hard, or established or new, right? And I think things that are hard and new is still something that the models need humans.Yeah. Drive.swyx: Yeah. Yeah.Ryan Lopopolo: But I think those other quadrants are largely salt. Given the right scaffold and the right thing that's gonna drive the agent to completion,swyx: it's crazy that it solved,Ryan Lopopolo: but it means that the humans, the ones with limited time and attention get to work on the hardest stuff, like the problems where it's pure white space out in front. Or like the deepest refactorings where you don't know what the proper shape of the interfaces are. And this is where I wanna spend my time. ‘cause it lets me set up for the next level of scale.swyx: Yeah. Yeah. Amazing. Let's introduce Symphony.I think we've been mentioning it every now and then. Elixir. Interesting option.Ryan Lopopolo: Yeah.swyx: Yeah. I'm not,Ryan Lopopolo: again, like the [00:32:00] elixir manifestation here is just a derivative. Is it a modelswyx: chosen? Yeah.Ryan Lopopolo: Yeah. Yeah. And it chose that because the process supervision and the gen servers are super amenable to the type of process orchestration that we're doing here.You are essentially spinning up little Damons for every task that is in execution and driving it to completion, which. Means the mall gets a ton of stuff for free by using Elixir and the Beam.swyx: I had to go do a crash course in Beam and Elixir, and I think most people are not operating at that scale of concurrency where you need that.But it is a good mental model for Resum ability and all those things. And these are things I care about. But tell me the story, the origin story of Symphony. What do you use it for? Is this, how did it form maybe any abandoned paths that you didn't take?[00:32:46] Terminal Free Orchestration[00:32:46] Symphony: Removing Humans from the LoopRyan Lopopolo: At the end of December we were at about three and a half PRS per engineer per day.This was before five two came out in the beginning of January. Everyone gets back from holiday with five two and no other work [00:33:00] on the repository. We were up in the five to 10 PRS per day per engineer. And I don't know about y'all, but like it's very taxing to constantly be switching like that. Like I was pretty tapped out at the end of the day, again, where are the humans spending their time? They're spending their time context switching between all these active tmox pains to drive the agent forward.swyx: Yeah. No way. Yeah.Ryan Lopopolo: So let's again, build something to remove ourselves from the loop. And this is what frantic sprinted adapt here to find a way to remove the need for the human to sit in front of their terminal.So a lot of experimentation with Devrel boxes and, automatically spinning up agents, like it seems like a fantastic end state here, where my life is beach. I open live twice a day and say yes no to these things. Yeah. And this is again, a super, super interesting framing for how the work is done.Because I become more latency and sensitive. I have [00:34:00] way less attachment to the code as it is written. Like I've had close to zero investment in the actual authorship experience. So if it's garbage. I can just throw it away and not care too much about it. In Symphony, there's this like rework state where once the PR is proposed and it's escalated to the human for review, it should be a cheap review.It is either mergeable or it is not. And if it's not, you move it to rework. The elixir service will completely trash the entire work tree NPR and start it again from scratch. Okay. And this is that opportunity again to say, why was it trash right? What did the agent do that wasswyx: bad. Yeah.Ryan Lopopolo: Fix that before moving the ticket toswyx: endRyan Lopopolo: of progress again.swyx: Yeah. Why is this not in codex app? I guess this, you guys are ahead of Codex app,Ryan Lopopolo: yeah, so the way the team has been working is basically to be as AI pilled as possible and spread ahead. And a lot of the things we have worked on have fallen out [00:35:00] into a lot of the products that we have.Like we were in deep consultation with the Codex team to. Have the Codex app be a thing that exists, right? To have skills be a thing that Codex is able to use. So we didn't have to roll our own to put automations into the product. So all of our automatic refactoring agents didn't have to be these hand rolled control loops.It has been really fantastic to be, in a way, un anchored to the product development of Frontier and Codex and just very quickly try to figure out what works and then later find the scalable thing that can be deployed widely. It's been a very fun way to operate. It's certainly chaotic. I have lost track very often of what the actual state of the code looks like.‘cause I'm not in the loop. There was. One point where we had wired playwright directly up to the Electron app. With MCPM CCPs, I'm pretty bearish on because the harness forcibly injects all those tokens in the [00:36:00] context, and I don't really get a say over it. They mess with auto compaction. The agent can forget how to use the tool.There's probably only what three calls in playwright that I actually ever want to use. So I pay the cost for a ton of things. Somebody vibed a local Damon that boots playwright and exposes a tiny little shim CLI to drive it. And I had zero idea that this had occurred because to me, I run Codex and it's able to, it's oh, it's better.Yeah. Like no knowledge of this at all. Uhhuh.[00:36:30] Multi Human ChaosRyan Lopopolo: So we have had like in human space to spend a lot of time doing synchronous knowledge sharing. We have a daily standup that's 45 minutes long because we almost have to. Fan out the understanding of the current state.swyx: Yeah, I was gonna say this is good for a single human multi-agent, but multi human, multi-agent is a whole like po like explosion of stuff.Ryan Lopopolo: Yeah. And that this is fundamentally why we have such a rigid, like 10,000 [00:37:00] engineer level architecture in the app because we have to find ways to carve up the space so people are not trampling on each other.swyx: Sorry, I don't get the 10,000 thing. Did I miss that?Ryan Lopopolo: The structure of the repository is like 500 NPM packages.It's like architecture to the excess for what you would consider, I think normal for a seven person team. But if every person is actually like 10 to 50. Then the like numbers on being super, super deep into decomposition and sharding and like proper interface boundaries make a lot more sense.swyx: Yeah. To me, that's why I talked about Microfund ends and I, an anex is from that world, but Cool. It is just coming back to, to, to this I dunno if you have other, thoughts on. Orchestrating so much work coin going through this. Is this enough? Is this like any aha moments?Vibhu: It'll be interesting to see like where, okay, so right now you pick linear as your issue tracker, right?swyx: Or it's like a is it actually linear? This is actually linear.[00:37:55] Linear vs Slack WorkflowVibhu: Oh, that's linear. It's linear.swyx: Oh I never looked atVibhu: video. The demo video I had to download to [00:38:00] run.swyx: So I, because I'm a Slack maxie, but Yeah, linear. Linear is also really good. Yes,Ryan Lopopolo: we do make a good use of Slack. We we fire off codex to do all these lotion, elasticity, fix ups, the things that like sync that knowledge into the repository.It's super cheap. Yeah.swyx: Yeah.Ryan Lopopolo: Just do it in Codex.swyx: My biggest plug is OpenAI needs to build Slack. You need to own Slack. Build yours. Turn this into Slack.Ryan Lopopolo: I did read about it. Youswyx: did?Ryan Lopopolo: Yeah.[00:38:25] Collaboration Tools for AgentsRyan Lopopolo: I would say that if we think that we want these agents to do economically valuable work, which is like this is the mission, right?We want AI to be deployed widely, to do economically valuable work, then we need to find ways for them to naturally collaborate with humans, which means collaboration tooling, I think, is an interesting space to explore.swyx: Yeah, totally. Yeah. GitHub, slack, linear.Vibhu: Yeah, that was my thing. Okay, where do we see right now Codex has started Codex Model, then CLI, now there's an app, app can let me shoot off multiple Codex is in parallel, but there's no great team collaboration for Codex.And it [00:39:00] seems like your team had some say into what comes out, right? So you talked to ‘em, codex kind of was a thing. From there, if you guys are on the bound, what stuff that like, you might not focus on, but what do you expect other people to be building, right? So people that are like five x 50 Xing.Should you build stuff that's like very niche for your workflow, for your team? Should it be more general so other people can adopt? Is there a niche there? ‘Cause part of it is just okay, is everything just internal tooling? Do we have everything our own way? Like the way our team operates has our own ways that we like to communicate or is there a broader way to do it?Is it something like a issue tracker? Just thoughts if you wanna riff on that.[00:39:35] Standardizing Skills and CodeRyan Lopopolo: I think TBD we have not figured this out in a general way. I do think that there is leverage to be had in making the code and the processes as much the same as possible. If you think that code is context, code is prompts, it's better from the agent behavior perspective to be able to look in a package in directory X, Y, Z, and it not to have to page so [00:40:00] deeply into directory if you C, because they have the same structure, use the same language, they have the same patterns internally.And that same like leverage comes from aligning on a single set of skills that you're pouring every engineer's taste into to make sure that the agent is effective. So like in our code base, we have, I think, six skills. That's it. And if some part of the software development loop is not being covered, our first attempt is to encode it in one of the existing setup skills, which means that we can change the agent behavior.Yeah. More cheaply than changing the human driver behavior.swyx: Yeah.[00:40:39] Self Improvement via Logsswyx: Have you ever, have you experimented with agents changing their own behavior?Ryan Lopopolo: We do.swyx: Yeah. Or parent agent changing a subagents, behavior or something like that.Ryan Lopopolo: We have some bits for skill distillation. So for example, there's one neat thing you can do with Codex, which is just point it at its own session logs to ask it to tell you how you can use [00:41:00] the tool pedal better.swyx: It's like introspectionRyan Lopopolo: or ask it to do things. I useVibhu: this session better. What skills should Iswyx: high? I like the modification of, you can do, just do things to you can just ask agent to do things.Ryan Lopopolo: Yeah. You can just codex things. This is like a, this is like a silly emoji that we have, right? You can just codex things, you can just prompt things.It's really glorious future we live in, but okay, you can do that one-on-one. But we're actually slurping these up for the entire team into blob storage and. Running agent loops over them every day to figure out where as a team can we do better and how do we reflect that back into the repositories?Yes, though everybody benefits from everybody else's behavior for free. Same for like PR comments, right? These are all feedback. That means the code as written, deviated from what was good, a PR comment, a failed build. These are all signals that mean at some point the agent was missing context. We gotta figure out how toswyx: Yeah.Ryan Lopopolo: Slurp it up and put it back in the reboot.swyx: By the way, I do this exactly right. I used to, when I use cloud code for [00:42:00] knowledge work, cloud cowork is like a nice product, right? Yes. In I think you would agree. I always have it tell me what do I do better next time? And that's the meta programming reflection thing.So I almost think like you have six reflection extraction levels in symphony and almost like the zero of layer. So the six levels are PO policy, configuration, coordination, execution, integration, observability. We've talked about a couple of these, but the zero layer is like the, okay, are we working well?Can we improve how we work? Yes. Can I modify my own workflow without MD or something? I don't know.Ryan Lopopolo: Yeah, of course. Yeah, of course you can. Like this thing is also able to cut its own tickets ‘cause we give it full access.Yeah. Make it a ticket to have it cut. Tickets you can.Put in the ticket that you expect it to file as on follow up work,swyx: like Yeah. Self-modifying. Yeah.Ryan Lopopolo: Yeah.[00:42:44] Tool Access and CLI FirstRyan Lopopolo: Put, don't put the agent in a box. Give the agent full accessibility over it. Domain.swyx: I had a mental reaction when you said don't put the agent in a box. So I think you should put it in a box. Like it's just that you're giving the box everything it needs.Ryan Lopopolo: Yeah. Context and tools.swyx: But we're like, as developers, we're used to calling [00:43:00] out to different systems, but here you use the open source things like the Prometheus, whatever, and you run it locally so that you can have the full loop. I assume.Ryan Lopopolo: Yep.Vibhu: I think likeRyan Lopopolo: another, you wanna minimize cloud, cloud dependencies.Vibhu: You also want to make sure that you think about what the agent has access to. What does it see? Does it go back into the loop, like from the most basic sense of you let it see its own like calls, traces it can determine where it went wrong. But are you feeding that back in? So you know, just the most basic level of you wanna see exactly what's input output, like does the agent have access to.What is being outputted, right? It can self-improve a lot of these things. It's allRyan Lopopolo: text, right? My job is to figure out ways to funnel text from one agent to the other.swyx: It's so strange like way back at the start of this whole AI wave Andre was like, English is the hottest day programming language.It's here, it's just Yeah. The feature as well.Vibhu: A lot of, okay. Like a lot of software, a lot of stuff. There's a gui, it's made for the human. We're seeing the evolution of CLI for everything, right? All tools have CLIs. Your agents can use [00:44:00] them well, do we get good vision? Do we get good little sandboxes?Like right now? It's a really effective way, right? Models love to use tools. They love the best. They love to read through text. So slap a CLI let it go loose. That works for everything.Ryan Lopopolo: It does. Yeah. Yeah.[00:44:14] UI Perception and RasterizingRyan Lopopolo: We've also been adapting nont, textual things to that shape in order to improve model behavior in some ways, right?We want the agent to be able to see the UI agents do not perceive visually in the same way that we do. They don't see a red box, they see red box button, right? They see these things in latent space. So if we want, Hey, yeah, I do. We haveswyx: a ding if that goes off every time. Alien spaceRyan Lopopolo: ding.Anyway if we wanna actually make it see the layout, it's almost easier to rasterize that image to ask EOR and feed it in to the agent. Ha. And there's no reason you can't do both, right? To like further refine how the model perceives the object it's [00:45:00] manipulating.swyx: Cool. Could we, you wanna talk about a couple more of these layers that might bear more introspection or that you have personal passion for?[00:45:07] Coordination Layer with ElixirRyan Lopopolo: I will say that the coordination layer here was a really tricky piece to get right.swyx: Let's do it. Yep. I'm all about that. And this is Temporal core.Ryan Lopopolo: This is where when we turn the spec into Elixir, where like the model takes a shortcut, right? Like it's oh, I have all these primitives that I can make use of in this lovely runtime that has native process supervision.Which is I think, a neat way to have taken the spec and made it more choices achievable by making choices that naturally mapswyx: Yeah.Ryan Lopopolo: To the domain, right? In the same way that like you would prefer to have a TypeScript model repo if you are doing full stack web development, right? Because the ability to share types across the front end and backend reduces a lot of complexity.And becauseswyx: that's what graph kill used to be.Ryan Lopopolo: That's right. Andswyx: I don't know if it's still alive, butRyan Lopopolo: [00:46:00] no humans in the loop here. So like my own personal ability to write or not write elixir. Doesn't really have to bias us away from using the right tool for the job. It is just wild.swyx: Love it. I love it.Yeah. I wonder if any languages struggle more than others because of this? I feel like everyone has their own abstractions. That would make sense. But maybe it might be slower, it might be more faulty where like you'd have to just kick the server every now and then. I, I don't know. I think observability layer is really well understood.Integration layer, CP is dead. I think all these just like a really interesting hierarchy to travel up and down. It's common language for people working on the system to understandRyan Lopopolo: The policy stuff is really cool, right? Yeah. You don't really have to build a bunch of code to make sure the system wait for the, to passswyx: it's institutional knowledge.Ryan Lopopolo: Yeah. You just give it the G-H-C-L-I with some text that say CI has to pass. It makes the maintenance of these systems a lot easier.[00:46:57] Agent Friendly CLI Outputswyx: Do you think that CLI maintainers need to be [00:47:00] do anything special for agents or just as is? It's good because like I don't think when people made the G GitHub, CLI, they anticipated this happening.Ryan Lopopolo: That's correct. The GH CLI is fantastic. It's great super industry.swyx: Everyone go try GH repo create GH pull and then pull request number, right? GH HPR, like 1 53, whatever. And then it like pullsRyan Lopopolo: basically my only interaction with the GitHub web UI at this point is GH PR view dash web.Exactly. Glanceswyx: at the diffRyan Lopopolo: and be like Sure thing. Send it. Yeah. But the CLI are nice ‘cause they're super token efficient and they can be made more token efficient really easily. Like I'm sure you all have seen like I go to build Kite or Jenkins and I could just get this massive wall of build output.And in order to unblock the humans, your developer productivity team is almost certainly gonna write some code that parses the actual exception out of the build logs and sticks it in a sticky note at the top of the page. And you basically [00:48:00] want CLI to be structured in a similar way, right? You're gonna want to patch dash silent to prettier because the agent doesn't care that every file was already formatted.Just wants to know it's either formatted or not. So it can then go run a right command. Similarly, like in our PNPM distributed script runner, when we had one, when you do dash recursive, like it produces a absolute mountain of text. But all of that is for passing. Test suites. So we ended up wrapping all of this in another scriptswyx: to suppress the,Ryan Lopopolo: which you can vibe the channel only output the failing parts of the tests.swyx: You make a pipe errors versus the standard, standard out. I don't know. Okay. Whatever. Too much thinking have to do that. The CII used to maintain SCLI for my company and yeah, this is like core, very core to my heart. But you're vibing my job.Ryan Lopopolo: That's right.swyx: Cool. Any other things?This is a long spec. [00:49:00] I appreciate that. It's got a lot of strong opinions in here. Any other things that we should highlight? I think obviously you can spend the whole day going through some of these, but I do think that some of these have a lot of care or some of this you might wanna tell people, Hey, take this, but, make it your own.[00:49:15] Blueprint Spec and GuardrailsRyan Lopopolo: Fundamentally, software is made more flexible when it's able to adapt to the environment in which it is deployed, which means that things like linear or GitHub even are specified within the spec, but not required pieces of it. There's like a more platonic ideal of the thing that you could swap in like Jira or Bitbucket, for example.But being able to tightly specify things like the ID formats or how the Ralph Loop works for the individual agents. Basically means you can get up and running with a fully specified system quickly that you then evolve later on. I think we never intended for this to be a static spec that you can [00:50:00] never change.It's more like a blueprint to get something worth a starting point up and running.swyx: Yeah.Ryan Lopopolo: For you then to vibe later to your heart's content,swyx: you have like code and scripts in here where it's oh, I think this is a really good prompt. It's just a very long prompt.Ryan Lopopolo: Fundamentally, the agents are good at following instructions, so give them instructions.And it will, improve the reliability of the result. We, much like the way we use Symphony, we don't want folks to have to monitor the agent as it is vibing the system into existence. So being very opinionatedVery strict around what these success criteria are means that our deployment success rate goes up. Yeah. It means we don't have to get tickets on this thing.Vibhu: Think it all goes back to that like code to disposable, right? Like early on when you had CLI or you'd kick off a Codex run, it would take two hours. You would wanna monitor okay, I'm in the workflow of just using one.I don't want it to go down the wrong path. I'll cut it off and, just shoot off four, like that was my favorite thing of the Codex app, right? Yeah. Just Forex it like, [00:51:00] it's okay. One of them will probably be right, one of them might be better. Stop overthinking it. Like my first example was probably like deep research.When you put out deep research and I'd ask it something like, I asked it something about LLM, it thought it was legal something and spent an hour, came back with a report completely off the rails. And I was like, okay, I gotta monitor this thing a bit. No don't monitor it. Just you want to build it so it's that it, it goes the right way.And you don't wanna, you don't wanna sit there and babysit, right? You don't want to babysit your agentsRyan Lopopolo: with that deep research query that you made. Looking at the bad result, you probably figured out you needed to tweak your prompt Yeah. A bit, right? That's that guardrail that you fed back into the code base for the task, your prompt to further align the agent's execution.Same sort of concept supply there too.swyx: When you talk, how are the customers feelingRyan Lopopolo: for Symphony? I think we have none, right? This is a thing we have put out into theswyx: world. Symphony's internal, right? As long as you are happy, you are the customer. That'
Security problems aren't changing very much even though security teams are. We catch up on the implications of the Claude Code source leak, the very human lessons from the axios NPM compromise, and what secure design looks like when it involves agents, humans, or both. AppSec has always celebrated interesting and impactful vulns. And LLMs are now a favored tool for finding flaws. We shouldn't forget the success and effectiveness of fuzzers like OSS-Fuzz, which has improved security for over 1,000 projects and found over 50,000 bugs. But we can't ignore the ease of prompting an agent to go find -- and exploit -- a vuln when the UX and overhead of doing so is hardly more than writing some markdown. The SDLC Blind Spot: Why Breaches Start with Identity, Not Code Developers have access to source code, CI/CD pipelines, and cloud infrastructure — and attackers know it. Target lost 860GB of source code through a single compromised credential. Recruitment fraud campaigns have pivoted from a compromised developer to cloud admin in under 10 minutes. As agents join human developers, contractors, and service accounts in the SDLC, the attack surface is expanding faster than static security tools can track. Security teams need real-time visibility beyond code and into who has access and what they're actually doing. This segment is sponsored by Apiiro. To lean more, visit https://securityweekly.com/apiirorsac. How AI-Driven Development is Reshaping the Application Risk Landscape Agent coding assistants are accelerating software development, generating more code and more change than security teams were built to handle. In this interview, Idan Plotnik discusses how AI-driven development is reshaping the application risk landscape and why traditional vulnerability management models can't keep up. Make sure to schedule a free SDLC Risk Assessment with BlueFlag Security - 30 minutes to deploy. 48 hours to results. Please visit https://securityweekly.com/blueflagrsac. Show Notes: https://securityweekly.com/asw-377
Security problems aren't changing very much even though security teams are. We catch up on the implications of the Claude Code source leak, the very human lessons from the axios NPM compromise, and what secure design looks like when it involves agents, humans, or both. AppSec has always celebrated interesting and impactful vulns. And LLMs are now a favored tool for finding flaws. We shouldn't forget the success and effectiveness of fuzzers like OSS-Fuzz, which has improved security for over 1,000 projects and found over 50,000 bugs. But we can't ignore the ease of prompting an agent to go find -- and exploit -- a vuln when the UX and overhead of doing so is hardly more than writing some markdown. The SDLC Blind Spot: Why Breaches Start with Identity, Not Code Developers have access to source code, CI/CD pipelines, and cloud infrastructure — and attackers know it. Target lost 860GB of source code through a single compromised credential. Recruitment fraud campaigns have pivoted from a compromised developer to cloud admin in under 10 minutes. As agents join human developers, contractors, and service accounts in the SDLC, the attack surface is expanding faster than static security tools can track. Security teams need real-time visibility beyond code and into who has access and what they're actually doing. This segment is sponsored by Apiiro. To lean more, visit https://securityweekly.com/apiirorsac. How AI-Driven Development is Reshaping the Application Risk Landscape Agent coding assistants are accelerating software development, generating more code and more change than security teams were built to handle. In this interview, Idan Plotnik discusses how AI-driven development is reshaping the application risk landscape and why traditional vulnerability management models can't keep up. Make sure to schedule a free SDLC Risk Assessment with BlueFlag Security - 30 minutes to deploy. 48 hours to results. Please visit https://securityweekly.com/blueflagrsac. Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw-377
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Team PCP Update and Axios Post Mortem https://isc.sans.edu/diary/32864 https://github.com/axios/axios/issues/10636 Strapi NPM Packages Compromised https://safedep.io/malicious-npm-strapi-plugin-events-c2-agent/ Fortinet CVE-2026-35616 exctively exploited https://fortiguard.fortinet.com/psirt/FG-IR-26-099
The Chopping Block crew and Wintermute's Evgeny Gaevoy debate whether Canton is truly permissionless, if Ethereum Foundation should double down on cypherpunk ideals or embrace institutions, and how AI-driven attacks are forcing everyone in crypto and open source to rethink security models. Welcome to The Chopping Block — where crypto insiders Haseeb Qureshi, Tom Schmidt, Tarun Chitra, and Robert Leshner chop it up about the latest in crypto. This week we've got Evgeny Gaevoy, Founder of Wintermute, known for sharp takes and sharper trades. First up, the group unpacks the Twitter war over enterprise chain Canton—does it deserve to be called “permissionless”, or is it just TradFi with extra steps? Cue the Solana–Ethereum truce, and a rare moment where every old-school degenerate finds a common enemy. Evgeny makes a strong case for why, despite years of jokes at the Ethereum Foundation's expense, he thinks they're finally ahead of the curve by doubling down on cypherpunk roots—even if it makes ETH a little more Linux and a little less Nasdaq. But does decentralization matter if stablecoins and institutions now control the fork-choice? Haseeb and Evgeny spar over whether Ethereum's “world computer” vision means inviting in the corporate crowd or keeping the punk sanctuary alive. The mood shifts as the hosts dig into crypto's unfolding security meltdown: AI-written hacks, NPM supply chain fiascos, and what that means for the future of open source in crypto. Plus, a fresh new hack (RIP Drift), and predictions on how defensive tech (or lack thereof) will shape the next cycle. Barstool banter, spicy takes, and zero investment advice as always—let's get into it. Listen to the episode on Apple Podcasts, Spotify, Pods, Fountain, Podcast Addict, Pocket Casts, Amazon Music, or on your favorite podcast platform. Show highlights
(Presented by TLPBLACK: High-fidelity threat intelligence and research tools for modern security teams. From curated Passive DNS and real-time C2 monitoring to actionable IOC feeds and daily malware samples, we help defenders detect, hunt, and disrupt threats faster, with seamless integration into SIEM and SOAR workflows.) Three Buddy Problem - Episode 92: Costin walks through real-world ransomware incident response while Juanito makes the case for AI-generated operating systems that never run anyone else's code. Plus, debates on whether vulnerability research is cooked, why nobody should pay ransoms, and what the security industry looks like after the massive AI flood. Cast: Juan Andres Guerrero-Saade, Ryan Naraine and Costin Raiu. 0:00 – Introductory banter 2:00 – Costin's ransomware incident response work 3:30 – How attackers break in: Fortinet vulnerabilities everywhere 6:30 – Hunting for ransomware decryption keys 9:00 – Breaking into ransomware C2s and monitoring leak sites 12:00 – The ransom payment debate: should you ever pay? 16:00 – Why "don't pay the ransom" is overgeneralized 21:00 – How ransomware gangs price their demands 24:00 – The AI-pilling of the security industry 28:30 – Nicholas Carlini, Ptacek, and "vulnerability research is cooked" 35:00 – Towards a generative-first operating system 41:00 – Code factories, trusted computing, and killing dependencies 48:00 – Microsoft and Apple's AI positioning 56:00 – Chris St. Myers' "Cognitive Rust Belt" essay 1:18:00 – Choice, The Matrix, and the illusion of control 1:38:00 – Supply chain attacks, North Korea, and dependency sprawl
Iran's cyber campaign continues. North Korea targets the axios NPM package. Cisco suffers a Trivy-related breach. Claude's code leak unveils broad capabilities. The DOD's zero-trust efforts are slow-going. A proposed class action suit accuses Perplexity of oversharing. Google patches another Chrome zero-day. The FBI warns against using foreign-developed mobile apps. Christy Wyatt, CEO from Absolute Security, discussing why cyber risk is now a business continuity problem. A city circulates cameras to cultivate crime control. Remember to leave us a 5-star rating and review in your favorite podcast app. Miss an episode? Sign-up for our daily intelligence roundup, Daily Briefing, and you'll never miss a beat. And be sure to follow CyberWire Daily on LinkedIn. CyberWire Guest We will be sharing a series of interviews we held at RSAC 2026 over the next few weeks. Christy Wyatt, CEO from Absolute Security, discussing why cyber risk is now a business continuity problem. If you enjoyed this conversation, tune in here to listen to the full interview. Selected Reading Iran's hackers are on the offensive against the US and Israel (Ars Technica) Cisco Source Code and AWS Keys Stolen in Trivy Supply Chain Attack (Beyond Machines) Claude Code's source reveals extent of system access (The Register) Pentagon's Zero Trust Push Faces a 2027 Reality Check (GovInfo Security) Perplexity AI Machine Accused of Sharing Data With Meta, Google (Bloomberg) Google fixes fourth Chrome zero-day exploited in attacks in 2026 (Bleeping Computer) FBI warns against using Chinese mobile apps due to privacy risks (Bleeping Computer) North Korea-Nexus Threat Actor Compromises Widely Used Axios NPM Package in Supply Chain Attack (Google Cloud Blog) Silicon Valley city to give residents doorbells equipped with cameras (The Guardian) Share your feedback. What do you think about CyberWire Daily? Please take a few minutes to share your thoughts with us by completing our brief listener survey. Thank you for helping us continue to improve our show. Want to hear your company in the show? N2K CyberWire helps you reach the industry's most influential leaders and operators, while building visibility, authority, and connectivity across the cybersecurity community. Learn more at sponsor.thecyberwire.com. The CyberWire is a production of N2K Networks, your source for strategic workforce intelligence. © N2K Networks, Inc. Learn more about your ad choices. Visit megaphone.fm/adchoices
SANS Internet Stormcenter Daily Network/Cyber Security and Information Security Stormcast
Application Control Bypass for Data Exfiltration https://isc.sans.edu/diary/Application%20Control%20Bypass%20for%20Data%20Exfiltration/32850 Axios NPM Module Supply Chain Compromise https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan https://www.linkedin.com/events/7444763050819092480/ TeamPCP vs. Cloud Resources https://www.wiz.io/blog/tracking-teampcp-investigating-post-compromise-attacks-seen-in-the-wild
Cisco Source Code Stolen in Trivy Fallout, Axios Supply Chain Attack, and Active Exploitation of Fortinet and Citrix Flaws David Shipley reports multiple major security incidents: attackers used credentials stolen in the Trivy supply-chain attack via a malicious GitHub action to breach Cisco's internal development environment, clone 300+ GitHub repos, steal source code (including AI products) and AWS keys, and impact customer-related code; Cisco contained the breach, re-imaged systems, and rotated credentials. A separate supply-chain attack hit the widely used JavaScript library Axios after its maintainer account was compromised, pushing poisoned NPM versions that installed a dropper/RAT via a fake dependency; users are told to downgrade affected versions, remove the dependency, rotate credentials, and review CI/CD logs. Active exploitation is confirmed for a Fortinet FortiClient EMS SQL injection (CVE-2026-21643) and for critical Citrix NetScaler flaws (CVE-2026-3055, possibly alongside CVE-2026-4368). Anthropic accidentally exposed details of a new model, "Code Mythos," described as highly capable in reasoning, coding, and cybersecurity. Finally, TechCrunch reports escalating allegations that compliance startup Delve helped fabricate audit evidence and worked with weak auditors. The episode also marks show episode 1,500. 00:00 Headlines and Sponsor 00:54 Cisco Trivy Breach 02:28 Axios NPM Attack 04:12 Fortinet SQLi Exploited 06:24 Citrix Bleed Returns 08:05 Anthropic Model Leak 10:24 Fake Compliance Scandal 12:30 Episode 1500 Milestone 14:03 Sponsor Closing Message
An airhacks.fm conversation with Brian Vermeer (@BrianVerm) about: growing up with a Commodore 64 and gaming, inheriting a 486 DX2 with Windows 3.1, first "enterprise" migration from Windows 3.1 to 3.11, early experiments with Turbo Pascal and Basic, curiosity-driven programming and disassembling electronics, building computers from parts in the early PC era, high school informatics classes and the transition from hobby to career, bachelor's degree in software engineering, master's degree at Utrecht University focusing on Formal methods and compiler construction, mathematical proofs of program correctness, abstract syntax trees and program analysis, Haskell and pure functional programming, recursion vs loops and thinking in different paradigms, the influence of functional programming on Java development, first professional Java job at a temperature sensor monitoring company, building systems for vaccine transport temperature verification, enterprise service-based architecture, JavaServer Faces for frontend development, transitioning to consultancy at Blue4IT working for banks and government, community involvement and knowledge sharing, joining Snyk as a hybrid engineer and developer advocate, Snyk's origins as an NPM dependency scanner, supply chain security and NPM package vulnerabilities, expansion from Node.js to Java and other ecosystems, static code analysis and container analysis and AI flow analysis, security as part of the development lifecycle not an afterthought, vibe coding and AI assistant security checks, MCP server toxic flow risks, Java vs python for scripting and automation, JBang for Java scripting, modern Java simplicity vs legacy enterprise verbosity, Java developers thinking about production from the start, Java and C# as the main languages for large backends, JVM optimization over time, Leslie Lamport and formal verification of concurrent programs, outsourcing expertise vs doing everything Brian Vermeer on twitter: @BrianVerm
On this week's show, Patrick Gray, Adam Boileau and James WIlson discuss the week's cybersecurity news. They talk through: TeamPCP's supply chain attack on Github, and they threw in an anti-Iran wiper, because why not?! Anthropic hooks up its models to just… use your whole computer After Stryker's Very Bad Day, CISA says maybe add some more controls around your Intune? Another iOS exploit kit shows up in the cyber bargain-bin The FTC decides to ban… all new home routers?! U wot m8?! Supermicro founder was personally sanction-busting Nvidia GPUs into China?! This week's episode is sponsored by enterprise browser maker, Island. Chief Customer Officer Bradon Rogers joins Pat to explain how its customers are using Island to control the use of personal AI services in regulated industries. This episode is also available on Youtube. Show notes ‘CanisterWorm' Springs Wiper Attack Targeting Iran TeamPCP deploys CanisterWorm on NPM following Trivy compromise Andrej Karpathy on X: "Software horror: litellm PyPI supply chain" attack Checkmarx KICS GitHub Action Compromised: Malware Injected in All Git Tags Felix Rieseberg on X: "Today, we're releasing a feature that allows Claude to control your computer" A Top Google Search Result for Claude Plugins Was Planted by Hackers Lockheed Martin targeted in alleged breach by pro-Iran hacktivist CISA urges companies to secure Microsoft Intune systems after hackers mass-wipe Stryker devices FBI seems to seize website tied to Iranian cyberattack on Stryker Stryker confirms cyberattack is contained and restoration underway Hundreds of Millions of iPhones Can Be Hacked With a New Tool Found in the Wild Someone has publicly leaked an exploit kit that can hack millions of iPhones Russia-linked hackers use advanced iPhone exploit to target Ukrainians Apple rolls out first 'background security' update for iPhones, iPads, and Macs to fix Safari bug Post by @wartranslated.bsky.social — Bluesky Signal's Creator Is Helping Encrypt Meta AI Hacker says they compromised millions of confidential police tips held by US company Millions of 'anonymous' crime tips exposed in massive Crime Stoppers hack Feds Disrupt IoT Botnets Behind Huge DDoS Attacks FCC bans import of consumer-grade routers amid national security concerns White House pours cold water on cyber ‘letters of marque' speculation Google launches threat disruption unit, stops short of calling it ‘offensive' Supermicro's cofounder was just arrested for allegedly smuggling $2.5 billion in GPUs to China Cyberattack on vehicle breathalyzer company leaves drivers stranded across the US Man pleads guilty to $8 million AI-generated music scheme Two Israelis AI generated "intelligence" and sold it to Iran
Claude Cowork came out of an accident.Felix and the Anthropic team noticed something interesting with Claude Code: many users were using it primarily for all kinds of messy knowledge work instead of coding. Even technical builders would use it for lots of non-technical work.Even more shocking, Claude cowork wrote itself. With a team of humans simply orchestrating multiple claude code instances, the tool was ready after a brief week and a half.This isn't Felix's first rodeo with impactful and playful desktop apps. He's helped ship the Slack desktop app and is a core maintainer of Electron the open-source software framework used for building cross-platform desktop applications, even putting Windows 95 into an Electron app that runs on macOS, Windows, and Linux.In this episode, Felix joins us to unpack why execution has suddenly become cheap enough that teams can “just build all the candidates” and why the real frontier in AI products is no longer better chat, but trusted task execution.He also shares why Anthropic is betting on local-first agent workflows, why skills may matter more than most people realize, and how the hardest questions ahead are about autonomy, safety, portability, and the changing shape of knowledge work itself.We discuss* Felix's path: Slack desktop app, Electron, Windows 95 in JavaScript, and now building Claude Cowork at Anthropic* What Claude Cowork actually is: a more user-friendly, VM-based version of Claude Code designed to bring agentic workflows to non-terminal-native users* Why “user-friendly” does not mean “less powerful”: Cowork as a superset product, much like how VS Code initially looked simpler than Visual Studio but became more hackable and extensible* Anthropic's prototype-first culture: why Cowork was built in 10 days using many pre-existing internal pieces, and how internal prototypes shaped the final product* Why execution is getting cheap: the shift from long memos, specs, and debate toward rapidly building multiple candidates and choosing based on reality instead of theory* The local debate: why Felix thinks Silicon Valley is undervaluing the local computer, and why putting Claude “where you work” is often more powerful* Why Claude gets its own computer: the VM as both a safety boundary and a capability unlock, letting Claude install tools, run scripts, and work more independently without constant approval* Safety through sandboxing: why “approve every command” is not a real long-term UX, and how virtual machines create a middle ground between uselessly safe and dangerously autonomous* How Cowork differs from Claude Code: coding evals vs. knowledge-work evals, different system-prompt tradeoffs, longer planning horizons, and heavier use of planning and clarification tools* Why skills matter: simple markdown-based instructions as a lightweight abstraction layer for reusable workflows, personalized automation, and portable agent behavior* Skills vs. MCPs: why Felix is increasingly interested in file-based, text-native interfaces that tell the model what to do, rather than forcing everything through rigid tool schemas* The portability problem: why personal skills should move across agent products, and the unresolved tension between public reusable workflows and private user-specific context* Real use cases already happening today: uploading videos, organizing files, handling taxes, managing calendars, debugging internal crashes, analyzing finances, and automating repetitive browser workflows* Why AI products should work with your existing stack: Anthropic's bias toward integrating with Chrome, Office, and existing workflows instead of rebuilding every app from scratch* Computer use one year later: how much better it has gotten, why vision plus browser context is such a superpower, and why letting Claude see the thing it is working on changes everything* Why many “AI verticals” may get compressed: specialized wrappers may matter in the short term, but better general models and stronger primitives could absorb a lot of narrow use cases* The future of junior work: Felix's concerns about entry-level roles, labor-market disruption, and whether AI can compress early-career learning into denser simulated experience* Why Waterloo grads stand out: internships, shipping experience, and learning how real teams build products versus purely theoretical academic preparation* The agentic future of the desktop: what it means for Claude to have its own computer, whether AI should act on your machine or a remote one, and how intimacy with personal data changes the product design space* Why Electron still mattered: shipping Chromium as a controlled rendering stack, the limits of OS-native webviews, and why browser engines remain one of the great software abstractions* Anthropic's Labs mentality: wild internal experiments, half-broken future-looking prototypes, and the broader effort to move users from asking questions to delegating increasingly long and valuable tasks* Why the endgame is not just more capability, but more independence: teaching users to trust AI with bigger scopes of work, for longer durations, with fewer interventionsFelix Rieseberg* X: https://x.com/felixrieseberg* LinkedIn: https://www.linkedin.com/in/felixrieseberg* Website: https://felixrieseberg.com/Anthropic* Website: http://anthropic.comFull Video PodTimestamps00:00 — Cheap execution and building all the candidates00:44 — Intro in the new Kernel studio02:47 — What Claude Cowork is04:18 — Why user-friendly can be more powerful05:33 — How Anthropic built Cowork07:09 — Prototype-first product development08:00 — Why local computers still matter09:20 — Skills, primitives, and platform leverage12:13 — Cowork's architecture: VM + Chrome + system prompt15:38 — Felix's own bug-fixing Cowork workflows17:38 — Local-first agents20:16 — Evals, planning, and knowledge-work optimization23:14 — What Anthropic means by evals24:21 — Scaffolding, tools, and why skills matter27:44 — Demo: YouTube uploads and self-generated skills31:03 — Calendar automation and cleaning your desktop34:47 — Browser context and why DOM access matters37:47 — Skills portability and plugins44:36 — Which AI categories survive?46:19 — Junior jobs, simulated work, and labor disruption52:00 — Gradual takeoff vs big-bang takeoff53:42 — Finance, taxes, and enterprise verticals56:24 — Vision and the improvement in computer use57:31 — Why Claude writes its own scripts58:06 — Should Claude have its own computer?1:01:26 — Windows 95 in JavaScript1:03:19 — VM tradeoffs and sandbox design1:07:23 — Approval fatigue and safe delegation1:11:18 — The future of Cowork1:12:27 — What comes next for agentic knowledge work1:15:13 — Electron, Chromium, and desktop software lessons1:22:16 — Multiplayer agents and coworker-to-coworker workflows1:26:05 — Anthropic Labs and closing thoughtsTranscriptAlessio: Hey everyone. Welcome to the Latent Space Podcast, our first one in the new studio. This is Alessio, founder of Kernel Labs, and I'm joined by swyx, editor of Latent Space.swyx: Yeah, so nice to be here. Thanks to, uh, TJ, Alessio, Allen helping to set everything up. It looks beautiful. We even have the logo outside.Yeah, kind.Felix: It's like really nice, right? When you walk in here as a guest, you're like, ah, this is a serious production. You're like, feel it immediately.swyx: Yeah. Felix, you've been, you're, you're currently a product manager of Cowork or,Felix: uh, really Technicswyx: Eng. Yeah. The, the identities are kind of vague member technical staff.Felix: I know member staff is like, the official title will carry around forever.swyx: Yeah. I basically kind of wanted, like we've been. Kinda obsessed. I, I've been using it a lot, even for managing latent space. Like, uh, cowork helps me upload videos and like title things and like edit and everything. It's, it's like really amazing.Alessio: Cool. He said multiple times Cowork has said gi in the group track.swyx: Yeah, yeah, yeah. So, so we have a second, uh, we have a second channel, uh, for latent space tv. Uh, and I, uh, and uh, we basically, this is our Discord meetup. Um, and I I, we have like Claude Coworks, it might be a GI, I don't know if we, we have, uh, uploaded it yet, but one of the sessions was like a, like a Claude cowork thing.Felix: I, you have to see, I would love to see it. Like, I'm so curious, like one of the most fun parts of my job is like constantly see the weird things people use Cowork for because it's obviously like very hard for us to actually design for specific use cases we do. But like every single person who's like most amazed is usually amazed about a thing that I didn't even expect cowork would be good at.Um, we have a new designer and it's one of the first small tasks. I was like, Hey, we need like a new emoji for cowork for our internal stock. It's like a pretty small thing. I like, can you please do it? And he drew an SVG and just gave it to coworker was like, can you animate this emoji? And now it has like this beautiful loopy animation.Um, and I mean, I think obviously this goes down to like, it turns out you can do more things with code than you expected, but it, it's like that kind of stuff that is really fun to me. So, long story short, I would love to see like, the kind of things you're doing.swyx: I'll pull it up. I'll pull it up.Felix: Yeah. Yeah.swyx: Uh, but before we get into it, I, I think always wanna start with like a top level. What is Claude Cowork for people who haven't heard of it? Haven't tried it out.Felix: Okay. Uh, real quick, Claude Cowork is a user friendly version of Claude Code. So the way it basically works is we have Claude Code and for us, fairly impressive agent harness that over December we noticed more and more people are using either, even though they're not technical, they, they're not at home in the terminal or they are at home in the terminal, but they started using Claude Code for non-coding workloads, right?Like managing expenses or like filling out receipts or organizing a knowledge base. Like there was a big obsidian moment that a lot of people liked and we wanted to capitalize on that, but also bring, bring this capability to people who are not terminal native and who might not know how to like brew and store something.So cowork is Claude Code running in original machine with a little bit of padding, a little bit more guardrails, making it a little safer and a little bit more convenient for people who don't wanna first open up the terminal when they go to work.swyx: It's interesting, uh, that is kind of. Pitch that way as a more user friendly thing because I always feel like it, it, to me, I I treat it as like why I'm familiar with Claude Code.Like we, we did a Claude Code episode Yeah. A year ago. But this one is like even more power user tools ‘cause it, uh, it kind of integrates much better with like clotting Chrome and, uh, in all the, all the other tooling. But like, maybe, maybe that's like a perception thing, right? LikeFelix: No, honestly, I don't think you're wrong.This is like a, a thing I've been thinking a lot about for like the last two weeks. So,swyx: but when they say user friendly, it's like, oh, it's the dumb down version. But no, actually this is the superset.Felix: Yeah. Like, I think a similar thing happened, A similar thing happened to me about 10 years ago, like maybe 12 years ago when I was at Microsoft and we started working on, on Electron and like browser-based technologies and cross-platform stuff.And one of the first use cases was Visual Studio Code, which used to be a website. And the initial narrative was, or Visual Studio Code is, is like a more user-friendly version of Visual Studio. But in a similar vein, I think there was some voices saying, oh, this is. For serious developers, like, we're not gonna use this.Right? For like anything. And I think in the end what happened is people have different stories about why Visual Studio Code became such a big thing. But my personal, my personal belief is that the Hackability and the extendability has like played a pretty big role, right? You can hook in Visual Studio Code that like almost any workload, it's so easy to hack on, so easy to put extensions for it.And I think cowork might be hitting a similar thing where it's very easy to extend and it's very easy to bring into your workflows. Uh, so the convenience I think is a bit of a, it's obviously the thing we strive for as developers, but I think the way people find value in it then is by probably mapping it onto whatever they actually have to do in their job.Alessio: So end of last year, you see the spike of like non-technical usage and clock code. What's the design process to say we should make clock code work? Because I mean, you built it in only 10 days. Um, I'm sure there was some discussion before on whether it's easier to use mean. You know, like making, making like a desktop GUI is obviously one way to do it, but like there's a lot of nuance in the product.Like maybe talk people through what was like the trigger of like, we should build a separate thing. We should not build like a different plot code thing. And then maybe some of the more interesting design decisions that maybe you didn't take.Felix: Yeah, I think philanthropic, we've been thinking about ways to move people who are comfortable with using Claude to answer questions and bring more of the power of like this thing to now like, execute tasks for you.I can like solve problems for you can like build things for you. How do we bring that capability to people who are currently mostly comfortable with like a like question answer paradigm within the chat. And we've had a lot of prototypes around that. Just going back as far as like easily a year and a half.Like we had a lot of people working on that. Um, and internally philanthropic is a very prototype demo, first culture. We have a lot of like internal prototypes that don't reach the public. What Cowork actually became is like we sort of picked the right pieces out of the many prototypes that we had.Right. And that's, that's maybe also like, I think an important qualifier whenever people mention this like 10 day number. I do think it's important to me to mention that within Double Scratch there was like a lot of stuff already happening, right? Like, and I think it's important for people to remember that when you build a website, you use React, you use like a bunch of other things.And this is like a similar scenario with like a lot of pieces we already had. Um, and in terms of decision path, I think we live in like an interesting new world where execution is actually quite cheap.swyx: Mm-hmm.Felix: So maybe, maybe what you would do That's so crazy. The year. I know it's wild.swyx: You should be, ideas are cheap.Execution is the hard part. IFelix: know. And like the, we, we used to live in this world maybe where you would take a product manager and the product manager would go to a number of potential customers and in this like very low bandwidth way, would try to. Try to like tease out what are the problems they're having, what are they willing to buy?Um, and then maybe what can you build to like drive out that need and then you go back and you like draft a spec and you think about it and then like you make a design and you execute it. We internally philanthropic app, not pretty much closer to the point where we're like, don't even write a memo, just like build, like let's build all the candidates very quickly.Let's just build all of them and then pick the best ones. I think the, the decision that is most impactful both for the product as well for the users right now is like the way we put value on your local computer. I think that's a big decision point a lot of people have thought about. Should this thing, whatever it is, should it ultimately run into computer or should it run in the cloud?‘cause they're big trade offs, right?Alessio: I guess like if we solve auth, it would be easy to do in the cloud. But I think like the fact that I can just download any file from anywhere and then put it and cowork there, it's like a big unlock. Um, I mean it's interesting you mentioned reusing certain pieces. I think this is something I've been thinking about even with Claude Code, right?The price of like writing code is going to zero, blah, blah, blah. But it actually seems like the value of having some sort of platform substrate is like increasing because as you build these new things, you can kind of plug them together.Felix: Yeah.Alessio: So I almost feel like when people are saying, oh, the value of a lot of software is gonna zero because you can recreate it, to me it's almost like the opposite.It's like having an existing platform to build on top of. It's like even more valuable because you can kind of bolt things on.Felix: Yeah.Alessio: You have obviously mcps, you have skills, you have like obviously the models, which is a big part. All these things kind of come together. Do you feel like that's a valid way to think about it, where people should invest even more in kind of like primitives.To rebuild on or are you like recreating a lot of it each time because like things change and it's easier to rewrite than reuse?Felix: You know, I think, I think you're right. I think you're right that the holistic platform is really useful. And this is maybe a whole like a somewhat contrarian view to a lot of people in ai.I actually don't think that the future is going to be hyper personalized software down to the point where everyone is running their own version. Like, I actually think it's going to be quite hard for all of us to have our own internal chat tool and like, if I wanna talk to you, likeswyx: howFelix: is that gonna work, right?In the, in the context of cowork and how we build it, I think it's a bit of a combination. Like what the, the execution that gets cheap is not necessarily rebuilding all the primitives. I think our priori, there's also not a lot of value in it. So for instance, my team did not think about rebuilding clock code.We're like very much started with the. The core thesis of this should be Claude Code.Mm-hmm.Felix: And then we'll like build things on top of it. The part of the execution that gets a little cheaper is like, how do you take all of these Lego pieces and put them together in a way that makes sense for users?It's like actually valuable. You have so many different approaches now in terms of what kind of, what kind of things do you actually elevate to a primitive, do you strongly believe that all your products should be built by just combining primitive that the public also has available? Do you keep some things internal?Um, and I think that's still evolving, but I think what's probably gonna go away is like, I'm not sure if it's gonna fully go away, but I'm gonna say, I think for me personally, I will probably no longer try to come up with a really good product without testing up with people. This is not a new concept, but wherever you used to have to make costly decisions around, do we pick technology A or technology B, or do we like, um, build it this way, build it the other way.I really strongly believe now you just build all of them and try them out with a small focus group and then whatever, whatever is better is what you go with. Right. And that, that is probably quite different even from how we maybe worked a year ago. Right. Like, I think, I think this happened very recently.Alessio: Yeah. I started building something in on Electron since you're here. Coincidence. Uh, but then Electron and like SQL Light are like, there's like some issues that like between development and like, uh, building anyway. And I was like, let's just rebuild the whole thing in Swift and just recreated the whole thing in Swift.And it's like, I. It's done.swyx: You know, I didn't take any effort. I, I, I don't even know Swift.Alessio: Yeah, exactly. I was like, I'm the, I'm not reviewing it anyway, whatever. You can write in whatever language you pick, but the important stuff that I did was not write the electron bindings. Yeah. It was like the logic of what happens in the app, you know, and then the model is like, yeah, I can just recreate the same thing as withswyx: Yeah.I, I think you still want, especially for people who are doing like high performance software or like very complex software, uh, you still want like, some view of the architecture. Uh, but you can use markdown for that,Felix: right? Yeah.swyx: Uh, you don't actually have to read the code again. I, I'm still like on a sort of like a definitional thing.Um, can we build a good mental model of Claude Cowork? Um, this is what I have, right? Like you you said it's like fundamentally cloud co. We don't wanna touch it. There's the cloud app, there's clouding Chrome. I think you guys do something different in planning, but, uh, I've been talking with Tariq who is on the cloud co team, and you guys are, he's like, no, we just exposed planning.Maybe we can clarify like, what are the major pieces. That people should be aware. It goes into cowork, like,Felix: okay, I think you basically have them. So really, um, you can, you can take planning more or less out. I think there's a few things that are really valuable in cowork. Um, the virtual machine is probably the most powerful thing.So we currently run like a, we currently run like a lightweight VM and we put clocked out into the vm and we do that for, for, um, a number of reasons. Safety and security is a big one, but even if you, even if you ignore for a second safety and security and you're just like, okay, Yolo, I want this thing to do whatever.It is quite powerful to give Claus on computer that is like generally a good idea. And in terms of architecture and UX and everything else that we've been working on, philanthropic, it often is quite useful for you to like anthropomorphize, um, clot aggressively and just be like, this is a person. What will you do if you give a, if you had a person, right?Yeah. And the analogy I've given my dad this morning who is still like quite insistent on using chat even for like coding things, is if you were a developer and your employer told you that you don't need a computer, they're just gonna like, send you emails with a code and you send emails with code back like that, maybe work for Patrick Miles in the back, but that it's not very effective.Um, so what we can do with the VM is because it's a, it's a Linux system, Claude Code has more or less free reign to install whatever needs to install. It can install Python, it can install no js. We do have strict network ingress and egress controls. So you can still, as, as a user in like plain human language, make it clear to, to the entire system what you're okay with and what you're not okay with.But at no point do we have to ask a real person, like a, like a person who might be in marketing or a lawyer. I'd have to go to a lawyer and be like, are you okay with me installing Homebrew?Alessio: Yeah, yeah.Felix: Right. Because the implications of the question and the answer are complex and nuanced and like, not, not easy to reason about.This gives us a lot of distraction that makes Cloud very powerful. Now then around it, we, we do probably have a number of things that also keeps growing almost every single week that you're probably noticing that make cowork maybe better for certain tasks than just cloud. Cloud on its own. Yeah. But most of those actually live in the system prompt.They're about like, what can we infer about the work that you do? What can we, what can we intru in the system prompt to make that more effective? It's of course the like very tight integration with Cloud and Chrome. You're noticing that a lot of people, especially as the models get better, a lot of people throw up their hands when it comes to MCP connectors in this area.I'm not gonna, I'm not gonna go through like 25 M CCP connectors, click off everywhere and then like half of them don't let me do the things anyway. So Cloud and Chrome is quite powerful because we can just talk to the cloud and Chrome sub agent and that will just do things for you.swyx: Yeah, so, so one example right in MCPI, honestly, I think that the state of MCP is kind of, kind of.Really hard to integrate. Um, I need to, I needed to add, uh, Figma MCP to the coding agent that I use.Felix: Yeah.swyx: Uh, and, but I didn't wanna read the docs, so I just had caught to it. And it's, it's great at reading docs and the same, same way I had to set up like a Google Cloud, um, account for some project I was working on and get some API keys somewhere.And Google Cloud is famously super hard to navigate, so I just didn't wanna deal with any of it. I just used Claude CoworkFelix: within the first week of developing on Core. This happened very, very quickly. Um, I caught myself by starting to use cowork for coding tasks, which is not ostensibly what we built it for, right?We don't need to. But I found myself, um, I found myself like on our internal, internal tool that we have for, to collect crashes and just like debugging information and I found myself sort like picking out the ones that I think we can easily fix versus the ones that might be like kernel corruption or something else on the operating system.And I found myself sort of picking these out and then just telling Clark, go fix this bug. I was like, what am I doing here? Go one level up, tell a cowork, I want you to go to all these crash tools. I want you to find all the bugs that you think are fixable and not like an operating system crash. And then I want you to tell another cloud to like fix all of that.Um, and that's, that's, that's sort of another cloud,swyx: just so it can spin up another instance or,Felix: uh, it, currently what I do is, um, and this is a bit of a hack, but I tell it to use clockwork remote to which website itself? Yeah, that's interesting. So you basically take, if you, if you imagine like a dashboard with like 20 bucks, you, this is remote control or clock or remote, or, sorry, I just wanted to confirm what, the way I'm using it is.I have cowork running and I'm telling cowork, here's where I normally go every morning to find the latest bugs. Go read the entire bug list, separate out which ones are fixable, which ones are, are fixable, and then for the fixable ones, four is this almost loop. For each bug, write a markdown file with a prompt.And then for each markdown v, that is a prompt. Start of a cloud set. So natively Claude Code hasswyx: this concept of subagents. Mm-hmm. And this is basically a subagent, but you're not using the subagent functionality.Felix: I'm not using the subagent functionality. And the reason I'm not is because I'm firing that off as a Claude Code remoteswyx: task.Felix: Yes. That's kind of nice. ‘cause then I can just fire it off. I can go to my next meeting and in Claude Code remote. Now the work is happening.swyx: Mm-hmm. Yeah. You, you see like you're already starting to use the cloud over your local machine. And I think this is one of those things where like. Shouldn't just everything just be cloud first, right?Felix: Ah, this is such a good group. I'm like solely bad about this. I have so many thoughts about that. Okay. So I generally believe that Silicon Valley overall is undervaluing the local computer. And my default argument for that is always how come we're all using MacBooks and not like an iPad or a Chromebook?Um, that there is like still value in, in having a local machine. And now when I think about Clot, it's this entity that is supposed to be very useful to you, like it tremendously useful to you. I think that entity needs to have access to all the same tools you have access to. Otherwise it's gonna be hamstrung in like all these complex ways.And there's, there's sort of two approaches we could take. We could say, okay, we're gonna like one by one chip away at everything that is at your computer and move it into the cloud. That's, that's one way to do it. Um, and I think other products have taken that path. I personally, this is a very personal opinion, but I personally, for the amount of tools that I use.Just don't have the patience to give another tool like permissions to every single thing and keep those permissions up to date. The second thing that I'm still grappling with, and I don't have a good answer for anyone just yet, but the second thing I'm still grappling with is what does it look like for someone to slurp up your entire work and put that in the cloud?Like if I, just as an example, like if you could click a button and it just clone your entire computer into the cloud, is that something that you would want? I'm not totally convinced yet that all everyone will. Mm-hmm. And that is sort of like upstream of all the technical issues we're gonna have. ‘cause like in general, I think the world is not ready for this kind of stuff.Like, I'll give you one quick example that would probably be very easy for us. So as a desktop app, we in theory with your permission, can do a lot of things on your computer, including reading your Chrome cookies. If we really want to do right, we could take your Chrome cookies, you would have to decrypt them for us.We could put those on the cloud if we really felt like it. Pretty easy solution. That would be super cool. We could just be like, oh, we can do all your tasks in the cloud now. Um, a lot of websites, thanks, include it. If, if they see the same authentication from like two different locations, we'll just lock down your account and now you have to go to the branch and be like, okay, I, I'm here with my passport.You actually know that. Wow. Yeah. As tired as well are of the term agent for the age agent future, I think there's a lot of stuff that sort of slowly needs to catch up and until that's the case, the way I, as someone's working on clock and make Cloud most effective is to like put it where you are working.swyx: Anything else? I thought with our mental model, so like, basically like, uh, part of me also just want, like the more I understand how it works, the more I can use it to its full potential. Right?Felix: Yeah.swyx: And so what I'm get hearing from you is you told me to delete the planning thing. You're not doing anything special on, on the, that's only exclusive to Qua cowork.Felix: We have some tricks for this sort of like change week over week. We eval cowork maybe against different use cases than he would evil clock code, right? If you think about it this way. Okay, so like clock code is our eval clock cowork. Yeah. So clock code is like quite optimized for coding tasks and we mostly value it whether or not we're getting better or worse depending on how good it is at like a typical suite job.And Clark Cowork on the other hand, we evaluate more against typical knowledge work, the kind of stuff he would find in finance or in like maybe a, like in like a legal office. Um, my personal use case is always like managing my things, like managing my personal mortgage or something like that, right? Or like wealth planning for me and my family.Those are the kinds of use cases we eval, clock cowork on. And what you might be picking up on is like the subtle changes we make to the system. Prompt what we put in the system, prompt how we steer, clot with the tools we give it. Um, like either it'd be better in one or the other direction and whether there's a trade off, try us exist a lot.CLO code will be better of a code and Claude Cowork will be better. For non-coding tasks, will those gaps still exist in the next three generations of models? It's like a little unclear to me though.swyx: Yeah,Felix: because right now these like hyper optimizations we make, I'm not sure for how long they're still be relevant.swyx: I think what I was referring to was also, it, it just, uh, it qualitatively felt different when I probably, it's just all prompting and I'm reading too much into it, but like the, the fact that it comes out with like a nine step plan, I can edit the plan and give feedback and, and, and see it execute the plan.Yeah. It felt more long range than in Claude Code, but maybe that already existed in Claude Code and you just build a nicer UI for it.Felix: It's kind of both. Um, like if the Clark Code people who build the planning functionalities would city, they probably say yes, we have all of those things in Clark code and they do.Um, I think people tend to give cowork. Tasks that are maybe of longer time horizon, I thought isswyx: so long. Yeah.Felix: That's like one thing, right? It's just like that the, the chunk of work tends to be maybe a little bigger. And then the second thing is that because the work, when it gets longer, it gets a little bit more ambiguous.We do tell co-work to make heavy use of the planning tool or to make heavy use of the ask user question tool, right? We do want it to come up with like. Different scenarios of, okay, tease out what the user actually wants. Don't go off to work for like four hours and then come back with the wrong thing.And you're probably picking up on that.swyx: Yeah.Felix: Um, I wish I could tell you I like built this magical thing and it's like, there's some secret sauce,swyx: but No, no, no. I mean, it's, it's just clarity is good that, you know, engineers just want to know. Yeah. They can, they can plan around it. And then I think also for me, um, I am realizing I have to switch to my, my other machine because this is a new machine that doesn't have my session.But, uh, yeah, the, the, the planning is really important for, for me to like approve or like to see whether it's like, it's right. The ask is, the question is so beautifully presented. I mean, it also, it also available in like cursor and, and in Claude Code. But like, I, I think like it's so nice to see that it, like it's kind of for me like to understand that it gets me, it gets what I want to do.Felix: Yeah.swyx: Yeah.Felix: It probably very hardswyx: just on the topical evals. Mm-hmm. When you say eval, I think people are very vague about what it means. Is it just like vibe testing or do you have like automated programmatic evals of Claude Cowork?Felix: When we say eval, uh, what we really mean is that we essentially take the entire transcript, including all the tools that clot has available ultimately to it, and we then measure what are the outputs, depending on what we tweak, right?So we do run that a lot. We use that in training. Um, we use that in, in like, if you sort of separate out post training from like the scaffolding around it. Cowork sort of exists in the scaffolding space, but obviously we also train on it a little bit. Um, so when we say eval, we mean given the certain transcript, what do the outputs look like?Including the file outputs as well as like the actual token outputs, like the ones that you see in the chat window.Alessio: I'm curious, um, how much of the failure modes are the model intelligence versus like the usage of the end tool to put the intelligence in? Like the well planning is like a good example, right?It's like one thing is to come up with a plan. The other thing is like make a nice spreadsheet. Yeah. That kind of runs you through the plan. Like how have you seen that? Well,Felix: the thing that I grapple with a lot is that whatever scaffolding you come up with, I think we still have a bit of sort of like model overhang where the model is dramatically more capable than right.Users end up using it for. And I think part of that is that we're just not getting the model all the tools to do all the things that's theory capable of, right? There's like one thing, um, however, whenever you do build the scaffolding, I'm sort of wondering at what point, at what point will that scaffolding go away and like how much you invest in figuring out what the right scaffolding is.It's kind of up to, it's a little bit of a bet. And one thing that I as an NJ quite enjoy is that like working in philanthropic and working at a frontier lab, I maybe have a little bit more insight into what's coming, coming down the chute in terms of like, what's the next model, what is the model capable of?What is good at, what is it bad at? And I'm, I'm increasingly wondering, is the right thing for us to like really invest too much in sort of these like scaffolding corrections where the model might otherwise not misbehave, but just not do the thing that you want?Alessio: Yeah.Felix: Or is it to just like give it as many capabilities as possible, try to make those safe so there's the worst case scenarios, likeno status might be otherwise.And then just simply wait a second for the next model drop. I'm personally, currently more leaning into the ladder. I think we're gonna see a lot of like applications and companies that do very impressive things with ai that in the short term might seem very effective ‘cause they're very specialized to individual use cases.But I think once models get better generalization and get better at like those specific use cases without being super guided on those, I'm not sure how long that's gonna stick around. And you can kind of, kind of already see this in like skills and NCP servers, right? Mm-hmm. We've, we've already seen sort of this like slow shift from MCP service to skills.And like, maybe a good example is Barry who made skills. He was initially hacking on something that honestly looked a lot, looked, looked a lot like what Cowork does today. It was sort of thinking about what if cowork, but for like people who don't wanna build code. Mm-hmm. And, um, he too did that as a prototype inside the desktop app.One of the first use cases we thought of were, okay, what, what are like coding like use cases that could really benefit from graphical interfaces and like from being a little separated from the actual underlying code. And everyone comes with the same answers. Data analysis,Alessio: right?Felix: Yeah. Or saying how many users do we have today?How many, like, it's always data analysis. And I think the thing that ultimately led to skills is that we wanted to connect this little prototype to our data warehouse and. The team very quickly discovered that like instead of building a custom tool for the thing to talk our data warehouse, they just like meet and embarked on follow like mm-hmm.Dear Claude, if you want to get data, here's the end point. Here's what the API looks like. You'll figure it out.swyx: Ah.Felix: And then it be hand over control. Yeah, yeah. Also just like maybe go one step up in the layer of abstractions, right. Just, yeah. Instead of, instead of telling the thing, here's ACL I, please call the CLI, or here's an MCP.Please call this ECT shape. Just like this is the end point. If you wanna know something, if you post here, maybe you can do post sql. It's gonna be okay. And that ended up being so effective that they started trying the same pattern of like just giving the model a markdown file that describes whatever it needs to do.That the whole thing eventually became skills and we're like. We should package this up. This is a good idea.swyx: Yeah. Um, we've had Barry Mahesh, uh, on, on our conference and uh, he's uh, definitely got a good idea there.Felix: Yeah.swyx: I wanted to show you the, how I've been using Claude Cowork.Felix: Uh, this is was my favorite part.swyx: This is this. So this is like me, uh, this is how we run the Discord. Uh, we literally, uh, at first I didn't trust Cloud Core. This was my very first usage.Felix: Okay.swyx: Right. So then I was like, okay, I will just try to manually download from Zoom all my recordings and upload it to YouTube. Yeah. Because this is a very laborious process.I got a click, click, click YouTube, um, isn't super user friendly. Uh, and it just did it. And then I was like, actually, you know, even the download from Zoom part, I should also. Put into Claude Cowork, and then I did it right. Here's a bunch of, and it starts compacting here, and it, and it, it starts to even be able to do things like look through the individual frames of the video to name the video so I can upload it auto automatically.Oh, that is, and this replaces my job as a YouTuber. We will forever appreciate your creative Yes. You know, and so that's great. Uh, but then by the way, it compacts and makes, makes like a new thing, right? So I, I don't, I don't have the initial, initial thing, but then I asked it to make its own skills so that it, so that something that's repetitive and one-off and human guided becomes more automated and I can use the skills independently and reuse them.Uh, and it obviously you can write skills and that goes into context and skills at the bottom here, which is, which is so nice. Um, so I have all these skills that, that I now sort of do on a weekly basis. Uh, I know you've released scheduled Coworks, which I haven't done yet, butFelix: course I should try them. I, I think this is like so wonderful and fun for me to see because.One thing that is very fun for me about skills in particular is that they're so easy to make. Like anyone can make a skill, like a text message, could be a skill, and they can be so hyper personalized to you. And this is like sort of the subtraction layer, right? Like, um, I, I'm just guessing, but I assume, heck, you are very good at your job.You're probably given this thing some guidance about how to do it, right? I,swyx: I just said, wrap everything up into, into a skill, right?Felix: Yeah.swyx: And then, uh, and then I was like, actually, sometimes I might need to break, uh, things apart because some parts fail or some parts might be needed in individually. So I told it to split one skill into three skills.So it's like a skill splitting thing, and then there's like a parent skill that just orchestrates all of them if I want to use that. You know, like, um, I think that's, that's like really good. Uh, and, and, uh, there's, there's one more part, which is the, uh, Google Chrome thing that I told you about.Felix: Yeah.swyx: Where I'm like, okay, you know, what's better than uploading, using Claude Coworks to YouTube?Like actually. Looking at the docs to like programmatically upload to YouTube and then putting that in a skill. And I've never done that before. I don't want to deal with Google Cloud. Yeah. So Claude Cowork does it for me.Felix: That is really cool.swyx: So, so I, I just, I don't care. I just, like, I do a thing. I don't, it doesn't really matter.Felix: That is really cool. And then you've, I assume paired the skill just with the script that it's built.swyx: Yeah, no, I just update, update the skills.Felix: Oh, that is beautiful. Yeah. That's wonderful.swyx: It's kind of like a skill, like, uh, uh, basically I think like the way that people ease into Claude Cowork is like take a knowledge work task that you would normally be clicking around for and then, uh, try to turn, turn that, and then you do the, okay, well what if you went further?Okay. And then when, if you went further, when, if you, and it sort of expand the scope of cowork as you gain trust with it and, and also teach it how to replace you.Felix: Yeah. It's like a little bit like playing factorial, but for your own life. Uh, like you say, you start really small.swyx: Yeah.Felix: You start automating something really tiny and like.Once it clicks, you keep adding onto this like automation empire. Just like make your life easier and easier. My favorite skill has been, um, every single morning Kohlberg starts looking at my calendar and make sure that there's conflicts because people tend to schedule a lot of meetings, sometimes last minute, sometimes miss it soft and painful.And a lot of products have existed like that A lot. I've written in the custom prompt there. I haven't made it a skill, um, honestly should.swyx: Yeah.Felix: But I've given it like pretty clear instructions about okay, here are some people, if they book over other meetings, I'm probably gonna go to their meeting. Like if Dario schedules a meeting.swyx: Right.Felix: Not try to reschedule down. Right. Um, and I think there's some other rules in there about like what kind of meetings I care more about what kind of meetings I care less about. What is okay to like, maybe pun like when I want to be, when I want to be working, when I don't want to be working. And it's those really small things that I can think kind of click with people.Right. When we launch co-work, I think one of the US races that went most viral on Twitter. X was clean up your desktop, which is stuff, because silly, that's such a smart thing, right? Like you don't need to model to clean up your desktop. Not really. Um,swyx: like this, like clean up my desktop.Felix: Yeah, exactly. Yeah.swyx: I need to, I need to choose my desktop, right? I guess give it access to my desktop.Felix: Yeah.swyx: Okay. Uh, okay. This is very scary. Oh, we'll do it.Alessio: I did, I did it with my downloads folder. It was like, you have so many term sheets and there's like eight copies of your rental lease for your office. I was like, all right.Like, don't yell at me.Felix: It's like, it's not such a small task. And then like, I, I would never go out there and normally otherwise and tell people I've pulled a product. It can organize your folder. Right. Um, because it feels small. But I think to your point like,swyx: oh, here's, here's the, here's the ask user questions.Felix: Yeah.swyx: Uh,Felix: beautiful. Right. Elite obvious junk. You probably shouldn't click that.Alessio: No.Felix: If he's not done right.swyx: As long as it's reversible, I don'tAlessio: make up blend to,swyx: yeah. Uh, yeah. No, I, I have a, I have a typical, everything is super messy folder. So, yes. I think this, this is super helpful. So this is a pretty simple task.Mm-hmm. But I've, okay, here it is. Right. Here's the progress. I don't see this in, that's why I'm like, this gotta be something different than, uh, than Claude Code, because I'm like, weFelix: do. Yeah. That's, we do system prompt that. We're like, all right. We want you to think about like, this task Yeah. Methodology.Yeah.swyx: And then I can, I can, I can do like little suggestions for, for, for these things. It's beautiful. Look at this. I, I can, I can like say like, oh, don't do that. Don't do this. It's amazing.Felix: I'm so happy. You like it. Um, I mean, the other way around, like we're part of the Clark core team, if you would like this in Clark COVID.swyx: Yeah. Yeah. Yeah. Uh, so, so yeah, I mean, uh, this is really good. Obviously I, I'm like kind of raving about it. Uh, you know, I have other things like sign up for pg e so if you can do phone calls for me, that'd be great. Um, I, I do, peopleFelix: have done that. Obviously you can't do that natively, but people have done that with like, various other providers.swyx: Yeah. Uh, and then this is like signing up for the Figma MCP. Um, I, I really am trying to do like everything, um, data analysis as well. I do think, um, oh, design to code, uh, very, very good. Right? So like, here's a Figma file, take it. And then this is where like a lot of other tasks is like knowledge work, like replace my manual clicking, but this is no, I would normally use Claude Code or uh, Claude Code for this, but because I perceive that you have better Chrome integrationFelix: mm-hmm.swyx: I, I think you can actually do a better job of this. And I, this, this is one shot at my, uh, conference website.Felix: That's pretty cool. Like at some point I would love to like, hear how you feel about code. In the desktop apps, which is like I never use, which is the, the same team. Same team.swyx: So I use the call code in terminal, which I, I perceive to be the default way of cloud coding.Felix: So one thing this has,swyx: sorry, I'm just like, I'm notFelix: here, I'm not here. All products. Can I talk about other stuff? Like I, I'm not sure if people out there wanna like hear me advertise my stuff for like an hour. Please do that. Um, this thing is like a builtin browser, which is a thing a lot of products have said.Yeah, it's a builtin browser. And I think giving cloud eyes into like what you're actually working on makes it so much more effective. And that's probably what you've seen in cohort because it can see Chrome, it can like debug the dom, it can like see things. Um, that does make it more powerful.swyx: Yeah. So, so I think, uh, my mental model was kind broken.‘cause I only use this cowork because I thought it had a, a browser thing in it. But I understand that the Claude Code app. The app version of Claude Code does have a built-in browser. I've seen, I've seen this preview thing.Felix: Yeah.swyx: I just, I've never used it.Felix: But in the end, in the end, you sort of have it by hard.Yeah. You basically get the same thing. Right? Like the, the, the additional skill that you're describing is chart is better if we can see what it's working on. Right. That's, that's sort of like the summary here and like whether it's using your Chromeswyx: Yeah.Felix: Or it's just like making up its own little like browser.It doesn't really make a big difference because either way it's gonna see what it's working on and that just makes it much better. And then you don't have to run QA for your cloud.swyx: Why doesn't it pick up my existing Claude Code sessions? ‘cause I, I mean, obviously I've used Claude Code, but Excellent question.Um, don't have a good answer other than like, we're honest. Just haven't Yeah. This is what the Open AI team does. Okay. Uh, cool. I I I don't have other, like, I, I just, I, I do wanna expand people's minds and also maybe show people if they haven't really done it, but like, I, I think it's very interesting how I sometimes use this more than I use, I mean, I use dia, right?Yeah. Um, I, and I use, uh, I've used like all the other agentic browsers and philanthropic didn't have to build an agentic browser because you just had Claude Cowork and that's enough.Felix: Yeah. I also think like maybe integrating with number of excellent browsers out there, it's like currently on my personal priority list, a little higher than like trying to rebuild a browser from scratch.Yeah. You know, never say never, but I think going back to this idea of like, we wanna plug this into an entire existing workflow, I think our goal is actually to not replace any of the applications we have in your computer. But instead of like, work really well within a new workflow,Alessio: make the new one. Yeah.Are, it seems that nowadays, especially on the browser, most of the innovation is like user ergonomics. It's not really like the underlying browser engine. So I feel like to call it, it doesn't really matter if it's like the, uh, or Chrome or Alice, whatever.Felix: Yeah. We wanna, we wanna meet you wherever you are.Which is like, like obviously I would say that, but it's also just generally true because I don't wanna shrink my potential user base artificially by saying, okay, like, I'm gonna start building for the people who are willing to switch browsers.Alessio: Right.Felix: That's such a, like, you know, like many lawsuits have been filed over who gets to review the browser and like a lot of money has switched hands over the question of like, which browser is default and which search engine is default within the browser.Um, I just wanna build for, yeah, I wanna build for swyx essentially. Like, I wanna, I wanna, I wanna build for people who have a number of annoying tasks that they feel like. Maybe clock could do it. Could do it for them.Alessio: Yeah. What do you think about skills portability? I think there's been one thing, I use another thing called zo, which is kinda like a cloud computer plus agent.And I have a skill to add visitors to the office. Yeah. So whenever somebody has to come in after hours, they need to check in downstairs. Um, but I wanna like text the thing, so it doesn't really work in, in cowork, but now that skill is in the zone harness and it's not in my cowork thing. And then if I make a change, it's gotta, I gotta sync them.How do you see that going? Like I see memory as like. Cloud personal, kinda like, I don't necessarily want my memories to be cross thing.Felix: Yeah.Alessio: But I do want my skills to be cross agent that I use. I think with MTPs, people do the same thing. It's like, oh, Mt. P Gateway. Mt P registry. I don't really know if that's like a business.So I'm curious like if you've had any thoughts in the area.Felix: I think for me, this is sort of where I go back to the really basic primitives for our skills are file-based instead of like this complicated thing that exists inside a place somewhere that is like super proprietary. I'm really leaning into the idea of like, it's all just files and vultures, and that makes it very portable on its own.Right. We do have skills as part of this container format, which was just called plugins.Alessio: Mm-hmm.Felix: And plugins are available both for Claude Code and Claude Code work the same format, and you can install plugins. This works in cowork today. You can basically say, I'm gonna add a whole, like just a GitHub repo as a.Skills marketplace or like a plugin marketplace. And that's how we're doing portability. I think we have a lot of room left to grow in. How do we make it easy for people to know that they can write skills? How do we make it easy for them to just like, share a skill with you? Because obviously all the words I just said, right?Like I'm losing most of the knowledge worker base out there, right. And start by saying, oh, you can connect to GitHub repo. It's not exactly how most people will end up working in like a general knowledge worker space. Um, but I think there's something there. And another thing that's there that I think has not really been properly explored is the, the, the combination of which part of the skill is very portable and then which part of the skill is like very personal to you.Right. And I think that's something we haven't really solved as an industry. Hmm.swyx: It's like, which, how you wanna introduce more structure to the skill or have always have like. Public skill, private skill, you know, pair. Yeah, yeah. Kind of. I think there'sFelix: like a, like the easiest way to do this, which is we do like use string interpolation or something.Right, right. Yeah, yeah. Insert username here, insert like phone number, insert, like known folder, locations, that kind of stuff. Um, that's probably clunky. That's why we haven't built it. Um, but I do think someone is going to come up with like an interesting way to keep everything we like about skills. The portability is just a file, it's just marked down.It's just text, honestly. Right. Like a text file words. The complete lack of structure, which means you don't need any kind of tutorial to write a skill. Just like explain it to Claude the way he would explain it to me and Claude will probably get it before I work. Mm-hmm. Right? You're just like, for booking a flight, tell Claude how to book a flight the same way we tell him somewhere.I just started working here today. But combine that with a very like, personal thing. Um, maybe we'll stick with a booking a flight example. I don't actually think. AI should be booking flights. I think the tools we have is yes.swyx: Yeah. Finally, somebody says it. It's the default demo that everyone's making.Felix: I'mswyx: like, I even against like booking demos, it is not a good showcase.Felix: Yeah. I'm like, I just wanna book my flight myself. But, um, I think there's a lot of things that have a personal and a non-personal component and that's maybe why people reach for flight booking because some things are very universal. Yeah. Super flight is usually better, right? Like few people try to book the most expensive flight.And then some things are quite personal about like what times you prefer, which seat you prefer, which airports you prefer. Combining that and like a skill format that is actually portable, compatible, easy to understand for people. I think that would be very exciting. We just haven't figured it out yet.Alessio: Yeah, I think the text part every, I think everybody by now has some sort of like cloud file thing. Either Dropbox, Google Drive, whatever. So it feels like in a way it should basically like sim link. My skills into all my agent harnesses. Yeah. Just keep those ing like we have internally this like valuable tokens repo, which is like all the commands sub agents.It's good. Uh, and then I build like a TUI where you can start it and be like, you know, install this command and this three sub agents into this agent in this folder and just copy paste this. It doesn't do anything. It literally cp the file into that. But I feel like there should be something similar where like whenever I go into a new thing, it's like, hey, here's like the link to exactly the cloud folder and just bring down these skills into this.Yeah. Like today it doesn't quite work like that. Like if I install a new agent, I cannot, I have to like copy paste all the skills and I don't even know where they are.Felix: Yeah.Alessio: That's like the big problem. It's like where do I find them?Felix: Yeah.Alessio: Um, so I'm curious like in the future like that, that almost feels like my personal productivity thing will be my skills.Felix: Yeah.Alessio: Is not really the product that I use. Everybody has access to the same product. But today there's, that just looks like copy pasting ME files, IFelix: think so many things I, I really like thinking about agents and LLMs just as like another coworker. So many attempts have made to build documentation companies that are like, oh, we're gonna solve oil documentation problems.Um, I myself, like spend a little bit of time working in notion, right? I'm like deeply familiar with the concept of let's get everyone on the same page. Mm-hmm. Right? And what you're basically saying here is you want all your agents to be on the same page about your preferences, about the skills, about the way they ought to work and like how they ought to execute.And I'm not sure what the right thing is going to be if it's going to be some, some company that can say, all right, we're as an independent body, we're not trying to like, push into any particular product. It's our job to be like the skill authority, and we provide, I don't know, we're gonna be the Dropbox of skills and we can just sim link us into all the products we want to use.I'm not sure that's gonna be viable business, but as, as an idea, it would be cool.Alessio: Yeah. Yeah. I think so many things are just going away as businesses. It's like, how am I supposed to do it? I'm not even asking somebody to make a product about it. Like yeah. I wanna personally know. And there's things like you said, it's like you almost wanna skill and then interpolate it between personal and work.So if I'm booking a fly for work, it's different than I'm booking a flight personally.Felix: Yeah.Alessio: In some ways, yeah. But like a lot of the scaffolding is the same, you know? Cool.Felix: I mean, as an engineer I will tell you like, you know, technic a person to technic a person. I will just be like siblings.Alessio: Well that's what, that's what I do.We call that MD and agents that MD's just the same how sim length. And so it is like, that works, but it feels like, yeah, I don't know. MaybeFelix: you can always go one, you can always tell cowork problem and then cowork will solve it for you. Just make the siblings. That's like one way to do it.Alessio: That's true.That's true. All right. Everything is called cowork.Felix: Uh, potentially spicy. Question for both of you.swyx: Uh, which of these industries will go away?Alessio: Okay, so what Felix was saying before is interesting. There's busy like. The short term pressure of like, we need to turn these tokens into valuable things, which is I should build the last mile product that harness the model.And then there's the question of like, long term, which ones are gonna still be valuable? And I think you're kind of seeing this today with like, uh, you know, the coding space in a way is kind of like everybody's moving up and up in stack because you need more than just turning tokens into code. I think search, like enterprise search is kind of saying the same thing.Like with G Clean and like all these different companies is like, at the end of the day, if Cowork is the one doing all the work, the search itself is like such a small part that like, I don't know if I'm really gonna pay that much money just to do search. It's almost like everything is like a cowork vertical.So like how much can cowork first party support?swyx: Mm-hmm.Alessio: And how much can it not? I think for a lot of these things, the planning thing that you were showing do Which one? The planning. The planning.swyx: Okay. Yeah. Yeah.Alessio: That's one thing where like most of the value that these agents provide is like they're better at planning for specific tasks.Yeah. And have better tools for it.swyx: Yeah.Alessio: But I think the models are now moving in that direction and they have the right harnesses and they're on your computer. So for me it's almost like if for the end customer trusts your startup to be the provider of that task result, then I think that works. This is, uh, something that, this is a shortswyx: spike that we're, we're working on.Uh, yeah.Felix: I think, look, I'll, I'll, I'll tell you this, like I don't think I'm the best person to like actually estimate which industry is going to be hit the hardest. But I do think that at philanthropic as a group of people, we're deeply worried about the impact. That the tools are going to have on the labor market, especially for like junior employees that, because I think, I think it's only honest to say that when we talk about automating a lot away, a lot of the work that we personally find annoying that we maybe think's not the best use of our time.In a lot of industries, that kind of work would've been given to a junior entry level employee. Yeah. Right. And I think it's, it's only, it's only right to be really worried about that and like worry what that's going to do in particular to people like enter the shop market.Alessio: Mm-hmm. I have a solution for that.Which you make them, you create simulative jobs for them.Felix: Okay.Alessio: So this is, this is like half joke, half true. So if you think about software engineering, when you're like a junior engineer, you work like 1, 2, 3 years. And in those three years there's like maybe like a handful of moments where like you really learn something.And then a bunch of other days where like you're not really progressing.Felix: Yeah.Alessio: I think now we can use AI and these models to actually like shortcut these careers and almost like simulate the early years of your work and like just make them like super dense and like these learnings, it's like, hey, we're working on this feature, which is like a distributed system and you need to learn this thing that might take three months at a company.And so you take three months here, it's like we're just simulating the whole thing. It's actually not a real thing. And in one week we kind of speed run through the whole thing and you kind of learn your lesson from there. And we kind of repeat that in like one year. You basically get like three years worth of like projects and experience.Yeah. I think it's harder for like things like sales or for things like, you know, marketing because you don't really have a way to get the feedback loop. But I think a lot of it, it sounds kind of silly, it's like you're making the new effect job, but it's almost like you go to college, right? People pay to learn how to do it, and this might feel similar where it's like, hey, we have the.Jane Street Simulator is like, you wanna come work at Jane Street? We'll just put you in the simulator for like three months.Felix: Wow.Alessio: And you'll come out of it. It's like, you know, I'm ready.Felix: So there, there is an aspect here. I'm not an expert enough to like actually know what, what is going to happen to marketing or legal or finance, right?Like, I don't work in those jobs and I, I don't think I should talk about them, but I am an engineer and I think I have a pretty good idea of what engineering is like. And I think one thing we're sort of seeing is that as a company and also as, as the public, we're like deeply worried about entry level, but we're also seeing more senior engineers accelerate it.If like they're more productive. They, they actually increase the value they provide. And the thing that I'm thinking about a lot is the fact that even before all of this happened, um, I've always had a lot of respect for the University of Waterloo and the, the new grads that have joined my teams as from coming from the University of Waterloo always felt like.More ready than new grads will like literally spend their entire time at the university regardless of how good, but never actually had to work inside an environment where you have to ship things that eventually will be used by users. And I'm, I'm, I'm German. I like initially went to German University and I think the, the, the like information systems programs, there tend to be very theoretical, right?Like I often give people the example of like trying
AI Agent Hacks McKinsey Chatbot in 2 Hours, NPM Phantom Raven, Router Malware & Trojaned AI Models This episode covers how researchers at CodeWall used an autonomous AI security agent to gain read/write access to McKinsey's internal chatbot Lilli database in about two hours by chaining exposed APIs and an SQL injection, potentially exposing 46.5 million chats, 728,000 files, 57,000 accounts, and 95 system prompts, with McKinsey saying the issues were fixed and no unauthorized access was found. It also reports on the Phantom Raven supply-chain campaign that published 88 malicious NPM packages using a runtime-downloaded payload to steal developer system data like SSH keys and host details. A study warns that 83% of 800 million compromised passwords still meet complexity rules, highlighting credential-stuffing risk and the need for breach checks and MFA. The show notes 14,000+ routers infected with persistent malware often requiring factory resets plus hardening, and discusses Trojan backdoors embedded in AI models that trigger misbehavior under specific inputs, calling for new AI security testing and validation. Cybersecurity Today would like to thank Meter for their support in bringing you this podcast. Meter delivers a complete networking stack, wired, wireless and cellular in one integrated solution that's built for performance and scale. You can find them at Meter.com/cst 00:00 Sponsor Meter Intro 00:20 Headlines And Welcome 00:55 AI Agent Hacks McKinsey Bot 03:44 Phantom Raven NPM Malware 05:55 Strong Passwords Still Leaked 07:55 Router Malware That Persists 09:36 Trojan Backdoors In AI Models 12:01 Call For AI Backdoor Research 12:30 Sponsor Meter Outro 13:13 Sign Off
Interview with Anna Pham Breaking in with ClickFix: Anatomy of a modern endpoint attack Cybersecurity company Huntress just published a report on a new ClickFix variant they've discovered, which they've dubbed CrashFix. This technique was developed by KongTuke to serve as the primary lure within a new custom malicious browser extension also created by the group. In short, the team observed the threat actors using KongTuke's malicious browser extension to display a fake security warning, claiming the browser had “stopped abnormally” and prompting users to run a “scan” to remediate the threats. Upon “running the scan,” the user is presented with a fake “Security issues detected” alert and instructed to manually “fix” the issue by opening the Windows Run dialog, pasting from their clipboard, and pressing Enter. The malicious extension silently copies a PowerShell command to the clipboard, disguised as a legitimate repair command. From there, they execute the malicious command. Segment Resources: BLOG - Dissecting CrashFix: KongTuke's New Toy Interview with David Zendzian Continuous compliance and real security lifecycle management Supply chain attacks are not just on the rise; attackers are learning from the past, making these attacks even more effective and dangerous than before. It was just over a month ago when the Shai-Hulud attack first impacted NPM packages, forcing enterprises around the world into lockdown. While only 187 packages were compromised in that initial incident, it served as a wake-up call for many: an accurate inventory of systems is good, but a clear, real-time Software Bill of Materials (SBOM) for applications is non-negotiable. In this world of manifest based infrastructure and container based applications with (real) "devsecops", the dream of continuous upgrades of OS/Runtime/Stack/App and App Dependencies is very mature and there are solid examples of companies and federal entities managing this at scale without thousands of teams and people. Segment Resources: BLOG - Supply Chain Security: How accurate SBOMs can deliver proactive threat mitigation Interview with Jacob Horne CMMC Phase 1 Enforcement — What the November 10 Deadline Means for the Defense Supply Chain With the upcoming CMMC Phase 1 enforcement on November 10, cybersecurity teams across the defense and federal supply chain are facing new compliance requirements that directly affect contract eligibility and data-protection standards. Jacob Horne, Chief Cybersecurity Evangelist at Summit 7, can break down what this milestone means for enterprise security leaders, MSPs/MSSPs, and contractors preparing for audits. Visit https://www.securityweekly.com/esw for all the latest episodes! Show Notes: https://securityweekly.com/esw-449