Podcasts about SDLC

  • 242PODCASTS
  • 515EPISODES
  • 37mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Jun 25, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about SDLC

Latest podcast episodes about SDLC

Code Story
S12 E24: Volodymyr & Vitalii Sydorenko, Gearheart - Part 2

Code Story

Play Episode Listen Later Jun 25, 2026 15:08 Transcription Available


Volodymyr Sydorenko lives in London, and collects mechanical keyboards. His most unusual hobby is that he does clay sculptures of characters, or random people at times. He has 2 cats, and likes to spend time outdoors. In fact, in 3 weeks time from this recording, he will traveling to Switzerland to do the Via Ferrata. To add to all of this, he has started to write children's books and hopes to publish them someday.Vitalii Sydorenko currently lives in Lisbon, Portugal. He is into sports, loves to hit the gym and regularly tracks his calories. Last year he started playing tennis and finds that he can't stop. He enjoy hiking, which is great in Lisbon. And in the past, he spent many years building startups, exiting, and also in venture capitalYou may have noticed that Volodymyr and Vitalii have the same last name... that is because they are brothers. As kids growing up, they did a lot of boxing together, as well as cling to classic films like Back to the Future.Fourteen years ago, Volodymyr got interesting in building solutions, and realized he could only get so far by himself... so he decided to build a team to deliver these solutions. Two years ago, Vitalii and Volodymyr started to consider all the of the shifts in the SDLC, and what that meant for the current business. Vitalii decided to bring his prior startup and VC experience and join the team.This is the creation story of Gearheart.SponsorsUnblockedTECH DomainsMezmoBraingrid.aiLinkshttps://gearheart.io/https://codestory.co/podcast/e6-jon-darbyshire-smartsuite/https://www.linkedin.com/in/gearheart/https://www.linkedin.com/in/vitalii-sydorenko-%F0%9F%92%AA%F0%9F%87%BA%F0%9F%87%A6-24b4ba35/Our Sponsors:* Check out Cash App and use my code CASHAPP10 for a great deal: https://click.cash.app/ui6m/mt82fpxl #CashAppPod. Cash App is a financial services platform, not a bank. Banking services provided by Cash App's bank partner(s). Prepaid debit cards issued by Sutton Bank, Member FDIC. See terms and conditions at https://cash.app/legal/us/en-us/card-agreement. Cash App Green, overdraft coverage, borrow, cash back offers and promotions provided by Cash App, a Block, Inc. brand. Visit http://cash.app/legal/podcast for full disclosures.* Check out Plaud AI and use my code CODESTORY for a great deal: https://plaud.aiAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

Code Story
S12 E24: Volodymyr & Vitalii Sydorenko, Gearhart - Part 1

Code Story

Play Episode Listen Later Jun 23, 2026 22:50


Volodymyr Sydorenko lives in London, and collects mechanical keyboards. His most unusual hobby is that he does clay sculptures of characters, or random people at times. He has 2 cats, and likes to spend time outdoors. In fact, in 3 weeks time from this recording, he will traveling to Switzerland to do the Via Ferrata. To add to all of this, he has started to write children's books and hopes to publish them someday.Vitalii Sydorenko currently lives in Lisbon, Portugal. He is into sports, loves to hit the gym and regularly tracks his calories. Last year he started playing tennis and finds that he can't stop. He enjoy hiking, which is great in Lisbon. And in the past, he spent many years building startups, exiting, and also in venture capitalYou may have noticed that Volodymyr and Vitalii have the same last name... that is because they are brothers. As kids growing up, they did a lot of boxing together, as well as cling to classic films like Back to the Future.Fourteen years ago, Volodymyr got interesting in building solutions, and realized he could only get so far by himself... so he decided to build a team to deliver these solutions. Two years ago, Vitalii and Volodymyr started to consider all the of the shifts in the SDLC, and what that meant for the current business. Vitalii decided to bring his prior startup and VC experience and join the team.This is the creation story of Gearhart.SponsorsUnblockedTECH DomainsMezmoBraingrid.aiLinkshttps://gearheart.io/https://beyondthewow.iohttps://codestory.co/podcast/e6-jon-darbyshire-smartsuite/https://www.linkedin.com/in/gearheart/https://www.linkedin.com/in/vitalii-sydorenko-%F0%9F%92%AA%F0%9F%87%BA%F0%9F%87%A6-24b4ba35/Our Sponsors:* Check out Cash App and use my code CASHAPP10 for a great deal: https://click.cash.app/ui6m/mt82fpxl #CashAppPod. Cash App is a financial services platform, not a bank. Banking services provided by Cash App's bank partner(s). Prepaid debit cards issued by Sutton Bank, Member FDIC. See terms and conditions at https://cash.app/legal/us/en-us/card-agreement. Cash App Green, overdraft coverage, borrow, cash back offers and promotions provided by Cash App, a Block, Inc. brand. Visit http://cash.app/legal/podcast for full disclosures.* Check out Plaud AI and use my code CODESTORY for a great deal: https://plaud.aiAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

Dev Interrupted
Your developers are the attack surface now and vibe coding as a vulnerability | Tanya Janca

Dev Interrupted

Play Episode Listen Later Jun 23, 2026 46:06


Developers are like water: if you make your security protocols too difficult, they will find a way to flow right around them. This week on Dev Interrupted, bestselling author and OWASP Top 10 Project Leader Tanya Janca returns to unpack why vibe coding has officially made the list of the most critical security risks in software development. Tanya breaks down the psychology of bad code, explains why the modern software engineer has become the primary attack surface, and shares actionable strategies for shifting security left directly into your AI prompts. Finally, she provides practical, behavioral solutions for building a golden path that makes secure coding the easy choice for your engineering team. Register here: for the June 25th workshop, Life Beyond Tokenmaxxing, to learn how to measure real AI impact and ROI across the SDLC.Follow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's guest:SheHacksPurple: Learn secure coding from Tanya at shehackspurple.caDevSec Station: Listen to Tanya's bite-sized security podcast for developers at devsecstation.comSecure My Vibe: Download Tanya's free AI secure coding prompt library at securemyvibe.ca The Psychology of Bad Code: Read Tanya's insightful blog series on behavioral economics and application security on the SheHacksPurple BlogOWASP Top 10: Learn more about the most critical security risks to web applications at owasp.orgTanya's Newsletter: Sign up for Tanya's newsletter at  newsletter.shehackspurple.ca  Connect with Tanya: LinkedIn | Twitter/XOFFERSStart Free Trial: Get started with LinearB's AI productivity platform for free.Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.LEARN ABOUT LINEARBAI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

Risky Business News
Sponsored: Trail of Bits and OpenAI patch the planet

Risky Business News

Play Episode Listen Later Jun 23, 2026 18:27


In this sponsored interview James Wilson chats with Trail of Bits founder and CEO Dan Guido about its newly announced partnership with OpenAI. Together, they've started a new initiative called “Patch the Planet” to support open source maintainers. Being an open source maintainer is more difficult than ever. Just using frontier models to keep up with all the bug reports isn't enough. Trail of Bits wants to help maintainers by combining its deep cybersecurity expertise with OpenAI's GPT 5.5 Cyber. As Dan points out in this interview, this isn't just about helping maintainers find and fix bugs. They're spending just as much time on SDLC improvements, architecture changes, and the foundations needed to make open source sustainable in the AI era. Show notes

Dev Interrupted
Microsoft's wandering eyes, data labeling duties for senior devs at Meta, and prod is the new source code

Dev Interrupted

Play Episode Listen Later Jun 19, 2026 38:02


This week on the Friday Deploy, Ben and Andrew unpack the sudden disappearance of Fable 5 and discuss whether Meta's aggressive pivot to AI data labeling is destroying its legendary engineering culture. The hosts also explore the rise of highly capable open source Chinese models like GLM 5.2 and why tech giants are considering them to slash skyrocketing inference bills. Finally, they dive into new research proving that as AI execution takes over, human domain expertise and strict production observability are more critical than ever. Register here: for the June 25th workshop, Life Beyond Tokenmaxxing, to learn how to measure real AI impact and ROI across the SDLC.Follow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's stories:KayfableWhy is Meta destroying its engineering organization?AI demands more engineering discipline. Not lessZ.ai's open-weights GLM-5.2 beats GPT-5.5 on multiple long-horizon coding benchmarks for 1/6th the costMicrosoft Mulls China's DeepSeek for Copilot, Probably to Trump's ChagrinAgentic coding and persistent returns to expertiseOFFERSStart Free Trial: Get started with LinearB's AI productivity platform for free.Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.LEARN ABOUT LINEARBAI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

The Tech Trek
Yahoo CTO on AI, Engineering Velocity, and Why the SDLC Has To Change

The Tech Trek

Play Episode Listen Later Jun 18, 2026 25:45


Yahoo is not just adding AI on top of existing products. It is using AI across product experiences, internal tools, engineering workflows, and modernization efforts.In this episode of The Tech Trek, Lee Zen, CTO at Yahoo, joins Amir Bormand to talk about modernizing at massive scale, moving from on prem infrastructure to the cloud, rebuilding internal tools with AI, and how engineering organizations need to rethink process when agents can move faster than people.Lee also shares how Yahoo views AI as a coworker, not just a tool, and why the next bottleneck in software delivery may be human judgment.Practical Takeaways• Modernization at scale often means operating in two worlds at once, keeping proven systems running while new cloud based services move faster.• AI can help teams move past legacy tools by reverse engineering requirements and rebuilding modern versions from scratch.• The real unlock is not only code generation. It is connecting agents to documents, chats, emails, production context, and internal knowledge with the right permissions.• As agents speed up execution, engineering teams need to rethink where human approval, judgment, and review should live.• The build versus buy equation is changing because some tools that were too expensive to build before may now be realistic to create internally.Timestamped Highlights00:31, Yahoo's mission and why the internet still feels hard to navigate02:01, Where AI fits across Yahoo products and engineering work03:30, The challenge of moving from on prem data centers to cloud based infrastructure05:27, How Yahoo has used AI to rebuild internal tools and leave technical debt behind07:25, Why agents need access to engineering context, not just code10:20, AI as a coworker and the shift from human speed to machine speed16:27, Why parts of the SDLC may need to change as AI increases delivery speedOne Line That Stuck“AI as a coworker, not just as a tool.”The Tech Trek is for technical leaders thinking through how teams build, operate, modernize, and adapt as AI changes the work. Subscribe or follow for more conversations with engineering, product, data, and technology leaders.

DevOps Paradox
DOP 355: Why AI Coding Slows Down Code Review

DevOps Paradox

Play Episode Listen Later Jun 17, 2026 55:50


#355: Picture your engineering team a year from now. A coding agent doing the coding. A testing agent on tests. A security agent on security. An infrastructure agent on infrastructure. All of them wired into GitHub and Jira, all of them working right alongside the humans. Not science fiction either - Atlassian and GitHub are already shipping these features. So out come the stats everyone loves to quote. AI code introduces 1.7 times more issues. Half of it ships with security holes. Code duplication is through the roof. AI-assisted PRs take four to five times longer to review. The response to most of it: so what? If you have a way to detect the issue and feed it back, that is just the SDLC doing its job. Couldn't care less if it is 1.7x or 50x more issues - what matters is what is left at the end, per feature shipped. Security holes? You have scanners. Detect, fix, ship. The only real problem is when you skip the detection or sit on the fix for months, and that has nothing to do with AI. Here is the one stat that actually sticks: PR reviews backing up. Speed up coding and leave everything downstream at human speed, and you have not sped up delivery - you have just moved the pile from Jira tickets to pull requests. The review pipeline was built for human speed, and now it is the bottleneck. The blunt fix: stop letting AI write 10,000-line PRs, work in smaller chunks, and accept that the job is about to get mentally harder. Delegate the tedious work and what is left is the demanding work - architecture, taste, is this even the feature we should ship. The silly stuff, does every function have a comment, is it camel case, goes to the machine. Spend your time there and you are wasting your talent. Offshoring never worked when the only goal was cheaper - chase the cheapest engineers, then chase even cheaper ones, and you end up dragging the work back in house. Same trap with AI. Offshore to Opus, then Sonnet, then Haiku, then Llama on a laptop. If cheaper is your primary motivation, you are doing it wrong. The win is qualitative, not the price tag. Where does it land? Three people per product, end to end - frontend, backend, database, deployments. Augmented at every stage, not autonomous. A human still pushes the final button to prod, the way you never let a Jenkins pipeline deploy straight to production without a check. Full autonomy is coming the way self-driving cars came: not in a year, not everywhere at once, and not by flipping it on at 4pm on a Friday. Even when the technology is ready, you are not. And if you think none of this touches your job, there is a story here about a textile factory built in the eighties that ran on five people. Knowledge work is next. The only exception is a monopoly, and you probably do not have one.   YouTube channel: https://youtube.com/devopsparadox   Review the podcast on Apple Podcasts: https://www.devopsparadox.com/review-podcast/   Slack: https://www.devopsparadox.com/slack/   Connect with us at: https://www.devopsparadox.com/contact/

FINOS Open Source in Fintech Podcast
Governing the Pipeline: Fusing CALM with Open SDLC | Karl Moll, FINOS

FINOS Open Source in Fintech Podcast

Play Episode Listen Later Jun 17, 2026 32:57


Karl Moll (Technical Project Advocate at FINOS) sits down with Grizz Griswold to discuss how CALM (Common Architecture Language Model) is acting as the structural glue connecting compliance projects across the banking ecosystem. He breaks down the momentum behind the Open SDLC Controls Framework and how these tools together build a secure, governable pipeline for unpredictable AI deployments.

Life Accelerated
Reshaping the Future of Insurance Operations in the AI Era

Life Accelerated

Play Episode Listen Later Jun 17, 2026 49:18


In this episode, host Olivier Lafontaine is joined by four industry leaders from the Life Accelerated Summit: Richard Wiedenbeck, Principal Consultant at Avatar Solutions LLC; John Brabazon, CFO at SBLI; Matthew Busbee, Chief Data Officer at Pan-American Life Insurance; and Sheriff Balogun Jr., Head of Insurance and Wealth Technology at MassMutual. Together, they explore the current and future impact of artificial intelligence on the insurance industry. From navigating legacy systems to overcoming regulatory hurdles, the panelists dive deep into AI's role in business transformation. They discuss how AI can drive operational efficiencies, improve customer experience, and shape workforce dynamics. The conversation also highlights how insurance companies are preparing their teams for AI adoption, the challenges of managing AI agents, and the evolving leadership structures needed in the age of AI.   Key Takeaways: AI is not just a tool but a key player in reshaping insurance workforces and operational models. Talent management and knowledge transfer are critical as AI continues to automate processes. Balancing governance, AI responsibility, and data privacy is central to successful AI adoption. Organizational structures need to evolve, with a focus on AI ownership and oversight. Building an AI-enabled culture requires a shift in mindset and deep collaboration across teams. Jump Into the Conversation:(00:00) Intro and context - what this episode covers (02:15) Meet the panelists - Richard Wiedenbeck, John Brabazon, Matthew Busbee, and Sherriff Balogun Jr. (05:33) MassMutual's top technology priorities - talent, AI productivity, and legacy modernization (08:46) Pan-American Life's AI strategy across 20+ countries and regulatory complexity (12:46) SBLI's approach - small company, big transformation, policy admin overhaul (14:41) Ameritas' AI playbook - unit cost reduction, AI workers, and workforce mindset (20:02) The real challenges of technology transformation - legacy systems, buy-in, and change management (25:11) Knowledge management - capturing expertise before your best people retire (32:04) Who owns AI in the C-suite - CIO, COO, or something new entirely (36:08) AI workers in practice - the Herbie and Susie story from Ameritas (40:01) Rethinking the SDLC - why "deploy and learn" beats "test until perfect" (43:30) What happens when human expertise retires and AI is all that's left (46:02) AlphaGo, gamification, and how humans will learn from AI in the future (48:30) Takeaways - be intentional, build governance muscle, don't wait for perfection Resources: Connect with Richard Wiedenbeck: https://www.linkedin.com/in/richard-wiedenbeck-6a752/ Connect with John Brabazon: https://www.linkedin.com/in/john-brabazon-cpa-3a81901b/ Connect with Matthew Busbee: https://www.linkedin.com/in/mbusbee/ Connect with Sherriff Balogun Jr.: https://www.linkedin.com/in/sherriffbalogunjr/  

Dev Interrupted
Your SDLC needs a productivity context engine

Dev Interrupted

Play Episode Listen Later Jun 16, 2026 39:44


What if the secret to fixing your overwhelmed SDLC is not a better AI coding model, but a smarter productivity context engine? This week on Dev Interrupted, LinearB founders Ori Keren and Dan Lines join the show to discuss the messy middle of AI adoption and the painful transition from the traditional SDLC to the Agentic Development Life Cycle. They unpack why the era of cheap AI experimentation is over, how rising token costs are forcing engineering leaders to prioritize strict business ROI, and how autonomous tools are fundamentally changing the daily workflow of developers.Register here: for the June 25th workshop, Life Beyond Tokenmaxxing, to learn how to measure real AI impact and ROI across the SDLC.Follow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's guest:LinearB: Learn how to transform your SDLC and build an engineering context engine at linearb.iogitStream: Explore LinearB's workflow automation tool for routing pull requests at linearb.io/platform/gitstreamFollow Ori and Dan on LinkedIn: Ori Keren | Dan LinesOFFERSStart Free Trial: Get started with LinearB's AI productivity platform for free.Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.LEARN ABOUT LINEARBAI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

Remotely Curious
Why don't more AI tools understand what matters to you?

Remotely Curious

Play Episode Listen Later Jun 16, 2026 29:43


How do you build AI that actually understands you and the work you do? It all starts with having the right context.  We talk with Dropbox staff product manager Noorain Noorani and principal engineer Sean-Michael Lewis about the art of context engineering and how Dropbox connects to all the tools your team needs for work—so you get AI that works wherever you do.  ~ ~ ~  Working Smarter is brought to you by Dropbox. Find, organize, and share your work—all in one place—with context-aware AI from Dropbox. You can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.ai This show would not be possible without the talented team at Cosmic Standard: producer Ben Montoya, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrator Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck. Our theme song was composed by Doug Stuart.  Working Smarter is hosted by Matthew Braga. Thanks for listening!

HLTH Matters
How CGI Is Helping Federal Health Agencies Get AI-Ready

HLTH Matters

Play Episode Listen Later Jun 15, 2026 23:44


In this episode, host Sandy Vance chats with Brad Schoffstall, Vice President of Health and Compliance Programs at CGI, and Dr. James Peake, Senior Vice President and former Secretary of Veterans Affairs and Army Surgeon General. They have a wide-ranging and practical conversation about what it actually takes to modernize data infrastructure at federal health agencies. With Brad's 35 years at CGI and Dr. Peake's 16 years, this is a conversation grounded in hard-won experience rather than theory. Today's conversation is a refreshingly honest and deeply practical perspective for anyone working at the intersection of government, healthcare, and AI.  In this episode, they talk about: Federal health agencies are running some of the largest healthcare operations in the world, with the VA equivalent in size to a Fortune 5 company Data silos created by contract-by-contract procurement are the primary barrier to AI-ready infrastructure at federal agencies Federated data platforms allow data to stay in its own repositories while being discoverable, mappable, and usable across the organization Policy is often the biggest obstacle to data sharing, and changing it requires executive-level support and shared governance Technology is the third most important factor in transformation; policy and business understanding come first and second CGI improved NHS Spine performance tenfold while reducing infrastructure to a tenth of its original size, saving a million euros in annual expenses Improper payments across federal health programs run into billions of dollars annually and represent one of the highest-impact areas for AI-driven improvement AI for AI's sake is not the answer; start with the business problem and work backward to the data strategy Start small with two or three systems, demonstrate value, and build from there rather than attempting a massive all-at-once implementation A Little About Brad and James: Brad Schoffstall has wide-ranging experience, deep knowledge, and skills in information technology. He has led multiple digital transformation efforts. He has 37 years of experience with a diverse set of architectures, operating systems, languages, and technologies. His experience includes enterprise architecture, cloud migration, and hands-on development. He also has significant experience in business development and project management. He has implemented large, complex systems on platforms ranging from mainframes to Microservices. He has successfully performed many solution architecture and SDLC engagements that include characteristics like high-volume processing, DevOps, and automation. He demonstrates expertise in multiple service-based secure architectures utilizing multiple application and enterprise solution sets, e.g., Data Driven, Microservices, Cloud, etc. Dr. James Peake is an American politician and former lieutenant general who served as the sixth Secretary of Veterans Affairs from 2007 to 2009. In 2004, he retired from a 38-year United States Army career, having served as the 40th Surgeon General of the United States Army. After retiring from the Army, Peake served as Executive Vice President and Chief Operating Officer of Project Hope,[4][5] a non-profit international health foundation operating in more than 30 countries. While at Project HOPE, he helped to orchestrate the use of civilian volunteers aboard the Navy Hospital Ship Mercy as it responded to the tsunami disaster in Indonesia and also as part of the Hurricane Katrina response aboard the Hospital Ship Comfort. Just before he was nominated Secretary of Veterans Affairs, Peake served as Chief Medical Officer and Chief Executive Officer for QTC, one of the largest private providers of government-outsourced occupational health and disability examination services in the nation. 

The Engineering Enablement Podcast
From AI experiments to organizational shift: Lessons from Mercari's transformation (Michael Galloway and Snehal Shinde)

The Engineering Enablement Podcast

Play Episode Listen Later Jun 15, 2026 44:38


Michael Galloway leads Platform Engineering at Mercari, while Snehal Shinde leads Cost and Performance Engineering. Together, they have been at the center of Mercari's effort to become an AI-native company.In this session from DX Annual, Michael and Snehal share what happened after Mercari's CEO mandated 100% AI adoption across the organization. While AI accelerated code generation and increased engineering output, the team quickly discovered that their existing dashboards could not answer a simple question: was AI actually improving productivity?They discuss how Mercari built new visibility into AI usage and software delivery, the bottlenecks they uncovered across the SDLC, why faster coding did not automatically translate into faster delivery, and the lessons they learned rolling out AI at scale. They also share how Mercari is rethinking software development around agents, feedback loops, and new ways of working.In this episode, we cover:(00:00) Intro(01:46) Mercari's scale and engineering culture(02:51) DX awards at Mercari(03:44) Mercari's push to become AI-native(06:34) The mandate to rethink everything(08:02) Mercari's AI visibility problem and how they solved it(11:30) Mercari's early findings on AI implementation(18:47) Closing the AI awareness gap at Mercari(21:11) Mapping AI opportunities across Mercari(31:32) Unpacking the results from the second rollout(34:14) Agent spec-driven development and what's next(37:37) A multi-loop SDLC(40:50) Some hard lessons(42:55) Closing thoughtsReferenced:• Mercari• Cursor• Devin• Claude Code | Anthropic's agentic coding system• GitHub• Datadog• Tim Bozarth - Microsoft | LinkedIn• Airbnb• Jim Collins - Concepts - The Stockdale Paradox

Scrum Master Toolbox Podcast
BONUS Why More Code Doesn't Mean Better Software — And Where AI Actually Helps Your SDLC With Mooly Beeri

Scrum Master Toolbox Podcast

Play Episode Listen Later Jun 10, 2026 40:00


BONUS: Why More Code Doesn't Mean Better Software — And Where AI Actually Helps Your SDLC Most teams are adopting AI to write code faster. But what if code generation isn't your bottleneck? Mooly Beeri has spent 25 years diagnosing where software organizations actually underperform — from Microsoft to Philips to automotive — and his message is clear: measure before you automate, and tie every AI investment to a business KPI. The Pattern Debugger's Origin Story "I've been identifying patterns way before AI was doing that. One of my first jobs was Microsoft, and I got the opportunity to work in engineering excellence. Every single simple improvement would make the lives of so many people better and the code better and the products better."   Mooly's career started at Microsoft in engineering excellence, where he discovered his passion for finding process areas that need improvement. From there he built the first software centre of excellence for Philips, spawned it into a separate business, and has been doing the same process excellence work across healthcare, telecom, and automotive ever since. His framework: understand where you're bleeding quality, revenue, or budget — then intervene there, not everywhere. Improvement Doesn't Mean Progress "There are too many efforts to improve too many things that don't really matter. The ability to tie a specific improvement to what actually means progress for a business — that, for me, is one critical component that's missing in many transformations."   Mooly's core insight applies directly to AI adoption: everyone has an improvement plan, but few can answer "how does this improvement improve business performance?" If you ask that one additional question, you can probably cancel half your improvement projects — the ones that make people feel good but don't move the needle on time to market, quality, or cost. The Code Generation Trap "It's like saying a book author is more productive because they write more words. The unit of work is not the number of lines of code they produce. The unit of work is a piece of code that works, that is tested, that is fully reliable, that meets a customer expectation, and eventually generates revenue."   Data from Faros AI shows individual developer PRs went up 98% with AI tools — but organizational delivery actually dropped 1.5%. More code, same or worse outcomes. Mooly explains why: most organizations invest in code generation not because it's the most effective thing to improve, but because it's the easiest step to automate. There are 35 steps in the SDLC. Picking code generation gives you a 1-in-35 chance of striking gold. As the saying goes: hope is not a strategy. Where AI Actually Works in the SDLC "The best usages would be in areas of the SDLC where there is a lot of data that needs processing and needs some detection of patterns — where AI is really, really good."   The most successful AI applications Mooly has seen with clients:   Defect root cause analysis — training AI agents on thousands of Jira bugs to find patterns humans can't see. In one healthcare client, AI analysis revealed that "false positive" bugs were actually compromised requirements — the dev team was closing real deviations as unimportant because they didn't have time to fix them Code review enhancement — AI scans incoming defects and generates a live, evolving checklist so reviewers spend their limited time checking for the most probable problems Test generation — unit, component, and functional test creation where AI can leverage existing test patterns and requirement data Requirements review — correlating requirements against strategic objectives, OKRs, and historical defect patterns to find contradictions before coding begins The Thinking Process You Can't Automate "The developers going through the process of converting requirements into code — it's actually a thinking process. It creates a lot of discussions with the product managers, a lot of back and forth, which help refine the requirement. This entire exchange is gone out the window when you have AI generate the code in 5 minutes."   When AI generates code instantly from requirements, it eliminates the human feedback loop that catches contradictions and incomplete specifications. The FDA has recognized this: every AI-assisted step in medical device software must be guardrailed by human activity. If you generate code quickly but still need a human review, the speed gain disappears. The value of coding was never just the code — it was the thinking. Map Every Investment to a Business KPI "If your uncle ran a bicycle repair shop and you said, let's advertise in the local newspaper, the first question he'd ask is: how many new customers will we get? The business logic hasn't evolved so much. If you want to do something — how will this impact your revenue, your customer retention, or your cost of producing goods? If you can't answer these things, don't invest."   Mooly's advice is deceptively simple: before adopting any AI tool in your SDLC, ask yourself which of three business outcomes it will improve — faster time to market, higher quality (fewer customer issues), or better margins (lower execution cost). If you can't draw a direct line from the AI investment to one of those outcomes, you're doing improvement theatre. About Mooly Beeri Mooly Beeri is CEO and co-founder of BetterSoftware, a consulting firm with over 25 years helping companies across healthcare, telecom, and automotive transform how they build software. His work focuses on diagnosing where software organizations underperform and designing targeted interventions — not blanket transformations.   You can link with Mooly Beeri on LinkedIn.

The Engineering Enablement Podcast
The current impact of AI on engineering velocity: What 400 companies are seeing (Abi Noda & Brian Houck)

The Engineering Enablement Podcast

Play Episode Listen Later Jun 8, 2026 26:56


Recorded live at DX Annual, Abi Noda, co-founder and CEO of DX, joins Brian Houck of Microsoft to share an early look at DX's new research on AI's impact on engineering velocity.Drawing on data from a sample of DX customers, they discuss what companies are actually seeing as AI adoption matures. Most organizations in the study saw pull request throughput increase by 10 to 15 percent—far more modest than the 10x gains often promised in industry headlines.They explore why coding remains only a small part of developer work, where time saved by AI may be going, and the unintended consequences of moving faster, from shifting bottlenecks to “false velocity.” Abi also shares how engineering leaders are applying AI beyond coding and how DX is evolving its measurement framework to account for both human and agent productivity.Where to find Brian Houck: • LinkedIn: https://www.linkedin.com/in/brianhouck/ Where to find Abi Noda:• LinkedIn: https://www.linkedin.com/in/abinoda In this episode, we cover:(00:00) Intro(00:53) What motivated DX's research into AI's impact on engineering velocity(02:36) How DX designed the study and selected companies(04:54) What DX's data reveals about AI's impact on engineering throughput(06:31) Why PR throughput was the most practical metric to publish(08:21) Why AI productivity gains are lower than many leaders expected(10:24) How an all-in culture can amplify AI productivity gains(12:35) Why it's hard to track where AI-generated time savings are going(15:04) Unintended consequences of AI-driven productivity gains(17:12) Why leaders should look beyond coding to the rest of the SDLC(19:43) Cognitive debt and the human costs of AI-assisted development(21:33) How DX's AI measurement framework is evolving(24:42) How to make agents more effectiveReferenced:• DX Core 4 Productivity Framework • DORA, SPACE, and DevEx: Which framework should you use?• Time Warp: The Gap Between Developers' Ideal vs Actual Workweeks in an AI-Driven Era - Microsoft • Research• How Generative and Agentic AI Shift Concern from Technical Debt to Cognitive Debt• Measuring AI code assistants and agents

The Engineering Enablement Podcast
Mapping the new SDLC at BNY: Codifying AI into every step of the delivery lifecycle (Jason Valentino)

The Engineering Enablement Podcast

Play Episode Listen Later Jun 8, 2026 33:24


Jason Valentino is Head of Software Engineering Strategy at BNY, where he oversees developer tooling, DevEx, platform workflows, and software delivery governance across more than 8,000 engineers.In this session from DX Annual, Jason shares how BNY moved beyond AI coding assistants to rethink the entire software delivery lifecycle. He explains how his team identified bottlenecks across the SDLC, prioritized automation opportunities, and applied AI to planning, peer review, testing, change management, and compliance workflows.Jason also discusses what it takes to scale AI inside a highly regulated enterprise, including rewriting policies, partnering closely with risk and audit teams, and building a culture that encourages experimentation and rapid sharing of ideas.Where to find Jason Valentino:• LinkedIn: https://www.linkedin.com/in/jasonvalentinoIn this episode, we cover:(00:00) Intro (01:20) Early results from AI coding tools at BNY(04:08) The 3X stress test: What breaks if engineering throughput triples?(06:56) Three ways to apply AI across the SDLC: IDE and CLI tools(08:07) Using autonomous AI agents for repetitive engineering tasks(09:16) Embedding AI directly into SDLC workflows(12:27) Why leaders should encourage experimentation and “start saying yes”(15:00) Q&A: How platform and productivity teams are evolving to support AI(16:33) Q&A: Rewriting policies and controls for AI-assisted software delivery(17:52) Q&A: How AI is affecting software quality and test ownership(19:00) Q&A: What Jason is most proud of: Practical examples of AI across the SDLC(20:30) Q&A: How BNY handles duplicated work across AI initiatives(22:30) Q&A: How BNY uses AI to support regulatory and compliance work(23:30) Q&A: Automating code reviews and change tickets(25:55) Q&A: How increased AI-driven throughput is affecting on-call and reliability(27:11) Q&A: How BNY works with risk and audit partners to move quickly with AI(29:01) Q&A: How BNY scales successful AI use cases across the organization(30:42) Q&A: What Jason is most proud of after BNY's busiest year with AIReferenced:• AI-assisted engineering: Q4 impact report• Measuring AI code assistants and agents• Measuring developer productivity with the DX Core 4• Windsurf• Claude Code by Anthropic | AI Coding Agent, Terminal, IDE• Codex | AI Coding Agent

The Engineering Enablement Podcast
Designing the AI‑native engineering organization with 1Password, Microsoft and Atlassian

The Engineering Enablement Podcast

Play Episode Listen Later Jun 8, 2026 43:31


Abi Noda is joined live at DX Annual by three engineering leaders shaping AI adoption at scale: Tim Bozarth, Corporate Vice President in Microsoft's CoreAI division; Nancy Wang, CTO of 1Password; and Taroon Mandhana, CTO of AI and Teamwork at Atlassian. Together, they discuss how AI is changing engineering organizations, from team structures and planning cycles to hiring, governance, and measurement.The panel explores how the profile of a great engineer is evolving, why smaller cross-functional teams are becoming more effective, and what happens when product managers, designers, and customer support teams start contributing code. They also share why they are encouraging AI adoption through enablement, training, and local champions rather than mandates, and how AI is shifting more of the software development lifecycle toward planning and validation.Finally, they discuss where human judgment remains essential, how to measure adoption and manage token usage, and how to connect AI investments to business outcomes while preserving room for experimentation and learning.Where to find Nancy Wang: • LinkedIn: https://www.linkedin.com/in/wangnancyWhere to find Taroon Mandhana:• LinkedIn: https://www.linkedin.com/in/taroonmWhere to find Tim Bozarth: • LinkedIn: https://www.linkedin.com/in/tbozarthWhere to find Abi Noda:• LinkedIn: https://www.linkedin.com/in/abinoda In this episode, we cover:(00:00) Intro(01:08) Introducing the panelists(02:16) AI's impact on engineering team structures and planning cycles(05:00) How the role of the engineer is changing and what makes a great engineer(10:11) The opportunities and challenges of non-engineers writing code(15:26) Encouraging AI adoption without mandating it(21:25) What an AI-native SDLC looks like and why human judgment still matters(30:56) Measuring AI adoption, token usage, and ROI(37:06) How to tie AI investments to business outcomesReferenced:• DX Core 4 Productivity Framework• Microsoft • 1Password• Atlassian• Jira• Confluence• Loom• Rovo • Amazon Operating Cadence - Working Backwards

CISSP Cyber Training Podcast - CISSP Training Program
CCT 356: Supply Chain Attacks Are Exploding in 2026 — Here's What the NCSC Wants You to Do

CISSP Cyber Training Podcast - CISSP Training Program

Play Episode Listen Later Jun 8, 2026 41:38 Transcription Available


Send us Fan MailYour software is only as trustworthy as the dependencies you quietly inherit and attackers know it. Today I break down the NCSC warning on software supply chain security and why open source package ecosystems have become a high-value target for real-world compromises that spread fast through CI/CD pipelines.I walk through the attack patterns that keep showing up in incidents: maintainer account compromise, expired domain takeover, typosquatting, and credential chaining. We connect each technique to the CISSP mindset so you can spot it in scenario questions and, more importantly, recognise it in your own environment. Along the way, I explain why Node.js, Python, and Rust projects are especially exposed, how automation can turn “latest version” convenience into an enterprise incident, and why developer environments often become an overlooked attack surface.Then we get practical with controls you can actually implement: pausing automatic dependency updates when compromise is suspected, adding human approval for critical packages, rotating credentials immediately, enforcing MFA on developer and registry accounts, and using private or trusted registries to mirror and vet dependencies. I also zoom out to show how to build supply chain security into the secure SDLC with software composition analysis (SCA), code signing, checksum verification, audit logging, continuous monitoring, and an SBOM so you can respond fast when a package turns toxic.If this helps you tighten your dependency management and level up your CISSP prep, subscribe, share this with a teammate, and leave a quick review so more security pros can find the show.Gain exclusive access to 360 FREE CISSP Practice Questions at FreeCISSPQuestions.com and have them delivered directly to your inbox!  Don't miss this valuable opportunity to strengthen your CISSP exam preparation and boost your chances of certification success. Join now and start your journey toward CISSP mastery today!

Open Source Startup Podcast
E196: Shifting Developer Portals to Agent Portals with Port

Open Source Startup Podcast

Play Episode Listen Later Jun 3, 2026 38:41


This Open Source Startup Podcast episode has our co-hosts Robby and Tim in conversation with Zohar Einy the Co-Founder of agentic SDLC platform Port.They have a few open source projects including ocean which allows third-party systems to integrate with their developer portal.Port is positioning itself as the infrastructure layer for the agentic era, evolving the traditional internal developer portal into what its founders describe as a “system of record for agents.” The company believes the future belongs not to vertical point solutions, but to flexible platforms that organizations control themselves, enabling anyone, from developers to non-technical employees, to become builders. Rooted in the founders' experience of overwhelming developer workflows and ticket volumes, Port aims to centralize engineering context while making both humans and AI agents more self-sufficient. Their hybrid approach combines openness and commercial software, with public roadmaps, community contributions, and open-source integrations helping customers extend the platform while maintaining governance and control.The conversation also explored how AI is reshaping engineering organizations. Port is focused on creating the infrastructure around agents rather than building the agents themselves, providing visibility, permissions, governance, and a unified “context lake” for agent activity. As companies deploy increasing numbers of coding, security, SRE, and product agents, leaders need a control plane to understand what agents are doing and ensure they operate safely. The team is already seeing customers use Port to automate large portions of engineering support workflows, and they believe enterprises are adopting AI-driven workflows as quickly as, or faster than, mid-market companies. Internally, this pace of change requires constant adaptation, particularly across go-to-market teams, where education and flexibility have become more important than rigid playbooks.

TestTalks | Automation Awesomeness | Helping YOU Succeed with Test Automation
AI Agents in QA: How to Keep Up with AI-Driven Dev Velocity with Vilhelm von Ehrenheim

TestTalks | Automation Awesomeness | Helping YOU Succeed with Test Automation

Play Episode Listen Later Jun 2, 2026 35:23


AI coding tools promised to make development faster — and they delivered. But here's the problem nobody talks about enough: when you speed up coding, you don't eliminate the bottleneck in the SDLC. You just move it. And for most teams, it lands squarely in QA. In this episode, Joe sits down with Vilhelm von Ehrenheim, Co-founder and Chief AI Officer of QA.tech, to dig into how agentic AI is reshaping software testing from the ground up. Vilhelm brings serious ML credibility, he helped build Motherbrain, one of the earliest production LLM systems in venture capital, and he's now applying that experience to one of the hardest problems in software delivery: testing at AI development velocity. You'll learn how QA.tech's behavioral knowledge graph gives AI agents the context they need to actually understand your application, why validating user intent beats checking element identifiers every time, how autonomous agents can review PRs, reproduce bugs from Slack messages, and generate targeted tests without a single line of test code ,and what the tester's role actually looks like when agents do the heavy lifting. If you're wondering whether your QA practice can survive the pace of AI-driven development, this one's required listening.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

I'm excited to work with Microsoft once again as the presenting sponsors of the AI Engineer World's Fair! We'll streaming live from MS Build today for a special crossover pod with our friends at No Priors and the one and only Satya Nadella. However we did not hold back with this interview - we asked all the burning questions about uptime and Copilot that we know you have in your minds. Lets go!For almost two decades, GitHub has been the home of software, where both open source and closed flow, through commits, pull requests, reviews, actions, etc.This ecosystem flourished as open-source maintainers and contributors would continue shipping code for the benefit of the community. However as coding agents began to ship mass quantities of code - growing 1400% in 2026, it marked a new era that was both extremely exciting and challenging for GitHub.While these agents help more people ship more projects, they also significantly increase the floor of how much code is shipped, how often it is shipped, how many people commit code, and basically orders of magnitude multiples in every dimension of GitHub infrastructure:Now GitHub inevitably experiences more pressure on their infrastructure which was originally designed around human developers moving at human speed. This has resulted in a very publicly notable uptime story:So it begs the question of whether current systems around code can absorb what AI produces. Can CI/CD keep up when every idea becomes a build? Can open source maintainers survive floods of AI-generated slop contributions? Can GitHub preserve the human social contract of software while becoming the operating layer for agents?Which brings us to the perfect person to answer these questions: GitHub COO Kyle Daigle. In this episode, he joins swyx to unpack what happens when AI doesn't just autocomplete code, but starts changing how companies operate, how open source works, how pull requests get reviewed, and how GitHub itself has to scale. We go deep on GitHub's internal AI workflows: micro-skills, WorkIQ, MCP, Slack, Teams, email, Copilot workflows, the new Copilot desktop app, CLI, cloud agents, and how Kyle uses agents to look backwards across company context before deciding what to do next. Kyle also reflects on GitHub's history building webhooks, APIs, Actions, npm, Dependabot, and Semmle, why the AI era is breaking GitHub in new ways, how Actions became a general-purpose compute layer, and what Copilot becomes after code completion.Full Video PodWe discuss:* Kyle's expanded role across GitHub* How AI got Kyle coding again after years in leadership* Why GitHub rolls out AI through existing workflows instead of forcing new tools* WorkIQ, MCP, Slack, Teams, email, and GitHub as company context* Why massive “mega-skills” are giving way to small, atomic micro-skills* How AI changes summarization, communications, marketing, and analyst work* Why former developers in leadership may have a unique advantage in the AI era* Kyle's “15 agents on Saturday” workflow* How Kyle built an AI-generated executive presentation for CRO/CFO teams* Why AI changes the chief of staff role without removing the human work* GitHub Actions, webhooks, arbitrary code execution, and secure agent compute* The npm acquisition, supply-chain security, 2FA, and token invalidation* Slop forks, vendoring, and whether AI agents change dependency management* What pull requests become when most PRs come from agents* Prompt requests, vouching, AI review, and trust in open source* What counts as a “developer” when AI lowers the barrier to building* GitHub Spark, low-code, and why GitHub refuses to hide the code* 14x commit growth, Actions load, databases, monorepos, and availability* Copilot's evolution from completion to CLI, desktop app, cloud agents, and SDK* Context, memory, rules, and making GitHub “act like Kyle wants it to act”* Ambient AI, OpenClaw, enterprise security, and the new operating system for agents* What swyx should ask Satya Nadella about Microsoft's AI futureKyle Daigle* LinkedIn: https://www.linkedin.com/in/kyledaigle* X: https://x.com/kdaigleTimestamps00:00:00 Introduction00:03:36 Why AI Got Kyle Coding Again00:07:04 Running GitHub with AI: WorkIQ, MCP, Slack, Teams, and Skills00:15:39 The Golden Age for Former Developers in Leadership00:17:31 15 Agents on Saturday and AI-Generated Executive Work00:20:20 How AI Changes the Chief of Staff Role00:21:45 GitHub's History: Actions, npm, Webhooks, and Open Source00:28:45 Slop Forks, Vendoring, and AI Dependency Management00:33:57 Pull Requests, Prompt Requests, and Trust in Agent-Generated Code00:41:21 GitHub Stars, 200M+ Developers, and the New AI Builder Wave00:45:15 GitHub Spark, Low-Code, and Why GitHub Still Shows the Code00:47:38 GitHub's Hardest Era: 14x Growth, Reliability, and Scale00:59:21 Actions as the Compute Layer for CI/CD and Automation01:02:04 The State and Future of GitHub Copilot01:08:24 Ambient AI, Background Agents, and the Future of the SDLC01:13:09 OpenClaw, Enterprise Security, and the New OS for Agents01:18:03 Build Announcements, WorkIQ, FoundryIQ, and Microsoft Context01:21:41 What Should swyx Ask Satya?TranscriptIntroduction: Kyle Daigle's Expanded Role at GitHub and MicrosoftSwyx [00:00:00]: We're here with Kyle Daigle, COO of GitHub. Welcome.Kyle [00:00:07]: Hey, thanks for having me.Swyx [00:00:08]: You're not just CEO of GitHub. People know you as that. You have a new role.Kyle [00:00:11]: So I have an expanded role now. I've been working at GitHub for thirteen years and doing all things developer. Joined as a developer myself. And now, I'm also responsible as the CMO of Developer for Microsoft. And so all the kind of learnings and passion for developers and how we work with them and how we communicate and how we bring our products to market, we're also bringing that expertise to the broader Microsoft ecosystem and helping every developer that uses a Microsoft product or would like to have a sort of similar experience that they've had with GitHub over the years. So it's a different role in some ways, but it's also just building on the experience that I've had at GitHub of just sort of tell the truth, be authentic, show people how to use it and then let the products speak for themselves. Now just doing that with, all of Microsoft.Swyx [00:01:09]: We'll be releasing this in conjunction with Build. You got lots of stuff planned, and we can sort of touch on that whenever it's appropriate. I think one of the interesting things is I rarely meet a COO who's also a CMO. I think you're a very outward facing and you're very confident publicly. That's rare. Do you actually view yourself as COO? What's What is your thing?From GitHub Developer to COO/CMO: Building the Platform and Operating GitHubKyle [00:01:33]: I think for me, it's been funny. The titles have always been, a— have always felt a little strange to me. I joined GitHub as a developer? I wrote so much of theSwyx [00:01:46]: Let's bring that up. You wrote the back ends?Kyle [00:01:48]: I was going through, I was going through, some old photos, when folks were talking about how things were being built or how there was a build GitHub. I built, webhooks and worked with teams building the API, built the platform layer. Anything that integrated with GitHub, up until really twenty eighteen, I built or ran the engineering teams. And that's kind of where my the beginning of my passion always was helping people build things, deliver them to, their customers. And so being a developer, building for developers was always super unique. In a— I think as my role expanded, it became my ability to talk to not just developers, but also enterprise customers or business leaders and have this translation layer. And then through all those years, GitHub has always operated pretty uniquely. Post-pandemic, working remotely was not as novel as it was when GitHub started in two thousand and eight. But all that expertise of running remote teams, doing it well, became this sort of bigger role, ultimately turning into the COO role of how do we operate GitHub in the way that GitHub's always operated after the Microsoft acquisition. And kind of so on from there. So like for me, I think the— I've, I still code. I love coding but the problem has always been, people. It's a much harder problem to both support our own employees, a harder problem to communicate to developers and enterprise buyers what we're building why it matters, ‘cause those are two very different messages. And so getting to work in the mix of COO, CMO, also just being a dev, I think is what's kept me at GitHub for so long.AI Workflows for Leadership: Commits, Retrospectives, and ContextSwyx [00:03:40]: Apparently, you have— your commits have gone up. What's this? What's going on?Kyle [00:03:45]: Rui's called me out pretty aggressively. So I think— as you can imagine, right, you can see my normal era of being a dev In the twenty thirteen, twenty fourteen era, and then moving into management, and then ultimately the COO role. I think what you see there is me, really getting back to coding thanks to AI. I— similar to, attaching problems between how to market and how to operate a business and how to code, I find, building agents and workflows that are connecting very disparate problems to be what's driving this. So that's, some of it's writing software. A lot of it is, connecting a ton of a different data sources to, help me out. But that is completely me really diving in on the AI side in trying out our tools, trying out everyone's tools, But building for me, building for the non-technical leader, though I'm technical and how we're, able to use these tools more than just the simple, call and response that I think a lot of the non-technical, your employers, you have to get— you have to use AI, and so everyone uses, ChatGPT or Copilot or Claude or whatever. To really get into, how is this going to help me out, it— I find that it's not the I need to write a blog post, I need to those simple examples. Helping people find the workflows of, “Okay, I need you to go through all the PRs today. I need you to go through everything that we've posted online. I need you to go through what we did the last three months. Go through all of my Obsidian notes for any mentions of this then go through my transcripts at work.” We use, Teams, so, using WorkIQ, go call that MCP server, grab all the transcripts, go through all the Slack, and then build me out the plan of, what this week's messaging actually was. That's something that was, impossible because for me, I find AI in a what most of this launch here is actually, less building forward. It's actually, a recursive loop backwards. I'm always looking at what had happened first. Go back through the week and tell me what we did, what worked, what didn't work? And then tell me in the next three or four days-What would you tweak based on this sort of like looking backwards and then looking ahead a little bit? I find that to be so much more valuable, especially for like non-technical, because that retrospection is actually LLMs are very good at that. Like finding all the patterns, pulling them out, and then applying that retrospection to just a couple of days or just like a short period of time. Is all a bunch of apps that I've built and launched a bunch of, internal tools. I use the new, GitHub Copilot app, the desktop app with workflows. Every time I crack open my laptop, it's running workflows for me. It's just a ton of different stuff and of course, it all ends up on, it all ends up on GitHub.Swyx [00:06:47]: Of course. That's where, that's where, stuff is hosted. Man, there's so much to ask you. I was going to leave the how do you run a company with AI thing at the end. I have to ask one— double click one thing. You said, you are looking back at the week. You're, you're understanding what happens. When you say we That's three thousand people. How?Rolling Out AI Internally: Skills, CLIs, and Company ContextKyle [00:07:09]: I think when we started rolling out AI internally beyond engineering, right? One of the things that I was really, passionate about is like we have to do this in a way where no one has to change how they work. I don't want to have to teach you a tool. I don't want to have to teach you something new. And so for us, we tried out a few tools. Most of them don't work because I got to get you on board? I got to teach you how to use it. What we've actually ended up doing is we've built like a set of skills internally. We have we each have our set of skills, and we've just been distributing even to the non-technical folks, the CLI. And then effectively, we're just giving it access to like read about everything that we're writing. So that's for us, that's usually GitHub, Teams, Email, and Slack. So Teams for, video chat, generally speaking.Swyx [00:08:03]: Teams and Slack?Kyle [00:08:04]: so we use Teams for video communication, but we don't use it for chat. W-we— GitHub for a long history, right? We're alwaysSwyx [00:08:13]: Also SlackKyle [00:08:14]: Talking about ChatOps and like everything is built into Slack. Like every command, every flow.Swyx [00:08:18]: So even though you have been acquired for I don't know, eight years nowKyle [00:08:22]: we stillSwyx [00:08:23]: You still use Slack?Kyle [00:08:23]: it's a purpose-built tool for us, and I think the reality is that moving off of it would be so bluntly expensive? Simply because all the tooling is, baked in with that paradigm. And they both have their pros and cons but they don't work the same way at all. We still use a bunch of different tools Because it's the purpose-built tools that We need. And thenSwyx [00:08:47]: Well, the same doesn't go for the rest of Microsoft, presumably.Kyle [00:08:50]: like the like various teams like operateSwyx [00:08:53]: They make their own decisionsKyle [00:08:54]: Various ways. I think it just matters what you're trying to what you're trying to do. But we do we do work across kind of every tool that we use, and then by giving everyone access to all of that context and the new WorkIQ MCP server, which is quite cool if you do live in the M365 like world. I can ask it all these backwards-facing questions, and it's incredibly important for our teams that are working remotely. There's a lot of stuff you miss when you're not in an office, and we are spread out all over the world. So most of that is looking back. And then we post, we post either auto-automatically into GitHub issues or discussions, these sorts of like findings or like our industry reports. Like what's happening this morning, today, yesterday. A little automation gets run. We'll use the app. We might use GitHub Actions like with, our agentic workflows just to go do that run, and then we push it into GitHub, and w-we keep having a conversation. So usually for us, it's about that sort of like looking back, looking forward on the non-technical side. And then of course for a lot of those folks, it's also building an app, pushing it to GitHub pages or pushing it somewhere to host it et cetera. But it's just like enabling everyone with that power of it's going to take me a week to figure this out. Instead, we're going “Okay I built a skill. Let's put it into a repo. We'll all share that skill together, and then we'll use the CLI or now the app-” “just to run it.”Micro Skills vs. Mega Skills: How GitHub Uses AI at WorkSwyx [00:10:26]: All right. I think, I think we're going straight into like the team management and productivity thing. I think a lot of people are getting various levels of LLM psychosis. How do you manage the bloat of skills? Like everyone Has their thing, and they're Like trying to promote it to the rest of their peers in their org, right? And obviously, whoever becomes a skill influencer internally becomes like an AI leader, right? Of sorts. I assume you have those.Kyle [00:10:50]: like I think we haveSwyx [00:10:52]: And I assume it's a mess a Yeah.Kyle [00:10:54]: there's like I— like I think the reality is there's two pieces. Like first is I think that we're ending the era of these like massive, beautiful, perfect skills that are just like not any of those things. ‘cause for a while, right every tweet every day is like go download the skills, the perfectly managed thing to do this entire workflow. And I think that like what we've found and what— I was just with my team, this week, and we were talking about the skill side, and we're really talking about these like incredibly micro skills that are just doing one thing for us very well Versus a skill that's going to do I said, that full report. That doesn't really exist on our side anymore. It's usually how do— like a single skill that's going to identify the most important marketing information given any MCP server. Like this is the most important thing. Less about stitch a bunch of tools together and have it produce this mega output because then weeks go by, months go by, things change, and you want to tweakSwyx [00:11:58]: It's brittleKyle [00:11:58]: Your mega skill and you're screwed? You can't do that. And so now we're really just talking about the Legos we're using and just letting the instruction book be something we're all putting together. Whereas I think a lot of AI skills for a while have been that mega instruction book style.Swyx [00:12:15]: I've, thought a lot about Postel's law. I don't know if that's a term that is, means things to folks. It's the idea that you should be liberal in what you accept and strict in what you output, right? And I think that's like a good framing principle for skills. This is my skills, obviously on GitHub. I feel like everyone should have like how like some repos In GitHub are special repos? I feel like we should sort of reify the slash skills and everyone like give it some kind of special presentation. Anyway, so, yeah, this is one of those like download Download anything, transcribe anything, and then you can string together the atomic skills that do one thing well Into like some kind of orchestration skill that calls other skills. I assume, does that match?Kyle [00:12:56]: I like I think so. I think that theSwyx [00:13:00]: Summarize anything.Kyle [00:13:01]: Like I think the- For me, summarizing something for I do communications and PR and analyst relations and marketing and customer activities, and so my summarize everything is very different for each one of those like Contexts. What ‘Cause if I'm summarizing something for an analyst, that's a very different thing than, probably how I'm going to summarize something for like a customer meeting or an engagement. So that's I think like the difference when we're talking about the like the tools I might use on Saturday or the skills I might use on a Saturday when it's just for Kyle. Yeah, those are kind of like they have an atomic actual tool underneath or maybe skill, and then Kyle cares about X. But I think when we're talking about work and enabling the the marketers, communicators there, it's the atomic, this is what good summarization is, and then this is what I care about as for marketing for communications For whatever. And that I think is like the interesting matrix problem when we go from like a developer set of concerns to all kinds of different professions, is that what that word means to me is different than it means to you is different than it means to the analyst or the salesperson, and that's where I think the matrix mess is that we're starting to like still starting to find. It's about these mega skills but they're all just slight permutations, but those permutations are really important. It's the difference between someone reading this and going “Did AI make this?” what Or “This makes total sense, and I would expect this when I'm giving a briefing to Gartner,” or like whatever else.Swyx [00:14:37]: I think the beauty of it maybe is that you don't have to be that careful about what goes in there. It doesn't have to exactly fit as long as it like roughly is contained in there. I used to complain about plugin hell, basically. Like when you have a framework and then you have a hundred things that you need to integrate, everyone does like the GitHub used to be bloated full of these things. And now we don't need them anymore ‘cause now you just use skills.Former Developers in Leadership: AI as a Creation MultiplierKyle [00:15:00]: And like I think the most magical thing is the just that like I can just also crack it open. Like Like yes, I could go like change the how the plugin is coded, or like I could go do that now with AI, but I think there's just something more magical about getting a response back and being “That's not right,” and then you just crack the skill open, you just type English words and it's different. That building block is just, I think very unique. Once I get everyone to kind of understand how to best how to best make those changes to get the most power out of them.Swyx [00:15:36]: Is there a— you have a your peer group that Of people like you. Is there a common framing for Something I'm feeling is, which is true, is that is this a golden age for former developers who are now in leadership? Because you can wield the tools, you would know the right words, you're maybe not too close to the details. Doesn't matter. But like you're more effective than someone who doesn't come from that background.Kyle [00:15:59]: I think that like the secret has always been your ability to identify patterns and solve problems, and I think that for folks that like myself that don't code day to day anymore, that has made me successful as a developer, made me successful as a COO and now CMO. And so now that I have access to get and write code, I'm now applying that sort of like pattern finding and problem solving, and I know enough still about how to then go and say, “Oh, I want to make an app, but I don't want to break into jail or create something that's not going to be able to work or to be deployed scale or whatever.” that ability to apply all that additional business knowledge and still code I think is what makes that so interesting to me. Slightly different than I think some of the other like technical leaders that became business leaders and now are going back to their apps and updating them. Good for them? But I think the more, much more interesting thing is, well, now I have this whole new set of expertise over ten plus years. Why not take that and use that as a developer with these AI tools? So I definitely think that makes me more powerful, but I think that's true for like every dev as well. Most of the dev friends I still have also have some other underlying skill and passion. There's really talented, very kind of linear computer science software devs, absolutely. I just find that the folks that came from a different career, went to school for something else, went off and did this random thing, and then became a software dev, or were a dev, did a random thing, came back. Learning that extra set of information, learning those extra skills, and now having the power of an AI where I can crank up fifteen agents on Saturday while my kids are doing lacrosse, That's like really powerful. And I think it gets me back to that feeling of like creation, and it's very hard to replicate that in most other senses? That first time you build an app and you click it and you show someone that's magical. And so being able to do that not just in code, but across all kinds of different assets that's, that's huge. We were doing we're doing our every year we do our revenue planning. We talk about okay, what is it going to look like for next year? And of course as you imagine, there's, slideshows everywhere talking about what are we going to talk about, what's the narrative, et cetera. And so as you said I'm “Okay, well, I could probably just like build something to build this and then that way I don't have to go build the whole spreadsheet or I have to pass it to my team.” So we went through this process, and I got all the information and used the skills I mentioned. I built like a little app just to make it so I could look at some of the information in a SQLite database, more easily. And I ultimately built this entire presentation without touching any of it and I was “Okay, I'm just going to present this to our CRO, the CFO, their teams,” without mentioning I'd built it with AI. I like built a skill to make it look very much not AI driven. Just not pretty.AI-Generated Presentations, Human Taste, and the Changing Chief of Staff RoleSwyx [00:19:03]: Like a design. Yeah.Kyle [00:19:03]: Not pretty. But just like very clearly not AI. Kind of like don't do anything interesting.Swyx [00:19:08]: That's, yeah, that is valuable.Kyle [00:19:08]: Just go Exactly. We did the whole thing through. It used my notes from Obsidian, it used all the context I mentioned before, the plans, and Never came up once that it was AI generated.Swyx [00:19:20]: It didn't matter.Kyle [00:19:20]: Never once. D It didn't matter. And so now I takeSwyx [00:19:23]: This is a toolKyle [00:19:23]: I can take that tool and go, “Look, I don't want you to go build slideshows.” They're just helping us share information with each other. If this thing can do it With a little bit of crafting from you and then we can look at it together, awesome. There's no value in all that extra work. I think that the ability to, make it look humanly bad and and build a little app to, manipulate the data I think is part of, that upside for devs that are now in leadership roles. Because, the thing that I feel like I said before, this that's all a people, that's all a people problem. I know if you've used a coworker or not to build a slide deck, unless you spent a bunch of time to not do it.Swyx [00:20:07]: I know, but like it was so, I think there's a certain charm to just being blatantly AI. ‘Cause I think that you're well, you're just honest about There may be mistakes here that I cannot vouch for. So how much value is there? But anyway I think, actually the real question I want to ask is, there's a— You were a chief of staff To Thomas. And in the pre-AI world, the that job would've been a chief of staff job of like Can you prep me these slides and all that? And now you do it yourself.Kyle [00:20:35]: I still, I still have a chief of staff. Because, the difference is it's sort of the discussion every time we have some sort of technology evolution is it's not that the jobs the roles don't all go away, they just change? And so yeah, I don't have someone spending all their time building out slides for me and presentations ‘cause I don't need that anymore. But now I need that person that is able to go and find all the different connections between humans in those discussions to help me find out, okay, I should be meeting with this group and this team, and they have an opportunity, and I'm going to be in San Francisco today, I'm going to be in Seattle tomorrow. Those sorts of human connection aspects are still incredibly valuable and has always been a big part of that chief of staff role. But now just like chiefs of staff are not opening up, letters to process, they're doing emails. What It's the same thing. And now they're, they're not building out as many of these presentations because they have the the ability to have a AI take it on for, and share that with me and great. Let's keep moving ‘cause it's allowing us to go faster and make better decisions more quickly.Swyx [00:21:45]: Awesome. Well, so we can dive into more sort of, Productivity insights as you go. I did want to do a little bit of a brief history of colleague and hub. Because, we started here. And then you also involved the NPM acquisition. I did, I do want to touch upon that. And then more recently, I just want to bring up to present day where we're having uptime issues Which transparently we've already Addressed publicly, but we'll, we'll discuss in the pod. Did I miss anything? Like what, any other major highlights? Obviously, it's, it's a lot of years to cover.A Brief History of GitHub: Webhooks, Actions, Acquisitions, and Platform EvolutionKyle [00:22:15]: No the I think one of one highlight was right before the acquisition closed in twenty eighteen, I got to launch the first version of ActionsSwyx [00:22:27]: OhKyle [00:22:27]: At GitHub Universe. So it was OSwyx [00:22:29]: They're that young?Kyle [00:22:30]: It was October of twenty eighteen, I think. Yeah. Yeah.Swyx [00:22:33]: Gee, Jesus.Kyle [00:22:34]: I got to I was the engineering leader on that project and got to launch that. And then, yeah, we did acquisitions of NPM you said, Semmle, Dependabot Pul Panda a whole bunch of things. That was a bigSwyx [00:22:47]: Pul Panda.Kyle [00:22:48]: Abi is doing well.Swyx [00:22:51]: DX. Holy crap.Kyle [00:22:52]: Did well on DX. I and like that was a that was the big shift, after the acquisition. I had to join the sort of business side.Swyx [00:23:00]: So I need to hit you on some of these things ‘cause you were there. Right? And how often do I get to talk to someone who was there? But yeah, Actions. Is that the number one source of security issues on GitHub?Kyle [00:23:11]: Oh, sh I think that the number one source of, security issues is probably like all, the literal code in everyone's like underlying repositories. I would say back further than that is, if you remember I had to show in this graph was this is, I'm, didn't say this before, this is ultimately webhooks.Swyx [00:23:30]: You yeah.Kyle [00:23:31]: Like circa whatever it was.Swyx [00:23:32]: It says Hookshot in there.Kyle [00:23:32]: I forget. Yeah. Yeah, Hookshot's in there. And so like back then, it says GitHub Services. Do you see, it says Hookshot FE for front end, and then it says GitHub Services. GitHub Services back in the old days, right? You we had a repository that was Ruby code, and you could write any Ruby code in there, and then we would execute that On your behalf As a service, and then that way if an if you were trying to integrate with something, it didn't we would run it for you.Swyx [00:23:57]: And of course no containers ‘causeKyle [00:23:58]: No, ‘cause it wasSwyx [00:23:59]: Well, no containersKyle [00:24:00]: Twenty fourteen. And so there was some isolation obviously, but it was mostly the separations on the server level. That's like an example as long as the very old version of Pages, which ran on its own containerization infrastructure, not on Actions.Swyx [00:24:15]: Which like all-time great product.Kyle [00:24:16]: Pages powers the internet at this point to some degree. Those were places where like clearly there were no like issues like to my knowledge. But it was those things where I'm looking at and going “Okay, well we can't be running arbitrary Ruby code,” like on everyone's behalf. Then containerizing all of that up intoUh into actions now where yeah the containerization, is r-really good. The pinning most folks aren't pinning it the like to a particularSwyx [00:24:48]: ImagesKyle [00:24:48]: Sha, et cetera like their workflows, and so that's a big that's a big place Of pain for folks if they're just doing similar to any dependency management, just V1 or newest or latest, I think. But, that journey from that day to “Okay, we're just going to run all this arbitrary code, and, it'll basically be okay,” to now, no, we have, really good containerization. We have a new, underlying, ag-agent, containerization, service. It's like we're using it under the hood. It's through Azure. They recently announced it. The Azure, Dev Compute, but it's, very fast, very fast compute to be able to, spin up your own cloud agents, or whatnot. We're using it under the hood for some parts of the new,Swyx [00:25:36]: Microsoft Dev Box?Kyle [00:25:37]: No. Dev Compute, yeah.Swyx [00:25:41]: Hmm. Not finding it just yet.Kyle [00:25:44]: Oh, it's, it's in there somewhere.Swyx [00:25:46]: All right. Well, we'll cut that out.Kyle [00:25:47]: Sorry. But with, Dev Compute, you can, run, really fast, spin up really, small VMs really quickly, so you're doing a tool callSwyx [00:25:58]: Same conceptKyle [00:25:58]: Just do it containerize exact-exactly. So we're using that so definitely moving that direction to protect us from every every piece of code that we're ultimately running.Swyx [00:26:07]: look, that grows into the full SDLC? Code hosting was just the start and and then it's grown beyond that. Let's talk about NPM may-maybe ‘cause I think that's also, a very major point in the industry. I do think, it was looking for a home. It was, kind of struggling as a business, right? I don't know, I don't know how you would characterize that whole acquisition and how itNPM, Package Security, and Keeping the Internet RunningKyle [00:26:33]: like when we were talking to the team, I think the big thing for the both of us was to find a way to keep NPM, which was basically powering the internet then and way more so now to some degree running. Keep it going keep continuing to scale. It was having scaling problems, if I recall, back at that time. They were doing some rewrites. ItSwyx [00:27:00]: that's cute compared to now.Kyle [00:27:01]: Well, that's the thing is like when I'm talking to folks now, there's there's so many more underlying uses of NPM than there were back when we had them join in with GitHub. But that was ultimately the goal. It was really okay, we used to have pages. We have, the world's code. Let's make sure that we can keep NPM running well for the world. And we put a bunch of time and investment into fixing some of the underlying backend, changes, some of which we talked about some of the manifest work, et cetera. And then now, really trying to bring the the security posture of NPM up to speed. But, it is a unique challenge in that every move that we make to make it more secure will break a lot of people. And security is paramount. And also, we take it very seriously. We're, the any time that we have a problem with GitHub or we make a change that makes us more secure but hurts, there's, a snow day for developers or a really bad fire that they have to go put out. And so we've, have changed the 2FA policies. We've changed the way the tokens work. When we find tokens that have been exposed or potentially, exposed, we invalidate them, andSwyx [00:28:22]: I love that feature in GitHub. Yeah, it's greatKyle [00:28:23]: That creates issues, but, the but that's the thing is we're trying to push the community, forward without necessarily, doing something that is going to break the contract that's been for 15 years or close to it or some amount of years on NPM.Slop Forks, Vendoring, and the Future of Open Source Supply ChainsSwyx [00:28:43]: I think the— So now we're talking about, open source and publishing. And I think there's something here with what people are calling slop forks, which, I think Malta from Vercel is doing. And, part of me thinks, well, the way to get past any vulnerabilities, we just, let's just get rid of the concept of NPM. And we only publish source code. And anytime you want to import it you have your coding agent look at it and then adapt whatever subset you're going to use into your vendor it. But, the AI vendor it. Is that realistic? I don't know. Is it— Will that solve all our security issues? I don't know.Kyle [00:29:24]: I don't think it'll solve I so Mitchell was just talking Mitchell Hashimoto Was just talking about this today, and I think that I-in some ways, it's all all things, old or new again? Yeah, absolutely vendoring everything. Like I do I do remember twenty thirteen, twenty fourteen.Swyx [00:29:42]: This is Yeah. Let's, we must return toKyle [00:29:43]: That's what is We were vendoring everything. We were having actual discussions around, or at least I remember we were “Should we take this full thing?” “Why is this so big? We only need this one file.” And so I do think there's something true there where having either taking only what you need or the dependencies just getting incredibly small over time, I think will help to some degree, but it's not going to solve the fundamental problem, I don't think, because the vulnerabilities in an agent looking at them, there's time and time again, there's a million different ways in which we can convince an agent that this thing is, secure or not and pull it in. Or we can do static code analysis or runtime testing to say whether the code works or not. That is, I think, the step that needs to continue to be, invested in. The question is just on, how much scope. Should it be this enormous project that I'm pulling down, or should it be this piece? Either most companies are running some amount of security checking on the on the packages that they're bringing in or vendoring. That I think won't change. That's like what advanced security does to some degree, Socket does some degree. Like everyone is doing a piece of that. How we each do that like especially when we're talking to enterprise customers, is just like very different. No there's no one wants one single way to do it. And I think that's always been GitHub's, unique position in the world. I talk a lot to maintainers, I talk a lot to folks about this. It's we're— we rarely start like a process and a practice and like push it onto the community. We usually wait for the sort of like RFC process socially or literally, everyone agreeing, and then we'll cement something in. Because otherwise we'reMaintainers, RFCs, Vouching, and the Social Layer of TrustSwyx [00:31:35]: That fits your role in the ecosystem, yeahKyle [00:31:36]: We're GitHub. Yeah, we don't want to shape the whole thing. We want it to be figured out. But like how do you balance that like sort of Role in the industry to keep everything as secure as is possible and make sure that you're you're not going to be compromised as a human, ‘cause that's usually how it all happens. And Not not create a process or lock us into a flow that you're not going to or like Mitchell's not going to or other open source projects aren't going to like. That's always been a tricky balance for us, and I think that's something that we haven't talked about enough is we're not going to be able to fix everything for everyone in a way that everyone is going to like. So tell, help us, tell us what is working. When Mitchell was talking about, the Upvote, the upSwyx [00:32:22]: I was going to bring up his thing. Yeah.Kyle [00:32:23]: I forget what it Yeah. When he's talking to us, I was chatting with him and talking to him about this and I put it on Twitter and we talked to, also over DM, was “We're going to keep working.” but I think the important thing is I do actually want to hear what isn't working for you. And as, be as specific and clear for your project as is possible. And to every piece of credit over the many years that we've known each other through the industry, he's always done that and I appreciate that ‘cause there are places that we need to fix up, and we hear from him, and we'll fix up just like we do all other kinds of maintainers. But that that process between making those types of improvements and being more secure and like creating, I forget what he calls it's not the proof process, not the claims process. Do what I'm talking about? He has that he his projects have a way for you to kind of like,Swyx [00:33:13]: VouchKyle [00:33:13]: Vouch. Thank you. Yeah. He has like the vouch system for saying, “Hey, you should accept my PRs.” That's beenSwyx [00:33:20]: I just built this into GitHub. I don't know.Kyle [00:33:22]: Well, see, but that's the thing is that you say that and like he and his community really likes this and then I'll go talk to other maintainers and other maintainers, globally, and they're “No, this doesn't work for me.” And that is the tension, but also the kind of beauty of GitHub, depending on which way you look at it is we want to help maintainers, so we create all these tools to let you have more control over how much you take in from AI and PRs. But you can also use this. What You can go use this project, and if it takes off and becomes the kind of mostly standard, then yeah, we probably wouldn't enforce it but we would add it in because that's the flow that we tend to do?Swyx [00:34:02]: I hear a lot of people don't know the history of the pull request. And like like that's how, that's something that GitHub standardized basically.Kyle [00:34:08]: Yeah. It was a very messy process Like beforehand, and now the we have the benefit of it being the process? And now we have to go and Figure out the next best process or what adaptations change, or what does a pull request look like when eighty percent of your PRs are just coming from your agents and not From other devs?Swyx [00:34:31]: Do you like the prompt request idea from Peter?Kyle [00:34:34]: like I think that for each like each idea I think has its merits. I'm not, I'm not avoiding saying anything good or bad, but I feel like I've seen a version of we have that we have entire Thomas' store. Take all the assets of what you've built and put that in. I think that's got great ideas. There's all these various permutations of the PR flow, but I think the reason why there's not a single answer is ultimately we're trying to codify trust. We're trying to say “Okay, if Sean reviews this I'm going to trust it because you're Sean or you're the senior dev or you're the whatever.” And right now, when we are working in a flow where an agent writes code and another agent reviews code and then Kyle goes and looks at it the trust is kind of diffuse. And most of the tools that we're talking about are talking more about verification flows. We have more assets to look at, so I can probably say whether this is a good PR or not. But that still doesn't solve, I think, the human problem of I'm looking at a PR and I want to know if I can trust it. And we're still, we still tend to use human signals for that? Mitchell approving it or Kyle approving it or whatever. And so I think that's, I think that's why most of these options haven't really solved it is because, it's a social problem ultimately. It's a it's a human problem to review it and agree. Or you fully trust the tool and you're imbuing that tool with full trust Which I think in some cases that absolutely exists.AI-Generated PRs, Trust, and the Waymo AnalogySwyx [00:36:08]: And so like in the same way that there will be a tipping point in society when we don't allow humans to drive anymore Because machines are measurably better than Than humans. I'm looking for that tipping point, right? Like Mythos is ridiculously expensive. Someday we'll have Mythos on a desktop. I don't know. Will, does that change the equation?Kyle [00:36:30]: I think it's more I took a Waymo here, and I was on my phone and not looking around at all. There are other, self-driving, vehicles that I would not trust while, staring at the road. And I think that trust is something that isSwyx [00:36:48]: Is this a Zoox thing? What is itKyle [00:36:50]: I think that is both. I think that is both. LikeSwyx [00:36:53]: There's Zoox in this robo taxi. That's it. It'sKyle [00:36:56]: Well, depending on what level Of self-driving. But, my point is sort of that I think part of that is I strongly believe that's, a mixture of verifiable proof. Like how many accidents, how much data, and so on, and the human aspect of how I feel when I'm in this car, what it tells me, et cetera. And so that's why I think some of the like Some of these some of our AI tools tend to, imbue me with more of that feeling of trust, even if the data says this is 100% accurate. I feel like it takes more time for us to go, “Should I trust this or not?” And that's in the soft sense of, startups with high agency, weekend projects, and open source. And then there's enterprises and regulated industries and everything else, and that is an even harder problem to go solve because even when it is fully verified, not only do you have to have trust from the humans on the team, you probably have to have trust from multinational,Swyx [00:37:55]: Oh my GodKyle [00:37:55]: Multi governments around the world and regulating agencies. And so that's where I feel like until we tip over to your point on the sort of like human EQ side of it. I feel okay this feels okay I've been proven enough. Then the ball will start to roll a lot faster, where we'll end up getting to the “Okay, we can trust this,” and feel good about it in the Most difficult of cases.Reputation, Sponsors, Stars, and Bot Activity on GitHubSwyx [00:38:18]: If human trust is the thing that matters, I feel like GitHub as the developer social network could maybe do more there. Like vouchers are one system But, we have star counts, and then we have Contributor rights, and that's it. And I feel like there should be more in that space. I don't know if there's any other design decisions there.Kyle [00:38:37]: I think that one of the places that we don't really expose right now in this sort of way is, some degree of like hard trust and support, which would like for me is like sponsors is a good example of that.Swyx [00:38:49]: Ah.Kyle [00:38:49]: It like costs you something. To prove that I believe in your project and I trust you To some degree or I want to support you at the very least.Swyx [00:38:56]: Solve payments for open source. Why not?Kyle [00:38:58]: I think that I think that like as we keep moving forward, right, there's more and more projects where I'm, adding more and more dollars into sponsors personally because I want to like support them, but I also like know of I've probably never met them in person, but, I know of enough of their work that I want to support them. I think the thing that I don't love about stars or commit counts or anything else is ultimately, even with all of the various, abuse and de-spamming and deduplication work that we do or anti-abuse work that we do, these are all, not active social signals. They're passive ones that are ultimately gamifiable. And you may trust me, but another open source maintainer may not. And on what heuristic should you be, trusting me? That I think, is kind of where some of our thinking is right now. What signal from me is most important to you? You— If you can define that potentially, honestly in an agentic workflow that's what we see some of these open source projects do, where you have GitHub actions, and then you have like an agentic workflow that's calling AI, and you're setting these rules. Like if Kyle has submitted and gotten accepted PRs across any given project and has a social handle tied to his account in GitHub, and that social account's older than a certain amount. Really complex measures that matter to you ‘cause most open source projects have that heuristic built into their heads, if not written down in the contributing guidelines. You could take that and then go apply that and then just say, “Oh, we're not going to accept this PR.” Building something that is, I think, malleable to everyone's needs, is a little bit better, rather than going “Hmm, this account's too young.” Because what happens? The attackers just go and go and create a multitude of accounts, and they wait Until it ages up. Needs to have a certain amount of stars. That's how star inflation happens. Need to have a certain amount of reposSwyx [00:40:46]: Oh my God. YeahKyle [00:40:47]: With PRs. They all just create repos and submit PRs to each other, and then they come in and do something nefarious. And so, it's hard. It's hard to find the measure. So I think we're, we're looking more at how can we provide you tools so you can kind of choose what's best for you. And of course, we'll give you some standards. But the trust vector, gets down to I don't know, some version of like human digital ID like everyone's been talking about. Like how do I prove that it's meSwyx [00:41:13]: Give me your eyeballsKyle [00:41:14]: On the internet. Give me your eyeballs. Exactly.Swyx [00:41:18]: The I got to keep moving on Topics, but obviously I can go all day on this stuff because, I've been involved in GitHub and open source My entire professional career. Stars. Very superficial. Everyone knows it. But I think time to one hundred thousand stars is the fastest I've ever seen. Like people just reached that in I don't know, months. And then like at the same time I don't trust it right? Like how many of these are real or bot or like whatever. I don't know how to ask this but like what can we do about it? LikeKyle [00:41:49]: JustSwyx [00:41:49]: Is stars broken? Is stars fine?Kyle [00:41:51]: I think that there's kind of two, there's like two pieces. Obviously we're constantly like trying to find ways in which like your users are producing spam, which would, I would include like be like only doing star gamification. When we find them, we pluck ‘em out and we,Swyx [00:42:08]: But it's like a Whac-A-MoleKyle [00:42:10]: It's a hundred percent like a Whac-A-MoleSwyx [00:42:11]: There's no wayKyle [00:42:11]: Now, powered by AI to be helpful. But I think more so what I'm seeing is, a lot of the like fastest time to X tends to be because we're now inviting so many more people into like software development on GitHub That like the zeitgeist is just swarming? And it'sSwyx [00:42:32]: It's not just developers anymoreKyle [00:42:33]: And it's not you and I. Like like however you want to say like what a developer is it's not just folks who have been coding for a very long time. It's folks that have maybe started coding or only joined in since the AI era. And nowSwyx [00:42:44]: what's the latest Octoverse number? I know eighty million was my lastRem- member that a number of developers on GitHubKyle [00:42:50]: Oh, we're over 200 million now.Swyx [00:42:53]: Okay. Well, so you see?Kyle [00:42:55]: Like over 200 million developers now.Swyx [00:42:56]: But it's not developers, right? It's, it's people with a GitHub account.What Counts as a Developer in the AI Era?Kyle [00:43:00]: So, so this is, this is the biggest debate that I would say, everyone loves to have at GitHub at this point. From my perspective, right, I think that there's, there's clearly a difference between, professional enterprise developer and then developers. But I think that I think that the idea that we should be I don't know, splitting hairs or segmenting developers in the early era of software development is, not worth our not worth the time. SoSwyx [00:43:29]: When you get into gatekeepingKyle [00:43:31]: 100%Swyx [00:43:31]: What is a developer?Kyle [00:43:31]: 100%. ‘Cause I wasn't a developer when I started writing code? I was going toSwyx [00:43:36]: Oh, no. I made— I cloned a thing, seven years before I learned to code. And then I and then I wrote about my learning to code journey, and people Just called me a fraud ‘cause I had a GitHub account. And I'm “Well, no, I just use GitHub, but I don't know-” “I didn't know what I was doing.”Kyle [00:43:49]: I I remember that. I remember those sets of posts, and like that's, that's b******t. So I fight very clearly on the line of, if you create code, if you have an idea and you create it into some way of, I'm, I'm going to run it and use the app right now, you may still use AI in that moment, but that's okay. At some point you're going to do the next thing. You're going to create a big— You're going to have to learn about this database. You're going to fix a bug, whatever. We're all on some same journey, and those people are also hearing about the great new agent skill package or a new CLI tool or a new whatever. And those projects are going up because you want to be a part of this moment, just like I wanted to be a part of the Ruby community when Ruby was popping off when I started becoming a developer, and now I can just click the star button. And so I think that yes, there's clearly some amount of like spamming and game gamification that we're working against, but I really think we're just seeing this whole new cohort of folks that are moving from technology to technology because they're not working on a 20-year-old software application. They're working on a side app that they built on the weekend for their friends or for their new idea or whatever. And that's how you see these enormous charts going up and to the right with With stars.Swyx [00:44:59]: I think something that's remarkable is the persistence or, that GitHub extends to those folks. Usually when I see platforms go into a new audience, they usually have to, have like a second platform with a different name that wraps the main platform. But somehow GitHub has been able to sort of persist and extend, and it's friendly and whatever? So it's, it's nice.Spark, Low-Code, and Always Showing the CodeKyle [00:45:19]: I that's partially why I think as we've tried to move into I don't know, more like low-code-y things. We so we started working on Spark as like a way to, build an app and run it. I think that the reality is that we anytime we try to, kind of put even a veneer on top of it without when we put a veneer on top of something, we still always show you the code. That's kind of like a tenant. We're never going to, hide the code from you ever, because whatSwyx [00:45:52]: Why would you?Kyle [00:45:52]: That's, yeah, that's the whole point? However, I think that what we learned with things like Spark is that really the value of Spark for most devs is, easy runtime. And you may have a runtime or a host that you're going to use for that or you just build something and run it but, the package of making that even more simple isn't really needed for folks that are trying to build software and not just trying to build, an app, which is, slightly different, a slightly different goal. So I want to get you in, I want to get you comfortable. I think the best thing for me as, someone that did not traditionally come into software dev way back, I want anyone to be able to breach that chasm and not be in the I don't know, I feel like we're, we're still in an era of, STEM. I've got a 12-year-old and an eight-year-old, and it's “We got to get ‘em into STEM,”? Over and over. And I like I do, I do the things that good parents do. I was “Oh, you want to do coding?” “Yes, I want to do coding.” Do coding classes. But now they're just not afraid of doing software. And that's, I think, the thing that's honestly kept me at GitHub for so long. Anyone should be able to go and build a thing, just like I can go change a light switch in my house. I'm not going to go into the breaker box ‘cause I'll probably kill myself? But, I can go change that light switch. Everyone should be able to go and say, “This fricking app doesn't do what I want. I want it to work like this.” And that I think, is what's kind of kept us all connected with GitHub through the years and some and during the easiest of times or in the hard times because of that opportunity of, we're the home for all developers, and we want everyone to be able to have that feeling that we've had of, had an idea, I created it and holy s**t here it is.Swyx [00:47:37]: Here it is. All right, I'm going to try to do more spicy questions.GitHub's Hardest Scaling Moment: Growth, Agents, and UptimeKyle [00:47:42]: Great.Swyx [00:47:42]: Is it an easy time now or a hard time?Kyle [00:47:45]: Oh at GitHub? It's a hard time. Like, it's a hard time and also, I was just with my team and I said, “This is also, the best and most exciting time that I think I can remember at GitHub.” BecauseSwyx [00:47:57]: Best of times, worst of times. It's never oneKyle [00:47:59]: ‘cause we've we were talking about Octoverse reports and, usually we do an Octoverse report once a year, and we look at the numbers, and we say, “Oh my goodness.” I was at Universe in October saying, “This was the fastest year of growth that we've ever had,” right? And now we're doing more in a month than we did in a year last year.Swyx [00:48:20]: You're talking about PRs.Kyle [00:48:21]: Commits.Swyx [00:48:21]: Commits, yeah.Kyle [00:48:22]: PRs. Kind of like you name it by roughly every measure that we're looking at, there's some amount of sort of growth that is much bigger, and that is breaking our system in new ways, not old ways. Like webhooks were always notoriously, unreliable over the years?Swyx [00:48:38]: Whose fault is that?Kyle [00:48:39]: not anymore mine, but for a period of time, I'm sure you could pull up a tweet that was “It was me. I'm sorry.” but, now, that got rewritten at a scale level that is still working and is not having problems today. Now what we're finding isn't just the isn't the-The simple stuff that folks are on the sometimes on Twitter or on the internet are “Hey, why is this like this?” Sure. There's absolutely silly problems that we shouldn't exist. But now we're talking about, unique, novel permission problems that happen only at a scale across all different objects or whatever, that now we have to go rewrite this underlying system. And so it's, there are problems that yeah, caught us off guard, which I think I said. Like the growth is astronomical, but also we're making such material progress in that I'm excited once we're once we've kind of like reimagined the underlying foundation layer, or pieces of it at least, what's going to be possible when it's not just all of us and all the new people that are being developers and all of their agents and all the tools like working together. Because that'll still happen in that in that GitHub tool, that GitHub community. But it's a it's a hard day anytime we can't give you what you're looking for. We have the same problem internally. We operate through github. Com. Of course, we have backups when things go down and whatnot for our own operations but we feel it too. If it's not working it's not working for us, and that's kind of like the promise of dogfooding for GitHub. It's always been true. We're using the same tool you're using. We're not using a super secret version. We and so we also need it to be great for us for our customers of course for open source. And now an exponential growth of agents, Doing it too.Swyx [00:50:32]: I wanted to load for audio listeners who maybe haven't seen your tweets, whatever. So one billion commits in twenty-five. Now it's two hundred and seventy-five million per week on pace for fourteen billion this year, if growth remains linear. Is that still the pace? I don't know. It's been aKyle [00:50:48]: it's, it's speedingSwyx [00:50:50]: Roughly.Kyle [00:50:50]: It's still speeding up.Swyx [00:50:51]: It's, it's April, so yeah.Kyle [00:50:51]: Exactly. This was in April.Swyx [00:50:53]: All right. So basically you have fourteen x growth, right? Year on year on year. And I think that's a scaling issue. I think, I'm going to like try to really steel man this thing. People have experienced fourteen x growth. They haven't had your downtime. And that's like— C-can we go dig into that? Why? Like what's the— what broke? What are we doing to fix it? Like just anything for the community to reassure them.Why GitHub Reliability Is Breaking in New WaysKyle [00:51:18]: so there's a Like I was saying, there's a couple different places that we've seen the growth issues. Some of the growth issues, which is why we're t— I was talking about pushing hard on more CPUs is in actions in particular. More tools, more agents, more PRs mean more builds, more builds mean more CPUs. And so we are expanding through not just our data center, but obviously we were talking about moving to Azure and moving to, adding an additional cloud compute because we simply need more CPUs. Not as much GPUs. We definitely need GPUs too, but now CPUs are becoming a factor.Swyx [00:51:53]: It's very CPU heavy.Kyle [00:51:54]: Underneath the hood when it comes to some of the underlying services, we've been breaking up over the years our database infrastructure, so that way we have, more cognitive separation between our the various services. The place that we continue to have pain is in, permissioning. And so right now m-many of our permissioning layers sit into a database that we like internally call MySQL One, and old Hubbers will know what I'm talking about. And so we've been pulling things out of MySQL One for many years, because like and we use we use Vitess and we use other technologies to shard and we do it as one bigSwyx [00:52:31]: Famous thing, PlanetScale was born from this andKyle [00:52:32]: A hundred percent. Sam Old Hubber and friend. And so finding these opportunities to like break this out and then do that globally. The other thing that I think is interesting and both a unique opportunity and tricky is we also run everything I just talked about in a black box container with GitHub Enterprise Server for people that work on-prem. So we take everything I just said, and we also do it on-prem, and we also do all of that and we do it in a data residence setup for customers that need to have their data in a single location. Each of these has the unique characteristic around how we're sort of storing that data in MySQL or in a permissioning setup. That's where some of these outages have oc-occurred, where you're seeing it more like across the board rather than just like the one pieceSwyx [00:53:17]: Filling the databaseKyle [00:53:17]: Isn't quite working. Exactly. And so part of it is that. I think there's been some other places where agents are much more or more projects appear to be moving towards monorepo versus we were going the other direction for many years in the industry. Repos were smaller, but there were more of them, and now we're seeing the opposite. Repos are bigger, and there's, not fewer of them per se ‘cause there's new growth, but, we're just seeing many more big repos. Big repos, big monorepos have always had, a unique performance problem. Because each one, is slightly different if, particularly if the underlying blobs are incredibly big Inside the repos. And so we've done a ton of work that you pro— like most people haven't probably experienced, unless you're in this case of the monorepo. But that Git, infrastructure layer improvement does help the overall, system because, many of the improvements that make monorepos work better make all repo infrastructure work better. And so, I could kind of keep going down the line where it's another thing where we're moving out of, We're changing how we do j I'll just say job queuing for lack of a better, explanation changing the underlying technologies there.Swyx [00:54:32]: I spent two years being a job queuing guy, so.Kyle [00:54:34]: And so it's kind of a little bit of a little bit of piece by piece, and it's mostly because as we were— as it was built, we built everything in a way that assumed, I guess in some ways that the size of the pipe of work was going to remain the same. There's just going to be more people coming through each of those pipes. But instead now in places whereA git push was, generally a certain size for example, is now, no longer true.Swyx [00:55:03]: Oh, yeah.Kyle [00:55:03]: OrSwyx [00:55:05]: I push a thousandKyle [00:55:06]: On the average. 100%Swyx [00:55:06]: A thousand line commits like dailyKyle [00:55:07]: Same thing with PRs. Like PRs same thing. And like we've talked about optimizing that and making changes where, and there were technology choices that did not work there? And it got slow, and it didn't It was not fast. It did not do what the users wanted. And so we've been reeling that all out and going “Okay, that's just not right. Let's stop putting good money after bad and do it the do it the right way or the right way now.” So there's It's a it's a lot of things, not quite when I've experienced scale at GitHub historically, it's almost always two options that we've used. We go vertical scaling, particularly with databases, right? And we go horizontal scaling. Oh, we just have more people using this service. Great. We're going to add more servers, and we rack them in our data center, or we use it in a cloud. And now we're sort of in a like diagonal, where like vertical doesn't really work anymore. Horizontal isn't work either because we're all We all have some CPU or GPU constraints in the world now, and now we have to go in and like crack open services that have been running for 10 or 15 years and go, “Okay, the rules of this service have legitimately changed, and now we have to rewrite them.” None of this is an excuse. This is like we're We have to do the work. We have to make it better.Swyx [00:56:22]: actually as an infra guy, I'm “This is like one of the most fascinating scaling challenges I've ever seen.”Kyle [00:56:26]: That's that's, that's the thing that's the thing that it's hard for Like when we weren't talking about it publicly, and I was like I came out, and I was “Hey, I just want to explain what's going on.” Part of it comes from a very old GitHub ethos, which is it's our it's our uptime. It's down. W What I know you're a developer, so you're, you're inclined to want to understand more what's going on. But at the same time us going “Hey, this service didn't, perform the way we expected, and now we have to go change it,” we weren't We're not trying to hide anything from you i

Remotely Curious
Coming soon: Working Smarter season three

Remotely Curious

Play Episode Listen Later Jun 2, 2026 2:17


Modern work can be frustrating and chaotic—if you don't have the right tools. From context engineering to multimodal search, go behind the scenes and hear how Dropbox engineers are building AI that actually understands you, so you can focus on the work that matters most. If you're new to Working Smarter, we've travelled from the F1 track to the bottom of a lake, and heard real stories from chefs, doctors, lawyers, and founders about how AI is helping them do more of what they love about their jobs. But in our third season, we're talking to the people behind the tools—the engineers and product leaders building helpful, time-saving AI features into the Dropbox experience you already know and trust. You'll hear all about their work on agents, inference, security, and, of course, how the people building AI use AI themselves. ~ ~ ~  Working Smarter is brought to you by Dropbox. Find, organize, and share your work—all in one place—with context-aware AI from Dropbox. You can listen to more episodes of Working Smarter on Apple Podcasts, Spotify, YouTube, Amazon Music, or wherever you get your podcasts. To read more stories and past interviews, visit workingsmarter.ai This show would not be possible without the talented team at Cosmic Standard: producer Ben Montoya, sound engineer Aja Simpson, technical director Jacob Winik, and executive producer Eliza Smith. Special thanks to our illustrator Fanny Luor, marketing consultant Meggan Ellingboe, and editorial support from Catie Keck.  Our theme song was composed by Doug Stuart.  Working Smarter is hosted by Matthew Braga. Thanks for listening!

Open Source Security Podcast
Open source verification with Sal Kimmich

Open Source Security Podcast

Play Episode Listen Later Jun 1, 2026 31:54


Josh chats with Sal Kimmich about the current state of everything, and what we can expect next. Sal has some incredible insight into what we can expect to see due to the current wave of security bugs and incidents. There are some new features we will need in both our hardware and software to ward off the state of things. Since those features are years away, what we need in the short term is shoring up our SDLC programs. Sal has some really good medical examples and analogies for this one. It's a huge problem but not insurmountable. The show notes and blog post for this episode can be found at https://opensourcesecurity.io/2026/2026-06-verification-sal-kimmich/

Resilient Cyber
Securing the Agentic SDLC

Resilient Cyber

Play Episode Listen Later May 29, 2026 49:24


In this episode of Resilient Cyber, I sit down with Katie Norton, Research Manager for DevSecOps and Software Supply Chain Security at IDC, to unpack what application security looks like as AI moves from copilot to autonomous teammate across the software development lifecycle.We dive into:

Software Lifecycle Stories
Reviving Developer Passion with Paramu Kurumathur

Software Lifecycle Stories

Play Episode Listen Later May 29, 2026 33:39


My guest today is a good friend and colleague - and not to forget with whom I was a co-author for a book, Paramu Kurumathur.In this episode, Paramu discusses how his recent development work evolved from small Google Apps Script utilities copied and adapted from online examples to building AI-connected applications via APIs to tools like Gemini and ChatGPT, including enabling Q&A over his book content. He describes surprises from “conversing” with his books—especially that LLMs retain details he has forgotten—while noting key risks such as hallucinations and the need for precise prompts. He explains learning Cursor with guidance from our colleague, Raja, discovering that it can generate code, and rapidly producing a proof of concept that maps citizens to the Government welfare schemes using PDFs, Chroma DB, sentence transformers, and queues—work that took about a week instead of months. The conversation contrasts older development eras with today's dependency-heavy environments, argues many SDLC intermediate steps are compressed, and highlights transferable mid-career skills in requirements and problem translation, alongside concerns about limited debugging and testing depth.The timestamps are approximate and do not include the time for the intro. Add about 90 seconds to locate the section00:00 Welcome and Setup01:16 Rediscovering Coding via Apps Script02:02 Connecting Scripts to LLM APIs03:21 Talking to Your Own Book05:32 Hallucinations and Prompt Control06:50 Learning Cursor and Building a POC09:09 Old School Dev vs Modern Tooling12:00 AI Changes the SDLC13:36 Testing and Trusting AI Output15:10 Debugging Gaps and Assumptions16:34 Setting AI Standards17:20 Mid Career Transfer Skills18:48 Prompting Without Hallucinations20:21 Courses vs Learning by Doing23:25 Overcoming First Step Fear25:24 LLM Limits in Astronomy28:24 Cursor for Reliable Code29:37 Anybody Can Code Now31:13 Next Projects and Wrap Up

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 722: Dwayne McDaniel on the Engineering Challenges of Secrets Management

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later May 27, 2026 52:10


Dwayne McDaniel, developer advocate at GitGuardian.com, joins host Priyanka Raghavan to talk about the engineering challenges of secrets management. They explore what "secrets" really are in modern systems—far beyond passwords—including API keys, tokens, certificates, and machine identities, and how "secret sprawl" emerges across the SDLC. Drawing on reports from GitGuardian and Verizon, they discuss the growing scale of secret leaks and why credential abuse and phishing remain dominant attack vectors. They examine common leak points—from code repos and logs to CI/CD pipelines, containers, and SaaS integrations—and how cloud, DevOps, and AI tooling are amplifying risks. Priyanka quizzes Dwayne about recent supply chain attacks from pyPi and trivy ecosystems, highlighting recurring root causes like poor access control, long-lived credentials, and weak security hygiene. Finally, they consider detection, response, and modern solutions—short-lived credentials, secret scanning, and identity-based approaches like OWASP NHIR and SPIFFE/SPIRE—ending with practical advice for engineers to reduce blast radius and design for secure secret lifecycle management.

Paul's Security Weekly
AppSec Conversations on Agents, LLMs, and OWASP from RSAC - Merritt Maxim, Scott Clinton, Janet Worthington - ASW #384

Paul's Security Weekly

Play Episode Listen Later May 26, 2026 59:40


We showcase recordings from this year's RSAC. At RSAC Conference 2026, Scott Clinton, Co-Chair and co-founder of the OWASP GenAI Security Project, shares insights from the project's latest research, including new landscape guides and evolving approaches to securing generative and agentic AI systems. The conversation explores critical gaps in GenAI data security, the rise of AI-assisted development, and the immense growth of the OWASP community and sponsor ecosystem. Looking ahead, he outlines the most urgent risks and priorities shaping AI and agentic security in 2026. Then Merritt Maxim discusses how AI is affecting Identity and Access Management. Expect to hear this topic a lot throughout 2026, especially as the industry tries to figure out what's different or special about securing agent identities. We close with a chat with Janet Worthington about the impact of agents on the SDLC and how orgs are updating their controls to deal with code generated by humans and LLMs alike. Segment Resources: https://genai.owasp.org https://genai.owasp.org/resources/ https://www.scworld.com/podcast-episode/3905-keeping-up-with-the-owasp-genai-project-scott-clinton-asw-381 This segment is sponsored by The OWASP GenAI Security Project. Visit https://securityweekly.com/owasp to learn more about them! Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw-384

Paul's Security Weekly TV
AppSec Conversations on Agents, LLMs, and OWASP from RSAC - Scott Clinton, Janet Worthington, Merritt Maxim - ASW #384

Paul's Security Weekly TV

Play Episode Listen Later May 26, 2026 59:40


We showcase recordings from this year's RSAC. At RSAC Conference 2026, Scott Clinton, Co-Chair and co-founder of the OWASP GenAI Security Project, shares insights from the project's latest research, including new landscape guides and evolving approaches to securing generative and agentic AI systems. The conversation explores critical gaps in GenAI data security, the rise of AI-assisted development, and the immense growth of the OWASP community and sponsor ecosystem. Looking ahead, he outlines the most urgent risks and priorities shaping AI and agentic security in 2026. Then Merritt Maxim discusses how AI is affecting Identity and Access Management. Expect to hear this topic a lot throughout 2026, especially as the industry tries to figure out what's different or special about securing agent identities. We close with a chat with Janet Worthington about the impact of agents on the SDLC and how orgs are updating their controls to deal with code generated by humans and LLMs alike. Segment Resources: https://genai.owasp.org https://genai.owasp.org/resources/ https://www.scworld.com/podcast-episode/3905-keeping-up-with-the-owasp-genai-project-scott-clinton-asw-381 This segment is sponsored by The OWASP GenAI Security Project. Visit https://securityweekly.com/owasp to learn more about them! Show Notes: https://securityweekly.com/asw-384

Application Security Weekly (Audio)
AppSec Conversations on Agents, LLMs, and OWASP from RSAC - Merritt Maxim, Scott Clinton, Janet Worthington - ASW #384

Application Security Weekly (Audio)

Play Episode Listen Later May 26, 2026 59:40


We showcase recordings from this year's RSAC. At RSAC Conference 2026, Scott Clinton, Co-Chair and co-founder of the OWASP GenAI Security Project, shares insights from the project's latest research, including new landscape guides and evolving approaches to securing generative and agentic AI systems. The conversation explores critical gaps in GenAI data security, the rise of AI-assisted development, and the immense growth of the OWASP community and sponsor ecosystem. Looking ahead, he outlines the most urgent risks and priorities shaping AI and agentic security in 2026. Then Merritt Maxim discusses how AI is affecting Identity and Access Management. Expect to hear this topic a lot throughout 2026, especially as the industry tries to figure out what's different or special about securing agent identities. We close with a chat with Janet Worthington about the impact of agents on the SDLC and how orgs are updating their controls to deal with code generated by humans and LLMs alike. Segment Resources: https://genai.owasp.org https://genai.owasp.org/resources/ https://www.scworld.com/podcast-episode/3905-keeping-up-with-the-owasp-genai-project-scott-clinton-asw-381 This segment is sponsored by The OWASP GenAI Security Project. Visit https://securityweekly.com/owasp to learn more about them! Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw-384

Application Security Weekly (Video)
AppSec Conversations on Agents, LLMs, and OWASP from RSAC - Scott Clinton, Janet Worthington, Merritt Maxim - ASW #384

Application Security Weekly (Video)

Play Episode Listen Later May 26, 2026 59:40


We showcase recordings from this year's RSAC. At RSAC Conference 2026, Scott Clinton, Co-Chair and co-founder of the OWASP GenAI Security Project, shares insights from the project's latest research, including new landscape guides and evolving approaches to securing generative and agentic AI systems. The conversation explores critical gaps in GenAI data security, the rise of AI-assisted development, and the immense growth of the OWASP community and sponsor ecosystem. Looking ahead, he outlines the most urgent risks and priorities shaping AI and agentic security in 2026. Then Merritt Maxim discusses how AI is affecting Identity and Access Management. Expect to hear this topic a lot throughout 2026, especially as the industry tries to figure out what's different or special about securing agent identities. We close with a chat with Janet Worthington about the impact of agents on the SDLC and how orgs are updating their controls to deal with code generated by humans and LLMs alike. Segment Resources: https://genai.owasp.org https://genai.owasp.org/resources/ https://www.scworld.com/podcast-episode/3905-keeping-up-with-the-owasp-genai-project-scott-clinton-asw-381 This segment is sponsored by The OWASP GenAI Security Project. Visit https://securityweekly.com/owasp to learn more about them! Show Notes: https://securityweekly.com/asw-384

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Take the 2026 AI Engineering Survey and get >$2k in credits and AIE WF tickets!This was recorded before Railway suffered a major GCP outage on May 19, despite being a multi-AZ, multi-zone mesh ring, with HA fiber interconnects between their Metal GCP AWS, because workload discoverability was unintentionally still tied to GCP. All has been resolved with a post-mortem.Railway did not start as an AI infrastructure company.It was founded in 2020 years before agents became the default way people thought about deploying software. Jake Cooper, formerly at Bloomberg and Uber, started Railway with a simple obsession: the activation energy to ship something to production should be near zero. Push code, get a URL, iterate. No Docker files, no Kubernetes manifests, no Ansible scripts stacked on Ansible scripts.For years, this was a slow grind. Railway spent its first 18 months hand-acquiring its first 100 users with Jake personally greeting every Discord signup on a second monitor.Today, Railway has raised $124m and is growing very fast. A 35-person team supports 3 million users, adding roughly 100,000 signups a week. Their bare metal data centers have a 3-month payback period vs. renting in the cloud, with 70% margins funding aggressive cloud bursting when needed. The servers they own have actually appreciated in value as RAM prices have climbed basically meaning the value of their hardware now exceeds the capital they've raised.From rebuilding Railway's network overlay over a weekend to moving the vast majority of workloads onto its own bare metal data centers, Jake Cooper is trying to build a new cloud for an agent-native world. In this episode, Railway's founder and “conductor” joins swyx and Alessio to unpack why the next era of software infrastructure is not just “Heroku but newer,” what agents need that humans did not, and why the old deployment loop of Git, PRs, CI/CD, and static cloud resources may be heading for a rewrite.We go deep on Railway's infrastructure stack: own-metal data centers, three-month cloud payback periods, cloud bursting, data center debt, Railpack, Nixpacks, Temporal, feature flags, Central Station, content-addressable filesystems, agent-safe production forks, and why the CLI may become more important than the canvas in an agent world. Jake also shares the founder journey behind Railway, how the company survived losing $500K/month, why it now serves millions of users with only 35 people, and why he believes the pull request is dying.We discuss:* How Railway went from a slow six-year grind to adding 100,000 users a week* How Railway thinks about agents as the next dominant software species* Why agents need version control, observability, compute, storage, and orchestration at 1000x scale* The economics of Railway's own-metal data centers and three-month payback* How Railway uses cloud bursting while scaling its own infrastructure* Why data center debt can be a better tool than venture debt for infra startups* Central Station, Railway's internal system for clustering customer feedback and incidents* Why responsible disclosure and over-communication matter for platforms* Why feature flags, progressive rollouts, and shadow traffic are essential for agents* Temporal's strengths, pain points, and why workflows matter for agents* Railpack, Nixpacks, Nix, and lazy-loaded content-addressable filesystems* Why “cattle, not pets” may change if you can clone the pets* Why Railway is building a new cloud from scratch instead of copying hyperscalers* The solo founder path, focus, writing, and how Jake thinks about company buildingRailway:* Website: https://railway.com/* X: https://x.com/RailwayJake Cooper:* LinkedIn: https://www.linkedin.com/in/thejakecooper/* X: https://x.com/JustJakeTimestamps00:00:00 Introduction: What Is Railway?00:02:07 Jake's Path to Railway00:06:13 Railway's Six-Year Growth Story00:08:52 Rebuilding the Business After the Free Tier00:11:17 Agents as the Next Software Platform00:13:29 Railway's Infrastructure Philosophy00:15:42 Bare Metal, Cloud Economics, and the Compute Crunch00:17:22 Cloud Bursting and Five-Cloud Networking00:20:20 Data Center Debt and Infra Financing00:23:31 Data Centers in Space00:25:24 What Agents Need From Infrastructure00:28:24 CLIs, Canvas, and Agent-Native UX00:35:15 Central Station, Incidents, and Responsible Disclosure00:40:30 Safe Rollouts, SRE Agents, and Production Forks00:45:00 AI SRE, Specs, Code, and Tests00:48:24 Self-Replicating Infrastructure and the New Serverless00:53:18 Heroku, Temporal, and Workflow Engines01:04:07 Railpack, Nixpacks, and Lazy-Loaded Filesystems01:06:01 Coding Agents, Token Spend, and Roadmap Acceleration01:10:56 The Pull Request Is Dying01:12:28 Feature Flags and the Agent-Era SDLC01:16:15 Cattle, Pets, and Cloning Machines01:19:29 Solo Founder Lessons01:24:12 Focus, GPUs, and Building a New Cloud01:28:20 Closing ThoughtsTranscriptAlessio [00:00:00]: Hey, everyone. Welcome to the Latent Space Podcast. This is Alessio, founder of Kernel Labs, and I'm joined by Swyx, editor of Latent Space.Swyx [00:00:10]: Hey, hey, hey. Today we're in the studio with Jake Cooper of Railway.Alessio [00:00:14]: Conductor of Railway.Swyx [00:00:15]: Conductor at Railway. Yeah.Alessio [00:00:16]: Choo-choo.Swyx [00:00:17]: Do you actually have that anywhere, like on your business card?Jake [00:00:20]: We call some of our volunteer moderators conductors. I don't have a business card. We're not that big yet. At some point I will. I got handed a nice business card from the Supermicro folks, and I was like, “Damn, this is pretty official.”Swyx [00:00:30]: Business cards are coming back.Jake [00:00:32]: They're cool. They're hip. The conductor thing is good. We're trying to figure out what we want to call each other internally. Some people think it's super cringe and say, “You don't need a name for people internally.” Some people want to call each other something. We still don't have a really good one.Jake [00:00:55]: We've got New Railcrews, Trainiacs. Nothing has stuck yet.Swyx [00:01:00]: I like Trainiac. Trainiac sounds good. Railwayians. For those who don't know, what is Railway? Let's give people a crisp definition up front.Jake [00:01:09]: Railway is the easiest way to ship anything. You go to the canvas, or you talk with Claude, and you say, “Deploy a Postgres instance, deploy my GitHub repository, run this code,” and you're off to the races.Swyx [00:01:22]: You've got a nice animation on the landing page.Jake [00:01:24]: Thank you. None of my work, by the way. They don't let me touch the design stuff anymore.Jake [00:01:25]: We want to make it trivially easy not just to deploy things, but to evolve applications over time. Most tooling right now stacks entropy on top of entropy: Docker, Kubernetes, Ansible scripts, and all these other things. If we can version all of your software and keep track of all the changes, then we can make it trivial to clone environments, fork into a parallel universe, get copies of production data, get copies of any services, make changes, validate them, and collapse them back in without reproducing everything across a staging environment.The Railway Origin Story: From Uber Systems to a New CloudSwyx [00:02:07]: I was looking at your background: Bloomberg, Uber. Nothing immediately stands out as, “This guy is going to found the next great platform as a service.” What prepared you for Railway?Jake [00:02:21]: It was curiosity to keep going deeper. I started out on front-end stuff, working on Wolfram Mathematica and porting it over. Then I briefly moved to Bloomberg, then toward Uber and distributed systems, taking the Jump Bikes systems and moving them to a distributed system built on top of Cadence, the pre-Temporal Temporal.Swyx [00:02:44]: Which, by the way, I'm happy to talk about, pros and cons.Jake [00:02:48]: Totally.Swyx [00:02:51]: But let's do the Railway story.Jake [00:02:52]: It has been a continual step of wanting an experience. Whether it's walking up to a bike, unlocking it, and having it work frictionlessly, or something else, the depth required to make that happen follows from the experience. A lot of the work I do, and a lot of the team does, is in service of that experience. We fundamentally don't care how deep we have to go. We will swim to the bottom of the swimming pool to get the experience.Jake [00:03:17]: I don't have a physics PhD. I did an EECS degree. It has always been about figuring out the next step: how do we get there? That's what led to starting Railway for that experience and then moving all the way to bare metal data centers. I was adding patches to the kernel this week to get the experience there because I can see how much better it can be.Swyx [00:03:49]: Other patches to the Linux kernel this week?Jake [00:03:51]: Yeah. Not upstream. Our fork.Swyx [00:03:52]: That's a flex. Railpack? No, this is different. This is the OS on top of Railpack?Jake [00:03:57]: No, this is an actual kernel patch. It's always literally: what do we have to do to get that experience? Then figure it out. Anything is figureoutable.Swyx [00:04:10]: Would you send the patch upstream, or does it not fit other use cases?Jake [00:04:13]: Maybe. We have to work out the experience internally. It has to do with the storage layer we're building for some of the agentic stuff. Maybe it'll be useful upstream, but it's deeply useful for us internally.Open Source, Forks, and Non-Deterministic VersioningSwyx [00:04:29]: You mentioned open source before. How do you think about starting from open source, and then coding agents letting you do a lot more from forks of it?Jake [00:04:38]: GitHub's original sin is that it's almost a series of broken pointers. You have this thing, then you clone it, and now you've lost the whole upstream. How do we make it trivial for people to modify really small pieces of it?Jake [00:04:51]: We think of Git in a discrete sense: I've either made a change and merged upstream, or I haven't. What would it look like if it were percentage-based, a little more non-deterministic, or a stream of changes that users traverse as a percentage rolled out in general and then rolled all the way up?Jake [00:05:13]: We have the open-source kickback program and let you deploy templates because we want to make it trivial for people to version these shards over time. It solves a large problem around authentication, authorization, and security. NPM has a way to define, “Don't take any new packages.” The ideal end state is that you roll out progressively to users with the minimum impact zone and continue rolling up. JPMorgan should probably be the last one on the patch line, for all our sakes, because our money and livelihoods are there.Jake [00:05:53]: It's okay if Johnny Vibe Coder gets a broken patch because there's so much entropy in the system that the rubber has to meet the road at some point. You have to test at varying levels.The Long Grind: First Users, Free Tier, and Making the Business WorkSwyx [00:06:13]: I wanted to pull up this glorious chart, which is your usage or number of daily signups?Jake [00:06:22]: Daily signups, I think.Swyx [00:06:24]: You started six years ago. It was a slow grind, and now you're on a rocket ship. You say, “Don't doubt your fight and don't quit.” Maybe pick out certain points that were key inflections for the company.Jake [00:06:40]: At the start, it's about getting your first 100 users, hell or high water. We had a website and a support link. The support link was the Discord channel. I had notifications on with two monitors: the monitor I was working on and the other monitor with Discord. If anybody came in, I was immediately like, “Hey, how's it going?” It was rare, so getting those first 100 users to come back was the start.Jake [00:07:14]: Then you build a consultancy factory because users want all these things. You have to go back to the board and ask, “What is the actual product offering I want to build on top of this?”Jake [00:07:28]: VCs want charts that always go up and to the right, but in reality you don't necessarily want charts that look like that. For us, there have been periods of expansion where we add features to test use cases, and periods of compaction where we ask, “If the experience we have is good, how do we make it significantly better?” Maybe we strip out features that don't fit our ICP anymore.Jake [00:07:57]: The boom from 2022 to 2023 came from the free tier. Everybody under the sun was using it.Swyx [00:08:09]: A lot of Reddit bots and Discord bots.Jake [00:08:12]: And crypto miners. When you build an open product on the internet where anybody can sign up, the internet is a horrible place with so many things. You go through periods of asking, “How do I reach as many people as possible?” Then, “How do I fit the exact use case for the people who really matter and are really excited about this specific thing?”Jake [00:08:39]: Then there was a two-year period of making the actual business work. During the free-tier era, we were losing about half a million dollars a month.Swyx [00:08:59]: On a $20 million bank account.Jake [00:09:02]: On a $20 million bank account with maybe $50,000 a month in revenue. That's a horrible business. I don't know how anybody invested. But you have to go through it and say, “We have an experience people love, but the business has to work.”Jake [00:09:17]: There are two schools of thought. You can run the horrible business all the way up with bad margins, or you can go back and make it work. We've always wanted a super lean team. We're 35 people right now. It's very small.Swyx [00:09:36]: Supporting three million already?Jake [00:09:38]: Yeah. We're adding 100,000 users a week right now, so it's growing fast. We don't want to add headcount for the sake of headcount or throw bodies at problems. We want to build systems. It's hard to build systems during expansion because you're adding things to the system because people are asking for them or things are breaking.Jake [00:10:00]: We had to cut off the free users for a little while, rebuild the business, and make sure it worked. We want to reach as many people as possible because software is important. It's become difficult to create things in the physical world, so it's important to make it easy for people to build in the virtual world and have access to creation. But there are legs to that journey.Jake [00:10:30]: You can see divots in the charts. If you follow between 2025 and 2026, it's either summer or winter. People go on holiday with family.Swyx [00:10:50]: It affects that much?Jake [00:10:51]: Yeah. It's kind of B2C and kind of B2B. People are shipping constantly, then they stop. Our activation curve now shows more people activating on weekdays because we have more business users, so it smooths out over time.Agents as the New Interface to DeploymentSwyx [00:11:17]: Was there a point where you started prioritizing AI development or agent development?Jake [00:11:24]: We've prioritized agentic as a top-of-funnel thing. Over the last six months, we've deeply prioritized agentic as a mechanism to build and deploy things because we believe the curve is so steep and that is how people will build and deploy software.Jake [00:11:42]: It almost fundamentally doesn't matter whether this is dot-com or not because we're all on the internet anyway. If agents are going to deploy a bunch of things and we hit an inference wall at some point, we'll fix those problems. The dominant species over the next 10 years is that we've moved from assembly to C to C++ to JavaScript to words. You're going to need to close that loop.Swyx [00:12:13]: When you say this is dot-com, did you mean buying the domain, or the general case?Jake [00:12:17]: I mean the dot-com era, when companies had a huge run-up because people understood the internet was important. Then they hit bottlenecks, fundamental laws of physics, math didn't work, and everybody came back down to earth. But it didn't matter because the internet became so impactful. If you operate on a long enough time horizon, you should build these things anyway because you can see where it's going.Jake [00:12:45]: That's where I think a lot of agent stuff is. You get to a point where you're running thousands of agents in parallel. What is the inference cost? What is the compute cost? How do you make that efficient? How do you coordinate all this? We have issues coordinating humans; we don't even have good tooling for that. Now we have to figure out how to get agents to coordinate, safely version changes, and know when to raise their hand for someone to intervene. Otherwise it becomes an interrupt factory.Railway's Infrastructure Thesis: Network, Compute, Storage, and MetalSwyx [00:13:19]: Let's go right into the technical side. What are the core infrastructure or architectural beliefs of Railway that allow you to do what you do?Jake [00:13:29]: The primitives matter a lot for us. We need network, compute, storage, and orchestration around it. You need control over a lot of those things. We've talked a lot about how we don't really use Kubernetes because we want higher-order control to place workloads in very specific places.Jake [00:13:48]: The reason is that you have to be very efficient with agents: memory reuse and all these other things, or you're going to massively blow up your cost structure. Being able to rack and stack your own servers and build your own metal unlocks performance and cost. Experiences where you're running 1,000 agents in parallel are not massively cost prohibitive.Jake [00:14:13]: Token use and compute use are blowing up. Over time, those things have to get a lot more efficient. You can get a lot of margin to make those experiences solid by building your own metal. That's all in service of offering a differentiated experience to as many people as humanly possible.Swyx [00:14:51]: You have a data center in Singapore.Jake [00:14:53]: Yeah. We have two in every other region now. In Singapore, we're adding a second one in Q3.Swyx [00:14:58]: What's it like? I've never built a data center. Do you go to Equinix and say, “I want some slots?”Jake [00:15:05]: Yeah. Equinix. You basically go and say, “I want power and I want a cage.” They say, “Great, here's what it's going to be.” You rent the cage for a period of time, fill it with racks and servers, and hook up internet to it. That's all the pieces.Swyx [00:15:36]: Then you handle everything else.Jake [00:15:37]: You handle everything else.Swyx [00:15:39]: What's the math versus clouds doing it for you?Jake [00:15:43]: If we rented in the cloud, our payback period when we go to metal is about three months.Swyx [00:15:50]: Which is crazy.Jake [00:15:51]: It's nuts. That's four years of depreciated hardware. You're going to see a lot of this compute crunch because hyperscalers are buying up a lot of stuff. We're working directly with OEMs, resellers, and people building these machines: Supermicro, Dell, and others.Jake [00:16:11]: Upstream, there's a bunch of supply pressure. When we raised our last round, between deploying capital for servers and now, the amount of money we've raised is less than the amount of money we have in the bank plus the value of the servers because the servers have appreciated as RAM has gone up. It's nuts how valuable hardware has become.Jake [00:16:50]: If you look at hyperscalers, they deployed around $80 billion of capital expenditures this year, and next year will be more. That's a massive infrastructure build-out. You look at that and think it's crazy that they're spending way more than the Manhattan Project. But if every person is going to run dozens or hundreds of agents in parallel, you have no conceptual idea how much compute is required to make that experience happen, even if you're deeply efficient and sharing resources. And that doesn't even count inference.Swyx [00:17:22]: How do you plan the build-out? The growth chart is so vertical. Are you usually at 100% utilization as soon as racks are live? How far ahead are you planning?Jake [00:17:33]: We still maintain cloud presence for bursting. We work with AWS, GCP, and a few other clouds. We can rent, and then the moment we get space or power, we compact those workloads off the cloud. We started on the clouds, then built a system to migrate to our own metal. There's nothing that says you can't continually do that again, and that's exactly what we do. We never want to be compute constrained.Jake [00:18:09]: At the start of the year, we actually became compute constrained because one upstream provider wasn't able to give us quota at the rate we needed, and the hardware was slower. I spent a weekend rebuilding our entire network overlay so we could straddle five clouds: Oracle, AWS, ourselves, GCP, and one other one. We can do more than that now.Jake [00:18:38]: We got into a spot where we were trying to pack instances tight because we couldn't get enough compute. That led to a few reliability issues, which are now past us. I made a tweet pointing out that it's becoming harder and harder to acquire compute at the rate these models need to acquire compute. We got bit by it.Swyx [00:19:15]: How do you think about pricing knowing you might not have your own metal available at all times? Are you pricing assuming you need extra margin if you end up going into the cloud?Jake [00:19:26]: Because we've built out our metal data centers, our margins on metal are around 70%. We can deeply subsidize the cloud business if we want to scale at a reasonable rate. We have a few levers: metal, which makes the margins; cloud burst; debt to buy servers; and venture capital. It's an interesting operational problem: how much cash do we have, how much should we raise, how quickly can we deploy it, and can we scale revenue as quickly as we scale compute?Jake [00:20:05]: If we continue making it trivially easy for people to build and deploy, then the faster we close that loop and the more operationally excellent we are with capital, the faster the business can scale. It's almost a straight linear deployment rate.Financing Infrastructure: Hardware Debt, VC, and Operational LeverageSwyx [00:20:20]: I think infra startups raising debt is a tool people don't utilize enough or know enough about. What can you tell us about that? Is it secured against your CPUs?Jake [00:20:32]: It's secured against our hardware.Swyx [00:20:37]: What rates do you get? Who are the lenders?Jake [00:20:39]: We pay prime plus a spread, and we can refinance any of the debt as rates go down. The terms are pretty good. The unfortunate thing is that Twitter has no nuance, so people say, “Venture debt bad.” But as with all things, there are specific tools and areas where you can be deliberate instead of using one tool as a hammer. Venture capital is not the hammer for everything. You have to explore and figure out what works.Swyx [00:21:12]: VC is usually the most expensive financing you can get.Jake [00:21:15]: Yeah. I also think people think about VC incorrectly from a capital-raising perspective. Most people think, “How do I raise as much money as possible from whoever is probably the best I can get at that time?” That's close to right, but what we've tried to do is figure out what unfair advantage we can buy with that equity.Jake [00:21:34]: It's the most expensive equity you're going to give away at that point in time, assuming the company keeps getting better. How do you use it to work with someone stellar who complements you? In the seed stage, I had never started a company. Ray Tonsing had good advice, and I could text him all the time. He was really fast. Awesome.Jake [00:22:01]: Then with John and Erica at Unusual, they said, “You roughly know what you're doing building a product. We'll mostly leave you alone and be available for advice.” Amazing. Then we got to Series A and the business was an operational tire fire because we didn't know how to scale a business. Work with Erica, and Jordan is over at Redpoint, so bonus.Jake [00:22:28]: Now we've raised from TQ and FPV as we're moving into enterprises. Every step of the way, we've asked: who can we partner with at this specific time to unlock the next section of the journey? I don't know enterprise sales. As an engineer, I can eyeball what features we might need, and we have wonderful people internally who can help. But you want boardroom dynamics where everyone is aligned and asking, “How do we win this?” instead of bickering about strategy.Data Centers in Space and the Physics of ComputeSwyx [00:23:31]: You had a tweet about data centers in space. Why no data centers in space?Jake [00:23:37]: It's not “no data centers in space.” My hot take is that I think it is solvable. I've just never seen anybody solve it.Swyx [00:23:49]: You said, “How are you going to dissipate that much heat in a vacuum?” You're making a physics claim.Jake [00:23:55]: I haven't seen anybody prove how you're going to dissipate that much heat in a vacuum. It doesn't mean it's not possible. It just means nobody has brought it up yet.Swyx [00:24:05]: Astrophage.Jake [00:24:06]: I don't know what that is.Swyx [00:24:07]: The Martian thing. Okay, you're very logical.Jake [00:24:09]: It could work. A lot of people are putting the cart before the horse. They say, “We're going to put data centers in space.” Okay, but how? “We have time to figure it out.” It's like in The Martian where they ask how they're going to intercept something and say, “We'll figure it out.”Swyx [00:24:36]: Making a bet on human invention is weird because you blind trust that it can be solved. But with physics, there are first-principles bounds you can put on it. Maybe not. Maybe you're asking to travel time or break a fundamental thermodynamic law.Jake [00:24:57]: I don't know how VCs do this either. How do you know what's not possible and a grift versus what's possible but sounds completely insane? “We're going to put data centers in space.” Coin flip as to which it is, and I guess you'll know in 10 years. That's one cycle.What Agents Need: Versioning, Observability, and 1,000x ScaleSwyx [00:25:23]: Moving back to agents. The branching, fast spin-up, and orchestration you do feels like pre-work that happened to be exactly what agents want. What do agents want differently than humans?Jake [00:25:37]: They want the ability to version things. It's not that different; it materializes slightly differently. Agents want a way to test changes incrementally. Engineers have feature flags. Is there a reason agents can't use feature flags? I don't think so.Jake [00:25:54]: They want version control. Can we use Git or not Git? That one is up in the air. I think something outside Git will emerge for how we version these things over time. They need observability. You need to query what happened, when it happened, which steps failed, traces, logs, metrics, and all the rest. They need network, compute, and storage. They need to write files, save files, iterate on files, and snapshot file systems.Jake [00:26:25]: A lot of what humans needed is in line with what agents need. Branching and forking are not different; we're just moving 1,000 times quicker. It can look like you need something massively different, but what you need is something massively better than what existed. You need orchestration massively better than Kubernetes. You need networking probably better than Envoy. It goes all the way down the stack.Jake [00:26:55]: If the workload profile doesn't change so much as it gets massively compressed because you need thousands of these things, what assumptions change? etcd is going to melt. You need to replace it with something. You can go all the way down the stack and say, “That part has to change, that part has to change, and that part has to change.”Jake [00:27:19]: The interesting thing about the super-exponential curve is that you have to build systems where you can rip out those parts at any time because a new bottleneck might emerge. You get good at parallel agents, and a different part of the system breaks. So it's similar to what humans needed, but at 1,000x scale.Jake [00:27:55]: How do you do code review in the age of agents?Swyx [00:28:00]: You throw more agents at it.Jake [00:28:01]: You don't. But then who reviews for CVEs and all these other things?Swyx [00:28:07]: More agents.Jake [00:28:08]: And that's how we hit the inference wall. You can continually throw agents at the problem, but I think there's a limit to the number of agents you can throw at a problem.CLI, Agent Handles, and Closing the LoopSwyx [00:28:24]: You already had a CLI before it was cool. How is the shape of what you're exposing changing, if at all?Jake [00:28:28]: CLIs have always been cool. The CLI changes because we think about how to give Claude, Codex, ChatGPT, or any model a handhold.Jake [00:28:50]: A CLI is a single command: deploy, get logs, and so on. Things that were prohibitively annoying to humans are not annoying to agents. They're nice. If I handed you a CLI with 40 arguments and 600 flags, you'd think, “I'm never going to use all of this.” But if you hand it to an agent, it says, “This is excellent. I have so many handles to work with.”Jake [00:29:24]: If you're going to expose things to agents that way, you want as many handles as possible where they can get information, query dynamic information, and close the loop quickly. Most problems right now are about how to close the loop as quickly as possible. Where does the agent get stuck, and how can you remove that?Jake [00:29:49]: Telemetry is important. If you can tell where the agent gets stuck from the CLI and say, “12% of people deviate from the happy path because of this, and now I add this argument and drive it down to 2%,” you massively increase the rate of loop closure.Jake [00:30:03]: That's how we think about not just the CLI, but every point in the dashboard. It's a user journey: I hear about Railway. I get something deployed. I get my first green build or aha moment. I see an endpoint, logs, whatever. Then I iterate. The iteration loop is indefinite. The user wants to deploy a new thing, a Postgres instance, change code, and keep iterating.Jake [00:30:36]: If you focus on the iteration loops and what's blocking them from closing quickly, one thing we say internally is: you never want to be waiting on compute anymore. You always want to be waiting on intelligence. If you're waiting on compute, there's a bottleneck that needs to be destroyed because eventually that bottleneck becomes so large that another workflow emerges to change it.Jake [00:31:04]: We've built a product where you push code, build it, and so on. But I fundamentally believe the push-pull loop is going away. We'll get to a point where you make a small change in production, that change is versioned across your infrastructure, you're working alongside copy-on-write versions of your database and infrastructure, and then you merge it in and it's instantaneously live. That's the holy grail of loops. The push-pull-rebuild thing is a point of friction that we're removing entirely.Canvas as Output: Dashboards, Context Anchors, and HyperstructuresSwyx [00:31:43]: It's incredibly fast. If anyone hasn't tried it, that fast feedback is great. My hot take is that Railway was famous for its canvas, which visualizes your infrastructure and lets you manipulate it visually. But that was for humans. For the next phase of growth, Railway CLI is more important than canvas.Jake [00:32:05]: The canvas is funny because it's a mechanism to show changes over time. You're right that previously we used it a lot as an input. Moving forward, its goal is more like an output. You would go to the canvas, make changes, see them, and watch your infrastructure evolve. Now agents have access to the CLI and can make those changes. So the canvas becomes an output: what information does the human need at this moment to make suitable decisions about control requests? Do I approve this or not?Jake [00:32:57]: It also has to be an anchor for your context, a port in the storm. Think of it like layers in a file system. You start with a project, then drill down into services, then into a function or code, because you want to represent the entire thing not just in your head, but in the canvas. Other people can share that representation, think on the same wavelength, and move quickly.Jake [00:33:33]: A lot of organizations get in trouble as they scale because all the context lives in someone's head. “How does this microservice work?” “I have no idea; go ask this person.” Then you have whole categories of products built around context discovery. A lot of that melts away if you have a solid hierarchy and can infinitely nest services, code, context, and everything else all the way down. That's what lets you build these structures over time.Jake [00:34:18]: It's also what lets us build what I've called hyperstructures: things that are way bigger. You look at the Golden Gate Bridge and ask, “How did we build that?” There's a meme that we lost the technology. To some extent, yes, because the coordination that built those things evolved and changed. We lost some of the art of building structure as we jammed everything into Slack.Swyx [00:34:52]: But you jam everything in Discord.Jake [00:34:53]: Same point. It doesn't matter. It's message passing and interrupts, message passing and interrupts.Swyx [00:35:00]: So you're arguing there should be something better and more structured than Slack?Jake [00:35:04]: Yeah. For sure. I think Slack is awful, and Discord is awful too.Central Station: Context Routing, Support, and Incident ClustersSwyx [00:35:09]: This is the equivalent of my mom test. What have you done that has your solution to this?Jake [00:35:15]: Internally, we've built a tool called Central Station that aggregates all the context from our users. Every piece of feedback, every customer support item, everything gets aggregated into clusters. If an incident is brewing, we can determine how many users are affected and break off a discussion based on that.Jake [00:35:40]: That is more helpful than long-running channels where you're trying to decide which channel to put something in. If you can dynamically aggregate information and dynamically route it to the right person based on context, it works better. We know internally that these four people are close to networking. If we see a networking thing, we can drill it down to those four people. If it's with this part, we can look at the commits. This is no longer a manual process internally.Jake [00:36:13]: If you go to station or help.railway.com, that's why we built it. We wanted to scale with a massive amount of leverage by aggregating feedback.Swyx [00:36:27]: This is built in-house?Jake [00:36:28]: Yep.Swyx [00:36:29]: I remember helping out on this one with Angelo in 2023. You scale a lot with a very small team.Jake [00:36:38]: Yeah. We're about 10 times bigger now.Swyx [00:36:40]: You have your full developer code here? Very cool.Jake [00:36:44]: If you go to railway.com/stats, we expose this as a pub-sub-able thing. It's all real-time metrics. There's a way to get it as JSON somewhere if you care.Jake [00:37:01]: We're big on trying to build everything in public and talk about what we're working on. We've had issues in the past, and we'll say, “Here's how we're fixing these things.” We've gotten compliments and flak for incident reports. We're always trying to make them better and talk with people.Incidents, Disclosure, and Progressive RolloutsSwyx [00:37:20]: You had a big one recently. I liked that it was scoped to 3,000. You presumably used Central Station. Talk through what happened and how you address it internally as a team.Jake [00:37:38]: Internally, this one really sucked. It had to do with an upstream provider that didn't do the behavior it said it documented, which is unfortunate given they wrote the RFC for how the behavior should work. We rolled those things out, and Central Station caught it initially when a couple users said caches weren't invalidating. We turned it off immediately.Jake [00:38:03]: When you roll out to a large user base of three million people, you get a lot of disparate behaviors. We tested in staging and had tests, but we hit an edge case. We've hardened those systems, and now we can make that better. But it was a tough one.Swyx [00:38:39]: I always wonder how private disclosure is supposed to work if people find an issue. Are they supposed to contact you first? When you run a platform, these things will happen. What channels should people pursue to quietly resolve it before it becomes a bigger incident?Jake [00:38:59]: There's responsible disclosure. We err on the side of over-disclosing and letting you know something is wrong versus having your provider gaslight you. We've erred on sharing those things more publicly, even if they impact a small subset of users. That's a decision we've made internally. We have four values. One is honor. The honorable thing is to notify people to the widest degree at which they may have been affected or there was an issue, and then confront it head-on: why did it happen, what can we do better?Swyx [00:39:45]: Not the whole user base. That's because of incremental rollouts and other things?Jake [00:39:50]: Yeah. Progressive rollouts.Swyx [00:39:54]: That should be the norm at all large platforms.Jake [00:39:58]: It should. A variety of companies do this. There's the quote that Meta runs 10,000 different versions of Meta. To our earlier point about agents, they need the same thing. They need shadow traffic and all these other things. We've built so much ceremony around production being sacred that we need to make it trivially easy to test different behaviors in a safe environment. Then you can make mistakes in a safe environment.Safe AI SRE: Customer Agents, Forked Environments, and Production ParityAlessio [00:40:30]: Do you see a world where these things get automatically caught, not necessarily by your agent, but by your customer's agent? The cache invalidation issue seems easy to check if you know to look for it.Jake [00:40:44]: It's hard because to determine it, we almost need to hook into your observability infrastructure. That's why we have the template loop on the platform: so you can roll things out progressively. You can roll out to Johnny Vibe Coder initially, or push a shard that someone consumes at their own leisure. Or you can roll it out over weeks: 0.1% of people, 1% of people, early adopters, then all the way up. That's the non-deterministic version control we talked about earlier.Jake [00:41:30]: I believe that's where most things should go, because most companies end up building staged rollout systems in-house. It's the same thing built again and again at every company. There's a massive opportunity to consolidate developer debt.Alessio [00:41:45]: You should have a free tier. Model providers give free tokens if you let them use the data. You could give free compute if someone is the number-one shard that goes out and lets you plug into their observability.Jake [00:41:55]: We do that. That's why we talked about the impact on 3,000 people. We start with lower-impact people. Larger companies on the platform are last to receive those rollouts so they have a version of the platform that's deeply stable.Alessio [00:42:16]: I have three services, so I'm sure I get the first rollout. You can nuke my thing at any time. There are all these SRE agent companies. Observability people also want agents that fix upstream problems. You have your own agent in the canvas now. How do you see that playing out?Jake [00:42:39]: It's the stacking entropy problem. If you don't have primitives to make iteration in production safe, it becomes difficult. If you're an observability provider saying, “Here's the fix to this error,” assume 80% are good and make sense. But in the last 20% long tail of complex issues, if you let somebody stamp it, you create an opportunity for an incident.Jake [00:43:08]: That's why forked environments are important. People have staging, but it always drifts from production. You need primitives, workflows, and experience built first-party on the platform so you can fork any service at any point in time.Jake [00:43:33]: I think of the canvas as a sheet of transparency paper. The agent is a little guy you push up into the canvas. It should say, “I need to copy that service and that service so I can test these two things.” It gets a read-only copy of production. Anything that's PII gets marked as a transform when we clone the database, create a copy-on-write version, or read from it. Then the agent makes changes and asks, “Does this actually work?” as close to production as possible.Jake [00:44:22]: That's how close you have to be, or you get massive drift. The system becomes unstable. You see this with massive systems built on Docker for local, Kubernetes for production, and a specific thing for something else. That complexity slows developers and becomes unstable at scale, making it hard to iterate. We want to compress that way down and say, “As close to prod as possible is where we want to be.”From AISRE Skeptic to Agent BelieverSwyx [00:45:00]: I was texting Erica for questions, and she says you were originally not a believer in AISRE. Have you come around on it?Jake [00:45:10]: I flipped, but I'm still not a believer in AISRE if you don't have the primitives to make it safe. If you unleash AISRE on production infrastructure without safe primitives for copying volumes and making sure things are fine, it's going to nuke your production database. It's not a matter of if, but when. I'm a big believer in making those loops safe.Jake [00:45:33]: I was a deep AI skeptic until 2023. In 2024, I thought, “Maybe I can roughly make this thing do it.” In 2025, I thought, “Now I can hold this.” Over winter break, everybody came back saying, “It's almost impossible to hold this.”Swyx [00:46:01]: Did you see this on the Claude docs? CloudBot? OpenCloud?Jake [00:46:06]: It's gotten to a point where it's harder to hold it wrong than to hold it right. There's a scene in Avengers where Vision picks up Thor's hammer and says it's terribly well-balanced. It self-balances and works well. I'm a deep believer at this point that this will be the dominant species: assembly, C, C++, JavaScript, words.Swyx [00:46:35]: It feels like a big jump.Jake [00:46:37]: It is. But it's not like you abandon CPU-based discrete logic and move straight to fuzzy logic. You need both. Your skills should call code or applications or some static structure. You can use skills to distill what the procedure should be or how the code should act.Jake [00:47:02]: I'm coming to a thesis: you need three points. You need a clear spec defining the system, the code, and the tests. When you say it out loud, if you've been in engineering long enough, you're like, “Of course. That's an RFC, tests, and code.” But they all matter. Having them together lets them reinforce each other: the spec and tests match, but the code doesn't, so reconcile it. Or the tests and code match but the spec doesn't, so reconcile that. That's the iteration loop.Jake [00:47:41]: That's why you're seeing people talk about software factories, docs, and reconciliation. Some of that is architectural astronomy if you don't implement it, but that loop is where most things will end up.Swyx [00:48:07]: For listeners, we've been talking about this on the pod for three years: the holy trinity of specs and tests. Itamar Friedman from Qodo is the reference if people want to look it up.Self-Modifying Infrastructure and the End of Push-Pull-RebuildSwyx [00:48:18]: One thing I want to mention on the OpenCloud idea is self-modification. I don't know how Railway would support it, but I have my OpenClaw, and I just tell it it has the Railway CLI and can do whatever. In theory, whatever capabilities or new infra it needs, it can call the Railway CLI, provision it, and add it to itself. The agent can modify its own infra.Jake [00:48:45]: It's nuts. I have a loop set up where you put the Railway CLI on top of something that runs on Railway. You're authenticated as whatever the current box is, and you can make any changes to it. Then you call Railway deploy, and it deploys itself.Jake [00:49:04]: It's like: “I need to spin up this instance of this environment. I already exist in this environment. Excellent, I have access to a Postgres instance now.” That's where we want to go with agentic, self-replicating infrastructure. That's your loop: iterate in production. You continue making changes. If it works, merge it upstream. If it doesn't, throw it away.Jake [00:49:37]: How do you make throwaway copies trivial to spin up and super cheap? The era of “I have an AWS instance with four vCPU and 16 gigs of RAM” is going to get destroyed. If you do that for agents, you need a thousand of those machines. It's prohibitively expensive compared with what we've spent a ton of time figuring out: the atomic unit of deploy, whether you call it isolates, sandboxes, or something else. Only pay for what you use, spin up instantaneously, and close the loop as quickly as possible.Jake [00:50:15]: If the system can self-replicate safely and say, “This is my environment, I'm making these changes,” it can come back with, “Does this look good? This is a new state of infrastructure given this prompt. I think I've solved it.” Then you go back and say, “Actually, it looks different.” It does the loop again. Then you say, “Cool. Apply.”Swyx [00:50:38]: That's retroactively obvious, which is the most useful kind. Any other comments on agent deployment on Railway?Jake [00:50:51]: It's getting better every day. I'm on X or Twitter. You can always yell at me about the parts not working as well as they should, because plenty of things should work way better.The New Serverless: Stateful, Long-Running, Pay-for-What-You-Use LinuxSwyx [00:51:04]: At this stage, when people want massively or embarrassingly parallel compute, they usually talk serverless. I feel like there's a new serverless compared to the previous five years of serverless. You're in that new bucket. Do you have comparisons or philosophical differences you want to call out?Jake [00:51:31]: It's somewhere in between. It's the ability to run stateful, long-running workflows or executions.Swyx [00:51:42]: Vercel has Fluid Compute, Cloudflare has some container thing, Google has App Runner and others.Jake [00:51:55]: That's where everything is roughly going, and it's why we've been working on this for six years. We believe users need access to a computer: a box that speaks Linux. They need to deploy what they want. Other systems change the surface area of what you can build. For us, users need a computer and need to deploy anything they truly want. That's why we've focused on the primitives: network, compute, storage. If we give you those and expose them so you can run things indefinitely, that's where we believe it's going.Jake [00:52:43]: Twitter has no nuance, so everyone says “servers” or “serverless.” It's always somewhere in the middle: I want to run it for a long time, but I don't want to provision the resource statically or pay for things I'm not using. That's been our thesis from day one: pay only for what you use, run it indefinitely, and it is full Linux.Swyx [00:53:12]: That's why I like the naming of Fluid. It's fluid. Flexible.Heroku, Focus, and Carrying the Torch Without Becoming the PastSwyx [00:53:18]: Another milestone is the Heroku official deprecation. You're one of the presumptive new Herokus. “New Heroku” has been a category for as long as I've been in developer tooling. It's finally happening. What was that like? Any behind-the-scenes of, “This is the moment”?Jake [00:53:42]: You have people where you're like, “You were running stuff on here? You, as this company?” It's crazy that names you would know are running on it and now coming to us saying, “We want to move a lot of this off.”Swyx [00:54:00]: Any behind-the-scenes on why Salesforce let Heroku stagnate?Jake [00:54:05]: I can only guess. It's hard when it's not your business. Salesforce's business is to build a great CRM. That's their focus. Then you acquire a compute business as an offshoot. A lot of early Meta people talk about focus. Boz has a write-up about how in the early days of Meta they had no money, so they were forced to focus. Then they turned on the money tree and had no reason not to split their focus.Jake [00:54:52]: But that dilutes your product. You get offshoots where you ask, “Is this the focus of the business?” If it's not core, it languishes. A lot of companies get in trouble when they split focus because they're fighting a multi-front war, not just externally but internally for alignment. Where are we going? What are we doing? What is our purpose?Jake [00:55:24]: If you're Salesforce-built and mission-driven, you want to work on Salesforce. Heroku is off to the side. It's not core to the business. Getting resources, budget, focus, and alignment internally becomes hard. It was a matter of time.Swyx [00:56:06]: Kudos for them to call it out instead of leaving it unknown.Jake [00:56:12]: Their release was a little odd. They called it out, but they didn't say they were shutting it down. Behind the scenes, I think they issued messages to people saying they should close accounts and that they were going to deprecate and remove things over time.Jake [00:56:30]: It's crazy because some of my first deployment experiences were on Heroku. You start with dragging things into an FTP server, then you try to get a deploy working, and then it's Heroku. It was the on-ramp for us. But the wheel turns. New things emerge. We're happy to carry the torch for a lot of that. But we don't want to be the new Heroku. We want to be the way people build and deploy software, and ultimately the way people monetize software over time.Swyx [00:57:19]: It's still a big crown to be the new Heroku. There are 50 companies that fought for that.Jake [00:57:23]: Everybody is holding some portion of it. We're happy to support people and companies. The platform works differently. The game loop is similar, but we've been dogmatic about where these things are going: primitives, agents, fan-out. Some things fit; some workflows need to change. We have an approximation of Heroku pipelines with the environment system. It's exciting. We've got a ton of people we can support, and it's growing a lot.Temporal, Workflow Engines, and State MachinesSwyx [00:58:12]: I have one more technical question about Temporal. I've sold my shares. You're a power user and one of our earliest customers. I met you through Temporal. You built on Temporal. You have complaints. This may be the most neutral and informed conversation anyone will hear about Temporal without someone working at the company.Jake [00:58:39]: That's fair. I've used Temporal for almost 10 years because of Cadence at Uber.Swyx [00:58:52]: Give people a sense of what Cadence was at Uber.Jake [00:58:57]: Cadence was the precursor to Temporal. It powers trip actions, rides, when you rent a Jump bike or scooter or car. You're running workflows for a period of time and saying, “This ride will run indefinitely until it finishes.” You attach information: you paused in this zone, so add this charge to the bill. When you end the trip, the workflow is done. That experience was powered by Cadence at the time.Swyx [00:59:34]: I used to say it's like programming the entire user journey top-down as one function.Jake [00:59:39]: It's a powerful idea and important. It's also important for the next phase of the agentic journey. You want an agent to do a specific task, be complete or incomplete on that task, and move on to the next thing. You need a way to manage workflows dynamically.Jake [00:59:59]: Temporal was always great in theory, and great when you got it working the way you wanted in production. But it required you to model the entire journey in your head. If you didn't, you could cause issues where replaying the state of the workflow causes non-determinism.Swyx [01:00:25]: Because it works on deterministic workflow history.Jake [01:00:28]: Exactly. I describe it as a jet engine. If you know how to operate it and run it, it's great. But you can't hand it to people trying to build complicated things if they don't have the whole state in their head.Jake [01:00:48]: We run our whole deployment pipeline on top of it. That's a reasonably complicated workflow: pre-commit hooks, signaling, queuing, and all the rest. We ran into the same thing at Uber. As you express a large workflow, it gets more complicated, with more states in the state machine that you have to map back to the workflow.Swyx [01:01:15]: It's a lot of ifs.Jake [01:01:16]: Exactly. At Uber, we built a system for doing the state machine and testing it. We've started to build some of those things here because it's grown heavily. It's not quite love-hate. When it works well, it works super well. But if someone who doesn't have full context puts something into the system that invalidates state or causes non-determinism, or spins off a ton of activities, you have to keep track of underlying SRE knobs like activity slots. Those should scale with memory, vCPU, and so on. It becomes a bear to scale.Swyx [01:02:10]: You need a capable sysadmin running things behind the scenes. If you moved off, what would you do?Jake [01:02:19]: We'd build our own workflow engine. We have a few internally that we've worked on.Swyx [01:02:27]: This is one of those classes of things you typically wouldn't vibe code, but I'm wondering if you can.Jake [01:02:33]: I still don't think you should vibe code it. You still want to run decent tests to make sure it works.Swyx [01:02:39]: Timo didn't invent that from scratch either. There are libraries you can run. On top of that, it's just a state machine that you have to map out. Ultimately, you define the instructions you want and run them through a state machine.Jake [01:03:00]: It's very doable. Workflow stuff is interesting. Restate is doing neat stuff here.Swyx [01:03:10]: You're tied into JavaScript. Are you a JavaScript maxi?Jake [01:03:13]: Internally, we have TypeScript, Rust, and Go. We don't add more languages. Actually, we have a little C because we write BPF code and hooks. But those are the languages.Swyx [01:03:28]: Is this for sidecars?Jake [01:03:32]: No. It's for the networking stack, volumes, and things like that. We use TypeScript a lot because it powers the dashboard, but we're moving a lot of workflow stuff off the dashboard stack and into the infrastructure stack.Railpack, Nixpacks, and Content-Addressable FilesystemsSwyx [01:04:00]: Cool. Any other technical infrastructure stuff? Railpacks?Jake [01:04:07]: We built an engine for determining dependencies based on source code. It's called Railpack. We built the first version, Nixpacks, on top of Nix, and then we moved.Swyx [01:04:17]: People have been trying to get me to adopt Nix and NixOS for four years. Is it ever going to be a thing?Jake [01:04:23]: I don't know. We're excited about it, but it has pain points. Think of it as a stack of versioned binaries at specific slices in time. If you want version X and version Y, you bloat the package space, which blows up image size and makes real-world workloads difficult.Swyx [01:04:53]: But you content-address it and cache it. In theory, there are optimizations.Jake [01:05:00]: In theory, yes. But with a large enough user base and disparate enough machines, you run into a problem Meta described in the XFAAS paper, their internal serverless system. It becomes difficult at scale unless you break out specific runtimes.Jake [01:05:24]: We didn't want to do that because we wanted to truly allow you to deploy anything. That was our initial thing with Nix. But we've moved toward interesting work around content-addressable file systems that can lazy-load anything from any point and page it into memory.Swyx [01:05:48]: Amazing.Jake [01:05:49]: The future is very bright. It's crazy, and it's going to be nuts.Coding Agent Spend, Roadmaps, and Token ROISwyx [01:05:54]: Founder journey stuff?Alessio [01:05:56]: Your cloud usage: you tweeted you're going to spend $300K this month?Jake [01:06:01]: I think we got to $200K.Alessio [01:06:02]: Coding agents?Jake [01:06:03]: Yeah.Swyx [01:06:04]: Across the company?Alessio [01:06:05]: You only have 35 people, so I'm sure they're not all spending $10K a month. What's the distribution?Jake [01:06:10]: I think I'm at about $25K. We have power users all the way down. We came back from winter break, and I basically said, “If you're writing code by hand, you're doing this wrong.” The tools are good enough now that you can move extremely quickly. There are issues and pain points, but you should be reviewing the code you are writing instead of writing it by hand.Jake [01:06:40]: Architectural patterns matter more now than ever, but you shouldn't spend your time generating code you would write. If you know how to write it, ask the agent to write it and reconcile it until it looks like you would have written it yourself.Jake [01:06:58]: People misconstrue my propensity to push people toward agents as connected to our growth and some reliability bumps. They're not necessarily related. The tools are good enough to move extremely quickly and build things way larger than you could before.Jake [01:07:19]: To the earlier point about cooling data centers in space: I don't know. But with software, you can ask, “How would I build block storage from scratch? How would I do these things?” I have ideas because I have history and have read papers. Let me work them out and build massive test benches with thousands of tests, because those are now free to author. If you're not using AI systems to speed-run your roadmap and reconcile your existing system onto the future, you're missing a large point of what's happening.Alessio [01:08:12]: What's the path to spending $3 million a month? Is it bound by ideas and things customers can absorb?Jake [01:08:19]: For most companies, it's bound by deployment at this point. That's why we've seen a massive boom in users and companies, from Fortune 50s down, asking how to get developers to move faster. You'll probably hit your CFO before any technical limits because they'll look at the eye-watering amount of money spent on tokens. Inference costs have to come down, but we're inference constrained now. There will be price discovery around what makes sense for an org to adopt.Jake [01:09:06]: I think you'll end up with the F1 driver concept. If someone is really adept at these things, it makes sense to put them in a $3 million car. If they're not, it probably doesn't make sense. You'll take a few people and say, “You can drive the F1 car. We need to go in this direction. Figure out if it works and prototype it.”Jake [01:09:33]: We've done some of that and vastly accelerated our roadmap. We thought we'd ship something in a few years; now we can probably ship it in a few months because we validated it and don't have to build it incrementally. We can skip steps and move toward our vision.Alessio [01:09:58]: A lot of people are realizing the roadmap doesn't always have a business impact, so they say tokens are too expensive. But if your roadmap were built to make more money by the time you built it, you'd have token pricing for it, the same way you do with sales. You'd spend a billion dollars on sales if you knew you would get $2 billion of revenue.Jake [01:10:19]: Exactly. A naive way to measure this is the percentage of tokens that end up in production. If you can measure impact because those tokens end up in production, that's awesome. But the burden of proof will rise. Internally, we have a growing number of pull requests that haven't merged. The question becomes: how do you get this into production? It's about how quickly you can build and deploy software, which is exciting because that's our whole thing.The SDLC Shift: Prompt Requests, Feature Flags, and Safe RolloutsSwyx [01:10:56]: The SDLC is changing. One thesis is that the pull request is dying. It's going to be the prompt request. Beyond that, code review is also kind of dying if you have all the other systems in place. What else is changing about the SDLC?Jake [01:11:19]: The AISRE and the tools to make it happen. AISRE is pie-in-the-sky aspirational. What does it take to get an AISRE? What tools do you need to build?Swyx [01:11:32]: You should expose your tooling to customers at some point. The Central Station command center.Jake [01:11:39]: We have it for template maintainers. Template maintainers can deploy and maintain templates, and they get feedback. We're going to expose those things incrementally.Swyx [01:11:51]: Clustering around incidents. Everyone has a version of that, but I don't think anyone has solved it.Jake [01:11:56]: I won't say we've solved it internally, but it's gotten so good that we can see incidents forming pretty quickly. At some point, those will be things either someone else builds or we build. We've always built things purpose-built for us. If it makes sense to make it useful for users, monetize it, or turn that loop into a profit center instead of a cost center, we want to do that.Jake [01:12:28]: Pull request is definitely dying.Swyx [01:12:29]: Do you do first-party feature flagging and incremental rollout stuff?Jake [01:12:34]: We have a feature-flagging engine we built internally and will eventually roll out.Swyx [01:12:38]: I don't see it as a user. How come you didn't give us what you have?Jake [01:12:43]: We have to beta test it. We care a lot about the quality of the things. There's plenty we've used internally that doesn't make it all the way through the journey because it fails. It works for one service but not multiple services. We'd have to build it for multiple services and know that if we released it, we'd rebuild it again and again. Some things are worth that, but many inform the roadmap.Jake [01:13:18]: We don't want to dilute the experience by saying, “This works, but only for this service,” unless it's a core initiative. Over the next few months, we'll roll out things that work for a single service, then multiple services, then multiple services across the environment. You have to be deliberate. Otherwise you create broken disparate experiences and support load because people ask how to use the feature.Jake [01:13:52]: It's the earlier expansion and compaction pattern. You expand the company to get features, then compact and smooth them out so the experience is stellar. You told me in the hallway, “It's gotten so much better.” Internally we're saying, “This part really sucks. We need to make it significantly better.”Swyx [01:14:11]: I can attest to that over the last three years watching you build Railway. For listeners, feature flagging is a huge part of Uber culture. So much so that they have too many feature flags and another thing to remove feature flags. Facebook has Gatekeeper. Agents are going to need this. It's fundamental to incremental rollouts. OpenAI acquired Statsig. GPT-5 is routing and flagging through different models.Jake [01:14:56]: It's super important. If the software development lifecycle is going to change because we're doing things 1,000 times faster and 1,000 times more concurrently, what becomes important at scale?Jake [01:15:16]: Before I started Railway, I built a feature-flagging product and tried to sell it. It was an easier version of LaunchDarkly. I ran into a problem: anyone small enough to adopt your technology doesn't care about feature flags, and anyone large enough to need feature flags needs so much scale that you have to build out all the infrastructure. I scrapped it.Jake [01:15:42]: But what is old is new again. Companies are trying to move quickly, but you can't YOLO a vibe-coded thing straight into production. You need to say, “Here's my blast radius, my impact, and I want to shadow it for these users.” Feature flags. You're going to need the tools larger companies built to maintain their structures. Everything gets compressed by 1,000x so everybody can build those structures quickly.Jake [01:16:07]: That's exactly where we are: compressing the software development lifecycle, then expanding it and adding more new things.Cattle, Pets, and Clonable InfrastructureSwyx [01:16:15]: Another term that comes to mind for newer developers is “cattle, not pets.” People treat production like a pet. It has a name. You baby it and keep it alive. With cattle, you can mass farm, roll out, portion parts out, and kill them.Jake [01:16:37]: I think that might change. You can move toward having pets as long as you have a cloning machine for your pets.Swyx [01:16:52]: Yeah.Jake [01:16:52]: If you can snapshot every single thing at every frame, it doesn't matter if something gets obliterated because you have a snapshot of it. The things we've built right now are designed to block changes from the hermetically sealed DevOps line. You have to write a Dockerfile because you nee

Interviews: Tech and Business
AI-Enabled Software Development: AI Coding at a Global Insurer, with Blitzy | CXOTalk #917

Interviews: Tech and Business

Play Episode Listen Later May 4, 2026 21:24


Autonomous software development creates a dilemma for leaders in regulated industries: adopt AI coding at scale or fall behind on product velocity without compromising auditability and code quality. In CXOTalk episode 917, Kris Tokarzewski, Group Chief Technology Information Officer at Vitality, describes how a 14,000-employee multinational insurer is rebuilding its software development life cycle around AI. This episode examines the impact of agentic AI on software development in the enterprise.Recorded at Blitzy's headquarters, the conversation examines deterministic code generation, Blitzy's infinite code context, context engineering, test-driven development, and the shifting bottlenecks that surface as throughput accelerates.YOU'LL DISCOVER✅ Why regulated industries require deterministic, auditable code rather than the probabilistic output most AI coding systems generate✅ How Blitzy's infinite code context (ingestion of codebases, engineering standards, and business rules) creates high-quality software aligned with compliance requirements✅ How Vitality reverse-engineers legacy systems with autonomous AI, achieving a measured 5x acceleration over manual methods✅ Why optimizing end-to-end SDLC throughput matters more than local efficiency at any single stage✅ How code review of 50,000 to 100,000-line pull requests becomes the next limiting factor, and how AI reviewers close the gap✅ How test-driven development pairs with autonomous code generation to raise quality and compliance pass rates✅ How the roles of requirements engineers, software engineers, and product teams converge inside an AI-native SDLC✅ How to instrument AI spend against velocity, quality, end-to-end throughput, and customer value rather than isolated gainsTIMESTAMPS0:00 Deterministic code vs. probabilistic AI output0:14 Meet Kris Tokarzewski, Group CTIO of Vitality0:32 Why Vitality is modernizing legacy insurance systems1:30 Event-driven architecture as agentic AI's natural partner3:00 Building an AI-native software development life cycle with Blitzy4:28 Throughput optimization versus local efficiency6:02 Reverse engineering legacy systems and deterministic code generation9:05 Infinite code context: ingesting codebases, standards, and rules10:00 Test-driven development with autonomous code generation10:49 Results: 5x faster legacy reverse engineering13:17 Product, engineering, and DevOps convergence15:04 Roles level up: requirements engineers and software engineers16:18 Reviewing 50,000 to 100,000-line pull requests17:56 Instrumenting AI spend against business outcomes19:16 Executive sponsorship for autonomous development20:16 Advice for CIOs and CTOs adopting AI-driven development

the csuite podcast
Show 297 - Google Cloud NEXT, Part 1 of 3 - AI, Security & the New Enterprise Architecture

the csuite podcast

Play Episode Listen Later May 1, 2026 42:51


The first of three episodes recorded at Google Cloud NEXT, Las Vegas in partnership with Kyndryl, the world's largest IT infrastructure services provider Host Russell Goldsmith was joined by: 1/ Kris Lovejoy, Global Head of Strategy, Kyndryl 2/ Vincenzo Forciniti, AI Adoption and Data Platform Leader, Fastweb & Vodafone 3/ Adrian Tatsch, VP AI Technology & Innovation, Equifax 4/ Patrick Bobrukiewicz, VP Data Services, Thrive Restaurant Group 5/ Kaapro Kanto, VP, Cybersecurity & Digital Platforms, DNA 6/ Brad Duff-Hudkins, VP Data Analytics, Next After Each of our guests offered a grounded, real‑world view of AI adoption at scale. The episode opens with Kris Lovejoy, Global Head of Strategy at Kyndryl, who outlines why digital sovereignty, geopolitical risk and regulatory pressure are reshaping enterprise architecture. She also breaks down the guardrails required for employee productivity tools versus mission‑critical agentic systems and why modernisation itself has become a security control. Next, Vincenzo Forciniti, AI Adoption & Data Platform Leader at Fastweb and Vodafone Italia, discusses the data‑unification challenges following Fastweb's acquisition of Vodafone Italia. He shares how the team built a shared data catalogue, why change management is often harder than technology, and how modernising legacy stacks is enabling scaled AI across SDLC optimisation, operations and customer‑facing processes. We then hear from Adrian Tatsch, VP of AI Technology & Innovation at Equifax, who explains how the company is connecting APIs to AI agents using Apigee MCP, and how Equifax's multi‑billion‑dollar cloud transformation has accelerated AI maturity. Adrian explains how Equifax is redefining human vs. non‑human work, upskilling, and measuring ROI across the organisation. Patrick Bobrukiewicz, VP of Data Services at Thrive Restaurant Group, shares a hospitality‑sector perspective on AI adoption. Kaapro Kanto, VP, Cybersecurity & Digital Platforms, DNA explains how DNA moved from traditional network operations to AI‑driven SecOps, enabling small businesses to benefit from enterprise‑grade detection, automation and response, and why the biggest barrier to AI maturity is shifting from pilot experiments to trusted, scalable operational models. And finally Brad Duff‑Hudkins, VP of Data Analytics at NextAfter, explains how his team used Google's data engineering agents to cut onboarding time from 2–3 weeks to just 72 hours, and why agentic AI is already unlocking faster, more personalised, more scalable data operations for lean teams. A fast, insight‑rich episode capturing the reality of AI transformation inside complex global enterprises, from security and sovereignty to data foundations, workflow automation and the future of human‑machine collaboration.

Product Momentum Podcast
186 / TiPS: AI-Enabled First Principles + Core Product Skills Spark Adoption

Product Momentum Podcast

Play Episode Listen Later Apr 29, 2026 24:29


Welcome to TiPS – the Topics in Product Series – a new podcast format powered by ITX and the team at Product Momentum. The TiPS mission is to engage the same important product space issues that you confront every day – but this time through the experiences of ITX product managers, UX researchers and designers, engineers, security analysts, and the rest of the team. In this inaugural TiPS episode, Dan Sharp is joined by Sean Murray and Andrew Knoblauch to reflect on a recent Product Leaders Breakfast, hosted by Prerna Singh. Together, they draw on insights from event attendees to discuss how AI is being applied inside real organizations. The central theme was clear: successful AI adoption depends less on hype and more on first principles and core product skills that drive disciplined product thinking, incremental progress, and strong decision-making. Here's what we learned: Top-Down ‘Do AI' Directive Is the Wrong Reason for Integrating AI The integration of AI into software development is no longer the proverbial “hammer in search of a nail.” The days of doing AI for AI's sake are behind us. Today's product leaders focus on making incremental improvements tied to bona fide business problems. As Sean points out, our response to the ‘do AI' directive should be: “’Where do you want to see improvement? What outcomes are you looking for?' I think back to our conversation with Teresa Torres, about applying best practices in the initiation and discovery phases of the SDLC so that when we actually get into building something, it’s gonna have some sort of relevant business value.” It's a more grounded approach that reflects a broader industry need to align AI efforts with tangible outcomes.. Building Stakeholder Trust Through Incremental Change Trust emerged as a critical factor in AI adoption, but not only in the technical sense. Instead, as attendees discussed, trust is built gradually through careful implementation and organizational alignment. Andrew explains that product teams build trust not by tackling the biggest, riskiest challenge – but by prioritizing low- to medium-risk opportunities while involving stakeholders early, especially those in Legal and Compliance. “This idea of building trust among others in your organization.” Andrew continues. “We do this every day with our clients and with our own teammates. We learn about people’s concerns, what they care about.” The conversation reinforces the idea that AI should be introduced as a collaborator within workflows, not as a replacement for human judgment. Decision Quality as the True Differentiator One of the key threads weaving through our conversation was a return to foundational product principles – specifically, the importance of decision-making. While AI fluency is valuable, it does not replace the need for strong judgment and clear thinking. Teams that succeed will be those that consistently make informed, high-quality decisions, Sean says. “The biggest differentiator moving forward is gonna be decision quality…your ability to consistently make good decisions.” In this context, AI becomes an enabler, not the driver, of product success. The conversation at the Product Leaders Breakfast (hosted by Prerna Singh) reinforces a familiar but essential message for all product leaders. AI does not replace core product skills; it amplifies them. Teams that stay focused on problem definition, stakeholder alignment, and disciplined execution will be best positioned to realize its full potential. The post 186 / TiPS: AI-Enabled First Principles + Core Product Skills Spark Adoption appeared first on ITX Corp..

Resilient Cyber
Securing the Vibe: Tanya Janca on AI-Generated Code, Mythos, and the New AppSec Reality

Resilient Cyber

Play Episode Listen Later Apr 27, 2026 38:24


A new episode of the Resilient Cyber Show just dropped, and this one is a conversation I've been looking forward to for a long time.I sat down with Tanya Janca, better known to most of the AppSec world as SheHacksPurple. Tanya is the best-selling author of Alice and Bob Learn Application Security and Alice and Bob Learn Secure Coding, an OWASP Lifetime Distinguished Member, CEO of She Hacks Purple Consulting, and one of the most recognized voices in application security and developer education on the planet.The timing of this conversation is hard to overstate. The OWASP Top 10 2025 was announced at the Global AppSec Conference last year, with two new categories, Software Supply Chain Failures and Mishandling of Exceptional Conditions, and SSRF folded into Broken Access Control. Recently, Anthropic released the Claude Mythos Preview system card, documenting a model that has already found thousands of high-severity zero-day vulnerabilities autonomously, including bugs in every major operating system and web browser, and a 27-year-old vulnerability in OpenBSD.In other words, AppSec is at a hinge moment, and Tanya is exactly the right person to think out loud with about it.Here's what we get into:What the OWASP Top 10 2025 got right, what it missed, and how teams should actually use itAI-generated code, “vibe coding,” and Tanya's brand-new free prompt library for secure coding with AI assistants, SecureMyVibe.caWhat Mythos-class capabilities mean for the offense/defense asymmetry AppSec has always lived withHow AI is genuinely changing the SDLC, where it creates lift, where it creates noise, and where it creates entirely new attack surfaceArchitecting real defenses at the prompt layer, across MCP servers, and inside RAG pipelines, not just bolting content filters onto the front doorWhy developers are the new attack surface, and why a lot of what gets labeled as “supply chain attacks” lately is really a developer compromise that cascaded into the supply chainTanya's threat model, defense framework, and maturity model for protecting developers themselvesDevSec Station, Tanya's new podcast delivering 5–10 minute secure coding lessons in a format built for how developers actually consume contentWhat she'd change tomorrow about how AppSec programs are built and run if she could change just one thingThis is one of those conversations that ranges from the practical (what to do Monday morning) to the philosophical (what does it even mean to “secure software” when an AI can find more zero-days in a weekend than a Red Team finds in a year). Tanya brings the rare combination of deep technical chops, real teaching ability, and genuine warmth that makes a hard subject feel approachable.If you lead an AppSec program, write code for a living, run a security team trying to keep up with AI-assisted development, or you're just trying to figure out where this whole industry is heading, this is the episode for you.Resources from the episode:SecureMyVibeDevSec Station Podcast (Tanya's new show)She Hacks Purple ConsultingAlice and Bob Learn Application Security and Alice and Bob Learn Secure CodingOWASP Top 10 2025 — https://owasp.org/Top10/2025/Claude Mythos Preview System Card — AnthropicThanks for being here. If this episode landed for you, the best thing you can do is share it with one person on your team who'd find it useful, that's how this newsletter and show grow.

Tech Lead Journal
FeatureOps: The Safety Net You Need When Shipping with AI

Tech Lead Journal

Play Episode Listen Later Apr 27, 2026 64:49


(05:00) Brought to you by MailtrapMailtrap is a modern email delivery for developers with native SDKs support along with security compliant API & SMTP. Plus, you get 4,000 emails a month completely on their free tier! It also provides 24/7 support where you actually talk to real people, not an AI chatbot. Try Mailtrap for free at mailtrap.io.What happens when AI ships code faster than your team can review it? As agentic development accelerates your SDLC, the guardrails matter more than ever — and most teams don't have them.In this episode, Egil Osthus, CEO of Unleash, makes the case for FeatureOps as a strategic capability — not just a developer convenience. He explains the shift from a project mindset to a product mindset, where releases are decoupled from deployments and business outcomes matter more than shipping scope. Egil breaks down the four pillars of FeatureOps — gradual rollout, full stack experimentation, surgical rollback, and lifecycle management — and why each one becomes even more critical as AI-generated code flows faster into production. He also warns against building your own feature flag solution in-house, and shares what the rise of agentic development means for engineers who must now act as guardians of an oversight layer.Key topics discussed:Project mindset vs. product mindset in software deliveryThe 4 pillars of FeatureOps and what each one solvesWhy feature flags scare executives — and how to win them overDecoupling deployment from release across Dev, PM, and MarketingThe danger of rolling your own feature flag solutionHow local evaluation keeps feature flags fast and privateBlast radius management in an AI-accelerated SDLCWhat vibe coders get wrong about day-two operationsTimestamps:(00:00) Trailer & Intro(02:36) What Is the Current State of Feature Flag Adoption Across the Industry?(05:32) Why Is Feature Flag Adoption So Challenging Despite Its Apparent Simplicity?(10:44) How Does FeatureOps Differ From CI/CD and Progressive Delivery?(12:26) What Are the Four Core Pillars of FeatureOps?(16:11) How Can Teams Shift the Perception of Feature Flags From Tactical to Strategic?(20:46) How Do Feature Flags Align the Needs of Developers, Product Managers, and Marketing?(25:09) How Do Organizations Effectively Define Responsibilities for Strategic Feature Flags?(28:03) Does Using Feature Flags Enable Your Team to Deploy on Fridays?(30:41) What Is Unleash and How Does It Scale for Enterprise Needs?(34:54) What Are the Hidden Dangers of Building Your Own Feature Flag Solution?(39:32) Why Are Local Evaluation and Privacy Core to Unleash's Design?(44:48) How Does the Rise of AI Impact the Evolution of FeatureOps?(52:02) What Specific Guardrails Does FeatureOps Provide to Improve Safety?(54:21) Can FeatureOps Platforms Use AI to Autonomously Manage Feature Rollouts?(55:33) What Essential FeatureOps Advice Should Every Vibe Coder Follow?(59:53) 3 Tech Lead Wisdom_____Egil Osthus's BioEgil Østhus is the co-founder and CEO of Unleash, the world's leading open-source feature management platform. As a seasoned enterprise technologist and product strategist, he operates at the cutting edge of business and software engineering.Egil's mission is to help technology leaders and businesses move beyond traditional DevOps by embracing FeatureOps, a new methodology that provides a critical safety net for the accelerating, and often risky, world of agentic software development. He has a unique ability to speak the language of both engineers and senior executives, making complex topics accessible and actionable.Follow Egil:LinkedIn – linkedin.com/in/egilconrUnleash – getunleash.ioLike this episode?Show notes & transcript: techleadjournal.dev/episodes/256.Follow @techleadjournal on LinkedIn, Twitter, and Instagram.Buy me a coffee or become a patron.

Revenue Builders
Why Sales Execution Wins in an AI-First World with Brian McCarthy, President of Global Revenue and Field Operations at Cursor

Revenue Builders

Play Episode Listen Later Apr 24, 2026 55:12


PLG can create explosive growth, but it can also mask fundamental gaps in execution, capacity, and long-term durability. As AI-native companies scale at unprecedented speed, revenue leaders face a new tension: how to convert bottom-up adoption into enterprise value without breaking the system that fueled growth. Brian McCarthy joins to unpack how Cursor is navigating this shift, why sales execution becomes the moat in a world of swappable technology, and what it takes to build a go-to-market machine that keeps pace with innovation while deepening customer trust. Brian McCarthy is President of Global Revenue and Field Operations at Cursor and former CRO at Rubrik, where he helped scale the company from $118M to $1.5B in ARR. He is known for building high-performance revenue organizations and execution-focused cultures in complex enterprise environments. Connect with Brian: LinkedIn Resources mentioned: Ep. 71 - What the Best Sales Leaders Do with Brian McCarthy All In Podcast with Chamath, Jason, Sacks & Friedberg Key takeaways from this episode: 03:30 – Why great leaders know when to step away, and how building a successor is the true test of an execution machine 09:50 – What to look for in a once-in-a-career opportunity, and why timing matters more than brand or hype 17:03 – How PLG success created a capacity crisis, and why too much demand can degrade customer experience 29:29 – The decision to radically reduce account load, and how focus enables better selling and better buying experiences 32:16 – Why champions, not features, drive revenue, and how to intentionally build them across the organization 38:19 – The required balance between bottom-up adoption and top-down value selling in technical markets 41:31 – Why “clock speed” is the defining trait of modern sellers, and how enablement must fuel continuous learning 49:09 – The shift from tools to AI factories, and what it means for the future of software development and selling 52:01 – Why culture, trust, and human relationships remain the durable moat in a world of rapidly changing technology Hosted by five-time CRO John McMahon and Force Management Co-Founder John Kaplan, the Revenue Builders podcast goes behind the scenes with the sales leaders who have been there, done that, and seen the results. This show is brought to you by Force Management. We help companies improve sales performance, executing their growth strategy at the point of sale. Connect with Us: LinkedInYouTubeForce Management

Management Blueprint
327: 3D Print Your Software with Piyush Jain

Management Blueprint

Play Episode Listen Later Apr 13, 2026 23:51


Piyush Jain, Founder and CEO of Simpalm and co-founder of Ducknowl, is on a mission to solve real-world challenges by combining technology and entrepreneurship. With over 15 years of experience building custom software solutions, Piyush helps businesses turn complex ideas into practical applications by blending technical depth, business acumen, and a strong problem-solving mindset. We explore Piyush's AI Ideation Framework—Validate idea, Proof of concept, Design, Competitor analysis, and Feature selection—a practical approach to building software in the post-AI era. Piyush explains how AI can help teams better understand user personas, validate product assumptions, and rapidly prototype ideas, while human expertise remains essential in design, architecture, and production-grade development. He also shares how prompt engineering, peer-reviewed prompting, and a right-shoring delivery model can help businesses build smarter, faster, and more cost-effectively. — 3D Print Your Software with Piyush Jain Good day, dear listeners. Steve Preda here with the Management Blueprint, and my guest today is Piyush Jain, the Founder and CEO of Simpalm, a custom software development company, and the co-founder of Ducknowl, a candidate screening and assessment application business for high-volume recruiting. Piyush, welcome to the show.  Thank you, Steve. Thanks for inviting me.  Well, I’m very curious about the stuff that you have to share with us, and I’d like to ask first about your personal purpose. What is your “why,” and how are you manifesting it in your business?  Yeah, so that’s a very interesting question. And I think for every entrepreneur or tech founder, really, that's the motivation—why you want to do certain things. So for me, if I look at it, my personal “why” is: why are we not solving challenges? Or why are we not solving them the right way? Why are we not transforming our lives? I grew up in India and then came to the US, so I've seen many different parts of the world—from Asia to North America. I see people face different challenges, but then we are not focusing on solving those problems. A lot of it I see is there’s a lot of challenges in the world because I believe there are not enough entrepreneurs. Because entrepreneurs are the ones who really take risks, combine everything, and create solutions. That was like me, right? That’s what I learned growing up, that I think I can do that, right? I can combine the technical knowledge and the business acumen and create solutions that people like, solve their challenges. Growing up, like I'm more on the technical side.Share on X I was inclined more toward science and technology, but then as I got into my undergrad and grad school, I realized that I have that entrepreneurship aspect, but it's still around science and technology. That’s when I realized that, you know what, I cannot be a pure scientist or maybe a pure entrepreneur, but I can be someone who can combine these two, because my main driving factor is problem-solving. I can combine these two and then live my life, be very happy with what I do. That has been my motivation. I like it. So solving challenges and being an entrepreneur, and kind of combining the two—being the technical expert and the entrepreneur in one. Now, one of the things that we always talk about on this podcast is frameworks. And you have developed a really good one for AI ideation, which I think is something that everyone needs to do these days or use these days, and it helps you create business apps and other business applications. Can you share with me how that framework works, and what are the steps in it?  Sure, yeah, definitely. So just to give you a brief background, we've been building software for the last 15 years. Some companies have used different frameworks, whether it's Agile or Waterfall in SDLC, in building the software, right? There are different methodology that companies have used, and they've been good, successful—they've played their role. But now, with the advent of AI, things have changed. We had to figure out, in our organization, how to use AI, and that's how this framework was built. My team helped me building this framework as well.Share on X But we realized that we were losing business—we were losing clients—since we didn't have an AI framework that would fit our clients. Again, for me, it's a challenge. So anytime I see a challenge, it create brain juice in me, right? So I said, okay, let's figure out how we create this framework. How did you do it?  So really, we built this framework—very interesting. A lot of the steps are similar, but then a lot of things are different.Share on X Whenever client comes to us and says, “Hey, we want to solve this challenge,” what we do is we do enough research. And now we use a lot of AI tools to really understand the problem better and understand the user persona. When you build any software application, there is a person who's going to use that. Sometimes we used to do user research or focus studies to understand that. Now, with the help of AI, we can get a lot of ideas about the user persona. For example, maybe we are building a healthcare application for an anesthesiologist. I don’t know much about that. I know, I mean, because I have been through some medical surgery and all that, but I can't fully understand their user persona or their requirements with respect to the application we're building. But now, with AI, I can actually ask different AI models, “Hey, we are building this app for anesthesiologists. What are their pain points? How would they see it?” So all that deeper mindset and psychology we can get using AI.  You are validating the idea by interrogating AI applications.  What users are going to like and all that. So I will always use this term earlier. In software engineering, now we have this pre-AI and post-AI, right? If you read history, we talk about before Christ and after Christ, right? Yeah. So it's a similar thing now. Yeah, exactly. Or before Covid, after Covid. Before AI, after we did all the user research and everything and created a requirements document, we would usually do design, create like a visual design of the software. But now, with the AI framework, we don't do that. That's not the next step. What we do instead is create a quick prototype using AI platforms.Share on X  So there are a lot of AI platforms—like Lovable, Claude. Now ChatGPT launched Codex for coding, and Replit. Depending on what kind of application you're building—for example, maybe if you're building a web-based application—then I recommend using Lovable or Replit. They're very good at creating that. Whatever software you want to build, whatever user personas that you’re addressing, you can feed into that and it’ll create like a prototype application. Okay.  So what that does is actually, then this prototype, clients can just take it to their customers or internal users and get feedback. A picture is better than a thousand words. Organizations discussing an idea is very different from when they actually see something. Then everybody starts chipping in—“Oh yeah, I see this in the prototype, but I don't want this,” or “I want to move things around,” or “This is what I want.” Basically, building a prototype on AI platforms is much faster than building wireframes and design prototypes like we used to do earlier. So that has changed. So you're 3D printing your software, right?  Yes, exactly. There you go. Well, that’s a very good way you put it together. Yeah. So, yeah, exactly. You’re just 3D printing the software, right? So you can see it, visualize it, and then once you go through that, it creates a lot of better ideas about the software in faster time. So once you have that, then you go into UI/UX design. So in that also, there are two steps. One is wireframing. Wireframing is like creating the flow in black and white. It's like creating a skeleton of your software. It does not have the color, the font, or the branding, but you just create all the different user journeys, the screens, the flow, and the fields that will be there on the screen. So we have integrated AI into that step as well. Earlier, it used to be created by a designer or a business analyst. Now we are using software like Uizard or UX Pilot, where we define what we want—what kind of user journey, flows, and screens—and it creates that. It spins out those wireframes in minutes. So really that has reduced now. The time it used to take to create wire frames is faster now.  So you're designing the wireframes with AI?  Yes, but it's just the wireframe part of it, and it's still guided by our expert VA or designer—someone who knows how to really visualize things and has done a lot of wireframes and sketches. So they know what to tell the AI. Prompting is very important. It's very important that you know how to prompt—what to ask for—so that you can get variations and differentiation in the wireframes. You don't want a standard AI-created wireframe. Everybody can recognize AI-generated images now, right? If I show you one, you'd say, “Oh yeah, it's AI-generated.” I know that, right? Yeah. So again, we keep the human intelligence. We're not asking AI to create the full software end-to-end. It never works—it'll never work. It just doesn't. I know that's a strong statement, but I'm saying that based on experience and an understanding of human behavior and psychology. So AI agents will not be able to code software, in your opinion?  No, they can do the coding, but they cannot build the whole software end-to-end—a production-deployed software. Because these software are being used by humans. You have to have human intelligence to understand and define what you need and how it works.Share on X You can maybe create some software, but it doesn't work very well. Even if you use all these platforms, you can cut down your production time and cost by 30%, 40%, 50%, right? That's the number we are seeing—30 to 50% reduction, depending on the software you're building and the objectives. So just to recap—you validate the idea by interrogating Claude and ChatGPT, asking about the needs of that customer, the psychology of the customer—that's step number one. Step number two is 3D printing the software with Lovable or Replit—so proof of concept. And then you design the wireframes. And then what's next after you design the wireframes? What's the next step?  So that’s a good thing. That’s it. Now I'm going to talk about the human element—some people listening to this podcast will be surprised. Now it comes to visual design, right? So you've created the skeleton, and now you have to add the skin, the tone, the color, the emotion to the design, to the workflow. Now, we have tried AI, but it doesn't work. It's very monotonous. So we use an experienced visual designer, a UX designer, for that step—to give it emotion. When you use AI—I wish I could show you some examples—it creates very similar kinds of designs for apps and software. So what we did is we gave it three different apps with very different objectives and everything, and the designs it came up with were very similar—blocks, buttons—very monotonous. So there's no differentiation. And design is the main thing that becomes the differentiator, right?  Yeah.  So that's what we learned from our experience. And I say that very categorically in all of my talks—that visual design, final UX, has to be human, not AI.Share on X Because you are communicating emotions, right? And AI is still not there to communicate emotions.  Yeah. It doesn’t have emotions.  Well, some people will argue with you and say, “No, it can understand if you're sad or unhappy.” But my response to that is—it's because we've programmed it that way. But things change based on situation, context, ethnicity, culture, fear—how people express nervousness, fear, and all that—it's very different. So there was this AI video interviewing company five or six years ago. They were sued by the Department of Justice because they were trying to detect emotions of people like anxious, nervous, when the interview was happening.  It turned out their model was trained only on one race—they didn't account for other races or ethnicities. So their model failed, and they were sued by Department of Justice for that. So yeah, emotions is something—maybe they have unlimited dimensions, we don't know. So it's hard to program that. So basically: ideation, prototype, wireframe, and then final visual design—that's the discovery and design framework. Now, when it comes to development framework, this is where AI has been a game changer—the coding part. But again, you have to be very careful about how you use AI in your coding pattern with your coding team. It depends on the application, it depends on the tech stack, right? Every platform has its own strengths and weaknesses. For example, if you want to build a web-based application in the React JS framework, then Lovable is great. That's very good—very efficient and cost-effective. Then Claude is there. Claude has been really good in software engineering. I would say it has been built and designed mostly for coding, right? Anthropic—their idea, their starting point—was coding, how to make coding and software engineering better.  So they've been a front runner in the race. ChatGPT is trying to catch up using Codex, and Copilot is great. Copilot is mostly used by enterprises who are on the Microsoft stack. They use Copilot a lot for coding in .NET and enterprise-level applications. They’re used to co-pilot. It’s because they feel comfortable with Microsoft security policies and all that. That’s fine. But in general, we see Claude to be at the top—from our perspective. We've also built a framework for software coding. In software development, there's a popular process called peer review. So when you create source code, you get it reviewed by your peer—your colleague.Share on X  Is this what happens on GitHub?  Yeah, yes. So basically anywhere—any source code repository—you can do that. So your team members can help you make your code better and more efficient.  Yeah, I understand. But now we have a step called prompt peer review. When you're using prompts to build software, those prompts get reviewed by team members. Because if your prompts are not very specific or good enough all the way through the SDLC, you can run into a lot of challenges trying to fix the code. Because now you have a situation where you have code that you have not written fully, and when you ask AI to change something in the code, sometimes it ends up changing a lot of things that you don't want it to change. Yeah.  That's what we've seen, and that's why we evolved. Before we build any software, we create maybe a 10-, 20-, 30-page prompt document, where we go through each screen and function and write it out. It's very sophisticated—it has evolved really well. But the thing is, it takes a few days to do that within the team, because we know if we do it right, the next step is faster and more accurate. So really, the prompt document—think of it more like an architecture document. Earlier, we used to create a solution architecture document, defining all the tools, the design, everything.  But now it's more like an AI-driven solution architecture document with prompts, which get reviewed by team members. So we do that, and then we run that, and we get the code and everything. So I have a CTO club—I run a CTO Club in Maryland—and I was talking to CTOs. They're all using this, but some of them are so advanced that they actually define the test cases in the beginning. They define, “Okay, this is what I want, this is the function I want, and these are the test cases I want it to pass.” That's even more advanced. If you can do that, you can have very efficient code.  Yeah, I love it.  So is that the end? You have your test cases, you design the prompt, you peer-review the prompt, and you already had the prototype, so now you're coding the software—what's the last step?  Yeah. Then there’s an integration as well. So AI doesn’t do the integration so well. You can do the front-end coding, you can do the back-end coding, you can probably create the APIs. APIs require a lot more human intervention. But once you have that, then you have to connect it, right? You have to connect the front end with the backend. A lot of that is still done by the programmer. It's hard to rely on AI for doing that. And again, it depends on the application. Maybe if it's a smaller application, maybe you can have AI do that. But if it's a bigger application—we mostly build bigger applications—then integration, then final QA and testing, and deployment.  So all that is there. But in each of these steps, you can use some sort of AI tool to speed up the process. But the key is you still have to have your architecture, the process. You have to know the steps more. You have to be a good, experienced developer to use AI efficiently if you want to build a production-ready application. You can build a prototype. Anybody can build a prototype on Replit or Lovable, but it's not going to be production-ready that you can give to your customer and charge them money. So that’s the differentiator.  Yeah, I understand. So Piyush, I’d like to switch gears here. I understand the AI ideation framework—that's great. We talked about the technical part of it, the curiosity, the technical challenges. Let’s talk about the entrepreneurship part, which is also part of your profile. So what drives the growth of your business? What would you say drives it?  For us, there are multiple factors that drive the growth of our business. The first is, again, our problem-solving attitude. Any client that comes to us we communicate in that modelShare on X The problem, the challenge, the solution, the business part, the value proposition we bring. And the second factor is our location. We are here in Maryland, and we have another office in Chicago. So being here, we have a global shoring model—that's a main driving factor of our business from the entrepreneurship perspective. So what the global shoring model is: our client-facing team, the senior team, is here—solution architects, sales engineers, designers, project managers, business analysts—they are here in the US, client-facing. And our dev team and testers are in our offshore locations.  Some people call it hybrid shoring. I call it right shoring. The reason I call it right shoring is because in this model, you have the right people at the right shore, so you get the most value. Here, you have people who understand the culture, the product, the context—because products are used by people in a certain culture. And if you are not in that culture, if you haven't experienced it, it's always harder to design the right software solution. I was one of the first people to start that model here in the DMV area for mid-size and smaller companies. This model existed before, but mostly for large enterprise companies. They have used that. But I started to offer that 16 years ago to smaller companies. Either companies were just going offshore, or they were doing onshore, right? I introduced this hybrid—or right-shoring—model, and it has been well received by our customers. So that’s it.  So what is one thing that you’re trying to figure out in your business right now?  Right now, what I'm trying to figure out in my business is scaling. I mean, we have built solutions for many different industries. We have built solutions for different clients in fintech, healthcare, education, nonprofit, startups, IoT, construction. But now what we are trying to figure out is how do we create some off-the-shelf solutions for different industries? Because one challenge we see is that, from the client's perspective, getting custom software built takes time and money. But in certain use cases, we can have off-the-shelf, industry-specific solutions, and then customize those based on the client's needs.  So that's what we are trying to figure out—across different industries, what those solutions can be—so we can scale and also make it easier. And these are more like AI-driven, off-the-shelf solutions that are customizable. So think of it like Salesforce—its core is off-the-shelf, but then you can customize the front end and a lot of other things. Not exactly like Salesforce, but more like industry-specific solutions for different use cases—nonprofit, construction, right? With those, overall, we can build solutions faster.  That’s fascinating. So how has the offshoring—or right shoring, as you call it—model evolved over the past 10 years? Is it different now than it was 10 or 20 years ago?  Yeah, I think that's a great question. It has evolved and changed. Earlier—maybe 10, 12 years ago—when we were talking about hybrid shoring, we were mostly talking about the US and Asia. But now we have different players. We have the nearshore model, which has become quite popular as well—like South America. We have team members in nearshore locations as well, in South America, because we want to leverage different time zones, resources, and culture. And we've seen very positive results. Then you have Eastern Europe. We have competition from countries like Ukraine, Belarus, Romania, Poland. I think it’s the part of the globalized world, right? It's like energy flowing in different spaces—it's not limited to one place, which is great. That's one way it has evolved.  I also know some companies working in Kenya—there are developers there. Some companies are setting up in East Africa, West Africa. So different places are playing roles now. That’s one thing I see. And now, with the help of AI, what's going to happen is it will play two roles. One— in many situations, with AI, you can do more things onshore. That’s one aspect of it. And second—with AI, someone sitting offshore who knows how to use AI can become very competitive as well. We don't have enough data yet to fully see how this will evolve, but maybe in a year or so, we'll see how it plays out.  But I also find that with these simultaneous translation tools—like Apple, I think an iPhone can now translate in all languages. Essentially, another barrier falls that if the language and knowledge of your offshore contractor is not perfect, they can understand things much more clearly because of simultaneous translation. Even on Zoom, you can now flip a switch and they can read what's being said in their own language during a conversation. So that's amazing, I think.  Yeah. That’s amazing. That’s amazing. They can understand more about the culture and mindset. So that's something have to see. Again, I think it depends on the use case, the application, the problem we're solving. But in some cases, it might be great news for onshore—we can keep more dollars here. But keeping dollars here with AI also means a lot of that spend is going to AI, right? So that's one thing—we have to be very careful. Yesterday, in our tech breakfast, our presentation was about how to optimize your AI tokens. There are some companies spending $150,000 per year per employee on tokens. Wow.  That's like the salary of one employee.  Yeah.  A mid-level developer—$150K—they're spending that much. And then they’re trying to figure out how to optimize it. And on top of that, they have cloud costs, right? AWS, Azure—those costs are still there—and then you add AI. So it's a lot of money. You really have to be very smart about understanding and optimizing it. That’s why the prompting is so important, right? It's not just about getting the right software—it's also about getting the cost down.  Yeah. Again, you need expert people who can prompt well, because it's about being able to communicate well. Prompting is about communication—it's about clarity, brevity, security, all that stuff. So, Piyush, we're coming close to the end of the recording. If someone would like to learn more about the applications you develop, how you're using AI, and how you can help their business develop technology, where can they find you? What's the best way to get in touch with you? Sure, there are many ways people can reach out to me. They can go to my website, www.simpalm.com—we have a contact form there. They can submit the form, or they can reach out to me via email directly at contact@simpalm.com. They can also connect with me on LinkedIn. I'm on LinkedIn—message me there if somebody needs anything. I always like discussing problems and what the solutions can be. If anybody reaches out to me, I'm always very quick to respond.  That's awesome. So Piyush Jain, the CEO of Simpalm—and we didn't even talk about your other business, Ducknowl—thank you for coming, and thank you for sharing your insights and your framework on how to build an ideation framework for AI. So thanks for sharing that. And if you're listening and you enjoyed this conversation, then stay tuned, because every week we have another entrepreneur sharing their insights and frameworks with you. So make sure you follow us on YouTube, subscribe, and give us a review on Apple Podcasts. So thanks for coming. Thank you, Steve. It was a pleasure talking to you. Important Links: Piyush's LinkedIn Piyush's website

Paul's Security Weekly
AppSec News Roundup on Claude Code Leak, Axios NPM Compromise, Secure Design - Idan Plotnik, Raj Mallempati - ASW #377

Paul's Security Weekly

Play Episode Listen Later Apr 7, 2026 68:42


Security problems aren't changing very much even though security teams are. We catch up on the implications of the Claude Code source leak, the very human lessons from the axios NPM compromise, and what secure design looks like when it involves agents, humans, or both. AppSec has always celebrated interesting and impactful vulns. And LLMs are now a favored tool for finding flaws. We shouldn't forget the success and effectiveness of fuzzers like OSS-Fuzz, which has improved security for over 1,000 projects and found over 50,000 bugs. But we can't ignore the ease of prompting an agent to go find -- and exploit -- a vuln when the UX and overhead of doing so is hardly more than writing some markdown. The SDLC Blind Spot: Why Breaches Start with Identity, Not Code Developers have access to source code, CI/CD pipelines, and cloud infrastructure — and attackers know it. Target lost 860GB of source code through a single compromised credential. Recruitment fraud campaigns have pivoted from a compromised developer to cloud admin in under 10 minutes. As agents join human developers, contractors, and service accounts in the SDLC, the attack surface is expanding faster than static security tools can track. Security teams need real-time visibility beyond code and into who has access and what they're actually doing. This segment is sponsored by Apiiro. To lean more, visit https://securityweekly.com/apiirorsac. How AI-Driven Development is Reshaping the Application Risk Landscape Agent coding assistants are accelerating software development, generating more code and more change than security teams were built to handle. In this interview, Idan Plotnik discusses how AI-driven development is reshaping the application risk landscape and why traditional vulnerability management models can't keep up. Make sure to schedule a free SDLC Risk Assessment with BlueFlag Security - 30 minutes to deploy. 48 hours to results. Please visit https://securityweekly.com/blueflagrsac. Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw-377

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Extreme Harness Engineering for Token Billionaires: 1M LOC, 1B toks/day, 0% human code, 0% human review — Ryan Lopopolo, OpenAI Frontier & Symphony

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 7, 2026 72:43


We're proud to release this ahead of Ryan's keynote at AIE Europe. Hit the bell, get notified when it is live! Attendees: come prepped for Ryan's AMA with Vibhu after.Move over, context engineering. Now it's time for Harness engineering and the age of the token billionaires.Ryan Lopopolo of OpenAI is leading that charge, recently publishing a lengthy essay on Harness Eng that has become the talk of the town:In it, Ryan peeled back the curtains on how the recently announced OpenAI Frontier team have become OpenAI's top Codex users, running a >1m LOC codebase with 0 human written code and, crucially for the Dark Factory fans, no human REVIEWED code before merge. Ryan is admirably evangelical about this, calling it borderline “negligent” if you aren't using >1B tokens a day (roughly $2-3k/day in token spend based on market rates and caching assumptions):Over the past five months, they ran an extreme experiment: building and shipping an internal beta product with zero manually written code. Through the experiment, they adopted a different model of engineering work: when the agent failed, instead of prompting it better or to “try harder,” the team would look at “what capability, context, or structure is missing?”The result was Symphony, “a ghost library” and reference Elixir implementation (by Alex Kotliarskyi) that sets up a massive system of Codex agents all extensively prompted with the specificity of a proper PRD spec, but without full implementation:The future starts taking shape as one where coding agents stop being copilots and start becoming real teammates anyone can use and Codex is doubling down on that mission with their Superbowl messaging of “you can just build things”.Across Codex, internal observability stacks, and the multi-agent orchestration system his team calls Symphony, Ryan has been pushing what happens when you optimize an entire codebase, workflow, and organization around agent legibility instead of human habit.We sat down with Ryan to dig into how OpenAI's internal teams actually use Codex, why the real bottleneck in AI-native software development is now human attention rather than tokens, how fast build loops, observability, specs, and skills let agents operate autonomously, why software increasingly needs to be written for the model as much as for the engineer, and how Frontier points toward a future where agents can safely do economically valuable work across the enterprise.We discuss:* Ryan's background from Snowflake, Brex, Stripe, and Citadel to OpenAI Frontier Product Exploration, where he works on new product development for deploying agents safely at enterprise scale* The origin of “harness engineering” and the constraint that kicked off the whole experiment: Ryan deliberately refused to write code himself so the agent had to do the job end to end* Building an internal product over five months with zero lines of human-written code, more than a million lines in the repo, and thousands of PRs across multiple Codex model generations* Why early Codex was painfully slow at first, and how the team learned to decompose tasks, build better primitives, and gradually turn the agent into a much faster engineer than any individual human* The obsession with fast build times: why one minute became the upper bound for the inner loop, and how the team repeatedly retooled the build system to keep agents productive* Why humans became the bottleneck, and how Ryan's team shifted from reviewing code directly to building systems, observability, and context that let agents review, fix, and merge work autonomously* Skills, docs, tests, markdown trackers, and quality scores as ways of encoding engineering taste and non-functional requirements directly into context the agent can use* The shift from predefined scaffolds to reasoning-model-led workflows, where the harness becomes the box and the model chooses how to proceed* Symphony, OpenAI's internal Elixir-based orchestration layer for spinning up, supervising, reworking, and coordinating large numbers of coding agents across tickets and repos* Why code is increasingly disposable, why worktrees and merge conflicts matter less when agents can resolve them, and what it really means to fully delegate the PR lifecycle* “Ghost libraries”, spec-driven software, and the idea that a coding agent can reproduce complex systems from a high-fidelity specification rather than shared source code* The broader future of Frontier: safely deploying observable, governable agents into enterprises, and building the collaboration, security, and control layers needed for real-world agentic workRyan Lopopolo* X: https://x.com/_lopopolo* Linkedin: https://www.linkedin.com/in/ryanlopopolo/* Website: https://hyperbo.la/contact/Timestamps00:00:00 Introduction: Harness Engineering and OpenAI Frontier00:02:20 Ryan's background and the “no human-written code” experiment00:08:48 Humans as the bottleneck: systems thinking, observability, and agent workflows00:12:24 Skills, scaffolds, and encoding engineering taste into context00:17:17 What humans still do, what agents already own, and why software must be agent-legible00:24:27 Delegating the PR lifecycle: worktrees, merge conflicts, and non-functional requirements00:31:57 Spec-driven software, “ghost libraries,” and the path to Symphony00:35:20 Symphony: orchestrating large numbers of coding agents00:43:42 Skill distillation, self-improving workflows, and team-wide learning00:50:04 CLI design, policy layers, and building token-efficient tools for agents00:59:43 What current models still struggle with: zero-to-one products and gnarly refactors01:02:05 Frontier's vision for enterprise AI deployment01:08:15 Culture, humor, and teaching agents how the company works01:12:29 Harness vs. training, Codex model progress, and “you can just do things”01:15:09 Bellevue, hiring, and OpenAI's expansion beyond San FranciscoTranscriptRyan Lopopolo: I do think that there is an interesting space to explore here with Codex, the harness, as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding. We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to.Build a user journey that you're trying to solve into code. It's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate in prompts. To let the model cook, you have to step back, right? Like you need to take a systems thinking mindset to things and constantly be asking, where is the Asian making mistakes?Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC.swyx: [00:01:00] All right.[00:01:03] Meet Ryan swyx: We're in the studio with Ryan from OpenAI. Welcome.Ryan Lopopolo: Hi,swyx: Thanks for visiting San Francisco and thanks for spending some time with us.Ryan Lopopolo: Yeah, thank you. I'm super excited to be here.swyx: You wrote a blockbuster article on harness engineering. It's probably going to be the defining piece of this emerging discipline, huh?Ryan Lopopolo: Thank you. It is it's been fun to feel like we've defined the discourse in some sense.swyx: Let's contextualize a little bit, this first podcast you've ever done. Yes. And thank you for spending with us. What is, where is this coming from? What team are you in all that jazz?Ryan Lopopolo: Sure, sure.Ryan Lopopolo: I work on Frontier Product Exploration, new product development in the space of OpenAI Frontier, which is our enterprise platform for deploying agents safely at scale, with good governance in any business. And. The role of VMI team has been to figure out novel ways to deploy our models into package and products that we can sell as solutions to enterprises.swyx: And you have a background, I'll just squeeze it in there. Snowflake, brick, [00:02:00] stripe, citadel.Ryan Lopopolo: Yes. Yes. Same. Any kind of customerswyx: entire life. Yes. The exact kind of customer that you want to,Vibhu: so I'll say, I was actually, I didn't expect the background when I looked at your Twitter, I'm seeing the opposite.Stuff like this. So you've got the mindset of like full send AI, coding stuff about slop, like buckling in your laptop on your Waymo's. Yes. And then I look at your profile, I'm like, oh, you're just like, you're in the other end too. Oh, perfect. Makes perfect.Ryan Lopopolo: I it's quite fun to be AI maximalist if you're gonna live that persona.Open eye is the place to do it. And it'sswyx: token is what you say.Ryan Lopopolo: Yeah. Certainly helps that we have no rate limits internally. And I can go, like you said, full send at this stay.swyx: Yeah. Yeah. So the Frontier, and you're a special team within O Frontier.Ryan Lopopolo: We had been given some space to cook, which has been super, super exciting.[00:02:47] Zero Code ExperimentRyan Lopopolo: And this is why I started with kind of a out there constraint to not write any of the code myself. I was figuring if we're trying to make agents that can be deployed into end to enterprises, they should be [00:03:00] able to do all the things that I do. And having worked with these coding models, these coding harnesses over 6, 7, 8 months, I do feel like the models are there enough, the harnesses are there enough where they're isomorphic to me in capability and the ability to do the job.So starting with this constraint of I can't write the code meant that the only way I could do my job was to get the agent to do my job.Vibhu: And like a, just a bit of background before that. This is basically the article. So what you guys did is five months of working on an internal tool, zero lines of code over a mi, a million lines of code in the total code base.You say it was cenex, more like it was cenex faster than you would've. If you had done it by end. SoRyan Lopopolo: yeah, thatVibhu: was the mindset going into this, right?Ryan Lopopolo: That's right.[00:03:46] Model Upgrades LessonsRyan Lopopolo: Started with some of the very first versions of Codex CLI, with the Codex Mini model, which was obviously much less capable than the ones we have today.Which was also a very good constraint, right? Quite a visceral feeling to ask the [00:04:00] model to build you a product feature. And it just not being able to assemble the pieces together.Which kind of defined one of the mindsets we had for going into this, which is whenever the model just cannot, you always pop open at the task, double click into it, and build smaller building blocks that then you can reassemble into the broader objective.And it was quite painful to do this. Honestly, the first month and a half was. 10 times slower than I would be. But because we paid that cost, we ended up getting to something much more productive than any one engineer could be because we built the tools, the assembly station for the agent to do the whole thing.[00:04:43] Model Generations, Build Systems & Background ShellsRyan Lopopolo: But yeah, so onward to G BT 5, 5, 1, 5, 2, 5, 3, 5 4. To go through all these model generations and see their kind of corks and different working styles also meant we had to adapt the code base to change things up when the model was revved. [00:05:00] One interesting thing here is five two, the Codex harness at the time did not have background shells in it, which means we were able to rely on blocking scripts to perform long horizon work.But with five, three and background shells, it became less patient, less willing to block. So we had to retool the entire build system to complete in under a minute and. This is not a thing I would expect to be able to do in a code base where people have opinions. But because the only goal was to make the Asian productive over the course of a week, we went from a bespoke make file build to Basil, to turbo to nx and just left it there because builds were fast at that point.swyx: Interesting. Talk more about Turbo TenX. That's interesting ‘cause that's the other direction that other people have been doing.Ryan Lopopolo: Ultimately I have. Not a lot of experience with actual frontend repo architecture.swyx: You're talking that Jessica built the sky. So I'm like, I know the NX team. I know Turbo from Jared [00:06:00] Palmer.And I'm like, yeah, that's an interesting comparison.[00:06:02] One Minute Build LoopRyan Lopopolo: The hill we were climbing right, was make it fast.swyx: Is there a micro front end involved? Is it how how complex reactRyan Lopopolo: electron base single app sort of thingswyx: And must be under a minute. That's an interesting limitation. I'm actually not super familiar with the background shelf stuff.Probably was talked about in the fight three release.Ryan Lopopolo: BA basically means that codex is able to spawn commands in the background and then go continue to work while it waits for them to finish. So it can spawn an expensive build and then continue reviewing the code, for example.swyx: Yeah.Ryan Lopopolo: And this helps it be more time efficient for the user invoking the harness.swyx: And I guess and just to really nail this, like what does one minute matter? Like why not five, okay, good. We want no. WeRyan Lopopolo: want the inner loop to be as fast as possible. Okay. One minute was just a nice round number and we were able to hit it.swyx: And if it doesn't complete, it kills it or some something,Ryan Lopopolo: No.We just take that as a signal that we need to stop what we're doing, double click, decompose a build graph a bit to get us to high back under so that we [00:07:00] can able the agent continue to operate.swyx: It's almost like you're, it's like a ratchet. It's like you're forcing build time discipline, because if you don't, it'll just grow and grow.That's right. And you mentioned that my current, like the software I work on currently is at 12 minutes. It sucks.Ryan Lopopolo: This has been my experience with platform teams in the past, where you have an envelope of acceptable build times and you let it go up to breach and then you spend two, three weeks to bring it back down to the lower end of the average low bed stop.But because tokens are so cheap Yeah. And we're so insanely parallel with the model, we can just constantly be gardening this thing to make sure that we maintain these in variants, which means. There's way less dispersion in the code and the SDLC, which means we can simplify in a way and rely on a lot more in variance as we write the software.[00:07:45] Observability, Traces & Local Dev StackVibhu: Lovely.[00:07:46] Humans Are BottleneckVibhu: You mentioned in your article, like humans became the bottleneck, right? You kicked off as a team of three people. You're putting out a million line of code, like 1500 prs, basically. What's the mindset there? So as much as code is disposable, you're doing a lot of review. A lot [00:08:00] of the article talks about how you wanna rephrase everything is prompting everything, is what the agent can't see.It's kind of garbage, right? You shouldn't have it in there. So what's like the high level of how you went about building it, and then how you address okay, humans are just PR review. Like how is human in the loop for this?Ryan Lopopolo: We've moved beyond even the humans reviewing the code as well.[00:08:19] Human Review, PR Automation & Agent Code ReviewRyan Lopopolo: Most of the human review is post merge at this point.But post, post merge, that's not even reviewed. That's justswyx: Oh, let's just make ourselves happy by YouRyan Lopopolo: haven't used fundamentally. The model is trivially paralyzable, right? As many GPUs and tokens as I am willing to spend, I can have capacity to work with my hood base.The only fundamentally scarce thing is the synchronous human attention of my team. There's only so many hours in the day we have to eat lunch. I would like to sleep, although it's quite difficult to, stop poking the machine because it makes me want to feed it. You have to step back, right?Like you need to take a systems thinking mindset to things and [00:09:00] constantly be asking where is the agent making mistakes? Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC, and usually what that has looked like is like we started needing to pay very close attention to the code because the agent did not have the right building blocks to produce.Modular software that decomposed appropriately that was reliable and observable and actually accrued a working front end in these things, right?[00:09:35] Observability First SetupRyan Lopopolo: So in order to not spend all of our time sitting in front of a terminal at most, doing one or two things at a time, invested in giving the model that observability, which is that that graph in the post here.swyx: Yeah. Let's walk through this traces and which existed firstRyan Lopopolo: we started with just the app and the whole rest of it. From vector through to all these login metrics, APIs was, I dunno, half an [00:10:00] afternoon of my time. We have intentionally chosen very high level fast developer tools. There's a ton of great stuff out there now.We use me a bunch, which makes it trivial to pull down all these go written Victoria Stack binaries in our local development. Tiny little bit of python glue to spin all these up. And off you go. One neat thing here is we have tried to invert things as much as possible, which is instead of setting up an environment to spawn the coding agent into, instead we spawn the coding agent, like that's the entry point.It's just Codex. And then we give Codex via skills and scripts the ability to boot the stack if it chooses to, and then tell it how to set some end variables. So the app and local Devrel points at this stack that it has chosen to spin up. And this I think is like the fundamental difference between reasoning models and the four ones and four ohs of the past, where these models could not think so you had to put them in [00:11:00] boxes with a predefined set of state transitions.Whereas here we have the model, the harness be the whole box. And give it a bunch of options for how to proceed with enough context for it to make intelligent choices. SoVibhu: sales, so like a lot of that is around scaffolding, right? Yes. Previous agents, you would define a scaffold. It would operate in that.Lube, try again. That's pivoted off from when we've had reasoning models. They're seeming to perform better when you don't have a scaffold, right? That's right.[00:11:28] Docs Skills GuardrailsVibhu: And you go into like niches here too, like your SPEC MD and like having a very short agent MG Agent md.swyx: Yes. Yes.Vibhu: Yeah. So you even lay out what it is here, but I likeswyx: the table contents.Vibhu: Yeah.swyx: Like stuff like this, it really helps guide people because everyone's trying to do this.Ryan Lopopolo: This structure also makes it super cheap to put new content into the repository to steer both the humans and the agents.swyx: You, you reinvented skills, right?Vibhu: One big agents andswyx: skills from first princip holdsRyan Lopopolo: all skills did not exist when we started doing this.Vibhu: You have a short [00:12:00] one 100 line overall table of contents and then you have little skills, right? Core beliefs, MD tech tracker. Yeah. Yeah. The scale is overRyan Lopopolo: The tech jet tracker and the quality score are pretty interesting because this is basically a tiny little scaffold, like a markdown table, which is a hook for Codex to review all the business logic that we have defined in the app, assess how it matches all these documented guardrails and propose follow up work for itself.Before beads and all these ticketing systems, we were just tracking follow up work as notes in a markdown file, which, we could spa an agent on Aron to burn down. There's this really neat thing that like the models fundamentally crave text. So a lot of what we have done here is figure out ways to inject textswyx: intoRyan Lopopolo: the system right when we get a page, because we're missing a timeout, for example.I can just add Codex in Slack on that page and say, I'm gonna fix this by adding a timeout. Please update our reliability documentation. To require that all network calls have [00:13:00] timeouts. So I have not only made a point in time fix, but also like durably encoded this process knowledge around what good looks like.swyx: Yeah.Ryan Lopopolo: And we give that to the root coding agent as it goes and does the thing. But you can also use that to distill tests out of, or a code review agent, which is pointed at the same things to narrow the acceptable universe of the code that's produced.swyx: I think one of the concerns I have with that kind of stuff is you think you're making the right call by making, it's persisted for all time across everything.Yes. But then you didn't think about the exceptions that you need to make, right? And that you have to roll it back.Vibhu: Part of it isswyx: also sometimes it can follow your s instructions too.Vibhu: It's somewhat a skill, right? So it determines when it uses the tools, right? Like it's not like it'll run outta every call.It'll determine when it wants to check quality score, right?Ryan Lopopolo: Yeah. And we do in the prompts we give these agents, allow them to push back,[00:13:51] Agent Code Review RulesRyan Lopopolo: When we first started adding code review agents to the pr, it would be Codex, CLI. Locally writes the change, pushes up a PR on [00:14:00] those PR synchronizations of review agent fires.It posts a comment. We instruct Codex that it has to at least acknowledge and respond to that feedback. And initially the Codex driving the code author was willing to be bullied by the PR reviewer, which meant you could end up in a situation where things were not converging. So yeah, we had to,swyx: he's just a thrash.Ryan Lopopolo: We had to add more optionality to the prompts on both of these things, right? The reviewer agents were instructed to bias toward merging the thing to not surface anything greater than a P two in priority. We didn't really define P two, but we gave it, youswyx: did define P two.Ryan Lopopolo: We gave it a framework within which to score its outputswyx: and then greater than P zero is worse, right?Yes. P two is very good.Ryan Lopopolo: P zero is you will mute the code place ifswyx: you merch thisRyan Lopopolo: thing, right?swyx: Yeah.Ryan Lopopolo: But also on the code authoring agent side, we also gave it the flexibility to either defer or push back against review feedback, right? This happens all the time, right? Like I happen to notice something and leave a code review, [00:15:00] which.Could blow up the scope by a factor of two. I usually don't mean for that to be addressed Exactly. In the moment. It's more of an FYI file it to the backlog, pick it up in the next fix it week sort of thing. And without the context that this is permissible, the coding agents are gonna bias toward what they do, which is following instructions.swyx: Yeah.[00:15:19] Autonomous Merging Flowswyx: I do wanted to check in on a couple things, right? Sure. All the coding review agent, it can merge autonomously. I think that's something that a lot of people aren't comfortable with. And you have a list here of how much agents do they do Product code and tests, CI configuration and release tooling, internal Devrel tools, documentation eval, harness review, comments, scripts that manage the repository itself, production dashboard definition files, like everything.Yes. And so they're just all churning at the same time, is there like a record that, that any human on the team pulls to stop everythingRyan Lopopolo: Because we are building a native application here. We're not doing continuous deploy. So there's still a human in the loop for cutting the release branch.I see. We require a blessed [00:16:00] human approved smoke test of the app before we promote it to distribution, these sort of things.swyx: So you're working on the app, you're not building like infrastructure where you have like nines of reliability, that kinda stuff?Ryan Lopopolo: That's correct. That's correct. Okay. And also like full recognition here that all of this activity took in a completely greenfield repository.There's. Should be no script that this applies generally toswyx: this is a production thing, you're gonna shipRyan Lopopolo: toswyx: customers. Of course. Yeah, of course. So this is realVibhu: And like one of the things there is, you mentioned you started this as a repo from scratch. The onboarding first month or so was pretty, it was like working backwards, right?Yeah. And then you had to work with the system and now you're at that point where you know, you're very autonomous. I'm curious like, okay, so what, how human in the loop is it? So what are the bottlenecks that you wish you could still automate? And part of that is also like, where do you see the model trajectory improving and offloading more human in the loop?We just got 5.4. It's a really good,Ryan Lopopolo: fantastic model, by the way.Vibhu: Yeah. Yeah. It's the first one that's merged. Top tier coding. So it's codex level coding and reasoning. So general reasoning both in one model. SoRyan Lopopolo: andVibhu: computer [00:17:00] use vision.Ryan Lopopolo: Now we now with five four, I can just have Codex write the blog post, whereas for this one I had to balance between chat.swyx: Oh, I need to, I might be out of a job. Oh my God.Ryan Lopopolo: Oh,swyx: I know. You just gave me an idea for a completely AI newsletter that five four could do. Yeah, I get it Now.Ryan Lopopolo: This sort of thing is just one example of closing the loop, right? Like the dashboard thing you mentioned. We have Codex authoring the Js ON, for the Grafana dashboards and publishing them and also responding to the pages, which means when it gets the page, it knows exactly which dashboards are defined and what alerts.What alert was triggered by which exact log in the code base. ‘cause all of this stuff is collated together.swyx: It has to own everything.Yes. Yeah. Yeah.Ryan Lopopolo: And it means that if we have an outage that did not result in a page. It has the existing set of dashboards available to it. It has the existing set of metrics and logs and can figure out where the gaps in the dashboard are or [00:18:00] in the underlying metrics and fix them in one go.In the same way, you would have a full stack engineer be able to drive a feature from the backend all the way to the front end.Vibhu: So it, it seems like a lot of the work you guys had to do was you as a small team are fully working for a way that the model wants the software to be written. It's like less human legible for better. Code legibility, agent legibility. How do you think that affects broader teams? So one at OpenAI, do liaison, like this is how software should be written. Like I can imagine, say you join a new team with this methodology, this mindset there's ways that, teams do code review, teams write code, like teams are structured and a lot of it is for human legibility.So should we all swap? Like how does this play back one broader into OpenAI and then like broader into the software engineering, right? Is it like teams that pick this up will it's pretty drastic, right? You have to make a pretty big switch. Should they just full send Yeah.Ryan Lopopolo: The mindset is very much that I'm removed from the process, right? I can't really have deep code level opinions about [00:19:00] things. It's as if I'm. Group tech leading a 500 person organization.Vibhu: Yeah.Ryan Lopopolo: Like it's not appropriate for me to be in the weeds on every pr. This is why that post merge code review thing is like a good analog here, right?Like I have some representative sample of the code as it is written, and I have to use that to infer what the teams are struggling with, where they could use help, where they're already moving quickly and I can pivot my focus elsewhere.Vibhu: Yeah.Ryan Lopopolo: So I don't really have too many opinions around the code as it is written.I do, however, have a command based class, which is used to have repeatable chunks of business logic that comes with tracing and metrics and observability for free. And the thing to focus on is not how that business logic is structured, but that it uses this primitive ‘cause I know that's gonna give leverage by default.Vibhu: Yeah.Ryan Lopopolo: Yeah, back to that sort of systems stinking,Vibhu: and you have part of that in your blog post, enforcing architecture and ta taste how you set boundaries for what's used. There's also a section on redefining [00:20:00] engineering and stuff, but yeah, it's just, it's interesting to hear,Ryan Lopopolo: and as the models have gotten better, they have gotten better at proposing these abstractions to unblock themselves, which again, lets me move higher and higher up the stack to look deeper into the future on what ultimately blocked the team from shipping.swyx: Yeah. You mentioned so you, this is primarily a, it is like a 1 million line of code base electron app. But it manages its own services as well, so it's like a backend for front end type thing.Ryan Lopopolo: We do have a backend in there, but that's hosted in the cloud.Yeah. This sort of structure is actually within the separate main and render processesWithin theswyx: electric.That's just how electronic works.Ryan Lopopolo: Yeah, of course. So have also treated like. MVC style decomposition with the same level of rigor, which has been very fun.swyx: I have a fun pun. This is a tangent, NVC is model view controller. Any sort of full stack web Devrel knows that.But my AI native version of this is Model view Claw, the clause the harness.Ryan Lopopolo: That's right. That's right. I do think that there is an interesting space to [00:21:00] explore here with Codex, the harness as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding.We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to build, a user journey that you're trying to solve into code, it's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate and prompts to let the model cook.Yeah. It's been very fun. And there's also a very engineering legible way of increasing capabil. It's fantastic, right? Yeah. Just give you, just give the model scripts, the same scripts you would already build for yourself.swyx: Yeah.Yeah. So for listeners, this is Ryan saying that software engineering or coding against will eat knowledge work like the non-coding parts that you would normally think.Oh, you have to build a separate agent for it. No, start a coding agent and go out from there. Which open Claw has like it's pie Underhood.Ryan Lopopolo: [00:22:00] Yes.Vibhu: Basically define your task in code. Everything is a codingswyx: agent by the way. Since I brought it up, it's probably the only place we bring it up. Is any open claw usage from you?Any?Ryan Lopopolo: No. No. Not for me. I don't have any spare Mac Minis rattling around my house.swyx: You can afford it? No. I just, I'm curious if it's changed anything in opening eye yet, but it's probably early days. And then the other, the other thing I, I wanna pull on here is like you mentioned ticketing systems and you mentioned prs and I'm wondering if both those things have to go away or be reinvented for this kind of coding.So the git itself and is like very hostile to multi-agent.Ryan Lopopolo: Yeah. We make very heavy use of work trees.swyx: But like even then, like I just did a, dropped a podcast yesterday with Cursors saying, and they said they're getting rid of work trees ‘cause it still has too many merge conflicts.It's still un too un unintuitive. But go ahead.Ryan Lopopolo: The models are really great at resolving merge conflicts. Yeah. And to get to a state where I'm not synchronously in the loop in my terminal, I almost don't care that there are mergeswyx: with disposable.[00:23:00] Yeah.Ryan Lopopolo: We invoke a dollar land skill and that coaches codex to push the PR Wait for human and agent reviewers Wait for CI to be green.Fix the flakes if there are any merged upstream. If the PR comes into conflict, wait for everything to pass. Put it in the merge queue. Deal with flakes until it's in Maine. End. This is what it means to delegate fully, right? This is in a, very large model re probably a significant tax on humans to get PRS merged, but the agent is more than capable of doing this and I really don't have to think about it other than keep my laptop open.swyx: Yeah. I used to be much more of a control freak, but now I'm like, yeah, actually you could do a better job of this than me. Yeah. With the right context. Yes.[00:23:47] Encoding Requirementsswyx: Anything else in harness in general? Just this piece, I just wanna make sure we,Ryan Lopopolo: I think one thing that I maybe didn't make super clear in the article that I heard on Twitter as an interesting, that's respond [00:24:00]swyx: to them.What's the chatter and then what's your response?Ryan Lopopolo: Ultimately, all the things that we have encoded in docs and tests and review agents and all these things are ways to put all the non-functional requirements of building high scale, high quality, reliable software into a space that prompt injects the agent.We either write it down as docs, we add links where the error messages tell how to do the right thing. So the whole meta of the thing is to basically tease out of the heads of all the engineers on my team, what they think good looks like, what they would do by default, or what they would coach a new hire on the team to do to get things to merch.And that's why we pay attention to all the mistakes, mistakes that the agent makes, right? This is code being written that is misaligned with some as yet not written down, non-functional requirement.swyx: Sorry, what? Did the online people misunderstand orRyan Lopopolo: No,swyx: whatyouRyan Lopopolo: responded to? Somebody just literally said that.I was like, oh yeah,swyx: okay,Ryan Lopopolo: This is the [00:25:00] thing. This is what I've been doing. Oh, youswyx: agree? Yeah. I see. Interesting.Ryan Lopopolo: One other neat thing, which I did totally did not expect is folks were just. Taking the link to the article and giving it to pi or Codex and say, make my repo this,Vibhu: you achi a whole recursion.Ryan Lopopolo: And it was wildly effective. Really? It was wildly effective. NoVibhu: way. It just actually is something I tried with five, four yesterday. I didn't have time. Last time I was like out speaking of something, and this is one of my things, I was like, okay, I have this article. Can we just scaffold out what it would be like to run this?And I, I did it first as that and then I was like, okay, let me take another little side repo and say okay, if I was to fully automate this like this because I haven't written a line of code, it'sRyan Lopopolo: like over full, setVibhu: it right. The side thing I'm doing of voice. TTS I'm just like, slobbing out, whatever.It's nothing production. I'm like, how would I make this like this? And it's actually like a really good way. It's like a good way to learn what could be changed, what could be like, it's just a good analyzing, right? You give it all the codes, you give it all the context, you give it the article and it walks you through it very well.That's right. That's right.[00:25:57] Inlining Dependencies[00:25:57] Dependencies Going Away & Brett Taylor's Responseswyx: I guess one more thing before we go to Symphony is I wanted to cover [00:26:00] Brett Taylor's response. We had him on the show. He is your chairman, which is wild. Yeah. That he's reading your articles as well and like getting engaged in it. He says software dependencies are going away.Basically they can just be like vendored. Yes. Response.Ryan Lopopolo: Aswyx: hundred percent. A hundred percent agree. You still pro qr, you still pay Datadog. You still pay Temporal. Thank you.Ryan Lopopolo: Yep. The level of complexity of the dependencies that we can internalize is, I would say low, medium right now. Just based on model capability.What does the,swyx: what is medium?Ryan Lopopolo: I would say like a. A couple thousand line dependency is a thing that we could in-house No problem. Call in an afternoon of time. One neat thing about it is like probably most of that code you don't even need. Like by in-house and abstraction, you can strip away all the generic parts of it and only focus on what you need to enable the specific thing.Yes. You're building,swyx: I've been calling this the end of b******t plugins.Ryan Lopopolo: Yeah.swyx: Because there's so much when I published an open source thing, I want to accept everything, be liberal. I want to accept, this is post's law, but that means there's so much bloat. Yes. There's so much overhead.Ryan Lopopolo: One other neat thing about [00:27:00] this too is when we deploy Codex Security on the repo, it is able to deeply review and change. The internalized dependencies in a much lower friction way than it would be to like, push patches upstream, wait for them to be released, pull them down, make sure that's compatible with all the transitive I have in my repo and things like that.So it's also much lower friction to internalize some of these things if code is free. ‘cause the tokens are cheap sort of thing.swyx: Yeah. Yeah. I think like the only argument I have against this is basically scale testing, which obviously the larger pieces of software like Linux, MySQL, he calls up even the Datadog and Temporals and then maybe security testing where Yes.Classically, I think, is it linis tos, it said security open source is the best disinfectant.Ryan Lopopolo: Many eyes.swyx: Many eyes. And if inline your dependencies and code them up, you're gonna have to relearn mistakes from other people that Yep.Ryan Lopopolo: Yep. And to internalize that dependency, you're back to zero and you have to start.Reassembling all those bits and pieces to Yeah. Have [00:28:00] high confidence in the code as it is written. Yeah.Vibhu: Even part of the first intro of this, you basically mentioned like everything was written by codex, including internal tooling, right? So internal tooling, like when you're visualizing what's going on it's writing it for itself.swyx: Yeah. I'm built internal tools way I now, and like I just show them off and they're like, how long did you spend? And I didn't spend any time. I just prompted it,Ryan Lopopolo: very funny story here.swyx: Yeah, go ahead.Ryan Lopopolo: We had deployed our app to the first dozen users internally had some performance issues, so we asked them to export a trace for us get a tar ball, gave it to our on-call engineer, and he did a fantastic job of working with Codex to build this beautiful local Devrel tool, next JS app, the drag and drop the tar ball in, and it visualizes the entire trace.It's fantastic. Took an afternoon, but none of this was necessary. Because you could just spin up codex and give it the tar ball and ask the same thing and get the response immediately. So in a way, optimizing for human [00:29:00] legibility of that debugging process was wrong. It kept him in the loop unnecessarily when instead he could have just like Codex cooked for five minutes and gotten this same.swyx: Yeah, you verify your instincts here of this is how we used to do it. Or this is how I would have used to solve it.Ryan Lopopolo: Yeah. In this local observability stack. Like sure, you can de deploy Yeager to visualize the traces, but I wouldn't expect to be looking at the traces in the first place because I'm not gonna write the code to fix them.swyx: Yeah. So basically there needs to be like this kind of house stack and owning the whole loop. I think that is very well established. And it sounds like you might be like sharing more about that in the future, right?Ryan Lopopolo: Yeah. I think we're excited to do[00:29:36] Ghost Libraries Specs[00:29:36] Ghost Libraries & Distributing Software as SpecsRyan Lopopolo: We're gonna talk about Symphony in a little bit, but like the way we distribute it as a spec, which I think folks are calling Ghost Libraries on Twitter.This is like a such a cool name. It does mean it becomes much cheaper to share software with the world, right? You define a spec, how you could build your own specifying as much as is required for a coding agent to reassemble it [00:30:00] locally. The flow here is very cool. Like we have taken. All the scaffolding that has existed in our proprietary repo spun up a new one.Ask Codex with our repo as a reference. Write the spec. We tell it. Spin up a team ox spawn a disconnected codex to implement the spec. Wait for it to be done. Spawn another codex and another team ox to review the spec com or review the implementation compared to upstream and update the spec so it diverges less.And then you just loop over and over Ralph style until you get a spec that is with high fidelity able to reproduce the system as it is. It's fantastic.Vibhu: And you're basically, you're not really adding any of your human bias in there, right? That's correct. A lot of times people write a spec and be like, okay, I think it should be done this way, and you'll riff on something.And it's no, the agent could have just handled it like you're still scaffolding in a sense, right? I want it done this way. It can determine its spec better.swyx: That's right. That's right. Part of me it, I'm, I've been working a lot on evals recently, and part of me is wondering if [00:31:00] an agent can produce a spec that it cannot solve.Is it always capable of things that he can imagine or can you imagine things that it is impossible to do?Ryan Lopopolo: I think with Symphony, we, there's like this there's this axis where you have things that are easier, hard, or established or new, right? And I think things that are hard and new is still something that the models need humans.Yeah. Drive.swyx: Yeah. Yeah.Ryan Lopopolo: But I think those other quadrants are largely salt. Given the right scaffold and the right thing that's gonna drive the agent to completion,swyx: it's crazy that it solved,Ryan Lopopolo: but it means that the humans, the ones with limited time and attention get to work on the hardest stuff, like the problems where it's pure white space out in front. Or like the deepest refactorings where you don't know what the proper shape of the interfaces are. And this is where I wanna spend my time. ‘cause it lets me set up for the next level of scale.swyx: Yeah. Yeah. Amazing. Let's introduce Symphony.I think we've been mentioning it every now and then. Elixir. Interesting option.Ryan Lopopolo: Yeah.swyx: Yeah. I'm not,Ryan Lopopolo: again, like the [00:32:00] elixir manifestation here is just a derivative. Is it a modelswyx: chosen? Yeah.Ryan Lopopolo: Yeah. Yeah. And it chose that because the process supervision and the gen servers are super amenable to the type of process orchestration that we're doing here.You are essentially spinning up little Damons for every task that is in execution and driving it to completion, which. Means the mall gets a ton of stuff for free by using Elixir and the Beam.swyx: I had to go do a crash course in Beam and Elixir, and I think most people are not operating at that scale of concurrency where you need that.But it is a good mental model for Resum ability and all those things. And these are things I care about. But tell me the story, the origin story of Symphony. What do you use it for? Is this, how did it form maybe any abandoned paths that you didn't take?[00:32:46] Terminal Free Orchestration[00:32:46] Symphony: Removing Humans from the LoopRyan Lopopolo: At the end of December we were at about three and a half PRS per engineer per day.This was before five two came out in the beginning of January. Everyone gets back from holiday with five two and no other work [00:33:00] on the repository. We were up in the five to 10 PRS per day per engineer. And I don't know about y'all, but like it's very taxing to constantly be switching like that. Like I was pretty tapped out at the end of the day, again, where are the humans spending their time? They're spending their time context switching between all these active tmox pains to drive the agent forward.swyx: Yeah. No way. Yeah.Ryan Lopopolo: So let's again, build something to remove ourselves from the loop. And this is what frantic sprinted adapt here to find a way to remove the need for the human to sit in front of their terminal.So a lot of experimentation with Devrel boxes and, automatically spinning up agents, like it seems like a fantastic end state here, where my life is beach. I open live twice a day and say yes no to these things. Yeah. And this is again, a super, super interesting framing for how the work is done.Because I become more latency and sensitive. I have [00:34:00] way less attachment to the code as it is written. Like I've had close to zero investment in the actual authorship experience. So if it's garbage. I can just throw it away and not care too much about it. In Symphony, there's this like rework state where once the PR is proposed and it's escalated to the human for review, it should be a cheap review.It is either mergeable or it is not. And if it's not, you move it to rework. The elixir service will completely trash the entire work tree NPR and start it again from scratch. Okay. And this is that opportunity again to say, why was it trash right? What did the agent do that wasswyx: bad. Yeah.Ryan Lopopolo: Fix that before moving the ticket toswyx: endRyan Lopopolo: of progress again.swyx: Yeah. Why is this not in codex app? I guess this, you guys are ahead of Codex app,Ryan Lopopolo: yeah, so the way the team has been working is basically to be as AI pilled as possible and spread ahead. And a lot of the things we have worked on have fallen out [00:35:00] into a lot of the products that we have.Like we were in deep consultation with the Codex team to. Have the Codex app be a thing that exists, right? To have skills be a thing that Codex is able to use. So we didn't have to roll our own to put automations into the product. So all of our automatic refactoring agents didn't have to be these hand rolled control loops.It has been really fantastic to be, in a way, un anchored to the product development of Frontier and Codex and just very quickly try to figure out what works and then later find the scalable thing that can be deployed widely. It's been a very fun way to operate. It's certainly chaotic. I have lost track very often of what the actual state of the code looks like.‘cause I'm not in the loop. There was. One point where we had wired playwright directly up to the Electron app. With MCPM CCPs, I'm pretty bearish on because the harness forcibly injects all those tokens in the [00:36:00] context, and I don't really get a say over it. They mess with auto compaction. The agent can forget how to use the tool.There's probably only what three calls in playwright that I actually ever want to use. So I pay the cost for a ton of things. Somebody vibed a local Damon that boots playwright and exposes a tiny little shim CLI to drive it. And I had zero idea that this had occurred because to me, I run Codex and it's able to, it's oh, it's better.Yeah. Like no knowledge of this at all. Uhhuh.[00:36:30] Multi Human ChaosRyan Lopopolo: So we have had like in human space to spend a lot of time doing synchronous knowledge sharing. We have a daily standup that's 45 minutes long because we almost have to. Fan out the understanding of the current state.swyx: Yeah, I was gonna say this is good for a single human multi-agent, but multi human, multi-agent is a whole like po like explosion of stuff.Ryan Lopopolo: Yeah. And that this is fundamentally why we have such a rigid, like 10,000 [00:37:00] engineer level architecture in the app because we have to find ways to carve up the space so people are not trampling on each other.swyx: Sorry, I don't get the 10,000 thing. Did I miss that?Ryan Lopopolo: The structure of the repository is like 500 NPM packages.It's like architecture to the excess for what you would consider, I think normal for a seven person team. But if every person is actually like 10 to 50. Then the like numbers on being super, super deep into decomposition and sharding and like proper interface boundaries make a lot more sense.swyx: Yeah. To me, that's why I talked about Microfund ends and I, an anex is from that world, but Cool. It is just coming back to, to, to this I dunno if you have other, thoughts on. Orchestrating so much work coin going through this. Is this enough? Is this like any aha moments?Vibhu: It'll be interesting to see like where, okay, so right now you pick linear as your issue tracker, right?swyx: Or it's like a is it actually linear? This is actually linear.[00:37:55] Linear vs Slack WorkflowVibhu: Oh, that's linear. It's linear.swyx: Oh I never looked atVibhu: video. The demo video I had to download to [00:38:00] run.swyx: So I, because I'm a Slack maxie, but Yeah, linear. Linear is also really good. Yes,Ryan Lopopolo: we do make a good use of Slack. We we fire off codex to do all these lotion, elasticity, fix ups, the things that like sync that knowledge into the repository.It's super cheap. Yeah.swyx: Yeah.Ryan Lopopolo: Just do it in Codex.swyx: My biggest plug is OpenAI needs to build Slack. You need to own Slack. Build yours. Turn this into Slack.Ryan Lopopolo: I did read about it. Youswyx: did?Ryan Lopopolo: Yeah.[00:38:25] Collaboration Tools for AgentsRyan Lopopolo: I would say that if we think that we want these agents to do economically valuable work, which is like this is the mission, right?We want AI to be deployed widely, to do economically valuable work, then we need to find ways for them to naturally collaborate with humans, which means collaboration tooling, I think, is an interesting space to explore.swyx: Yeah, totally. Yeah. GitHub, slack, linear.Vibhu: Yeah, that was my thing. Okay, where do we see right now Codex has started Codex Model, then CLI, now there's an app, app can let me shoot off multiple Codex is in parallel, but there's no great team collaboration for Codex.And it [00:39:00] seems like your team had some say into what comes out, right? So you talked to ‘em, codex kind of was a thing. From there, if you guys are on the bound, what stuff that like, you might not focus on, but what do you expect other people to be building, right? So people that are like five x 50 Xing.Should you build stuff that's like very niche for your workflow, for your team? Should it be more general so other people can adopt? Is there a niche there? ‘Cause part of it is just okay, is everything just internal tooling? Do we have everything our own way? Like the way our team operates has our own ways that we like to communicate or is there a broader way to do it?Is it something like a issue tracker? Just thoughts if you wanna riff on that.[00:39:35] Standardizing Skills and CodeRyan Lopopolo: I think TBD we have not figured this out in a general way. I do think that there is leverage to be had in making the code and the processes as much the same as possible. If you think that code is context, code is prompts, it's better from the agent behavior perspective to be able to look in a package in directory X, Y, Z, and it not to have to page so [00:40:00] deeply into directory if you C, because they have the same structure, use the same language, they have the same patterns internally.And that same like leverage comes from aligning on a single set of skills that you're pouring every engineer's taste into to make sure that the agent is effective. So like in our code base, we have, I think, six skills. That's it. And if some part of the software development loop is not being covered, our first attempt is to encode it in one of the existing setup skills, which means that we can change the agent behavior.Yeah. More cheaply than changing the human driver behavior.swyx: Yeah.[00:40:39] Self Improvement via Logsswyx: Have you ever, have you experimented with agents changing their own behavior?Ryan Lopopolo: We do.swyx: Yeah. Or parent agent changing a subagents, behavior or something like that.Ryan Lopopolo: We have some bits for skill distillation. So for example, there's one neat thing you can do with Codex, which is just point it at its own session logs to ask it to tell you how you can use [00:41:00] the tool pedal better.swyx: It's like introspectionRyan Lopopolo: or ask it to do things. I useVibhu: this session better. What skills should Iswyx: high? I like the modification of, you can do, just do things to you can just ask agent to do things.Ryan Lopopolo: Yeah. You can just codex things. This is like a, this is like a silly emoji that we have, right? You can just codex things, you can just prompt things.It's really glorious future we live in, but okay, you can do that one-on-one. But we're actually slurping these up for the entire team into blob storage and. Running agent loops over them every day to figure out where as a team can we do better and how do we reflect that back into the repositories?Yes, though everybody benefits from everybody else's behavior for free. Same for like PR comments, right? These are all feedback. That means the code as written, deviated from what was good, a PR comment, a failed build. These are all signals that mean at some point the agent was missing context. We gotta figure out how toswyx: Yeah.Ryan Lopopolo: Slurp it up and put it back in the reboot.swyx: By the way, I do this exactly right. I used to, when I use cloud code for [00:42:00] knowledge work, cloud cowork is like a nice product, right? Yes. In I think you would agree. I always have it tell me what do I do better next time? And that's the meta programming reflection thing.So I almost think like you have six reflection extraction levels in symphony and almost like the zero of layer. So the six levels are PO policy, configuration, coordination, execution, integration, observability. We've talked about a couple of these, but the zero layer is like the, okay, are we working well?Can we improve how we work? Yes. Can I modify my own workflow without MD or something? I don't know.Ryan Lopopolo: Yeah, of course. Yeah, of course you can. Like this thing is also able to cut its own tickets ‘cause we give it full access.Yeah. Make it a ticket to have it cut. Tickets you can.Put in the ticket that you expect it to file as on follow up work,swyx: like Yeah. Self-modifying. Yeah.Ryan Lopopolo: Yeah.[00:42:44] Tool Access and CLI FirstRyan Lopopolo: Put, don't put the agent in a box. Give the agent full accessibility over it. Domain.swyx: I had a mental reaction when you said don't put the agent in a box. So I think you should put it in a box. Like it's just that you're giving the box everything it needs.Ryan Lopopolo: Yeah. Context and tools.swyx: But we're like, as developers, we're used to calling [00:43:00] out to different systems, but here you use the open source things like the Prometheus, whatever, and you run it locally so that you can have the full loop. I assume.Ryan Lopopolo: Yep.Vibhu: I think likeRyan Lopopolo: another, you wanna minimize cloud, cloud dependencies.Vibhu: You also want to make sure that you think about what the agent has access to. What does it see? Does it go back into the loop, like from the most basic sense of you let it see its own like calls, traces it can determine where it went wrong. But are you feeding that back in? So you know, just the most basic level of you wanna see exactly what's input output, like does the agent have access to.What is being outputted, right? It can self-improve a lot of these things. It's allRyan Lopopolo: text, right? My job is to figure out ways to funnel text from one agent to the other.swyx: It's so strange like way back at the start of this whole AI wave Andre was like, English is the hottest day programming language.It's here, it's just Yeah. The feature as well.Vibhu: A lot of, okay. Like a lot of software, a lot of stuff. There's a gui, it's made for the human. We're seeing the evolution of CLI for everything, right? All tools have CLIs. Your agents can use [00:44:00] them well, do we get good vision? Do we get good little sandboxes?Like right now? It's a really effective way, right? Models love to use tools. They love the best. They love to read through text. So slap a CLI let it go loose. That works for everything.Ryan Lopopolo: It does. Yeah. Yeah.[00:44:14] UI Perception and RasterizingRyan Lopopolo: We've also been adapting nont, textual things to that shape in order to improve model behavior in some ways, right?We want the agent to be able to see the UI agents do not perceive visually in the same way that we do. They don't see a red box, they see red box button, right? They see these things in latent space. So if we want, Hey, yeah, I do. We haveswyx: a ding if that goes off every time. Alien spaceRyan Lopopolo: ding.Anyway if we wanna actually make it see the layout, it's almost easier to rasterize that image to ask EOR and feed it in to the agent. Ha. And there's no reason you can't do both, right? To like further refine how the model perceives the object it's [00:45:00] manipulating.swyx: Cool. Could we, you wanna talk about a couple more of these layers that might bear more introspection or that you have personal passion for?[00:45:07] Coordination Layer with ElixirRyan Lopopolo: I will say that the coordination layer here was a really tricky piece to get right.swyx: Let's do it. Yep. I'm all about that. And this is Temporal core.Ryan Lopopolo: This is where when we turn the spec into Elixir, where like the model takes a shortcut, right? Like it's oh, I have all these primitives that I can make use of in this lovely runtime that has native process supervision.Which is I think, a neat way to have taken the spec and made it more choices achievable by making choices that naturally mapswyx: Yeah.Ryan Lopopolo: To the domain, right? In the same way that like you would prefer to have a TypeScript model repo if you are doing full stack web development, right? Because the ability to share types across the front end and backend reduces a lot of complexity.And becauseswyx: that's what graph kill used to be.Ryan Lopopolo: That's right. Andswyx: I don't know if it's still alive, butRyan Lopopolo: [00:46:00] no humans in the loop here. So like my own personal ability to write or not write elixir. Doesn't really have to bias us away from using the right tool for the job. It is just wild.swyx: Love it. I love it.Yeah. I wonder if any languages struggle more than others because of this? I feel like everyone has their own abstractions. That would make sense. But maybe it might be slower, it might be more faulty where like you'd have to just kick the server every now and then. I, I don't know. I think observability layer is really well understood.Integration layer, CP is dead. I think all these just like a really interesting hierarchy to travel up and down. It's common language for people working on the system to understandRyan Lopopolo: The policy stuff is really cool, right? Yeah. You don't really have to build a bunch of code to make sure the system wait for the, to passswyx: it's institutional knowledge.Ryan Lopopolo: Yeah. You just give it the G-H-C-L-I with some text that say CI has to pass. It makes the maintenance of these systems a lot easier.[00:46:57] Agent Friendly CLI Outputswyx: Do you think that CLI maintainers need to be [00:47:00] do anything special for agents or just as is? It's good because like I don't think when people made the G GitHub, CLI, they anticipated this happening.Ryan Lopopolo: That's correct. The GH CLI is fantastic. It's great super industry.swyx: Everyone go try GH repo create GH pull and then pull request number, right? GH HPR, like 1 53, whatever. And then it like pullsRyan Lopopolo: basically my only interaction with the GitHub web UI at this point is GH PR view dash web.Exactly. Glanceswyx: at the diffRyan Lopopolo: and be like Sure thing. Send it. Yeah. But the CLI are nice ‘cause they're super token efficient and they can be made more token efficient really easily. Like I'm sure you all have seen like I go to build Kite or Jenkins and I could just get this massive wall of build output.And in order to unblock the humans, your developer productivity team is almost certainly gonna write some code that parses the actual exception out of the build logs and sticks it in a sticky note at the top of the page. And you basically [00:48:00] want CLI to be structured in a similar way, right? You're gonna want to patch dash silent to prettier because the agent doesn't care that every file was already formatted.Just wants to know it's either formatted or not. So it can then go run a right command. Similarly, like in our PNPM distributed script runner, when we had one, when you do dash recursive, like it produces a absolute mountain of text. But all of that is for passing. Test suites. So we ended up wrapping all of this in another scriptswyx: to suppress the,Ryan Lopopolo: which you can vibe the channel only output the failing parts of the tests.swyx: You make a pipe errors versus the standard, standard out. I don't know. Okay. Whatever. Too much thinking have to do that. The CII used to maintain SCLI for my company and yeah, this is like core, very core to my heart. But you're vibing my job.Ryan Lopopolo: That's right.swyx: Cool. Any other things?This is a long spec. [00:49:00] I appreciate that. It's got a lot of strong opinions in here. Any other things that we should highlight? I think obviously you can spend the whole day going through some of these, but I do think that some of these have a lot of care or some of this you might wanna tell people, Hey, take this, but, make it your own.[00:49:15] Blueprint Spec and GuardrailsRyan Lopopolo: Fundamentally, software is made more flexible when it's able to adapt to the environment in which it is deployed, which means that things like linear or GitHub even are specified within the spec, but not required pieces of it. There's like a more platonic ideal of the thing that you could swap in like Jira or Bitbucket, for example.But being able to tightly specify things like the ID formats or how the Ralph Loop works for the individual agents. Basically means you can get up and running with a fully specified system quickly that you then evolve later on. I think we never intended for this to be a static spec that you can [00:50:00] never change.It's more like a blueprint to get something worth a starting point up and running.swyx: Yeah.Ryan Lopopolo: For you then to vibe later to your heart's content,swyx: you have like code and scripts in here where it's oh, I think this is a really good prompt. It's just a very long prompt.Ryan Lopopolo: Fundamentally, the agents are good at following instructions, so give them instructions.And it will, improve the reliability of the result. We, much like the way we use Symphony, we don't want folks to have to monitor the agent as it is vibing the system into existence. So being very opinionatedVery strict around what these success criteria are means that our deployment success rate goes up. Yeah. It means we don't have to get tickets on this thing.Vibhu: Think it all goes back to that like code to disposable, right? Like early on when you had CLI or you'd kick off a Codex run, it would take two hours. You would wanna monitor okay, I'm in the workflow of just using one.I don't want it to go down the wrong path. I'll cut it off and, just shoot off four, like that was my favorite thing of the Codex app, right? Yeah. Just Forex it like, [00:51:00] it's okay. One of them will probably be right, one of them might be better. Stop overthinking it. Like my first example was probably like deep research.When you put out deep research and I'd ask it something like, I asked it something about LLM, it thought it was legal something and spent an hour, came back with a report completely off the rails. And I was like, okay, I gotta monitor this thing a bit. No don't monitor it. Just you want to build it so it's that it, it goes the right way.And you don't wanna, you don't wanna sit there and babysit, right? You don't want to babysit your agentsRyan Lopopolo: with that deep research query that you made. Looking at the bad result, you probably figured out you needed to tweak your prompt Yeah. A bit, right? That's that guardrail that you fed back into the code base for the task, your prompt to further align the agent's execution.Same sort of concept supply there too.swyx: When you talk, how are the customers feelingRyan Lopopolo: for Symphony? I think we have none, right? This is a thing we have put out into theswyx: world. Symphony's internal, right? As long as you are happy, you are the customer. That'

Paul's Security Weekly TV
AppSec News Roundup on Claude Code Leak, Axios NPM Compromise, Secure Design - Idan Plotnik, Raj Mallempati - ASW #377

Paul's Security Weekly TV

Play Episode Listen Later Apr 7, 2026 68:42


Security problems aren't changing very much even though security teams are. We catch up on the implications of the Claude Code source leak, the very human lessons from the axios NPM compromise, and what secure design looks like when it involves agents, humans, or both. AppSec has always celebrated interesting and impactful vulns. And LLMs are now a favored tool for finding flaws. We shouldn't forget the success and effectiveness of fuzzers like OSS-Fuzz, which has improved security for over 1,000 projects and found over 50,000 bugs. But we can't ignore the ease of prompting an agent to go find -- and exploit -- a vuln when the UX and overhead of doing so is hardly more than writing some markdown. The SDLC Blind Spot: Why Breaches Start with Identity, Not Code Developers have access to source code, CI/CD pipelines, and cloud infrastructure — and attackers know it. Target lost 860GB of source code through a single compromised credential. Recruitment fraud campaigns have pivoted from a compromised developer to cloud admin in under 10 minutes. As agents join human developers, contractors, and service accounts in the SDLC, the attack surface is expanding faster than static security tools can track. Security teams need real-time visibility beyond code and into who has access and what they're actually doing. This segment is sponsored by Apiiro. To lean more, visit https://securityweekly.com/apiirorsac. How AI-Driven Development is Reshaping the Application Risk Landscape Agent coding assistants are accelerating software development, generating more code and more change than security teams were built to handle. In this interview, Idan Plotnik discusses how AI-driven development is reshaping the application risk landscape and why traditional vulnerability management models can't keep up. Make sure to schedule a free SDLC Risk Assessment with BlueFlag Security - 30 minutes to deploy. 48 hours to results. Please visit https://securityweekly.com/blueflagrsac. Show Notes: https://securityweekly.com/asw-377

Application Security Weekly (Audio)
AppSec News Roundup on Claude Code Leak, Axios NPM Compromise, Secure Design - Idan Plotnik, Raj Mallempati - ASW #377

Application Security Weekly (Audio)

Play Episode Listen Later Apr 7, 2026 68:42


Security problems aren't changing very much even though security teams are. We catch up on the implications of the Claude Code source leak, the very human lessons from the axios NPM compromise, and what secure design looks like when it involves agents, humans, or both. AppSec has always celebrated interesting and impactful vulns. And LLMs are now a favored tool for finding flaws. We shouldn't forget the success and effectiveness of fuzzers like OSS-Fuzz, which has improved security for over 1,000 projects and found over 50,000 bugs. But we can't ignore the ease of prompting an agent to go find -- and exploit -- a vuln when the UX and overhead of doing so is hardly more than writing some markdown. The SDLC Blind Spot: Why Breaches Start with Identity, Not Code Developers have access to source code, CI/CD pipelines, and cloud infrastructure — and attackers know it. Target lost 860GB of source code through a single compromised credential. Recruitment fraud campaigns have pivoted from a compromised developer to cloud admin in under 10 minutes. As agents join human developers, contractors, and service accounts in the SDLC, the attack surface is expanding faster than static security tools can track. Security teams need real-time visibility beyond code and into who has access and what they're actually doing. This segment is sponsored by Apiiro. To lean more, visit https://securityweekly.com/apiirorsac. How AI-Driven Development is Reshaping the Application Risk Landscape Agent coding assistants are accelerating software development, generating more code and more change than security teams were built to handle. In this interview, Idan Plotnik discusses how AI-driven development is reshaping the application risk landscape and why traditional vulnerability management models can't keep up. Make sure to schedule a free SDLC Risk Assessment with BlueFlag Security - 30 minutes to deploy. 48 hours to results. Please visit https://securityweekly.com/blueflagrsac. Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw-377

The Engineering Enablement Podcast
Measuring AI impact, assessing readiness, and new data trends

The Engineering Enablement Podcast

Play Episode Listen Later Apr 3, 2026 38:13


In this episode of Engineering Enablement, Jesse Adametz joins Abi Noda, this time to host. Together, they explore how AI is showing up across the SDLC, not just in code generation, and how it is shifting bottlenecks across the development process. They unpack what “AI readiness” actually means in practice, and why it often comes down to developer experience fundamentals like documentation, environments, and feedback loops.They also discuss why enablement matters more than tool choice, how teams are thinking about measuring ROI, and what changes as background agents become more common. Finally, they explore how the role of the engineer may evolve, the open questions teams are still grappling with, and the challenges of non-engineers contributing to codebases.Where to find Jesse Adametz: • LinkedIn: https://www.linkedin.com/in/jesseadametz • X: https://x.com/jesseadametz • Website: https://www.jesseadametz.com/Where to find Abi Noda:• LinkedIn: https://www.linkedin.com/in/abinoda In this episode, we cover:(00:00) Intro(02:12) Where AI is showing up across the SDLC(05:53) AI readiness and its link to developer experience(08:23) Why enablement, education, and experimentation matter more than tool choice(13:05) The case for a dedicated enablement team(14:50) Measuring AI ROI: challenges and tradeoffs(19:46) Background agents and token spend(24:12) Measuring agent output with PR throughput(26:58) How the engineer role might change(31:01) Specs and documentation in the age of AI(33:11) Non-engineers writing code(35:30) What's changing in the SDLC and open questionsReferenced:• Measuring AI code assistants and agents• Lessons from Twilio's multi-year platform consolidation• The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win• How Claude remembers your project - Claude Code Docs• specIsJustCode : r/ProgrammerHumor

Dev Interrupted
Retrofit or reimagine? Developer environments for humans and agents | Ona's Matt Boyle

Dev Interrupted

Play Episode Listen Later Mar 31, 2026 37:03


AI agents have officially arrived on an internet that simply wasn't built for them. So how do we build the infrastructure to keep them safe, productive, and contained? This week, Andrew sits down with Matt Boyle, Head of Product, Design and Engineering at Ona (formerly Gitpod), to discuss evolving cloud development environments into secure, enterprise-grade "agent jails." They explore the mechanics of Project Veto's kernel-level security, the slow death of the traditional IDE, and how the rise of AI is transforming developers into full-stack, T-shaped product owners. Finally, Matt shares his vision for the future of the SDLC, detailing how organizations can safely balance strict compliance with the bleeding edge of autonomous software factories.Download the APEX FrameworkFollow the show:Subscribe to our Substack Follow us on LinkedInSubscribe to our YouTube ChannelLeave us a ReviewFollow the hosts:Follow AndrewFollow BenFollow DanFollow today's guest:Learn more about Ona (formerly Gitpod) and read their latest blog announcements, including Veto.Connect with Matt on LinkedIn | X (Twitter)OFFERSStart Free Trial: Get started with LinearB's AI productivity platform for free.Book a Demo: Learn how you can ship faster, improve DevEx, and lead with confidence in the AI era.LEARN ABOUT LINEARBAI Code Reviews: Automate reviews to catch bugs, security risks, and performance issues before they hit production.AI & Productivity Insights: Go beyond DORA with AI-powered recommendations and dashboards to measure and improve performance.AI-Powered Workflow Automations: Use AI-generated PR descriptions, smart routing, and other automations to reduce developer toil.MCP Server: Interact with your engineering data using natural language to build custom reports and get answers on the fly.

Absolute AppSec
Episode 317 - (Post-RSAC/BSidesSF), Supply Chain Security, Future of SDLC

Absolute AppSec

Play Episode Listen Later Mar 31, 2026


Ken Johnson and Seth Law reflect on the 2026 RSA Conference and BSidesSF, noting an industry-wide "awakening" regarding the high costs and engineering complexities of operationalizing AI security tools. A major focus is the recent "supply chain attack hell," specifically the compromise of the Axios HTTP client through dual-account breaches that allowed attackers to bypass legitimate OIDC deploy setups via a misconfigured NPM CLI. The malware used was particularly evasive, deleting itself and replacing its package.json with a clean version post-execution. The hosts also discuss the emergence of the "Agentic Development Lifecycle" (ADLC), where engineering teams are increasingly "committing on time" rather than features, creating a volume of code that traditional security gates cannot manage. They debate Thomas Ptacek's thesis that AI agents will soon "supplant" human vulnerability research for common bug classes, shifting the human role toward high-level governance and "context infusion". Economically, they highlight how Anthropic's security announcements contributed to nearly half a trillion dollars in market value loss for traditional security firms, as investors increasingly bet on frontier models to consume established security domains.

Software Defined Talk
Episode 562: Bureaucracy: Still Unsolved

Software Defined Talk

Play Episode Listen Later Mar 6, 2026 66:25


This week. we discuss Claude Code's momentum, Cursor's identity crisis, and the SDLC's uncertain future. Plus, Coté finally explains how Markdown is destroying the economy. Watch the YouTube Live Recording of Episode 562 Runner-up Titles Demos over Memos Products over Prose Software written by the many for the few USB is flaky Do you get a Code of Conduct for prison? I thought I had typed it somewhere Markdown is taking down the economy Claude, Take the Wheel Sticking with month-to-month Precious Tokens Rip Van Winkle this whole AI thing The ants have won They have infinite tokens Is SLDC Dead? Rundown The SaaS-Apocalypse was based on markdown files The Software Development Lifecycle Is Dead The Third Era of Software Development Intelligence, Subtracted Anthropic rejects Pentagon's AI demands Exclusive-Anthropic investors push to de-escalate Pentagon clash over AI safeguards ‘Incoherent': Hegseth's Anthropic ultimatum confounds AI policymaker Anthropic leads Enterprise AI Spend Anthropic took >50% of spend on enterprise AI subscriptions $110 Billion in Name Only OpenAI reveals more details about its agreement with the Pentagon OpenAI changes deal with US military after backlash Relevant to your Interests McKinsey and AWS launch Amazon McKinsey Group Polymarket defends its decision to allow betting on war as ‘invaluable' The Supreme Court doesn't care if you want to copyright your AI-generated art Distinguished Eng On Stack Ranking, Competing with Bezos, Regrets WIZ: My Personal AI Agent OpenAI changes deal with US military after backlash Tech Publications Lost 58% of Google Traffic Since 2024 Ramp AI Index Nonsense Callers to Washington state hotline press 2 for Spanish and get accented AI English instead Anyone Else Have Those Weird Dreams Where Sobbing Future Generations Beg You To Change Course? Conferences Austin Meetup, March 10th, Listener Steve Anness speaking on Grafana KubeCon EU, March 23rd to 26th, 2026 - Coté will be there on a media pass. DevOpsdays Atlanta 2026, April 21-22, 2026 DevOpsDays Austin, May 5-6, 2026 WeAreDevelopers, July 8th to 10th, Berlin, Coté speaking. VMware User Groups (VMUGs): Amsterdam (March 17-19, 2026) - Coté speaking. Minneapolis (April 7-9, 2026) Toronto (May 12-14, 2026) Dallas (June 9-11, 2026) Orlando (October 20-22, 2026) SDT News & Community Join our Slack community Email the show: questions@softwaredefinedtalk.com Free stickers: Email your address to stickers@softwaredefinedtalk.com Follow us on social media: Twitter, Threads, Mastodon, LinkedIn, BlueSky Watch us on: Twitch, YouTube, Instagram, TikTok Book offer: Use code SDT for $20 off "Digital WTF" by Coté Sponsor the show Sponsor more podcasts with Failover Media Recommendations Brandon: Failover Media Newsletter Milestone 1.1 Ski Quiver Matt: IKEA MYGGSPRAY motion sensor, TRADFRI LED and RODRET Coté: The “Anime Wow” sound. And, related to Brandon's modernization talk last week.

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 710: Marc Brooker on Spec-Driven AI Dev

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Mar 4, 2026 63:27


Marc Brooker, VP and Distinguished Engineer at AWS, joins host Kanchan Shringi to explore specification-driven development as a scalable alternative to prompt-by-prompt "vibe coding" in AI-assisted software engineering. Marc explains how accelerating code generation shifts the bottleneck to requirements, design, testing, and validation, making explicit specifications the central artifact for maintaining quality and velocity over time. He describes how specifications can guide both code generation and automated testing, including property-based testing, enabling teams to catch regressions earlier and reason about behavior without relying on line-by-line code review. The conversation examines how spec-driven development fits into modern SDLC practices; how AI agents can support design, code review, documentation, and testing; and why managing context is now one of the hardest problems in agentic development. Marc shares examples from AWS, including building drivers and cloud services using this approach, and discusses the role of modularity, APIs, and strong typing in making both humans and AI more effective. The episode concludes with guidance on rollout, evaluation metrics, cultural readiness, and why AI-driven development shifts the engineer's role toward problem definition, system design, and long-term maintainability rather than raw code production. Brought to you by IEEE Computer Society and IEEE Software magazine.

The Tech Trek
The CPTO Role Explained, How Product and Engineering Move Faster Together

The Tech Trek

Play Episode Listen Later Feb 23, 2026 25:10


Arnie Katz has been running product and engineering under one roof since before most companies even considered combining the roles. As CPTO at GoFundMe, he oversees the teams behind a platform processing over 2.5 donations every second, with more than $40 billion in help facilitated worldwide. Arnie breaks down why the CPTO title keeps gaining traction, how he thinks about the role like a portfolio manager, and where the real trade offs live when one person holds both the product and technology reins.Key TakeawaysThe CPTO role works like a portfolio manager. Arnie manages the company's largest investment center by balancing short term business wins against long term platform bets, knowing when to take on technical debt and when to pay it down.Velocity, coordination, and alignment are the three biggest wins. When product and engineering report to one leader, decisions happen faster, roadmap conflicts get resolved without executive tug of war, and technical investments stay tied to business outcomes.The disadvantages are real. Without separate CPO and CTO voices at the executive table, certain perspectives can get muted. His fix: build a leadership bench strong enough to create the right tension underneath him.AI is changing what small teams can deliver. GoFundMe's eight person team behind Giving Funds is shipping at a pace that would have been impossible five years ago.Timestamped Highlights[00:38] The scale most people don't realize about GoFundMe, including 2.5 donations per second and GoFundMe Pro for nonprofits.[02:02] How Arnie first landed the CPTO title at StubHub seven years ago, and why it clicked.[09:11] The real downside of collapsing two C suite roles into one, and how Arnie designs around it.[13:57] His portfolio approach to technical debt, sequencing re platforming in areas like identity and payments while other teams ship business value.[18:38] AI reshaping engineering velocity, the future of the SDLC, and product teams prototyping without writing code.[23:06] Where the CPTO model is headed as the industry evolves.The Line That Stuck"I often think of myself as a portfolio manager. My job is to invest money where the company gets the best returns, where the mission gets the best return, where the shareholder gets the best returns."Pro TipsSequence your bets instead of spreading them thin. GoFundMe gave their identity and payments teams nine months of runway to re platform with no feature expectations while other squads picked up the pace on near term results.Build leadership that creates productive friction. Without CPO vs. CTO tension at the exec level, let your VPs and SVPs push back against each other. That tension is where the best decisions come from.Think in time horizons, not just priorities. Short term moves for 0.1% to 0.5% metric lifts. Midterm bets for 1% to 5% gains. Long term swings that could transform the business. Allocate across all three.If this conversation changed how you think about product and engineering working together, share it with someone on your team. Subscribe to The Tech Trek so you never miss an episode, and connect with Arnie on LinkedIn to keep the conversation going.GoFundMe is offering listeners of The Tech Trek a chance to open their own Giving Fund. For the first 50 people who open a Giving Fund and add $25 or more to their Giving Fund, GoFundMe will add an additional $25 to that Giving Fund. If you have a Giving Fund but have never contributed into it, you can also participate. The deadline for this incentive is March 13. To get this incentive, click here to start your Giving Fund.

Cloud Security Podcast
Why AI Infrastructure is Harder to Secure Than Cloud

Cloud Security Podcast

Play Episode Listen Later Feb 20, 2026 34:03


Is AI security just "Cloud Security 2.0"? Toni De La Fuente, creator of the open-source tool Prowler, joins Ashish to explain why securing AI workloads requires a fundamentally different approach than traditional cloud infrastructure.We dive deep into the "Shared Responsibility Gap" emerging with managed AI services like AWS Bedrock and OpenAI. Toni spoke about the hidden dangers of default AI architectures, why you should never connect an MCP (Model Context Protocol) directly to a database.We discuss the new AI-driven SDLC, where tools like Claude Code can generate infrastructure but also create massive security blind spots if not monitored.Guest Socials -⁠ ⁠⁠⁠⁠⁠Toni's LinkedinPodcast Twitter - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠@CloudSecPod⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠If you want to watch videos of this LIVE STREAMED episode and past episodes - Check out our other Cloud Security Social Channels:-⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security Podcast- Youtube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠- ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security Newsletter ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠If you are interested in AI Security, you can check out our sister podcast -⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ AI Security Podcast⁠Questions asked:(00:00) Introduction(02:50) Who is Toni De La Fuente? (Creator of Prowler)(03:50) AI Security vs. Cloud Security: What's the Difference? (07:20) The Shared Responsibility Gap in AI Services (Bedrock, OpenAI) (11:30) The "Fifth Party" Risk: Managed AI Access (13:40) AI Architecture Best Practices: Never Connect MCP to DB Directly (16:40) Prowler's AI Pillars: Generating Dashboards & Detections (22:30) The New SDLC: Securing Code from Claude Code & Lovable (25:30) The "Magic" Trap: Why AI Doesn't Know Your Security Context (28:30) Top 3 Priorities for Security Leaders (Infra, LLM, Shadow AI) (30:40) Future Predictions: Why Predicting 12 Months Out is Impossible

The Changelog
It's a renaissance woman's world (Friends)

The Changelog

Play Episode Listen Later Feb 6, 2026 103:08


Amal Hussein returns to tell us all about her new role at Istari, what life is like outside the web browser, how she's helping ambitious orgs in aerospace, what the SDLC looks like in 2026, and a whole lot more. Wait, moon vacuums?!

friends renaissance woman sdlc istari adam stacoviak jerod santo