Podcasts about Jira

  • 701PODCASTS
  • 1,415EPISODES
  • 41mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Jun 12, 2026LATEST

POPULARITY

20192020202120222023202420252026

Categories



Best podcasts about Jira

Show all podcasts related to jira

Latest podcast episodes about Jira

The top AI news from the past week, every ThursdAI

Hey folks, Alex here, and welcome to a BIG MODEL week! We finally got Mythos (well almost)! Let me catch you up! This week started with WWDC26 from Apple, and Max Weinbach, who was in the room at Apple Park and actually has access to some of the new features including an all new SIRI AI, joined us to break down what could be the most used AI in the world very soon. At first I was skeptical, but he convinced me that the new Siri is actually good! Then, we saw the ultimate model drop: Anthropic finally shipped Mythos (X, my system card thread, benchmarks). Same weights, two names: Mythos 5 is the unrestricted version that only Project Glasswing partners get, Fable 5 is what the rest of us get, wrapped in the heaviest guardrails I've ever seen ship on a frontier model. It's state of the art on nearly every benchmarkThe model that was “too dangerous to release” is now... well, released, but with the heaviest guardrails we've seen. More on this later. Peter Gostev from Arena.ai joined us to break down the new model. Last but definitely not least, Google released a real-time translation model, that our friend Thor Schaeff from DeepMind demoed live, while we all spoke in different languages and it translated us in REAL TIME. It was really cool, definitely check that out. There's quite a few more things, like Loop Engineering Alpha, Swyx came by to talk about FrontierCode, OpenAI confirmed our suspicions that the anti-datacenter social media posts could be a concerted effort by groupds links to the Chinese government and much more. Let's dive in! ThursdAI - Let me catch you up, every week!

The Daily Standup
Jira Turned Agile Into a Micromanagement Tool

The Daily Standup

Play Episode Listen Later Jun 11, 2026 7:50


Jira Turned Agile Into a Micromanagement ToolThere was a time when Agile felt liberating. Teams owned their work, conversations mattered more than documentation, and progress was measured by outcomes, not activity. Then somewhere along the way, tools stepped in to “support” the process. What followed in many organizations was not support but substitution. Jira did not break Agile by design. It became the easiest place for organizations to quietly reintroduce control, visibility, and ultimately micromanagement under the label of transparency.- [website] ⁠⁠⁠⁠https://www.agiledad.com/⁠⁠⁠⁠- [instagram] ⁠⁠⁠⁠https://www.instagram.com/agile_coach/⁠⁠⁠⁠- [facebook] ⁠⁠⁠⁠https://www.facebook.com/RealAgileDad/⁠⁠⁠⁠- [Linkedin] ⁠⁠⁠⁠https://www.linkedin.com/in/leehenson/

The Jira Life
JIra Admin Woes (with Peter Kerrigan)

The Jira Life

Play Episode Listen Later Jun 11, 2026 68:16


Being a Jira admin has never been more complex or more critical. In this episode, we sit down with Peter Kerrigan, Head of Customer Success at Solcoro, for a no-holds-barred conversation about the real struggles, hard decisions, and thankless work that goes into keeping an Atlassian environment healthy, scalable, and sane.Whether you're managing a scrappy single-instance setup or a sprawling enterprise layout with dozens of sites, this one is going to hit home.We dig into the topics that don't get talked about enough the ones that keep admins up at night and that no Atlassian certification ever fully prepares you for.Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!https://www.linkedin.com/company/the-jira-life/Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/joinHosts:- Alex "Dr. Jira" Ortiz https://www.linkedin.com/in/alexortiz89/ https://www.youtube.com/@ApetechTechTutorials- Rodney "The Jira Guy" Nissen https://www.linkedin.com/in/rgnissen/ https://thejiraguy.com- Sarah Wright https://www.linkedin.com/in/satwright/ Producer:- "King Bob" Robert Wen https://www.linkedin.com/in/robert-wen-csm-spc6-a552051/Executive Producer: - Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codeshttps://www.youtube.com/c/monstercatOutro: Fractal - Atriumhttps://www.youtube.com/c/monstercatinstinct

Scrum Master Toolbox Podcast
BONUS Why More Code Doesn't Mean Better Software — And Where AI Actually Helps Your SDLC With Mooly Beeri

Scrum Master Toolbox Podcast

Play Episode Listen Later Jun 10, 2026 40:00


BONUS: Why More Code Doesn't Mean Better Software — And Where AI Actually Helps Your SDLC Most teams are adopting AI to write code faster. But what if code generation isn't your bottleneck? Mooly Beeri has spent 25 years diagnosing where software organizations actually underperform — from Microsoft to Philips to automotive — and his message is clear: measure before you automate, and tie every AI investment to a business KPI. The Pattern Debugger's Origin Story "I've been identifying patterns way before AI was doing that. One of my first jobs was Microsoft, and I got the opportunity to work in engineering excellence. Every single simple improvement would make the lives of so many people better and the code better and the products better."   Mooly's career started at Microsoft in engineering excellence, where he discovered his passion for finding process areas that need improvement. From there he built the first software centre of excellence for Philips, spawned it into a separate business, and has been doing the same process excellence work across healthcare, telecom, and automotive ever since. His framework: understand where you're bleeding quality, revenue, or budget — then intervene there, not everywhere. Improvement Doesn't Mean Progress "There are too many efforts to improve too many things that don't really matter. The ability to tie a specific improvement to what actually means progress for a business — that, for me, is one critical component that's missing in many transformations."   Mooly's core insight applies directly to AI adoption: everyone has an improvement plan, but few can answer "how does this improvement improve business performance?" If you ask that one additional question, you can probably cancel half your improvement projects — the ones that make people feel good but don't move the needle on time to market, quality, or cost. The Code Generation Trap "It's like saying a book author is more productive because they write more words. The unit of work is not the number of lines of code they produce. The unit of work is a piece of code that works, that is tested, that is fully reliable, that meets a customer expectation, and eventually generates revenue."   Data from Faros AI shows individual developer PRs went up 98% with AI tools — but organizational delivery actually dropped 1.5%. More code, same or worse outcomes. Mooly explains why: most organizations invest in code generation not because it's the most effective thing to improve, but because it's the easiest step to automate. There are 35 steps in the SDLC. Picking code generation gives you a 1-in-35 chance of striking gold. As the saying goes: hope is not a strategy. Where AI Actually Works in the SDLC "The best usages would be in areas of the SDLC where there is a lot of data that needs processing and needs some detection of patterns — where AI is really, really good."   The most successful AI applications Mooly has seen with clients:   Defect root cause analysis — training AI agents on thousands of Jira bugs to find patterns humans can't see. In one healthcare client, AI analysis revealed that "false positive" bugs were actually compromised requirements — the dev team was closing real deviations as unimportant because they didn't have time to fix them Code review enhancement — AI scans incoming defects and generates a live, evolving checklist so reviewers spend their limited time checking for the most probable problems Test generation — unit, component, and functional test creation where AI can leverage existing test patterns and requirement data Requirements review — correlating requirements against strategic objectives, OKRs, and historical defect patterns to find contradictions before coding begins The Thinking Process You Can't Automate "The developers going through the process of converting requirements into code — it's actually a thinking process. It creates a lot of discussions with the product managers, a lot of back and forth, which help refine the requirement. This entire exchange is gone out the window when you have AI generate the code in 5 minutes."   When AI generates code instantly from requirements, it eliminates the human feedback loop that catches contradictions and incomplete specifications. The FDA has recognized this: every AI-assisted step in medical device software must be guardrailed by human activity. If you generate code quickly but still need a human review, the speed gain disappears. The value of coding was never just the code — it was the thinking. Map Every Investment to a Business KPI "If your uncle ran a bicycle repair shop and you said, let's advertise in the local newspaper, the first question he'd ask is: how many new customers will we get? The business logic hasn't evolved so much. If you want to do something — how will this impact your revenue, your customer retention, or your cost of producing goods? If you can't answer these things, don't invest."   Mooly's advice is deceptively simple: before adopting any AI tool in your SDLC, ask yourself which of three business outcomes it will improve — faster time to market, higher quality (fewer customer issues), or better margins (lower execution cost). If you can't draw a direct line from the AI investment to one of those outcomes, you're doing improvement theatre. About Mooly Beeri Mooly Beeri is CEO and co-founder of BetterSoftware, a consulting firm with over 25 years helping companies across healthcare, telecom, and automotive transform how they build software. His work focuses on diagnosing where software organizations underperform and designing targeted interventions — not blanket transformations.   You can link with Mooly Beeri on LinkedIn.

Conversations on Careers and Professional Life
AI Ready: Ahmad Ghabboun Discovers His Interest in AI

Conversations on Careers and Professional Life

Play Episode Listen Later Jun 10, 2026 42:12


AI Ready: Ahmad Ghabboun Ahmad Ghabboun built a Demo Day–winning AI product during his MSIS program — after arriving with no plans to work in AI at all. He breaks down how his mindset shifted, how his design background made him a stronger prompter, and how to build AI fluency that actually holds up in interviews. Useful for students and early-career professionals trying to get AI-ready without faking it. Ahmad Ghabboun is a Master of Science in Information Systems (MSIS) 2026 Graduate at the UW Foster School of Business. Before Foster, he spent roughly fifteen years in UX and product design, building web applications for startups. At Foster he built several generative-AI tools in his coursework, including Synapse, which won Best Business and Tech Product at the MSIS Demo Day. He is targeting product management and technical product roles. What you'll learn Why naming the specific AI model you use — and justifying it — matters more in interviews than saying "I use AI" How a design background translates into sharper, more technical prompts How to keep a human in the loop so AI assists your judgment instead of replacing it Why AI's tendency to agree with you makes human and second-model pushback essential How to stay current with fast-moving tools without trying to learn everything The difference between a productivity mindset and a learning mindset in school Key moments The third-quarter AI classes that moved AI from "not on my list" to his career focus The origin of Synapse: manually juggling answers across Gemini, Claude, and a third model How Synapse runs a dual-model validation and a judge step to flag gaps for technical PMs Why interview proctoring now detects AI use — and what a "perfect" AI answer signals to interviewers Ethan Mollick's "jagged edge" and why it shifts with every model release Resources mentioned Lovable; Replit; Gemini; Claude; ChatGPT; Jira; Azure DevOps; GitHub; Ethan Mollick's "jagged frontier" of AI capability.

Revenue Engine Podcast
Building a CMO+ Mindset for Modern Marketing With Catherine Solazzo

Revenue Engine Podcast

Play Episode Listen Later Jun 5, 2026 35:57


Catherine Solazzo is the Chief Marketing Officer at Appfire, a leading provider of software applications that help developers optimize their efficiency on platforms like Jira, Salesforce, and monday.com. With over 20 years of experience in marketing and digital transformation, Catherine has led high-performing teams at global tech giants and was recently recognized on the CRN Women of the Channel 2026 list. Under her marketing leadership, Appfire reaches over a million users every quarter through its technical documentation site. In this episode… Marketing today goes beyond just awareness, demand generation, or a polished campaign. It extends into the product experience, customer feedback, partner enablement, and every handoff that shapes revenue. So what does it really take to build a modern CMO mindset? According to Catherine Solazzo, a seasoned marketing leader with deep experience across developer ecosystems, the answer lies in thinking beyond the traditional marketing boundaries. She explains that elements like technical documentation, release notes, product pages, partner programs, and customer feedback are not secondary functions; they are critical touchpoints that help buyers understand, adopt, and expand with a product. Catherine emphasizes that by taking ownership of this entire journey, marketing teams can move faster, use data more effectively, and create a more cohesive go-to-market motion. In this episode of the Revenue Engine Podcast, Alex Gluz is joined by Catherine Solazzo, Chief Marketing Officer at Appfire, to discuss building a CMO mindset for modern marketing. Catherine explains the CMO+ model, how to align product and marketing, and why demand generation should go beyond lead volume. She also shares advice on partner enablement and creating repeatable operating models.

The Jira Life
Sustainable Leadership at Atlassian (featuring Jessica Hyman)

The Jira Life

Play Episode Listen Later Jun 4, 2026 62:10


What does it actually look like to embed social impact into the DNA of a leading tech company — not as a PR footnote, but as a core operating principle? This week, we sit down with Jessica Hyman, the person at Atlassian tasked with answering exactly that question.As a key part of the Atlassian Foundation and the company's Chief Sustainability Officer, Jessica oversees everything from Atlassian's environmental commitments to its global social impact programs. We dig into how she navigates the tension between ambitious sustainability goals and the realities of running a high-growth software company, what the Atlassian Foundation is building for the long term, and why she thinks the tools teams use every day — yes, including Jira — can shape a more equitable future of work.This discussion looks beyond the typical outcomes such as new features of Rovo to understand how one of the world's biggest collaboration platforms approaches its obligations to the planet and its communities. If you're ready to look at Atlassian from a different perspective, this conversation is for you.Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!https://www.linkedin.com/company/the-jira-life/Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/joinHosts:- Alex "Dr. Jira" Ortiz https://www.linkedin.com/in/alexortiz89/ https://www.youtube.com/@ApetechTechTutorials- Rodney "The Jira Guy" Nissen https://www.linkedin.com/in/rgnissen/ https://thejiraguy.com- Sarah Wright https://www.linkedin.com/in/satwright/ Producer:- "King Bob" Robert Wen https://www.linkedin.com/in/robert-wen-csm-spc6-a552051/Executive Producer: - Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codeshttps://www.youtube.com/c/monstercatOutro: Fractal - Atriumhttps://www.youtube.com/c/monstercatinstinct

Podcast Radio Penyiaran Polimedia
“KAMISTERI” Eps 03 Season 8… Kesurupan #1

Podcast Radio Penyiaran Polimedia

Play Episode Listen Later Jun 4, 2026 30:21


Halo sobat kreatif…di episode KamisTeri kali ini, kita kedatangan tamu spesial yaitu Jira yang bakal cerita pengalaman waktu sekolah dulu.Katanya, sekolahnya pernah terjadi kesurupan setelah melewati lantai paling atas gedung sekolahnya.Anehnya, hampir tiap siang ada kejadian dan kesurupannya nyamber.Awalnya satu orang, lalu yang lain ikut kesurupan sampai suasana sekolah jadi chaos.Sebenernya ada apa di lantai atas itu?Dengerin cerita lengkapnya di KamisTeri.Karena di KamisTeri…kamu nggak sendirian. Penyiar: FikriTamu: JiraOperator: BillyMusic Director: FattaahEditor: IlhamJangan lupa follow kita ya!Instagram: @polimedia_radioTikTok: @polimediaradio

Scrum Master Toolbox Podcast
The Team That Gave Up — When Green Reports Mask a Sinking Ship | Maria Skvortsova

Scrum Master Toolbox Podcast

Play Episode Listen Later Jun 2, 2026 15:14


Maria Skvortsova: The Team That Gave Up — When Green Reports Mask a Sinking Ship Read the full Show Notes and search through the world's largest audio library on Agile and Scrum directly on the Scrum Master Toolbox Podcast website: http://bit.ly/SMTP_ShowNotes.   "They said, 'Yeah, we know, but no one will listen to us.' And they just gave up — waiting for the ship to sink so they could swim away." — Maria Skvortsova   Maria walked into a 20-person migration team where the PowerPoint reports glowed green but the reality on the ground was covered in red flags. Developers were building features against requirements that had already changed — nobody had told them. The scope was impossibly large, and when Maria asked the team why they hadn't raised a red flag, the answer shook her: "No one will listen to us." The team had given up. They were waiting for the project to fail so they could leave. Maria's first instinct was to observe — spend weeks understanding the dynamics, the communication patterns, the culture. But she learned the hard way that when a team is already drowning, there's no time for a slow ramp-up. She needed to act immediately. Her breakthrough came from a simple technique: replacing some daily standups with an async RAG (Red-Amber-Green) status system in Jira. Team members just chose a color for each story — no explanation needed. It gave them psychological safety to signal problems without speaking up in a 20-person meeting. From there, Maria broke the team into smaller cross-functional groups — one QA, one developer, one consultant — so they could actually discuss features instead of hiding behind silence.   In this episode, we refer to Zombie Scrum Survival Guide by Christiaan Verwijs, Johannes Schartau, and Barry Overeem. Also check out the episode with Barry and Christiaan, authors of the book, on the podcast.   Self-reflection Question: When you join a new team and sense that something is deeply wrong, how long do you wait before acting — and is that waiting period serving the team or just your own comfort? Featured Book of the Week: Zombie Scrum Survival Guide by Christiaan Verwijs, Johannes Schartau, and Barry Overeem Maria chose Zombie Scrum Survival Guide because, as she puts it, "Most Scrum Masters learn by the happy path. We all know how it should be. But we rarely think about how it should not be." The book focuses on detecting anti-patterns early — before they become entrenched behaviors that are much harder to break. Maria finds it especially valuable because it provides concrete experiments you can try with your team to shake off the zombie symptoms. Her advice: start here, because understanding what bad looks like is just as important as knowing the ideal.   [The Scrum Master Toolbox Podcast Recommends]

The Jira Life
3 is a Magic Number! - Happy Birthday, TJL!

The Jira Life

Play Episode Listen Later May 28, 2026 66:21


Product for Product Management
EP 155 - Reshaping product development for AI's impact with Gil Broza

Product for Product Management

Play Episode Listen Later May 27, 2026 51:15


This conversation with Gil Broza goes straight at a question many product leaders are quietly wrestling with: how do you bring AI into product development without breaking everything that already works? Gil, author, coach, and long-time agility expert, returns to talk with Matt and Moshe about “reshaping product development for AI's impact,” focusing not on building AI features, but on how AI is changing the way product and engineering teams work day to day. He argues that while AI massively increases speed and output, it doesn't change the fundamentals of good product development: clear direction, evidence-based judgment, solid technical foundations, and healthy teams. Join Matt and Moshe as they explore with Gil: - Why AI is a “turbo engine in a car with old brakes” if you drop it into a system designed for human speed. - How leaders confuse more output with more value, and why faster code can just mean “legacy code years ahead of schedule”. - The difference between real agility and “performative Agile” (ceremonies, Jira theater) when AI tools are doing more of the work. - How to think in systems: what you're actually optimizing for (predictability, innovation, time-to-value) and how AI changes the constraints and feedback loops in your org. - Practical blind spots leaders miss with AI adoption: - Treating AI as an implementation, not a transformation - Ignoring cognitive load and burnout when people work all day with agents - Shrinking teams for “efficiency” and accidentally increasing isolation - The three main ways to use AI in product development, and why you should be explicit about each: - As a pairing partner (thinking, coding, design) - As an autonomous agent - As “just” automation (summaries, note-taking, etc.) - Why skipping prototyping and experiments is now “less excusable” when AI can create testable prototypes in hours instead of weeks. - What changes (and doesn't) in roles like PM, engineer, and scrum master when AI becomes a real team member. - Concrete steps leaders can take: apply systems thinking, revisit mindset and values, redesign ways of working for AI-speed conditions, and invest in continuous improvement again. - How Gil's new courses (“Reshaping Product Development for AI's Impact” and “Leading AI-Enabled Product Teams”) help product and engineering leaders do this work intentionally. Want to go deeper or work with Gil? - Website & courses: https://3pvantage.com/  - Newsletter & articles: https://3pvantage.com/subscrib.../  - LinkedIn: https://www.linkedin.com/in/gi.../  You can also connect with us and find more episodes: - Product for Product Podcast: http://linkedin.com/company/pr...-podcast  - Matt Green: https://www.linkedin.com/in/ma... - Moshe Mikanovsky: http://www.linkedin.com/in/mikanovsky  Note: Any views mentioned in the podcast are the sole views of our hosts and guests, and do not represent the products mentioned in any way. Please leave us a review and feedback ⭐️⭐️⭐️⭐️⭐️

SBS Somali - SBS Afomali
Malaayiin xujey ah oo gudanaya waajibaadka Xajka, iyo kuleyl aad u daran oo ka jira Makka

SBS Somali - SBS Afomali

Play Episode Listen Later May 26, 2026 7:34


Malaayiin xujey ah oo ka kala yimid waddamada Islaamka ayaa isugu yimid magaalada barakaysan ee Maka ee dalka Sacuudiga, si ay u gutaan waajibka Xajka ee sannadkan, iyadoo uu heerkul-kuna gaarey 48 degrees.

The Tech Blog Writer Podcast
Cisco's AI Transformation Journey From Fragmented Systems To Smarter Workflows

The Tech Blog Writer Podcast

Play Episode Listen Later May 25, 2026 23:53


What does AI transformation actually look like inside one of the world's largest engineering organizations? At Team '26 in Anaheim, I recently sat down with Jason Andrews to unpack how Cisco transformed decades of fragmented tooling, disconnected workflows, and spreadsheet-driven operations into a unified system of work built around Jira, Confluence, Jira Service Management, automation, and AI-ready workflows. And honestly, this conversation felt refreshingly practical. Jason oversees engineering operations across Cisco Networking, a business unit with around 22,000 engineers and product managers representing roughly $40 billion in annual revenue. So when he talks about transformation, this isn't theory. This is operational change happening at enterprise scale. We discuss how Cisco consolidated more than 85 Jira instances, reduced tooling spend by 54%, and accelerated reporting by 40x while creating a far more scalable engineering organization. But as Jason explains throughout the conversation, the real challenge was never the technology itself. It was getting teams to rethink how they wanted to work moving forward rather than simply migrating years of technical debt into modern systems. One of the strongest themes in this episode is the difference between transformation and migration. Jason explains why organizations often fail when they focus only on moving systems rather than changing workflows, behaviors, and operational culture at the same time. We also dive deep into AI adoption inside engineering organizations. Jason shares how Cisco is already seeing significant productivity gains from AI-assisted development, why organizational context matters so much for enterprise AI success, and why he believes the industry is still massively underestimating how much structured data and workflow consistency AI systems actually require. Along the way, we unpack scenario planning in the AI era, why annual planning cycles are becoming increasingly fragile, and how leaders can move from rigid long-term roadmaps toward more agile operational playbooks capable of adapting to constant disruption. There's also a fascinating discussion around the so-called "SaaS apocalypse," the limits of AI-generated software, and why Jason believes humans will remain central to enterprise operations for years to come, especially in organizations managing millions of lines of legacy code and decades of accumulated institutional knowledge. If your organization is currently navigating modernization, operational complexity, AI adoption, or large-scale systems transformation, this episode is packed with lessons learned from the front lines of enterprise change. And perhaps most importantly, Jason offers a reminder that AI alone is not the strategy. The real opportunity comes from reducing friction, improving context, and helping teams spend more time solving meaningful problems instead of manually stitching systems together.

ASCII Anything
S11E14: Jira, Karaoke & Connections: Latisha Guinn; Marc Brickley & Malinda Lowder Recap the Build It Together and Atlassian Team 26 Conferences

ASCII Anything

Play Episode Listen Later May 25, 2026 36:00


Tish Guinn hosts this look back at the Build It Together and Atlassian Team 26 conferences held this month in Anaheim, California. She's joined by Marc Brickley, Moser's Director of Application Development, and Malinda Lowder, our Director of Sales and Marketing, who were also at the conferences.We cover what stood out at both conferences, recap some of the conversations we had around Clear Path, the importance of building authentic connections, and how organizations can turn conference interactions into long-term partnerships and opportunities. There's also plenty of reminiscing about karaoke with a live band and the skating rink at the post-Team 26 bash. 

The Jira Life
The Trello Life...On the Road!

The Jira Life

Play Episode Listen Later May 23, 2026 71:28


As spring turns into summer, many people's thoughts turn to hitting the open road and soaking in the majestic grandeur of the United States. This week on The Jira Life, we're celebrating that spirit of adventure with a grand reunion of the Trello Trailblazers — featuring the Trello queen herself, Brittany Joiner, along with fellow Trello enthusiasts Rob Hean and Mike Day. Brittany takes us behind the scenes of her cross-country motor home journey, sharing how she planned every mile of the trip using Trello. From route mapping to packing lists, tune in as she walks us through her actual board and the tips and tricks that kept her adventure organized and on track.But the fun doesn't stop there. Rob and Mike bring their own Trello expertise to the table, diving into the ways they've harnessed the power of Trello in their own lives and work. Whether you're a seasoned Trello power user or just getting started, this episode is packed with inspiration, practical takeaways, and plenty of good company. Don't miss this lively, road-trip-fueled conversation with three of the most passionate voices in the Trello community — only on The Jira Life.Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!https://www.linkedin.com/company/the-jira-life/Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/joinHosts:- Alex "Dr. Jira" Ortizhttps://www.linkedin.com/in/alexortiz89/https://www.youtube.com/@ApetechTechTutorials- Rodney "The Jira Guy" Nissenhttps://www.linkedin.com/in/rgnissen/https://thejiraguy.com- Sarah Wrighthttps://www.linkedin.com/in/satwright/ Producer:- "King Bob" Robert Wenhttps://www.linkedin.com/in/robert-wen-csm-spc6-a552051/Executive Producer: - Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codeshttps://www.youtube.com/c/monstercatOutro: Fractal - Atriumhttps://www.youtube.com/c/monstercatinstinct 

Patoarchitekci
MVP, POC, POT - przestań mieszać te trzy rzeczy

Patoarchitekci

Play Episode Listen Later May 22, 2026 25:37


“The POC must deliver a fully functional production capable MVP.” Autentyczny cytat od klienta, który Łukasz wyciąga z Teamsów jak dowód rzeczowy w sprawie o zbiorowe pomieszanie pojęć. Brzmi znajomo?

DevOps Diaries
074 — Make SOX audits easy with Salesforce DevOps!

DevOps Diaries

Play Episode Listen Later May 21, 2026 39:33


Jack sits down with Tapan Patel, Gearset DevOps Leader for 2026 and DevOps Lead for the Salesforce practice at Braze, a publicly traded omnichannel platform where every change management decision is subject to SOX audit scrutiny. Tapan brings a rare blend of project delivery experience, release management rigour, and genuine passion for building DevOps not just as a set of processes, but as a culture.The episode is a masterclass in phased, people-first DevOps rollout. Tapan walks through exactly how he's taken Braze from change sets and manual deployments to a governed, audit-ready CI/CD pipeline over the past year and a half — breaking it down into four distinct phases and sharing what actually worked, what took longer than expected, and where he's headed next. Tapan shares his rounded take on AI, including where it's already adding value in the pipeline today, why agentic autonomy in prod is still a way off, and how Claude, Jira and Gearset's reporting API are becoming a powerful combination for DevOps KPI tracking.00:01 – Intro & Meet Tapan Patel00:40 – Tapan's Journey: From Data & Analytics to Salesforce DevOps02:12 – What DevOps Actually Means as an Organisational Culture04:10 – DevOps in a SOX-Audited, Publicly Traded Company05:10 – The State of DevOps at Braze When Tapan Joined08:14 – Shifting Mindsets From Change Sets to a DevOps Tool10:32 – Precision Deployments: Why Page Layouts Break Everything11:49 – Stakeholder Visibility & the Value of Issue Tracking Integration13:36 – What Tapan Values Most About Gearset15:53 – The Four Phases of CI/CD Rollout at Braze19:16 – Phase Two: Stabilisation & SOX Integration20:30 – Phase Three: Automation Layers & QA Integration21:18 – Phase Four: Maturity & Minimal Intervention22:55 – The Admin Learning Curve for DevOps Adoption25:25 – Continuous Improvement as a Practice, Not a Project28:34 – Where AI Fits Into the DevOps Pipeline Right Now31:07 – Supplementary vs. Agentic AI: Why Tapan Is Taking It Slow33:14 – Using Claude + Gearset Data for Sprint Analysis & KPI Tracking36:00 – The DevOps KPIs That Matter at Braze37:24 – Closing Advice for Anyone Starting Their DevOps Journey

Irish Tech News Audio Articles
Xtremepush Introduces 'XpertOS', the Future Operating System for CRM Teams

Irish Tech News Audio Articles

Play Episode Listen Later May 21, 2026 4:53


Xtremepush, the category leader in igaming CRM and loyalty marketing, has announced the launch of XpertOS, a seamlessly integrated and fully-live AI platform that has the potential to totally upend the industry and change how CRM teams operate in future. Just as the internet search industry saw a seismic shift when ChatGPT arrived, the arrival of XpertOS heralds a similar transformation for CRM and how marketing professionals will work in regulated markets, including financial services, insurance, as well as igaming. Xtremepush customers will be instantly able to take a leap forward and exponentially increase their CRM output, moving from ineffective bolt-ons to an integrated, bespoke AI-powered solution featuring relevant content and brand tools. Personalisation at scale will be at their fingertips, layered with the assurance that comes with governance built into the architecture by long-time experts in their field. XpertOS' compliance-first architecture is a three-tiered product solution – adaptable for operational readiness – including, the 'Xpert Assistant', which will be familiar to those with AI chat experience and acts as an entry point for managers looking for ideation and broader strategy planning. 'Xpert Flows', sits above, integrating with key workflow tools such as Jira and Slack, and operationalises the CRM system with human-in-the-loop governance and transparent campaign execution. 'Xpert Crews' offer the most sophisticated assistance to CRM teams, with brief or goal-driven simulated agent teams acting autonomously with specific roles across functions from compliance to copywriting. These trusted 'teams' produce draft campaigns and iterate in real-time, utilising Xtremepush's unique, unified data architecture, ensuring optimal outcomes that CRM managers can monitor, unpick logic, and steer. Together, these allow human teams to execute their strategic visions and focus on core objectives of lifetime value uplift, churn reduction, and reduced cost per conversion, while delivering more without losing execution quality. From now on, creativity is the only ceiling, not a business's capacity. The tiered adoption path is specific to Xtremepush. It enables customers to find effective solutions that work best with existing tech stacks, optimises headcount output and resource, guarantees relevance, for example by carrying out detailed research into the latest news and odds via custom feeds and thereby fosters greater focus on high-level strategy goals. Unique to XpertOS, and a key guardrail for CRM teams, is the fact that compliance is enforced by the platform's engine, not by the AI, ensuring intelligence and governance operate in separate architectural layers. "This is the end of today's CRM as we know it," said Tommy Kearns, CEO and co-founder at Xtremepush. "XpertOS is the sector's first, fully functioning agentic operating system and marks a shift as fundamental as any we've seen in the space for a couple of decades. "Thanks to the governance layer built into its core, we firmly believe this will replace the traditional, step-by-step, and manual platforms currently used in martech with existing teams doing the judgement, while the AI does the work. What will be ubiquitous as a work process in 18 months is here now. "XpertOS automation replaces the slowly evolving campaign execution of old, elevating CRM executives into strategic architects, backed by a hard-coded governance and compliance layer that empowers the human-in-the-loop and supercharges personalisation, as well as engagement and retention metrics." Built with a visual 'Control Room' which demonstrates approval gates and interaction logging, XpertOS is primed for usage in heavily regulated industries, where compliance and transparency are key. XpertOS Takeaways: Find: Locates commercially valuable players that CRM teams currently don't have the tools to reach. Govern: Checks every campaign for compliance before they go live. Scale: Lets CRM teams run 10x m...

La French Connection
Épisode 0x295 - Pourquoi GitGuardian voit exploser les fuites de secrets

La French Connection

Play Episode Listen Later May 19, 2026 56:12


Synopsis Cette semaine, un épisode spécial commandité par GitGuardian avec leur fondateur et CEO Eric Fourrier qui se joint à Patrick, Steve et Francis depuis New York. On déroule le rapport annuel State of Secrets Sprawl et le constat est brutal : le nombre de secrets exposés sur GitHub explose, et l'IA n'arrange rien. Les utilisateurs de coding agents leakent deux fois plus de secrets que les développeurs traditionnels, et une nouvelle catégorie de “vibe codeurs” sans formation sécurité produit du code rempli de clés API directement dans les fichiers de config. Eric détaille la nouvelle économie d'attaque où voler des secrets est devenu l'objectif #1 : que ce soit via du phishing, du social engineering ou des compromissions supply chain comme Shai-Hulud, LiteLLM ou l'attaque récente sur NX, tout converge vers le même but. Les machines des développeurs sont devenues la cible privilégiée parce qu'elles concentrent à la fois les accès, les secrets et maintenant les coding agents connectés à tout (GitHub, Jira, CRM, bases de données) via MCP et autres intégrations. La conversation s'attaque ensuite au cas Axios et à la méthode des Nord-Coréens qui exploitent l'ingénierie sociale pour piéger les mainteneurs de packages open source — un faux call de business, un faux fichier joint, et c'est tout l'écosystème en aval qui se retrouve compromis. Patrick et Steve ramènent le débat sur la réalité des PME québécoises qui ne savent même pas où sont leurs secrets, et encore moins comment gérer un vault. L'épisode se termine sur une réflexion essentielle : la sécurité n'est pas une valeur absolue, c'est un trade-off. Les approches les plus parfaites sont souvent celles qui découragent les équipes et finissent par ne jamais être implémentées. Eric annonce aussi la récente levée de fonds de 50 M$ de GitGuardian et leur recrutement actif. Crew Patrick Mathieu Steve Waterhouse Francis Coats Eric Fourrier - Fondateur et CEO de GitGuardian Liens et ressources Eric Fourrier / GitGuardian GitGuardian - site officiel Rapport State of Secrets Sprawl 2026 Rapport State of Secrets Sprawl 2025 Levée de fonds Série C - 50M$ GitGuardian Careers Attaques supply chain mentionnées Shai-Hulud npm worm LiteLLM compromise Compromission Axios via mainteneur Concepts abordés MCP - Model Context Protocol Principe du moindre privilège Shamelessplug Join Discord securite.fm Hackfest 2026 - Québec iHack 2026 Crédits Montage audio par Hackfest Communication Locaux virtuels par Streamyard Épisode commandité par GitGuardian

The Tech Blog Writer Podcast
Atlassian's Chief Design Officer on AI, Creativity, and the Future of Work

The Tech Blog Writer Podcast

Play Episode Listen Later May 15, 2026 29:53


What happens when AI stops being a feature and starts reshaping the very craft of design itself? Live from, I sat down with Charlie Sutton for a conversation that went far beyond product interfaces and pixels. As Atlassian unveiled its latest AI ambitions around agents, context, and the Teamwork Graph, Charlie offered a fascinating look at the human side of that transformation and why design may become even more important as AI becomes embedded into the way we work. Charlie shared how Atlassian approaches design at scale across products like Jira, Confluence, Loom, and Rovo, explaining why every interaction should feel intentional and cohesive, even when built by hundreds of people across dozens of teams. But this conversation quickly moved into much bigger territory. We explored how AI is changing the relationship between designers, developers, and business teams, and why the traditional barriers between idea and execution are rapidly disappearing. One of the most thought-provoking parts of the discussion centered around democratization. Charlie argued that while AI tools have dramatically lowered the floor for creativity, they have also raised the ceiling for what users now expect from software experiences. Anyone can prototype an app today, but expectations around quality, coherence, trust, and usability are climbing just as quickly. We also unpacked the growing shift from prompting AI to delegating work to AI agents. Charlie explained why assigning work to agents increasingly resembles managing human teammates, from defining goals and success criteria to understanding strengths, limitations, and context. That naturally led us into a deeper conversation about trust, transparency, and why users must always feel they can "pop the bonnet" and understand what AI systems are doing on their behalf. Another major theme throughout the episode was context. Charlie shared why Atlassian sees organizational context as one of the defining challenges of the AI era and how the Teamwork Graph is helping connect people, projects, conversations, and knowledge across the company. He compared this moment to the first time many of us used Google search and suddenly realized the scale of what was possible. We also discussed how AI adoption is unfolding differently from previous technology waves. Instead of adoption trickling down from hardcore technical users, Charlie is seeing rapid experimentation from marketing, HR, and design teams looking to reduce repetitive work and communicate ideas more effectively. Even his own mother, he joked, has become an AI power user before he has. From AltaVista nostalgia and Ask Jeeves memories to serious conversations about the future of human creativity, this episode captures a rare and honest perspective on where design, collaboration, and AI may be heading next. How will organizations balance personalization with shared experiences as AI becomes embedded into every workflow, and what role will human creativity play when everyone suddenly has access to the same powerful tools? Please check the partners of the Tech Tech Talks Network Learn more about the NordLayer Browser Visit Denodo.com

The Tech Blog Writer Podcast
AI, Engineering, And Formula One: The Tech Driving the Atlassian Williams F1 Team

The Tech Blog Writer Podcast

Play Episode Listen Later May 14, 2026 28:31


What happens when one of the most iconic teams in Formula One decides to rethink how work gets done behind the scenes completely? Last year, Atlassian Williams Racing made headlines when Atlassian entered Formula One as both title partner and technology partner. At the time, many people saw the partnership as another high-profile sponsorship deal. But over the last twelve months, something much bigger has been unfolding inside the Williams organization. At Team '26 in Anaheim, I sat down with Andrew Boyagi and Matt Harman to unpack how AI, data, workflows, and organizational transformation are reshaping life both at the factory and on the grid. This conversation goes far beyond racing. Matt explains how Williams is reducing the time between "idea to track," compressing development cycles so upgrades arrive at race weekends weeks earlier than before. One striking example involves reducing front wing lead times by a factor of three through parallel workflows and better collaboration, allowing performance gains to reach the circuit three race weekends sooner. Andrew shares how Atlassian's system-of-work philosophy is being applied in one of the most data-intensive environments on earth. We explore how tools like Jira, Confluence, Loom, Rovo, and Teamwork Graph are helping engineers, strategists, operations teams, and factory staff make faster decisions with less operational friction. We also discuss how AI is changing engineers' roles, why organizational context matters more than raw intelligence, and how Formula One teams balance human instinct with AI-driven precision in race strategy decisions. Matt offers fascinating insight into how AI helps teams process decades of historical race data in real time while still relying on human judgment in critical moments. Along the way, we explore the cultural transformation underway at Williams, including the shift away from endless meetings toward faster, outcome-focused collaboration. Matt explains how tools like Loom and Confluence are helping teams make decisions more efficiently while spreading knowledge more effectively across specialist departments. Andrew also reveals some eye-opening metrics from the partnership so far. Since rolling out Atlassian's Teamwork Collection, teams have reportedly increased throughput by 83%, while low-value meetings have been reduced by 863 hours in a single month across 200 people. Perhaps the biggest takeaway from this episode is that Formula One may actually be a perfect reflection of the challenges facing every modern business. As Andrew puts it during our conversation, Formula One is ultimately "an enterprise performance problem," just operating at 300 kilometers an hour with millions of people watching every weekend. If you've ever wondered what enterprise transformation looks like when milliseconds matter, this episode offers a fascinating look inside one of the most ambitious AI and workflow transformation journeys happening anywhere in business today   Please check the partners of the Tech Tech Talks Network Learn more about the NordLayer Browser Visit Denodo.com

The Jira Life
...And Now a Word from Our Sponsor (with Nelson Pereira)

The Jira Life

Play Episode Listen Later May 14, 2026 62:54


TEAM '26 was a whirlwind — and we're still buzzing from it. But before we move on to the next thing, we wanted to take a moment to shine a light on the people who helped make this season possible. This quarter, that means ikuTeam, and we couldn't be more grateful for the partnership.So what better way to say thank you than to hand them the mic?In this episode, we sit down with Nelson Pereira, co-founder and VP of Product at ikuTeam, for a wide-ranging conversation about how ikuTeam came to be, what they've built, and where they're headed. Whether you've been hearing about ikuTeam all season and want to finally understand what they actually do, or you're already a fan of their products, this one's worth a listen.Nelson walks us through the origin story of ikuTeam — the problem they set out to solve and how that vision has evolved over time. We dig into their current product lineup, what makes it tick for Jira users specifically, and the roadmap ahead. If you've ever wondered how a tool gets purpose-built for the way modern teams actually work inside Atlassian's ecosystem, Nelson has a lot of thoughtful things to say about that.It's a fun, honest conversation with someone who clearly loves what he's building — and we think you'll come away with a much clearer picture of why ikuTeam has been a natural fit for the Atlassian community.Give it a listen, and as always — thanks for being here!Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!https://www.linkedin.com/company/the-jira-life/Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/joinHosts:- Alex "Dr. Jira" Ortizhttps://www.linkedin.com/in/alexortiz89/https://www.youtube.com/@ApetechTechTutorials- Rodney "The Jira Guy" Nissenhttps://www.linkedin.com/in/rgnissen/https://thejiraguy.com- Sarah Wrighthttps://www.linkedin.com/in/satwright/ Producer:- "King Bob" Robert Wenhttps://www.linkedin.com/in/robert-wen-csm-spc6-a552051/Executive Producer: - Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codeshttps://www.youtube.com/c/monstercatOutro: Fractal - Atriumhttps://www.youtube.com/c/monstercatinstinct 

Develpreneur: Become a Better Developer and Entrepreneur
Software Delivery Clarity: Why Visibility Beats More Process

Develpreneur: Become a Better Developer and Entrepreneur

Play Episode Listen Later May 12, 2026 35:23


Software delivery clarity has become one of the most important competitive advantages for engineering organizations. Teams are shipping faster, AI-assisted development is compressing implementation timelines, and traditional project management systems are struggling to keep pace with modern software delivery realities. During the conversation with Alex Polyakov, one idea surfaced repeatedly: most project management systems promise visibility but fail to provide actual operational clarity. Teams still discover delays too late. Executives still receive bad news at the last possible moment. Developers still spend excessive time updating systems rather than building software. That disconnect is exactly what inspired Alex to rethink how engineering organizations manage software delivery. About Alex Polyakov Alex Polyakov is the founder of Project Simple AI, a platform focused on improving transparency and discipline across software delivery workflows. With more than 25 years of experience spanning software engineering, architecture, product management, entrepreneurship, and startup leadership, Alex brings a deeply practical perspective to modern development operations. He has worked as an Application Developer, Senior Engineer, Tech Lead, Software Architect, Solutions Architect, Product Manager, Entrepreneur, and Startup Founder. Today, his focus is helping engineering teams gain visibility and operational discipline without adding unnecessary complexity. Alex also hosts the "Let's Talk Agile" podcast on YouTube, where he discusses modern software development challenges and Agile transformation realities. LinkedIn: https://www.linkedin.com/in/alexpolyakov/ Why Software Delivery Clarity Still Doesn't Exist Most organizations believe they have visibility because they use Jira, Azure DevOps, or similar tools. In reality, they have tracking systems, not visibility systems. Alex described modern project management tools as "glorified Excel sheets." That description lands because many engineering teams recognize the pattern immediately. Endless ticket hierarchies, fields, statuses, and sprint rituals often create administrative complexity without improving confidence. The core issue is simple: status updates depend on human behavior. Developers forget to update tickets. Teams delay reporting problems. Managers discover schedule risks only when deadlines are already compromised. The tooling creates an illusion of control while actual delivery risk remains hidden. That creates a dangerous operating environment for leadership. A founder or executive can solve a delivery problem early. They can reduce scope, renegotiate timelines, allocate additional staff, or re-sequence priorities. But once a team waits until the final week to communicate delays, most strategic options disappear. Visibility is not the same thing as documentation. Visibility means understanding delivery risk early enough to respond. Software Delivery Clarity Requires Behavioral Design One of the most interesting concepts from the discussion was the idea that project management is partly behavioral science. Most tools allow teams to skip critical disciplines. Teams can start work before decomposition. They can mark tasks complete without validating outcomes. They can carry partially defined requirements into implementation. Alex's approach flips that model entirely. Instead of giving teams unlimited flexibility, the system enforces operational readiness. Work cannot begin without decomposition. Timelines cannot exist without estimates. Completion cannot happen without verifying a definition of done. This is important because software organizations often assume process problems are communication problems. In reality, many are workflow design problems. If a system permits ambiguity, ambiguity becomes normalized. If a system requires clarity, clarity becomes operational behavior. Why AI Makes Software Delivery Clarity More Important AI-assisted development changes the economics of software delivery. Implementation cycles are shrinking dramatically. Tasks that previously required days may now take hours. Boilerplate code generation, scaffolding, testing support, and architectural suggestions accelerate execution speed. That acceleration creates a new challenge. If implementation becomes faster, bottlenecks move upstream and downstream. Requirements gathering, coordination, prioritization, testing, and validation suddenly become the limiting factors. This means organizations can no longer rely on heavyweight process management structures built for slower delivery cycles. When implementation speeds increase but operational visibility stays static, delivery chaos accelerates instead of improving. The transcript discussion highlighted a critical reality many organizations are only beginning to recognize: AI amplifies existing operational weaknesses. A disorganized engineering team using AI becomes a faster disorganized engineering team. That is why delivery clarity matters more now than it did during earlier Agile transformations. The Simplicity Principle Behind Better Delivery Alex outlined several operational principles that simplify software execution dramatically. Software Delivery Clarity Starts with Prioritization Teams should know exactly what matters most. Priority order should not be vague or political. If only one item can ship, teams must know which item wins. That sounds obvious, but many organizations operate with dozens of simultaneous "critical" initiatives. Clear sequencing eliminates organizational confusion. Software Delivery Clarity Depends on Finishable Work Teams should not start work that they cannot complete. This principle directly attacks excessive work in progress — one of the most common hidden inefficiencies in software organizations. Partially completed work creates coordination overhead, testing delays, context switching, and reporting confusion. Smaller, decomposed work creates measurable progress. Software Delivery Clarity Improves Team Accountability Alex also challenged pre-assigned work structures. When work is individually distributed too early, collaboration weakens. Teams lose shared ownership. Visibility becomes fragmented across individuals instead of remaining centralized around delivery goals. That perspective aligns closely with modern product-oriented engineering cultures where collaboration and flow matter more than rigid task ownership. Before adding new process layers, evaluate whether your current workflow already contains unnecessary coordination overhead. Why Simpler Engineering Systems Scale Better Many organizations assume maturity means adding process. The conversation suggested the opposite. Mature engineering organizations often remove unnecessary friction instead of introducing more operational complexity. Simplicity improves adoption, consistency, and decision-making speed. This becomes especially important in high-growth environments. As teams scale, communication overhead compounds rapidly. Every unnecessary workflow step multiplies across developers, product managers, QA engineers, architects, and leadership stakeholders. Simple systems reduce cognitive load. That reduction creates operational focus. The goal of project management is not to track work forever. The goal is to deliver valuable software predictably. Conclusion Software delivery clarity is not about more dashboards, more ceremonies, or more ticket customization. It is about creating operational confidence. Alex Polyakov's perspective challenges many assumptions that modern engineering organizations accept as normal. Teams do not necessarily need more process. They need better behavioral systems, clearer visibility, stronger prioritization, and simpler operational structures. As AI continues accelerating implementation speed, organizations that simplify coordination and improve transparency will gain a meaningful competitive advantage. The future of software delivery may not belong to the teams with the most process sophistication. It may belong to the teams with the clearest operational discipline. Stay Connected: Join the Developreneur Community

Tech Disruptors
Atlassian CEO on Human-AI Agent Collaboration

Tech Disruptors

Play Episode Listen Later May 12, 2026 33:05


AI agents are reshaping enterprise workflows, increasing the importance of organizational context and connected data. Atlassian CEO and co-founder Mike Cannon-Brookes joins Bloomberg Intelligence senior software analyst Sunil Rajgopal to discuss how Atlassian is embedding AI across Jira, Confluence and service-management tools through its Rovo platform and Teamwork Graph. “The future is about human and agent collaboration,” Cannon-Brookes says. The discussion also covers enterprise AI adoption, developer productivity and API-driven software infrastructure.

Ecomm Breakthrough
Claude, OpenClaw & Custom GPTs: The New AI Stack Winning in 2026

Ecomm Breakthrough

Play Episode Listen Later May 11, 2026 40:04


Oren Michels is the founder and CEO of Barndoor.ai, the first and only Control Plane for the agentic enterprise. Previously, he co-founded Mashery in 2006 and served as CEO until Intel acquired the company in 2013. When it was acquired, Mashery-powered APIs were used by over 350,000 active developers in over 100,000 active applications, and counted among its customers many of the largest e-commerce, media, and data companies in the world. He is an entrepreneur, investor, board member, and advisor to technology startups in the US and Europe and has made angel investments in several successful companies including Uber, Pebble Post, Addy, Navdy, and eero.Highlight Bullets> Here's a glimpse of what you would learn…. Rapid evolution of AI agents in e-commerce and business operations.Definition and functionality of AI agents that perform actions on behalf of users.Importance of governance and trust in deploying AI agents to prevent errors and misuse.Introduction of Barndoor AI and its role in providing connectivity and governance for AI agents.Practical use cases of AI agents in managing tasks across various platforms (e.g., Shopify, JIRA, QuickBooks).The necessity of setting strict policies to control AI actions and ensure safety.Integration of AI tools with existing software systems and the potential for low-code/no-code solutions.The significance of problem-solving and process design skills in effectively utilizing AI agents.Recommendations for starting small with AI and learning through practical application.Continuous evolution of AI tools and the importance of staying informed and adaptable.In this episode of the Ecomm Breakthrough podcast, host Josh Hadley speaks with Oren Michels, founder and CEO of Barndoor AI, about the growing role of AI agents in business operations. Oren explains how AI agents can autonomously perform tasks within systems like Shopify, Amazon, and Slack, while emphasizing the critical need for governance and trust. He introduces Barndoor AI as a control plane that enables secure connectivity and policy-based guardrails, preventing unintended actions. Practical use cases include email management, JIRA ticket handling, and financial forecasting. Oren advises listeners to start small, experiment with multiple AI tools, and develop strong problem-solving skills.Here are the 3 action items that Josh identified from this episode:Start with low-risk automation Deploy AI agents on simple, non-critical workflows first (e.g., email summaries, reporting) to test value and build internal trust before scaling. Enforce strict governance from day one Define clear permissions, rules, and guardrails—never give blanket access. Every AI action should be controlled, logged, and auditable. Design processes before deploying AI Break workflows into clear steps and craft precise prompts. Strong process design + prompt clarity = better, safer AI performance.Timestamps:00:00:00 The Problem of AI GovernanceOren discusses lack of governance in current AI systems and the risks of AI agents forgetting instructions.00:00:30 Podcast Introduction & Guest BackgroundPodcast is introduced, and Oren Michels' background and achievements are highlighted.00:00:44 The Rise of AI Agents in E-commerceJosh frames the future of e-commerce as dominated by AI agents and introduces Oren as the guest.00:02:06 Oren's Perspective on AI Agent AdoptionOren explains the rapid and slow pace of AI agent adoption, especially beyond coding tasks.00:03:02 What is Barndoor AI?Oren introduces Barndoor AI, focusing on connectivity and trust for AI agents in business systems.00:03:40 How Barndoor AI WorksDetails on how Barndoor AI enables granular control and governance over AI agent actions.00:05:45 Security and Guardrails for AI AgentsDiscussion on security risks, both from bad actors and unintended consequences by legitimate users.00:06:33 Difference Between Barndoor and Other AI ToolsOren explains how Barndoor adds governance missing from tools like OpenClaw and Claude.00:09:24 Use Case: Email Management with AI AgentsOren shares how he uses AI agents to manage and triage his daily email load efficiently.00:12:04 Why Governance Matters in AI ActionsExplains the importance of restricting AI actions to prevent mistakes, especially in sensitive tasks.00:13:00 Custom Rules and Granular PoliciesBarndoor allows highly specific rules for AI actions, such as price-based restrictions in e-commerce.00:13:58 Use Case: JIRA and Finance AutomationExamples of using AI agents for JIRA ticket management and automated financial reporting via Slack.00:16:48 Enterprise Use Cases & E-commerce OptimizationBarndoor's enterprise clients use AI for handling sensitive data and optimizing Amazon listings seasonally.00:19:08 Customer Service and Contextual CommunicationAI agents help draft personalized emails by pulling context from Salesforce and previous communications.00:20:40 AI Agent Adoption is Still EarlyOren emphasizes that AI agent use is in its infancy and encourages experimentation in low-risk areas.00:22:40 Personal Use Cases for AI AgentsJosh and Oren discuss personal productivity applications, like sports team management and scheduling.00:24:14 The Evolving AI Tool LandscapeDiscussion on the rapid evolution of AI tools, the importance of using multiple models, and specialization.00:27:47 Future of AI in Business OperationsSpeculation on the future: specialized AI tools for each business function, governed by platforms like Barndoor.00:31:00 The Importance of Problem-Solving and Prompt EngineeringSuccess with AI depends on defining problems and giving clear instructions, akin to prompt engineering.00:33:46 Actionable Takeaways for ListenersJosh summarizes three action items: start experimenting, document processes, and stay flexible with tools.00:36:44 Book Recommendation: Why Computers ThinkOren recommends a book that explains the probabilistic nature of AI and why it sometimes fails.00:37:34 Favorite AI Tool and Personal UseOren shares his favorite AI tools and how he uses them for both work and personal learning.00:38:49 Who to Follow: Aaron LevieOren recommends following Aaron Levie for insightful commentary on AI and business.00:39:28 Where to Learn More About Barndoor AIOren directs listeners to Barndoor AI's website and their personal product, Zenni, for hands-on experience.00:39:45 Podcast Wrap-UpPodcast concludes with thanks and a call to subscribe and leave a review.Resources mentioned in this episode:Josh Hadley on LinkedIneComm Breakthrough ConsultingeComm Breakthrough PodcastEmail Josh Hadley: Josh@eCommBreakthrough.comTools and Websites"OpenClaw": "00:00:00""Barndoor AI": "00:03:14""

Talos Takes
The trust paradox: How attackers weaponize legitimate SaaS platforms

Talos Takes

Play Episode Listen Later May 7, 2026 20:51 Transcription Available


In this episode of Talos Takes, Amy Ciminnisi sits down with researcher Diana Brown to discuss the rise of "platform-as-a-proxy" (PAP) attacks. We explore how threat actors are weaponizing legitimate SaaS platforms like GitHub and Jira to deliver phishing campaigns that bypass traditional security filters. By leveraging the platforms' own infrastructure to send authenticated emails, attackers are exploiting the inherent trust employees place in these essential business tools. We break down the mechanics of these campaigns and provide actionable strategies for security teams to move beyond binary trust and implement contextual awareness to better protect their organizations.Blog: https://blog.talosintelligence.com/weaponizing-saas-notification-pipelines/

Project Management Happy Hour
124: Drowning in Tasks: How Successful PMs Organize the Chaos

Project Management Happy Hour

Play Episode Listen Later May 6, 2026 56:57


If your to-do list is 47 items long, your Slack won't shut up, and you ended the day thinking, "Cool… but what did I actually accomplish?"—welcome. You're among friends. In this episode, Kim and Kate take on the very real, very unsexy side of project management: figuring out how to manage your own work when everything (and everyone) is demanding your attention. This isn't about finding the perfect tool or building a prettier dashboard. It's about surviving—and actually functioning—in an interrupt-driven world where emails breed overnight, notifications multiply, and every task somehow feels urgent. They get into what actually works: setting a North Star for your week (yes, only a few priorities), getting tasks out of your brain before they haunt you at 10 PM, and why some tasks are secretly just traps that create even more work (looking at you, boomerang tasks). Also: a gentle reality check—you're not supposed to do everything. Grab a drink, ignore your inbox for a bit, and let's figure out how to organize the chaos without losing your mind.  

Built to Sell Radio
Ep 544 Why He Regrets Selling for 3.5X EBITDA

Built to Sell Radio

Play Episode Listen Later May 1, 2026 48:56


Boris Berenberg bootstrapped Atlas Authority, an Atlassian partner that resold Jira and Confluence to mid-market companies and built apps on top of the platform, to high seven figures in revenue with 18% net margins, then sold to private equity in May 2022.  A year later he wrote a blog post titled "I regret selling my startup" that went viral inside the exited founder community. 

The Jira Life
TEAM '26 Predictions - What's Actually Coming for the Atlassian Platform?

The Jira Life

Play Episode Listen Later May 1, 2026 62:41


Atlassian Team '26 Conference kicks off May 5th in Anaheim, which means it's time to pull out our crystal ball and see what's coming. We've been digging through the Atlassian Roadmap, the Cloud release notes, recent partner announcements, and a few tea leaves to put together our best predictions for what you'll be reading about next week — because if the pattern holds, there's going to be a lot to unpack. Will Rovo finally get the showcase moment it's been building toward? What does the new Google Cloud partnership mean for your stack? Will Mike Cannon-Brookes order a pizza on stage and have it delivered before the keynote wraps? And is there a surprise or two lurking that nobody saw coming? What's your boldest Team '26 prediction? Drop it in the comments — we'll call out the best ones on the stream.Tune in Thursday as we go prediction by prediction — then follow us to Anaheim: we'll be dropping our Keynote Response the afternoon of May 6th and going live from the Expo Hall floor on May 7th at our normal time to close out the show.Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!  / the-jira-life  Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/...Hosts: Alex "Dr. Jira" Ortiz   / alexortiz89     / @apetechtechtutorials     Rodney "The Jira Guy" Nissen/ rgnissen  https://thejiraguy.com  Sarah Wright / satwright   Producer: "King Bob" Robert Wen   / robert-wen-csm-spc6-a552051  Executive Producer: Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codes   / monstercat  Outro: Fractal - Atrium   / monstercatinstinct  

INNOQ Podcast
KI Features für Jira Data Center

INNOQ Podcast

Play Episode Listen Later Apr 30, 2026 35:05 Transcription Available


In dieser Episode des INNOQ Podcasts sprechen Gil Breth und Nicolas Inden über ein Rezept zum Digital Independence Day (DI.DAY): Wie man KI-gestützte Features für Jira aufbaut: Komplett lokal und ohne auf die Atlassian Cloud angewiesen zu sein. Nicolas beschreibt, wie er mit einem lokalen KI-Modell, einem MCP-Server und einer selbst gehosteten Jira-Instanz einen Workflow geschaffen hat, der ihm hilft, Informationen aus Kundengesprächen, Slack-Konversationen und Calls strukturiert in Jira-Tickets zu überführen.

Developer Tea
AI-Proofing Your Skillset - High-Meaning, High-Specifity Vocabulary is the Path to Growth

Developer Tea

Play Episode Listen Later Apr 29, 2026 31:10


Why I'm Not "Picking a Fight" on AI: A listener asked if I'm intentionally stoking a flame war by treating agentic coding as a foregone conclusion. The honest answer is that I've used it, the data points one direction, and a show built around pretending otherwise would slowly drift away from reality — and away from being useful to you. Respecting the Misgivings, Without Getting Stuck in Them: Ethical concerns, skill atrophy worries, and questions about long-term effects are all legitimate. But the goal of this show is practical applicability, so we focus on mental models you can use Monday morning rather than litigating every angle of the debate. The "Minecraft" Principle: If I ask you to "build Minecraft," I've handed you several chapters of specification in a single word. That's meaning-rich abstraction — language that points at a huge amount of shared context with very little token cost. Meaning-Rich AND Specific: "Human history" is meaning-rich but uselessly broad. "Block-building game" is specific but loses fidelity. The sweet spot is vocabulary that is both compact and unambiguous — sitting in the top right of the meaning-density / specificity graph. A Real Example — Strategy Pattern: When working on authorization rules, I didn't want a pipeline. Instead of describing base classes, shared interfaces, and parallel execution to the LLM, I used the words "strategy pattern." Three words did the work of three paragraphs, and the output landed where I wanted it. Vocabulary as Leverage: Named patterns, named algorithms (Monte Carlo, etc.), named architectural concepts — these act like compressed pointers. The more of them you genuinely understand, the higher the leverage of every prompt you write and every conversation you have with another engineer. How to Build This Vocabulary: Have conversations with senior engineers. Ask an LLM what patterns are at play in a codebase, which ones you're using incorrectly, and which ones you're tricked into thinking you're using. Learn the abstraction layer that sits one step above your day-to-day implementation work. The Asterisk — Shared Context Required: This only works when both sides know the term. Public, well-documented concepts (patterns, papers, algorithms) translate immediately to LLMs. Private or organization-specific concepts need to be loaded into context — via CLAUDE.md, AGENTS.md, or skills — before that compression kicks in. Episode Homework: Pick one area of your current codebase. Ask an LLM to name the patterns in play, the patterns you're using incorrectly, and the ones you might be missing. Use that conversation to add at least one new piece of meaning-rich vocabulary to your working set.

The People Managing People Podcast
How Great Leaders Prioritize in a World Where Everything Feels Urgent

The People Managing People Podcast

Play Episode Listen Later Apr 28, 2026 47:10 Transcription Available


Everything is urgent—until it isn't. When every ticket is a fire, teams don't move faster; they burn out. In this episode, Barbara Nicholas (CEO at Polly) borrows a lesson from search and rescue: urgency only matters when it actually changes outcomes. Most white-collar work isn't life or death, but we've built cultures that pretend it is—and people are paying for it in cognitive overload and constant distraction.Barbara walks through how she operationalizes a triage system across her company—embedding shared language into tools like Slack, Notion, and Jira, and empowering teams to challenge urgency instead of blindly accepting it. From AI experimentation to customer demands and internal comms, this is a conversation about cutting through noise, making better calls under pressure, and remembering what actually matters.Related Links:Join the People Managing People CommunitySubscribe to the newsletter to get our latest articles and podcastsConnect with Barbara on LinkedInVisit PollySupport the show

The Jira Life
Balancing Governance & Customization in your Atlassian Environment (with Nick Turner)

The Jira Life

Play Episode Listen Later Apr 25, 2026 66:06


Every Atlassian admin has been there — you're either locking everything down so tightly that users are filing tickets just to breathe, or you've said "yes" one too many times and now you've got 47 duplicate fields and a different workflow in every single space. So where's the line?This episode is less of an interview and more of a conversation — and we want you to be part of it. As we sit down with Nick Turner of the Turner Advisory Group, we're pulling in perspectives straight from the JiraAmigos community, because this is exactly the kind of topic where everyone has a war story, an opinion, or a hard-learned lesson worth hearing.We'll dig into what it actually looks like when an instance tips too far in either direction. On one end, the over-governed environment built more for the admin than the everyday user — where guardrails have become gates and friction is everywhere. On the other, the anything-goes instance where customization has quietly spiraled into chaos, half the fields don't work, and no two teams are operating the same way.Nick brings his perspective from the Turner Advisory Group, the hosts bring theirs, and we want you bringing yours too. As you listen, we're challenging you to think honestly about your own instance — which direction are you leaning? How much of that is intentional, and how much just... happened?We also get into how much context matters here, because what works at a few hundred users can fall apart fast at a few thousand.Pull up a chair. This one's a group discussion.Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!https://www.linkedin.com/company/the-jira-life/Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/joinHosts:- Alex "Dr. Jira" Ortiz https://www.linkedin.com/in/alexortiz89/ https://www.youtube.com/@ApetechTechTutorials- Rodney "The Jira Guy" Nissen https://www.linkedin.com/in/rgnissen/ https://thejiraguy.com- Sarah Wright https://www.linkedin.com/in/satwright/ Producer:- "King Bob" Robert Wen https://www.linkedin.com/in/robert-wen-csm-spc6-a552051/Executive Producer: - Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codeshttps://www.youtube.com/c/monstercatOutro: Fractal - Atriumhttps://www.youtube.com/c/monstercatinstinct

The SaaS CFO
AI and Finance Unite: A Bold Vision for Tax Credit Automation with Taxnova

The SaaS CFO

Play Episode Listen Later Apr 23, 2026 27:42


Join Ben Murray on The SaaS CFO Podcast as he interviews George Nichkov, CEO and founder of TaxNova. Hear how George Nichkov's journey from consulting and tech inspired him to solve R&D tax credit challenges for SaaS companies. George Nichkov shares how TaxNova uses AI to automate finance operations—streamlining tax claims, connecting with tools like JIRA and GitHub, and simplifying compliance across global markets. Say goodbye to spreadsheets and manual processes. If you're a SaaS founder or finance leader curious about leveraging global tax credits and AI-powered FinOps, this episode is full of insights you won't want to miss. Show Notes: 00:00 Considering a career change 03:50 Translating projects for accounting 07:24 Easy data integration for companies 12:14 Choosing venture capital for growth 14:04 Building early brand trust 16:57 Discussing data-driven tiered pricing 22:49 Building Pack Snow from scratch 26:05 Integrating JIRA with global offices 26:53 Promoting TaxNova resources online Links: SaaS Fundraising Stories: https://www.thesaasnews.com/news/taxnova-raises-1-million-in-funding George Nichkov's LinkedIn: https://www.linkedin.com/in/nichkov/ Taxnova's LinkedIn: https://www.linkedin.com/company/taxnova-ai/ Taxnova's Website: https://taxnova.ai/ To learn more about Ben check out the links below: Subscribe to Ben's daily metrics newsletter: https://saasmetricsschool.beehiiv.com/subscribe Subscribe to Ben's SaaS newsletter: https://mailchi.mp/df1db6bf8bca/the-saas-cfo-sign-up-landing-page SaaS Metrics courses here: https://www.thesaasacademy.com/ Join Ben's SaaS community here: https://www.thesaasacademy.com/offers/ivNjwYDx/checkout Follow Ben on LinkedIn: https://www.linkedin.com/in/benrmurray

Developer Tea
Building Real Skills During the AI Boom - No, Not That Kind of Skill

Developer Tea

Play Episode Listen Later Apr 22, 2026 30:16


The Coding-Is-My-Value Trap: For years, we've treated the ability to write code as the flagship skill of software engineering. It's concrete, it's teachable, it's the thing big box stores sell kits for. But conflating "what I enjoy about the job" with "what I'm actually valuable for" is dangerously reductive — and AI is now exposing that gap. The Skills You've Been Discounting: Domain expertise, systems thinking, risk and bottleneck analysis, organizational design, tech-lead-level sequencing of work, relational skills that unblock hard moments in a company's life. These have always been where a lot of your real value lived. You probably just weren't writing them down. The Three-Part Framework — Valuable, Durable, Transferable: A skill worth investing in hits as many of these as possible. Valuable means it meets a clear business need. Durable means it survives industry shifts. Transferable means it applies across domains and scales up as you grow more senior. What "Durable" Actually Means: Ask yourself: what would have to change for this skill to become obsolete? Coding, on its own, has a lower durability answer than it used to. Relationship building, architectural thinking, and the ability to reason about complexity require much bigger shifts before they stop mattering. Transferability Is Vertical, Not Just Lateral: Don't just ask whether a skill moves across industries. Ask whether it keeps paying off as you move into more senior, higher-leverage roles. Soft skills, systems thinking, and mental models like compound interest compound themselves the further up you go. Episode Homework: Make your own list. Which of your skills are valuable, durable, and transferable? Every engineer's list looks different — and the ones you've been quietly discounting are often the ones that matter most going forward.

Dear Nikki - A User Research Advice Podcast
Claude CoWork + UXR Deliverables

Dear Nikki - A User Research Advice Podcast

Play Episode Listen Later Apr 20, 2026 28:58


I hate making deliverables. So I made Claude do it and accidentally built something better than any journey map I've ever made.Research deliverables f*cking suck. I'm a words person. I do not have a design bone in my body, not even the tip of a pinky bone. I will write you a beautiful report. I will not make you a beautiful journey map and yet somehow half my job is making beautiful journey maps.So when clients started asking for dynamic deliverables lately, my honest reaction was that I don't even like static deliverables, how am I going to make dynamic ones?!?!?!? I'm not even dynamic.Then I sat down with Claude Cowork one evening, uploaded two screenshots of a bogstandard journey map I made years ago at a now-defunct travel company, wrote an embarrassingly bad prompt (”how can we make this more dynamic”), and watched something kind of amazing happen. This video is me playing around, live, with very little plan. Things break. I yell at Claude (who, for the record, is a dude). I burn through some Lovable credits I was saving for actual work. And by the end I've got something I genuinely wish I'd had ten years ago.What I cover:* From two flat screenshots to an interactive, segmented, three-persona journey map in one prompt. I fed Claude the world's most standard journey map with goals, tasks, pain points, quotes, the usual, and a prompt that I am not proud of. Claude came back with a toggleable map across three user types (occasional, first-time, power), moments of truth, a backstage layer, channel dimensions, impact-vs-effort opportunities, and pain points ranked high, medium, and low. Not perfect but already more useful than anything I've ever built by hand.* The real journey is a mess and Claude can finally show that. Every journey map I've ever made has been a lie. Real people don't walk in straight lines from Awareness to Conversion. They loop, they abandon, they come back three days later on a different device. We simplify, since showing the mess is genuinely hard and takes forever. I told Claude I wanted the realistic version. It gave me a more real scenario: 12 days, 7 sessions, 5 backtrack loops. You can scrub through it on a timeline. That sentence alone is more interesting to a stakeholder than my entire old journey map.* Pain points linked straight to Jira tickets. Opportunities linked to prototype prompts. The whole point of a journey map is to make someone do something. So I had Claude wire each pain point and opportunity to backlog tickets (manual now, Jira-connected later) and then try to link out to Lovable and Figma Make with a prompt pre-built to prototype the fix. Some of it 404'd. All of it pointed at something I couldn't have dreamed of making myself a year ago.* This is a genuinely bad prompt and I want you to see it anyway. I'm a huge fan of prompt engineering. The prompts in this video are not that. I left them in on purpose. When you're experimenting, perfect prompts aren't the point, throwing stuff at the wall is. You do not need a flawless prompt to get something useful. You need to start.* The thing we used to beg others to build, we can now build ourselves. When I was making journey maps early in my career, I printed personas and taped them in bathrooms so people would actually look at them. That was the bar. Now I can hand a stakeholder something interactive they can click through, segment, and give feedback on without a wait. It's not “AI replaces researchers,” it's “researchers finally have the means, the power, and the resources to make the things we've always wanted to make.”This is me experimenting live. A proper walkthrough with real prompt engineering is coming. I wanted to get this out now anyway, since it's the most fun I've had with research deliverables in years, and I want you to go try it too.Watch the full thing above then go make something ugly and cool.Want Claude working like this without the “oh god what's my prompt” moment?I built the UXR Claude Skills Bundle for exactly this reason, 52 research skills installed directly into Claude, so the right framework shows up the moment you need it. Journey maps, TEDW interview guides, the insight formula, the Pyramid Principle for reports, the whole toolkit with no re-explaining, no starting from a blank chat every time.One-time $49, lifetime access (updates included), and installs in 5 minutes.Get the Bundle → This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.userresearchstrategist.com/subscribe

Scrum Master Toolbox Podcast
BONUS From 3,000 Scripts to 3 Tools - Building AI-Last Software With Peter Swimm

Scrum Master Toolbox Podcast

Play Episode Listen Later Apr 18, 2026 31:28


BONUS: From 3,000 Scripts to 3 Tools - Building AI-Last Software With Conversational AI Pioneer Peter Swimm In this special BONUS episode, Peter Swimm—conversational AI veteran, creator of BotKit (the open-source chatbot framework that powered Slack and Teams bots), and former Principal Product Manager at Microsoft Copilot Studio—shares what 25+ years in tech taught him about working with AI. From his brutal experiment of running an entire business on voice-based AI for a week, to why he treats AI more like R2-D2 than C-3PO, Peter offers a grounded, practical perspective on where AI fits in software development teams. From BotKit to Copilot Studio: A Front-Row Seat to the AI Evolution "We had the number one bot in the Slack app store, because there were only 8 bots, and ours used regex. To show you how far we've come."   Peter's journey into conversational AI started with a newspaper ad and a creative writing background. When Slack launched its API, Peter and BotKit co-creator Ben Brown immediately saw that building bots wasn't just a technical challenge—it was a social and creative one, like writing scripts for plays that interface with people in their daily lives. That insight powered BotKit into becoming the backbone of Slack and Teams bots, and eventually led to Microsoft acquiring the company. Peter spent years inside Microsoft shaping Copilot Studio, working on connectors that bridge the gap between APIs and real-world work. But the experience also gave him a healthy dose of perspective: he can show you slide decks from 2016 that promise the same things today's AI pitches promise, always saying "within 5 years." That pattern recognition shapes his practical, no-hype approach. The 3,000 Scripts Experiment: Why AI-Last Beats AI-First "At the end of the day, if I've been prompting all day, I should have a computer program that works offline, that works without a subscription. Otherwise, I didn't really make anything."   Peter ran a week-long experiment trying to run his entire business using only voice-based conversational AI. The result: 3,000 generated scripts. After static code analysis, he discovered it was really only 5 programs made thousands of times—and those 5 programs were really just 2 or 3 core abilities. He deleted 36 gigabytes of generated code and kept 50 megabytes of what actually worked. This brutal compression led him to an "AI-last" philosophy: build reliable runtime software that works confidently in one click, then use AI only for exploration, connection-making, and creative riffing. The payoff is striking—within 3 weeks of a given application, his team sees a 90% reduction in AI usage in the first week, dropping to 0% within 13 days, because once a computer program does everything you need, you don't need AI anymore. R2-D2, Not C-3PO: How to Think About AI on Your Team "I think of our AI use more like R2-D2 than C-3PO. R2-D2 doesn't talk—bonus points. He doesn't interject his fear. He saves your butt. He's silent until you need him, and visible when you need him."   Peter's Star Wars analogy captures his team's philosophy on AI integration. AI should be like a smarter linter—a quiet, capable tool that handles the boring, repetitive tasks so humans can focus on creativity and shipping. His team treats AI as a "super junior" with infinite time: set it up as if it invented Python, have it write buy-the-book code with unit tests, and then a human reviews and accepts (or rejects) the output. The tooling isn't consistent enough to ship autonomously or commit directly into the codebase—even frontier providers don't fully understand what their models do. The practical benefit is enormous for setup and configuration: what used to be a painful, arcane process of tracking down dozens of AWS or Azure docs becomes a 20-minute "hello world" that's actually a working proof of concept. Your job isn't to become an expert at cloud services—it's to ship product. The Biggest Mistake: Automating Broken Processes at AI Speed "All it does is automate all the mistakes you made, all the way, at AI speed."   When asked about the most common mistake organizations make with AI, Peter is blunt: they port their existing infrastructure into AI-governed systems instead of rebuilding from the ground up. Companies with a self-inflated opinion of their processes think AI is just a million-person force multiplier—so they'll ship faster. But if your process was broken before AI, you'll just generate broken output at unprecedented scale. That 3,000-script experiment proved this firsthand. Peter's recommendation: rebuild from the bolts up. Start with AI-last architecture where reliable, offline-capable software handles the core, and AI is reserved for the edges—filling gaps, translating between systems, and making connections that don't exist yet. SaaS Is Bloated: The Case for AI Transformation Layers "The one thing AI is good at is transforming between boundaries."   Peter's team has been divesting from SaaS providers, replacing the patchwork of middleware subscription plans that forced everyone to copy and paste between CMS, Excel, meeting notes, and email. His approach: product people use Notion, developers use GitHub, and the two cross-sync without needing Jira as an arbitration layer. Everyone tracks work in the tool they already live in. AI's real superpower here is translation—between APIs, between languages, between formats. Peter sees a future where small translation layers between CRUD operations replace the bloated, one-size-fits-all SaaS tools that are "built for 99% of users with generalized features nobody uses." His team also freed themselves from tools like Figma: the designer works in their preferred graphics program, the developer in their preferred IDE, and AI arbitrates the differences. Teams, Velocity, and Reinvesting the AI Dividend "5 to 7 people is still good, because you need a diverse set of people who are intensely focused on certain areas. But they should be allotted that savings in time to ship all the things that get cut."   Peter pushes back on the idea that AI changes the ideal team size. The 5-to-7 person team still works—what should change is what those people do with the time they save. Instead of loading teams onto more projects or increasing portfolio velocity, reinvest the AI productivity dividend into quality: ship with unit tests from day one, ship WCAG-compliant from day one, and stop cutting features to hit deadlines. Version 1.0 should no longer need an immediate 1.1 follow-up. Peter also challenges the notion that AI eliminates the need for experienced practitioners—velocity metrics become meaningless when a 6-week coding plan finishes in 25 minutes. What matters is using the saved time to make software genuinely better. The Future: Demo-First Development and Solid Releases "I can show you a working demo of the thing at the first meeting, and you can pay for it. And then we can make it better than your dreams."   Peter sees AI transforming the consulting and product development lifecycle from "launch, listen, and learn" to "listen, iterate, and launch." As a consultant, he now brings working demos to first meetings instead of $20,000 six-week proposals. Clients see the product in motion and immediately identify improvements—before money changes hands. This shifts the power dynamic: products iterate toward quality before launch, not after. Peter envisions a future where we ship solid releases that iterate in quality, with interfaces that show users only what's relevant to them instead of "90,000 buttons that don't apply to me."   About Peter Swimm   Peter Swimm is a conversational AI veteran with 25+ years in tech — from managing data centers to building Botkit (the open-source chatbot framework that powered Slack and Teams bots), to serving as Principal Product Manager at Microsoft Copilot Studio. He's the founder of Toilville, a consultancy helping businesses build conversational AI solutions.   You can link with Peter Swimm on LinkedIn and visit his website at peterswimm.com.

Techmeme Ride Home
Reed Hastings Rides Into The Sunset

Techmeme Ride Home

Play Episode Listen Later Apr 17, 2026 22:15


Netflix beat on revenue and income but dropped 10%+ on weak Q2 guidance as Reed Hastings exits the board. Anthropic launches Claude Design, OpenAI overhauls Codex Desktop with computer control, and DeepSeek seeks its first outside funding at $10B+. Netflix reports Q1 revenue up 16% YoY to $12.25B, vs. $12.2B est., net income up 83% YoY to $5.28B, and forecasts Q2 EPS and revenue below est.; NFLX drops 10%+ (Bloomberg) Anthropic launches Claude Design, a new experimental product that lets users create visuals like prototypes, slides, one-pagers, and more using Claude (TechCrunch) Sources: Dario Amodei is set to meet with WH Chief of Staff Susie Wiles on Friday, a breakthrough in Anthropic's effort to resolve its fight with the Pentagon (Axios) OpenAI updates its Codex desktop app with features like computer control, an in-app browser, image generation, automation memory, plugin support, and more (ZDNet) Sources: DeepSeek is in talks to raise outside capital for the first time, seeking at least $300M at a valuation of at least $10B (The Information) Longreads India produces 1.5M+ CS graduates annually, but AI coding tools are forcing its $315B IT outsourcing industry into an existential reckoning (Bloomberg) Doug Liman's $70M movie Bitcoin: Killing Satoshi uses AI for sets, lighting, and more in post-production, cutting costs from an estimated $300M (The Wrap) Defunct startups are being liquidated for their Slack archives, Jira tickets, and email threads—operational exhaust that AI labs now treat as premium training data (Forbes) Learn more at liquid.trade/techbrew. Disclaimer: ● Initial 3 week subscription and 4 weeks of medication from $79 plus tax and $179 per month plus tax for 12 week subscription thereafter. Final pricing depends on program selection. ● Noom GLP-1Rx Program involves healthy diet, exercise and support. Individual results vary. Meds & personalization based on clinical need. Not reviewed by FDA for safety, efficacy, or quality. No affiliation with Novo Nordisk Inc., the only US source of FDA-approved semaglutide. Not available in all 50 US states ● Based on an analysis of self reported data from 1,254 engaged Noom users. Learn more about your ad choices. Visit megaphone.fm/adchoices

The Jira Life
Examining Atlassian Certification Changes (with Kristjan Mathiesen)

The Jira Life

Play Episode Listen Later Apr 16, 2026 65:49


In this episode, we sit down with Kristjan Mathiesen, who recently joined Atlassian's Certification team as a Technical Subject Matter Expert, to take a comprehensive look at one of the most valuable — and often underutilized — resources in the Atlassian ecosystem: the certification program.Kristjan walks us through the full history of Atlassian credentialing, from the early days of the Atlassian Certified Professional (ACP) exams — high-stakes, proctored assessments designed to validate deep product knowledge — through the evolution of the program into a broader ecosystem of credentials. We cover the early ACP exams, how they came to be, why they were built the way they were, and who they were designed to serve. We also dig into the Atlassian Certified Associate (ACA) and Atlassian Certified Hands-on (ACH) credentials — the latter being a lower-stakes, non-proctored, open-book format aimed at a different audience entirely.Then we look at where the program stands today. All exams now live on the Certiverse platform, making the entire certification journey fully online — no more trekking to a testing center. Other changes are afoot this year, rolling out with the release of the updated ACP-520.We close the episode with a bigger-picture conversation about the value of credentials in today's job market. Kristjan shares his perspective on why certifications matter more than ever in a world where learning moves fast — where a college degree still carries weight, but hands-on, specialized credentials are increasingly what validates real-world expertise to employers.Whether you're already certified, considering your first exam, or just curious about where the Atlassian credentialing program is headed, this episode is for you.Subscribe, leave a review, and let us know — are you certified?Thank you to ikuTeam for connecting and collaborating with The Jira Life. https://ikuteam.comThe Jira Life=====================================Having trouble keeping up with when we are live? Sign up for our Atlassian Community Group!https://ace.atlassian.com/the-jira-life/Or Follow us on LinkedIn!https://www.linkedin.com/company/the-jira-life/Become a member on YouTube to get access to perks:https://www.youtube.com/@thejiralife/joinHosts:- Alex "Dr. Jira" Ortizhttps://www.linkedin.com/in/alexortiz89/https://www.youtube.com/@ApetechTechTutorials- Rodney "The Jira Guy" Nissenhttps://www.linkedin.com/in/rgnissen/https://thejiraguy.com- Sarah Wrighthttps://www.linkedin.com/in/satwright/Producer:- "King Bob" Robert Wenhttps://www.linkedin.com/in/robert-wen-csm-spc6-a552051/Executive Producer: - Lina OrtizMusic provided by Monstercat:=====================================Intro: Nitro Fun - Cheat Codeshttps://www.youtube.com/c/monstercatOutro: Fractal - Atriumhttps://www.youtube.com/c/monstercatinstinct

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Notion's Token Town: 5 Rebuilds, 100+ Tools, MCP vs CLIs and the Software Factory Future — Simon Last & Sarah Sachs of Notion

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 15, 2026 77:17


For all those who missed out on London, see you in Miami next week!Notion, the knowledge work decacorn, has been building AI tooling since before ChatGPT, with many hits from Q&A in 2023 and unified AI in 2024 and Meeting Notes in 2025. At the end of their last Make user conference, Ryan Nystrom teased Notion 3.0's Custom Agents - and they are finally embracing the Agent Lab playbook!Sarah Sachs and Simon Last of Notion join us for a deep dive into how Notion built Custom Agents, why it took years and multiple rebuilds to get right, and what it means to turn a productivity tool into an agent-native system of record for enterprise work.We go inside the product, engineering, evals, pricing, and org design decisions behind one of the most ambitious AI product efforts in software today — from early failed tool-calling experiments in 2022 to agent harnesses, progressive tool disclosure, meeting notes as data capture, and the long-term vision for software factories and agentic work.We discuss:* Sarah and Simon's path to launching Notion Custom Agents, and why the feature was rebuilt four or five times before it was ready for production* Why early agent attempts failed: no tool-calling standard, short context windows, unreliable models, and too much complexity exposed to the model* The “Agent Lab” thesis: not just wrapping a model, but understanding how people collaborate and building the right product system around frontier capabilities* How Notion thinks about roadmap timing: not swimming upstream against model limitations, but also building early enough that the product is ready when the models are* Why coding agents feel like the kernel of AGI, and how Notion is thinking about “software factories” made up of agents that spec, code, test, debug, review, and maintain codebases together* How Sarah runs AI engineering at Notion (“notes from Token Town”): objective-setting over idea ownership, low-ego teams comfortable deleting their own work, and a culture designed to swarm around fast-changing opportunities* The “Simon Vortex,” company hackathons, and why security gets pulled in early rather than late* How Notion organizes AI: core AI capabilities and infrastructure, product packaging teams, and a broader company mandate that every product surface must increasingly work for both humans and agents* Why prototypes have become much easier to build internally, and how “demos over memos” changes product development inside a tool the whole company already uses every day* Notion's eval philosophy: regression tests, launch-quality evals, and “frontier/headroom” evals that intentionally only pass ~30% of the time so the company can see where model capabilities are going* What a “Model Behavior Engineer” is, and why Notion treats eval writing, failure analysis, and model understanding as a distinct function rather than just software engineering* The changing role of software engineers in the age of coding agents, and why the new job looks less like typing code and more like supervising a rigorous outer system of agents, PRs, and verification loops* How the “software factory” should work: specs, self-verification, bug flows, subagents, and minimizing human intervention while preserving the invariants that matter* A live walkthrough of a Notion Custom Agent handling coworking space tenant applications by triaging email, enriching applicants with web search, and writing structured data into a Notion database* How agents compose inside Notion: shared databases as primitives, agents invoking other agents, “manager agents” supervising dozens of specialized agents, and memory implemented simply as pages and databases* Notion's take on MCP vs CLI: why Simon is bullish on CLI's self-debugging nature, where MCP still makes sense, and how Sarah thinks about capability, determinism, permissioning, and pricing alignment* The evolution of Notion's internal agent harness: from early JavaScript coding agents, to custom XML, to Markdown and SQL-like abstractions, to tool definitions, progressive disclosure, and a much shorter system prompt* Why Notion cares about teaching “the top of the class,” building for sophisticated operators rather than abstracting away too much capability for everyone* How agent setup works today: agents that can configure themselves, inspect their own failures, and edit their own instructions — with guardrails around permissions* How Notion prices Custom Agents: credits as an abstraction over tokens, model type, serving tier, web search, and future sandbox costs; why usage-based pricing was necessary; and how “auto” tries to match the right model to the right task* Why Notion is not eager to train a foundation model, where they do fine-tune and optimize today, and why retrieval/ranking is one of the most important investment areas as more searches come from agents rather than humans* Why Meeting Notes became one of Notion's strongest growth loops: not just as transcription, but as high-signal data capture that powers search, custom agents, follow-up workflows, and the broader system of record for company collaboration* Why Notion is more interested in being the place where collaboration data lives than in building hardware themselves — and how wearables or other capture devices may eventually feed into that systemSarah SachsLinkedIn: https://www.linkedin.com/in/sarahmsachsX: https://x.com/sarahmsachsSimon LastLinkedIn: https://www.linkedin.com/in/simon-last-41404140X: https://x.com/simonlastFull Video EpisodeTimestamps* 00:00:00 Introduction and launching Notion Custom Agents* 00:01:17 Why Notion rebuilt agents four or five times* 00:03:35 Building for where models are going, not just where they are* 00:05:32 The Agent Lab thesis, wrappers, and product intuition* 00:08:07 User journeys, leadership, and low-ego AI teams* 00:13:16 The Simon Vortex, hackathons, and bringing security in early* 00:16:39 Team structure, demos over memos, and building for agents* 00:20:25 Evals, Notion's Last Exam, and the Model Behavior Engineer role* 00:27:37 Evals as an agent harness and the changing role of software engineers* 00:30:42 The software factory: specs, verification, and agent workflows* 00:32:18 Live demo: a custom agent for coworking space applications* 00:35:08 Composing agents, manager agents, and memory as pages* 00:38:15 Notion Mail, Gmail, native integrations, and tools* 00:39:43 MCP vs CLI and the cost of capability* 00:44:13 When Notion uses MCP vs building its own integrations* 00:47:43 The history of Notion's agent harness rebuilds* 00:55:35 Power users, public tools, and the setup agent* 00:58:01 Self-fixing agents, permissions, and “flippy”* 01:01:13 Pricing, credits, and choosing the right model automatically* 01:09:01 Why Notion isn't training its own frontier model* 01:14:07 Retrieval, ranking, and search built for agents* 01:17:27 Meeting Notes as data capture and workflow automation* 01:21:18 Wearables, hardware, and Notion as the system of record* 01:23:45 OutroTranscript[00:00:00] Alessio: Hey everyone. Welcome to the Latent Space podcast. This is Alessio founder of Kernel Labs and I'm joined by swyx, editor of the Latent Space.[00:00:11] swyx: Hello. Hello. We're back in the beautiful studio that, uh, Alessio has set up for us with Simon and Sarah from Notion. Welcome.[00:00:18] Sarah Sachs: Thanks for having us.[00:00:19] Alessio: Thanks for having us. Yeah.[00:00:20] swyx: Congrats on the launch recently the custom agents, finally it's here. How's it feel?[00:00:26] Sarah Sachs: We ship things slowly. So it had been in Alpha for a little bit and at the point at which is it's an alpha, um, there's a group of people that are making sure it's ready for prod, and then there's a group of people working on the next thing.So sometimes some of these launches are a bit delayed satisfaction, so it's quite nice to remind yourself all the work you did because we do have a habit of like. Being two or three milestones ahead. Uh, just ‘cause you have to be, you know, you can't get complacent. Um, but it's been great that people understood how this is helpful.And I think that's just easier in general building AI tools today than it was two, three years ago. People kind of get it and so that user education, um, there's just, it was our most successful launch in terms of free trials and converting people and things like that. It was really successful, so yeah.But there's a lot to build.[00:01:12] swyx: Making it free for three months helps.[00:01:16] Sarah Sachs: Yep.[00:01:17] Simon Last: It was definitely super exciting for me because it's probably the fourth or fifth time that we rebuilt that.[00:01:22] swyx: Yes.[00:01:23] Simon Last: And I mean,[00:01:24] swyx: you've been building this since like 20, 22.[00:01:26] Simon Last: Yeah, I mean, like, it was even right when we got access to like GPT four in late 20 22, 1 of the first ideas we had is like, oh, okay, let's make an agent that I, we used the word assistant at the time, there wasn't really the word, the word agent yet, but, oh, we'll give an access to all the tools the notion can do, and then it, we run in the background like, like do work for us.And then we just tried that many times and it just. Was too early. Um,[00:01:48] swyx: I need to force you to like double click on that. What is too early? What didn't work?[00:01:52] Sarah Sachs: We were fine to, like, before function calling came out. We were trying to fine tune with the Frontier Labs and with fireworks, like a function calling model on notion functions.This is right when I joined. I joined because, um, we needed a manager as Simon was needed to be able to go on vacation. So, uh, that's, that's around when I joined, so you can speak much more to it.[00:02:11] Simon Last: Yeah, we did partnerships with both philanthropic and open AI at different times, uh, to try to, at the time the, I mean, when we first tried, there wasn't even a constant of like tools yet.We, we sort of designed our own like, like tool calling framework and then we tried to fine tune the models to, uh, to use it over multiple turns. Um, and because it, it didn't work well out the box, I think. Yeah. The models are just too dumb and the context thing was also way too short.[00:02:37] Alsesio: Yeah.[00:02:37] Simon Last: Um, and yeah, we just kind of banged our head against it for a long time.Uh, unfortunately it was always like, there was always like sort of. Glimmers that it was working, but um, it never felt quite robust enough to be like a useful, delightful thing. Um, until I would say, uh, the big unlock was probably like Sonic 3.6 or seven, uh, early last year. And that's when we started working on our agent, which we shipped last year.Um, and then, and then uh, uh, custom agents, kinda a similar capability and that, that one just took longer because we, we just wanted to get the reliability up a lot higher. ‘cause it's actually running in the background.[00:03:14] Sarah Sachs: And the product interface of like permissions and understanding, you know, this custom agent is shared in a Slack channel with X group of people and has access to documents that are surfaced to Y group of people.And the intersect experts, Y might not be whole. And so how do you build the product around making sure administrators understand that permissioning took multiple swings.[00:03:35] Alsesio: Everything is hard back at the end of the day. Yeah. I'm curious, like when the models are not working, how do you inform the product roadmap of like, okay, we should probably build, expecting the models to be better at some reasonable pace, but at the same time we need to, you know, you had a lot of customers in 2022.It's not like you were a new company or like no user base.[00:03:54] Simon Last: Yeah, I mean I think there's always the balance of, you know, like you want to be a GI pilled and thinking ahead and building for where things are going. Uh, but also you wanna be like shipping useful things. And so we always try to like, like keep a balance there.You know, we. We try to take clear, like a portfolio approach. You know, we're always working on multiple projects and, and we're always trying to work on, you know, maintaining things where that have already shipped, like, like shipping new things that are like eminently working well and make them really good.And, and then we wanna always have a few projects that are a little bit crazy. Um,[00:04:23] Alsesio: and what are the a GI peel projects that you have today? I'm curious about, uh, you don't have to share exactly what you're working on, but I'm curious what are things today that maybe in 18 months people will be like, oh, obviously this was gonna work[00:04:35] Sarah Sachs: 18 months.[00:04:37] Alsesio: Yeah, 18 months is, you know,[00:04:37] Sarah Sachs: it's a long time and Yeah. Yeah.[00:04:39] Simon Last: I mean, there's a number of things happening. I think one thing that's becoming more clear is I think like, like, uh, coding agents are the kernel of EGI, sort of, everything is a coding agent. Mm-hmm. I think that's, that's sort of one, one direction.Um, and then, yeah, the exciting thing about that is sort of your agent can sort of bootstrap its own software and capabilities and actually debug and maintain them. And so yeah, we're, we're, we're thinking a lot about that. And then, yeah, like, like another category of things that I'm, I'm really excited about is like, uh, we call the software factory also.People are using this, uh, this, this sort of word. Um, basically it just means can you create sort of like a, as automated as possible, a workflow for developing debugging. Mm-hmm. Merging, reviewing, and maintaining a code base and a service where there's a bunch of agents working together inside, and like, like how does that work?[00:05:28] Sarah Sachs: If you think back to your initial question, like, why did this take so long? I think something,[00:05:32] swyx: I didn't say that, but Yes. Okay. Go ahead.[00:05:34] Sarah Sachs: Why, what, what changed over the three and half years of trying[00:05:37] swyx: it? Exactly. Right. Because most people always say like, it didn't work yet. Then reasoning models came, then it worked.I was like, okay, let's go a little[00:05:43] Sarah Sachs: bit. That's, I mean, that's part of it, but I think the other part of it that I actually think is really what will set notion apart for every new capability is we have like. Two skills that are crucial when it comes to frontier capabilities. One is not letting yourself swim upstream.So like quickly realizing if you're just pressing against model capabilities versus not exposing the model to the right information, not having the right infrastructure set up. That and of itself is the skill of intuition. And the second is to see, okay, you're not swimming upstream. Which direction is the river flowing and what is like, how do we think ahead about the product and start building it even if it's not great yet, so that when it is there, we're ready for it.Right? And like those can sometimes feel like counterintuitive things. Like we can be trying to fine tune a tool calling model when they don't exist yet. And that the trick is to not do that for too long, but realize that there was something there. And we've had a lot of things which like, um, we're just like not swimming in the right direction with the streams.I think we had multiple versions of transcription before we got meeting notes, right? Oh, I gotta talk[00:06:39] swyx: about that. Yeah.[00:06:40] Sarah Sachs: Yeah. Um, and so. I, I, I think that like we, we really closely partner with the Frontier Labs on capabilities and we also have to have strong conviction on, as those capabilities move.Notion is about being the best place for you to collaborate and do your work. And how does that narrative change if the way that we work changes?Yeah.[00:06:58] swyx: Yeah. You told me you were a fan of the Agent Lab thesis, and this is, this is kind of it, right?[00:07:02] Sarah Sachs: Right. I show that thesis to so many candidates. Like I have it as like micro chrome autofill.Um, at this point, like it's one of my most visitations[00:07:10] swyx: because like, is this the, here's why you should work in notion and not open, open eye. I, it's like,[00:07:14] Sarah Sachs: here's, here's what's different about it.[00:07:16] swyx: Yeah.[00:07:16] Sarah Sachs: And here's why. It's not just a rapper. I actually think more and more people understand it's not just a wrapper.[00:07:21] swyx: Yeah.[00:07:22] Sarah Sachs: Um, and by the way, like in the beginning, parts of what we build are wrappers on functionality. That works well, of course, but that's not really the most, um. I would say that's not the product that, that drives revenue. And that's not necessarily always what users need.[00:07:35] swyx: I mean, you know, notion is the AWS wrapper, but like the, the wrapper is very beautiful and like very, very well polished.So[00:07:40] Sarah Sachs: like the analogy,[00:07:41] swyx: like[00:07:42] Sarah Sachs: the analogy that I've been coming back to his Datadog in AWS[00:07:45] swyx: Yeah.[00:07:46] Sarah Sachs: So, uh, Datadog could not exist with, without cloud storage. Right. That it's kind of fundamental that that works. Um, and AWS has like a CloudWatch product, but Datadog is an expert on understanding how people want observability on the products they launch.And we're experts in understanding how people wanna collaborate, and that's really where our expertise lies.[00:08:04] swyx: Totally.[00:08:04] Sarah Sachs: Um, regardless of the tools that we use,[00:08:07] Alsesio: I'm kind of curious how you think about implicit versus explicit expertise. I feel like Datadog is half and half implicit and explicit. It's like they understand across markets and industries what engineering teams usually look for.With notion, it's almost like more of the expertise is at the edge because you as a platform, you're like so horizontal that the end user is not really the same. Mm-hmm. Like with Datadog, the end user is always like, yeah, an engineering lead, a kinda like SRE related person with notion. It can be anything.So I'm curious how you put that expertise into a product versus, you know, obviously it, WS cannot build notion. It's, that doesn't quite work in this case, but[00:08:44] Simon Last: it's, it's a little bit differently shaped. I think, you know, a classic vertical SaaS, like the data is kind of like that. They understand their individual customer very deeply.It's kinda a narrow slice, um, notion has always been super horizontal. And our, our task has always been to sort of balance these two somewhat opposing forces of like, we're listening to our customers and what they want us to build. It's a broad slice. And then also we're thinking about like, okay, how do we decompose what they want into, uh, nice primitives that are, that are really nice to use and we'll, we'll get us like as much bang for the buck as possible.And then, you know. Maintain the whole system, make it all like, like super clean and nice to use.[00:09:22] Sarah Sachs: We still have user journeys. I mean, we still focus on like core. I actually think the failure of our team is when we focus too much on what are cools that are, what are tools that are[00:09:31] Simon Last: mm-hmm.[00:09:31] Sarah Sachs: Cool tools. I actually think that's when we make have the least velocity because you still need some sort of focus on a user journey.So like for instance, we'll all sit down every Friday and look at the P 99 of like the most token exhaustive custom agent transcript and just look at why it didn't do well and cut a bunch of tasks. Like we still focus on like, this has, like this should work. Email triaging should work. Mm-hmm. Right. And similarly, like when we're talking about before building, um, chatting, um, before we started filming about, okay, how can I do PDF export?Well that's functionality that then merits. Maybe we should build a tool that has access to a computer sandbox in a file system and the ability to write code. Right? Right. Um, but it's because we're thinking about the fact that our users to do their, to do their daily work, need to export PDFs, not because we're like, Hmm, I think a computer tool could be cool.Like, let's just see what happens. Mm-hmm. Like we, we have to focus on some user journeys, otherwise we just don't have like, enough strategy to, to prioritize.[00:10:29] swyx: I think there's a lot of like really strong opinions that you've had. Do you have like sort of like a towel of Sarah Sachs? Like, you know, like what, how do you run your team?Like I feel like you just have accumulated all these strong opinions. Obviously part, part of this is your, your token town thing.[00:10:43] Sarah Sachs: I think the TAs working with Service X is, um, you'd have to, it depends who you ask. Um, I think it depends if you're on my team or a partner Right. Or a vendor.[00:10:54] swyx: Yeah. There other people want to run their teams the way that you're Yeah.You're like bringing these things. And then also similarly, uh, Simon, when you did the custom agents demo, you had like, well, we've been using custom agents and here's the super long list of everything that we do. No humans ever read it. Right? That's what you said. I was like,[00:11:07] Sarah Sachs: yeah. So I think for, for me, um, something that I learned very quickly and became very comfortable with was that my job was not to be the ideas per person or the technical expert.My job was to make it so that everybody understood the objective, had a resource to help prioritize what they should work on, and had an avenue to prioritize what they thought was important. And I think that's true with all, all leadership, but I think especially on the AI team. Almost all of our best ideas come from prototypes, from people that have a cool idea because they saw a user problem, and it's a huge disservice if all of those ideas have to pass, like the sniff test of what me and a product partner or Simon and Ivan decided were the direction, right?Because a lot of what we're doing is leaning into capabilities, so. I think that's the first thing is like, I don't really view like the role of engineering leadership as like, uh, hierarchical, nor has it ever been, but especially now, like very willing to change direction based on, um, like proof is in the pudding.Yeah. And like, and I think we have rebuilt our harness three or four times. And when you do that, then the second rule of engineering leadership is like you need to build a team that's comfortable deleting their own code and is very low ego and is driven by what's best for the company. And, um, doesn't write design docs because they think it's their promotion packet.Right. And that's a culture that notion had long before I joined, but like our willingness to just swarm on different problems and um, redo things that we've built before because something has changed. Like, there's a lot of friction that can happen at companies when you do that. And it doesn't happen at Notion.And because it doesn't happen when new people join. Like they don't wanna be the ones that are saying, we shouldn't do this. I wrote that code. So then it's, you know, you, you create a culture that everyone thoughts and that culture comes directly, I think from Simon and Ivan though, um, because they're very open-minded.[00:12:50] swyx: Anything that you,[00:12:50] Simon Last: you'd add? I'm not a manager, like, like, like Sarah is. Um, a lot of my role is really to try to think a little bit ahead, make sure that we're, we're building on the right capabilities and then like the prototyping stuff. And yeah, it's really, really critical to always just be starting again.It's like, okay, this is new thing. What does this mean? What if we just rethought everything or wrote everything? And so I, I'm, I'm basically just doing that in a loop every six months.[00:13:16] swyx: Yeah. Do you believe in internal hackathons for this stuff?[00:13:19] Sarah Sachs: I think there's like two different versions. So one is like, we just have a, a, a solid bench of senior engineers that come and go on what we call the Simon Vortex and Productionizing what we built, right?Because when you're in the Simon Vortex, the velocity is super high. The direction changes daily, and it's meant to be like the equivalent of a SC Works lab. We don't need to do hackathons for that. We need to have senior engineers that we trust to come in and out of those projects. For instance, like management boundaries are really loose.Like you report to him, but you work for her right now. Yeah. That's something that when we hire managers, it's important they don't care about because we tend to form more structures. Yeah. Don't be too[00:13:54] swyx: territorial.[00:13:55] Sarah Sachs: We form more. It's after we ship things, not not before, just historically. Um, the second thing is we do have companywide hackathons.Actually we just had our demos day for the hackathon we had last week this morning. That's more for people that aren't directly working on the project, feeling like they have the time to pause and learn how to make themselves more productive or how they would use notion custom agents to build something.Or part of the hackathon was actually encouraging everyone across the company to build their own agentic tool loop, calling from scratch. Follow like an every blog post on how to do what I think because we want[00:14:26] swyx: just with the compound engineering one. Yeah.[00:14:28] Sarah Sachs: We want everyone to use cloud code in the company or whatever the coding agent they please and understand that fundamental.So we set aside a day and a half. We're all leadership, encourage everyone on their teams across the company to do it. So we have hackathons like that. I would say like kind of facetiously, like everything we build is a little bit like a hackathon until it graduates and puts on big boy pants and as a product ops rollout leader and has a assigned data scientists and stuff like that,[00:14:54] swyx: security review enterprise stuff,[00:14:56] Sarah Sachs: actually security reviews one of the things that we bring in first because it just slows us down way more and, um, causes a lot of tension and they build better product if they're involved early.So, um, that is probably the first person to get involved in something that's the[00:15:09] swyx: right PR approved answer.[00:15:10] Sarah Sachs: No, but it's not just PR approved. It like, um, um, it's[00:15:13] swyx: actually real. It's actually real. It's like, um, I'm just saying scar[00:15:15] Sarah Sachs: tissue.[00:15:15] swyx: Yeah,[00:15:16] Sarah Sachs: because like, you know, my background's also, I worked at Robinhood for a number of years.Yes. So like, uh, compliance and things like that, um, are a little bit more, you learn the hard way when it doesn't come naturally.[00:15:26] Simon Last: Yeah. I think the. The hackathon is really important for uplifting the general population, but like, if that's the only way you can build new things, you're kind of toast. I mean, it, it has to be like the daily processes, like, you know, building these new things.Um, and it has to be about, I think like, I think in the AI era a lot more leverage accumulates to the most curious and excited people. And so it's like we're all about just like activating that energy. You know, like if someone's protesting something on the weekend that they're excited about and it's important, that should be the main thing that we're doing.Yeah. Um, it's not a hackathon that we schedule once a quarter, it's just like, yeah. Daily process. Part of the culture.[00:16:02] Sarah Sachs: I mean, that's how we shift image generation and notion now. It was always this thing that would be kind of nice to have, but it wasn't really clear where that was necessarily aligned in product priorities.It'd be a lot of work. And we had someone on the database collections team, Jimmy, who was like. I really wanna do image generation for cover photos and inside notion. And we're like, if you wanna build it, like it's, do it please. Like we encourage you. We gave ‘em all the resources of working directly with Gemini and being able to like track the token usage and it working through endpoints.We gave them eval, support, everything, and then became a, a full project.[00:16:34] Alsesio: Yeah.[00:16:35] Sarah Sachs: That's why you can't have like ego as a, a leader. Like that's, that's how we work.[00:16:39] Alsesio: What's the size of the team today, both engineering and overall?[00:16:43] Sarah Sachs: I manage, uh, the team. That's what we'll call it. Core AI capabilities and infrastructure.That's about 50 people. But then we have per i partner teams that do packaging. So how it shows up in the corner chat versus custom agents versus meeting notes, that's another 30, 40 people. And, and then every team that has a product service at Notion that a user can interface with owns the tool that the agent interfaces with the editor team.The team that did CRDT for offline mode is the same team that handles how two agents, um, edit competing blocks. Mm-hmm. Right? It's the same problem. The team that built the underlying SQL engine is the same team that owns how the agent asks it to run a SQL query, and it does it performantly. And so from that regard, anyone working on product engineering is tasked with making them work for customers that are humans and agents because over time the majority of our traffic will be coming from agencies using in our interface, not humans.And so. Our objective is to make it so that the whole product org is building for agents.[00:17:40] Alsesio: Yeah. How has it changed internally? The activation bar is kind of lowered a lot. Like anybody can kind of create a prototype very, somewhat easily, especially if you're like an existing code base. Have you raised the bar on like what type of prototype people need to bring forward to gonna be taken?Not like seriously, but like, you know what I[00:17:58] Simon Last: mean? Yeah. I think the bar is lowered in many ways. Be like, one thing our, uh, our team built that is really cool is our, uh, our, our design team made a whole separate GitHub repo, uh, called the, the design Playground. And it's basically just to create a bunch of like, like helper components and you, uh, for, for quickly a throwing together UIs.And it's become like actually quite sophisticated. Like it has like an agent in there and like, uh, that's pretty fun. So like, we pretty much, like, they don't do mocks, they just make like, like full, full prototypes.[00:18:27] swyx: Here it is. It works.[00:18:28] Simon Last: They give you like a u rl. They're like, okay, all right. So we have to make the, like the real production version of that.Um, and then for engineers. A prototype looks like just making it a feature flag that actually works. Like that's sort of the bar.[00:18:39] Sarah Sachs: Something to understand that's really unique about notion. One of the reasons I joined we're super lucky is no one uses Notion in their job as much as people that work at Notion.[00:18:46] Simon Last: Of course.[00:18:47] Sarah Sachs: So I think there's very few companies, maybe if you worked on Chrome I guess, but like everything that we ship, we ship internally first and get a lot of really quick feedback. And also sometimes our dev instance is totally borked and you have to change a bunch of flags to get things done. And that's kind of like, but everyone, so people that do it ticketing, people that do supply chain procurement, recruiting, everyone is using the same instance of notion with like a lot of flags on for these prototypes people build.Um, and so we have this, Brian Levin, one of the designers on our team, I think evangelize this concept of demos over memos.[00:19:18] swyx: Ooh, too[00:19:20] Sarah Sachs: good. Um, which has been, uh, very good for building demos, and I think it's put a big pressure point on us to have really strong product conviction, because if anything can be demoed, you really need a strong filter of making sure that if you know, you're doing X amount of work, you're making the, you're, you're focusing on one tower, you're not just building a really flat hill.Right. That's actually where I think there has to be more conviction from our PMs, um, and our designers and, and well, the company really to have conviction of what journey we're going on.[00:19:52] Simon Last: But overall, I feel like it works pretty well. Like people, almost all the engineers have good enough taste to realize that like, this prototype doesn't actually make sense in the product, or, or it does.So it's not that common that I would see a prototype. It's like, oh, this makes no sense. Mm-hmm. It's like, you know, people are doing reasonable things and, and, and then it's just a matter of. Which things we build first and then often just, just figuring out how to turn it on and off. There's our, in the, in our like experimental chat ui, there's this, there's probably like, like a hundred check boxes in there.[00:20:22] Sarah Sachs: Kills me[00:20:23] Simon Last: the things you could turn on and off.[00:20:25] Sarah Sachs: Uh, but I think that, okay, so that is kind of true, Simon, but like being the person that manages the evals team, like there is a level of intensity that it adds to the platform team. So, you know, if we're gonna do image generation and notion, all of a sudden the way that we do attachments and the way that we, um, our LLM completion like cortex talks and expects tokens back and now it's getting images back.Like there's a lot of platform work that we do need to, like solidify a little bit. So sometimes it'll be in dev for a couple weeks before it makes it to prod just because we still have to like, make it robust, make it HIPAA compliant, ZDR compliant, figure out the right contracting with the vendor, whatever it is.And we need to eval it because we want the team. To still maintain what they build. That's the one thing is like if we have a bunch of prototypes, it can't just be like a small group of people that then maintain whatever end prototypes. So we have invested a lot of people in an eval and model behavior understanding teams that, we call it agent dev velocity.So your dev velocity building agents can be faster if we invest in that platform. And so we have a whole org dedicated to Asian, um, platform velocity so that you can build your own eval and then maintain it once you ship it. So if a new model release comes out and we, every[00:21:38] swyx: team maintains their own eval,[00:21:40] Sarah Sachs: we maintain the eval framework.Every team owns their own evals and a lot of them we've integrated to Optin, to ci, or we run them nightly and we have a team, uh, a custom agent that triggers to a team to look at the major failures. That's really critical because if we have like all these different surfaces now, a lot of it's on the same agent harness, so it's easier to maintain.It's just packaging of different agent harnesses, but new functionality of the agent. Let's say that like we wanna update like. Uh, you know, they deprecated, sonnet, um, four or whatever it is and we need to auto update. Are[00:22:11] swyx: they already? That's so, okay. Yeah. Actually wasn't that long ago.[00:22:14] Alsesio: Theywere[00:22:14] Alsesio: just 3.5.[00:22:15] Sarah Sachs: 3.537. Just got deprecated.[00:22:18] swyx: 3 7, 5 0.2 or, yeah. No,[00:22:20] Sarah Sachs: it's not. 5.2 is five point. Five point no. Yeah, five four is 40% more expensive than five two. So if they deprecated five two, you would hear they can, you would hear from me about that one. Um, but, uh, another conversation to have.[00:22:35] swyx: I have a cheeky evals question for you.Have you noticed any secret degradation from any of the major model providers?[00:22:40] Sarah Sachs: Secret degradation,[00:22:42] swyx: like. During the War Bay, when it's high traffic, it suddenly gets dumber.[00:22:47] Sarah Sachs: Yeah. I mean, not just between the, I mean, we definitely notice flakiness, we've definitely noticed, particularly for some providers, that things are slower during working hours and[00:22:57] swyx: there's a latency argument.Yes. Not a quality argument.[00:22:59] Sarah Sachs: No. I think the quality difference that's interesting is, um, even though companies that say they're selling the same, a, it's really into like quanti quantization, but like companies that say they're selling the same model through different vendors, whether it be through first party or Bedrock, Azure, et cetera.We do see different qualities sometimes, and that's not necessarily what's advertised.[00:23:21] swyx: Yeah. Kidney went to the point of like, if we, they shipped like this, like eval across all the providers and it was like very obvious we were secret equalizing and it was very,[00:23:28] Sarah Sachs: yeah. But[00:23:29] swyx: that's very embarrassing.[00:23:30] Sarah Sachs: You know, um, we hire Subprocess to figure that out for us.So we just wanna understand where it's regressing or where it's optimized. And sometimes we're okay with regressions that optimize latency if they're the appropriate regressions. Our job is to make sure we have the evals to understand the changes that are important to us. And even like when we're partnering with labs on pre-releasees of models, they'll send us multiple snapshots.And this is less about quantization, but more just regressions. Like they have shipped models that were not the snapshots that we wanted, and they have changed the snapshots that they shipped based on the feedback that we give. Because our feedback tends to be more enterprise work focused and not coding agent focused.And definitely those can be bummers, like, you know, uh, we know that this wasn't the version you wanted, but we'll help you make it work. I mean, we always make it work, but that definitely happens.[00:24:16] Alsesio: Yeah. Do you have, um, failing evals that you're just hoping, oh, that will have success eventually when a good model comes out?[00:24:23] Sarah Sachs: Uh, I mean, yeah. So I think. I mean, I could talk about this for 60 minutes, so I will limit myself. I think it's a real issue when people say evals and it's just like, that's quality, that's like unit, I mean, it's like saying testing. It's not just unit tests, right? So. We have the equivalent of unit test.Regression test. Those live in ci, those have to pass a certain percent, you know, within some stochastic error rate. Then we have, as you're building a product, evals of these aren't passing right now, and this is launch quality. So we have a report card and we need to, on these categories, you know, be it 80 or 90% of all of these user journeys to launch, and then what we have what we call frontier or headroom evals, where we actively wanna be at 30% pass rate.And that's actually been a effort that we took in partnership with philanthropic and OpenAI in the past maybe two or three months, because we actually hit a point where our evals were saturated and we weren't able to really give insightful feedback other than it wasn't worse. And not only is that not helpful for our partners, it's not helpful for us to understand where the stream is going.You know, going back to that analogy. And so we spent a lot of time thinking about. What notions last exam looks like, right? Mm-hmm. Not just humanities, last exam. Ooh, notions last exam. Mm-hmm. And, um, there's a lot of, you know, dreams about what that would look like. I know we've talked a lot about benchmarking, um, swix, but, uh, yeah.Notions last exam is a big thing inside the company and we have people, full-time staff to it exclusively. Mm. We have a data scientist, a model behavior engineer, and an full-time, um, evals engineer just dedicated to the evals that we pass 30% of the time.[00:25:56] swyx: What you're hiring for[00:25:57] Sarah Sachs: MBEs? I am hiring[00:25:58] swyx: What is an MBEA[00:25:59] Sarah Sachs: model?Behavior Engineer Model. Behavior engineers started with a title data specialist before I joined when they were working with Simon on like, uh, Google Sheets and like Simon just needed someone to look through Google Sheets and say, yes, no, this looks bad. This looks good. Right? And so we hired people with kind of diverse linguistics background.We had like a linguistics PhD dropout. Mm-hmm. And a Stanford ate new grad. And they're amazing. And they formed a new function basically. And over time we've built a whole team, um, with a manager who's now kind of reinventing what that role is with coding agents. So they used to be kind of manually inspecting code.Now they're primarily building agents that can write evals for themselves or LLM judges. There's a really funny day I can send you the picture where Simon, about a year and a half ago, was teaching them how to use GitHub. Um, and they're on the whiteboard and it was like, okay, I think it would be so much faster if our data specialists learned how to use GitHub and like learned how to commit these things in Dakota.And, and that was then and now I think, you know, coding has been a lot more accessible. Um, but moving forward it's this mix of like data scientist PM and prompt engineer because there's craft in understanding like even like what models can and can't do things. How do we define like that headroom? How do we define like what a good journey is?Um, is this model better or not? Why is this failing? There's some qualitative work, but then there's also like a lot of instinct and taste to it, and that's not necessarily software engineering. And so we have like very firm conviction and we have had for a number of years now that that is its own career path and we have always welcomed the misfits, so to speak.So we really firmly believe that you don't need an engineering background to be the best at this job. And that's what's quite unique about this particular role.[00:27:37] Simon Last: Yeah, this is something that I've been pretty excited about recently is we made an effort basically to treat the eval system as like an agent harness.So if you think about it, like, you know, you should be able to have an agent end-to-end, download a dataset, run an eval, iterate on a failure, debug, and, and then implement a fix. And ultimately you should be able to, you know, drive the full time process with a human sort of observing the, you know, the outer uh, system.So yeah, we went, went pretty hard on that. And that's, that's worked extremely well so far. It's like basically just to turn it into a coding agent, uh, uh, problem.[00:28:11] swyx: Your coding agent or just whatever[00:28:13] Simon Last: harness No coding agent. Yeah, code, cloud code. It should be totally general. Yeah. I think if it would be a mistake to like, like fix it on any, any particular coding agent.At the end of the day, it's just like CLI tools.[00:28:21] Sarah Sachs: It's like the same way that you would've a coding agent write the unit test. You should have a coding agent write the eval.[00:28:26] swyx: Yeah.[00:28:26] Sarah Sachs: But there's a lot of supervision in that still. We just don't believe that supervision has to come from software engineers because a lot of it is like, um, kind of you XREE and whatever, and these are the people that also triage failures and tell us where we should be investing next.[00:28:40] swyx: Yeah. I'm gonna go ahead and ask a spicy question. Is there a data, there are no software engineers at Notion.[00:28:46] Simon Last: Um,[00:28:46] Sarah Sachs: what does it mean to be a software engineer?[00:28:47] swyx: Exactly.[00:28:48] Simon Last: I mean, I think the way things are going is like we're on some continuum where. If, if you look back three years ago, humans were typing all the code and then we had auto complete, you're typing list of the code.Then we had sort of like filling agents, filling lines, and now we're getting into like agents doing longer range tasks where you can debug and implement a fix and then verify it works and you know, get your, get your PR even like, like Merion deployed. I think we're sort of just moving up the abstraction ladder and then the human role becomes more about observing and maintaining the outer system.There's a string of agents flowing through, like me prs what's going off the rails. Like what do I need to approve? Is there like a learning or memory mechanism that that works? So it's kind of a hard engineering problem. There's a, you know, there's, there's a lot to do there. I think we're just sort of moving up stack[00:29:34] Sarah Sachs: the same transition machine learning engineers have made, right?Like I haven't looked at a PR curve in a while.[00:29:39] swyx: Yeah. You used to do this stuff and now, um, auto research can do it,[00:29:42] Sarah Sachs: right? Like I think it depends on what you define as a software engineer.[00:29:46] swyx: Yes. It's, that's changing for sure.[00:29:49] Sarah Sachs: I think every software engineer in notion this summer went through like this, um, sheer, um, one of our engineering leads of the company called it, like every software engineer is going through the, the, uh, identity crisis that every manager goes through, where all of a sudden they realize their ability to write code is less important than their ability to delegate in context switch.And I think that is a transition out of being a software engineer. But[00:30:12] Simon Last: yeah. Yeah, there's a critical difference to being a manager, which is that like, it is actually very deeply technical. The problem, you know, humans are very like, like, like fuzzy and you can't like treat a team of humans like a, like a rigorous system where like, you know, prs like, like flow through and can be in like a block status and then what happens when they're blocked, right.With a set of agents, you actually can do that. And, and, and I think it's actually, there's a lot of interesting technical rigor that that goes into that it's like it's a technical design problem. Ultimately.[00:30:42] Alsesio: What is the design of the software factory that you're building?[00:30:46] Simon Last: Yeah, I mean, I think we're. Trying a lot of different things.I mean, ultimately you want to design a system that requires as little human intervention as possible, but like still maintaining the in variance that, that you care about. So yeah, we're exploring a lot different ideas there. I mean, I think I could talk about a few things I think are important there.Like, one thing I think is really important is, um, having some kind of like specification layer you can just commit marked on files. Mm-hmm. That works pretty well, but[00:31:15] swyx: it's nice to be notion man. I'm just saying like the spec, like Yeah. The natural home for specs is notion.[00:31:21] Simon Last: Yeah. Right. It can be a database of pages.Yeah. I mean, it needs to be something that is, you know, human readable and I viewable and I think that's pretty key. Another really key component is like the, the self verification loop. Yes. You need really, really good testing layers, basically. And that's a really deep, uh, uh, problem. But by getting that right, you know, and then, and then it's kinda like the workflow of like.What happens when there's a bug? How does it flow into the system? Like, is it like a subagent working on it? How does it make a PR and how does that get reviewed? And me, and then, you know, so there's like the, the flow or process.[00:31:56] swyx: Yeah. Cool. Uh, you know, one thing we did work out before you guys came in was this demo or this[00:32:01] Simon Last: agents[00:32:02] swyx: agent demo.Uh,[00:32:03] Simon Last: so every,[00:32:04] Alsesio: every time we do an episode, we try the product. Right. I don't think there's ever been an episode that I haven't tried. Yeah. Um,[00:32:11] swyx: and we, we try, try is a, a big word. Like since day one lane space has been on Notion, but this is the, this is the net new thing. Yes.[00:32:18] Alsesio: So this is for Nel Labs, which is the space we're in.So next week we're opening applications for tenants. So there's a web form, let me, we got this form done here. Uh, so, uh, before. Uh, the workflow would be I get an email, then I look at the person. It was like, should I spend time talking to this person? Then I respond, they respond back. So I build this. So the name it came up for on its own.Can you maybe h how do, how does it come up with its own name?[00:32:43] Simon Last: Yeah, that's a pretty app name. It's, it, it is just a random, it's a random, a name generator.[00:32:47] Alsesio: Oh, that's funny. It just came,[00:32:49] Simon Last: the fact that it picked that is, is kind of hilarious. I'm pretty sure it's just determined,[00:32:54] Sarah Sachs: resilient collector. I, I think I've never looked at the code for that.I've never second guessed it. I think it's kind of like a madlib situation.[00:33:00] Simon Last: Yeah, I think you're right. Yeah. It's, it's totally a, a deterministic. Oh, I thought it was great. Yes. Although, although when the, if you use the AI to set itself up, it can update its own name, so. Okay. Um,[00:33:11] Sarah Sachs: how did you create it? It, did you just do[00:33:12] Alsesio: classroom?I,[00:33:13] Sarah Sachs: okay.[00:33:13] Alsesio: I did, yeah. I'll say just check my inbox for applications for a coworking space. Keep a people, so it created the database for me. Which I have here. And I guess database is like an notion table because everything is notion. Um, and then whenever um, an email comes in, like here, it just creates a new role for the person.Mm-hmm. And then it uses web search to enrich the mm-hmm. The profile. So it kind of like searches the web and it's like, this is who this person is, this is when they say they wanna move in and kind of updates everything else. This is, I mean, it's not a GI, but to me, I don't wanna do this work. So it feels like, I mean, it took me maybe like 15 minutes to set up the whole thing.Um, and I really like that most of the information should live here. You know, it is not like some other tool asking me[00:34:01] Sarah Sachs: Yeah.[00:34:01] Alsesio: To like, bring my stuff there. It's like I would've probably already created an ocean thing.[00:34:06] Sarah Sachs: Mm-hmm.[00:34:06] Alsesio: So[00:34:07] Sarah Sachs: most of our biggest use cases and gains are from. That extra layer of human involvement in the process to make it so right.And so like one of our biggest use cases is bug triaging. So if someone posts something in Slack, can you just have a custom agent that lives there that has its own routing constitution of what team this belongs to, creates a task in your task database and then posts in that Slack channel, right? Like that's like one of the first things that we built internally, I think.And it's completely changed the way that notion functions as a company. Nothing falls through, well, most things don't fall through the crack. We don't know what we don't know. But it's not replacing people, it's replacing processes.[00:34:44] Alsesio: Yeah.[00:34:44] Sarah Sachs: Right.[00:34:45] Alsesio: And I'm curious how you think about composability of these things.So the other one I was working on is like a. These filler. So whenever somebody signs up as a tenant, kind of he'll sell the lease for them. There should probably some agent that is like office manager agent mm-hmm. That can handle the request, make the lease, and then, uh, give them a ADA access to the office and all of that.How do you think about that feature?[00:35:08] Simon Last: Yeah, so I mean, there's, there's two ways you can compose. One way is by using like the data primitives. So you can, you know, you, you could give, you have one agent, uh, be writing to the database and there's another agent that's walked in the database. So that's, that's one way that they, they can coordinate that's like a little bit more decoupled and mm-hmm.Works really well. Or you, you can couple them. So I, I think it's actually not released yet. Releasing it like next week is, uh, in the settings for an agent, you can give access to invoke any other agent.[00:35:34] swyx: Hmm.[00:35:34] Simon Last: So you can have them just. Just, uh, uh, talk directly. So[00:35:37] swyx: you, was there a limit on like, number of recursions or just,[00:35:40] Simon Last: um, probably,[00:35:42] swyx: you know what I mean?Like, you can just get an infinite loop that way there's[00:35:45] Simon Last: some kind of Yeah,[00:35:46] Sarah Sachs: I think it's, there is actually a number somewhere.[00:35:49] swyx: I believe I'm just, you know, like, you're, you're, someone's gonna screw up. You[00:35:51] Simon Last: should you try to see[00:35:53] swyx: Yeah. I mean, everything's gonna be paperclips.[00:35:55] Simon Last: Oh, yeah. Yeah. But, uh, but, but that's really useful.Yeah. So we, you know, like I just, I, I helped, uh, someone internally the other day, they had, they had built like over 30 custom agents for, uh, for our go to market team doing all kinds of different things. You know, for example, like researching, you know, like, like filling information about, about a customer or like, like triaging customer feedback or like, uh, something like that.Literally over 30 of them. And, and then he, and then he even made like a database of all the agents and then he is like, okay, and, and now I'm getting 70, over 70 notifications per day with just the agents are blocked on various things. Uh, and then I was like, oh, okay, cool. You know, the obvious thing to do there is to make a manager agent,[00:36:32] Sarah Sachs: right?[00:36:33] Simon Last: That's gonna sort of blocks be another abstraction layer in between your, your, uh, uh, 30 agents. Uh, so yeah, we, we send out with like a manager agent and then has access to invoke all the other agents and it's sort of like, like watching and observing them and then it sort of, it just creates a layer of abstraction.So instead of 70 notifications per day, it's like, like five. And then, and then the manager agent can help like, uh, debug and fix any problems with the,[00:36:54] swyx: does this is a concept of like an inbox or something like piece, you're basically saying that they can message each other?[00:37:00] Simon Last: Yeah.[00:37:01] Sarah Sachs: Well[00:37:01] swyx: they use the system of record, which, which is[00:37:02] Sarah Sachs: notion, so we[00:37:03] Simon Last: actually, yeah, we didn't make any special concepts at all.[00:37:06] swyx: They're interested to the motion notifications that I would've got,[00:37:09] Sarah Sachs: they can just like write a task to a database that the other agent's task to listening to, or they can actually call a web book to the agent, like they can just add the agent. Okay.[00:37:17] Simon Last: Yeah, I mean, this is something that, that we're still working on.I, I think we, you know, like, like generally, generally the way we do these things is, you know, you first make it possible, maybe like a sort of janky way. So I, I, I think the way I set ‘em up is like, you know, we created like a new database that was sort of like issues mm-hmm. That the custom agents were, were experiencing, and then gave them all access to file an issue and then the manager has access to, to read the issues.Um, and that works pretty well, essentially like, like give it its own like internal issue tracker just for the agents. And then, you know, if that becomes a, a concept that seems useful, generally maybe we will think of how to package it in. But I mean, generally we try to just keep it to composing the primitive if we can.You know, another example of this is we have no built-in memory concept. Memory is, is just pages and databases. And so if you wanna give a memory, just give it a page and give it. Edit access to that page and the[00:38:03] swyx: human can edit it. Agent can edit[00:38:04] Simon Last: it. Yeah. And so that works, that pattern works extremely well on it.And you know, depending this case, you can have it be just a page or it could be an entire database with, you know, or, you know, I can have sub pages is is pretty on what you can do with that.[00:38:15] Alsesio: So when I was setting this up, uh, I connected my inbox and it was like, do you wanna use Gmail or Notion Mail? And I'm like, I don't wanna use Eater, I just want you to do it.I'm curious how you think about, you know, notion, mail, notion, calendar, all of these kind of ui ux interfaces, full stack[00:38:29] Simon Last: notion.[00:38:30] Alsesio: Yeah. When like at the same time you have the agents abstracting them away from you in a way, you know, how do you spend like the product calories so to speak?[00:38:37] Simon Last: Yeah, I mean, I think it's pretty important that you don't have to use, not your mail to connect to the mail capability.So we can just connect to Gmail or, or whatever you want, uh, to use. And we're thinking of the mail service as being really great to the extent that it's really agent built, right? So maybe the mail app is just sort of a prepackaged agent that helps you automate your, your inbox.[00:39:00] Alsesio: Yeah, the auto labeling is great.Think[00:39:03] Sarah Sachs: the, when we, um, integrate with Gmail for instance, we have a series of tools available that are available via MCP or API to Gmail. When we integrate with Notion Mail, we have the Notion Mail engineering team to build us the, um, exact right tools that optimize latency, optimize performance and quality.They own that quality. Um, there's product leads there. They're directly thinking about the user problems that happen in mail. So it tends to be when we build integrations and connections, we build natively first. Um, and then think about, um, extending them generally just because it's also easier. Mm-hmm. Um, um, to build natively first.Um, so that tends to be how we phase things out.[00:39:43] swyx: Talking about integrations, you prompted me, so I gotta ask. M-C-P-C-L-I. What's going on? What's the[00:39:48] Simon Last: Yeah. Opinion. I think, I mean, I'm, I'm definitely bullish and excited about cli. I think there's a few really cool things about cli. So one really cool thing is like, um, is that it's in the terminal environment, so it gets a bunch of extra power.So it, you know, for example, it can like, like paginating and cursor through like long outputs. Um, and it has a progressive disclosure inherently. Uh, so, you know, you don't see all the tools at once. It's just, you see the CLI wrapper and you can like use the, the help commands and, and, and read files. And then I think the most important thing that's, that's super cool is that there, it's also inherently a, a bootstrapped.So if there's an issue, uh, the agent can debug and fix itself within the same environment that it uses the tool.[00:40:30] swyx: Mm.[00:40:30] Simon Last: Right. Like, you know, I think I saw a tweet this morning. Someone said, you know, my agent didn't have a browser, so I asked it to make all a browser tool and within a hundred lines of code, it gave itself a little browser, like, like wrapping the, the, the chromium API, um.That's pretty incredible. And then if there was a bug, it would just immediately try to fix it. Mm-hmm. Right. On the other hand, if you use an, you know, if you use like of, of the Chrome dev tools, MCP, I've had this issue where like, like sometimes the transport gets like messed up. If it gets messed up, the agent has no way to fix itself.It, it no longer has a browser, it's, it's not broken. Right. I think that's, that's pretty fundamental, but I would say like a lot of the, the bad things about it can be fixed. Uh, so I think like, as a progressive disclosure, that can be fixed with, with right harness. Like, it, it obviously doesn't make sense to show it all the tools all the time.That's not really inherent to the MCP protocol. It's just like how you wrap it and use it.[00:41:16] swyx: There's many poorly built MCPs because we didn't know.[00:41:19] Simon Last: Yeah, yeah. I mean it was just early, like, like the obvious thing is, uh, you know, to start with is, is to just show it all the tools and it's like, okay, now we have a hundred tools.Yeah. And like the tool calling actually works. So let's of[00:41:28] swyx: your success[00:41:29] Simon Last: give it a way to like, like filter to source the tools. So yeah, I would say like broadly speaking, I'm really bullish on cli. I'm still bullish on CPS and in a certain environment. I think in, in particular, CP is really great for when you want sort of like a narrow, lightweight agent.I think there's, there's definitely a lot of use cases where, where you don't want like a full coding agent with a compute run time. And also you want it to be like more tightly permissioned. MCP inherently has a really strong permission model, like all you can do is call the tools. A CLI is a little bit murkier.It's like, can I access the, if PI token are you, like, properly sort of like re-encrypt the token so it can't like exfiltrate it, it introduce a lot of like, like new issues, which are. Real and hard to solve. And MCP is just like the dumb simple thing that works and it that it's pretty good.[00:42:12] Sarah Sachs: I'll add two more perspectives, not from it working well for Notion, but how notion like commits to both platforms.Notion is dedicated to being the best system of record for where people do their enterprise work. So we will always support our MCP and so far as other people are using cps, right? So regardless of our perspective, we've put a lot of effort into our MCP and we have a fantastic team that we're building, um, to do more there.And the second thing I'll say, I think, um, we all think a lot, but lately I've been thinking a lot about making sure there's a value alignment and pricing, um, with capability.[00:42:43] swyx: Literally our next question[00:42:44] Sarah Sachs: and. Needing language to execute deterministic tasks feels wasteful and requiring on a language model to interface with third party providers seems wasteful for tasks that don't require it.And particularly because our custom agents are using usage-based pricing. We think of pricing as like the barrier of entry for use of our product, and we're quite committed to making sure that it's not wasteful. Um, not just because it's a bad deal for our customers, but it's also bad business. We wanna have as many buyers, like there's a, there's an elasticity of demand and so if we can have our agents properly execute code that calls on CLI deterministically, it's a one-time cost, right?Versus constantly having a language model integrate with an MCP over and over and over and paying those like repeated token fees and it's happening outside the cash window, then you're paying for it over and over and over and it's just kind of unnecessary and less deterministic when it doesn't have to be.[00:43:36] Alessio: Yeah, the open-endedness I think is like, the main thing is like, well, if I go write code to just call an API, I would never use an MCP. But then you need an NCP sometimes when you know what to call, but you don't want it to restart versus like, I think the it built a browser from scratch is like, it's great when you're doing it on your own, but like if your customers were having your AI write a browser from scratch every time and you had to pay the token cost of that, yeah.You'd be like, no, no. The Chrome dev tools CP is actually pretty great. Just use that. I'm curious, how do you make that decision? Like should it be. Just straight API call very narrow. Should it be an MCP? Should it be super open-ended?[00:44:10] Sarah Sachs: Do you mean for when we ship notion capabilities or when we add capabilities to[00:44:13] Alessio: notion[00:44:14] Sarah Sachs: AI or,[00:44:14] Alessio: I mean, you might have a capability that the only way to do is an open-ended agent, like an agent with a coding sandbox.[00:44:21] Sarah Sachs: Yeah. In Notion ai they're not explicit, not We also ship an MCP.[00:44:24] Alsesio: Yeah. Yeah. In B,[00:44:25] Sarah Sachs: yeah.[00:44:26] Alsesio: Internally. Okay. Like is there ever a discussion of like, we're not gonna ship it because we're not able to tie it down? Or are you happy to just like,[00:44:33] Sarah Sachs: um, no. I mean, there are a lot of things where we choose not to use MCP because we wanna add more high touch to quality.I think search an agent to find is like the largest instance of that, where we have. Um, slack and linear and Jira search and notion that is not using necessarily the search MCP functionality that is provided by those companies. And that's because it's quite critical we think, to how our agent trajectories work is for us to have a little bit more control on the functionality of the search journey.And so it usually comes from quality and there's a long tail of things and that's why we built an MCP client or an MCP server, excuse me, so that people can connect whatever they want. There's that long tail, right. But we, for search particularly, I would say that's like the primary entry point, but there are other connections as well that it's a little bit of secret sauce a

Arguing Agile Podcast
AA256 - The AI PM Competency Trap: Why AI Tools Won't Save Your Product Career

Arguing Agile Podcast

Play Episode Listen Later Apr 8, 2026 56:38 Transcription Available


Every PM is scrambling to learn AI tools - but is that a trap? In this episode of Arguing Agile, hosts Brian Orlando and Om Patel summarize Shreyas Doshi's provocative article "Why Product Sense Is the Only Product Skill That Will Matter in the AI Age." Using the article as background for our discussion, we explore whether AI tools like Claude, Cursor, and NotebookLM are genuine superpowers for product managers or just the new baseline that everyone will have access to.https://shreyasdoshi.substack.com/p/why-product-sense-is-the-only-productWe've structured this episode around several key debates, including:

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Extreme Harness Engineering for Token Billionaires: 1M LOC, 1B toks/day, 0% human code, 0% human review — Ryan Lopopolo, OpenAI Frontier & Symphony

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Apr 7, 2026 72:43


We're proud to release this ahead of Ryan's keynote at AIE Europe. Hit the bell, get notified when it is live! Attendees: come prepped for Ryan's AMA with Vibhu after.Move over, context engineering. Now it's time for Harness engineering and the age of the token billionaires.Ryan Lopopolo of OpenAI is leading that charge, recently publishing a lengthy essay on Harness Eng that has become the talk of the town:In it, Ryan peeled back the curtains on how the recently announced OpenAI Frontier team have become OpenAI's top Codex users, running a >1m LOC codebase with 0 human written code and, crucially for the Dark Factory fans, no human REVIEWED code before merge. Ryan is admirably evangelical about this, calling it borderline “negligent” if you aren't using >1B tokens a day (roughly $2-3k/day in token spend based on market rates and caching assumptions):Over the past five months, they ran an extreme experiment: building and shipping an internal beta product with zero manually written code. Through the experiment, they adopted a different model of engineering work: when the agent failed, instead of prompting it better or to “try harder,” the team would look at “what capability, context, or structure is missing?”The result was Symphony, “a ghost library” and reference Elixir implementation (by Alex Kotliarskyi) that sets up a massive system of Codex agents all extensively prompted with the specificity of a proper PRD spec, but without full implementation:The future starts taking shape as one where coding agents stop being copilots and start becoming real teammates anyone can use and Codex is doubling down on that mission with their Superbowl messaging of “you can just build things”.Across Codex, internal observability stacks, and the multi-agent orchestration system his team calls Symphony, Ryan has been pushing what happens when you optimize an entire codebase, workflow, and organization around agent legibility instead of human habit.We sat down with Ryan to dig into how OpenAI's internal teams actually use Codex, why the real bottleneck in AI-native software development is now human attention rather than tokens, how fast build loops, observability, specs, and skills let agents operate autonomously, why software increasingly needs to be written for the model as much as for the engineer, and how Frontier points toward a future where agents can safely do economically valuable work across the enterprise.We discuss:* Ryan's background from Snowflake, Brex, Stripe, and Citadel to OpenAI Frontier Product Exploration, where he works on new product development for deploying agents safely at enterprise scale* The origin of “harness engineering” and the constraint that kicked off the whole experiment: Ryan deliberately refused to write code himself so the agent had to do the job end to end* Building an internal product over five months with zero lines of human-written code, more than a million lines in the repo, and thousands of PRs across multiple Codex model generations* Why early Codex was painfully slow at first, and how the team learned to decompose tasks, build better primitives, and gradually turn the agent into a much faster engineer than any individual human* The obsession with fast build times: why one minute became the upper bound for the inner loop, and how the team repeatedly retooled the build system to keep agents productive* Why humans became the bottleneck, and how Ryan's team shifted from reviewing code directly to building systems, observability, and context that let agents review, fix, and merge work autonomously* Skills, docs, tests, markdown trackers, and quality scores as ways of encoding engineering taste and non-functional requirements directly into context the agent can use* The shift from predefined scaffolds to reasoning-model-led workflows, where the harness becomes the box and the model chooses how to proceed* Symphony, OpenAI's internal Elixir-based orchestration layer for spinning up, supervising, reworking, and coordinating large numbers of coding agents across tickets and repos* Why code is increasingly disposable, why worktrees and merge conflicts matter less when agents can resolve them, and what it really means to fully delegate the PR lifecycle* “Ghost libraries”, spec-driven software, and the idea that a coding agent can reproduce complex systems from a high-fidelity specification rather than shared source code* The broader future of Frontier: safely deploying observable, governable agents into enterprises, and building the collaboration, security, and control layers needed for real-world agentic workRyan Lopopolo* X: https://x.com/_lopopolo* Linkedin: https://www.linkedin.com/in/ryanlopopolo/* Website: https://hyperbo.la/contact/Timestamps00:00:00 Introduction: Harness Engineering and OpenAI Frontier00:02:20 Ryan's background and the “no human-written code” experiment00:08:48 Humans as the bottleneck: systems thinking, observability, and agent workflows00:12:24 Skills, scaffolds, and encoding engineering taste into context00:17:17 What humans still do, what agents already own, and why software must be agent-legible00:24:27 Delegating the PR lifecycle: worktrees, merge conflicts, and non-functional requirements00:31:57 Spec-driven software, “ghost libraries,” and the path to Symphony00:35:20 Symphony: orchestrating large numbers of coding agents00:43:42 Skill distillation, self-improving workflows, and team-wide learning00:50:04 CLI design, policy layers, and building token-efficient tools for agents00:59:43 What current models still struggle with: zero-to-one products and gnarly refactors01:02:05 Frontier's vision for enterprise AI deployment01:08:15 Culture, humor, and teaching agents how the company works01:12:29 Harness vs. training, Codex model progress, and “you can just do things”01:15:09 Bellevue, hiring, and OpenAI's expansion beyond San FranciscoTranscriptRyan Lopopolo: I do think that there is an interesting space to explore here with Codex, the harness, as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding. We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to.Build a user journey that you're trying to solve into code. It's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate in prompts. To let the model cook, you have to step back, right? Like you need to take a systems thinking mindset to things and constantly be asking, where is the Asian making mistakes?Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC.swyx: [00:01:00] All right.[00:01:03] Meet Ryan swyx: We're in the studio with Ryan from OpenAI. Welcome.Ryan Lopopolo: Hi,swyx: Thanks for visiting San Francisco and thanks for spending some time with us.Ryan Lopopolo: Yeah, thank you. I'm super excited to be here.swyx: You wrote a blockbuster article on harness engineering. It's probably going to be the defining piece of this emerging discipline, huh?Ryan Lopopolo: Thank you. It is it's been fun to feel like we've defined the discourse in some sense.swyx: Let's contextualize a little bit, this first podcast you've ever done. Yes. And thank you for spending with us. What is, where is this coming from? What team are you in all that jazz?Ryan Lopopolo: Sure, sure.Ryan Lopopolo: I work on Frontier Product Exploration, new product development in the space of OpenAI Frontier, which is our enterprise platform for deploying agents safely at scale, with good governance in any business. And. The role of VMI team has been to figure out novel ways to deploy our models into package and products that we can sell as solutions to enterprises.swyx: And you have a background, I'll just squeeze it in there. Snowflake, brick, [00:02:00] stripe, citadel.Ryan Lopopolo: Yes. Yes. Same. Any kind of customerswyx: entire life. Yes. The exact kind of customer that you want to,Vibhu: so I'll say, I was actually, I didn't expect the background when I looked at your Twitter, I'm seeing the opposite.Stuff like this. So you've got the mindset of like full send AI, coding stuff about slop, like buckling in your laptop on your Waymo's. Yes. And then I look at your profile, I'm like, oh, you're just like, you're in the other end too. Oh, perfect. Makes perfect.Ryan Lopopolo: I it's quite fun to be AI maximalist if you're gonna live that persona.Open eye is the place to do it. And it'sswyx: token is what you say.Ryan Lopopolo: Yeah. Certainly helps that we have no rate limits internally. And I can go, like you said, full send at this stay.swyx: Yeah. Yeah. So the Frontier, and you're a special team within O Frontier.Ryan Lopopolo: We had been given some space to cook, which has been super, super exciting.[00:02:47] Zero Code ExperimentRyan Lopopolo: And this is why I started with kind of a out there constraint to not write any of the code myself. I was figuring if we're trying to make agents that can be deployed into end to enterprises, they should be [00:03:00] able to do all the things that I do. And having worked with these coding models, these coding harnesses over 6, 7, 8 months, I do feel like the models are there enough, the harnesses are there enough where they're isomorphic to me in capability and the ability to do the job.So starting with this constraint of I can't write the code meant that the only way I could do my job was to get the agent to do my job.Vibhu: And like a, just a bit of background before that. This is basically the article. So what you guys did is five months of working on an internal tool, zero lines of code over a mi, a million lines of code in the total code base.You say it was cenex, more like it was cenex faster than you would've. If you had done it by end. SoRyan Lopopolo: yeah, thatVibhu: was the mindset going into this, right?Ryan Lopopolo: That's right.[00:03:46] Model Upgrades LessonsRyan Lopopolo: Started with some of the very first versions of Codex CLI, with the Codex Mini model, which was obviously much less capable than the ones we have today.Which was also a very good constraint, right? Quite a visceral feeling to ask the [00:04:00] model to build you a product feature. And it just not being able to assemble the pieces together.Which kind of defined one of the mindsets we had for going into this, which is whenever the model just cannot, you always pop open at the task, double click into it, and build smaller building blocks that then you can reassemble into the broader objective.And it was quite painful to do this. Honestly, the first month and a half was. 10 times slower than I would be. But because we paid that cost, we ended up getting to something much more productive than any one engineer could be because we built the tools, the assembly station for the agent to do the whole thing.[00:04:43] Model Generations, Build Systems & Background ShellsRyan Lopopolo: But yeah, so onward to G BT 5, 5, 1, 5, 2, 5, 3, 5 4. To go through all these model generations and see their kind of corks and different working styles also meant we had to adapt the code base to change things up when the model was revved. [00:05:00] One interesting thing here is five two, the Codex harness at the time did not have background shells in it, which means we were able to rely on blocking scripts to perform long horizon work.But with five, three and background shells, it became less patient, less willing to block. So we had to retool the entire build system to complete in under a minute and. This is not a thing I would expect to be able to do in a code base where people have opinions. But because the only goal was to make the Asian productive over the course of a week, we went from a bespoke make file build to Basil, to turbo to nx and just left it there because builds were fast at that point.swyx: Interesting. Talk more about Turbo TenX. That's interesting ‘cause that's the other direction that other people have been doing.Ryan Lopopolo: Ultimately I have. Not a lot of experience with actual frontend repo architecture.swyx: You're talking that Jessica built the sky. So I'm like, I know the NX team. I know Turbo from Jared [00:06:00] Palmer.And I'm like, yeah, that's an interesting comparison.[00:06:02] One Minute Build LoopRyan Lopopolo: The hill we were climbing right, was make it fast.swyx: Is there a micro front end involved? Is it how how complex reactRyan Lopopolo: electron base single app sort of thingswyx: And must be under a minute. That's an interesting limitation. I'm actually not super familiar with the background shelf stuff.Probably was talked about in the fight three release.Ryan Lopopolo: BA basically means that codex is able to spawn commands in the background and then go continue to work while it waits for them to finish. So it can spawn an expensive build and then continue reviewing the code, for example.swyx: Yeah.Ryan Lopopolo: And this helps it be more time efficient for the user invoking the harness.swyx: And I guess and just to really nail this, like what does one minute matter? Like why not five, okay, good. We want no. WeRyan Lopopolo: want the inner loop to be as fast as possible. Okay. One minute was just a nice round number and we were able to hit it.swyx: And if it doesn't complete, it kills it or some something,Ryan Lopopolo: No.We just take that as a signal that we need to stop what we're doing, double click, decompose a build graph a bit to get us to high back under so that we [00:07:00] can able the agent continue to operate.swyx: It's almost like you're, it's like a ratchet. It's like you're forcing build time discipline, because if you don't, it'll just grow and grow.That's right. And you mentioned that my current, like the software I work on currently is at 12 minutes. It sucks.Ryan Lopopolo: This has been my experience with platform teams in the past, where you have an envelope of acceptable build times and you let it go up to breach and then you spend two, three weeks to bring it back down to the lower end of the average low bed stop.But because tokens are so cheap Yeah. And we're so insanely parallel with the model, we can just constantly be gardening this thing to make sure that we maintain these in variants, which means. There's way less dispersion in the code and the SDLC, which means we can simplify in a way and rely on a lot more in variance as we write the software.[00:07:45] Observability, Traces & Local Dev StackVibhu: Lovely.[00:07:46] Humans Are BottleneckVibhu: You mentioned in your article, like humans became the bottleneck, right? You kicked off as a team of three people. You're putting out a million line of code, like 1500 prs, basically. What's the mindset there? So as much as code is disposable, you're doing a lot of review. A lot [00:08:00] of the article talks about how you wanna rephrase everything is prompting everything, is what the agent can't see.It's kind of garbage, right? You shouldn't have it in there. So what's like the high level of how you went about building it, and then how you address okay, humans are just PR review. Like how is human in the loop for this?Ryan Lopopolo: We've moved beyond even the humans reviewing the code as well.[00:08:19] Human Review, PR Automation & Agent Code ReviewRyan Lopopolo: Most of the human review is post merge at this point.But post, post merge, that's not even reviewed. That's justswyx: Oh, let's just make ourselves happy by YouRyan Lopopolo: haven't used fundamentally. The model is trivially paralyzable, right? As many GPUs and tokens as I am willing to spend, I can have capacity to work with my hood base.The only fundamentally scarce thing is the synchronous human attention of my team. There's only so many hours in the day we have to eat lunch. I would like to sleep, although it's quite difficult to, stop poking the machine because it makes me want to feed it. You have to step back, right?Like you need to take a systems thinking mindset to things and [00:09:00] constantly be asking where is the agent making mistakes? Where am I spending my time? How can I not spend that time going forward? And then build confidence in the automation that I'm putting in place. So I have solved this part of the SDLC, and usually what that has looked like is like we started needing to pay very close attention to the code because the agent did not have the right building blocks to produce.Modular software that decomposed appropriately that was reliable and observable and actually accrued a working front end in these things, right?[00:09:35] Observability First SetupRyan Lopopolo: So in order to not spend all of our time sitting in front of a terminal at most, doing one or two things at a time, invested in giving the model that observability, which is that that graph in the post here.swyx: Yeah. Let's walk through this traces and which existed firstRyan Lopopolo: we started with just the app and the whole rest of it. From vector through to all these login metrics, APIs was, I dunno, half an [00:10:00] afternoon of my time. We have intentionally chosen very high level fast developer tools. There's a ton of great stuff out there now.We use me a bunch, which makes it trivial to pull down all these go written Victoria Stack binaries in our local development. Tiny little bit of python glue to spin all these up. And off you go. One neat thing here is we have tried to invert things as much as possible, which is instead of setting up an environment to spawn the coding agent into, instead we spawn the coding agent, like that's the entry point.It's just Codex. And then we give Codex via skills and scripts the ability to boot the stack if it chooses to, and then tell it how to set some end variables. So the app and local Devrel points at this stack that it has chosen to spin up. And this I think is like the fundamental difference between reasoning models and the four ones and four ohs of the past, where these models could not think so you had to put them in [00:11:00] boxes with a predefined set of state transitions.Whereas here we have the model, the harness be the whole box. And give it a bunch of options for how to proceed with enough context for it to make intelligent choices. SoVibhu: sales, so like a lot of that is around scaffolding, right? Yes. Previous agents, you would define a scaffold. It would operate in that.Lube, try again. That's pivoted off from when we've had reasoning models. They're seeming to perform better when you don't have a scaffold, right? That's right.[00:11:28] Docs Skills GuardrailsVibhu: And you go into like niches here too, like your SPEC MD and like having a very short agent MG Agent md.swyx: Yes. Yes.Vibhu: Yeah. So you even lay out what it is here, but I likeswyx: the table contents.Vibhu: Yeah.swyx: Like stuff like this, it really helps guide people because everyone's trying to do this.Ryan Lopopolo: This structure also makes it super cheap to put new content into the repository to steer both the humans and the agents.swyx: You, you reinvented skills, right?Vibhu: One big agents andswyx: skills from first princip holdsRyan Lopopolo: all skills did not exist when we started doing this.Vibhu: You have a short [00:12:00] one 100 line overall table of contents and then you have little skills, right? Core beliefs, MD tech tracker. Yeah. Yeah. The scale is overRyan Lopopolo: The tech jet tracker and the quality score are pretty interesting because this is basically a tiny little scaffold, like a markdown table, which is a hook for Codex to review all the business logic that we have defined in the app, assess how it matches all these documented guardrails and propose follow up work for itself.Before beads and all these ticketing systems, we were just tracking follow up work as notes in a markdown file, which, we could spa an agent on Aron to burn down. There's this really neat thing that like the models fundamentally crave text. So a lot of what we have done here is figure out ways to inject textswyx: intoRyan Lopopolo: the system right when we get a page, because we're missing a timeout, for example.I can just add Codex in Slack on that page and say, I'm gonna fix this by adding a timeout. Please update our reliability documentation. To require that all network calls have [00:13:00] timeouts. So I have not only made a point in time fix, but also like durably encoded this process knowledge around what good looks like.swyx: Yeah.Ryan Lopopolo: And we give that to the root coding agent as it goes and does the thing. But you can also use that to distill tests out of, or a code review agent, which is pointed at the same things to narrow the acceptable universe of the code that's produced.swyx: I think one of the concerns I have with that kind of stuff is you think you're making the right call by making, it's persisted for all time across everything.Yes. But then you didn't think about the exceptions that you need to make, right? And that you have to roll it back.Vibhu: Part of it isswyx: also sometimes it can follow your s instructions too.Vibhu: It's somewhat a skill, right? So it determines when it uses the tools, right? Like it's not like it'll run outta every call.It'll determine when it wants to check quality score, right?Ryan Lopopolo: Yeah. And we do in the prompts we give these agents, allow them to push back,[00:13:51] Agent Code Review RulesRyan Lopopolo: When we first started adding code review agents to the pr, it would be Codex, CLI. Locally writes the change, pushes up a PR on [00:14:00] those PR synchronizations of review agent fires.It posts a comment. We instruct Codex that it has to at least acknowledge and respond to that feedback. And initially the Codex driving the code author was willing to be bullied by the PR reviewer, which meant you could end up in a situation where things were not converging. So yeah, we had to,swyx: he's just a thrash.Ryan Lopopolo: We had to add more optionality to the prompts on both of these things, right? The reviewer agents were instructed to bias toward merging the thing to not surface anything greater than a P two in priority. We didn't really define P two, but we gave it, youswyx: did define P two.Ryan Lopopolo: We gave it a framework within which to score its outputswyx: and then greater than P zero is worse, right?Yes. P two is very good.Ryan Lopopolo: P zero is you will mute the code place ifswyx: you merch thisRyan Lopopolo: thing, right?swyx: Yeah.Ryan Lopopolo: But also on the code authoring agent side, we also gave it the flexibility to either defer or push back against review feedback, right? This happens all the time, right? Like I happen to notice something and leave a code review, [00:15:00] which.Could blow up the scope by a factor of two. I usually don't mean for that to be addressed Exactly. In the moment. It's more of an FYI file it to the backlog, pick it up in the next fix it week sort of thing. And without the context that this is permissible, the coding agents are gonna bias toward what they do, which is following instructions.swyx: Yeah.[00:15:19] Autonomous Merging Flowswyx: I do wanted to check in on a couple things, right? Sure. All the coding review agent, it can merge autonomously. I think that's something that a lot of people aren't comfortable with. And you have a list here of how much agents do they do Product code and tests, CI configuration and release tooling, internal Devrel tools, documentation eval, harness review, comments, scripts that manage the repository itself, production dashboard definition files, like everything.Yes. And so they're just all churning at the same time, is there like a record that, that any human on the team pulls to stop everythingRyan Lopopolo: Because we are building a native application here. We're not doing continuous deploy. So there's still a human in the loop for cutting the release branch.I see. We require a blessed [00:16:00] human approved smoke test of the app before we promote it to distribution, these sort of things.swyx: So you're working on the app, you're not building like infrastructure where you have like nines of reliability, that kinda stuff?Ryan Lopopolo: That's correct. That's correct. Okay. And also like full recognition here that all of this activity took in a completely greenfield repository.There's. Should be no script that this applies generally toswyx: this is a production thing, you're gonna shipRyan Lopopolo: toswyx: customers. Of course. Yeah, of course. So this is realVibhu: And like one of the things there is, you mentioned you started this as a repo from scratch. The onboarding first month or so was pretty, it was like working backwards, right?Yeah. And then you had to work with the system and now you're at that point where you know, you're very autonomous. I'm curious like, okay, so what, how human in the loop is it? So what are the bottlenecks that you wish you could still automate? And part of that is also like, where do you see the model trajectory improving and offloading more human in the loop?We just got 5.4. It's a really good,Ryan Lopopolo: fantastic model, by the way.Vibhu: Yeah. Yeah. It's the first one that's merged. Top tier coding. So it's codex level coding and reasoning. So general reasoning both in one model. SoRyan Lopopolo: andVibhu: computer [00:17:00] use vision.Ryan Lopopolo: Now we now with five four, I can just have Codex write the blog post, whereas for this one I had to balance between chat.swyx: Oh, I need to, I might be out of a job. Oh my God.Ryan Lopopolo: Oh,swyx: I know. You just gave me an idea for a completely AI newsletter that five four could do. Yeah, I get it Now.Ryan Lopopolo: This sort of thing is just one example of closing the loop, right? Like the dashboard thing you mentioned. We have Codex authoring the Js ON, for the Grafana dashboards and publishing them and also responding to the pages, which means when it gets the page, it knows exactly which dashboards are defined and what alerts.What alert was triggered by which exact log in the code base. ‘cause all of this stuff is collated together.swyx: It has to own everything.Yes. Yeah. Yeah.Ryan Lopopolo: And it means that if we have an outage that did not result in a page. It has the existing set of dashboards available to it. It has the existing set of metrics and logs and can figure out where the gaps in the dashboard are or [00:18:00] in the underlying metrics and fix them in one go.In the same way, you would have a full stack engineer be able to drive a feature from the backend all the way to the front end.Vibhu: So it, it seems like a lot of the work you guys had to do was you as a small team are fully working for a way that the model wants the software to be written. It's like less human legible for better. Code legibility, agent legibility. How do you think that affects broader teams? So one at OpenAI, do liaison, like this is how software should be written. Like I can imagine, say you join a new team with this methodology, this mindset there's ways that, teams do code review, teams write code, like teams are structured and a lot of it is for human legibility.So should we all swap? Like how does this play back one broader into OpenAI and then like broader into the software engineering, right? Is it like teams that pick this up will it's pretty drastic, right? You have to make a pretty big switch. Should they just full send Yeah.Ryan Lopopolo: The mindset is very much that I'm removed from the process, right? I can't really have deep code level opinions about [00:19:00] things. It's as if I'm. Group tech leading a 500 person organization.Vibhu: Yeah.Ryan Lopopolo: Like it's not appropriate for me to be in the weeds on every pr. This is why that post merge code review thing is like a good analog here, right?Like I have some representative sample of the code as it is written, and I have to use that to infer what the teams are struggling with, where they could use help, where they're already moving quickly and I can pivot my focus elsewhere.Vibhu: Yeah.Ryan Lopopolo: So I don't really have too many opinions around the code as it is written.I do, however, have a command based class, which is used to have repeatable chunks of business logic that comes with tracing and metrics and observability for free. And the thing to focus on is not how that business logic is structured, but that it uses this primitive ‘cause I know that's gonna give leverage by default.Vibhu: Yeah.Ryan Lopopolo: Yeah, back to that sort of systems stinking,Vibhu: and you have part of that in your blog post, enforcing architecture and ta taste how you set boundaries for what's used. There's also a section on redefining [00:20:00] engineering and stuff, but yeah, it's just, it's interesting to hear,Ryan Lopopolo: and as the models have gotten better, they have gotten better at proposing these abstractions to unblock themselves, which again, lets me move higher and higher up the stack to look deeper into the future on what ultimately blocked the team from shipping.swyx: Yeah. You mentioned so you, this is primarily a, it is like a 1 million line of code base electron app. But it manages its own services as well, so it's like a backend for front end type thing.Ryan Lopopolo: We do have a backend in there, but that's hosted in the cloud.Yeah. This sort of structure is actually within the separate main and render processesWithin theswyx: electric.That's just how electronic works.Ryan Lopopolo: Yeah, of course. So have also treated like. MVC style decomposition with the same level of rigor, which has been very fun.swyx: I have a fun pun. This is a tangent, NVC is model view controller. Any sort of full stack web Devrel knows that.But my AI native version of this is Model view Claw, the clause the harness.Ryan Lopopolo: That's right. That's right. I do think that there is an interesting space to [00:21:00] explore here with Codex, the harness as part of building AI products, right? There's a ton of momentum around getting the models to be good at coding.We've seen big leaps in like the task complexity with each incremental model release where if you can figure out how to collapse a product that you're trying to build, a user journey that you're trying to solve into code, it's pretty natural to use the Codex Harness to solve that problem for you. It's done all the wiring and lets you just communicate and prompts to let the model cook.Yeah. It's been very fun. And there's also a very engineering legible way of increasing capabil. It's fantastic, right? Yeah. Just give you, just give the model scripts, the same scripts you would already build for yourself.swyx: Yeah.Yeah. So for listeners, this is Ryan saying that software engineering or coding against will eat knowledge work like the non-coding parts that you would normally think.Oh, you have to build a separate agent for it. No, start a coding agent and go out from there. Which open Claw has like it's pie Underhood.Ryan Lopopolo: [00:22:00] Yes.Vibhu: Basically define your task in code. Everything is a codingswyx: agent by the way. Since I brought it up, it's probably the only place we bring it up. Is any open claw usage from you?Any?Ryan Lopopolo: No. No. Not for me. I don't have any spare Mac Minis rattling around my house.swyx: You can afford it? No. I just, I'm curious if it's changed anything in opening eye yet, but it's probably early days. And then the other, the other thing I, I wanna pull on here is like you mentioned ticketing systems and you mentioned prs and I'm wondering if both those things have to go away or be reinvented for this kind of coding.So the git itself and is like very hostile to multi-agent.Ryan Lopopolo: Yeah. We make very heavy use of work trees.swyx: But like even then, like I just did a, dropped a podcast yesterday with Cursors saying, and they said they're getting rid of work trees ‘cause it still has too many merge conflicts.It's still un too un unintuitive. But go ahead.Ryan Lopopolo: The models are really great at resolving merge conflicts. Yeah. And to get to a state where I'm not synchronously in the loop in my terminal, I almost don't care that there are mergeswyx: with disposable.[00:23:00] Yeah.Ryan Lopopolo: We invoke a dollar land skill and that coaches codex to push the PR Wait for human and agent reviewers Wait for CI to be green.Fix the flakes if there are any merged upstream. If the PR comes into conflict, wait for everything to pass. Put it in the merge queue. Deal with flakes until it's in Maine. End. This is what it means to delegate fully, right? This is in a, very large model re probably a significant tax on humans to get PRS merged, but the agent is more than capable of doing this and I really don't have to think about it other than keep my laptop open.swyx: Yeah. I used to be much more of a control freak, but now I'm like, yeah, actually you could do a better job of this than me. Yeah. With the right context. Yes.[00:23:47] Encoding Requirementsswyx: Anything else in harness in general? Just this piece, I just wanna make sure we,Ryan Lopopolo: I think one thing that I maybe didn't make super clear in the article that I heard on Twitter as an interesting, that's respond [00:24:00]swyx: to them.What's the chatter and then what's your response?Ryan Lopopolo: Ultimately, all the things that we have encoded in docs and tests and review agents and all these things are ways to put all the non-functional requirements of building high scale, high quality, reliable software into a space that prompt injects the agent.We either write it down as docs, we add links where the error messages tell how to do the right thing. So the whole meta of the thing is to basically tease out of the heads of all the engineers on my team, what they think good looks like, what they would do by default, or what they would coach a new hire on the team to do to get things to merch.And that's why we pay attention to all the mistakes, mistakes that the agent makes, right? This is code being written that is misaligned with some as yet not written down, non-functional requirement.swyx: Sorry, what? Did the online people misunderstand orRyan Lopopolo: No,swyx: whatyouRyan Lopopolo: responded to? Somebody just literally said that.I was like, oh yeah,swyx: okay,Ryan Lopopolo: This is the [00:25:00] thing. This is what I've been doing. Oh, youswyx: agree? Yeah. I see. Interesting.Ryan Lopopolo: One other neat thing, which I did totally did not expect is folks were just. Taking the link to the article and giving it to pi or Codex and say, make my repo this,Vibhu: you achi a whole recursion.Ryan Lopopolo: And it was wildly effective. Really? It was wildly effective. NoVibhu: way. It just actually is something I tried with five, four yesterday. I didn't have time. Last time I was like out speaking of something, and this is one of my things, I was like, okay, I have this article. Can we just scaffold out what it would be like to run this?And I, I did it first as that and then I was like, okay, let me take another little side repo and say okay, if I was to fully automate this like this because I haven't written a line of code, it'sRyan Lopopolo: like over full, setVibhu: it right. The side thing I'm doing of voice. TTS I'm just like, slobbing out, whatever.It's nothing production. I'm like, how would I make this like this? And it's actually like a really good way. It's like a good way to learn what could be changed, what could be like, it's just a good analyzing, right? You give it all the codes, you give it all the context, you give it the article and it walks you through it very well.That's right. That's right.[00:25:57] Inlining Dependencies[00:25:57] Dependencies Going Away & Brett Taylor's Responseswyx: I guess one more thing before we go to Symphony is I wanted to cover [00:26:00] Brett Taylor's response. We had him on the show. He is your chairman, which is wild. Yeah. That he's reading your articles as well and like getting engaged in it. He says software dependencies are going away.Basically they can just be like vendored. Yes. Response.Ryan Lopopolo: Aswyx: hundred percent. A hundred percent agree. You still pro qr, you still pay Datadog. You still pay Temporal. Thank you.Ryan Lopopolo: Yep. The level of complexity of the dependencies that we can internalize is, I would say low, medium right now. Just based on model capability.What does the,swyx: what is medium?Ryan Lopopolo: I would say like a. A couple thousand line dependency is a thing that we could in-house No problem. Call in an afternoon of time. One neat thing about it is like probably most of that code you don't even need. Like by in-house and abstraction, you can strip away all the generic parts of it and only focus on what you need to enable the specific thing.Yes. You're building,swyx: I've been calling this the end of b******t plugins.Ryan Lopopolo: Yeah.swyx: Because there's so much when I published an open source thing, I want to accept everything, be liberal. I want to accept, this is post's law, but that means there's so much bloat. Yes. There's so much overhead.Ryan Lopopolo: One other neat thing about [00:27:00] this too is when we deploy Codex Security on the repo, it is able to deeply review and change. The internalized dependencies in a much lower friction way than it would be to like, push patches upstream, wait for them to be released, pull them down, make sure that's compatible with all the transitive I have in my repo and things like that.So it's also much lower friction to internalize some of these things if code is free. ‘cause the tokens are cheap sort of thing.swyx: Yeah. Yeah. I think like the only argument I have against this is basically scale testing, which obviously the larger pieces of software like Linux, MySQL, he calls up even the Datadog and Temporals and then maybe security testing where Yes.Classically, I think, is it linis tos, it said security open source is the best disinfectant.Ryan Lopopolo: Many eyes.swyx: Many eyes. And if inline your dependencies and code them up, you're gonna have to relearn mistakes from other people that Yep.Ryan Lopopolo: Yep. And to internalize that dependency, you're back to zero and you have to start.Reassembling all those bits and pieces to Yeah. Have [00:28:00] high confidence in the code as it is written. Yeah.Vibhu: Even part of the first intro of this, you basically mentioned like everything was written by codex, including internal tooling, right? So internal tooling, like when you're visualizing what's going on it's writing it for itself.swyx: Yeah. I'm built internal tools way I now, and like I just show them off and they're like, how long did you spend? And I didn't spend any time. I just prompted it,Ryan Lopopolo: very funny story here.swyx: Yeah, go ahead.Ryan Lopopolo: We had deployed our app to the first dozen users internally had some performance issues, so we asked them to export a trace for us get a tar ball, gave it to our on-call engineer, and he did a fantastic job of working with Codex to build this beautiful local Devrel tool, next JS app, the drag and drop the tar ball in, and it visualizes the entire trace.It's fantastic. Took an afternoon, but none of this was necessary. Because you could just spin up codex and give it the tar ball and ask the same thing and get the response immediately. So in a way, optimizing for human [00:29:00] legibility of that debugging process was wrong. It kept him in the loop unnecessarily when instead he could have just like Codex cooked for five minutes and gotten this same.swyx: Yeah, you verify your instincts here of this is how we used to do it. Or this is how I would have used to solve it.Ryan Lopopolo: Yeah. In this local observability stack. Like sure, you can de deploy Yeager to visualize the traces, but I wouldn't expect to be looking at the traces in the first place because I'm not gonna write the code to fix them.swyx: Yeah. So basically there needs to be like this kind of house stack and owning the whole loop. I think that is very well established. And it sounds like you might be like sharing more about that in the future, right?Ryan Lopopolo: Yeah. I think we're excited to do[00:29:36] Ghost Libraries Specs[00:29:36] Ghost Libraries & Distributing Software as SpecsRyan Lopopolo: We're gonna talk about Symphony in a little bit, but like the way we distribute it as a spec, which I think folks are calling Ghost Libraries on Twitter.This is like a such a cool name. It does mean it becomes much cheaper to share software with the world, right? You define a spec, how you could build your own specifying as much as is required for a coding agent to reassemble it [00:30:00] locally. The flow here is very cool. Like we have taken. All the scaffolding that has existed in our proprietary repo spun up a new one.Ask Codex with our repo as a reference. Write the spec. We tell it. Spin up a team ox spawn a disconnected codex to implement the spec. Wait for it to be done. Spawn another codex and another team ox to review the spec com or review the implementation compared to upstream and update the spec so it diverges less.And then you just loop over and over Ralph style until you get a spec that is with high fidelity able to reproduce the system as it is. It's fantastic.Vibhu: And you're basically, you're not really adding any of your human bias in there, right? That's correct. A lot of times people write a spec and be like, okay, I think it should be done this way, and you'll riff on something.And it's no, the agent could have just handled it like you're still scaffolding in a sense, right? I want it done this way. It can determine its spec better.swyx: That's right. That's right. Part of me it, I'm, I've been working a lot on evals recently, and part of me is wondering if [00:31:00] an agent can produce a spec that it cannot solve.Is it always capable of things that he can imagine or can you imagine things that it is impossible to do?Ryan Lopopolo: I think with Symphony, we, there's like this there's this axis where you have things that are easier, hard, or established or new, right? And I think things that are hard and new is still something that the models need humans.Yeah. Drive.swyx: Yeah. Yeah.Ryan Lopopolo: But I think those other quadrants are largely salt. Given the right scaffold and the right thing that's gonna drive the agent to completion,swyx: it's crazy that it solved,Ryan Lopopolo: but it means that the humans, the ones with limited time and attention get to work on the hardest stuff, like the problems where it's pure white space out in front. Or like the deepest refactorings where you don't know what the proper shape of the interfaces are. And this is where I wanna spend my time. ‘cause it lets me set up for the next level of scale.swyx: Yeah. Yeah. Amazing. Let's introduce Symphony.I think we've been mentioning it every now and then. Elixir. Interesting option.Ryan Lopopolo: Yeah.swyx: Yeah. I'm not,Ryan Lopopolo: again, like the [00:32:00] elixir manifestation here is just a derivative. Is it a modelswyx: chosen? Yeah.Ryan Lopopolo: Yeah. Yeah. And it chose that because the process supervision and the gen servers are super amenable to the type of process orchestration that we're doing here.You are essentially spinning up little Damons for every task that is in execution and driving it to completion, which. Means the mall gets a ton of stuff for free by using Elixir and the Beam.swyx: I had to go do a crash course in Beam and Elixir, and I think most people are not operating at that scale of concurrency where you need that.But it is a good mental model for Resum ability and all those things. And these are things I care about. But tell me the story, the origin story of Symphony. What do you use it for? Is this, how did it form maybe any abandoned paths that you didn't take?[00:32:46] Terminal Free Orchestration[00:32:46] Symphony: Removing Humans from the LoopRyan Lopopolo: At the end of December we were at about three and a half PRS per engineer per day.This was before five two came out in the beginning of January. Everyone gets back from holiday with five two and no other work [00:33:00] on the repository. We were up in the five to 10 PRS per day per engineer. And I don't know about y'all, but like it's very taxing to constantly be switching like that. Like I was pretty tapped out at the end of the day, again, where are the humans spending their time? They're spending their time context switching between all these active tmox pains to drive the agent forward.swyx: Yeah. No way. Yeah.Ryan Lopopolo: So let's again, build something to remove ourselves from the loop. And this is what frantic sprinted adapt here to find a way to remove the need for the human to sit in front of their terminal.So a lot of experimentation with Devrel boxes and, automatically spinning up agents, like it seems like a fantastic end state here, where my life is beach. I open live twice a day and say yes no to these things. Yeah. And this is again, a super, super interesting framing for how the work is done.Because I become more latency and sensitive. I have [00:34:00] way less attachment to the code as it is written. Like I've had close to zero investment in the actual authorship experience. So if it's garbage. I can just throw it away and not care too much about it. In Symphony, there's this like rework state where once the PR is proposed and it's escalated to the human for review, it should be a cheap review.It is either mergeable or it is not. And if it's not, you move it to rework. The elixir service will completely trash the entire work tree NPR and start it again from scratch. Okay. And this is that opportunity again to say, why was it trash right? What did the agent do that wasswyx: bad. Yeah.Ryan Lopopolo: Fix that before moving the ticket toswyx: endRyan Lopopolo: of progress again.swyx: Yeah. Why is this not in codex app? I guess this, you guys are ahead of Codex app,Ryan Lopopolo: yeah, so the way the team has been working is basically to be as AI pilled as possible and spread ahead. And a lot of the things we have worked on have fallen out [00:35:00] into a lot of the products that we have.Like we were in deep consultation with the Codex team to. Have the Codex app be a thing that exists, right? To have skills be a thing that Codex is able to use. So we didn't have to roll our own to put automations into the product. So all of our automatic refactoring agents didn't have to be these hand rolled control loops.It has been really fantastic to be, in a way, un anchored to the product development of Frontier and Codex and just very quickly try to figure out what works and then later find the scalable thing that can be deployed widely. It's been a very fun way to operate. It's certainly chaotic. I have lost track very often of what the actual state of the code looks like.‘cause I'm not in the loop. There was. One point where we had wired playwright directly up to the Electron app. With MCPM CCPs, I'm pretty bearish on because the harness forcibly injects all those tokens in the [00:36:00] context, and I don't really get a say over it. They mess with auto compaction. The agent can forget how to use the tool.There's probably only what three calls in playwright that I actually ever want to use. So I pay the cost for a ton of things. Somebody vibed a local Damon that boots playwright and exposes a tiny little shim CLI to drive it. And I had zero idea that this had occurred because to me, I run Codex and it's able to, it's oh, it's better.Yeah. Like no knowledge of this at all. Uhhuh.[00:36:30] Multi Human ChaosRyan Lopopolo: So we have had like in human space to spend a lot of time doing synchronous knowledge sharing. We have a daily standup that's 45 minutes long because we almost have to. Fan out the understanding of the current state.swyx: Yeah, I was gonna say this is good for a single human multi-agent, but multi human, multi-agent is a whole like po like explosion of stuff.Ryan Lopopolo: Yeah. And that this is fundamentally why we have such a rigid, like 10,000 [00:37:00] engineer level architecture in the app because we have to find ways to carve up the space so people are not trampling on each other.swyx: Sorry, I don't get the 10,000 thing. Did I miss that?Ryan Lopopolo: The structure of the repository is like 500 NPM packages.It's like architecture to the excess for what you would consider, I think normal for a seven person team. But if every person is actually like 10 to 50. Then the like numbers on being super, super deep into decomposition and sharding and like proper interface boundaries make a lot more sense.swyx: Yeah. To me, that's why I talked about Microfund ends and I, an anex is from that world, but Cool. It is just coming back to, to, to this I dunno if you have other, thoughts on. Orchestrating so much work coin going through this. Is this enough? Is this like any aha moments?Vibhu: It'll be interesting to see like where, okay, so right now you pick linear as your issue tracker, right?swyx: Or it's like a is it actually linear? This is actually linear.[00:37:55] Linear vs Slack WorkflowVibhu: Oh, that's linear. It's linear.swyx: Oh I never looked atVibhu: video. The demo video I had to download to [00:38:00] run.swyx: So I, because I'm a Slack maxie, but Yeah, linear. Linear is also really good. Yes,Ryan Lopopolo: we do make a good use of Slack. We we fire off codex to do all these lotion, elasticity, fix ups, the things that like sync that knowledge into the repository.It's super cheap. Yeah.swyx: Yeah.Ryan Lopopolo: Just do it in Codex.swyx: My biggest plug is OpenAI needs to build Slack. You need to own Slack. Build yours. Turn this into Slack.Ryan Lopopolo: I did read about it. Youswyx: did?Ryan Lopopolo: Yeah.[00:38:25] Collaboration Tools for AgentsRyan Lopopolo: I would say that if we think that we want these agents to do economically valuable work, which is like this is the mission, right?We want AI to be deployed widely, to do economically valuable work, then we need to find ways for them to naturally collaborate with humans, which means collaboration tooling, I think, is an interesting space to explore.swyx: Yeah, totally. Yeah. GitHub, slack, linear.Vibhu: Yeah, that was my thing. Okay, where do we see right now Codex has started Codex Model, then CLI, now there's an app, app can let me shoot off multiple Codex is in parallel, but there's no great team collaboration for Codex.And it [00:39:00] seems like your team had some say into what comes out, right? So you talked to ‘em, codex kind of was a thing. From there, if you guys are on the bound, what stuff that like, you might not focus on, but what do you expect other people to be building, right? So people that are like five x 50 Xing.Should you build stuff that's like very niche for your workflow, for your team? Should it be more general so other people can adopt? Is there a niche there? ‘Cause part of it is just okay, is everything just internal tooling? Do we have everything our own way? Like the way our team operates has our own ways that we like to communicate or is there a broader way to do it?Is it something like a issue tracker? Just thoughts if you wanna riff on that.[00:39:35] Standardizing Skills and CodeRyan Lopopolo: I think TBD we have not figured this out in a general way. I do think that there is leverage to be had in making the code and the processes as much the same as possible. If you think that code is context, code is prompts, it's better from the agent behavior perspective to be able to look in a package in directory X, Y, Z, and it not to have to page so [00:40:00] deeply into directory if you C, because they have the same structure, use the same language, they have the same patterns internally.And that same like leverage comes from aligning on a single set of skills that you're pouring every engineer's taste into to make sure that the agent is effective. So like in our code base, we have, I think, six skills. That's it. And if some part of the software development loop is not being covered, our first attempt is to encode it in one of the existing setup skills, which means that we can change the agent behavior.Yeah. More cheaply than changing the human driver behavior.swyx: Yeah.[00:40:39] Self Improvement via Logsswyx: Have you ever, have you experimented with agents changing their own behavior?Ryan Lopopolo: We do.swyx: Yeah. Or parent agent changing a subagents, behavior or something like that.Ryan Lopopolo: We have some bits for skill distillation. So for example, there's one neat thing you can do with Codex, which is just point it at its own session logs to ask it to tell you how you can use [00:41:00] the tool pedal better.swyx: It's like introspectionRyan Lopopolo: or ask it to do things. I useVibhu: this session better. What skills should Iswyx: high? I like the modification of, you can do, just do things to you can just ask agent to do things.Ryan Lopopolo: Yeah. You can just codex things. This is like a, this is like a silly emoji that we have, right? You can just codex things, you can just prompt things.It's really glorious future we live in, but okay, you can do that one-on-one. But we're actually slurping these up for the entire team into blob storage and. Running agent loops over them every day to figure out where as a team can we do better and how do we reflect that back into the repositories?Yes, though everybody benefits from everybody else's behavior for free. Same for like PR comments, right? These are all feedback. That means the code as written, deviated from what was good, a PR comment, a failed build. These are all signals that mean at some point the agent was missing context. We gotta figure out how toswyx: Yeah.Ryan Lopopolo: Slurp it up and put it back in the reboot.swyx: By the way, I do this exactly right. I used to, when I use cloud code for [00:42:00] knowledge work, cloud cowork is like a nice product, right? Yes. In I think you would agree. I always have it tell me what do I do better next time? And that's the meta programming reflection thing.So I almost think like you have six reflection extraction levels in symphony and almost like the zero of layer. So the six levels are PO policy, configuration, coordination, execution, integration, observability. We've talked about a couple of these, but the zero layer is like the, okay, are we working well?Can we improve how we work? Yes. Can I modify my own workflow without MD or something? I don't know.Ryan Lopopolo: Yeah, of course. Yeah, of course you can. Like this thing is also able to cut its own tickets ‘cause we give it full access.Yeah. Make it a ticket to have it cut. Tickets you can.Put in the ticket that you expect it to file as on follow up work,swyx: like Yeah. Self-modifying. Yeah.Ryan Lopopolo: Yeah.[00:42:44] Tool Access and CLI FirstRyan Lopopolo: Put, don't put the agent in a box. Give the agent full accessibility over it. Domain.swyx: I had a mental reaction when you said don't put the agent in a box. So I think you should put it in a box. Like it's just that you're giving the box everything it needs.Ryan Lopopolo: Yeah. Context and tools.swyx: But we're like, as developers, we're used to calling [00:43:00] out to different systems, but here you use the open source things like the Prometheus, whatever, and you run it locally so that you can have the full loop. I assume.Ryan Lopopolo: Yep.Vibhu: I think likeRyan Lopopolo: another, you wanna minimize cloud, cloud dependencies.Vibhu: You also want to make sure that you think about what the agent has access to. What does it see? Does it go back into the loop, like from the most basic sense of you let it see its own like calls, traces it can determine where it went wrong. But are you feeding that back in? So you know, just the most basic level of you wanna see exactly what's input output, like does the agent have access to.What is being outputted, right? It can self-improve a lot of these things. It's allRyan Lopopolo: text, right? My job is to figure out ways to funnel text from one agent to the other.swyx: It's so strange like way back at the start of this whole AI wave Andre was like, English is the hottest day programming language.It's here, it's just Yeah. The feature as well.Vibhu: A lot of, okay. Like a lot of software, a lot of stuff. There's a gui, it's made for the human. We're seeing the evolution of CLI for everything, right? All tools have CLIs. Your agents can use [00:44:00] them well, do we get good vision? Do we get good little sandboxes?Like right now? It's a really effective way, right? Models love to use tools. They love the best. They love to read through text. So slap a CLI let it go loose. That works for everything.Ryan Lopopolo: It does. Yeah. Yeah.[00:44:14] UI Perception and RasterizingRyan Lopopolo: We've also been adapting nont, textual things to that shape in order to improve model behavior in some ways, right?We want the agent to be able to see the UI agents do not perceive visually in the same way that we do. They don't see a red box, they see red box button, right? They see these things in latent space. So if we want, Hey, yeah, I do. We haveswyx: a ding if that goes off every time. Alien spaceRyan Lopopolo: ding.Anyway if we wanna actually make it see the layout, it's almost easier to rasterize that image to ask EOR and feed it in to the agent. Ha. And there's no reason you can't do both, right? To like further refine how the model perceives the object it's [00:45:00] manipulating.swyx: Cool. Could we, you wanna talk about a couple more of these layers that might bear more introspection or that you have personal passion for?[00:45:07] Coordination Layer with ElixirRyan Lopopolo: I will say that the coordination layer here was a really tricky piece to get right.swyx: Let's do it. Yep. I'm all about that. And this is Temporal core.Ryan Lopopolo: This is where when we turn the spec into Elixir, where like the model takes a shortcut, right? Like it's oh, I have all these primitives that I can make use of in this lovely runtime that has native process supervision.Which is I think, a neat way to have taken the spec and made it more choices achievable by making choices that naturally mapswyx: Yeah.Ryan Lopopolo: To the domain, right? In the same way that like you would prefer to have a TypeScript model repo if you are doing full stack web development, right? Because the ability to share types across the front end and backend reduces a lot of complexity.And becauseswyx: that's what graph kill used to be.Ryan Lopopolo: That's right. Andswyx: I don't know if it's still alive, butRyan Lopopolo: [00:46:00] no humans in the loop here. So like my own personal ability to write or not write elixir. Doesn't really have to bias us away from using the right tool for the job. It is just wild.swyx: Love it. I love it.Yeah. I wonder if any languages struggle more than others because of this? I feel like everyone has their own abstractions. That would make sense. But maybe it might be slower, it might be more faulty where like you'd have to just kick the server every now and then. I, I don't know. I think observability layer is really well understood.Integration layer, CP is dead. I think all these just like a really interesting hierarchy to travel up and down. It's common language for people working on the system to understandRyan Lopopolo: The policy stuff is really cool, right? Yeah. You don't really have to build a bunch of code to make sure the system wait for the, to passswyx: it's institutional knowledge.Ryan Lopopolo: Yeah. You just give it the G-H-C-L-I with some text that say CI has to pass. It makes the maintenance of these systems a lot easier.[00:46:57] Agent Friendly CLI Outputswyx: Do you think that CLI maintainers need to be [00:47:00] do anything special for agents or just as is? It's good because like I don't think when people made the G GitHub, CLI, they anticipated this happening.Ryan Lopopolo: That's correct. The GH CLI is fantastic. It's great super industry.swyx: Everyone go try GH repo create GH pull and then pull request number, right? GH HPR, like 1 53, whatever. And then it like pullsRyan Lopopolo: basically my only interaction with the GitHub web UI at this point is GH PR view dash web.Exactly. Glanceswyx: at the diffRyan Lopopolo: and be like Sure thing. Send it. Yeah. But the CLI are nice ‘cause they're super token efficient and they can be made more token efficient really easily. Like I'm sure you all have seen like I go to build Kite or Jenkins and I could just get this massive wall of build output.And in order to unblock the humans, your developer productivity team is almost certainly gonna write some code that parses the actual exception out of the build logs and sticks it in a sticky note at the top of the page. And you basically [00:48:00] want CLI to be structured in a similar way, right? You're gonna want to patch dash silent to prettier because the agent doesn't care that every file was already formatted.Just wants to know it's either formatted or not. So it can then go run a right command. Similarly, like in our PNPM distributed script runner, when we had one, when you do dash recursive, like it produces a absolute mountain of text. But all of that is for passing. Test suites. So we ended up wrapping all of this in another scriptswyx: to suppress the,Ryan Lopopolo: which you can vibe the channel only output the failing parts of the tests.swyx: You make a pipe errors versus the standard, standard out. I don't know. Okay. Whatever. Too much thinking have to do that. The CII used to maintain SCLI for my company and yeah, this is like core, very core to my heart. But you're vibing my job.Ryan Lopopolo: That's right.swyx: Cool. Any other things?This is a long spec. [00:49:00] I appreciate that. It's got a lot of strong opinions in here. Any other things that we should highlight? I think obviously you can spend the whole day going through some of these, but I do think that some of these have a lot of care or some of this you might wanna tell people, Hey, take this, but, make it your own.[00:49:15] Blueprint Spec and GuardrailsRyan Lopopolo: Fundamentally, software is made more flexible when it's able to adapt to the environment in which it is deployed, which means that things like linear or GitHub even are specified within the spec, but not required pieces of it. There's like a more platonic ideal of the thing that you could swap in like Jira or Bitbucket, for example.But being able to tightly specify things like the ID formats or how the Ralph Loop works for the individual agents. Basically means you can get up and running with a fully specified system quickly that you then evolve later on. I think we never intended for this to be a static spec that you can [00:50:00] never change.It's more like a blueprint to get something worth a starting point up and running.swyx: Yeah.Ryan Lopopolo: For you then to vibe later to your heart's content,swyx: you have like code and scripts in here where it's oh, I think this is a really good prompt. It's just a very long prompt.Ryan Lopopolo: Fundamentally, the agents are good at following instructions, so give them instructions.And it will, improve the reliability of the result. We, much like the way we use Symphony, we don't want folks to have to monitor the agent as it is vibing the system into existence. So being very opinionatedVery strict around what these success criteria are means that our deployment success rate goes up. Yeah. It means we don't have to get tickets on this thing.Vibhu: Think it all goes back to that like code to disposable, right? Like early on when you had CLI or you'd kick off a Codex run, it would take two hours. You would wanna monitor okay, I'm in the workflow of just using one.I don't want it to go down the wrong path. I'll cut it off and, just shoot off four, like that was my favorite thing of the Codex app, right? Yeah. Just Forex it like, [00:51:00] it's okay. One of them will probably be right, one of them might be better. Stop overthinking it. Like my first example was probably like deep research.When you put out deep research and I'd ask it something like, I asked it something about LLM, it thought it was legal something and spent an hour, came back with a report completely off the rails. And I was like, okay, I gotta monitor this thing a bit. No don't monitor it. Just you want to build it so it's that it, it goes the right way.And you don't wanna, you don't wanna sit there and babysit, right? You don't want to babysit your agentsRyan Lopopolo: with that deep research query that you made. Looking at the bad result, you probably figured out you needed to tweak your prompt Yeah. A bit, right? That's that guardrail that you fed back into the code base for the task, your prompt to further align the agent's execution.Same sort of concept supply there too.swyx: When you talk, how are the customers feelingRyan Lopopolo: for Symphony? I think we have none, right? This is a thing we have put out into theswyx: world. Symphony's internal, right? As long as you are happy, you are the customer. That'

Building Better Games
E124: AI Can't Replace Game Producers. So Why Are They Getting Cut?

Building Better Games

Play Episode Listen Later Apr 7, 2026 23:07


If you're a leader in game dev who feels stuck, able to spot problems but struggling to make a real difference, there is a path forward that levels up your leadership and accelerates your team, game, and career. Sign up here to learn more: https://forms.gle/nqRTUvgFrtdYuCbr6 Is your job being replaced by an LLM, or are you just doing the wrong job? The claim has been made that AI can handle 85% of management tasks. For game producers, this sounds like a death knell, but only if you believe your value lies in shuffling Jira tickets and taking meeting notes. In this episode, we break down the fundamental misunderstanding of "productivity" in game dev. We explore why LLMs are masters of the "passing high school grade" and why the most vital, indirect value a producer provides remains entirely out of reach for even the most sophisticated AI. What You'll Learn in This Episode: Why "being busy" can actually hurt your career How to find the invisible value you uniquely bring The risk of using AI without real expertise Why producers must evolve beyond task tracking Connect with us:

Scrum Master Toolbox Podcast
BONUS #NoEstimates, Throughput, and the Superstition of Project Management With Felipe Engineer-Manriquez

Scrum Master Toolbox Podcast

Play Episode Listen Later Apr 4, 2026 50:47


BONUS: Why Your Plan Is Lying to You — #NoEstimates, Throughput, and the Superstition of Project Management This episode is a cross-post from The EBFC Show, Felipe Engineer-Manriquez's podcast exploring Lean and Agile in construction. In this conversation, Felipe interviews Vasco about the #NoEstimates movement, throughput-based planning, and why traditional project management is still stuck in the middle ages of managing creative work. The Human Side of Scrum That the Scrum Guide Doesn't Cover "When you go into a daily meeting and you start looking at the people in that room, maybe they are the exact same people that were there yesterday, but the team is totally different. Somebody might have had a bad night's sleep, somebody might have had an argument with their spouse. These are human beings. These are not machines that you can just distribute work to."   Vasco's path to agile coaching started with a realization that most practitioners eventually reach: the problems in software development aren't technological. They're about people — getting agreements, sharing information at the right time, making the collective brain of a team actually function. The Scrum Guide gives you organizing principles — how many meetings, who's in them — but it says almost nothing about the real-time feedback cycle between humans that makes or breaks a team. That's why the Scrum Master role exists: to be the lubricant for human interactions, to break down complex ideas into items the collective mind can process. It's the piece that makes Scrum work, and it's the piece that's hardest to teach. From Project Manager to #NoEstimates — The Bet That Changed Everything "The PM wanted 15 items per sprint, and the team said 'yeah, we can do 15.' I said, this is not gonna happen. The team had been delivering between five and eight items per sprint. I said, I'm gonna be positive — I'm gonna say seven. And no surprise, by the end of the sprint, they delivered seven."   Vasco started as a project manager — and not the easy certification kind. He went through IPMA, which means six months of training, a four-hour written exam, and an expert interview, just for the entry level. Planning and estimating was the job. Then he ran his first Scrum project, specifically to prove it couldn't work. By the second month, he couldn't understand how anything else could work. The team delivered something to show every single sprint — something that never happened with traditional project management. The turning point came when he made a bet with a product manager: the PM needed 15 items per sprint, the team committed to 15, but historical throughput was 5-8 items. Reality delivered seven. That moment crystallized the #NoEstimates insight: we can't fight reality, but we can choose which seven items to deliver. Reality Is a Bitch — Why Linear Predictive Planning Fails "Never believe the plan. Or as in Scarface — never get high on your own supply. It's so unbelievable how project managers still today believe their freaking plans."   At Nokia, Vasco managed a program of 500 people across 100 teams on four continents. No way to get everyone in a room. So he tracked system-level throughput — features delivered to integration per week. Six months into a twelve-month project, the data said they'd be at least six months late. He told the program manager: cut scope now. The program manager did what every PMI-trained program manager does — sent an email asking all 100 teams if they'd deliver on time. Every single team said yes. Nobody wants to be first to admit they're late. Twelve months in, they discovered they were six months late. The project got canceled. 500 people, millions of euros, all because somebody believed the plan. Linear predictive planning is useful for exploring what might be possible if nothing goes wrong. It is not reality. The only tool that reflects reality is throughput — the number of items completed per unit of time. Earned Value Management — George Orwell at His Best "It's not earned, it's spent. It's not value, it's cost. It's not management, it's just observation. Monty Python could not have come up with a better name."   Felipe shares a story that mirrors the absurdity: an industrial project with a dedicated 35-person earned value management department. Before the meeting even started, the department head announced, "Let's all acknowledge that earned value management is more an art than a science." Their charts were made up, the contractor's charts were made up, and the goal of the meeting was to agree that the project would finish on time — regardless of what any data said. This is where traditional project management ends up when it disconnects from throughput: a $30 million scope addition with zero additional time, defended by charts that a mediocre attorney can invalidate in the first week of litigation. Felipe knows — he spent a year being cross-examined by forensic schedulers whose full-time job is proving that construction schedules are fiction. One Small Experiment to Test #NoEstimates "Never convince anyone. Convince yourself. Once you're convinced, whatever other people say, it doesn't really matter because you're not gonna take them seriously anyway."   Here's how to validate throughput-based planning with your own data: take the last 10 sprints (or periods). Calculate the average throughput and control limits from the first five. Then check whether the next five sprints fall within that range. They will. If you're in software and using Jira, you already have this data. You don't need anyone's permission. You don't need to change anything. Just look at what your team actually delivers versus what they planned to deliver. The gap between those two numbers is the gap between superstition and reality. About Felipe Engineer-Manriquez Felipe Engineer-Manriquez is a best-selling author, international keynote speaker, Project Delivery Services Director at The Boldt Company, host of The EBFC Show podcast, and a proven construction change-maker implementing Lean and Agile practices on projects from millions to billions of dollars worldwide. He is a Registered Scrum Trainer™ (RST), Registered Scrum Master™ (RSM), and recipient of the Lean Construction Institute Chairman's Award. His book Construction Scrum is the first practical guide for applying Scrum in construction.   You can link with Felipe Engineer-Manriquez on LinkedIn.

Developer Tea
Useful Illusions and Exploiting Heuristics

Developer Tea

Play Episode Listen Later Apr 1, 2026 27:20


When Good Thinking Becomes Overthinking: Discover why the pursuit of perfect analysis often undermines good decision-making. Loading every caveat, every exception, and every alternative into your working memory doesn't produce better outcomes — it produces paralysis. Heuristics as a Feature, Not a Bug: Your brain is an efficiency machine that creates shortcuts — cached concepts, stored routines, snap judgments. These heuristics are always incomplete, but they let you move through complex problems quickly. The opportunity is to deliberately choose which heuristics to exploit. "All Models Are Wrong, Some Models Are Useful": Useful illusions don't need to be perfectly true. They need to be true enough that acting on them produces better outcomes than endlessly debating their accuracy. Useful Illusion: Coding by Hand Is Going Away: Whether or not this is literally true in every case, the engineer who acts as if it is will invest in agentic workflows, LLMs, and new tooling — while the engineer who picks the argument apart risks being labeled a skeptic and falling behind. Useful Illusion: Hard Work Pays Off: You can poke holes in this all day — wrong direction, burnout, culture-dependent — but people who follow this heuristic tend to build reputations as reliable and capable. Few of us want to be known for the opposite. Useful Illusion: As Long As I'm Learning, I'm Growing: Learning becomes less directly correlated with career advancement over time, but continuing to act on this belief keeps you flexible, curious, and in a growth mindset. More Useful Illusions for Your List: Clean code is better. Always think about the user's experience. Go with the tool you know. Volume of delivered work correlates with career success — especially during performance review season. The Key Insight: You don't have to believe any of these things literally. You're exploiting your own heuristic system to drive efficient action and avoid wasting time on low-utility debates. The result is a more decisive, action-oriented version of yourself.

ITSPmagazine | Technology. Cybersecurity. Society
Vulnerability Management in the Age of AI: From Data Overload to Decisive Action | A Brand Spotlight at RSAC Conference 2026 with Daniel DeCloss, Founder & CTO of PlexTrac

ITSPmagazine | Technology. Cybersecurity. Society

Play Episode Listen Later Apr 1, 2026 19:37


Security teams have always struggled with the gap between finding vulnerabilities and fixing the right ones. DeCloss built PlexTrac after seeing that gap firsthand as a penetration tester -- watching critical findings disappear into static PDFs and manual spreadsheets with no real tracking, no accountability, and no way to demonstrate improvement. The platform was designed from the ground up to close that loop. The conversation gets specific about what contextual risk scoring actually means. A CVE rated 10.0 in the National Vulnerability Database may be irrelevant to a given organization; a lower-severity finding may be critical given the systems that organization actually runs. PlexTrac's newly launched MCP server correlates vulnerability data against real-world environmental context, making that distinction automated and actionable -- not something an analyst has to puzzle out manually every time. DeCloss walks through what the before state looks like for most teams: an annual pentest PDF, weekly scanner output, no unified view, and spreadsheet-based assignment that makes it nearly impossible to track who is working on what or whether anything is actually getting resolved. PlexTrac replaces that with a normalized, integrated platform that connects to Jira, ServiceNow, and Azure DevOps -- keeping workflows intact while adding the visibility that was always missing. On AI's role in the industry, DeCloss is measured but direct. AI is a force multiplier, not a job eliminator. Security has always operated with a talent shortage, and automation fills that gap. But AI also expands the attack surface -- and organizations that adopt it without a security framework create new exposure. The human in the loop, with real subject matter expertise, remains essential. This is a Brand Spotlight. A Brand Spotlight is a ~15 minute conversation designed to explore the guest, their company, and what makes their approach unique. Learn more: https://www.studioc60.com/creation#spotlight GUEST Daniel DeCloss, Founder & CTO, PlexTrachttps://www.linkedin.com/in/ddecloss/ RESOURCES PlexTrac: https://plextrac.com KEYWORDS Daniel DeCloss, PlexTrac, Sean Martin, vulnerability management, penetration testing, pentest reporting, risk prioritization, CVE scoring, MCP server, AI in cybersecurity, blue team, remediation tracking, CTEM, continuous threat exposure management, RSAC Conference 2026, brand spotlight, brand marketing, marketing podcast, brand story Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Developer Tea
Decision Making is Your New Core Skill, So it's Critical to Avoid These Two Traps of Collaborative Decision-Making

Developer Tea

Play Episode Listen Later Mar 24, 2026 38:30


The Bottleneck Is Moving: Borrowing from traditional manufacturing theory, the coding step used to define your team's total throughput. AI tooling hasn't incrementally improved that bottleneck — it has drastically shrunk it, which means the constraint is now upstream in product decisions, specifications, and prioritization. Engineers who recognize this shift early will redirect their energy accordingly. Sharing Your Opinion Is Not a Free Action: Every time you weigh in on a decision, you're making a transaction. You're asking others to consider your input, and in return, they will update their beliefs about your judgment based on whether you turn out to be right. This means your credibility is a finite resource that appreciates or depreciates over time. Trap #1 — Arguing About Things You Don't Care About: Engineers often feel an intellectual itch to engage when they hear an argument they disagree with, even when the outcome doesn't matter to them. If the only utility of sharing your opinion is your own self-satisfaction, the risk to your social capital almost never justifies the reward. Pick your battles so that when something does matter to you, people actually listen. The Watchful Waiting Approach: If you predict a decision will lead to a bad outcome, sometimes the most effective move is to wait and let the result speak for itself. You get the learning for free without putting your reputation on the line — especially for decisions outside your core responsibilities. Trap #2 — Arguing on the Wrong Axis: When you do engage, make sure your argument is aligned with what the decision-maker actually cares about. A product manager asking engineers to delay optimization work is not going to be moved by arguments about on-call load. An engineering manager introducing a systems design interview won't be swayed by the fact that you personally dislike them. If your reasoning doesn't connect to their criteria, it lands as noise. Naive Realism and the Alignment Fix: We all default to believing our perspective is the balanced, unbiased one. This tendency causes us to assume anyone who disagrees must be missing information. The fix is to start by understanding what the other person is optimizing for. Once you know their criteria, you can either recognize their decision is perfectly reasonable — or reframe your argument in terms they actually care about. The One Takeaway: Understand what the other person wants, what they care about, and why. Decision-making in a collaborative environment is fundamentally negotiation, and the best negotiators optimize for multiple axes rather than treating every disagreement as zero-sum.

Scrum Master Toolbox Podcast
Product Owner Anti-Patterns, From Team Owner to Product Owner, And The PO Who Got It Right

Scrum Master Toolbox Podcast

Play Episode Listen Later Mar 13, 2026 16:07


Junaid Shaikh: Product Owner Anti-Patterns, From Team Owner to Product Owner, And The PO Who Got It Right Junaid opens with a line that cuts straight to the most common PO anti-pattern: "You are the product owner, not the team owner." When he sees a PO slipping into command-and-control mode, he asks them one question: "What is your role?" They say "Product Owner." He says: "Exactly. You own the product, not the team. If you were meant to own the team, we'd call you a project manager." The worst case he witnessed: a PO who was so possessive of "his" team that he required approval on everything — processes, tools, even holiday requests. In sprint planning, he would assign stories to individual team members ("Mr. X, you take this one"). He'd estimate the work himself, and when developers pushed back, he'd override them: "I was a developer, I know how long this takes." For approaching PO anti-patterns, Junaid has a deliberate style: he doesn't confront upfront. He observes, takes notes, and starts by solving a smaller impediment to demonstrate he's there to help. Once trust is built, he brings in coaching tools — first teaching the basics ("this is what the PO role is in Scrum"), then gradually coaching on specific anti-patterns observed in practice. He targets 10-15% improvement at a time. Six months later, you've already achieved 30-40% improvement. The best PO Junaid has worked with had four qualities: clear, concise communication; an open mindset willing to be coached; courage to say "no" when needed; and the discipline to define the "what" and leave the "how" to the team. This PO started with five sources of truth — Excel tabs, whiteboards, JIRA, and other tools. When Junaid pointed out that five sources of truth is the opposite of transparency (one of Scrum's three pillars), the PO asked for help. Junaid's response: "I can't do the push-ups for you." Together, they consolidated everything into one tool. The team was happier, and the PO managed the backlog much better. The key lesson: great product owners trust their team, communicate clearly, prioritize ruthlessly, and have the courage to say no. And they don't try to own the team. You can link with Junaid Shaikh on LinkedIn. [The Scrum Master Toolbox Podcast Recommends]

Scrum Master Toolbox Podcast
The "Death of Agile" and Why It's Really the Death of Empowerment That Should Frighten Us | Nigel Baker

Scrum Master Toolbox Podcast

Play Episode Listen Later Mar 4, 2026 18:32


Nigel Baker: The "Death of Agile" and Why It's Really the Death of Empowerment That Should Frighten Us Read the full Show Notes and search through the world's largest audio library on Agile and Scrum directly on the Scrum Master Toolbox Podcast website: http://bit.ly/SMTP_ShowNotes.   "It's not so much the death of Agile that's killing me, or death of Scrum. It's the death of things like empowerment, the death of things like empiricism. Those are the things that frighten me in work." - Nigel Baker   Nigel brings a challenge that resonates across the entire Agile community: the so-called "death of Agile." But he quickly reframes the conversation in a way that cuts much deeper. The real issue isn't whether teams call what they do Scrum or Agile—it's that the industry is decaying back past waterfall to what Nigel calls feudalism, where a single "great man" dictates and everyone else follows.  He distinguishes between two kinds of popularity: the number of people saying they're doing Agile versus the number of people actually liking what they're doing—a gap he compares to Jira's massive subscriber base versus its actual user satisfaction. Through this lens, Nigel introduces his famous "Nigel Scale"—a joke he made on a Scrum Alliance forum 20 years ago that people took entirely seriously. The scale separates Scrum into three levels: core practices that break things if you skip them (like a surgeon disinfecting hands), contextual good practices that may or may not apply (like story points), and persistent anti-patterns that never work no matter how many times people try (like normalizing team measurements across teams).  Vasco and Nigel converge on an experiment: treat Scrum adoption itself as a backlog of changes, introducing practices incrementally based on feedback—but always with a compelling vision of why the change matters.   Self-reflection Question: When you hear "Agile is dead," are you defending a framework, or are you advocating for the underlying principles of empowerment and empiricism that teams genuinely need?   [The Scrum Master Toolbox Podcast Recommends]

Developer Tea
AI Moves the Bottleneck - Are You Ready for What That Means For Your Career?

Developer Tea

Play Episode Listen Later Mar 3, 2026 29:52


AI is bringing massive changes to our industry, but it's not just about how fast you can write code or use agentic flows. In this episode, I explore how AI is fundamentally shifting the economic bottleneck of software development, and how you can use your systems-thinking engineering mindset to adapt and thrive in this new era.