I'm setting out to meet interesting people doing awesome work in the world of DevOps. From the creators of your favorite tools to the organizers of amazing conferences, from the authors of great books to fantastic public speakers. I want to introduce you to the most interesting people I can find.
When I heard about a company doing observability on robots in the physical world, I was hooked--I had to know more! Thankfully, my guest this week, was all-too-happy to talk about what he and his team at Formant.io are doing, how it works, and the challenges they run into. Some stuff you can look forward to in this episode: the robot just got stuck in a puddle--how does it know? What considerations do you make for the safety of humans around the robot, so they don’t get up getting whacked by it? And, of course, a whole lot more.
A mentor is one of those things that everyone wants but no one seems to know how to get. My guest this episode is Aaron Sachs, who cares quite a bit about mentorship in the tech world and has quite a lot of advice on finding a mentor, being a good mentee (and mentor!), plus thoughts on what mentorship is and isn’t.
One of the most interesting conversations I’ve had was the day I learned about Andrew Rodgers using Graphite to monitor a million-dollar furnace in a manufacturing plant. Since then, he’s gone on to start a company that uses the same observability tooling you know and love (Influx, Kafka, Cassanda, Grafana, etc) to solve observability challenges in the physical world, such as tracking energy consumption in the hundreds of government buildings in Washington, D.C. If you’re at all interested in unique uses of software tooling, this episode is a fun one.
Many people view the relationships they have with their vendors, whether SaaS or service companies, somewhere along a line between disdain and indifference. But it doesn’t need to be that way. In fact, vendors can be a crucial component of your team and help make your company’s vision a reality--provided you know how to work with your vendors. My guest this week is an expert on this very thing, and he’s joined me to talk through it all.
Building communities is harder than it sounds, but they’re really nothing new. Humans have been forming communities since we’ve existed, so it’s only natural that technical communities are a thing, but they certainly have their unique challenges. How do you grow a technical community? How do you keep everyone happy and resolve arguments peacefully? How do you keep a community from stagnating? If you’re an organization trying to start a community around your product, how do you measure results? Mary Thengvall and I chat about her experiences with technical community growth and management on this week’s episode.
My guest this week is Kelly Shortridge, VP of Product Strategy at Capsule8, and we’re talking about infosec. We get into some interesting discussion: threat modeling, foundational security defense, why you’re totally screwed if a nation-state is after you (tip: they’re probably not), and why chaos engineering and ephemeral infrastructure is actually great for security. Also, we totally crap on security vendor FUD for a bit and how to choose security tools that actually work.
Monitoring and observability is something near-and-dear to my own heart, so this week’s episode is exciting: Christine Yen, Cofounder & CEO of Honeycomb, joins me to talk about observability, why dashboards aren’t as helpful as you think, and the value of being able to ask questions of your own application and infrastructure when you’re troubleshooting.
Probably the most common question I received when I told people I was writing a book about monitoring was, “Have you read James Turnbull’s book?” I’m putting that to rest with a delightful conversation with James Turnbull on a variety of topics, including which of his own books is his favorite, some not-so-subtle digs at Kubernetes, and why James thinks DevOps is dead.
We’re happy to have VM (Vicky) Brasseur join us to talk about open source and dispel a few myths. We talk about what it means to properly license your code -- it’s not difficult -- and what it means for businesses that use open source code in their projects. Of course, we’ll dive deep into some tips on building a community around your open source project and talk about some ways to help continue to sustain open source projects and culture.
Ryn Daniels joins me in this episode to talk about building resilient cultures, particularly in Engineering teams. Ryn’s knowledge comes to us through two different incidents they experienced at two different companies, how the response differed at each, and what both teach us about building safe, learning environments and a resilient culture.
Baron Schwartz joins me for a delightful conversation about his technique for writing books, his thoughts on culture change, and wrapping up with a very real conversation about bias, privilege, and empathy that you won’t want to miss.
Dr. Nicole Forsgren, whom you may know from her incredible book Accelerate: Building and Scaling High Performing Technology Operations, and The State of DevOps Report, joins me to talk about academic rigor and why the State of DevOps Report is so much more than most industry surveys floating around. We dig into what goes into the Report, why it matters to all of us, and some of her favorite findings from it.
Thai Wood, editor of Resilience Roundup, joins me to discuss on-call, incident management, and the latest in systems resiliency research. We get into incident command structures, how to make it work on a small team, on-call in the Emergency Medical Services (EMS) world and the parallels to DevOps, and a bunch of fun stuff with the academic research on resiliency.
This week, I talk with Cory Watson, Technical Director in the Office of the CTO at SignalFx. Formerly having run Observability at both Stripe and Twitter, we discuss his switch from being the customer to working for the enemy--er, a vendor. What has the transition been like? How can you work effectively with your vendors? We also dig into Cory’s current and ongoing research into accessibility (a11y) in monitoring tools.
The average engineer has, what, two dozen jobs in their lifetime? Compensation has a compounding effect over your career--$10k left on the table early in your career can easily become a million lost later in your career. That’s why I went to my good friend Josh Doody to get advice on effectively negotiating job offers.
I’ve always looked up to the Arrested DevOps Podcast since it started so many years ago, so I’m super excited to have this week’s guest: Matty Stratton. We talk about all sorts of fun stuff, such as how career progression isn’t linear, how we’ve accidentally fallen into doing interesting work, and much more.
Trying to change directions in a 150-year-old organization is no easy feat. My guest this week is Chief Architect at the National Association of Insurance Commissioners, a nonprofit body charged with helping organize and enable US state-level insurance commissioners. To stay on top of the industry, they’re doing a full technical overhaul of everything and it’s pretty interesting stuff.
Join Emily Freeman and I for a discussion about what it’s like to write a technical book (hint: it’s awful) and the role of DevRel in the industry.
Ever thought hard about your company’s observability strategy and the challenges you’re facing? What about if your company spanned 70 countries, 90,000+ employees, and you were a bank? My guest certainly thinks about this regularly. In this episode, I speak with Greg Parker, the head of the Enterprise Monitoring Services team at Standard Chartered Bank about what it takes to design and implement a global monitoring strategy in a complex environment.
Yan Cui, creator of the Production-Ready Serverless course and serverless consultant, joins us this episode to school Mike on serverless, talk about the real business value behind why an organization should be interested, and a whole lot of intricate details around this new paradigm.
Compliance and risk management gets a bad reputation in engineering circles: there’s the “it’s just unnecessary overhead!” camp and also the “risk management is just the Department of No” camp. It doesn’t have to be that way, though. In this episode of Real World DevOps, I’m joined by Elliot Murphy, CEO of KindlyOps, to talk about how compliance and risk management can be forces of good and how to do that work without the stress-inducing toil and headache.
Burnout, depression, and other mental health struggles are rampant in the Ops profession thanks to our long hours and intense job pressure, so I set out to talk to a professional clinical psychologist on the topic. Dr. Sherry Walling and I discuss what burnout is, how it happens, and what steps you can to avoid it or bring yourself (or a friend!) out of its clutch.
Dave Mangot joins Mike to give more thoughts and depth on his idea of “ops smells”: like the infamous “code smell,” Dave has identified a number of ops smells through his lengthy career in Ops/SRE. This episode covers a range of wonderful topics, including the dangers of outsourced ops teams, testing in production, and the value of consistency in your infrastructure.
Wherein Corey Quinn and Mike Julian pontificate about the dangers of perfect infrastructure, why multi-cloud is (probably) a dumb idea, and that your biggest risk in a large-scale disaster is your entire team quitting to help your competitor for 10x more money.
The wild world of systems in China may be different — and smaller — than you’d think. This episode’s guest is Steve Mushero, CEO of ChinaNetCloud and Siglos, who joins Mike to discuss the challenges and evolution of systems infrastructure in China. They also dig into what it could look like to standardize troubleshooting methods and the challenge of teaching troubleshooting to people.
The Database: the final frontier in the DevOps journey. Losing your company’s data would suck, but hand-crafted, artisanal database servers also sucks. What do you do? This episode’s guest is Silvia Botros, Principal DBA at SendGrid, who joins Mike to talk about the DBA silo, better tooling, the woes of schema management, and more.