About MattMatt is an AWS DevTools Hero, Serverless Architect, Author and conference speaker. He is focused on creating the right environment for empowered teams to rapidly deliver business value in a well-architected, sustainable and serverless-first way.You can usually find him sharing reusable, well architected, serverless patterns over at cdkpatterns.com or behind the scenes bringing CDK Day to life.Links: AWS CDK Patterns: https://cdkpatterns.com The CDK Book: https://thecdkbook.com CDK Day: https://www.cdkday.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn't heard of before, but they're doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they're using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they're able to wind up taking what you're running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I'm somewhat skeptical, but their customers seem to really like them, so that's one of those areas where I really have a hard time being too snarky about it because when you solve a customer's problem and they get out there in public and say, “We're solving a problem,” it's very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it's worth exploring. So, if you're looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That's risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined today by Matt Coulter, who is a Technical Architect at Liberty Mutual. You may have had the privilege of seeing him on the keynote stage at re:Invent last year—in Las Vegas or remotely—that last year of course being 2021. But if you make better choices than the two of us did, and found yourself not there, take the chance to go and watch that keynote. It's really worth seeing.Matt, first, thank you for joining me. I'm sorry, I don't have 20,000 people here in the audience to clap this time. They're here, but they're all remote as opposed to sitting in the room behind me because you know, social distancing.Matt: And this left earphone, I just have some applause going, just permanently, just to keep me going. [laugh].Corey: That's sort of my own internal laugh track going on. It's basically whatever I say is hilarious, to that. So yeah, doesn't really matter what I say, how I say it, my jokes are all for me. It's fine. So, what was it like being on stage in front of that many people? It's always been a wild experience to watch and for folks who haven't spent time on the speaking circuit, I don't think that there's any real conception of what that's like. Is this like giving a talk at work, where I just walk on stage randomly, whatever I happened to be wearing? And, oh, here's a microphone, I'm going to say words. What is the process there?Matt: It's completely different. For context for everyone, before the pandemic, I would have pretty regularly talked in front of, I don't know, maybe one, two hundred people in Liberty, in Belfast. So, I used to be able to just, sort of, walk in front of them, and lean against the pillar, and use my clicker, and click through, but the process for actually presenting something as big as a keynote and re:Invent is so different. For starters, you think that when you walk onto the stage, you'll actually be able to see the audience, but the way the lights are set up, you can pretty much see about one row of people, and they're not the front row, so anybody I knew, I couldn't actually see.And yeah, you can only see, sort of like, the from the void, and then you have your screens, so you've six sets of screens that tell you your notes as well as what slides you're on, you know, so you can pivot. But other than that, I mean, it feels like you're just talking to yourself outside of whenever people, thankfully, applause. It's such a long process to get there.Corey: I've always said that there are a few different transition stages as the audience size increases, but for me, the final stage is more or less anything above 750 people. Because as you say, you aren't able to see that many beyond that point, and it doesn't really change anything meaningfully. The most common example that you see in the wild is jokes that work super well with a small group of people fall completely flat to large audiences. It's why so much corporate numerous cheesy because yeah, everyone in the rehearsals is sitting there laughing and the joke kills, but now you've got 5000 people sitting in a room and that joke just sounds strained and forced because there's no longer a conversation, and no one has the shared context that—the humor has to change. So, in some cases when you're telling a story about what you're going to say on stage, during a rehearsal, they're going to say, “Well, that joke sounds really corny and lame.” It's, “Yeah, wait until you see it in front of an audience. It will land very differently.” And I'm usually right on that.I would also advise, you know, doing what you do and having something important and useful to say, as opposed to just going up there to tell jokes the whole time. I wanted to talk about that because you talked about how you're using various CDK and other serverless style patterns in your work at Liberty Mutual.Matt: Yeah. So, we've been using CDK pretty extensively since it was, sort of, Q3 2019. At that point, it was new. Like, it had just gone GA at the time, just came out of dev preview. And we've been using CDK from the perspective of we want to be building serverless-first, well-architected apps, and ideally we want to be building them on AWS.Now, the thing is, we have 5000 people in our IT organization, so there's sort of a couple of ways you can take to try and get those people onto the cloud: You can either go the route of being, like, there is one true path to architecture, this is our architecture and everything you want to build can fit into that square box; or you can go the other approach and try and have the golden path where you say this is the paved road that is really easy to do, but if you want to differentiate from that route, that's okay. But what you need to do is feed back into the golden path if that works. Then everybody can improve. And that's where we've started been using CDK. So, what you heard me talk about was the software accelerator, and it's sort of a different approach.It's where anybody can build a pattern and then share it so that everybody else can rapidly, you know, just reuse it. And what that means is effectively you can, instead of having to have hundreds of people on a central team, you can actually just crowdsource, and sort of decentralize the function. And if things are good, then a small team can actually come in and audit them, so to speak, and check that it's well-architected, and doesn't have flaws, and drive things that way.Corey: I have to confess that I view the CDK as sort of a third stage automation approach, and it's one that I haven't done much work with myself. The first stage is clicking around in the console; the second is using CloudFormation or Terraform; the third stage is what we're talking about here is CDK or Pulumi, or something like that. And then you ascend to the final fourth stage, which is what I use, which is clicking around in the AWS console, but then you lie to people about it. ClickOps is poised to take over the world. But that's okay. You haven't gotten that far yet. Instead, you're on the CDK side. What advantages does CDK offer that effectively CloudFormation or something like it doesn't?Matt: So, first off, for ClickOps in Liberty, we actually have the AWS console as read-only in all of our accounts, except for sandbox. So, you can ClickOps in sandbox to learn, but if you want to do something real, unfortunately, it's going to fail you. So.—Corey: I love that pattern. I think I might steal that.Matt: [laugh]. So, originally, we went heavy on CloudFormation, which is why CDK worked well for us. And because we've actually—it's been a long journey. I mean, we've been deploying—2014, I think it was, we first started deploying to AWS, and we've used everything from Terraform, to you name it. We've built our own tools, believe it or not, that are basically CDK.And the thing about CloudFormation is, it's brilliant, but it's also incredibly verbose and long because you need to specify absolutely everything that you want to deploy, and every piece of configuration. And that's fine if you're just deploying a side project, but if you're in an enterprise that has responsibilities to protect user data, and you can't just deploy anything, they end up thousands and thousands and thousands of lines long. And then we have amazing guardrails, so if you tried to deploy a CloudFormation template with a flaw in it, we can either just fix it, or reject the deploy. But CloudFormation is not known to be the fastest to deploy, so you end up in this developer cycle, where you build this template by hand, and then it goes through that CloudFormation deploy, and then you get the failure message that it didn't deploy because of some compliance thing, and developers just got frustrated, and were like, sod this. [laugh].I'm not deploying to AWS. Back the on-prem. And that's where CDK was a bit different because it allowed us to actually build abstractions with all of our guardrails baked in, so that it just looked like a standard class, for developers, like, developers already know Java, Python, TypeScript, the languages off CDK, and so we were able to just make it easy by saying, “You want API Gateway? There's an API Gateway class. You want, I don't know, an EC2 instance? There you go.” And that way, developers could focus on the thing they wanted, instead of all of the compliance stuff that they needed to care about every time they wanted to deploy.Corey: Personally, I keep lobbying AWS to add my preferred language, which is crappy shell scripting, but for some reason they haven't really been quick to add that one in. The thing that I think surprises me, on some level—though, perhaps it shouldn't—is not just the adoption of serverless that you're driving at Liberty Mutual, but the way that you're interacting with that feels very futuristic, for lack of a better term. And please don't think that I'm in any way describing this in a way that's designed to be insulting, but I do a bunch of serverless nonsense on Twitter for Pets. That's not an exaggeration. twitterforpets.com has a bunch of serverless stuff behind it because you know, I have personality defects.But no one cares about that static site that's been a slide dump a couple of times for me, and a running joke. You're at Liberty Mutual; you're an insurance company. When people wind up talking about big enterprise institutions, you're sort of a shorthand example of exactly what they're talking about. It's easy to contextualize or think of that as being very risk averse—for obvious reasons; you are an insurance company—as well as wanting to move relatively slowly with respect to technological advancement because mistakes are going to have drastic consequences to all of your customers, people's lives, et cetera, as opposed to tweets or—barks—not showing up appropriately at the right time. How did you get to the, I guess, advanced architectural philosophy that you clearly have been embracing as a company, while having to be respectful of the risk inherent that comes with change, especially in large, complex environments?Matt: Yeah, it's funny because so for everyone, we were talking before this recording started about, I've been with Liberty since 2011. So, I've seen a lot of change in the length of time I've been here. And I've built everything from IBM applications right the way through to the modern serverless apps. But the interesting thing is, the journey to where we are today definitely started eight or nine years ago, at a minimum because there was something identified in the leadership that they said, “Listen, we're all about our customers. And that means we don't want to be wasting millions of dollars, and thousands of hours, and big trains of people to build software that does stuff. We want to focus on why are we building a piece of software, and how quickly can we get there? If you focus on those two things you're doing all right.”And that's why starting from the early days, we focused on things like, okay, everything needs to go through CI/CD pipelines. You need to have your infrastructure as code. And even if you're deploying on-prem, you're still going to be using the same standards that we use to deploy to AWS today. So, we had years and years and years of just baking good development practices into the company. And then whenever we started to move to AWS, the question became, do we want to just deploy the same thing or do we want to take full advantage of what the cloud has to offer? And I think because we were primed and because the leadership had the right direction, you know, we were just sitting there ready to say, “Okay, serverless seems like a way we can rapidly help our customers.” And that's what we've done.Corey: A lot of the arguments against serverless—and let's be clear, they rhyme with the previous arguments against cloud that lots of people used to make; including me, let's be clear here. I'm usually wrong when I try to predict the future. “Well, you're putting your availability in someone else's hands,” was the argument about cloud. Yeah, it turns out the clouds are better at keeping things up than we are as individual companies.Then with serverless, it's the, “Well, if they're handling all that stuff for you on their side, when they're down, you're down. That's an unacceptable business risk, so we're going to be cloud-agnostic and multi-cloud, and that means everything we build serverlessly needs to work in multiple environments, including in our on-prem environment.” And from the way that we're talking about servers and things that you're building, I don't believe that is technically possible, unless some of the stuff you're building is ridiculous. How did you come to accept that risk organizationally?Matt: These are the conversations that we're all having. Sort of, I'd say once a week, we all have a multi-cloud discussion—and I really liked the article you wrote, it was maybe last year, maybe the year before—but multi-cloud to me is about taking the best capabilities that are out there and bringing them together. So, you know, like, Azure [ID 00:12:47] or whatever, things from the other clouds that they're good at, and using those rather than thinking, “Can I build a workload that I can simultaneously pay all of the price to run across all of the clouds, all of the time, so that if one's down, theoretically, I might have an outage?” So, the way we've looked at it is we embraced really early the well-architected framework from AWS. And it talks about things like you need to have multi-region availability, you need to have your backups in place, you need to have things like circuit breakers in place for if third-party goes down, and we've just tried to build really resilient architectures as best as we can on AWS. And do you know what I think, if [laugh] it AWS is not—I know at re:Invent, there it went down extraordinarily often compared to normal, but in general—Corey: We were all tired of re:Invent; their us-east-1 was feeling the exact same way.Matt: Yeah, so that's—it deserved a break. But, like, if somebody can't buy insurance for an hour, once a year, [laugh] I think we're okay with it versus spending millions to protect that one hour.Corey: And people make assumptions based on this where, okay, we had this problem with us-east-1 that froze things like the global Route 53 control planes; you couldn't change DNS for seven hours. And I highlighted that as, yeah, this is a problem, and it's something to severely consider, but I will bet you anything you'd care to name that there is an incredibly motivated team at AWS, actively fixing that as we speak. And by—I don't know how long it takes to untangle all of those dependencies, but I promise they're going to be untangled in relatively short order versus running data centers myself, when I discover a key underlying dependency I didn't realize was there, well, we need to break that. That's never going to happen because we're trying to do things as a company, and it's just not the most important thing for us as a going concern. With AWS, their durability and reliability is the most important thing, arguably compared to security.Would you rather be down or insecure? I feel like they pick down—I would hope in most cases they would pick down—but they don't want to do either one. That is something they are drastically incentivized to fix. And I'm never going to be able to fix things like that and I don't imagine that you folks would be able to either.Matt: Yeah, so, two things. The first thing is the important stuff, like, for us, that's claims. We want to make sure at any point in time, if you need to make a claim you can because that is why we're here. And we can do that with people whether or not the machines are up or down. So, that's why, like, you always have a process—a manual process—that the business can operate, irrespective of whether the cloud is still working.And that's why we're able to say if you can't buy insurance in that hour, it's okay. But the other thing is, we did used to have a lot of data centers, and I have to say, the people who ran those were amazing—I think half the staff now work for AWS—but there was this story that I heard where there was an app that used to go down at the same time every day, and nobody could work out why. And it was because someone was coming in to clean the room at that time, and they unplugged the server to plug in a vacuum, and then we're cleaning the room, and then plugging it back in again. And that's the kind of thing that just happens when you manage people, and you manage a building, and manage a premises. Whereas if you've heard that happened that AWS, I mean, that would be front page news.Corey: Oh, it absolutely would. There's also—as you say, if it's the sales function, if people aren't able to buy insurance for an hour, when us-east-1 went down, the headlines were all screaming about AWS taking an outage, and some of the more notable customers were listed as examples of this, but the story was that, “AWS has massive outage,” not, “Your particular company is bad at technology.” There's sort of a reputational risk mitigation by going with one of these centralized things. And again, as you're alluding to, what you're doing is not life-critical as far as the sales process and getting people to sign up. If an outage meant that suddenly a bunch of customers were no longer insured, that's a very different problem. But that's not your failure mode.Matt: Exactly. And that's where, like, you got to look at what your business is, and what you're specifically doing, but for 99.99999% of businesses out there, I'm pretty sure you can be down for the tiny window that AWS is down per year, and it will be okay, as long as you plan for it.Corey: So, one thing that really surprised me about the entirety of what you've done at Liberty Mutual is that you're a big enterprise company, and you can take a look at any enterprise company, and say that they have dueling mottos, which is, “I am not going to comment on that,” or, “That's not funny.” Like, the safe mode for any large concern is to say nothing at all. But a lot of folks—not just you—at Liberty have been extremely vocal about the work that you're doing, how you view these things, and I almost want to call it advocacy or evangelism for the CDK. I'm slightly embarrassed to admit that for a little while there, I thought you were an AWS employee in their DevRel program because you were such an advocate in such strong ways for the CDK itself.And that is not something I expected. Usually you see the most vocal folks working in environments that, let's be honest, tend to play a little bit fast and loose with things like formal corporate communications. Liberty doesn't and yet, there you folks are telling these great stories. Was that hard to win over as a culture, or am I just misunderstanding how corporate life is these days?Matt: No, I mean, so it was different, right? There was a point in time where, I think, we all just sort of decided that—I mean, we're really good at what we do from an engineering perspective, and we wanted to make sure that, given the messaging we were given, those 5000 teck employees in Liberty Mutual, if you consider the difference in broadcasting to 5000 versus going external, it may sound like there's millions, billions of people in the world, but in reality, the difference in messaging is not that much. So, to me what I thought, like, whenever I started anyway—it's not, like, we had a meeting and all decided at the same time—but whenever I started, it was a case of, instead of me just posting on all the internal channels—because I've been doing this for years—it's just at that moment, I thought, I could just start saying these things externally and still bring them internally because all you've done is widened the audience; you haven't actually made it shallower. And that meant that whenever I was having the internal conversations, nothing actually changed except for it meant external people, like all their Heroes—like Jeremy Daly—could comment on these things, and then I could bring that in internally. So, it almost helped the reverse takeover of the enterprise to change the culture because I didn't change that much except for change the audience of who I was talking to.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: One thing that you've done that I want to say is admirable, and I stumbled across it when I was doing some work myself over the break, and only right before this recording did I discover that it was you is the cdkpatterns.com website. Specifically what I love about it is that it publishes a bunch of different patterns of ways to do things. This deviates from a lot of tutorials on, “Here's how to build this one very specific thing,” and instead talks about, “Here's the architecture design; here's what the baseline pattern for that looks like.” It's more than a template, but less than a, “Oh, this is a messaging app for dogs and I'm trying to build a messaging app for cats.” It's very generalized, but very direct, and I really, really like that model of demo.Matt: Thank you. So, watching some of your Twitter threads where you experiment with new—Corey: Uh oh. People read those. That's a problem.Matt: I know. So, whatever you experiment with a new piece of AWS to you, I've always wondered what it would be like to be your enabling architect. Because technically, my job in Liberty is, I meant to try and stay ahead of everybody and try and ease the on-ramp to these things. So, if I was your enabling architect, I would be looking at it going, “I should really have a pattern for this.” So that whenever you want to pick up that new service the patterns in cdkpatterns.com, there's 24, 25 of them right there, but internally, there's way more than dozens now.The goal is, the pattern is the least amount to code for you to learn a concept. And then that way, you can not only see how something works, but you can maybe pick up one of the pieces of the well-architected framework while you're there: All of it's unit tested, all of it is proper, you know, like, commented code. The idea is to not be crap, but not be gold-plated either. I'm currently in the process of upgrading that all to V2 as well. So, that [unintelligible 00:21:32].Corey: You mentioned a phrase just now: “Enabling architect.” I have to say this one that has not crossed my desk before. Is that an internal term you use? Is that an enterprise concept I've somehow managed to avoid? Is that an AWS job role? What is that?Matt: I've just started saying [laugh] it's my job over the past couple of years. That—I don't know, patent pending? But the idea to me is—Corey: No, it's evocative. I love the term, I'd love to learn more.Matt: Yeah, because you can sort of take two approaches to your architecture: You can take the traditional approach, which is the ‘house of no' almost, where it's like, “This is the architecture. How dare you want to deviate. This is what we have decided. If you want to change it, here's the Architecture Council and go through enterprise architecture as people imagine it.” But as people might work out quite quickly, whenever they meet me, the whole, like, long conversational meetings are not for me. What I want to do is teach engineers how to help themselves, so that's why I see myself as enabling.And what I've been doing is using techniques like Wardley Mapping, which is where you can go out and you can actually take all the components of people's architecture and you can draw them on a map for—it's a map of how close they are to the customer, as well as how cutting edge the tech is, or how aligned to our strategic direction it is. So, you can actually map out all of the teams, and—there's 160, 170 engineers in Belfast and Dublin, and I can actually go in and say, “Oh, that piece of your architecture would be better if it was evolved to this. Well, I have a pattern for that,” or, “I don't have a pattern for that, but you know what? I'll build one and let's talk about it next week.” And that's always trying to be ahead, instead of people coming to me and I have to say no.Corey: AWS Proton was designed to do something vaguely similar, where you could set out architectural patterns of—like, the two examples that they gave—I don't know if it's in general availability yet or still in public preview, but the ones that they gave were to build a REST API with Lambda, and building something-or-other with Fargate. And the idea was that you could basically fork those, or publish them inside of your own environment of, “Oh, you want a REST API; go ahead and do this.” It feels like their vision is a lot more prescriptive than what yours is.Matt: Yeah. I talked to them quite a lot about Proton, actually because, as always, there's different methodologies and different ways of doing things. And as I showed externally, we have our software accelerator, which is kind of our take on Proton, and it's very open. Anybody can contribute; anybody can consume. And then that way, it means that you don't necessarily have one central team, you can have—think of it more like an SRE function for all of the patterns, rather than… the Proton way is you've separate teams that are your DevOps teams that set up your patterns and then separate team that's consumer, and they have different permissions, different rights to do different things. If you use a Proton pattern, anytime an update is made to that pattern, it auto-deploys your infrastructure.Corey: I can see that breaking an awful lot.Matt: [laugh]. Yeah. So, the idea is sort of if you're a consumer, I assume you [unintelligible 00:24:35] be going to change that infrastructure. You can, they've built in an escape hatch, but the whole concept of it is there's a central team that looks to what the best configuration for that is. So, I think Proton has so much potential, I just think they need to loosen some of the boundaries for it to work for us, and that's the feedback I've given them directly as well.Corey: One thing that I want to take a step beyond this is, you care about this? More than most do. I mean, people will work with computers, yes. We get paid for that. Then they'll go and give talks about things. You're doing that as well. They'll launch a website occasionally, like, cdkpatterns.com, which you have. And then you just sort of decide to go for the absolute hardest thing in the world, and you're one of four authors of a book on this. Tell me more.Matt: Yeah. So, this is something that there's a few of us have been talking since one of the first CDK Days, where we're friends, so there's AWS Heroes. There's Thorsten Höger, Matt Bonig, Sathyajith Bhat, and myself, came together—it was sometime in the summer last year—and said, “Okay. We want to write a book, but how do we do this?” Because, you know, we weren't authors before this point; we'd never done it before. We weren't even sure if we should go to a publisher, or if we should self-publish.Corey: I argue that no one wants to write a book. They want to have written a book, and every first-time author I've ever spoken to at the end has said, “Why on earth would anyone want to do this a second time?” But people do it.Matt: Yeah. And that's we talked to Alex DeBrie, actually, about his book, the amazing Dynamodb Book. And it was his advice, told us to self-publish. And he gave us his starter template that he used for his book, which took so much of the pain out because all we had to do was then work out how we were going to work together. And I will say, I write quite a lot of stuff in general for people, but writing a book is completely different because once it's out there, it's out there. And if it's wrong, it's wrong. You got to release a new version and be like, “Listen, I got that wrong.” So, it did take quite a lot of effort from the group to pull it together. But now that we have it, I want to—I don't have a printed copy because it's only PDF at the minute, but I want a copy just put here [laugh] in, like, the frame. Because it's… it's what we all want.Corey: Yeah, I want you to do that through almost a traditional publisher, selfishly, because O'Reilly just released the AWS Cookbook, and I had a great review quote on the back talking about the value added. I would love to argue that they use one of mine for The CDK Book—and then of course they would reject it immediately—of, “I don't know why you do all this. Using the console and lying about it is way easier.” But yeah, obviously not the direction you're trying to take the book in. But again, the industry is not quite ready for the lying version of ClickOps.It's really neat to just see how willing you are to—how to frame this?—to give of yourself and your time and what you've done so freely. I sometimes make a joke—that arguably isn't that funny—that, “Oh, AWS Hero. That means that you basically volunteer for a $1.6 trillion company.”But that's not actually what you're doing. What you're doing is having figured out all the sharp edges and hacked your way through the jungle to get to something that is functional, you're a trailblazer. You're trying to save other people who are working with that same thing from difficult experiences on their own, having to all thrash and find our own way. And not everyone is diligent and as willing to continue to persist on these things. Is that a somewhat fair assessment how you see the Hero role?Matt: Yeah. I mean, no two Heroes are the same, from what I've judged, I haven't met every Hero yet because pandemic, so Vegas was the first time [I met most 00:28:12], but from my perspective, I mean, in the past, whatever number of years I've been coding, I've always been doing the same thing. Somebody always has to go out and be the first person to try the thing and work out what the value is, and where it'll work for us more work for us. The only difference with the external and public piece is that last 5%, which it's a very different thing to do, but I personally, I like even having conversations like this where I get to meet people that I've never met before.Corey: You sort of discovered the entire secret of why I have an interview podcast.Matt: [laugh]. Yeah because this is what I get out of it, just getting to meet other people and have new experiences. But I will say there's Heroes out there doing very different things. You've got, like, Hiro—as in Hiro, H-I-R-O—actually started AWS Newbies and she's taught—ah, it's hundreds of thousands of people how to actually just start with AWS, through a course designed for people who weren't coders before. That kind of thing is next-level compared to anything I've ever done because you know, they have actually built a product and just given it away. I think that's amazing.Corey: At some level, building a product and giving it away sounds like, “You know, I want to never be lonely again.” Well, that'll work because you're always going to get support tickets. There's an interesting narrative around how to wind up effectively managing the community, and users, and demands, based on open-source maintainers, that we're all wrestling with as an industry, particularly in the wake of that whole log4j nonsense that we've been tilting at that windmill, and that's going to be with us for a while. One last thing I want to talk about before we wind up calling this an episode is, you are one of the organizers of CDK Day. What is that?Matt: Yeah, so CDK Day, it's a complete community-organized conference. The past two have been worldwide, fully virtual just because of the situation we're in. And I mean, they've been pretty popular. I think we had about 5000 people attended the last one, and the idea is, it's a full day of the community just telling their stories of how they liked or disliked using the CDK. So, it's not a marketing event; it's not a sales event; we actually run the whole event on a budget of exactly $0. But yeah, it's just a day of fun to bring the community together and learn a few things. And, you know, if you leave it thinking CDK is not for you, I'm okay with that as much as if you just make a few friends while you're there.Corey: This is the first time I'd realized that it wasn't a formal AWS event. I almost feel like that's the tagline that you should have under it. It's—because it sounds like the CDK Day, again, like, it's this evangelism pure, “This is why it's great and why you should use it.” But I love conferences that embrace critical views. I built one of the first talks I ever built out that did anything beyond small user groups was “Heresy in the Church of Docker.”Then they asked me to give that at ContainerCon, which was incredibly flattering. And I don't think they made that mistake a second time, but it was great to just be willing to see some group of folks that are deeply invested in the technology, but also very open to hearing criticism. I think that's the difference between someone who is writing a nuanced critique versus someone who's just [pure-on 00:31:18] zealotry. “But the CDK is the answer to every technical problem you've got.” Well, I start to question the wisdom of how applicable it really is, and how objective you are. I've never gotten that vibe from you.Matt: No, and that's the thing. So, I mean, as we've worked out in this conversation, I don't work for AWS, so it's not my product. I mean, if it succeeds or if it fails, it doesn't impact my livelihood. I mean, there are people on the team who would be sad for, but the point is, my end goal is always the same. I want people to be enabled to rapidly deliver their software to help their customers.If that's CDK, perfect, but CDK is not for everyone. I mean, there are other options available in the market. And if, even, ClickOps is the way to go for you, I am happy for you. But if it's a case of we can have a conversation, and I can help you get closer to where you need to be with some other tool, that's where I want to be. I just want to help people.Corey: And if I can do anything to help along that axis, please don't hesitate to let me know. I really want to thank you for taking the time to speak with me and being so generous, not just with your time for this podcast, but all the time you spend helping the rest of us figure out which end is up, as we continue to find that the way we manage environments evolves.Matt: Yeah. And, listen, just thank you for having me on today because I've been reading your tweets for two years, so I'm just starstruck at this moment to even be talking to you. So, thank you.Corey: No, no. I understand that, but don't worry, I put my pants on two legs at a time, just like everyone else. That's right, the thought leader on Twitter, you have to jump into your pants. That's the rule. Thanks again so much. I look forward to having a further conversation with you about this stuff as I continue to explore, well honestly, what feels like a brand new paradigm for how we manage code.Matt: Yeah. Reach out if you need any help.Corey: I certainly will. You'll regret asking. Matt [Coulter 00:33:06], Technical Architect at Liberty Mutual. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, write an angry comment, then click the submit button, but lie and say you hit the submit button via an API call.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:17] Cert-manager now supports Private CA ACM (no public ACM yet) https://aws.amazon.com/about-aws/whats-new/2022/01/acm-kubernetes-cert-manager-plugin-production/https://github.com/aws/containers-roadmap/issues/904[00:04:32] Huge PR for Maintenance on Beanstalk Modulehttps://github.com/cloudposse/terraform-aws-elastic-beanstalk-environment/pull/203[00:05:25] SQL Migrations with Terraform (via Oliver)https://registry.terraform.io/providers/paultyng/sql/latest/docs/resources/migrate[00:09:21] Checkout our #jobs Channel for new postings[00:10:02] Ready to do things the Cloud Posse way? Take our quiz.https://cloudposse.com/quiz[00:11:41] Is updating a securitygroup with lambda really the only way to protect endpoints behind Cloudfront from other traffic?[00:16:35] Any insights on provisioning cdns that are optimized to minimize http 2 response delays?[00:30:30] CloudTrail lake announced https://aws.amazon.com/blogs/mt/announcing-aws-cloudtrail-lake-a-managed-audit-and-security-lake/ [00:31:55] Anyone working with VPC IPAM? [00:36:30] Do you have any suggestions to prevent creation of resources without cost allocation tags?[00:39:00] High CVE in containerdhttps://github.com/containerd/containerd/security/advisories/GHSA-mvff-h3cj-wj9chttps://nvd.nist.gov/vuln/detail/CVE-2021-43816[00:42:18] Why would we move from ECS on EC2 to Kubernetes? [00:53:19] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
Unedited live recording of this show on YouTube (Ep 114) Topics and Links Guide to GitOps from weaveworks GitOps origins; a blog from weaveworks: What DevOps is to the Cloud, GitOps is to Cloud Native Flux CD Argo CD Swarm Sync for Docker Swarm GitOps YouTube Live Show (Ep 113) where Bret talks about, and gives a demo of, Crossplane YouTube Live Show (Ep 142) where Bret and Viktor Farcic cover DevOps Automation with Crossplane Control GitHub GitOps through "branch protection rules" DevOpsDays ★ Support this podcast on Patreon ★
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:18] Happy New Years! [00:01:57] What is the current best practice for a cold start?[00:10:44] How to organize Terraform modules in a large enterprise? [00:21:22] Do we have a demo? [00:25:14] Should I write my own providers? [00:34:44] Call for proposals for HashiTalks 2022 is open[00:40:52] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
It's no secret that suboptimal web and mobile user experiences can negatively impact the bottom line. The problem is that many organizations' UX, accessibility, security, and performance testing occur separately from functional testing. Often, this type of testing is done until after the code is complete. Rick Broker, a Solutions Architect at Digital.ai, shares why this needs to change in this episode. Discover the value of engaging with all customers, even those with accessibility needs. The challenges of performing non-functional and functional testing separately. And how to add non-functional tests to your CI/CD pipelines, and much more.
Unedited live recording on YouTube (Ep #150) Log4Shell info from SANS Institute on YouTube Log4Shell info from Docker blog HashiCorp IPO Bill Gates Year in Review "Reasons for Optimism After a Difficult Year" GitHub blog "GitHub Actions: Reusable Workflows are Generally Available" Dig into your Docker images contains.dev WebAssembly in 100 seconds on YouTube Modern Finance podcast "Side Chain Scaling with Sandeep Nailwal, Co-Founder of Polygon" My First Million podcast Docker blog: "Faster Multi-Platform Builds: Dockerfile Cross-Compilation Guide" ★ Support this podcast on Patreon ★
In this episode of AWS TechChat, we take a journey into EC2 Mac. I interviewed two EC2 Mac Specialists, Muhammad and Scott, who helped us deep dive into the depths of EC2 and supporting services and features. We start the show by setting foundations as we talked about the single tenancy model and how that relates to billing. We then discussed the differences between instances and hosts and EBS storage as well as building a CI/CD pipeline w/ EC2 macs for your build servers. And we wrapped that all up with some use cases we've heard and by looking at where customers should start their EC2 Mac journey. Speakers Shai Perednik - Sr. Solutions Architect, AWS Muhammad Mansoor - Sr. Solutions Architect, AWS Scott Malki - Sr EC2/Graviton Specialist Resources New – Use Amazon EC2 Mac Instances to Build & Test macOS, iOS, iPadOS,... (https://aws.amazon.com/blogs/aws/new-use-mac-instances-to-build-test-macos-ios-ipados-tvos-and-watchos-apps/)
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:25] Lambda@Edge support for S3 CDN Module (inline Lambdas!)https://github.com/cloudposse/terraform-aws-cloudfront-s3-cdn/pull/204[00:04:45] MWAA Airflow Module Coming Soonhttps://github.com/cloudposse/terraform-aws-mwaa/pull/3[00:06:03] Atmos Help (and README coming soon)https://github.com/cloudposse/atmos/pull/94[00:07:23] Atlantis adds GH allowlist support (after 3 years!)https://github.com/cloudposse/atlantis/releases/tag/0.8.0[00:11:45] YAAO!!! (Yet Another AWS Outage) https://status.aws.amazon.com/https://www.datacenterdynamics.com/en/news/aws-has-another-east-coast-cloud-outage/[00:12:21] PSA If you are using Terraform CLI v1.1.0 or v1.1.1, please upgrade to this new version as soon as possiblehttps://github.com/hashicorp/terraform/releases/tag/v1.1.2[00:14:21] We're looking for a service to check DNS registration expiration and SSL certs, across registrars and CAs, for only about a dozen domains. Any recommendations?[00:20:19] Has anyone played with Control Tower Customizations?[00:21:17] Start a discussion regarding various Ingress Controllers[00:36:46] How are people running spark on kubernetes? [00:37:25] I have 3 different resource-usage profiles among the K8s services and jobs that I run. I want to isolate the pods with erratic resource usage from the front-end pods, and also run jobs on spot instances. Should I use node groups to do this? Should resource limits be enough to manage this?[00:51:23] Is there any way you can restrict IO for each pod? [00:55:03] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
On The Cloud Pod this week, Oracle finally has some news to share. Plus Log4j is ruining everyone's lives, AWS suffers a massive outage post re:Invent, and Google CAT releases its first threat report. A big thanks to this week's sponsors: Foghorn Consulting, which provides full-stack cloud solutions with a focus on strategy, planning and execution for enterprises seeking to take advantage of the transformative capabilities of AWS, Google Cloud and Azure. JumpCloud, which offers a complete platform for identity, access, and device management — no matter where your users and devices are located. This week's highlights
Eric Anderson (@ericmander) and Kyle Quest (@kcqon) discuss DockerSlim, the open-source optimization and security tool for Docker container images. Kyle initially created DockerSlim as a humble hackathon project, and now supports it with his company, Slim.AI. Tune in to learn how DockerSlim is redefining DevOps with application intelligence and a backwards compatible vision of the future. In this episode we discuss: Bridging the gap between application and infrastructure Emerging from the cloud native stone age Application intelligence rather than artificial intelligence in Slim.AI DockerSlim integrated into CI/CD pipelines, embedded systems, and robots How Slim.AI aims to become ‘Google for containers' Links: DockerSlim Slim.AI Terraform Serverless Sigstore
Artificial intelligence (AI) and machine learning (ML) have seen a surge in adoption and advances for IT applications, especially for database management, CI/CD support and other functionalities. Robotics, meanwhile, is largely relegated to factory-floor automation. In this The New Stack Makers podcast, Pieter Abbeel, co-founder, president, chief scientist at covariant.ai, a supplier of “universal AI” for robotics, discusses why and how the potential of robotics can evolve beyond just serving as pre-programmed devices thanks to advances in IT. Abbeel also draws on his background to offer his perspective, as a professor at the University of California, Berkeley and a podcast host at The Robot Brains Podcast.Alex Williams, founder and publisher of The New Stack, hosted this podcast.
About FrankFrank Chen is a maker. He develops products and leads software engineering teams with a background in behavior design, engineering leadership, systems reliability engineering, and resiliency research. At Slack, Frank focuses on making engineers' lives simpler, more pleasant, and more productive, in the Developer Productivity group. At Palantir, Frank has worked with customers in healthcare, finance, government, energy and consumer packaged goods to solve their hardest problems by transforming how they use data. At Amazon, Frank led a front-end team and infrastructure team to launch AWS WorkDocs, the first secure multi-platform service of its kind for enterprise customers. At Sandia National Labs, Frank researched resiliency and complexity analysis tooling with the Grid Resiliency group. He received a M.S. in Computer Science focused in Human-Computer Interaction from Stanford. Frank's thesis studied how the design / psychology of exergaming interventions might produce efficacious health outcomes. With the Stanford Prevention Research Center, Frank developed health interventions rooted in behavioral theory to create new behaviors through mobile phones. He prototyped early builds of Tiny Habits with BJ Fogg and worked in the Persuasive Technology Lab. He received a B.S. in Computer Science from UCLA. Frank researched networked systems and image processing with the Center for embedded Networked Systems. With the Rand Corporation, he built research systems to support group decision-making.Links: Slack: https://slack.com “Infrastructure Observability for Changing the Spend Curve”: https://slack.engineering/infrastructure-observability-for-changing-the-spend-curve/ “Right Sizing Your Instances Is Nonsense”: https://www.lastweekinaws.com/blog/right-sizing-your-instances-is-nonsense/ Personal webpage: https://frankc.net Twitter: @frankc TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com. Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Several people are undoubtedly angrily typing, and part of the reason they can do that, and the fact that I know that is because we're all using Slack. My guest today is Frank Chen, senior staff software engineer at Slack. So, I guess, sort of… [sales force 00:00:53]. Frank, thanks for joining me.Frank: Hey, Corey, I have been a longtime listener and follower, and just really delighted to be here.Corey: It's one of the weird things about doing a podcast is that for better or worse, people don't respond to it in the same way that they do writing a newsletter, for example, because you receive an email, and, “Oh, well, I know how to write an email. I can hit reply and send an email back and give that jackwagon a piece of my mind,” and people often do. But with podcasts, I feel like it's much more closely attuned to the idea of an AM radio talk show. And who calls into a radio talk show? Lunatics, and most people don't self-describe as lunatics, so they don't want to do that.But then when I catch up with people one-on-one or at events in person, I find out that a lot more people listen to this show than I thought they did. Because I don't trust podcast statistics because lies, damn lies, and analytics are sort of how I view this world. So, you've worked at a bunch of different companies. You're at Slack now, which, of course, upsets some people because, “Slack is ruining the way that people come and talk to me in the office.” Or it's making it easier for employees to collaborate internally in ways their employers wish they wouldn't. But that's neither here nor there.Before this, you were at Palantir, and before this, you're at Amazon, working on Amazon WorkDocs of all things, which is supposedly rumored to have at least one customer somewhere, but I've never seen them. Before that you were at Sandia National Labs, and you've gotten a master's in computer science from Stanford. You've done a lot of things and everything you've done, on some level, seems like the recurring theme is someone on Twitter will be unhappy at you for a career choice you've made. But what is the common thread—in seriousness—between the different places that you've been?Frank: One thing that's been a driver for where I work is finding amazing people to work with and building something that I believe is valuable and fun to keep doing. The thing that brought me to Slack is I became my own Slack admin, [laugh] when I met a girl and we moved in together into a small apartment in Brooklyn. And she had a cat that, you know, is a sweetheart, but also just doesn't know how to be social. Yes, you covered that with ‘cat.' Part of moving it together, I became my own Slack admin and discovered well, we can build a series of home automations to better train and inform our little command center for when the cat lies about being fed, or not fed, clipping his nails, and discovering and tracking bad behaviors. In a lot of ways this was like the human side of a lot of the data work that I had been doing at my previous role. And it was like a fun way to use the same frameworks that I use at work to better train and be a cat caretaker.Corey: Now, at some point, you know that some product manager at Amazon is listening to this and immediately sketching notes because their product strategy is, “Yes,” and this is going to be productized and shipping in two years as Amazon Prime Meow. But until then we'll enjoy the originality of having a Slack bot more or less control the home automation slash making your house seem haunted for anyone who didn't write the code themselves. There's an idea of solving real world problems that I definitely understand. I mean, and again, it might not even be a fair question entirely. Just because I am… for better or worse, staggering through my world, and trying—and failing most days—to tell a narrative that, “Oh, why did I start my tech career at a university, and then spend time in ad tech, and then spend time in consulting, and then FinTech, and the rest?” And the answer is, “Oh, I get fired an awful lot, and that sucked.”So, instead of going down that particular rabbit hole of a mess, I went in other directions. I started finding things that would pay me and pay me more money because I was in debt at the time. But that was the narrative thread that was the, “I have rent to pay and they have computers that aren't behaving properly.” And that's what dictated the shape of my career for a long time. It's only in retrospect that I started to identify some of the things that aligns with it. But it's easy to look at it with the shine of hindsight and not realize that no, no, that's sort of retconning what happened in the past.Frank: Yeah, I have a mentor and my former adviser had this way of describing, building out the jankiest prototype you can to prove out an idea. And this manifested in his class in building out paper prototypes, or really, really janky ideas for what helping people through technology might look like. And I feel like it a lot of ways, even when those prototypes fail, like, in a career or some half baked tech prototype I put together, it might succeed and great, we could keep building upon that, but when it fails, you actually discover, “Oh, this is one way that I didn't succeed.” And even in doing so, you discover things about yourself, your way of building, and maybe a little bit about your infrastructure, or whatever it is that you build on a day-to-day basis. And wrapping that back to the original question, it's like, well, we think we're human beings, right, we're static, but in a lot of ways we're human becomings. We think we know what the future might look like with our careers, what we're building on a day-to-day basis, and what we're building a year from now, but oftentimes, things change if we discover things about ourselves, the people we work with, and ultimately, the things that we put out into the world.Corey: Obviously, I've been aware of who Slack is, for a long time; I've been a paying customer for years because it basically is IRC with reaction gifs, and not having to teach someone how to sign into IRC when they work in accounting. So, the user experience alone solved the problem.Frank: And you've actually worked with us in the past before. [laugh]. Slack, it's the Searchable Log for all Content and Knowledge; I think that backronym, that's how it works. And I was delighted when I had mentioned your jokes and you're trolling [a folk 00:07:00] on Twitter and on your podcast to my former engineering manager, Chris Merrill, who was like, oh, you should search the Slack. Corey actually worked with us and he put together a lot of cool tooling and ideas for us to think about.Corey: Careful. If we talk too much, or what I did when I was at Slack years ago, someone's going to start looking into some of the old commits and whatnot and start demanding an apology, and we don't want that. It's, “Wow, you're right. You are a terrible engineer.” “Told you.” There's a reason I don't do that anymore.Frank: I think that's all of us. [laugh]. An early career mentor of mine, he was like, “Hey, Frank, listen. You think you're building perfect software at any point in time? No, you're building future tech debt.” And yeah, we should put much more emphasis on interfaces and ideas we're putting out because the implementation is going to change over time, and likely your current implementation is shit. And that is, okay.Corey: That's the beautiful part about this is that things grow and things evolve. And it's interesting working with companies, and as a consultant, I tend to build my projects in such a way that I start on day one and people know that I'm leaving with usually a very short window because I don't want to build a forever job for myself; I don't want to show up and start charging by the hour or by the day, if I can possibly avoid it. Because then it turns into eternal projects that never end because I'm billing and nothing's ever done. No, no, I like charging fixed fee and then getting out at a predetermined outcome, but then you get to hear about what happens with companies as they move on.This combines with the fact that I have a persistent alert for my name, usually because I'm looking for various ineffective character assassination from enterprise marketing types because you know, I dish it out, I should certainly be able to take it. But I found a blog post on the Slack engineering blog that mentioned my name, and it's, “Aw, crap. Are they coming after me for a refund?” No, it was not. It was you writing a fairly sizable post. Tell me more about that.Frank: Yeah, I'm part of an organization called Developer Productivity. And our goal is to help folk at Slack deliver services to their customers, where we build, test, and release high quality software. And a lot of our time is spent thinking about internal tooling and making infrastructure bets. As engineers, right, it's like, we have this idea for what the world looks like, we have this idea for what our infrastructure looks like, but what we discover using a set of techniques around observability of just asking questions—advanced questions, basic questions, and hell, even dumb questions—we discover hey, the things that we think our computers are doing aren't actually doing what they say they're doing. And the question is like, great. Now, what? How can we ask better questions? How can we better tune, change, and equip engineers with tooling so that they can do better work to make Slack customers have simple, pleasant, and productive experiences?Corey: And I have to say that there's a lot that Slack does that is incredibly helpful. I don't know that I'm necessarily completely bought into the idea that all work should happen in Slack. It's, well, on some level, I—like people like to debate the ‘should people work from home? Should people all work in an office?' Discussion.And, on some level, it seems if you look at people who are constantly fighting that debate online, it's, “Do you ever do work at all?” on some level. But I'm not here to besmirch others; I'm here to talk about, on some level, what you alluded to in your blog post. But I want to start with a disclaimer that Slack as far as companies go is not small, and if you take a look around, most companies are using Slack whether they know it or not. The list of side-channel Slack groups people have tend to extend massively.I look and I pare it down every once in a while, whenever I cross 40 signed-in Slacks on my desktop. It is where people talk for a wide variety of different reasons, and they all do different things. But if you're sitting here listening to this and you have a $2,000 a month AWS bill, this is not for you. You will spend orders of magnitude more money trying to optimize a small cost. Once you're at significant points of scale, and you have scaled out to the point where you begin to have some ability to predict over months or years, that's what a lot of this stuff starts to weigh in.So, talk to me a bit about how you wound up—and let me quote directly from the article, which is titled, “Infrastructure Observability for Changing the Spend Curve,” and I will, of course, throw a link to this in the [show notes 00:11:38]. But you talk in this about knocking, I believe it was orders of magnitude off of various cost areas within your bill.Frank: Yeah. The article itself describes three big-ish projects, where we are able to change the curve of the number of tests that we run, and a change in how much it costs to run any single test.Corey: When you say test, are you talking CI/CD infrastructure test or code test, to make sure it goes out, or are you talking something higher up the stack, as far as, “Huh, let's see how some users respond when, I don't know, we send four notifications on every message instead of the usual one,” to give a ridiculous example?Frank: Yeah, this is in the CI/CD pipelines. And one of these projects was around borrowing some concepts from data engineering: oversubscription and planning your capacity to have access capacity at peak, where at peak, your engineers might have a 5% degradation in performance, while still maintaining high resiliency and reliability of your tests in order to oversubscribe, either CPU or memory and keep throughput on the overall system stable and consistent and fast enough. I think, with spend in developer productivity, I think, both, like, the metrics you're trying to move and why you're optimizing for it at any given time are, like, this, like, calculus. Or it's like, more art than science in that there's no one right answer, right? It's like, oh, yeah—very naively—like, yeah, let's throw the biggest machines most expensive machines we can at any given problem. But that doesn't solve the crux of your problem. It's like, “Hey, what are the things in your system doing?” And what is the right guess to capitalize around how much to spend on your CI/CD [unintelligible 00:13:39] is oftentimes not precise, nor is this blog article meant to be prescriptive.Corey: Yeah, it depends entirely on what you're doing and how because it's, on some level, well, we can save a whole bunch of money if we slow all of our CI/CD runs down by 20 minutes. Yeah, but then you have a bunch of engineers sitting idle and I promise you, that costs a hell of a lot more than your cloud bill is going to be. The payroll is almost always a larger expense than your infrastructure costs, and if it's not, you should seriously consider firing at least part of your data science team, but you didn't hear it from me.Frank: Yeah. And part of the exploration on profiling and performance and resiliency was, like, around interrogating what the boundaries and what the constraints were for our CI/CD pipelines. Because Slack has grown in engineering and in the number of tests we were running on a month-to-month basis; for a while from 2017 to mid 2020, we were growing about 10% month-over-month in test suite execution numbers. Which means on a given year, we doubled almost two times, which is quite a bit of strain on internal resources and a lot of dependent services where—and internal systems, we oftentimes have more complexity and less understood changes in what dependencies your infrastructure might be using, what business logic your internal services are using to communicate with one another than you do your production.And so, by, like, performing a series of curiosity-driven development, we're able to both answer, at that point in time, what our customers internally were doing, and start to put together ideas for eliminating some bottlenecks, and hell, even adding bottlenecks with circuit breakers where you keep the overall throughput of your system stable, while deferring or canceling work that otherwise might have overloaded dependencies.Corey: There's a lot to be said for understanding what the optimization opportunities are, in an environment and understanding what it is you're attempting to achieve. Having those test for something like Slack makes an awful lot of sense because let's be very clear here, when you're building an application that acts as something people use to do expense reports—to cite one of my previous job examples—it turns out you can be down for a week and a majority of your customers will never know or care. With Slack, it doesn't work that way. Everyone more or less has a continuous monitor that they're typing into for a good portion of the day—angrily or otherwise—and as soon as it misses anything, people know. And if there's one thing that I love, on some level, seeing change when I know that Slack is having a blip, even if I'm not using Slack that day for anything in particular, because Twitter explodes about it. “Slack is down. I'm now going to tweet some stuff to my colleagues.” All right. You do you, I suppose.And credit where due, Slack doesn't go down nearly as often as it used to because as you tend to figure out how these things work, operational maturity increases through a bunch of tests. Fixing things like durability, reliability, uptime, et cetera, should always, to some extent, take precedence priority-wise over let's save some money. Because yeah, you could turn everything off and save all the money, but then you don't have a business anymore. It's focused on where to cut, where to optimize in the right way, and ideally as you go, find some of the areas in which, oh, I'm paying AWS a tax for just going about my business. And I could have flipped a switch at any point and saved—“How much money? Oh, my God, that's more than I'll make in my lifetime.”Frank: Yeah, and one thing I talk about a little bit is distributed tracing as one of the drivers for helping us understand what's happening inside of our systems. Where it helps you figure out and it's like this… [best word 00:17:24] to describe how you ask questions of deployed code? And there a lot of ways it's helped us understand existing bottlenecks and identify opportunities for performance or resiliency gains because your past janky Band-Aids become more and more obvious when you can interrogate and ask questions around what is it performing like it used to? Or what has changed recently?Corey: This episode is sponsored in part by something new. Cloud Academy is a training platform built on two primary goals. Having the highest quality content in tech and cloud skills, and building a good community the is rich and full of IT and engineering professionals. You wouldn't think those things go together, but sometimes they do. Its both useful for individuals and large enterprises, but here's what makes it new. I don't use that term lightly. Cloud Academy invites you to showcase just how good your AWS skills are. For the next four weeks you'll have a chance to prove yourself. Compete in four unique lab challenges, where they'll be awarding more than $2000 in cash and prizes. I'm not kidding, first place is a thousand bucks. Pre-register for the first challenge now, one that I picked out myself on Amazon SNS image resizing, by visiting cloudacademy.com/corey. C-O-R-E-Y. That's cloudacademy.com/corey. We're gonna have some fun with this one!Corey: It's also worth pointing out that as systems grow organically, that it is almost impossible for any one person to have it all in their head anymore. I saw one of the most overly complicated architecture flow trees that I think I've seen in recent memory, and it was on the Slack engineering blog about how something was architected, but it wasn't the Slack app itself; it was simply the [decision tree for ‘Should we send a notification?' 00:18:17] and it is more complicated than almost anything I've written, except maybe my newsletter content publication pipeline. It is massive. And I'll throw a link to that in the [show notes 00:18:31] as well, just because it is well worth people taking a look at.But there is so much complexity at scale for doing the right thing, and it's necessary because if I'm talking to you on Slack right now and getting notifications every time you reply on my phone, it's not going to take too long before I turn off notifications everywhere, and then I don't notice that Slack is there, and it just becomes useless and I use something else. Ideally, something better—which is hard to come by—moderately worse, like, email or completely worse, like, Microsoft Teams.Frank: I tell all my close collaborators about this. I typically set myself away on Slack because I like to make time for deep, focused work. And that's very hard with a constant stream of notifications. How people use Slack and how people notify others on Slack is, like, not incumbent on the software itself, but it's a reflection of the work culture that you're in. The expectation for an email-driven culture is, like, oh, yeah, you should be reading your email all the time and be able to respond within 30 minutes. Peace, I have friends that are lawyers, [laugh] and that is the expectation at all times of day.Corey: I married one of those. Oh, yeah, people get very salty. And she works with a global team spread everywhere, to the point where she wakes up and there's just a whole flurry of angry people that have tried to reach her in the middle of the night. Like, “Why were you sleeping at 2 a.m.? It's daytime here.” And yeah, time zones. Not everyone understands how they work, from my estimation.Frank: [laugh]. That's funny. My sweetheart is a former attorney. On our first international date, we spent an entire day-and-a-half hopping between WiFi spots in Prague so that she could answer a five minute question from a partner about standard deviations.Corey: So, one thing that you link to that really is what drew my notice to this—because, again, if you talk about AWS cost optimization, I'm probably going to stumble over it, but if you mention my name, that's sort of a nice accelerator—and you linked to my article called Why “Right Sizing Your Instances Is Nonsense.” And that is a little overblown, to some extent, but so many folks talk about it in the cost optimization space because you can get a bunch of metrics and do these things programmatically, and somewhat without observability into what's going on because, “Well, I can see how busy the computers are and if it's not busy, we could use smaller computers. Problem solved,” versus, the things that require a fair bit of insight into what is that thing doing exactly because it leads you into places of oh, turn off that idle fleet that's not doing anything is all labeled ‘backup,' where you're going to have three seconds of notice before it gets all the traffic.There's an idea of sometimes things are the way they are for a reason. And it's also not easy for a lot of things—think databases—to seamlessly just restart the thing and have it scale back up and run on a different instance class. That takes weeks of planning and it's hard. So, I find that people tend to reach for it where it doesn't often make sense. At your level of scale and operational maturity, of course, you should optimize what instance classes things are using and what sizes they are, especially since that stuff changes over time as far as what AWS has made available. But it's not the sort of thing that I suggest as being the first easy thing to go for. It's just what people think is easy because it requires no judgment and computers can do it. At least that's their opinion.Frank: I feel like you probably have a lot more experience than me, and talked about war stories, but I recall working with customers where they want to lift-and-shift on-prem hardware to VMs on-prem. I'm like, “It's not going to be as simple as you're making it out to be.” Whereas, like, the trend today is probably oh, yeah, we're going to shift on-prem VMs to AWS, or hell, like, let's go two levels deeper and just run everything on Kubernetes. Similar workloads, right? It's not going to be a huge challenge. Or [laugh] everything serverless.Corey: Spare me from that entire school of thought, my God.Frank: [laugh].Corey: Yeah, but it's fun, too, because this came out a month ago, and you're talking about using—an example you gave was a c5.9xlarge instance. Great. Well, the c6i is out now as well, so are people going to look at that someday and think, “Oh, wow. That's incredibly quaint.”It's, you wrote this a month ago, and it's already out of date, as far as what a lot of the modern story instances are. From my perspective, one of the best things that AWS has done in this space has been to get away from the reserved instance story and over into savings plans, where it's, “I know, I'm going to run some compute—maybe it's Fargate, maybe it's EC2; let's be serious, it's definitely going to be EC2—but I don't want to tie myself to specific instance types for the next three years.” Great, well, I'm just going to commit to spending some money on AWS for the next three years because if I decide today to move off of it, it's going to take me at least that long to get everything out. So okay, then that becomes something a lot more palatable for an awful lot of folks.Frank: One thing you brought up in the article I linked to is instance types. You think upgrading to the newest instance type will solve all your challenges, but oftentimes it's not obvious that it won't all the time, and in fact, you might even see degraded resiliency and degraded performance because different packages that your software relies upon might not be optimized for the given kernel or CPU type that you're running against. And ultimately, you go back to just asking really basic questions and performing some end-to-end benchmarking so that you can at least get a sense for what your customers are doing today, and maybe make a guess for what they're going to do tomorrow.Corey: I have to ask because I'm always interested in what it is that gives rise to blog posts like this—which, that's easy; it's someone had to do a project on these things, and while we learn things that would probably apply to other folks—like, you're solving what is effectively a global problem locally when you go down this path. It's part of the reason I have a consulting business is things I learned at one company apply almost identically to another company, even though that they're in completely separate industries and parts of the world because AWS billing is, for better or worse, a bounded problem space despite their best efforts to, you know, use quantum computers to fix that. What was it that gave rise to looking at the CI/CD system from an optimization point of view?Frank: So internally, I initially started writing a white paper about, hey, here's a simple question that we can answer, you know, without too much effort. Let's transition all of our C3 instances to C5 instances, and that could have been the one and done. But by thinking about it a little more and kind of drawing out, while we can actually borrow a model for oversubscription from another field, we could potentially decrease our spend by quite a bit. That eventually [laugh] evolved into a 70 page white paper—no joke—that my former engineering manager said, “Frank, no one's going to [BLEEP] read this.” [laugh].Corey: Always. Always, always. Like, here's a whole bunch of academically research and the rest. It's like, “Great. Which of these two buttons do I press?” is really the question people are getting at. And while it's great to have the research and the academic stuff, it's also a, “Great we're trying to achieve an outcome which, what is the choice?” But it's nice to know that people are doing actual research on the back end, instead, “Eh, my gut tells me to take the path on the left because why not? Left is better; right's tricky friend.”Frank: Yeah. And it was like, “Oh, yeah. I accidentally wrote a really long thing because there was, like, a lot of variables to test.” I think we had spun up 16-plus auto-scaling groups. And ran something like the cross-section of a couple of representative test suites against them, as well as configurations for a number of executors per instance.And about a year ago, I translated that into a ten page blog article that when I read through, I really didn't enjoy. [laugh]. And that template blog article is ultimately, like, about a page in the article you're reading today. And the actual kick in the butt to get this out the door was about four months ago. I spoke at o11ycon rescources which you're a part of.And it was a vendor conference by Honeycomb, and it was just so fun to share some of the things we've been doing with distributed tracing, and how we were able to solve internal problems using a relatively simple idea of asking questions about what was running. And the entire team there was wonderful in coaching and just helping me think through what questions people might have of this work. And that was, again, former academic. The last time I spoke at a conference was about a decade earlier, and it was just so fun to be part of this community of people trying to all solve the same set of problems, just in their own unique ways.Corey: One of the things I loved about working with Honeycomb was the fact that whenever I asked them a question, they have instrumented their own stuff, so they could tell me extremely quickly what something was doing, how it was doing it, and what the overall impact on this was. It's very rare to find a client that is anywhere near that level of awareness into what's going on in their infrastructure.Frank: Yeah, and that blog article, right, it's like, here's our current perspective, and here's, like, the current set of projects we're able to make to get to this result. And we think we know what we want to do, but if you were to ask that same question, “What are we doing for our spend a year from now?” the answer might be very different. Probably similar in some ways, but probably different.Corey: Well, there are some principles that we'll never get away from. It's, “Is no one using the thing? Turn that shit off.” That's one of those tried and true things. “Oh, it's the third copy of that multiple petabyte of data thing? Maybe delete it or stuff in a deep archive.” It's maybe move data less between various places. Maybe log things fewer times, given that you're paying 50 cents per gigabyte ingest, in some cases. Et cetera, et cetera, et cetera. There's a lot to consider as far as the general principles go, but the specifics, well, that's where it gets into the weeds. And at your scale, yeah, having people focus on this internally with the context and nuance to it is absolutely worth doing. Having a small team devoted to this at large companies will pay for itself, I promise. Now, I go in and advise in these scenarios, but past a certain point, this can't just be one person's part-time gig anymore.Frank: I'm kind of curious about that. How do you think about working with a company and then deprecating yourself, and allowing your tools and, like, the frameworks you put into place to continue, like, thrive?Corey: We're advisory only. We make no changes to production.Frank: Or I don't know if that's the right word, deprecate. I think… that's my own word. [laugh].Corey: No, no, it's fair. It's a—what we do is we go in and we are advisory. It's less of a cost engagement, more of an architecture engagement because in cloud, cost and architecture are the same thing. We look at what's going on, we look at the constraints of why we've been brought in, and we identify things that companies can do and the associated cost savings associated with that, and let them make their own decision. Because it's, if I come in and say, “Hey, you could save a bunch of money by migrating this whole subsystem to serverless.”Great, I sound like a lunatic evangelist because yeah, 18 months of work during which time the team doing that is not advancing the state of the business any further so it's never going to happen. So, why even suggest it? Just look at things that are within the bounds of possibility. Counterpoint: when a client says, “A full re-architecture is on the table,” well, okay, that changes the nature of what we're suggesting. But we're trying to get away from what a lot of tooling does, which is, “Great. Here's 700 things you can adjust and you'll do none of them.” We come back with a, “Here's three or four things you can do that'll blow 20% off the bill. Then let's see where you stand.” The other half of it, of course, is large scale enterprise contract negotiation, that's a bit of a horse of a different color. I want to thank you so much for taking the time to speak with me today. I really do appreciate it. If folks want to hear more about what you're up to, and how you think about these things. Where can they find you?Frank: You can find me at frankc.net. Or at me at @FrankC on Twitter.Corey: Oh, inviting people to yell at you at Twitter. That's never a great plan. Yeash. Good luck. Thanks again. We've absolutely got to talk more about this in-depth because I think this is one of those areas that you have the folks above a certain point of scale, talk about these things semi-constantly and live in the space, whereas folks who are in relatively small-scale environments are listening to this and thinking that they've got to do this.And no. No, you do not want to spend millions of dollars of engineering effort to optimize a bill that's 80 grand a year, I promise. It's focus on the thing that's right for your business. At a certain point of scale, this becomes that. But thank you so much for being so generous with your time. I appreciate it.Frank: Thank you so much, Corey.Corey: Frank Chen, senior staff software engineer at Slack. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that seems to completely miss the fact that Microsoft Teams is free because it sucks.Frank: [laugh].Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:35] Terraform v1.1.0 released (with state migrations)https://github.com/hashicorp/terraform/releases/tag/v1.1.0[00:04:07] Terraform Provider for ArgoCD (anyone use it?)https://github.com/oboukili/terraform-provider-argocd[00:05:15] GitHub Stars Organized by Category[00:05:42] Apple debuts new Open Source website here https://opensource.apple.com/https://appleinsider.com/articles/21/12/08/apple-debuts-new-open-source-website-will-release-projects-on-github[00:06:25] Summary of the AWS Service Event in the Northern Virginia (US-EAST-1) Regionhttps://aws.amazon.com/message/12721/[00:07:22] AWS postmortem: Internal ops teams' own monitoring tools went down, had to comb through logshttps://news.google.com/articles/CAIiEABlFVLsTs1vNb3RZcT9Q3YqMwgEKioIACIQyKvy0DxcsbRLQKQhOygtHCoUCAoiEMir8tA8XLG0S0CkITsoLRww7bLrBg?uo=CAUiANIBAA&hl=en-US&gl=US&ceid=US%3Aen[00:11:13] The Log4j bug exposes a bigger issue: Open-source fundinghttps://twitter.com/GovCERT_CH/status/1470097783407398928/photo/1https://news.google.com/articles/CAIiEE6zlfaKB0ztQmVBFP97U44qFggEKg0IACoGCAow8KsBMMBFMOzkzwU?hl=en-US&gl=US&ceid=US%3Aen[00:18:00] The USB kill cord for your laptophttps://www.buskill.in/[00:18:53] GitHub Projects for Open Source Roadmaps (Example)https://github.com/orgs/github/projects/4247/views/2?filterQuery=label%3Aactions[00:20:32] Has anyone had any success in configuring Persistent Storage when running EKS on Fargate? @Michael Holt[00:22:37] Are environment variables inferior to config files in Kubernetes? @sheldonh[00:34:25] How turn key is EC2 Image Builder with Terraform? @erik [00:40:22] Maintain all the boilerplate for packaging and building projects separate from the codehttps://github.com/cruft/cruft[00:48:06] Where does Concourse CI fit in with flux/argo cd and GHA, etc.? @DaniC [00:58:18] Any thoughts/opinions on kitchen-terraform? @jonjitsu[01:03:15] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
Unedited live recording on YouTube (Ep #108) An earlier YTL show about Docker Hub (Ep #89) Brandon on StackOverflow regclient on GitHub registry spec on GitHub Brandon's videos and presentations on GitHub ★ Support this podcast on Patreon ★
About RobertR2 advocates for Liquibase customers and provides technical architecture leadership. Prior to co-founding Datical (now Liquibase), Robert was a Director at the Austin Technology Incubator. Robert co-founded Phurnace Software in 2005. He invented and created the flagship product, Phurnace Deliver, which provides middleware infrastructure management to multiple Fortune 500 companies.Links: Liquibase: https://www.liquibase.com Liquibase Community: https://www.liquibase.org Liquibase AWS Marketplace: https://aws.amazon.com/marketplace/seller-profile?id=7e70900d-dcb2-4ef6-adab-f64590f4a967 Github: https://github.com/liquibase Twitter: https://twitter.com/liquibase TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com. Corey: You know how Git works right?Announcer: Sorta, kinda, not really. Please ask someone else.Corey: That's all of us. Git is how we build things, and Netlify is one of the best ways I've found to build those things quickly for the web. Netlify's Git-based workflows mean you don't have to play slap-and-tickle with integrating arcane nonsense and web hooks, which are themselves about as well understood as Git. Give them a try and see what folks ranging from my fake Twitter for Pets startup, to global Fortune 2000 companies are raving about. If you end up talking to them—because you don't have to; they get why self-service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This is a promoted episode. What does that mean in practice? Well, it means the company who provides the guest has paid to turn this into a discussion that's much more aligned with the company than it is the individual.Sometimes it works, Sometimes it doesn't, but the key part of that story is I get paid. Why am I bringing this up? Because today's guest is someone I met in person at Monktoberfest, which is the RedMonk conference in Portland, Maine, one of the only reasons to go to Maine, speaking as someone who grew up there. And I spoke there, I met my guest today, and eventually it turned into this, proving that I am the envy of developer advocates everywhere because now I can directly tie me attending one conference to making a fixed sum of money, and right now they're all screaming and tearing off their headphones and closing this episode. But for those of you who are sticking around, thank you. My guest today is the CTO and co-founder of Liquibase. Please welcome Robert Reeves. Robert, thank you for joining me, and suffering the slings and arrows I'm about to hurled directly into your arse, as a warning shot.Robert: [laugh]. Man. Thanks for having me. Corey, I've been looking forward to this for a while. I love hanging out with you.Corey: One of the things I love about the Monktoberfest conference, and frankly, anything that RedMonk gets up to is, forget what's on stage, which is uniformly excellent; forget the people at RedMonk who are wonderful and I aspire to do more work with them in different ways; they're great, but the people that they attract are invariably interesting, they are invariably incredibly diverse in terms of not just demographics, but interests and proclivities. It's just a wonderful group of people, and every time I get the opportunity to spend time with those folks I do, and I've never once regretted it because I get to meet people like you. Snark and cynicism about sponsoring this nonsense aside—for which I do thank you—you've been a fascinating person to talk to you because you're better at a lot of the database-facing things than I am, so I shortcut to instead of forming my own opinions, I just skate off of yours in some cases. You're going to get letters now.Robert: Well, look, it's an occupational hazard, right? Releasing software, it's hard so you have to learn these platforms, and part of it includes the database. But I tell you, you're spot on about Monktoberfest. I left that conference so motivated. Really opened my eyes, certainly injecting empathy into what I do on a day-to-day basis, but it spurred me to action.And there's a lot of programs that we've started at Liquibase that the germination for that seed came from Monktoberfest. And certainly, you know, we were bummed out that it's been canceled two years in a row, but we can't wait to get back and sponsor it. No end of love and affection for that team. They're also really smart and right about a hundred percent of the time.Corey: That's the most amazing part is that they have opinions that generally tend to mirror my own—which, you know—Robert: [laugh].Corey: —confirmation bias is awesome, but they almost never get it wrong. And that is one of the impressive things is when I do it, I'm shooting from the hip and I already have an apology half-written and ready to go, whereas when dealing with them, they do research on this and they don't have the ‘I'm a loud, abrasive shitpostter on Twitter' defense to fall back on to defend opinions. And if they do, I've never seen them do it. They're right, and the fact that I am as aligned with them as I am, you'd think that one of us was cribbing from the other. I assure you that's not the case.But every time Steve O'Grady or Rachel Stephens, or Kelly—I forget her last name; my apologies is all Twitter, but she studied medieval history, I remember that—or James Governor writes something, I'm uniformly looking at this and I feel a sense of dismay, been, “Dammit. I should have written this. It's so well written and it makes such a salient point.” I really envy their ability to be so consistently on point.Robert: Well, they're the only analysts we pay money to. So, we vote with our dollars with that one. [laugh].Corey: Yeah. I'm only an analyst when people have analyst budget. Other than that, I'm whatever the hell you describe me. So, let's talk about that thing you're here to show. You know, that little side project thing you found and are the CTO of.I wasn't super familiar with what Liquibase does until I looked into it and then had this—I got to say, it really pissed me off because I'm looking at it, and it's how did I not know that this existed back when the exact problems that you solve are the things I was careening headlong into? I was actively annoyed. You're also an open-source project, which means that you're effectively making all of your money by giving things away and hoping for gratitude to come back on you in the fullness of time, right?Robert: Well, yeah. There's two things there. They're open-source component, but also, where was this when I was struggling with this problem? So, for the folks that don't know, what Liquibase does is automate database schema change. So, if you need to update a database—I don't care what it is—as part of your application deployment, we can help.Instead of writing a ticket or manually executing a SQL script, or generating a bunch of docs in a NoSQL database, you can have Liquibase help you out with that. And so I was at a conference years ago, at the booth, doing my booth thing, and a managing director of a very large bank came to me, like, “Hey, what do you do?” And saw what we did and got angry, started yelling at me. “Where were you three years ago when I was struggling with this problem?” Like, spitting mad. [laugh]. And I was like, “Dude, we just started”—this was a while ago—it was like, “We just started the company two years ago. We got here as soon as we could.”But I struggled with this problem when I was a release manager. And so I've been doing this for years and years and years—I don't even want to talk about how long—getting bits from dev to test to production, and the database was always, always, always the bottleneck, whether it was things didn't run the same in test as they did, eventually in production, environments weren't in sync. It's just really hard. And we've automated so much stuff, we've automated application deployment, lowercase a compiled bits; we're building things with containers, so everything's in that container. It's not a J2EE app anymore—yay—but we haven't done a damn thing for the database.And what this means is that we have a whole part of our industry, all of our database professionals, that are frankly struggling. I always say we don't sell software Liquibase. We sell piano recitals, date nights, happy hours, all the stuff you want to do but you can't because you're stuck dealing with the database. And that's what we do at Liquibase.Corey: Well, you're talking about database people. That's not how I even do it. I would never call myself that, for very good reason because you know, Route 53 remains the only database I use. But the problem I always had was that, “Great. I'm doing a deployment. Oh, I'm going to put out some changes to some web servers. Okay, what's my rollback?” “Well, we have this other commit we can use.” “Oh, we're going to be making a database schema change. What's your rollback strategy,” “Oh, I've updated my resume and made sure that any personal files I had on my work laptop been backed up somewhere else when I immediately leave the company when we can't roll back.” Because there's not really going to be a company anymore at that point.It's one of those everyone sort of holds their breath and winces when it comes to anything that resembles a schema change—or an ALTER TABLE as we used to call it—because that is the mistakes will show territory and you can hope and plan for things in pre-prod environments, but it's always scary. It's always terrifying because production is not like other things. That's why I always call my staging environment ‘theory' because things work in theory but not in production. So, it's how do you avoid the mess of winding up just creating disasters when you're dealing with the reality of your production environments? So, let's back up here. How do you do it? Because it sounds like something people would love to sell me but doesn't exist.Robert: [laugh]. Well, it's real simple. We have a file, we call it the change log. And this is a ledger. So, databases need to be evolved. You can't drop everything and recreate it from scratch, so you have to apply changes sequentially.And so what Liquibase will do is it connects to the database, and it says, “Hey, what version are you?” It looks at the change log, and we'll see, ehh, “There's ten change sets”—that's what components of a change log, we call them change sets—“There's ten change sets in there and the database is telling me that only five had been executed.” “Oh, great. Well, I'll execute these other five.” Or it asks the database, “Hey, how many have been executed?” And it says, “Ten.”And we've got a couple of meta tables that we have in the database, real simple, ANSI SQL compliant, that store the changes that happen to the database. So, if it's a net new database, say you're running a Docker container with the database in it on your local machine, it's empty, you would run Liquibase, and it says, “Oh, hey. It's got that, you know, new database smell. I can run everything.”And so the interesting thing happens when you start pointing it at an environment that you haven't updated in a while. So, dev and test typically are going to have a lot of releases. And so there's going to be little tiny incremental changes, but when it's time to go to production, Liquibase will catch it up. And so we speak SQL to the database, if it's a NoSQL database, we'll speak their API and make the changes requested. And that's it. It's very simple in how it works.The real complex stuff is when we go a couple of inches deeper, when we start doing things like, well, reverse engineering of your database. How can I get a change log of an existing database? Because nobody starts out using Liquibase for a project. You always do it later.Corey: No, no. It's one of those things where when you're doing a project to see if it works, it's one of those, “Great, I'll run a database in some local Docker container or something just to prove that it works.” And, “Todo: fix this later.” And yeah, that todo becomes load-bearing.Robert: [laugh]. That's scary. And so, you know, we can help, like, reverse engineering an entire database schema, no problem. We also have things called quality checks. So sure, you can test your Liquibase change against an empty database and it will tell you if it's syntactically correct—you'll get an error if you need to fix something—but it doesn't enforce things like corporate standards. “Tables start with T underscore.” “Do not create a foreign key unless those columns have an ID already applied.” And that's what our quality checks does. We used to call it rules, but nobody likes rules, so we call it quality checks now.Corey: How do you avoid the trap of enumerating all the bad things you've seen happen because at some point, it feels like that's what leads to process ossification at large companies where, “Oh, we had this bad thing happen once, like, a disk filled up, so now we have a check that makes sure that all the disks are at least 20, empty.” Et cetera. Great. But you keep stacking those you have thousands and thousands and thousands of those, and even a one-line code change then has to pass through so many different tests to validate that this isn't going to cause the failure mode that happened that one time in a unicorn circumstance. How do you avoid the bloat and the creep of stuff like that?Robert: Well, let's look at what we've learned from automated testing. We certainly want more and more tests. Look, DevOp's algorithm is, “All right, we had a problem here.” [laugh]. Or SRE algorithm, I should say. “We had a problem here. What happened? What are we going to change in the future to make sure this doesn't happen?” Typically, that involves a new standard.Now, ossification occurs when a person has to enforce that standard. And what we should do is seek to have automation, have the machine do it for us. Have the humans come up and identify the problem, find a creative way to look for the issue, and then let the machine enforce it. Ossification happens in large organizations when it's people that are responsible, not the machine. The machines are great at running these things over and over again, and they're never hung over, day after Super Bowl Sunday, their kid doesn't get sick, they don't get sick. But we want humans to look at the things that we need that creative energy, that brain power on. And then the rote drudgery, hand that off to the machine.Corey: Drudgery seems like sort of a job description for a lot of us who spend time doing operation stuff.Robert: [laugh].Corey: It's drudgery and it's boring, punctuated by moments of sheer terror. On some level, you're more or less taking some of the adrenaline high of this job away from people. And you know, when it comes to databases, I'm kind of okay with that as it turns out.Robert: Yeah. Oh, yeah, we want no surprises in database-land. And that is why over the past several decades—can I say several decades since 1979?Corey: Oh, you can s—it's many decades, I'm sorry to burst your bubble on that.Robert: [laugh]. Thank you, Corey. Thank you.Corey: Five, if we're being honest. Go ahead.Robert: So, it has evolved over these many decades where change is the enemy of stability. And so we don't want change, and we want to lock these things down. And our database professionals have become changed from sentinels of data into traffic cops and TSA. And as we all know, some things slip through those. Sometimes we speed, sometimes things get snuck through TSA.And so what we need to do is create a system where it's not the people that are in charge of that; that we can set these policies and have our database professionals do more valuable things, instead of that adrenaline rush of, “Oh, my God,” how about we get the rush of solving a problem and saving the company millions of dollars? How about that rush? How about the rush of taking our old, busted on-prem databases and figure out a way to scale these up in the cloud, and also provide quick dev and test environments for our developer and test friends? These are exciting things. These are more fun, I would argue.Corey: You have a list of reference customers on your website that are awesome. In fact, we share a reference customer in the form of Ticketmaster. And I don't think that they will get too upset if I mention that based upon my work with them, at no point was I left with the impression that they played fast and loose with databases. This was something that they take very seriously because for any company that, you know, sells tickets to things you kind of need an authoritative record of who's bought what, or suddenly you don't really have a ticket-selling business anymore. You also reference customers in the form of UPS, which is important; banks in a variety of different places.Yeah, this is stuff that matters. And you support—from the looks of it—every database people can name except for Route 53. You've got RDS, you've got Redshift, you've got Postgres-squeal, you've got Oracle, Snowflake, Google's Cloud Spanner—lest people think that it winds up being just something from a legacy perspective—Cassandra, et cetera, et cetera, et cetera, CockroachDB. I could go on because you have multiple pages of these things, SAP HANA—whatever the hell that's supposed to be—Yugabyte, and so on, and so forth. And it's like, some of these, like, ‘now you're just making up animals' territory.Robert: Well, that goes back to open-source, you know, you were talking about that earlier. There is no way in hell we could have brought out support for all these database platforms without us being open-source. That is where the community aligns their goals and works to a common end. So, I'll give you an example. So, case in point, recently, let me see Yugabyte, CockroachDB, AWS Redshift, and Google Cloud Spanner.So, these are four folks that reached out to us and said, either A) “Hey, we want Liquibase to support our database,” or B) “We want you to improve the support that's already there.” And so we have what we call—which is a super creative name—the Liquibase test harness, which is just genius because it's an automated way of running a whole suite of tests against an arbitrary database. And that helped us partner with these database vendors very quickly and to identify gaps. And so there's certain things that AWS Redshift—certain objects—that AWS Redshift doesn't support, for all the right reasons. Because it's data warehouse.Okay, great. And so we didn't have to run those tests. But there were other tests that we had to run, so we create a new test for them. They actually wrote some of those tests. Our friends at Yugabyte, CockroachDB, Cloud Spanner, they wrote these extensions and they came to us and partnered with us.The only way this works is with open-source, by being open, by being transparent, and aligning what we want out of life. And so what our friends—our database friends—wanted was they wanted more tooling for their platform. We wanted to support their platform. So, by teaming up, we help the most important person, [laugh] the most important person, and that's the customer. That's it. It was not about, “Oh, money,” and all this other stuff. It was, “This makes our customers' lives easier. So, let's do it. Oop, no brainer.”Corey: There's something to be said for making people's lives easier. I do want to talk about that open-source versus commercial divide. If I Google Liquibase—which, you know, I don't know how typing addresses in browsers works anymore because search engines are so fast—I just type in Liquibase. And the first thing it spits me out to is liquibase.org, which is the Community open-source version. And there's a link there to the Pro paid version and whatnot. And I was just scrolling idly through the comparison chart to see, “Oh, so ‘Community' is just code for shitty and you're holding back advanced features.” But it really doesn't look that way. What's the deal here?Robert: Oh, no. So, Liquibase open-source project started in 2006 and Liquibase the company, the commercial entity, started after that, 2012; 2014, first deal. And so, for—Nathan Voxland started this, and Nathan was struggling. He was working at a company, and he had to have his application—of course—you know, early 2000s, J2EE—support SQL Server and Oracle and he was struggling with it. And so he open-sourced it and added more and more databases.Certainly, as open-source databases grew, obviously he added those: MySQL, Postgres. But we're never going to undo that stuff. There's rollback for free in Liquibase, we're not going to be [laugh] we're not going to be jerks and either A) pull features out or, B) even worse, make Stephen O'Grady's life awful by changing the license [laugh] so he has to write about it. He loves writing about open-source license changes. We're Apache 2.0 and so you can do whatever you want with it.And we believe that the things that make sense for a paying customer, which is database-specific objects, that makes sense. But Liquibase Community, the open-source stuff, that is built so you can go to any database. So, if you have a change log that runs against Oracle, it should be able to run against SQL Server, or MySQL, or Postgres, as long as you don't use platform-specific data types and those sorts of things. And so that's what Community is about. Community is about being able to support any database with the same change log. Pro is about helping you get to that next level of DevOps Nirvana, of reaching those four metrics that Dr. Forsgren tells us are really important.Corey: Oh, yes. You can argue with Nicole Forsgren, but then you're wrong. So, why would you ever do that?Robert: Yeah. Yeah. [laugh]. It's just—it's a sucker's bet. Don't do it. There's a reason why she's got a PhD in CS.Corey: She has been a recurring guest on this show, and I only wish she would come back more often. You and I are fun to talk to, don't get me wrong. We want unbridled intellect that is couched in just a scintillating wit, and someone is great to talk to. Sorry, we're both outclassed.Robert: Yeah, you get entertained with us; you learn with her.Corey: Exactly. And you're still entertained while doing it is the best part.Robert: [laugh]. That's the difference between Community and Pro. Look, at the end of the day, if you're an individual developer just trying to solve a problem and get done and away from the computer and go spend time with your friends and family, yeah, go use Liquibase Community. If it's something that you think can improve the rest of the organization by teaming up and taking advantage of the collaboration features? Yes, sure, let us know. We're happy to help.Corey: Now, if people wanted to become an attorney, but law school was too expensive, out of reach, too much time, et cetera, but they did have a Twitter account, very often, they'll find that they can scratch that itch by arguing online about open-source licenses. So, I want to be very clear—because those people are odious when they email me—that you are licensed under the Apache License. That is a bonafide OSI approved open-source license. It is not everyone except big cloud companies, or service providers, which basically are people dancing around—they mean Amazon. So, let's be clear. One, are you worried about Amazon launching a competitive service with a dumb name? And/or have you really been validated as a product if AWS hasn't attempted and failed to launch a competitor?Robert: [laugh]. Well, I mean, we do have a very large corporation that has embedded Liquibase into one of their flagship products, and that is Oracle. They have embedded Liquibase in SQLcl. We're tickled pink because that means that, one, yes, it does validate Liquibase is the right way to do it, but it also means more people are getting help. Now, for Oracle users, if you're just an Oracle shop, great, have fun. We think it's a great solution. But there's not a lot of those.And so we believe that if you have Liquibase, whether it's open-source or the Pro version, then you're going to be able to support all the databases, and I think that's more important than being tied to a single cloud. Also—this is just my opinion and take it for what it's worth—but if Amazon wanted to do this, well, they're not the only game in town. So, somebody else is going to want to do it, too. And, you know, I would argue even with Amazon's backing that Liquibase is a little stronger brand than anything they would come out with.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense. Corey: So, I want to call out though, that on some level, they have already competed with you because one of database that you do not support is DynamoDB. Let's ignore the Route 53 stuff because, okay. But the reason behind that, having worked with it myself, is that, “Oh, how do you do a schema change in DynamoDB?” The answer is that you don't because it doesn't do schemas for one—it is schemaless, which is kind of the point of it—as well as oh, you want to change the primary, or the partition, or the sort key index? Great. You need a new table because those things are immutable.So, they've solved this Gordian Knot just like Alexander the Great did by cutting through it. Like, “Oh, how do you wind up doing this?” “You don't do this. The end.” And that is certainly an approach, but there are scenarios where those were first, NoSQL is not a acceptable answer for some workloads.I know Rick [Horahan 00:26:16] is going to yell at me for that as soon as he hears me, but okay. But there are some for which a relational database is kind of a thing, and you need that. So, Dynamo isn't fit for everything. But there are other workloads where, okay, I'm going to just switch over. I'm going to basically dump all the data and add it to a new table. I can't necessarily afford to do that with anything less than maybe, you know, 20 milliseconds of downtime between table one and table two. And they're obnoxious and difficult ways to do it, but for everything else, you do kind of need to make ALTER TABLE changes from time to time as you go through the build and release process.Robert: Yeah. Well, we certainly have plans for DynamoDB support. We are working our way through all the NoSQLs. Started with Mongo, and—Corey: Well, back that out a second then for me because there's something I'm clearly not grasping because it's my understanding, DynamoDB is schemaless. You can put whatever you want into various arbitrary fields. How would Liquibase work with something like that?Robert: Well, that's something I struggled with. I had the same question. Like, “Dude, really, we're a schema change tool. Why would we work with a schemaless database?” And so what happened was a soon-to-be friend of ours in Europe had reached out to me and said, “I built an extension for MongoDB in Liquibase. Can we open-source this, and can y'all take care of the care and feeding of this?” And I said, “Absolutely. What does it do?” [laugh].And so I looked at it and it turns out that it focuses on collections and generating data for test. So, you're right about schemaless because these are just documents and we're not going to go through every single document and change the structure, we're just going to have the application create a new doc and the new format. Maybe there's a conversion log logic built into the app, who knows. But it's the database professionals that have to apply these collections—you know, indices; that's what they call them in Mongo-land: collections. And so being able to apply these across all environments—dev, test, production—and have consistency, that's important.Now, what was really interesting is that this came from MasterCard. So, this engineer had a consulting business and worked for MasterCard. And they had a problem, and they said, “Hey, can you fix this with Liquibase?” And he said, “Sure, no problem.” And he built it.So, that's why if you go to the MongoDB—the liquibase-mongodb repository in our Liquibase org, you'll see that MasterCard has the copyright on all that code. Still Apache 2.0. But for me, that was the validation we needed to start expanding to other things: Dynamo, Couch. And same—Corey: Oh, yeah. For a lot of contributors, there's a contributor license process you can go through, assign copyright. For everything else, there's MasterCard.Robert: Yeah. Well, we don't do that. Look, you know, we certainly have a code of conduct with our community, but we don't have a signing copyright and that kind of stuff. Because that's baked into Apache 2.0. So, why would I want to take somebody's ability to get credit and magical internet points and increase the rep by taking that away? That's just rude.Corey: The problem I keep smacking myself into is just looking at how the entire database space across the board goes, it feels like it's built on lock-in, it's built on it is super finicky to work with, and it generally feels like, okay, great. You take something like Postgres-squeal or whatever it is you want to run your database on, yeah, you could theoretically move it a bunch of other places, but moving databases is really hard. Back when I was at my last, “Real job,” quote-unquote, years ago, we were late to the game; we migrated the entire site from EC2 Classic into a VPC, and the biggest pain in the ass with all of that was the RDS instance. Because we had to quiesce the database so it would stop taking writes; we would then do snapshot it, shut it down, and then restore a new database from that RDS snapshot.How long does it take, at least in those days? That is left as an experiment for the reader. So, we booked a four hour maintenance window under the fear that would not be enough. It completed in 45 minutes. So okay, there's that. Sparked the thing up and everything else was tested and good to go. And yay. Okay.It took a tremendous amount of planning, a tremendous amount of work, and that wasn't moving it very far. It is the only time I've done a late-night deploy, where not a single thing went wrong. Until I was on the way home and the Uber driver sideswiped a city vehicle. So, there we go—Robert: [laugh].Corey: —that's the one. But everything else was flawless on this because we planned these things out. But imagine moving to a different provider. Oh, forget it. Or imagine moving to a different database engine? That's good. Tell another one.Robert: Well, those are the problems that we want our database professionals to solve. We do not want them to be like janitors at an elementary school, cleaning up developer throw-up with sawdust. The issue that you're describing, that's a one time event. This is something that doesn't happen very often. You need hands on the keyboard, you want people there to look for problems.If you can take these database releases away from those folks and automate them safely—you can have safety and speed—then that frees up their time to do these other herculean tasks, these other feats of strength that they're far better at. There is no silver bullet panacea for database issues. All we're trying to do is take about 70% of DBAs time and free it up to do the fun stuff that you described. There are people that really enjoy that, and we want to free up their time so they can do that. Moving to another platform, going from the data center to the cloud, these sorts of things, this is what we want a human on; we don't want them updating a column three times in a row because dev couldn't get it right. Let's just give them the keys and make sure they stay in their lane.Corey: There's something glorious about being able to do that. I wish that there were more commonly appreciated ways of addressing those pains, rather than, “Oh, we're going to sell you something big and enterprise-y and it's going to add a bunch of process and not work out super well for you.” You integrate with existing CI/CD systems reasonably well, as best I can tell because the nice thing about CI/CD—and by nice I mean awful—is that there is no consensus. Every pipeline you see, in a release engineering process inherently becomes this beautiful bespoke unicorn.Robert: Mm-hm. Yeah. And we have to. We have to integrate with whatever CI/CD they have in place. And we do not want customers to just run Liquibase by itself. We want them to integrate it with whatever is driving that application deployment.We're Switzerland when it comes to databases, and CI/CD. And I certainly have my favorite of those, and it's primarily based on who bought me drinks at the last conference, but we cannot go into somebody's house and start rearranging the furniture. That's just rude. If they're deploying the app a certain way, what we tell that customer is, “Hey, we're just going to have that CI/CD tool call Liquibase to update the database. This should be an atomic unit of deployment.” And it should be hidden from the person that pushes that shiny button or the automation that does it.Corey: I wish that one day that you could automate all of the button pushing, but the thing that always annoyed me in release engineering was the, “Oh, and here's where we stop to have a human press the button.” And I get it. That stuff's scary for some folks, but at the same time, this is the nature of reality. So, you're not going to be able to technology your way around people. At least not successfully and not for very long.Robert: It's about trust. You have to earn that database professional's trust because if something goes wrong, blaming Liquibase doesn't go very far. In that company, they're going to want a person [laugh] who has a badge to—with a throat to choke. And so I've seen this pattern over and over again.And this happened at our first customer. Major, major, big, big, big bank, and this was on the consumer side. They were doing their first production push, and they wanted us ready. Not on the call, but ready if there was an issue they needed to escalate and get us to help them out. And so my VP of Engineering and me, we took it. Great. Got VP of engineering and CTO. Right on.And so Kevin and I, we stayed home, stayed sober [laugh], you know—a lot of places to party in Austin; we fought that temptation—and so we stayed and I'm texting with Kevin, back and forth. “Did you get a call?” “No, I didn't get a call.” It was Friday night. Saturday rolls around. Sunday. “Did you get a—what's going on?” [laugh].Monday, we're like, “Hey. Everything, okay? Did you push to the next weekend?” They're like, “Oh, no. We did. It went great. We forgot to tell you.” [laugh]. But here's what happened. The DBAs push the Liquibase ‘make it go' button, and then they said, “Uh-Oh.” And we're like, “What do you mean, uh-oh?” They said, “Well, something went wrong.” “Well, what went wrong?” “Well, it was too fast.” [laugh]. Something—no way. And so they went through the whole thing—Corey: That was my downtime when I supposed to be compiling.Robert: Yeah. So, they went through the whole thing to verify every single change set. Okay, so that was weekend one. And then they go to weekend two, they do it the same thing. All right, all right. Building trust.By week four, they called a meeting with the release team. And they said, “Hey, process change. We're no longer going to be on these calls. You are going to push the Liquibase button. Now, if you want to integrate it with your CI/CD, go right ahead, but that's not my problem.” Dev—or, the release team is tier one; dev is tier two; we—DBAs—are tier three support, but we'll call you because we'll know something went wrong. And to this day, it's all automated.And so you have to earn trust to get people to give that up. Once they have trust and you really—it's based on empathy. You have to understand how terrible [laugh] they are sometimes treated, and to actively take care of them, realize the problems they're struggling with, and when you earn that trust, then and only then will they allow automation. But it's hard, but it's something you got to do.Corey: You mentioned something a minute ago that I want to focus on a little bit more closely, specifically that you're in Austin. Seems like that's a popular choice lately. You've got companies that are relocating their headquarters there, presumably for tax purposes. Oracle's there, Tesla's there. Great. I mean, from my perspective, terrific because it gets a number of notably annoying CEOs out of my backyard. But what's going on? Why is Austin on this meteoric rise and how'd it get there?Robert: Well, a lot of folks—overnight success, 40 years in the making, I guess. But what a lot of people don't realize is that, one, we had a pretty vibrant tech hub prior to all this. It all started with MCC, Microcomputer Consortium, which in the '80s, we were afraid of the Japanese taking over and so we decided to get a bunch of companies together, and Admiral Bobby Inman who was director planted it in Austin. And that's where it started. You certainly have other folks that have a huge impact, obviously, Michael Dell, Austin Ventures, a whole host of folks that have really leaned in on tech in Austin, but it actually started before that.So, there was a time where Willie Nelson was in Nashville and was just fed up with RCA Records. They would not release his albums because he wanted to change his sound. And so he had some nice friends at Atlantic Records that said, “Willie, we got this. Go to New York, use our studio, cut an album, we'll fix it up.” And so he cut an album called Shotgun Willie, famous for having “Whiskey River” which is what he uses to open and close every show.But that album sucked as far as sales. It's a good album, I like it. But it didn't sell except for one place in America: in Austin, Texas. It sold more copies in Austin than anywhere else. And so Willie was like, “I need to go check this out.”And so he shows up in Austin and sees a bunch of rednecks and hippies hanging out together, really geeking out on music. It was a great vibe. And then he calls, you know, Kris, and Waylon, and Merle, and say, “Come on down.” And so what happened here was a bunch of people really wanted to geek out on this new type of country music, outlaw country. And it started a pattern where people just geek out on stuff they really like.So, same thing with Austin film. You got Robert Rodriguez, you got Richard Linklater, and Slackers, his first movie, that's why I moved to Austin. And I got a job at Les Amis—a coffee shop that's closed—because it had three scenes in that. There was a whole scene of people that just really wanted to make different types of films. And we see that with software, we see that with film, we see it with fashion.And it just seems that Austin is the place where if you're really into something, you're going to find somebody here that really wants to get into it with you, whether it's board gaming, D&D, noise punk, whatever. And that's really comforting. I think it's the community that's just welcoming. And I just hope that we can continue that creativity, that sense of community, and that we don't have large corporations that are coming in and just taking from the system. I hope they inject more.I think Oracle's done a really good job; their new headquarters is gorgeous, they've done some really good things with the city, doing a land swap, I think it was forty acres for nine acres. They coughed up forty for nine. And it was nine acres the city wasn't even using. Great. So, I think they're being good citizens. I think Tesla's been pretty cool with building that factory where it is. I hope more come. I hope they catch what is ever in the water and the breakfast tacos in Austin.Corey: [laugh]. I certainly look forward to this pandemic ending; I can come over and find out for myself. I'm looking forward to it. I always enjoyed my time there, I just wish I got to spend more of it.Robert: How many folks from Duckbill Group are in Austin now?Corey: One at the moment. Tim Banks. And the challenge, of course, is that if you look across the board, there really aren't that many places that have more than one employee. For example, our operations person, Megan, is here in San Francisco and so is Jesse DeRose, our manager of cloud economics. But my business partner is in Portland; we have people scattered all over the country.It's kind of fun having a fully-distributed company. We started this way, back when that was easy. And because all right, travel is easy; we'll just go and visit whenever we need to. But there's no central office, which I think is sort of the dangerous part of full remote because then you have this idea of second-class citizens hanging out in one part of the country and then they go out to lunch together and that's where the real decisions get made. And then you get caught up to speed. It definitely fosters a writing culture.Robert: Yeah. When we went to remote work, our lease was up. We just didn't renew. And now we have expanded hiring outside of Austin, we have folks in the Ukraine, Poland, Brazil, more and more coming. We even have folks that are moving out of Austin to places like Minnesota and Virginia, moving back home where their family is located.And that is wonderful. But we are getting together as a company in January. We're also going to, instead of having an office, we're calling it a ‘Liquibase Lounge.' So, there's a number of retail places that didn't survive, and so we're going to take one of those spots and just make a little hangout place so that people can come in. And we also want to open it up for the community as well.But it's very important—and we learned this from our friends at GitLab and their culture. We really studied how they do it, how they've been successful, and it is an awareness of those lunch meetings where the decisions are made. And it is saying, “Nope, this is great we've had this conversation. We need to have this conversation again. Let's bring other people in.” And that's how we're doing at Liquibase, and so far it seems to work.Corey: I'm looking forward to seeing what happens, once this whole pandemic ends, and how things continue to thrive. We're long past due for a startup center that isn't San Francisco. The whole thing is based on the idea of disruption. “Oh, we're disruptive.” “Yes, we're so disruptive, we've taken a job that can be done from literally anywhere with internet access and created a land crunch in eight square miles, located in an earthquake zone.” Genius, simply genius.Robert: It's a shame that we had to have such a tragedy to happen to fix that.Corey: Isn't that the truth?Robert: It really is. But the toothpaste is out of the tube. You ain't putting that back in. But my bet on the next Tech Hub: Kansas City. That town is cool, it has one hundred percent Google Fiber all throughout, great university. Kauffman Fellows, I believe, is based there, so VC folks are trained there. I believe so; I hope I'm not wrong with that. I know Kauffman Foundation is there. But look, there's something happening in that town. And so if you're a buy low, sell high kind of person, come check us out in Austin. I'm not trying to dissuade anybody from moving to Austin; I'm not one of those people. But if the housing prices [laugh] you don't like them, check out Kansas City, and get that two-gig fiber for peanuts. Well, $75 worth of peanuts.Corey: Robert, I want to thank you for taking the time to speak with me so extensively about Liquibase, about how awesome RedMonk is, about Austin and so many other topics. If people want to learn more, where can they find you?Robert: Well, I think the best place to find us right now is in AWS Marketplace. So—Corey: Now, hand on a second. When you say the best place for anything being the AWS Marketplace, I'm naturally a little suspicious. Tell me more.Robert: [laugh]. Well, best is, you know, it's—[laugh].Corey: It is a place that is there and people can find you through it. All right, then.Robert: I have a list. I have a list. But the first one I'm going to mention is AWS Marketplace. And so that's a really easy way, especially if you're taking advantage of the EDP, Enterprise Discount Program. That's helpful. Burn down those dollars, get a discount, et cetera, et cetera. Now, of course, you can go to liquibase.com, download a trial. Or you can find us on Github, github.com/liquibase. Of course, talking smack to us on Twitter is always appreciated.Corey: And we will, of course, include links to that in the [show notes 00:46:37]. Robert Reeves, CTO and co-founder of Liquibase. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment complaining about how Liquibase doesn't support your database engine of choice, which will quickly be rendered obsolete by the open-source community.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Gil Hoffer (@gilhoffer, Co-Founder / CTO @salto_io) talks about the challenges of managing and integrating SaaS applications, the blurring line between business and technical engineer, and software supply-chain for SaaS integrations. SHOW: 575CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSORS:Megaport - Network as a Service PlatformTry Megaport - Cloud Connectivity SimplifiedCBT Nuggets: Expert IT Training for individuals and teamsSign up for a CBT Nuggets Free Learner accountSHOW NOTES:Salto (homepage)Companies today have, on average, 800 SaaS applicationsSalto (OSS code) - manage business applications by codeTopic 1 - Welcome to the show. You have a broad background in engineering and entrepreneurship. Let's talk about your background and what led you to create Salto. Topic 2 - There is a big trend happening, where companies are using more and more technology outside of the “central IT” groups. This includes quite a bit of SaaS usage. But it also requires more than just vanilla SaaS. Let's talk about this concept of a “business engineer”.Topic 3 - How often do SaaS applications need customizations? How often does a business application require integrations across multiple SaaS services? Topic 4 - Walk us through a day in the life of a business engineer, and what's involved in some of the integrations? How much is being able to code, and how much Topic 5 - Integrations can be complicated, not only to build, but also to maintain. How does Salto simplify the creation and life cycle of these integrations?Topic 6 - Salto is available as OSS, software and SaaS. What seems to be the best usage model for certain use-cases or types of businesses?FEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet
Continuous integration and delivery (CI/CD) has seen some radical changes during the past few years, especially for continuous delivery. While not so long ago, application development and delivery was exclusively for monolithic stacks but delivering software for microservices and container environments is a very different animal.In this The New Stack Maker podcast, recorded at KubeCon+CloudNativeCon in October, guest Rob Zuber, chief technology officer at CircleCI, discusses the evolution of CI/CD from the perspective of CircleCI's experience for over a decade.Alex Williams, founder and publisher of The New Stack, hosted this podcast.
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:31] AWS outage =) What's your theory?https://aws.amazon.com/premiumsupport/technology/pes/[00:04:00] AWS WAF adds support for CloudWatch Log and logging directly to S3 buckethttps://aws.amazon.com/about-aws/whats-new/2021/12/awf-waf-cloudwatch-log-s3-bucket/[00:04:30] AWS announces Construct Hub general availabilityhttps://aws.amazon.com/about-aws/whats-new/2021/12/aws-construct-hub-availability/[00:08:28] Amazon DevOps Guru for RDS Aurora to Detect, Diagnose, and Resolve Issueshttps://aws.amazon.com/blogs/aws/new-amazon-devops-guru-for-rds-to-detect-diagnose-and-resolve-amazon-aurora-related-issues-using-ml/[00:10:48] Summary of re:Invent Announcements and this one, and security announcementshttps://acloudguru.com/blog/engineering/aws-reinvent-2021-the-biggest-announcementshttps://aws.amazon.com/blogs/aws/top-announcements-of-aws-reinvent-2021/https://venturebeat.com/2021/12/03/the-top-12-security-announcements-at-aws-reinvent-2021/[00:17:50] Cloud Posse API Gateway Module and AWS Airflow WIPhttps://github.com/cloudposse/terraform-aws-api-gatewayhttps://github.com/cloudposse/terraform-aws-mwaa[00:19:27] Service Mesh options? [00:36:24] AWS AppSync service — gotchas, pitfalls, etc.[00:39:18] Pain using Terraform to apply helm charts instead of helmfile [00:46:15] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
Writing your application's code is only half the battle. Getting it to run on your machine is a milestone, but it's far from your code running in a production environment. There are an increasing set of options application designers have for helping to manage deployment, environments, and CI/CD. Encore is a backend engine for the The post Building Go Apps Using Encore with André Eriksson appeared first on Software Engineering Daily.
Writing your application's code is only half the battle. Getting it to run on your machine is a milestone, but it's far from your code running in a production environment. There are an increasing set of options application designers have for helping to manage deployment, environments, and CI/CD. Encore is a backend engine for the
Writing your application's code is only half the battle. Getting it to run on your machine is a milestone, but it's far from your code running in a production environment. There are an increasing set of options application designers have for helping to manage deployment, environments, and CI/CD. Encore is a backend engine for the The post Building Go Apps Using Encore with André Eriksson appeared first on Software Engineering Daily.
Ajay Nair is the General Manager (AWS Lambda Experience) at AWS. Ajay is one of the founding members of the AWS Lambda team, in his current role, drives the serverless product strategy and leads a talented team driving the product roadmap, feature delivery, and business results. Throughout his career, Ajay has focused on building and helping developers build large scale distributed systems, with deep expertise in cloud native application platforms, big data systems, and streamlining development experiences. He is also a co-author of Serverless Architectures on AWS, which teaches you how to design, secure, and manage serverless backend APIs for web and mobile applications on the AWS platform. Twitter: @ajaynairthinks LinkedIn: https://www.linkedin.com/in/ajnair/ Serverless Land: https://serverlessland.com Talia Nassi is a Senior Developer Advocate at AWS Serverless and an international keynote speaker who delivers content on all things testing and quality. Previously, she worked at Split Software as a developer advocate and at WeWork as an engineer, and implemented Testing in Production from start to finish! She is passionate about feature flagging, canary launches, CI/CD, testing in production, and A/B testing. She has spoken at countless conferences internationally, ranging from audiences of 100 to 4000! Twitter: @talia_nassi LinkedIn: https://www.linkedin.com/in/talianassi/ Serverless Land: https://serverlessland.com
Unedited live recording on YouTube Ep 126Topics Compose V2 is a Docker Plug-in (written in Go) Compose Spec (no more yaml versions) Service Profiles Compose ls (list all running compose projects) BuildKit by default Compose cp (copy files in/out) Compose convert Compose up to ACI and ECS Compose up to Kubernetes Compose command aliases Links GitOps Days 2021 - Day 1 (June 9, 2021) video on YouTube GitOps Days 2021 - Day 2 (June 10, 2021) video on YouTube Compose CLI on GitHub Compose Spec on GitHub Docker Plugins installer (GitHub) My Shell Setup Building with Buildx ★ Support this podcast on Patreon ★
About ClintonClinton Herget is Principal Solutions Engineer at Snyk, where he focuses on helping our large enterprise and public sector clients on their journey to DevSecOps. A seasoned technologist, Clinton spent his 15+ year career prior to Snyk as a web software engineer, DevOps consultant, cloud solutions architect, and technical director in the systems integrator space, leading client delivery of complex agile technology solutions. Clinton is passionate about empowering software engineers and is a frequent conference speaker, developer advocate, and everything-as-code evangelist.Links:Try Snyk for free today at:https://app.snyk.io/login?utm_campaign=Screaming-in-the-Cloud-podcast&utm_medium=Partner&utm_source=AWS TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by my friends at ThinkstCanary. Most companies find out way too late that they've been breached. ThinksCanary changes this and I love how they do it. Deploy canaries and canary tokens in minutes and then forget about them. What's great is the attackers tip their hand by touching them, giving you one alert, when it matters. I use it myself and I only remember this when I get the weekly update with a “we're still here, so you're aware” from them. It's glorious! There is zero admin overhead to this, there are effectively no false positives unless I do something foolish. Canaries are deployed and loved on all seven continents. You can check out what people are saying at canary.love. And, their Kub config canary token is new and completely free as well. You can do an awful lot without paying them a dime, which is one of the things I love about them. It is useful stuff and not an, “ohh, I wish I had money.” It is speculator! Take a look; that's canary.love because it's genuinely rare to find a security product that people talk about in terms of love. It really is a unique thing to see. Canary.love. Thank you to ThinkstCanary for their support of my ridiculous, ridiculous non-sense. Corey: Writing ad copy to fit into a 30 second slot is hard, but if anyone can do it the folks at Quali can. Just like their Torque infrastructure automation platform can deliver complex application environments anytime, anywhere, in just seconds instead of hours, days or weeks. Visit Qtorque.io today and learn how you can spin up application environments in about the same amount of time it took you to listen to this ad.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode features Clinton Herget, who's a principal solutions engineer at Snyk. Or ‘Snick.' Or ‘Cynic.' Clinton, thank you for joining me, how the heck do I pronounce your company's name?Clinton: That is always a great place to start, Corey, and we like to say it is ‘sneak' as in sneaking around or a pair of sneakers. Now, our colleagues in the UK do like to say ‘Snick,' but that is because they speak incorrectly. We will accept it; it is still wrong. As long as you're not saying ‘Sink' because it really has nothing to do with plumbing and we prefer to avoid that association.Corey: Generally speaking, I try not to tell other people how to run their business, but I will make an exception here because I can't take it anymore. According to CrunchBase, your company has raised $1.4 billion. Buy a vowel for God's sake. How much could it possibly cost for a single letter that clarifies all of this? My God.Clinton: Yeah, but then we wouldn't spend the first 20 minutes of every sales conversation talking about how to pronounce the company name and we would need to fill that with content. So, I think we're just going to stay the course from here on out.Corey: I like that. So, you're a principal solutions engineer. First, what does that do? And secondly, I've known an awful lot of folks who I would consider problem engineers, but they never self-describe that way. It's always solutions-oriented?Clinton: Well, it's because I worked for Snyk, and we're not a problems company, Corey, we're a solutions company.Corey: I like that.Clinton: It's an interesting role, right, because I work with some of our biggest customers, a lot of our strategic partners here in North America, and I'm kind of the evangelist that comes out and says, “Hey, here's what sucks about being a developer. Here's how we could maybe be better.” And I want to connect with other engineers to say, “Look, I share your pain, there might be an easier way, if you, you know, give me a few minutes here to talk about Snyk.”Corey: So, I've seen Snyk around for a while. I've had a few friends who worked there almost since the beginning and they talk about this thing—this was before, I believe, you had the Dobermann logo back in the early days—and I keep periodically seeing you folks in a variety of different contexts and different places. Often I'll be installing something from Docker Hub, for example, and it will mention that, oh, there's a Snyk scan thing that has happened on the command line, which is interesting because I, to the best of my knowledge, don't pay Docker for things that I do because, “No, I'm going to build it myself out of popsicle sticks,” is sort of my entire engineering ethos. But I keep seeing you in different cases where as best I am aware, I have never paid you folks for services. What is it you do as a company because you're one of those folks that I just keep seeing again and again and again, but I can't actually put my finger on what it is you do.Clinton: Yeah, you know, most people aren't aware that popsicle sticks are actually a CNCF graduated project. So, you know, that's that—Corey: Oh, and they're load-bearing in almost every piece of significant technical debt over the last 50 years.Clinton: Absolutely. Look at your bill of materials; it's there. Well, here's where I can drop in the other fun fact about Snyk's name, it's actually an acronym, right, stands for So, Now You Know. So, now you know that much, at least. Popsicle sticks, key component to any containerized infrastructure. Look, Snyk is a developer security company, right? And people hear that and go, “I'm sorry, what? I'm a developer; I don't give a shit about security.” Or, “I'm a security person”—Corey: Usually they don't say that out loud as often as you would hope, but it's like, “That's not true. I say that I care about security an awful lot.” It's like, “Yeah, you say that. Therein lies the rub.”Clinton: Until you get a couple of drinks in them at the party at re:Invent and then the real stuff comes out, right? No, Snyk is always been historically committed to the open-source community. We want to help open-source developers every bit as much as, you know, we're helping the engineers at our top-tier customers. And that's because fundamentally, open-source is inextricably linked to the way software is developed today, right? There is nobody not using open-source.And so we, sort of, have to be supporting those communities at the same time. And that fundamentally is where the innovation is happening. And you know, my sales guys hate when I say this, right, but you can get an amazing amount of value out of Snyk by using the freemium solution, using the open-source tooling that we've put out in the community, you get full access to our vulnerability database, which is updated every day, and if you're working on public projects, that's going to be free forever, right? We're fundamentally committed to making that work. If you're an enterprise that happens to have money to spend, I guess we'll take that too, right, but my job is really talking to developers and figuring out, you know, how can we reduce the amount of pain in your life through better security tooling?Corey: The challenging part is that your business, although I confess is significantly larger than my business, we're sort of on some level solving the same problem. And that sounds odd to say because I focus on fixing AWS bills and you're focused on improving developer security. But I'm moving up about six levels to the idea that there are only two big problems in the world of technology, in the world of companies for that matter. And the problem that we're solving is the worst one of the two. And that is reducing risk exposure.It is about eliminating downside. It's cost optimization, it's security tooling, it is insurance, et cetera, et cetera, et cetera. And the other problem, the one that I've always found, that is the thing that will get people actually excited rather than something they feel obligated to do is speeding up time to market, improving feature velocity, being able to deliver the right things sooner. That's the problem companies are biasing towards investing in extremely heavily. They'll convene the board to come up with an answer there.That said, you stray closer into that problem space than most security companies that I'm aware of just because you do in fact, speed up the developer process. It let people move faster, but do it safely at least is my general understanding. If I'm completely wrong on this, and, “Nope, we are purely risk mitigation, then this is going to look fairly silly, but it wouldn't be the first time I put my foot in my mouth.”Clinton: Yeah, Corey, it sounds like you really read the first three words of the website, right? “Develop fast. Stay secure.” And I think that fundamentally gets at the traditional alignment, where security equals slow, right, because risk mitigation is all about preventing problematic things from going into production. But only doing that as a stop gate at the end of the process, right, by essentially saying we assume all developers are bad and want to do bad things, and so we're going to put up this big gate and generate an 1100 page PDF, and then throw it back to them and say, “Now, go figure out all of the bad things you did and how to fix them. And by the way, you're already overshooting your delivery target.” Right? So, there's no way to win in that traditional model unless you're empowering developers earlier with the right context they need to actually write more secure code to begin with, rather than remediating after the fact when those fixes are actually most expensive.Corey: It's the idea of the people who want to slow down and protect things and not break are on the operation side of the world, and then you have developers who want to ship things. And you have that natural tension, so we're going to smash them together and call it DevOps, which at least if nothing else, leads to interesting stories on stages. Whether it actually leads to lasting cultural transformation is another thing entirely. And then someone said, “Well, what about security?” And the answer is, “We have a security department?” And the answer is, “Yeah, you know, those grumpy people that say no all the time whenever we ask if we could do anything.” “Oh, that security department. I ignore them and go around them instead.” And it's, “All right, well, we need help on that so we're going to smash them in, too.” Welcome to DevSecOps, which is basically buzzword-driven cultural development. And here we are. But there is something to be said for you can no longer be the Department of No. I would argue that you couldn't do that successfully previously, but at least now we're a little more aware of it.Clinton: I think you could certainly do that when you were deploying software a couple times a year, right? Because you could build in all of the time to very expensively and time consumingly fix things after the fact, right? We're no longer in that world. I think when you're deploying every few seconds or a few minutes, what you need is tooling that, first of all, runs at that speed, that gives developers insights into what risk are they bringing on board with that application once it will be deployed, but then also give them the context they actually need to fix things, right? I mean, regardless of where those vulnerabilities are found, it still ultimately is a line of code that has to be written by a developer and committed and pushed through a pipeline to make it back into production.And that's true, whether we're talking about application security and proprietary code, we're talking about vulnerabilities in open-source, vulnerabilities in the container, infrastructure as code. I mean, it used to be that a network vulnerability was fixed by somebody going into the data center, unplugging a Cat 5 cable and plugging it in somewhere else, right? I mean, that was the definition of network security. It was a hardware problem. Now, networking is software-defined. I mean [laugh]—Corey: Oh, the firewall I trust is basically a wire cutter. Yeah, cut through the entire cable, and that is the only secure firewall. And it's like, oh, no, no, there are side-channel attacks. It's not completely going to solve things for you. Yeah.Clinton: You know, without naming names, there are certainly vendors in the security space that still consider mitigation to be shutting down access to a workload, right. Like, let's remediate by taking this off of the internet and allowing it to no longer be accessible.Corey: I don't think it's come from a security standpoint, but that does feel like it's a disturbing proportion of Google's product strategy.Clinton: [laugh]. Absolutely. But you know, I do think maybe we can take the forward-looking step of saying there are ways to fix issues while keeping applications online at the same time. For example, by arming engineers with the security intelligence they need when they're making decisions about what goes into those applications. Because those wire cutters now, that's a line in a YAML file, right?That's a Kubernetes deployment, that's a CloudFormation template, and that is living in code in the same repo with everything else, with all of the other logic. And so it's fundamentally indistinguishable at the point where all security is really now developer security, except the security tooling available doesn't speak to the developer, it doesn't integrate into their workflow, it doesn't enable them to make remediations, it's still slapping them on the wrist. And this is why I think when you talk about—to invoke one of the most overused buzzwords in the security industry—when you talk about shifting left, that's really only half the story. I mean, if you're taking a traditional solution that's designed to slow things down, and shifting that into the developer workflow, you're just slowing them down earlier, right? You're not enabling them with better decision-making capacity so they can say, “Oh, I now understand the risks that I'm bringing on board by not sanitizing a string before I dump it into a SQL, you know, query. But now I understand that better because Snyk is giving me that information at the right time when I don't have to context switch out of it, which is, as I'm writing that line of code to begin with.”Corey: When I look at your website—and I'm really, really hoping that your marketing folks don't turn me into a liar on this one between the time we have recorded this and the time it sees the light of day in a week or so—it's notable because you are a security vendor, but you almost wouldn't know that from your website. And that is a compliment because at no point, start to finish, on the landing page at snyk.io do I see anything that codes to, “Hackers are coming to kill you. Give us money immediately to protect yourself.”You're not slinging FUD. You're talking entirely about how to improve velocity. The closest it gets to even mentioning security stuff is, “Ship on time with peace of mind.” That is as close as it gets to talking about security stuff. There is no fear based on this, and you don't treat people like children and say, “Security is extremely important.” “Thank you, Professor, I really appreciate that helpful tip.”Clinton: Yeah, you know, again, I think we take the very controversial approach that developers are not bad people who want to make applications less secure, right? And I think again, when you go into that 40-year trajectory of that constant tension between the engineering and the security sides of the house, it really involves certain perceptions about what those other people are like: security are bad and want to shut everything down; developers are, you know, wild cowboys who don't care about standardization and are just introducing a bunch of risk, right? Where Snyk comes in is fundamentally saying, “Hey, we can actually all live together in a world where we recognize there's pain on both sides?” And look, Corey, I'm coming to you after essentially waking up every day for 20 years and writing code of some kind or other, and I can tell you, developers are already scared enough, man. It is a fearful and anxiety ridden experience to know that you're not completely in command of what happens to that application once it leaves your IDE, right?You know at some point you're going to get that PDF dumped on you; you're going to have a build block, you're going to have a bug report come in from a very important customer at three o'clock in the morning and you're going to have to do something about it. I think every software engineer in the world carries that fear around with them. They don't have to be told you have the capacity to do bad stuff here and you should be better at it. What they need is somebody to tell them here's how to do things better, right? Here's not necessarily even why a cross-site scripting attack is dangerous—although we can certainly educate you on that as well—but here's what you need to do to remediate it. Here's how other developers have fixed that in applications that look like yours.And if you get that intelligence at the right point, then it becomes truly—to go back to your original question—it becomes about solutions rather than about problems, right? The last thing we ever want to do is adopt that traditional approach of saying, “You did a bad thing. It's your fault. You have to go figure out what to do. And then by the way, you have to do all the refactoring on top of that because we didn't tell you you did the bad thing until three weeks later when that traditional SaaS tool finally finished running.”Corey: Exactly. It's a question of how much can you reduce that feedback loop? If I get pinged 60 seconds after I commit code that there's a problem with it, great. I still have that in my head. Mostly. I hope. But if it's six months later it's, “Who even wrote this?” And I pull up git blame and, “Ah, crap, it was me. What was I possibly thinking back then?” It's about being able to move rapidly and fix things, I guess, as early in the process as possible, the whole shift-left movement. That's important. That's valuable.Clinton: Yeah, the context switching is so expensive, right, because the minute you switch away from that file, you're reading some documentation. You're out of that world. Most of the developer's time is spent getting into and out of different contexts. Once you're in there, I mean, you could rattle off 40 lines of code in a sitting and actually clear a ticket and you feel really good about yourself, right? The next day, when that comes back from QA saying you did something wrong here, that's the painful part of having to get back in.And by the time you've already done that, you've doubled the amount of time you've spent on that feature. So, it's all about integrating the right intelligence in the right context at the right time, and doing so in such a way that we're not throwing around blame, that we're not saying, “You should have known better.” We're saying, “We want to help you do this better because, you know, ultimately, you're going to write another SQL query. That's okay. We hope that maybe this will inspire you to sanitize those strings properly, and we're going to give you some suggestions on how to do that.”Corey: Yeah. Developer time is way more expensive than the infrastructure. That is, I think, a little understood facet of how this works from an engineering perspective because an awful lot of us came up in this industry considering our time to be free. Because we were doing this as a hobby in some cases, it was. When I was in my dorm room back many years ago, as I was basically in the process of being expelled from boarding school, it was very clearly my time was not worth a whole hell of a lot to anyone at that point.Speaking of expensive things, I want to talk for a minute about your pricing. And what I like about this is, let me be clear here. I am a big fan of taking shortcuts wherever I can, and one of the shortcuts I love doing—and I don't know if I've talked about it on this show before—is when I'm talking to a company and I need to figure out do they know what they're doing or are they clowns, I cheat and I go to the pricing page. And there are two big things that I look for, and you have them both.The first is that over on the far left side of the spectrum, it's do you have a free option? And yes, you do. And, “Click here to get started immediately.” Great because it's three in the morning, I need to get something done, I'm under a deadline, I do not have time for a conversation with sales, and as an engineer, I absolutely don't want to deal with that type of sales process because it feels weird to go and ask my boss to go ahead and sign off on something because I feel like my spending authority is capped at $20. Now that I have a little more context, I understand exactly why [laugh] my spending authority was capped at $20 back when I was an engineer.Clinton: Yeah, exactly right. And so it's not only that commitment to ensuring every software engineer in the world can have access to Snyk immediately by making one click because, you know, ultimately, we're committed to that community, right? There's 3 million developers using Snyk currently. That's about 10% of all engineers in the world. We're very proud of that number.We expect that to continue to grow and I think it shows that there is need out there, right? And if we can enable every engineer who's up at 3 a.m. faced with some security prospect to say, you know, it is as simple as getting a free account and getting a vulnerability report, getting the remediation advice, being able to sleep easier. I think we're successful as a company, regardless of what the bottom line is. But when you look at how to scale that into the enterprise, the way security solutions are priced, I mean, it's like throwing a bunch of wet noodles at the wall and seeing what sticks, right?Corey: Yes. And that's the other piece of your pricing that I like is a lot of people are going to be listening to that, what I'm saying right now about, “Oh, well, we have a free tier. Why do you think we're clowns?” It's, “Ah. Because the other end is just as important if not more so, which is there has to be an enterprise tier, and the price for that has got to be, ‘Click here to have a conversation.'” And the reason behind that is if you work in procurement, which is very often who's going to be reaching out on something like this, you are going to need custom contracts; you are going to want a long-term enterprise deal, and if the top tier is X dollars per thing that's already there, it reeks of unsophisticated vendor to a buyer in that position, and it makes the people a big blue chip companies think, “Oh, they don't know how to deal with someone at our scale.” Pricing his messaging, and I think people lose sight of that. You absolutely say the right things on both ends. I look at this, and there's nothing I would change or improve about your pricing page, which to be honest, is really rare.Clinton: I'm not sure all of our sales leaders would agree with you there, but I will pass that feedback along. Well, and the other thing I would add to that is, what everyone who's in a pricing conversation wants is predictability about what is this going to be in the future, right? And so we base our pricing on how many developers are in your organization, right? That's probably a number you know; that's probably a number that you can predict over time. We're not going to say, “How many CPUs are we using, right? What's the footprint of the cloud resources we're deploying to scan your stuff?” These are all things that you have very little control over and there is alchemy there that introduces a financial risk into that situation. And we're all about risk mitigation at scale, right?Corey: You don't pop up halfway through a cycle of, “Oh, you've gone on a hiring spree. Time to go ahead and pay us a bunch more money you didn't plan for or budget for.” I've had vendors pop up a quarter after I signed a deal—repeatedly—and it drives me up a wall because back in my engineering days, it was, great, now I have to spend time on this that I hadn't planned for; I have to go to my boss and ask for more money, never a great conversation, and as a cherry on top, I get to look like I don't know how to manage vendors for crap. It's just everyone is angry about those conversations. And even the salespeople reaching out had the decency to act a little sheepish about having to have that conversation with me.Clinton: The best ones do, at least. Well, and on top of that, you know, maybe that tool has been capped so that now your bills are breaking because you went one over your cap, right? So, I—Corey: Yeah. I love it. When I fail in production. That's my favorite thing. It's like, “All right, we're going to wind up not scanning for security stuff anymore. And if you go five beyond your cap, we're going to start introducing vulnerabilities.” It's, “That's awesome. Just, great plan.” But I'm kidding. I'm kidding. I want to be very clear, I have never heard a whisper of an actual vendor doing that, on purpose anyway.Clinton: Exactly. Right. And you know, look. We want to make it as easy as possible, and that's why, for example, we're on AWS Marketplace. You can use your existing EDP program to, you know, buy Snyk, just as—Corey: At 50% of your spend on Snyk then winds up counting toward your spend commit, which is always an interesting approach that some people are like, “Ooh. So, we can wind up transferring the money that we're spending on a vendor to count toward our commit?” But in many cases, it's how much are you spending on other third-party vendors in this space because you're getting excited about a few tens of thousands in most cases, and you have a $50 million annual [laugh] commit. What are you doing there, buddy? That's like trying to become a millionaire via credit card points. It doesn't usually pan out that way.Clinton: Fair enough. Yeah. And then look, we're very proud of that partnership with Amazon. And look if hey, if they can lock some of our customers into $15 million a year spend contracts, we'll take a few pennies on that, right?Corey: Oh, yeah, as a vendor, you'd be silly not too. It makes sense. But you're doing significantly more than that. As of this week being re:Invent week, you are—well, tell me about it.Clinton: Yeah, Corey, we are thrilled to announce this week that AWS is now integrating with Snyk's vulnerability database within Amazon Inspector. And this is going to bring the best-of-breed security intelligence with a curated vulnerability database, including all of our proprietary research around things like exploit maturity, reachability, vulnerable conditions, social trends on vulnerabilities, all available within Amazon Inspector to any developer utilizing it. We also have an AWS code pipeline integration that makes it easy for anyone utilizing AWS for your CI/CD to get immediate feedback on vulnerabilities in your applications as they move through that pipeline. And remember, we're never just going to say, “We've identified a vulnerability. Now, you need to figure out what to do with it.” We're always going to integrate the remediation advice because our audience at the end of the day is the developer whose job it is to make the fix and who has such a wide variety of responsibility these days, the best we can do is say to them, not just, “We found something wrong,” but, “Here's the solution that we think you should implement to get that secure code back out into production.”Corey: This episode is sponsored by our friends at CloudAcademy. That's right, they have a different lab challenge up for you called, “Code Red: Repair an AWS Environment with a Linux Bastion Host.” What does it do? Well, its going to assess your ability to troubleshoot AWS networking and security issues in a production like environment. Well, kind of, its not quite like production because some exec is not standing over your shoulder, wetting themselves while screaming. But..ya know, you can pretend in fact I'm reasonably certain you can retain someone specifically for that purpose should you so choose. If you are the first prize winner who completes all four challenges with the fastest time, you'll win a thousand bucks. If you haven't started yet you can still complete all four challenges between now and December 3rd to be eligible for the grand prize. There's only a few days left until the whole thing ends, so I would get on it now. Visit cloudacademy.com/corey. That's cloudacademy.com/C-O-R-E-Y, for god's sake don't drop the “E” that drives me nuts, and thank you again to Cloud Academy for not only promoting my ridiculous non sense but for continuing to help teach people how to work in this ridiculous environment.Corey: First, congratulations. It's neat to have a first-party integration like that with an AWS service, as opposed to, you know, their somewhat storied approach of, “Hey, it's an open-source project. We're just going to implement something that's API compatible ourselves, and irritate people.” Now, to be clear, my problem is not that you should expect to build anything and not face competition. My concern is a little bit more along the lines of, “Huh. Why is that same company always the first in line to compete with something.” Which is neither here nor there.Security is also one of those areas where I think competition is important. You want it continual background level of investment in the space because this stuff is super important. What I like about Snyk and a number of companies in this space is I know exactly where you stand. Let's contrast that for a second with AWS. You're integrating with Inspector, which is a great service, but you're not, I don't believe, integrating with their other security services such as [big breath in] Amazon Detective, the Audit Manager—if you want to consider that one of them—Amazon Macie, AWS Firewall Manager, AWS Shield, the Network Firewall, IoT Device Defender, CloudTrail, Config.Amazon Inspector is in one you're there, but not really Security Hub, or GuardDuty, or IAM itself. And I look at all of these services—I mean, IAM is free, of course, but the rest are very much not—and I do some basic arithmetic and I'm starting to realize that if I can figure all the various AWS security services together and what that's going to cost me, it turns out the answer is more than the data breach. So, on some level, it's one of those—at what point is it so confusing and it starts to look like a cross-sell deal between all of the different services, and turn them all on because you could ever have too much security, we still have to ship things eventually. And their security messaging has been extraordinarily confused for a long time. At some level, the fact that you are now integrating with them on the Inspector side means that for the first time, I think I understand what Inspector does now, which is more than a little messed up. But here we are.Clinton: Indeed. Well, the first thing I would say on that is, you know, stay tuned. As we move into the new year. I think you're going to see a lot more announcements both, you know, on the AWS side, but also kind of industry-wide and terms of integration with Snyk. That Vulnerability Database feed also, as you mentioned earlier, in use in Docker Hub, so anyone with Containers and Docker Hub can get advantage by scanning with our Snyk container tool.We have other integrations with Red Hat, for example. And there are actually many other companies utilizing that DB feed to, again, get access to that best in breed vulnerability data. When you talk about that model of, you know, being outcompeted on the security front, I think that's more difficult to do when you're actually talking about data, right? Like tooling, on some level—and I might get in trouble for saying this—but tooling is commodity, right? Somebody tomorrow is going to come out with a better tool to do a thing a little bit faster in a little bit more intuitive way. What can't be easily replicated is the data and intelligence behind that, right? And so that's why—Corey: Yeah, the secret sauce that makes you folks work is not the fact of, “Ah, we can fire off or catch a web hook, and then run the following command against the codebase.” That is—sure it's handy and it's useful and you're good at that, but that is not the reason that people become your customer.Clinton: Exactly right. Look, there's a lot of tools that can resolve the dependency tree within your open-source application, right? We can do that as well. We leverage a lot of open-source to do that, you know, we're very open with that. As I mentioned earlier, a lot of Snyk tooling is available on GitHub, you can see how it works, that code is public.Really the value we're providing is in that curated security research that our dedicated team is working on day in and day out and verifying public security data that's out in CVEs. Is this actually accurate? Do we agree with the severity rating? Might there be other factors that could modify that severity rating? What happens when you are scanning an application that might have some vulnerable conditions versus others? Don't you want to prioritize those vulnerabilities differently? What happens at runtime, right? If you're deploying an application to an EC2 instance with an OpenSSH ingress into your security group, that's going to make certain vulnerabilities a lot bigger risk than if you've got your IAC configured correctly, right? So, the really the overall mission of Snyk as we move into this broader, kind of, ASPM application, you know, security posture management space, is to say, how many different signals across the SDLC can we combine in intuitive ways for the developer to understand that risk at the right time with the right context and armed with the remediation advice to make a better decision as they're writing their code, you know, rather than after the fact? If I could sum it all up, kind of, that's the vision of where we are both today and ultimately where we're going.Corey: There also needs to be an understanding of who the customer is. If I go through the launch wizard and spin up in a brand new account, my first EC2 instance, and I spin up an instance by going through the wizard, the first thing it does is yell at me. Because, “Ah, that SSH port is open to the world.” Which you need to get into it, once it's there. So, it sets that up for me and yells at me all in the same breath. And it's, this is not a promising start; I kind of need that to get into it.Conversely, if you're not someone learning this stuff for the first time, and you're, oh I don't know, a production engineer at a bank, you care quite a bit differently in that use case about things like OpenSSH groups, it's security posture, et cetera, et cetera. An awful lot of the tooling is, “Ah, you're failing this benchmark, and this benchmark, and this benchmark,” from CIS and the rest of all these rules of, oh, you're not encrypting your data at rest. Well, it's in an AWS data center environment. Yeah, if someone could break in and steal the drives from multiple facilities and somehow recombine them together and get out alive, yeah, that's really not my threat model.But it's easy to turn it on and check a box and make an auditor go away. But that's not where I would spend the bulk of my energies if I'm trying to improve my security posture. And it turns into rote checklists super easily. The thing I've always appreciated about the stuff that you're tooling in the open-source world has highlighted is it's not nonsense. And I really can't understate just how valuable that is.Clinton: Absolutely. And that comes from a combination of signals across that SDLC, from the open-source, from the container, from the proprietary code, from the IAC, but then also what's happening at runtime, right? Like, how are those containers actually deployed onto EKS? What ports are open? What running binaries are on the container that might influence, you know, what packages you choose to upgrade, versus not?All of that matters, and what—you know, the issue I think now is getting that visibility to the developer at the right time so that they can make it actionable. And the thing about infrastructure as code, that I think that's really interesting and not super well understood is a lot of those defaults are really insecure. And developers have no idea, right? Like, they might not be aware that if you don't define that encryption for your S3 bucket, it'll happily deploy unencrypted, right? Yes, that's a compliance problem, but that's also potentially exacerbator have other vulnerabilities that might be in that application.But you only see those when you can combine and have a single pane of glass that gives you the runtime signaling plus everything that's happening in the application, armed with the correct information to actually remediate that at the time, and say, “Don't you think you wanted to add, you know, AES encryption to this bucket? Don't you think you wanted to close down port 22?” And also, combine that with your internal business logic, right? Like maybe for an internal only application that never transits beyond your VPC perimeter, sure, it's fine to have port 22 open, right? There's just going to be people within your zero-trust environment authenticating to it. But for your production web application, that might be a different story.Corey: There are other concerns, too. For example, I'm sitting here complaining about the idea of encrypting at rest in an AWS environment, but if you've signed customer contracts that state that you're doing it, you'd better freaking do it, as opposed to, “Well, I know what the actual security risk is and it's no big deal.” Yeah, don't make that decision. If you are contractually obligated to do a thing. Don't YOLO it; do what you say you're going to do. That's that whole integrity thing.Clinton: Oh, sure. And look in a battle between security and compliance. Compliance always wins, right? But from a developer perspective, I don't know that we on the front lines writing code actually differentiate, right? That certainly is a matter for the people defining the policies and, you know, creating their gating mechanisms in CI to figure out.What I want to know as a developer is, is my build going to succeed, right? Or am I going to get shut down and get the nastygram that says, you know, “We couldn't launch this for x, y, and z reason.” Now, everybody on my team hates me, my lead dev is on me, now there's a bunch of merge conflicts because my branch is behind. I want to get that out into production, but in order to do that, I need information on how are all these signals going to be compiled together in a way that, you know, creates that red light or green light on the risk dashboard later on. But up until I think, you know, relatively recently, I don't have visibility into that except to launch the commit, you know, start the build and see what happens, and then I have that context-switching problem, right, because it's hours or days later, that I finally get that signal back.So yes, I think we have a compliance story to tell from the Snyk perspective as well. A lot of those same issues, you know, we're detecting, especially with regard to infrastructure as code, but it ultimately is up to various parts of the organization to work together and say, “What balance do we want to strike between security and velocity,” right? Understanding that those are not mutually opposed. What we need is tooling and more importantly a culture that takes both into account and allows us to develop securely and fast at the same time.Corey: I want to thank you so much for taking the time to speak with me about all this. If people want to learn more, where can they find you? And for God's sake, please don't say in your booth at re:Invent.Clinton: [laugh]. I will not be at re:Invent this year. I've had a little bit too much of the Vegas Strip here recently.Corey: No, I hear you. Right now, the people going are those whose employers find them expendable, which is why I'm there.Clinton: I wouldn't say that Corey. I think you'll do great, and you know, just make sure to bank all your vacation for a couple weeks after. Look, come to snyk.io start a conversation, but more importantly, just start using it, right?I don't want to give you the sales pitch; I want you to see the value in the tooling, and the easiest way to do that as an engineer is just to start using it. And if there is value there, you want to bring it to your enterprise. I would love to have that conversation and move forward. But engineer to engineer, like, figure out if this is going to work for you: does it make your life easier? Does it reduce the pain and anxiety you feel before making that commit into the production branch? And if so, then yeah, we'd love to talk.Corey: I will, of course, put links to that in the [show notes 00:33:22]. Thank you so much for speaking to me today. I really appreciate it.Clinton: Thank you, Corey. Glad to do it.Corey: Clinton Herget, principal solutions engineer at Snyk. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment yelling at Snyk about how they're a terrible company because they continually refuse to patronize your side business down at the Vowel Emporium.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:04:00] AWS Proton Adds Terraform for infrastructure provisioninghttps://aws.amazon.com/about-aws/whats-new/2021/11/aws-proton-terraform-infrastructure/[00:05:55] AWS Proton introduces Git management of infrastructure as code templateshttps://aws.amazon.com/about-aws/whats-new/2021/11/aws-proton-git-infrastructure-code-templates/[00:10:43] Amazon Linux 2022https://aws.amazon.com/linux/amazon-linux-2022/?amazon-linux-whats-new.sort-by=item.additionalFields.postDateTime&amazon-linux-whats-new.sort-order=desc[00:12:11] Announcing Pull Through Cache Repositories for ECR and terraform provider support cominghttps://aws.amazon.com/blogs/aws/announcing-pull-through-cache-repositories-for-amazon-elastic-container-registry/https://github.com/hashicorp/terraform-provider-aws/issues/21951[00:17:10] AWS EMR Serverless in previewhttps://aws.amazon.com/about-aws/whats-new/2021/11/amazon-emr-serverless-preview/[00:19:06] AWS Control Tower introduces Terraform account provisioning and customization (with weird modules)https://aws.amazon.com/about-aws/whats-new/2021/11/aws-control-tower-terraform/https://github.com/aws-ia/terraform-aws-control_tower_account_factory[00:23:58] AWS Karpenter v0.5 Now Generally Availablehttps://aws.amazon.com/about-aws/whats-new/2021/11/aws-karpenter-v0-5/[00:28:45] AWS WAF adds support for Captcha (e.g. like Cloudflare)https://aws.amazon.com/about-aws/whats-new/2021/11/aws-waf-captcha-support/[00:33:45] Has anyone migrated an existing organisation into control tower? How did it go? @Alex Jurkiewicz [00:34:45] I wanna open a discussion regarding tagging/labeling conventions that are used company wide. And what tags do you guys use ? @Sherif Abdel-Naby[00:48:06] I have some nested providers that I'm moving to the root module. My approach is to replace the nested providers in the state file, with the root-level providers, which seems to be working. Any advice, suggestions? @Eric Berg[00:52:17] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
In this episode, we cover:00:00:00 - Introduction 00:05:00 - Itiel's Background in Engineering00:08:25 - Improving Kubernetes Troubleshooting00:11:45 - Improving Team Collaboration 00:14:00 - OutroLinks: Komodor: https://komodor.com/ Twitter: https://twitter.com/Komodor_com TranscriptJason: Welcome back to another episode of Build Things On Purpose, a part of the Break Things On Purpose podcast where we talk with people who have built really cool software or systems. Today with us, we have Itiel Shwartz who is the CTO of a company called Komodor. Welcome to the show.Itiel: Thanks, happy to be here.Jason: If I go to Komodor's website it really talks about debugging Kubernetes, and as many of our listeners know Kubernetes and complex systems are a difficult thing. Talk to me a little bit more—tell me what Komodor is. What does it do for us?Itiel: Sure. So, I don't think I need to tell our listeners—your listeners that Kubernetes looks cool, it's very easy to get started, but once you're into it and you have a big company with complex, like, micros—it doesn't have to be big, even, like, medium-size complex system company where you're starting to hit a couple of walls or, like, issues when trying to troubleshoot Kubernetes.And that usually is due to the nature of Kubernetes which makes making complex systems very easy. Meaning you can deploy in multiple microservices, multiple dependencies, and everything looks like a very simple YAML file. But in the end of the day, when you have an issue, when one of the pods is starting to restart and you try to figure out, like, why the hell is my application is not running as it should have, you need to use a lot of different tools, methodologies, knowledge that most people don't really have in order to solve the issue. So, Komodor focus on making the troubleshooting in Kubernetes an easy and maybe—may I dare say even fun experience by harnessing our knowledge in Kubernetes and align our users to get that digest view of the world.And so usually when you speak about troubleshooting, the first thing that come to mind is issues are caused due to changes. And the change might be deploying Kubernetes, it can be a [configurment 00:02:50] that changed, a secret that changed, or even some feature flag, or, like, LaunchDarkly feature that was just turned on and off. So, what Komodor does is we track and we collect all of the changes that happen across your entire system, and we put, like, for each one of your services a [unintelligible 00:03:06] that includes how did the service change over time and how did it behave? I mean, was it healthy? Was it unhealthy? Why wasn't it healthy?So, by collecting the data from all across your system, plus we are sit on top of Kubernetes so we know the state of each one of the pods running in your application, we give our users the ability to understand how did the system behave, and once they have an issue we allow them to understand what changes might have caused this. So, instead of bringing down dozens of different tools, trying to build your own mental picture of how the world looks like, you just go into Komodor and see everything in one place.I would say that even more than that, once you have an issue, we try to give our best efforts on helping to understand why did it happen. We know Kubernetes, we saw a lot of issues in Kubernetes. We don't try complex AI solution or something like that, but using our very deep knowledge of Kubernetes, we give our users, FYI, your pods that are unhealthy, but the node that they are running on just got restarted or is having this pressure.So, maybe they could look at the node. Like, don't drill down into the pods logs, but instead, go look at the nodes. You just upgraded your Kubernetes version or things like that. So, basically we give you everything you need in order to troubleshoot an issue in Kubernetes, and we give it to you in a very nice and informative way. So, our user just spend less time troubleshooting and more time developing features.Jason: That sounds really extremely useful, at least from my experience, in operating things on Kubernetes. I'm guessing that this all stemmed from your own experience. You're not typically a business guy, you're an engineer. And so it sounds like you were maybe scratching your own itch. Tell us a little bit more about your history and experience with this?Itiel: I started computer science, I started working for eBay and I was there in the infrastructure team. From there I joined two Israeli startup and—I learned that the thing that I really liked or do quite well is to troubleshoot issues. I was in a very, very, like, production-downtime-sensitive systems. A system when the system is down, it just cost the business a lot of money.So, in these kinds of systems, you try to respond really fast through the incidents, and you spend a lot of time monitoring the system so once an issue occur you can fix it as soon as possible. So, I developed a lot of internal tools. For the companies I worked for that did something very similar, allow you once you have an issue to understand the root cause, or at least to get a better understanding of how the world looks like in those companies.And we started Komodor because I also try to give advice to people. I really like Kubernetes. I liked it, like, a couple of years ago before it was that cool, and people just consult with me. And I saw the lack of knowledge and the lack of skills that most people that are running Kubernetes have, and I saw, like—I'd have to say it's like giving, like, a baby a gun.So, giving an operation person that doesn't really understand Kubernetes tell him, “Yeah, you can deploy everything and everything is a very simple YAML. You want a load balancer, it's easy. You want, like, a persistent storage, it's easy. Just install like—Helm install Postgres or something like that.” I installed quite a lot of, like, Helm-like recipes, GA, highly available. But things are not really highly available most of the time.So, it's definitely scratching my own itch. And my partner, Ben, is also a technical guy. He was in Google where they have a lot of Kubernetes experience. So, together both of us felt the pain. We saw that as more and more companies moved to Kubernetes, the pain became just stronger. And as the shift-left movement is also like taking off and we see more and more dev people that are not necessarily that technical that are expected to solve issues, then again we saw an issue.So, what we see is companies moving to Kubernetes and they don't have the skills or knowledge to troubleshoot Kubernetes. And then they tell their developers, “You are now responsible for the production. You are deploying? You should troubleshoot,” and the developers really don't know what to do. And we came to those companies and basically it makes everything a lot easier.You have any issue in Kubernetes? No issue, like, no issue. And no problem go to Komodor and understand what is the probable root cause. See what's the status? Like, when did it change? When was it last restarted? When was it unhealthy before today? Maybe, like, an hour ago, maybe a month ago. So, Komodor just gives you all of this information in a very informative way.Jason: I like the idea of pulling everything into one place, but I think that obviously begs the question: if we're pulling in this information we need to have good information to begin with. I'm interested in your thoughts of if someone were to use Komodor or just want to improve their visibility into troubleshooting Kubernetes, what are some tips or advice that you'd have for them in maybe how to set up their monitoring, or how to tag their changes, things like that? What does that look like?Itiel: I will say the first thing is using more metadata and tagging capabilities across the board. It can be on top of the monitors, the system, the services, like, you name it, you should do it. Once an alert is triggered, you don't necessarily have to go to the perfect playbook because it doesn't really exist. You should understand what's the relevant impact, what system it impacted, and who is the owner, and who should you wake up, like, now or who should look at it?So, spending the time tagging some of the alerts and resources in Kubernetes is super valuable. It's not that hard, but by doing so you just reduced the mental capacity needed in order to troubleshoot an issue. More than that, here in Komodor we read of this metadata label stacks, and we harness it for our own benefits. So, it is best practice to do so and Komodor also utilize this data.And for example, for an alert, say like, the relevant team name that is responsible, and for each service in Kubernetes write the team that owns this service. And this way you can basically understand what teams are responsible for what services or issues. So, this is the number one tip or trick. And the second one is just spend time on exposing these data. You can use Komodor I think, like, it's the best solution, but even if not, try to have those notification every time something change.Write those, like, web hooks to which one of your resources and let the team know that things change. If not, like, what we see in companies is something break, no one really know what changed, and in the end of the day they are forced to go into Slack and doing, like, here—someone changed something that might cause production break. And if so, please fix it. It's not a good place to be. If you see yourself asking questions over Slack, you have an issue with the system monitoring and observability.Jason: That's a great point because I feel like a lot of times we do that. And so you look back into your CI/CD logs, like, what pushes are made, what deploys are made. You're trying to parse out, like, which one was it? Especially in a high-velocity organization of multiple changes and which one actually did that breaking.Itiel: We see it across the board. There are so many changes, so many dependencies. Because microservice A talks with microservice B that speak with microservice C using SQS or something like that. And then things break and no one know what is really happening. Especially the developers, they have no idea what is happening. But most of the time also the DevOps themselves.Jason: I think that's a great point of, sort of, that shared confusion. As we've talked about DevOps and that breaking down of the walls between developers and operations, there was always this, “Well, you should work together,” and there is this notion now of we're working together but nobody knows what's going on.As we talk about this world of sharing, what are some of your advice as somebody who's helped both developers and operations? Aside from getting that shared visibility for troubleshooting, do you have any tips for collaborating better to understand as a team how things are functioning?Itiel: I have a couple of thoughts on this area. The first thing is you must have the alignment. Both the DevOps, or operation and the developers need to understand they are in this together. And this, like, base point in other organization you see they struggle. Like, the developers are like, yeah, I don't really need—like, it's the ops problem if production is down, and the ops are, like, angry at the devs and say they don't understand anything so they shouldn't be responsible for issues in production.So, first of all, let's create the alignment. The organization needs to understand that both the dev and the ops team need to take shared responsibility over the system and over the troubleshooting process. Once this very key pillar is out of the way, I will say that adding more and more tools and making sure that those tools can be shared between the ops and the dev team.Because a lot of the times we see tools that are designed for the DevOps, and a developer don't really understand what is happening here, what are those numbers, and basically how to use them. So, I think making sure the tools fit both personas is a very crucial thing. And the last thing is learning from past incidents. You are going to have other incidents, other issues. The question is, do you understand how we improve the next time this incident or a similar incident will happen? What processes and what tools are missing in the link between the DevOps and the system to optimize it. Because it's not after you snap your finger and everything works as expected.It is an iterative process and you must have, like, the state of mind of, okay, things are going to get better, or they are going to get better, and so on. So, I think this is the third, like, three most important things. One make sure you have that alignment, two, create tools that can be shared across different teams, and three, learn from past incidents and understand this is like a marathon. It's not a sprint.Jason: Those are excellent tips. So, for our listeners, if you would like a tool that can be shared between devs and DevOps or ops teams, and you're interested in Komodor—Itiel, tell us where folks can find more info about Komodor and learn more about how to troubleshoot Kubernetes.Itiel: So, you can find us on Twitter, but basically on komodor.com. Yeah, you can sign up for a free trial. The installation is, like, 10 seconds or something like that. It's basically Helm install, and it really works. We just finished, like, a very big round, so we are growing really fast and we have more and more customers. So, we'll be happy to hear your use case and to see how we can accommodate your needs.Jason: Awesome. Well, thanks for being on the show. It's been a pleasure to have you.Itiel: Thank you. Thank you. It was super fun being here.Jason: For links to all the information mentioned, visit our website at gremlin.com/podcast. If you liked this episode, subscribe to the Break Things on Purpose podcast on Spotify, Apple Podcasts, or your favorite podcast platform. Our theme song is called “Battle of Pogs” by Komiku, and it's available on loyaltyfreakmusic.com.
Unedited live recording on YouTube (Ep 146) Datree Kubeconform pre-commit https://github.com/yannh/kubeconform https://pre-commit.com/ Eyar's article about K8s schema validation Open an issue for questions on k8s schema kubectl --dry-run=client bug Datree's CLI tool to ensure K8s manifests and Helm charts follow best practices Check CRDs and schema with Datree ★ Support this podcast on Patreon ★
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:37] Netlify Drophttps://app.netlify.com/drop[00:05:58] How should I run containers on AWS (flowchart)?https://www.vladionescu.me/posts/flowchart-how-should-i-run-containers-on-aws-2021/[00:15:22] Kubevious: The time-saving Kubernetes GUIhttps://github.com/kubevious/kubevious[00:25:35] Does anyone have a clean way to generate outputs/variable files?[00:31:18] Does anyone have a nice way to handle schema creation with Terraform on RDS MySQL?[00:34:46] How do you bootstrap IAM/service/machine roles for CICD and allow the repository to self manage? [00:44:29] Any alternatives to Docker for Desktop? [01:12:23] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
There's no better feeling than when you launch something that just clicks with people. I guess at the end of the day, folks that build businesses or create content are simply seeking acceptance. We want to see our idea flourish, to be adopted by the masses, and to leave an impact. When Maciej Palmowski launched WP Owls with his wife Agnieszka, it was (and still is) a publication that served the Polish community. But it clicked. People clicked, literally on to the website and their stories, so the co-founding duo decided it was time to go global. Combined they've published over 200 articles about WordPress and the community on the blog, with no signs of stopping. Oh, and if you're wondering how to get a job like WordPress ambassador at Buddy, you'll learn a thing or two about CI/CD today!
Unedited live recording on YouTube Ep 106Corrections=========== "Windows RT" isn't a thing anymore, and the Windows 10 on Arm (WoA) is getting better, with x64 emulation (MS version of Rosetta 2) in pre-beta AArch64 or ARM64 is the 64-bit extension of the ARM architecture Microsoft is indeed designing its own Arm chips for future Surface and servers Windows 10 Arm works in Parallels M1 beta on macOS AWS Arm instances are now on Gen 2, as of May 2020, with up to 40% better performance-per-dollar than old Gen 1 Arm, AMD, and Intel Topics=========== MX1 and future of Apple Silicon (Arm) on YouTube QEMU: emulate one CPU architecture on another Multi-arch support in Docker Desktop in docker docs Build multi-arch images with buildx in docker docs Manifest commands for multi-arch images in docker docs Hardcode platform in Compose files in docker docs Alex Ellis blog, lots of Arm + Docker info My shell setup Setup QEMU on Linux servers on stereolabs.com on GitHub ★ Support this podcast on Patreon ★
About Micheal BenedictMicheal Benedict leads Engineering Productivity at Pinterest. He and his team focus on developer experience, building tools and platforms for over a thousand engineers to effectively code, build, deploy and operate workloads on the cloud. Mr. Benedict has also built Infrastructure and Cloud Governance programs at Pinterest and previously, at Twitter -- focussed on managing cloud vendor relationships, infrastructure budget management, cloud migration, capacity forecasting and planning and cloud cost attribution (chargeback). Links: Pinterest: https://www.pinterest.com Twitter: https://twitter.com/micheal LinkedIn: https://www.linkedin.com/in/michealb/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: You know how git works right?Announcer: Sorta, kinda, not really Please ask someone else!Corey: Thats all of us. Git is how we build things, and Netlify is one of the best way I've found to build those things quickly for the web. Netlify's git based workflows mean you don't have to play slap and tickle with integrating arcane non-sense and web hooks, which are themselves about as well understood as git. Give them a try and see what folks ranging from my fake Twitter for pets startup, to global fortune 2000 companies are raving about. If you end up talking to them, because you don't have to, they get why self service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y.comCorey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Sometimes when I have conversations with guests here, we run long. Really long. And then we wind up deciding it was such a good conversation, and there's still so much more to say that we schedule a follow-up, and that's what happened today. Please welcome back Micheal Benedict, who is, as of the last time we spoke and presumably still now, the head of engineering productivity at Pinterest. Micheal, how are you?Micheal: I'm doing great, and thanks for that introduction, Corey. Thankfully, yes, I am still the head of engineering productivity; I'm really glad to speak more about it today.Corey: The last time that we spoke, we went up one side and down the other of large-scale environments running on AWS and billing aspects thereof, et cetera, et cetera. I want to stay away from that this time and instead focus on the rest of engineering productivity, which is always an interesting and possibly loaded term. So, what is productivity engineering? It sounds almost like it's an internal dev tools team, or is it something more?Micheal: Well, thanks for asking because I get this question asked a lot of times. So, for one, our primary job is to enable every developer, at least at our company, to do their best work. And we want to do this by providing them a fast, safe, and a reliable path to take any idea into production without ever worrying about the infrastructure. As you clearly know, learning anything about how AWS works—or any public cloud provider works—is a ton of investment, and we do want our product engineers, our mobile engineers, and all the other folks to be focused on delivering amazing experiences to our Pinners. So, we could be doing some of the hard work in providing those abstractions for them in such way, and taking away the pain of managing infrastructure.Corey: The challenge, of course, that I've seen is that a lot of companies take the approach of, “Ah. We're going to make AWS available to all of our engineers in it's raw, unfiltered form.” And that lasts until the first bill shows up. And then it's, “Okay. We're going to start building some guardrails around that.” Which makes a lot of sense. There then tends to be a move towards internal platforms that effectively wrap cloud services.And for a while now, I've been generally down on the concept and publicly so in the general sense. That said, what I say that applies as a best practice or something that most people should consider does tend to fall apart when we talk about specific use cases. You folks are an extremely large environment; how do you view it? First off, do you do internal platforms like that? And secondly, would you recommend that other companies do the same thing?Micheal: I think that's such a great question because every company evolves with its own pace of development. And I wouldn't say Pinterest by itself had a developer productivity or an engineering productivity organization from the get-go. I think this happens when you start realizing that your core engineers who are working on product are now spending a certain fraction of time—which starts ballooning pretty fast—in managing the underlying systems and the infrastructure. And at that point in time, it's probably a good question to ask, how can I reduce the friction in those people's lives such that they could be focused more on the product. And, kind of, centralize or provide some sort of common abstractions through a central team which can take away all that pain.So, that is generally a good guiding principle to think about when your engineers are spending at least 30% of their time on operating the systems rather than building capabilities, that's probably a good time to revisit and see whether a central team would make sense to take away some of that. And just simple examples, right? This includes upgrading OS on your EC2 machines, or just trying to make sure you're patching all the right versions on your next big Kubernetes cluster you're running for serving x number of users. The moment you start seeing that, you want to start thinking about, if there is a central team who could take away that pain, what are the things they could be investing on to help up-level every other engineer within your organization. And I think that's one of the best ways to be thinking about it.And it was also a guiding principle for us within Pinterest to view what investments we could make in these central teams which can up-level each and every different type of engineer in the company as well. And just an example on that could be your mobile engineer would have very different expectations from your backend engineer who was working on certain aspects of code in your product. And it is truly important to understand where you want to centralize capabilities, which both these types of engineers could use, or you want to divest and have unique capabilities where it's going to make them productive. There's no one-size-fits-all solution for this, but I'm happy to talk about what we have at Pinterest, which has been reasonably working well. But I do think there's a lot more improvements we could be doing.Corey: Yeah, but let's also be clear that, as you've mentioned, you are heavily biased towards EC2 instances for a lot of what you do. If we look at the AWS console and we see hundreds of different services now, and it's easy to sit here and say, “Oh, internal platforms are terrible because all of those services are going to be enhanced in various ways and you're never going to be able to keep up with feature parity.” Yeah, but if you can wrap something like EC2 in an internal platform wrapper, that begins to be a different story because sure, someone's going to go and try something new with a different AWS service, they're going to need direct access. But the EC2 product across the board generally does not evolve in leaps and bounds with transformative changes overnight. Let's also not forget that at a company with the scale that Pinterest operates at, “Hey, AWS just dusted off a new feature and docs are still rolling out, and it's not in CloudFormation yet, but we're going to roll it out to production,” probably seems like the wrong direction to go in, I would assume.Micheal: And yes, I think that brings one of the key guardrails, I think, which these groups provide. So, when we start thinking about what teams, centralized teams like engineering productivity, developer tools, developer platforms actually do is they help with a couple of things. The top three are: they can help pave a path for the most common use cases. Like to your point, provisioning EC2 does take a set of steps, all the time. If you're going to have a thousand people doing that every time they're building a new service or trying to expand capacity playing with their launch templates, those are things you can start streamlining and making it simple by some wrapper because you want to address those 80% use cases which are usually common, and you can have a wrapper or could just automate that. And that's one of the key things: can you provide a paved path for those use cases?The second thing is, can you do that by having the right guardrails in place? How often have you heard the story that, “I just clicked a button and that now spun up, like, a thousand-plus instances.” And now you have to juggle between trying to stop them or do something about it.Corey: Back in 2013, you folks were still focusing on this fair bit. I remember because Jeremy Carroll, who I believe was your first SRE there once upon a time, wound up doing a whole series of talks around how Pinterest approached doing an AMI Factory. And back in those days, the challenges were, “Okay. We have the baseline AMI, and that's great, but we also want to do deployments of things and we don't really want to do a new deploy of an entire fleet of EC2 instances for a single line of config change, so how do we wind up weighing off of when you bake a new AMI versus when you just change something that has—in what is deployed to them?” And it was really a complicated problem back then.I'm not convinced it's not still a complicated problem, but the answers are a lot more cohesive. And making sure that every team—when you're talking about a company as large as Pinterest with that many teams—is doing things in the same way, seems like it's critically important otherwise you wind up with a whole bunch of unique-looking instances that each have to be managed by hand as opposed to something that can be reasoned around collectively.Micheal: Yep. And that last part you mentioned is extremely crucial as well because like I said, our audience or our customers are just not the engineers; we do work with our product managers and business partners as well because at times, we have to tie or change our architecture based on certain cost optimizations which would make sense, like you just articulated. We don't want to have all the instance types. It does not add much value to a developer unless they're explicitly seeking a high-memory instance or a [GP-based instance in a 00:10:25] certain way. So, we can then work with our business partners to make sure that we're committing to only a certain type of instances, and how we can abstract our tools to only give you that. For example, our deployment system, Teletraan which is an open-source system, actually condenses down all these instance types to a couple of categories like high-compute, high-memory—and you've probably seen that in many of the new cloud providers as well—so people don't have to learn or know the underlying instance type.When we moved from c3 to c5, it was just called as a high-compute system, so the next time someone provisioned a new service or deployed it using our system, they would just select high-compute as the de facto instance type and we would just automatically provision a C5 for them. So, that just reduces the extra complexity or the cognitive overhead individuals would have to go through in learning each instance type, what is the base AMI that comes on it, what are the different configurations that need to go in terms of setting up your AZ-scaling properties. We give them a good reasonable set of defaults to get started with, and then they can then work on optimizing or making changes to it.Corey: Ignoring entirely your mispronunciation of AMI, which is, of course, three syllables—and that is a petty hill upon which I will die—it occurs to me the more I work with AWS in various ways, the easier it gets. And I used to think in some respects, it was because the platform was so—it was improving so dramatically around me. But no, in many cases, it's because the first time you write some CloudFormation by hand, it's a nightmare and you keep smacking into weird issues. But the second or third time, it's super easy because you just copy the thing you've already built and change the relevant bits around. And that was the learning curve that I went through playing around with a lot of these things.When you start looking at this from a large-scale environment where it's not just about upskilling the people that you have to understand how these things integrate in AWS land, but also the consistent onboarding of engineers at a fairly progressive clip is, great, you effectively have to start doing trainings on all these things, and there's a lot of knobs and dials that can blow up and hurt people. At some point, building the guardrails or building the environment in which you are getting all the stuff abstracted away from where the application engineers have to think about this at all, it eventually reaches a tipping point where it starts to feel like it's no longer optional if you want to continue growing as a company because you don't have the luxury of spending six months of onboarding before you let someone touch the thing they were hired to build.Micheal: And you will see that many companies very often have very similar programming practices like you just described. Even I learned that the same way: you have a base template, you just copy-paste it and start from there on. And no one goes through the bootstrapping process manually anymore; you want to—I think we call it cargo-culting, but in general, just get something to bootstrap and start from there. But one of the things we learned in sort of the hard way is that can also lead to, kind of, you pushing, you know, not great practices because people don't know what is a blessed version of a good template or what actually would make sense. So, some of those things, we have been working on.And this is where centralized teams like engineering productivity are really helpful is we provide you with the blessed or the canonical way to do certain things. Case in point example is a CI/CD pipeline or delivery of software services. We have invested enough in experimenting on what works with some of the more nuanced use cases at Pinterest, in helping generate, sort of, a canonical version which would cover 80% of the use cases. Someone could just go and try to build a service and they could just use the same canonical pipeline without learning much or making changes to it. This also reduces that cargo-culting nature which I called, rather than copying it from unknown sources and trying to like—again, it may cause havoc to our systems, so we can avoid a lot of that because of these practices.Corey: So, let's step a little bit beyond AWS—I know I hate doing it, too—but I'm going to assume that your remit is broader than, oh, AWS whisperer-slash-Wrangler. So, tell me a little bit more about what it is that your day-to-day looks like if there is anything that could be said not to focus purely around AWS whispering.Micheal: So, one of the challenges—and I want to talk about this a bit more—is our environments have become extremely complex over time. And it's the nature of, like, rising entropy. Like, we've just noticed that there's two things: we have a diverse set of customer base, and these include everyone trying to do different workloads or work service types. What that essentially translates into is that we realized that our solution may not fit all of them. For example, what works for a machine-learning engineer in terms of iterating on building a model and delivering a model is not the same as someone working on a long-running service and trying to deploy that. The same would apply for someone trying to operate a Kafka system.And that has made, I think, definitely our job a bit challenging in trying to assess where do you actually draw the line on the abstraction? What is the right layer of abstraction across your local development experience, across when you move over to staging your code in a PR model and getting feedback and subsequently actually releasing it to production? Because this changes dramatically based on what is the workload type you're working on. And we feel like that has been one of the biggest challenges where I know I spent my day-to-day and my team does too, in trying to help provide some of the right solutions for these individuals. There's—very often we'll also get asked from individuals trying to do a very nuanced thing.Of late, we have been talking about thinking about how you operate functions, like provide Functions as a Service within the company? It just put us in a difficult spot at times because we have to ask the hard question, “Is this required?” I know the industry is doing it; it's definitely there. I personally believe, yes, it could be a future, but is that absolutely important? Is that going to benefit Pinterest in any formal way if we invest on some core abstractions?And those are difficult conversations to have because we have exciting engineers coming in trying to do amazing things; it puts us in a hard spot, as well, as to sometimes saying graciously, no. I know many companies deal with it when they have these centralized teams, but I think it's part of that job. Like when you say it's day-to-day, I would say I'm probably saying no a couple of times in that day.Corey: Let's pretend for the sake of argument that I am, tomorrow morning, starting another company—Twitter for Pets—and over the next ten years, it grows to be larger than Pinterest in terms of infrastructure, probably not revenue because it turns out pets are not the lucrative source of ad revenue that I was hoping it would be but, you know, directionally the same thing. It seems to me that building out this sort of function with this sort of approach to things is dramatically early as far as optimizations go when it's just me puttering around on something. I'm always cognizant of the wrong people taking the wrong message when we're talking about things that happen like this at scale. When does having an engineering productivity group begin to make sense?Micheal: I mentioned this earlier; like, yeah, there is definitely not a right answer, but we can start small. For example, this group actually started more as a delivery team. You know, when we started, we realized that we had different ways of deploying services or software at Pinterest, so we first gathered together to figure out, okay, what are the different ways and can we start simplifying that part? And that's where it started expanding. Okay, we are doing button-based deployments right now we have thousand-plus microservices, and we are seeing more incidents than we wanted to because anything where there's a human involved means there's a potential gap for error. I myself was involved in a SEV 0 incident, and I will be honest; we ended up deploying a Hello World application in one of our production fleet. Not the thing I wanted to be associated with my name, but, you know—Corey: And you were suddenly saying hello to the world, in fact—Micheal: [laugh].Corey: —and oops-a-doozy.Micheal: Yeah. So—and that really prompted us to rethink how we need to enable guardrails to do safe production rollouts. And that's how those conversations start ballooning out.Corey: And the healthy correct way. We've all broken production in various ways, and it's—you correctly are identifying, I believe, the direction you're heading in where this is a process problem and a tooling problem; it is not that you are secretly crap and should never have been allowed near anything in production. I mean, that's my excuse for me, but in your case, this is a common thing where it's, if someone can unintentionally cause issues like that, there needs to be better processes and procedures as the organization matures.Micheal: Yep. And that's kind of like always the route or the starting point for these discussions. And it starts growing from there on because, okay, you've helped improve the deploy process but now we're seeing insane amount of slowness, say on the build processes, or even post-deploy, there's, like, issues on how we monitor and look into data.And that I think forces these conversations, okay, where do we have these bespoke tools available? What are people doing today? And you have to ask those hard questions, like what can we actually remove from here? The goal is not to introduce yet another new system. Many a times, to be honest bash just gets the job done. [laugh].Personally, I'm okay with that as long as it's consistent and people, you know, are able to contribute to it and you have good practices in validating it, if it works, we should go for it rather than introducing yet another YAML [laugh] and some of that other aspects of doing that work. And that's what we encourage as well. That's how I think a lot of this starts connecting together in terms of, okay, now this is becoming a productivity group; they're focused on certain challenges where investing probably one person here may up-level a few other engineers who don't have to do that on a day-to-day basis. And I think that's one of the key items for, especially, folks who are running mid-sized companies to realize and start investing in these type of teams to really up-level, sort of, the rest of the engineering.Corey: You've been doing this for a fair while. If you were to go back and start over again on day one—which is always a terrifying question, on some level—what would you have done differently about building out this function as Pinterest continued to scale out?Micheal: Well, first, I must acknowledge that this was just not me, and there's, like, ton of people involved in helping make this happen.Corey: No, that's fair. We'll blame them for the missteps; that is—Micheal: [laugh].Corey: —just fine with me. I kid. I kid.Micheal: I think, definitely the nuances. If I look back, all the decisions that were made then at that point in time, there was a decision made to move to Phabricator, which was back then a great open-source code management system where with the current information at that point in time. And I'm not—I think it's very hard to always look back and say, “Oh, we could have chosen x at one point in time.” And I think in reality, that's how engineering organizations always evolve, that you have to make do with the information you have right now to make a decision that works for you over a couple of years.And I'll give you a small example of this. There was a time when Pinterest was actually on GitHub Enterprise—this was like circa 2013, I would say—and it really served as well for, like, five-plus years. Only then at certain point, we realized that it's hard to hire PHP engineers to support a tool like that, and we had to rethink what is the ROI and the investments we've made here? Can we ever map up or match back to one of the offerings in the industry today? And that's when you make decisions that, okay, at this point in time, it's clear that business continuity talks, you know, and it's hard to operate a system, which is, at this moment not supported, and then you make a call about making a shift or moving.And I think that's the key item. I don't think there's anything dramatically I would have changed since the start. Perhaps definitely investing a bit more individuals into the group and going from there. But that said, I'm really, sort of, at least proud of the fact that usually these teams are extremely lean and small, and they always have an outsized impact, especially when they're working with other engineers, other [opinionated 00:22:13] engineers for what it's worth.This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking databases, observability, management, and security.And - let me be clear here - it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build.With Always Free you can do things like run small scale applications, or do proof of concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free. No asterisk. Start now. Visit https://snark.cloud/oci-free that's https://snark.cloud/oci-free.Corey: Most folks show up intending to do good today, and you make the best decision at the time with the context and constraints that you have, but my question I think is less around, “Well, what were the biggest mistakes you made?” But more to do with the idea of, based upon what you've learned and as you have shown—as you've shined light on these dark areas, as you have been exploring it, has anything jumped out at you that is, “Oh, yeah. Now, that I know—if I had known then what I know now, I would definitely have made this other decision.” Ideally, something that applies a little more globally than specific within Pinterest, just because the whole idea, aspirationally, is that people might learn something from our conversation. At least I will, if nothing else.Micheal: No, I think that's a great question. And I think the three things that jump to me, top of mind. I think technology is means to an end unless it gives you a competitive edge. And it's really hard to figure out at what point in time what technology and why we adopted it, it's going to make the biggest difference. Humans always tend to have a bias towards aligning towards where we want to go. So, that's the first one in my mind.The second one is, and we spoke about this last time, embrace your cloud provider as much as possible. You'd want to avoid taking on operational burden which is not going to add value to the business. If there is something you see your operating which can be offloaded—because your provider can, trust me, do a way better job than you or your team of few can ever do—embrace that as soon as possible. It's better that way because then it frees up your time to focus on the most important thing, which I've realized over time is—I really think teams like ours are actually—we're probably the most value as a glue to all the different experiences a software engineer would go through as part of their SDLC lifecycle.If we can simplify someone's life by giving them a clear view as to where their commit or the work is in this grand scheme of rolling out and giving them the right amount of data to take action when something goes wrong, trust me, they will love you for what you're doing because you're saving them ton of time. Many times, we don't realize that when we publish 11 different ways for you to go and check to just get your basic validation of work done. We tend to so much focus on the technological aspect of what the tool does, rather than the experience of it, and I've realized, if you can bridge the experience, especially for teams like ours, people really don't even need to know whether you're running Kubernetes or any of those solutions behind the scenes. And I think that's one of the biggest takeaways I have.Corey: I want to double down on something you said about the fact that you are not going to be able to run these services as effectively as your provider can. And relatively recently—in fact, since the first time we spoke—AWS has released a investment report in Virginia. And from 2011 through 2020, they have invested in building AWS data centers there, $35 billion. I promise almost no company that employs people listening to this that are not themselves a cloud provider is going to make that kind of investment in running these things themselves.Now, do cloud providers have sharp edges? Yes, absolutely. That is what my entire career is about, unfortunately. But you're not going to do a better job of running things more sustainably, more reliably, et cetera, et cetera. But there are other problems with this—and that's what I want to start exploring here—where in the olden days, when I ran things in data centers and they went down a lot more as a result, sometimes when there were outages, I would have the CEO of the company just standing there nervous worrying over my shoulder as I frantically typed to fix things.Spoiler: my typing accuracy did not improve by having someone looming over me. Now, when there's an outage that your cloud provider takes, in many cases the thing that you are doing to fix it is reloading the status page and waiting for an update because it is completely out of your hands. Is that something that you've had to encounter? Because you can push buttons and turn dials when things are broken and you control it, but in an AWS—or other cloud provider—outage, all you can really do is wait unless you have a DR plan that is large-scale and effective enough that you won't feel foolish or have wasted a huge amount of time and energy migrating off and then—because then it gets repaired in ten minutes. How do you approach that, from your perspective? I guess, the expectation management piece?Micheal: It's definitely I know something which keeps a lot of folks within infrastructure up at night because, like you just said, at times we can feel extremely powerless when we obviously don't have direct control—or visibility at times, as well—on what's happening. One of the things we have realized over time as part of running on our cloud provider for over a decade now, it forces us to rethink a bit on our priority workflows, what we want our Pinners to always have access to, what they need to see, what is not important or critical. Because it puts into perspective, even for the infrastructure teams, is to what is the most important thing we should always have it available and running, what is okay to be in a degraded state, until what time, right? So, it actually forces us to define SLOs and availability criteria within the team where we can broadcast that to the larger audience including the executives. So, none of this comes as a surprise at that point.I mean, it's not the answer, probably, you're looking for because is there's nothing we can do except set expectations clearly on what we can do and how when you think about the business when these things do happen. So, I know people may have I have a different view on this; I'm definitely curious to hear as well, but I know at Pinterest at least we have converged on our priority workflows. When something goes out, how do we jump in to provide a degraded experience? We have very clear run books to do that, and especially when it's a SEV 0, we do have clear processes in place on how often we need to update our entire company on where things are. And especially this is where your partnership with the cloud provider is going to be a big, big boon because you really want to know or have visibility, at the minimum some predictability on when things can get resolved, and how you want to work with them on some creative solutions. This is outside the DR strategy, obviously; you should still be focused on a DR strategy, but these are just simple things we've learned over time on how to just make it predictable for individuals within the company, so not everyone is freaking out.Corey: Yeah, from my perspective, I think the big things that I found that have worked, in my experience—mostly by getting them wrong the first time—is explain that someone else running the infrastructure when they take an outage; there's not much we can do. And no, it's not the sort of thing where picking up the phone and screaming at someone is going to help us, is the sort of thing that is best to communicate to executive stakeholders when things are running well, not in the middle of that incident.Then when things break, it's one of those, “Great, you're an exec. You know what your job is? Literally anything other than standing in the middle of the engineering floor, making everyone freak out even more. We'll have a discussion later about what the contributing factors were when you demand that we fire someone because of an outage. Then we're going to have a long and hard talk about what kind of culture you're trying to build here again?” But there are no perfect answers here.It's easy to sit here in the silver light of day with things working correctly and say, “Oh, yeah. This is how outages should be handled.” But then when it goes down, we're all basically an inch away at best from running around with our hair on fire, screaming, “Fix it, fix it, fix it, fix it, now.” And I am empathetic to that. There's a reason but I fix AWS bills for a living, and one of those big reasons is that it's a strictly business-hours problem and I don't have to run production infrastructure that faces anything that people care about, which is kind of amazing and freeing for someone who spent too many years on call.Micheal: Absolutely. And one of the things is that this is not only with the cloud provider, I think in today's nature of how our businesses are set up, there's probably tons of other APIs you are using or you're working with you may not be aware of. And we ended up finding that the hard way as well. There were a certain set of APIs or services we were using in the critical path which we were not aware of. When these outages happen, that's when you find that out.So, you're not only beholden to your provider at that point in time; you have to have those SLO expectations set with your other SaaS providers as well, other folks you're working with. Because I don't think that's going to change; it's probably only going to get complicated with all the different types of tools you're using. And then that's a trade-off you need to really think about. An example here is just like—you know, like I said, we moved in the past from GitHub to Phabricator—I didn't close the loop on that because we're moving back to GitHub right now [laugh] and that's one of the key projects I'm working with. Yeah, it's circle of life.But the thing is, we did a very strong evaluation here because we felt like, “Okay, there's a probability that GitHub can go down and that means people will be not productive for that couple of hours. What do we do then?” And we had to put a plan together to how we can mitigate that part and really build that confidence with the engineering teams, internally. And it's not the best solution out there; the other solution was just run our own, but how is that going to make any other difference because we do have libraries being pulled out of GitHub and so many other aspects of our systems which are unknowingly dependent on it anyways. So, you have to still mitigate those issues at some point in your entire SDLC process.So, that was just one example I shared, but it's not always on the cloud provider; I think there are just many aspects of—at least today how businesses are run, you're dependent; you have critical dependencies, probably, on some SaaS provider you haven't really vetted or evaluated. You will find out when they go down.Corey: So, I don't think I've told this story before, but before I started this place, I was doing a fair bit of consulting work for other companies. And I was doing a project at Pinterest years ago. And this was one of the best things I've ever experienced at a company site, let alone a client site, where I was there early in the morning, eight o'clock or so, so you know, engineers love to show up at the crack of 11:30. But so I was working a little early; it was great. And suddenly my SSH session that I was using to remote into something or other hung.And it's tap up, tap enter a couple of times, tap it a couple more. It was hung hard. “What's the—” and then someone gently taps me on the shoulder. So, I take the headphones off. It was someone from corporate IT was coming around saying, “Hey, there's a slight problem with our corporate firewall that we're fixing. Here's a MiFi device just for you that you can tether to get back online and get worked on until the firewall gets back.”And it was incredible, just the level of just being on top of things, and the focus on keeping the people who were building things and doing expensive engineering work that was awesome—and also me—productive during that time frame was just something I hadn't really seen before. It really made me think about the value of where do you remove bottlenecks from people getting their jobs done? It was—it remains one of the most impressive things I've seen.Micheal: That is great. And as you were telling me that I did look up our [laugh] internal system to see whether a user called Corey Quinn existed, and I should confirm this with you. I do see entries over here, a couple of commits, but this was 2015. Was that the time you were around, or is this before that even?Corey: That would have been around then, yes. I didn't start this place until late 2016.Micheal: I do see your commits, like, from 2015, and I—Corey: And they're probably terrible, I have no doubt. There's a reason I don't read code for a living anymore.Micheal: Okay, I do see a lot of GIFs—and I hope it's pronounced as GIF—okay, this is cool. We should definitely have a chat about this separately, Corey?Corey: Oh, yeah. “Would you explain this code?” “Absolutely not. I wrote it. Of course, I have no idea what it does. That's the rule. That's the way code always works.”Micheal: Oh, you are an honorary Pinterest engineer at this point, and you have—yes—contributed to our API service and a couple of Puppet profiles I see over here.Corey: Oh, yes—Micheal: [Amazing 00:36:11]. [laugh].Corey: You don't wind up thinking that's a risk factor that should be disclosed. I kid. I kid. It's, I made a joke about this when VMware acquired SaltStack and I did some analytics and found that 60 some odd lines of code I had written, way back when that were still in the current version of what was being shipped. And they thought, “Wait, is this actually a risk?”And no, I am making a joke. The joke is, is my code is bad. Fortunately, there are smart people around me who review these things. This is why code review is so important. But there was a lot to admire when I was there doing various things at Pinterest. It was a fun environment to work in, the level of professionalism was phenomenal, and I was just a big fan of a lot of the automation stuff.Phabricator was great. I love working with it, and, “Great, I'm going to use this to the next place I go.” And I did and then it was—I looked at what it took to get it up and running, and oh, yeah, I can see why GitHub is so popular these days. But it was neat. It was interesting seeing that type of environment up close.Micheal: That is great to hear. You know, this is what I enjoy, like, hearing some of these war stories. I am surprised; you seem to have committed way more than I've ever done in my [laugh] duration here at Pinterest. I do managing for a living, but then again—Corey, the good news is your code is still running on production. And we—Corey: Oh dear.Micheal: —haven't—[laugh]. We haven't removed or made any changes to it, so that's pretty amazing. And thank you for all your contributions.Corey: Oh, please, you don't have to thank me. I was paid, it was fine. That's the value of—Micheal: [laugh].Corey: —[work 00:37:38] for hire. It's kind of amazing. And the best part about consultants is, is when we're done with a project, we get the hell out everyone's happy about it.More happy when it's me that's leaving because of obvious personality-related reasons. But it was just an interesting company from start to finish. I remember one other time, I wound up opening a ticket about having a slight challenge with a flickering on my then Apple-branded display that everyone was using before they discontinued those. And I expected there to be, “Oh, okay. You're a consultant. Great. How did we not put you in the closet with a printer next to that thing, breathing the toner?” Like most consulting clients tend to do, and sure enough, three minutes later, I'm getting that tap on the shoulder again; they have a whole replacement monitor. “Can you go grab a cup of coffee? We'll run the cable for it. It'll just be about five minutes.” I started to feel actively bad about requesting things because I did a lot of consulting work for a lot of different companies, and not to be unkind, but treating consultants and contractors super well is not something that a lot of companies optimize for. I can't necessarily blame them for that. It just really stood out.Micheal: Yep, I do hope we are keeping up with that right now because I know our team definitely has a lot of consultants working with us as well. And it's always amazing to see; we do want to treat them as FTs. It doesn't even matter at that point because we're all individuals and we're trying to work towards common goals. Like you just said, I think I personally have learned a few items as well from some of these folks. Which is again, I think speaks to how we want to work and create a culture of, like, we're all engineers; we want to be solving problems together, and as you were doing it, we want to do it in such a way that it's still fun, and we're not having the restrictions of titles or roles and other pieces. But I think I digressed. It was really fun to see your commits though, I do want to track this at some point before we move completely over to GitHub, at least keep this as a record, for what it's worth.Corey: Yeah basically look at this graffiti in the codebase of, “A shit-poster was here,” and here I am. And that tends to be, on some level, the mark we live on the universe. What's always terrifying is looking at things I did 15 years ago in my first Linux admin job. Can I still ping the thing that I built there? Yes, I can. And how is that even possible? That should not have outlived me; honestly, it should never have seen the light of day in production, but here we are. And you never know how long that temporary kluge you put together is going to last.Micheal: You know, one of the things I was recalling, I was talking to someone in my team about this topic as well. We always talk about 10x engineers. I don't know what your thoughts are on that, but the fact that you just mentioned you built something; it still pings. And there's a bunch of things, in my mind, when you are writing code or you're working on some projects, the fact that it can outlast you and live on, I think that's a big, big contribution. And secondly, if your code can actually help up-level, like, ten other people, I think you've really made the mark of 10x engineer at that point.Corey: Yeah, the idea of the superhuman engineer is always been a strange and dangerous one. If for nothing else, from where I sit, excellence is inherently situational. Like we just talked about someone at Pinterest: is potentially going to be able to have that kind of impact specifically because—to my worldview—that there's enough process and things around there that empower them to succeed. Then if you were to take that engineer and drop them into a five-person startup where none of those things exist, they might very well flounder. It's why I'm always a little suspicious of this is a startup founded by engineers from Google or Facebook, or wherever it is.It's, yeah, and what aspects of that culture do you think are one-to-one matches with the small scrappy startup in the garage? Right, I predicting some challenges here. Excellence is always situational. An amazing employee at one company can get fired at a second one for lack of performance, and that does not mean that there's anything wrong with them and it does not mean that they are a fraud. It means that what they needed to be successful was present in one of those shops, but not the other.Micheal: This is so true. And I really appreciate you bringing this up because whenever we discuss any form of performance management, that is a—in my view personally—I think that's an incorrect term to be using. It is really at that point in time, either you have outlived the environment you are in, or the environment is going in a different direction where I think your current skill set probably could be best used in the environment where it's going to work. And I know it's very fuzzy at that point, but like you said, yes, excellence really means you don't want to tie it to the number of commits you have pushed out, or any specific aspect of your deliverables or how you work.Corey: There are no easy answers to any of these things, and it's always situational. It's why I think people are sometimes surprised when I will make comments about the general case of how things should be, then I talk to a specific environment where they do the exact opposite, and I don't yell at them for it. It's there—in a general sense, I have some guidance, but they are usually reasons things are the way they are, and I'm interested in hearing them out. Everything's situational, the worst consultant in the world is the one that shows up, has no idea what's going on, and then asked, “What moron set this up?” Invariably, two said, quote-unquote, “Moron.” And the engagement doesn't go super well from there. It's, “Okay, why is this the way that it is? What constraints shaped it? What was the context behind the problem you were trying to solve?” And, “Well, why didn't you use this AWS service?” “Because it didn't exist for another three years when we were building that thing,” is a—Micheal: Yes.Corey: —common answer.Micheal: Yes, you should definitely appreciate that of all the decisions that have been made in past. People tend to always forget why they were made. You're absolutely right; what worked back then will probably not work now, or vice versa, and it's always situational. So, I think I can go on about this for hours, but I think you hit that to the point, Corey.Corey: Yeah, I do my best. I want to thank you for taking another block of time out of your day to wind up talking with me about various aspects of what it takes to effectively achieve better levels of engineering productivity at large companies, with many teams, working on shared codebases. If people want to learn more about what you're up to, where can they find you?Micheal: I'm definitely on Twitter. So, please note that I'm spelled M-I-C-H-E-A-L on Twitter. So, you can definitely read on to my tweets there. But otherwise, you can always reach out to me on LinkedIn, too.Corey: Fantastic and we will, of course, include a link to that in the [show notes 00:44:02]. Thanks once again for your time. I appreciate it.Micheal: Thanks a lot, Corey.Corey: Micheal Benedict, head of engineering productivity at Pinterest. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with a comment telling me that you work at Pinterest, have looked at the codebase, and would very much like a refund and an apology.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
On The Cloud Pod this week, the pod squad is down to the OG three while Ryan is away. Also AWS announces serverless pipelines, GCP releases Spot Pods, and Azure introduces Chaos Studio. A big thanks to this week's sponsors: Foghorn Consulting, which provides full-stack cloud solutions with a focus on strategy, planning and execution for enterprises seeking to take advantage of the transformative capabilities of AWS, Google Cloud and Azure. JumpCloud, which offers a complete platform for identity, access, and device management — no matter where your users and devices are located. This week's highlights
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:09] American spy hacked booking.com, company stayed silenthttps://www.nrc.nl/nieuws/2021/11/10/american-spy-hacked-bookingcom-company-stayed-silent-a4065086[00:02:51] Fake emails sent from infrastructure owned by the FBI/DHS (the LEEP portal)https://twitter.com/spamhaus/status/1459450061696417792?s=21[00:06:35] Argo CD v2.2 release candidatehttps://blog.argoproj.io/argo-cd-v2-2-release-candidate-4e16e985b486[00:10:00] Resource Factories: A descriptive approach to Terraformhttps://medium.com/google-cloud/resource-factories-a-descriptive-approach-to-terraform-581b3ebb59c[00:36:15] Terraform Module Versions Cli https://github.com/keilerkonzept/terraform-module-versions[00:39:48] Are there any SQL database (e.g., CockroachDB, Percona) solutions which run in AWS (EC2 or EKS), and outperform AWS Aurora or any proxy recommendations to put in front of Aurora that provide query priority, better replication etc? [00:47:56] Moving from Terragrunt into native Terraform, what are good resources to learn how to split Terraform workspaces for infrastructure? [00:52:13] How many dev teams are using conventional commits? [01:00:25] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
Unedited live recording on YouTube (Ep 141)Topics Linting Q&A Super-Linter 101 Example GitHub Action Repo Dependabot for GHA Linter config files Customize Super-Linter Reusable Workflows Links Super-Linter GitHub Example Workflow in Bret's GitHub Editorconfig Reusable Workflows Workflow Templates Hacktoberfest from DigitalOcean ★ Support this podcast on Patreon ★
In this episode, Gerhard is joined by Cyrille Le Clerc, Product Manager Lead on Observability at Elastic, and Oleg Nenashev, Principal Engineer at CloudBees. It all started with Oleg's tweet back in July, in which he was promoting Akihiro Kiuchi's work on Jenkins monitoring with OpenTelemetry. This was done in the context of Google's Summer of Code - a link to Akihiro's demo is in the show notes. As you may remember from episode 20, instrumenting our changelog.com pipeline is on Gerhard's mind, and this conversation helped him clarify a few things. If you are thinking of instrumenting your CI/CD pipeline with OpenTelemetry, this episode is for you.
About RichardHe's also an instructor at Pluralsight, a frequent public speaker, and the author of multiple books on software design and development. Richard maintains a regularly updated blog (seroter.com) on topics of architecture and solution design and can be found on Twitter as @rseroter. Links: Twitter: https://twitter.com/rseroter LinkedIn: https://www.linkedin.com/in/seroter Seroter.com: https://seroter.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: You know how git works right?Announcer: Sorta, kinda, not really Please ask someone else!Corey: Thats all of us. Git is how we build things, and Netlify is one of the best way I've found to build those things quickly for the web. Netlify's git based workflows mean you don't have to play slap and tickle with integrating arcane non-sense and web hooks, which are themselves about as well understood as git. Give them a try and see what folks ranging from my fake Twitter for pets startup, to global fortune 2000 companies are raving about. If you end up talking to them, because you don't have to, they get why self service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y.comCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Once upon a time back in the days of VH1, which was like MTV except it played music videos, would have a show that was, “Where are they now?” Looking at former celebrities. I will not use the term washed up because that's going to be insulting to my guest.Richard Seroter is a returning guest here on Screaming in the Cloud. We spoke to him a year ago when he was brand new in his role at Google as director of outbound product management. At that point, he basically had stars in his eyes and was aspirational around everything he wanted to achieve. And now it's a year later and he has clearly failed because it's Google. So, outbound products are clearly the things that they are going to be deprecating, and in the past year, I am unaware of a single Google Cloud product that has been outright deprecated. Richard, thank you for joining me, and what do you have to say for yourself?Richard: Yeah, “Where are they now?” I feel like I'm the Leif Garrett of cloud here, joining you. So yes, I'm still here, I'm still alive. A little grayer after twelve months in, but happy to be here chatting cloud, chatting whatever else with you.Corey: I joke a little bit about, “Oh, Google winds up killing things.” And let's be clear, your consumer division which, you know, Google is prone to that. And understanding a company's org chart is a challenge. A year or two ago, I was of the opinion that I didn't need to know anything about Google Cloud because it would probably be deprecated before I really had to know about it. My opinion has evolved considerably based upon a number of things I'm seeing from Google.Let's be clear here, I'm not saying this to shine you on or anything like that; it's instead that I've seen some interesting things coming out of Google that I consider to be the right moves. One example of that is publicly signing multiple ten-year deals with very large, serious institutions like Deutsche Bank, and others. Okay, you don't generally sign contracts with companies of that scale and intend not to live up to them. You're hiring Forrest Brazeal as your head of content for Google Cloud, which is not something you should do lightly, and not something that is a short-term play in any respect. And the customer experience has continued to improve; Google Cloud products have not gotten worse, and I'm seeing in my own customer conversations that discussions about Google Cloud have become significantly less dismissive than they were over the past year. Please go ahead and claim credit for all of that.Richard: Yeah. I mean, the changes a year ago when I joined. So, Thomas Kurian has made a huge impact on some of that. You saw us launch the enterprise APIs thing a while back, which was, “Hey, here's, for the most part, every one of our products that has a fixed API. We're not going to deprecate it without a year's notice, whatever it is. We're not going to make certain types of changes.” Maybe that feels like, “Well, you should have had that before.” All right, all we can do is improve things moving forward. So, I think that was a good change.Corey: Oh, I agree. I think that was a great thing to do. You had something like 80-some-odd percent coverage of Google Cloud services, and great, that's going to only increase with time, I can imagine. But I got a little pushback from a few Googlers for not being more congratulatory towards them for doing this, and look, it's a great thing. Don't get me wrong, but you don't exactly get a whole lot of bonus points and kudos and positive press coverage—not that I'm press—for doing the thing you should have been doing [laugh] all along.It's, “This is great. This is necessary.” And it demonstrates a clear awareness that there was—rightly or wrongly—a perception issue around the platform's longevity and that you've gone significantly out of your way to wind up addressing that in ways that go far beyond just yelling at people on Twitter they don't understand the true philosophy of Google Cloud, which is the right thing to do.Richard: Yeah, I mean, as you mentioned, look, the consumer side is very experimental in a lot of cases. I still mourn Google Reader. Like, those things don't matter—Corey: As do we all.Richard: Of course. So, I get that. Google Cloud—and of course we have the same cultural thing, but at the same time, there's a lifecycle management that's different in Google Cloud. We do not deprecate products that much. You know, enterprises make decade-long bets. I can't be swap—changing databases or just turning off messaging things. Instead, we're building a core set of things and making them better.So, I like the fact that we have a pretty stable portfolio that keeps getting a little bit bigger. Not crazy bigger; I like that we're not just throwing everything out there saying, “Rock on.” We have some opinions. But I think that's been a positive trend, customers seem to like that we're making these long-term bets. We're not going anywhere for a long time and our earnings quarter after quarter shows it—boy, this will actually be a profitable business pretty soon.Corey: Oh, yeah. People love to make hay, and by people, I stretch the term slightly and talk about, “Investment analysts say that Google Cloud is terrible because at your last annual report you're losing something like $5 billion a year on Google Cloud.” And everyone looked at me strangely, when I said, “No, this is terrific. What that means is that they're investing in the platform.” Because let's be clear, folks at Google tend to be intelligent, by and large, or at least intelligent enough that they're not going to start selling cloud services for less than it costs to run them.So yeah, it is clearly an investment in the platform and growth of it. The only way it should be turning a profit at this point is if there's no more room to invest that money back into growing the platform, given your market position. I think that's a terrific thing, and I'm not worried at all about it losing money. I don't think anyone should be.Richard: Yeah, I mean, strategically, look, this doesn't have to be the same type of moneymaker that even some other clouds have to be to their portfolio. Look, this is an important part, but you look at those ten-year deals that we've been signing: when you look at Univision, that's a YouTube partnership; you look at Ford that had to do with Android Auto; you look at these others, this is where us being also a consumer and enterprise SaaS company is interesting because this isn't just who's cranking out the best IaaS. I mean, that can be boring stuff over time. It's like, who's actually doing the stuff that maybe makes a traditional company more interesting because they partner on some of those SaaS services. So, those are the sorts of deals and those sorts of arrangements where cloud needs to be awesome, and successful, and make money, doesn't need to be the biggest revenue generator for Google.Corey: So, when we first started talking, you were newly minted as a director of outbound product management. And now, you are not the only one, there are apparently 60 of you there, and I'm no closer to understanding what the role encompasses. What is your remit? Where do you start? Where do you stop?Richard: Yeah, that's a good question. So, there's outbound product management teams, mostly associated with the portfolio area. So network, storage, AI, analytics, database, compute, application modernization-y sort of stuff—which is what I cover—containers, dev tools, serverless. Basically, I am helping make sure the market understands the product and the product understands the market. And not to be totally glib, but a lot of that is, we are amplification.I'm amplifying product out to market, analysts, field people, partners: “Do you understand this thing? Can I help you put this in context?” But then really importantly, I'm trying to help make sure we're also amplifying the market back to our product teams. You're getting real customer feedback: “Do you know what that analyst thinks? Have you heard what happened in the competitive space?”And so sometimes companies seem to miss that, and PMs poke their head up when I'm about to plan a product or I'm about to launch a product because I need some feedback. But keeping that constant pulse on the market, on customers, on what's going on, I think that can be a secret weapon. I'm not sure everybody does that.Corey: Spending as much time as I do on bills, admittedly AWS bills, but this is a pattern that tends to unfold across every provider I've seen. The keynotes are chock-full of awesome managed service announcements, things that are effectively turnkey at further up the stack levels, but the bills invariably look a lot more like, yeah, we spend a bit of money on that and then we run 10,000 virtual instances in a particular environment and we just treat it like it's an extension of our data center. And that's not exciting; that's not fun, quote-unquote, but it's absolutely what customers are doing and I'm not going to sit here and tell them that they're wrong for doing it. That is the hallmark of a terrible consultant of, “I don't understand why you're doing what you're doing, so it must be foolish.” How about you stop and gain some context into why customers do the things that they do?Richard: No, I send around a goofy newsletter every week to a thousand or two people, just on things I'm learning from the field, from customers, trying to make sure we're just thinking bigger. A couple of weeks ago, I wrote an idea about modernization is awesome, and I love when people upgrade their software. By the way, most people migration is a heck of a lot easier than if I can just get this into your cloud, yeah love that; that's not the most interesting thing, to move VMs around, but most people in their budget, don't have time to rewrite every Java app to go. Everybody's not changing .NET framework to .NET core.Like, who do I think everybody is? No, I just need to try to get some incremental value first. Yes, then hopefully I'll swap out my self-managed SQL database for a Spanner or a managed service. Of course, I want all of that, but this idea that I can turn my line of business loan processing app into a thousand functions overnight is goofy. So, how are we instead thinking more pragmatically about migration, and then modernizing some of it? But even that sort of mindset, look, Google thinks about innovation modernization first. So, also just trying to help us take a step back and go, “Gosh, what is the normal path? Well, it's a lot of migration first, some modernization, and then there's some steady-state work there.”Corey: One of the things that surprised me the most about Google Cloud in the market, across the board, has been the enthusiastic uptake for enterprise workloads. And by enterprise workloads, I'm talking about things like SAP HANA is doing a whole bunch of deployments there; we're talking Big Iron-style enterprise-y things that, let's be honest, countervene most of the philosophy that Google has always held and espoused publicly, at least on conference stages, about how software should be built. And I thought that would cut against them and make it very difficult for you folks to gain headway in that market and I could not have been more wrong. I'm talking to large enterprises who are enthusiastically talking about Google Cloud. I've got a level with you, compared to a year or two ago, I don't recognize the place.Richard: Mmm. I mean, some of that, honestly, in the conversations I have, and whatever I do a handful of customer calls every week, I think folks still want something familiar, but you're looking for maybe a further step on some of it. And that means, like, yes, is everybody going to offer VMs? Yeah, of course. Is everyone going to have MySQL? Obviously.But if I'm an enterprise and I'm doing these generational bets, can I cheat a little bit, and maybe if I partner with a more of an innovation partner versus maybe just the easy next step, am I buying some more relevance for the long-term? So, am I getting into environment that has some really cool native zero-trust stuff? Am I getting into environment with global backend services and I'm not just stitching together a bunch of regional stuff? How can I cheat by using a more innovation vendor versus just lifting and shifting to what feels like hosted software in another cloud? I'm seeing more of that because these migrations are tough; nobody should be just randomly switching clouds. That's insane.So, can I make, maybe, one of these big bets with somebody who feels like they might actually even improve my business as a whole because I can work with Google Pay and improve how I do mobile payments, or I could do something here with Android? Or, heck, all my developers are using Angular and Flutter; aren't I going to get some benefit from working with Google? So, we're seeing that, kind of, add-on effect of, “Maybe this is a place not just to host my VMs, but to take a generational leap.”Corey: And I think that you're positioning yourselves in a way to do it. Again, talk about things that you wouldn't have expected to come out of Google of all places, but your console experience has been first-rate and has been for a while. The developer experience is awesome; I don't need to learn the intricacies of 12 different services for what I'm trying to do just in order to get something basic up and running. I can stop all the random little billing things in my experimental project with a single click, which that admittedly has a confirm, which you kind of want. But it lets you reason about these things.It lets you get started building something, and there's a consistency and cohesiveness to the console that, again, I am not a graphic designer, by any stretch of the imagination. My most commonly used user interface is a green-screen shell prompt, and then I'm using Vim to wind up writing something horrifying, ideally in Python, but more often in YAML. And that has been my experience, but just clicking around the console, it's clear that there was significant thought put into the design, the user experience, and the way of approaching folks who are starting to look very different, from a user persona perspective.Richard: I can—I mean, I love our user research team; they're actually fun to hang out with and watch what they do, but you have to remember, Google as a company, I don't know, cloud is the first thing we had to sell. Did have to sell Gmail. I remember 15 years ago, people were waiting for invites. And who buys Maps or who buys YouTube? For the most part, we've had to build things that were naturally interesting and easy-to-use because otherwise, you would just switch to anything else because everything was free.So, some of that does infuse Google Cloud, “Let's just make this really easy to use. And let's just make sure that, maybe, you don't hate yourself when you're done jumping into a shell from the middle of the console.” It's like, that should be really easy to do—or upgrade a database, or make changes to things. So, I think some of the things we've learned from the consumer good side, have made their way to how we think of UX and design because maybe this stuff shouldn't be terrible.Corey: There's a trope going around, where I wound up talking about the next million cloud customers. And I'm going to have to write a sequel to it because it turns out that I've made a fundamental error, in that I've accepted the narrative that all of the large cloud vendors are pushing, to the point where I heard from so many folks I just accepted it unthinkingly and uncritically, and that's not what I should be doing. And we'll get to what I was wrong about in a minute, but the thinking goes that the next big growth area is large enterprises, specifically around corporate IT. And those are folks who are used to managing things in a GUI environment—which is fine—and clicking around in web apps. Now, it's easy to sit here on our high horse and say, “Oh, you should learn to write code,” or YAML, which is basically code. Cool.As an individual, I agree, someone should because as soon as they do that, they are now able to go out and take that skill to a more lucrative role. The company then has to backfill someone into the role that they just got promoted out of, and the company still has that dependency. And you cannot succeed in that market with a philosophy of, “Oh, you built something in the console. Now, throw it away and do it right.” Because that is maddening to that user persona. Rightfully so.I'm not that user persona and I find it maddening when I have to keep tripping over that particular thing. How did that come to be, from your perspective? First, do you think that is where the next million cloud customers come from? And have I adequately captured that user persona, or am I completely often the weeds somewhere?Richard: I mean, I shared your post internally when that one came out because that resonated with me of how we were thinking about it. Again, it's easy to think about the cloud-native operators, it's Spotify doing something amazing, or this team at Twitter doing something, or whatever. And it's not even to be disparaging. Like, look, I spent five years in enterprise IT and I was surrounded by operators who had to run dozen different systems; they weren't dedicated to just this thing or that. So, what are the tools that make my life easy?A lot of software just comes with UIs for quick install and upgrades, and how does that logic translate to this cloud world? I think that stuff does matter. How are you meeting these people a little better where they are? I think the hard part that we will always have in every cloud provider is—I think you've said this in different forums, but how do I not sometimes rub the data center on my cloud or vice versa? I also don't want to change the experience so much where I degrade it over the long term, I've actually somehow done something worse.So, can I meet those people where they are? Can we pull some of those experiences in, but not accidentally do something that kind of messes up the cloud experience? I mean, that's a fine line to walk. Does that make sense to you? Do you see where there's a… I don't know, you could accidentally cater to a certain audience too much, and change the experience for the worse?Corey: Yes, and no. My philosophy on it is that you have to meet customers where they are, but only to a point. At some point, what they're asking for becomes actively harmful or disadvantageous to wind up providing for them. “I want you to run my data center for me,” is on some level what some cloud environments look like, and I'm not going to sit here and tell people they're inherently wrong for that. Their big reason for moving to the cloud was because they keep screwing up replacing failed hard drives in their data center, so we're going to put it in the cloud.Is it more expensive that way? Well, sure in terms of actual cash outlay, it almost certainly is, but they're also not going down every month when a drive fails, so once the value of that? It's a capability story. That becomes interesting to me, and I think that trying to sit here in isolation, and say that, “Oh, this application is not how we would build it at Google.” And it's, “Yeah, you're Google. They are insert an entire universe of different industries that look nothing whatsoever like Google.” The constraints are different, the resources are different, and—Richard: Sure.Corey: —their approach to problem-solving are different. When you built out Google, and even when you're building out Google Cloud, look at some of the oldest craftiest stuff you have in your entire all of Google environment, and then remember that there are companies out there that are hundreds of years old. It's a different order of magnitude as far as era, as far as understanding of what's in the environment, and that's okay. It's a very broad and very diverse world.Richard: Yeah. I mean, that's, again, why I've been thinking more about migration than even some of the modernization piece. Should you bring your network architecture from on-prem to the cloud? I mean, I think most cases, no. But I understand sometimes that edge firewall, internal trust model you had on-prem, okay, trying to replicate that.So, yeah, like you say, I want to meet people where they are. Can we at least find some strategic leverage points to upgrade aspects of things as you get to a cloud, to save you from yourself in some places because all of a sudden, you have ten regions and you only had one data center before. So, many more rooms for mistakes. Where are the right guardrails? We're probably more opinionated than others at Google Cloud.I don't really apologize for that completely, but I understand. I mean, I think we've loosened up a lot more than maybe people [laugh] would have thought a few years ago, from being hyper-opinionated on how you run software.Corey: I will actually push back a bit on the idea that you should not replicate your on-premises data center in your cloud environment. Sure, are there more optimal ways to do it that are arguably more secure? Absolutely. But a common failure mode in moving from data center to cloud is, “All right, we're going to start embracing this entirely new cloud networking paradigm.” And it is confusing, and your team that knows how the data center network works really well are suddenly in way over their heads, and they're inadvertently exposing things they don't intend to or causing issues.The hard part is always people, not technology. So, when I glance at an environment and see things like that, perfect example, are there more optimal ways to do it? Oh, from a technology perspective, absolutely. How many engineers are working on that? What's their skill set? What's their position on all this? What else are they working on? Because you're never going to find a team of folks who are world-class experts in every cloud? It doesn't work that way.Richard: No doubt. No doubt, you're right. There's areas where we have to at least have something that's going to look similar, let you replicate aspects of it. I think it's—it'll just be interesting to watch, and I have enough conversations with customers who do ask, “Hey, where are the places we should make certain changes as we evolve?” And maybe they are tactical, and they're not going to be the big strategic redesign their entire thing. But it is good to see people not just trying to shovel everything from one place to the next.Corey: This episode is sponsored in part by something new. Cloud Academy is a training platform built on two primary goals. Having the highest quality content in tech and cloud skills, and building a good community the is rich and full of IT and engineering professionals. You wouldn't think those things go together, but sometimes they do. Its both useful for individuals and large enterprises, but here's what makes it new. I don't use that term lightly. Cloud Academy invites you to showcase just how good your AWS skills are. For the next four weeks you'll have a chance to prove yourself. Compete in four unique lab challenges, where they'll be awarding more than $2000 in cash and prizes. I'm not kidding, first place is a thousand bucks. Pre-register for the first challenge now, one that I picked out myself on Amazon SNS image resizing, by visiting cloudacademy.com/corey. C-O-R-E-Y. That's cloudacademy.com/corey. We're gonna have some fun with this one!Corey: Now, to follow up on what I was saying earlier, what I think I've gotten wrong by accepting the industry talking points on is that the next million cloud customers are big enterprises moving from data centers into the cloud. There's money there, don't get me wrong, but there is a larger opportunity in empowering the creation of companies in your environment. And this is what certain large competitors of yours get very wrong, where it's we're going to launch a whole bunch of different services that you get to build yourself from popsicle sticks. Great. That is not useful.But companies that are trying to do interesting things, or people who want to found companies to do interesting things, want something that looks a lot more turnkey. If you are going to be building cloud offerings, that for example, are terrific building blocks for SaaS companies, then it behooves you to do actual investments, rather than just a generic credit offer, into spurring the creation of those types of companies. If you want to build a company that does payroll systems, in a SaaS, cloud way, “Partner with us. Do it here. We will give you a bunch of credits. We will introduce you to your first ten prospective customers.”And effectively actually invest in a company success, as opposed to pitch-deck invest, which is, “Yeah, we'll give you some discounting and some credits, and that's our quote-unquote, ‘investment.'” actually be there with them as a partner. And that's going to take years for folks to wrap their heads around, but I feel like that is the opportunity that is significantly larger, even than the embedded existing IT space because rather than fighting each other for slices of the pie, I'm much more interested in expanding that pie overall. One of my favorite questions to get asked because I think it is so profoundly missing the point is, “Do you think it's possible for Google to go from number three to number two,” or whatever the number happens to be at some point, and my honest, considered answer is, “Who gives a shit?” Because number three, or number five, or number twelve—it doesn't matter to me—is still how many hundreds of billions of dollars in the fullness of time. Let's be real for a minute here; the total addressable market is expanding faster than any cloud or clouds are going to be able to capture all of.Richard: Yeah. Hey, look, whoever who'll be more profitable solving user problems, I really don't care about the final revenue number. I can be the number one cloud tomorrow by making Google Cloud free. What's the point? That's not a sustainable business. So, if you're just going for who can deploy the most VCPUs or who can deploy the most whatever, there's ways to game that. I want to make sure we are just uniquely solving problems better than anybody else.Corey: Sorry, forgive me. I just sort of zoned out for a second there because I'm just so taken aback and shocked by the idea of someone working at a large cloud provider who expresses a philosophy that isn't lying awake at night fretting over the possibility of someone who isn't them as making money somewhere.Richard: [laugh]. I mean, your idea there, it'll be interesting to watch, kind of, the maker's approach of are you enabling that next round of startups, the next round of people who want to take—I mean, honestly, I like the things we're doing building block-wise, even with our AI: we're not just handing you a vision API, we're giving you a loan processing AI that can process certain types of docs, that more packaged version of AI. Same with healthcare, same with whatever. I can imagine certain startups or a company idea going, “Hey, maybe I could disrupt or serve a new market.”I always love what Square did. They've disrupted emerging markets, small merchants here in North America, wherever, where I didn't need a big expensive point of sale system. You just gave me the nice, right building blocks to disrupt and run my business. Maybe Google Cloud can continue to provide better building blocks, but I do like your idea of actually investment zones, getting part of this. Maybe the next million users are founders and it's not just getting into some of these companies with, frankly, 10, 20, 30,000 people in IT.I think there's still plenty of room in these big enterprises to unlock many more of those companies, much more of their business. But to your point, there's a giant market here that we're not all grabbing yet. For crying out loud, there's tons of opportunity out here. This is not zero-sum.Corey: Take it a step further beyond that, and today, if you have someone who's enterprising, early on in their career, maybe they just got out of school, maybe they have just left their job and are ready to snap, or they have some severance money that they want to throw into something. Great. What do they want to do if they have an idea for a company? Well today, that answer looks a lot like, well, time to go to a boot camp and learn to code for six months so you can build a badly done MVP well enough to get off the ground and get some outside investment, and then go from there. Well, what if we cut that part out entirely?What if there were building blocks of I don't need to know or care that there's a database behind it, or what a database looks like. Picture Visual Basic in a web browser for building apps, and just take this bit of information I give you and store it and give it back to me later. Sure, you're going to have some significant challenges in the architecture or something like that as it goes from this thing that I'm talking about as an MVP to something planet-scale—like a Spotify for example—but that's not most businesses, and that's okay. Get out of the way and let people innovate and iterate on what it is they're doing more rapidly, and make it more accessible to teach people. That becomes huge; that gets the infrastructure bits that cloud providers excel at out of the way, and all it really takes is packaging those things into a golden path of what a given company of a particular profile should be doing, if—unless they have reason to deviate from it—and instead of having this giant paradox of choice issue, it's, “Oh, okay, I'll drag-drop, build things accordingly.”And under the hood, it's doing all the configuration of services and that's great. But suddenly, you've made being a founder of a software company—fundamentally—accessible to people who are not themselves software engineers. And I know that's anathema to some people, and I don't even slightly care because I am done with gatekeeping.Richard: Yeah. No, it's exciting if that can pull off. I mean, it's not the years ago where, how much capital was required to find the rack and do all sorts of things with tech, and hire some developers. And it's an amazing time to be software creators, now. The more we can enable that—yeah, I'm along for that journey, sign me up.Corey: I'm looking forward to seeing how it winds up shaking out. So, I want to talk a little bit about the paradox of choice problem that I just mentioned. If you take a look at the various compute services that every cloud provider offers, there are an awful lot of different choices as far as what you can run. There's the VM model, there's containers—if you're in AWS, you have 17 ways to run those—and you wind up—any of the serverless function story, and other things here and there, and managed services, I mean and honestly, Google has a lot of them, nowhere near as many as you do failed messaging products, but still, an awful lot of compute options. How do customers decide?What is the decision criteria that you see? Because the worst answer you can give someone who doesn't really know what they're doing is, “It depends,” because people don't know how to make that decision. It's, “What factors should I consider then, while making that decision?” And the answer has to be something somewhat authoritative because otherwise, they're going to go on the internet and get yelled at by everyone because no one is ever going to agree on this, except that everyone else is wrong.Richard: Mm-hm. Yeah, I mean, on one hand, look, I like that we intentionally have fewer choices than others because I don't think you need 17 ways to run a container. I think that's excessive. I think more than five is probably excessive because as a customer, what is the trade-off? Now, I would argue first off, I don't care if you have a lot of options as a vendor, but boy, the backends of those better be consistent.Meaning if I have a CI/CD tool in my portfolio and it only writes to two of them, shame on me. Then I should make sure that at least CI/CD, identity management, log management, monitoring, arguably your compute runtime should be a late-binding choice. And maybe that's blasphemous because somebody says, “I want to start up front knowing it's a function,” or, “I want to start it's a VM.” How about, as a developer, I couldn't care less. How about I just build cool software and maybe even at deploy time, I say, “This better fits in running in Kubernetes.” “This is better in a virtual machine.”And my cost of changing that later is meaningless because, hey, if it is in the container, I can switch it between three or four different runtimes, the identity management the same, it logs the exact same way, I can deploy CI/CD the same way. So, first off, if those things aren't the same, then the vendor is messing up. So, the customer shouldn't have to pay the cost of that. And then there gets to be other actual criteria. Look, I think you are looking at the workload itself, the team who makes it, and the strategy to figure out the runtime.It's easy for us. Google Compute Engine for VMs, containers go in GKE, managed services that need some containers, there are some apps around them, are Cloud Functions and Cloud Run. Like, it's fairly straightforward and it's going to be an OR situation—or an AND situation not an OR, which is great. But we're at least saying the premium way to run containers in Google Cloud for systems is GKE. There you go. If you do have a bunch of managed services in your architecture and you're stitching them together, then you want more serverless things like Cloud Run and Cloud Functions. And if you want to just really move some existing workload, GCE is your best choice. I like that that's fairly straightforward. There's still going to be some it depends, but it feels better than nine ways to run Kubernetes engines.Corey: I'm sure we'll see them in the fullness of time.Richard: [laugh].Corey: So, talk about Anthos a bit. That was a thing that was announced a while back and it was extraordinarily unclear what it was. And then I looked at the pricing and it was $10,000 a month with a one-year minimum commitment, and is like, “Oh, it's not for me. That's why I don't get it.” And I haven't really looked back at it since. But it is something else now. It almost feels like a wrapper brand, in some respects. How's it going? [unintelligible 00:29:26]?Richard: Yeah. Consumption, we'll talk more upcoming months on some of the adoption, but we're finally getting the hockey stick, which always comes delayed with platforms because nobody adopts platforms quickly. They buy the platform and a year later they start to actually build new development, migrate the things they have. So, we're starting to see the sort of growth. But back to your first point. And I even think I poorly tried to explain it a year ago with you. Basically, look, Anthos is the ability to manage fleets of GKE clusters, wherever they are. I don't care if they're on-prem, I don't care if they're in Google Cloud, I don't care if they're Amazon. We have one customer who only uses Anthos on AWS. Awesome, rock on.So, how do I put GKE clusters everywhere, but then do fleet management because look, some people are doing an app per cluster. They don't want to jam 50 apps in the cluster from different teams because they don't like the idea that this app requires root access; now you can screw around with mine. Or, you didn't update; that broke the cluster. I don't want any of that. So, you're going to see companies more, doing even app per cluster, app per developer per cluster.So, now I have a fleet problem. How do I keep it in sync? How do I make sure policy is consistent? Those sorts of things. So, Anthos is kind of solving the fleet management challenge and replacing people's first-gen app platform.Seeing a lot of those use cases, “Hey, we're retiring our first version of Docker Enterprise, Mesos, Cloud Foundry, even OpenShift,” saying, “All right, now's the time for our next version of our app platform. How about GKE, plus Cloud Run on top of it, plus other stuff?” Sounds good. So, going well is a, sort of—as you mentioned, there's a brand story here, mainly because we've also done two things that probably matter to you. A, we changed the price a lot.No minimum commit, remarkably at 20% of the cost it was when we launched, on purpose because we've gotten better at this. So, much cheaper, no minimum commit, pay as you go. Be on-premises, on bare metal with GKE. Pay by the hour, I don't care; sounds great. So, you can do that sort of stuff.But then more importantly, if you're a GKE customer and you just want config management, service mesh, things like that, now you can buy all of those independently as well. And Anthos is really the brand for fleet management of GKE. And if you're on Google Cloud only, it adds value. If you're off Google Cloud, if you're multi-cloud, I don't care. But I want to manage fleets of compute clusters and create them. We're going to keep doubling down on that.Corey: The big problem historically for understanding a lot of the adoption paradigm of Kubernetes has been that it was, to some extent, a reimagining of how Google ran and built software internally. And I thought at the time, the idea was—from a cynical perspective—that, “All right, well, your crappy apps don't run well on Google-style infrastructure so we're going to teach the entire world how to write software the way that we do.” And then you end up with people running their blog on top of Kubernetes, where it's one of those, like, the first blog post is, like, “How I spent the last 18 months building Kubernetes.” And, okay, that is certainly a philosophy and an approach, but it's almost approaching Windows 95 launch level of hype, where people who didn't own computers were buying copies of it, on some level. And I see the term come up in conversations in places where it absolutely has no place being brought up. “How do I run a Kubernetes cluster inside of my laptop?” And, “It's what you got going on in there, buddy?”Richard: [laugh].Corey: “What do you think you're trying to do here because you just said something that means something that I think is radically different to me than it is to you.” And again, I'm not here to judge other people's workflows; they're all terrible, except for mine, which is an opinion held by everyone about their own workflow. But understanding where people are, figuring out how to get there, how to meet customers where they are and empower them. And despite how heavily Google has been into the Kubernetes universe since its inception, you're very welcoming to companies—and loud-mouth individuals on Twitter—who have no use for Kubernetes. And working through various products you offer, I don't ever feel like a second-class citizen. There's really something impressive about that, of not letting the hype dictate the product and marketing decisions of it.Richard: Yeah, look, I think I tweeted it recently, I think the future of software is managed services with containers in the gap, for the most part. Whereas—if you can use managed services, please do. Use them wherever you can. And if you have to sling some code, maybe put it in a really portable thing that's really easy to run in lots of places. So, I think that's smart.But for us, look, I think we have the best container workflow from dev tools, and build tools, and artifact registries, and runtimes, but plenty of people are running containers, and you shouldn't be running Kubernetes all over the place. That makes sense for the workload, I think it's better than a VM at the retail edge. Can I run a small cluster, instead of a weird point-of-sale Windows app? Maybe. Maybe it makes sense to have a lightweight Kubernetes cluster there for consistency purposes.So, for me, I think it's a great medium for a subset of software. Google Cloud is going to take whatever you got, which is great. I think containers are great, but at the same time, I'm happily going to let you deploy a function that responds to you adding a storage item to a bucket, where at the same time give you a SaaS service that replaces the need for any code. All of those are terrific. So yeah, we love Kubernetes. We think it's great. We're going to be the best version to run it. But that's not going to be your whole universe.Corey: No, and I would argue it absolutely shouldn't be.Richard: [laugh]. Right. Agreed. Now again, for some companies, it's a great replacement for this giant fleet of VMs that all runs at eight percent utilization. Can I stick this into a bunch of high-density clusters? Absolutely you should. You're going to save an absolute fortune doing that and probably pick up some resilience and functionality benefits.But to your point, “Do I want to run a WordPress site in there?” I don't know, probably not. “Do I need to run my own MySQL?” I'd prefer you not do that. So, in a lot of cases, don't use it unless you have to. That should go for all compute nowadays. Use managed services.Corey: I'm a big believer in going down that approach just because it is so much easier than trying to build it yourself from popsicle sticks because you theoretically might have to move it someday in the future, even though you're not.Richard: [laugh]. Right.Corey: And it lets me feel better about a thing that isn't going to be used by anything that I'm doing in the near future. I just don't pretend to get it.Richard: No, I don't install a general purpose electric charger in my garage for any electric car I may get in the future; I charge for the one I have now. I just want it to work for my car; I don't want to plan for some mythical future. So yeah, premature optimization over architecture, or death in IT, especially nowadays where speed matters, don't waste your time building something that can run in nine clouds.Corey: Richard, I want to thank you for coming on again a year later to suffer my slings, arrows, and other various implements of misfortune. If people want to learn more about what you're doing, how you're doing it, possibly to pull a Forrest Brazeal and go work with you, where can they find you?Richard: Yeah, we're a fun place to work. So, you can find me on Twitter at @rseroter—R-S-E-R-O-T-E-R—hang out on LinkedIn, annoy me on my blog seroter.com as I try to at least explore our tech from time to time and mess around with it. But this is a fun place to work. There's a lot of good stuff going on here, and if you work somewhere else, too, we can still be friends.Corey: Thank you so much for your time today. Richard Seroter, director of outbound product management at Google. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment into which you have somehow managed to shove a running container.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:27] HashiCorp registers to go public (HN thread)https://www.sec.gov/Archives/edgar/data/1720671/000119312521319849/d205906ds1.htmhttps://news.ycombinator.com/item?id=29110444[00:03:30] Goodbye Microsoft SQL Server, Hello Babelfishhttps://aws.amazon.com/blogs/aws/goodbye-microsoft-sql-server-hello-babelfish/[00:06:45] “Don't trust instructions from random people on the internet”https://ghuntley.com/sudo-rm-rf/[00:11:35] Grafana OnCall announcedhttps://grafana.com/blog/2021/11/09/announcing-grafana-oncall/[00:14:22] Terratest supports validating Terraform code with OPAhttps://github.com/gruntwork-io/terratest/releases/tag/v0.38.1[00:17:43] Lambda now supports pulling images from cross-account ECRhttps://aws.amazon.com/about-aws/whats-new/2021/11/aws-lambda-support-cross-account-image-amazon-elastic-container-registry/[00:24:20] You can now share AMIs with Orgs or OUs rather than individual accountshttps://aws.amazon.com/about-aws/whats-new/2021/10/amazon-ec2-amazon-machine-images-organizations/[00:27:03] Terraform Config Driven Refactoring (via Matt Gowie)https://discuss.hashicorp.com/t/request-for-feedback-config-driven-refactoring/30730[00:43:01] Uhoh. Terraform 1.1.0-beta1 drops. https://github.com/hashicorp/terraform/releases/tag/v1.1.0-beta1https://github.com/hashicorp/terraform-provider-aws/releases/tag/v3.64.0[00:48:45] PR to Add Bottlerocket Support for cloudposse/terraform-aws-eks-node-grouphttps://github.com/cloudposse/terraform-aws-eks-node-group/pull/93[00:51:30] How do you suggest doing DB Snapshot Dump and Restore from Production to Dev/Staging/QA envs[01:03:58] Outro #officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
Talk to, or about, CircleCI CFO Chitra Balasubramanian and you'll hear the word “team” early and often. The phrase describes her leadership approach and to how her finance group equips colleagues with next-level business and customer insights. Current and former colleagues will tell you how much value (and fun) she brings to any “solve team.” And Balasubramanian refers to her “insights team” when discussing her group's FP&A activities at CircleCI, a continuous integration and continuous delivery (CI/CD) platform that helps developers do their work faster while ensuring high-quality code. The key to delivering actionable business nights, Balasubramanian notes, includes creating clear problem statements and outcomes. When that clarity is wanting, Balasubramanian's insight team clicks into exploratory mode. Once the financial analysts have gleaned what the business needs to make better decisions, it's time to distill. “We don't want to simply relay a party bag full of information,” Balasubramanian asserts. “Our role is to deliver insights that are consumable and that immediately stimulate a productive dialogue among our stakeholders.” That those financial analyses could influence shareholder value marks a touchstone career moment for Balasubramanian, who previously worked at RetailNext. “Finance teams have a lot of opportunities to lean in and drive the business forward,” she adds. “By looking at customer data from a finance perspective, we can glean new insights that help other business stakeholders actually improve the end-customer experience.” A focus on delivering beyond-finance insights is fitting given that Balasubramanian is the only CFO currently on the board of trustees team at the Computer History Museum in Mountain View, Calif.
Want to learn about the latest and greatest in the 64-bit Visual Studio 2022? Join Scott Hanselman and Visual Studio product team as they take Visual Studio 2022 for a spin. [00:00] Intro[00:39] Why you should care about Visual Studio 2022?[02:20] Performance improvements in Visual Studio 2022[04:39] Why 64-bit now?[08:00] IntelliCode, type less code more[11:35] Hot reload for C++[13:47] New for WPF and WinForms (Hot Reload, Design time data, XAML live preview)[17:20] Hot Reload in ASP.NET[20:27] Profiling .NET apps in Visual Studio 2022[23:19] Cross platform apps with WSL and CMake in Visual Studio 2022[26:07] Testing your .NET app on Linux[28:00] Easily create CI/CD pipelines using GitHub actions with Visual Studio 2022[30:40] Balloon drop! https://aka.ms/VS2022LaunchLearnTV https://aka.ms/vsperftip https://aka.ms/intellicode https://aka.ms/cppinvs https://aka.ms/vswpf https://aka.ms/webinvs https://aka.ms/vsprofiler https://aka.ms/vscmake https://aka.ms/vstesttools https://aka.ms/azureinvs
[קישור לקובץ mp3] שלום וברוכים הבאים לפודקאסט מספר 425 [?To Early] של רברס עם פלטפורמה! היום ה-1 בנובמבר 2021, השעה היא פחות או יותר 2100 בערב ואנחנו נמצאים באולפן הביתי שלנו אשר בכרכור - אהלן אורי! הקדמה ארוכה . . .היום אנחנו מתכבדים לארח את שמעון - אהלן שמעון! - (שמעון) אהלן, כיף להיות פה, תודה רבה - (רן) איזה יופי שבאת וברוך הבא . . . - (שמעון) עשיתי את המסע מתל אביב, אני בטח לא היחיד פה שעולה לרגל . . . - (רן) בסוף עוד תקנה פה בית . . . (אורי) . . . והצטרפת ל-425 הרגלים . . .(רן) אז שמעון - ברוך הבא! שמעון מחברת Datree, והיום אנחנו הולכים לדבר בעיקר על Kubernetes ועל איך עושים רגולציה למפתחים, אבל תיכף נדבר על זה בצורה קצת יותר . . .(שמעון) זו מילה נוראית . . . (רן) בקטע טוב! . . . איך בעצם שומרים על Policy שפוי, ככה שה-Production שלנו לא יפול ויתרסק - כשאנחנו מדברים ספציפית על Kubernetes, אבל אולי גם נכליל את זה.ולפני שנצלול לנושא - נושא עמוק וטכני ומעניין - ספר לנו קצת עליך, שמעון:(שמעון) נעים מאוד, קוראים לי שמעון, אני בן 33 מתל אביבמה שנקרא “התקנתי את הלינוקס הראשון שלי” בגיל 12, ומאז התאהבתי ב-Open Source - אני זוכר, זה היה אז Red Hat 6 . . . לאחר מכן, פתחתי את החברה הראשונה שלי בגיל 15 - בתחום של Web Hosting ושרתי משחק [Game Servers]זה בתקופה שלפני ה-Cloud . . . .ו-Fast Forward: שירתתי בצבא, הייתי חוקר עבירות מחשבעבדתי אצל שי אגסי ב-Better Placeהייתי ב-Intel Security - מה שהיה Mcafee - אז יש לי קצת רקע ב-Securityולפני שפתחנו את Datree, הייתי ה-General Manager של חטיבת פיתוח התשתיות של Iron Source - היה מאוד מעניין לעשות Scale לחברה מ-30 עובדים ל-1,000 עובדים, עם תשתיות פיתוח ל-400 מתכנתים.ושם הרגשתי הרבה מאוד מה-Challeng-ים שאנחנו פותרים היום ב-Datree.(רן) אוקיי, אז אתה אומר שגדלתם מ-30 ל-400, פחות או יותר?(שמעון) מ-30 ל-1,000 - זה בכללי, עובדים; [מבחינת] מתכנתים הגענו ל-400, אני חושב.(רן) בסדר, אז 400 מפתחים עובדים כנראה על Cluster די גדול - ומישהו צריך לנהל את ה-Cluster הזה ולדאוג לזה שהוא יהיה “בריא” - וכנראה שיש כמה Cluster-ים כאלה. אולי לצערך או לשמחתך - זה היה התפקיד שלך, בין השאר . . . או שאתה ניהלת את הכוח שעשה את זה.(שמעון) כן . . . .שמע, אתה יודע - בחברה, לצורך העניין, כמו Iron Source, יש לך הכל: ECS ו-Fargate ו-Kubernetes ו-EC2 רגיל ו-Bare Metal באיזה Co-location . . . . יש הכל אז באמת מהבחינה הזאת, יצא לי לעבוד עם הרבה מאוד סוגים של תשתיות.אגב - גם AWS וגם GCP וגם Azure, כי אתה בסוף . . . היה לנו גם GitHub, גם GitLab וגם Bitbucket - לא כי רצינו, אלא כי אתה רוכש חברות ודברים נכנסים . . .(אורי) Iron Source גדלה הרבה מרכישות, ו . . .(שמעון) נכון - ואגב, זה היה אחד מה-Challenge-ים, שאין אחידות - הכל . . . אני לא אגיד “ג'ונגל”, אבל “רב-גוני” . . . .(אורי) אהה . . . (רן) ג'ונגל . . . ג'ונגל.(אורי) Iron Source, יאמר לזכותם, ואולי זה גם חלק מהיופי של לעשות רכישות ולתת לכולם להמשיך לרוץ קדימה . . . (שמעון) חד משמעית, זה היה מקום מדהים - מאוד מאוד שמחתי לעבוד שם והיו Challenge-ים מאוד מאוד מעניינים.ובאמת ככה יום אחד התעוררתי - ובלילה, אחד המתכנתים עשה טעות והכניס Misconfiguration שהגיעה ל-Production . . . זה קרה פה לכל המאזינים, כנראה - מ-Secrets שדלפו לאיזשהו GitHub Repo או Misconfiguration, בין אם זה Kubernetes או -EC2 או לא משנה במה אתם משתמשים - וזה הגיע ל-Production.וזה בסדר - אני כל הזמן עושה טעויות, אנשים טועים, אבל אחד ה-Challenge-ים הכי גדולים היה “אוקיי, סבבה - Fool me Once זה בסדר, Full me twice זה בעיה שלי . . .”איך אני אגרום עכשיו ל-400 מתכנתים לא לעשות את אותה הטעות שוב.(רן) אה, אין שום בעיה - שולחים אימייל! : “חבר'ה - בבקשה לא לעשות Misconfiguration!”(שמעון) אתה צוחק, אבל אני ראיתי הרבה ארגונים ששולחים מייל . . . “חברים, מעכשיו משתמשים בגירסא הזאת, מעכשיו משתמשים ב-Container-ים האלה”.מן הסתם, זה די מגוחך וזה לא עובד.ניסיתי כל מיני דברים . . . ניסיתי את ה-Email . . . אני מאוד פעיל בקהילותאני אחד מה-Co-Organizers של CNCF Tel Avivאני AWS Community Hero, אז אני מוביל את ה-Meetup הכי גדול בעולם של Amazon . . .יש לנו 8,000 אנשיםאמרתי בואו נעשה Meetup! עשינו Meetup, הסברתי על Secure Development ומה לעשות ואיך ולמה - אבל ברור שזה לא עובדזה חייב להיות in-flow ב-Development Process של המפתחים . . .(אורי) דרך אגב - יש מצב שאתה Community Hero בגלל שמישהו עשה טעות בקונפיגורציה, ודרך זה AWS נותנת לך את התארים . . .(שמעון) יכול להיות . . .(רן) הייתה פעם סדרה, אני לא יודע אם אתם זוכרים, בשם “גיבורים בעל כורחם” . . . אז הנה - דוגמא.(אורי) זה בהמשך ל-Darwin Awards . . . (רן) לא, “גיבורים בעל כורחם” הייתה סדרה אמיתית, בדרך כלל על חיילים שהגיעו לכל מיני מצבים מאוד קשים ואז נאלצו להראות את הגבורה שלהם - אז כמו שהם לא בחרו להיכנס לשם, גם אתה לא בחרת להיכנס לשם אבל יצאת בגבורה . . .(שמעון) כן, זה מה שקרה . . .אז באמת, אחרי 4 וחצי שנים נפלאות ומדהימות ב-Iron Source, יצאתי לדרך עם השותף של - אייר זילברמן - ופתחנו את Datree.מה שאנחנו עושים היום זה עוזרים למנוע מ-Misconfigurations להגיע ל-Production - בסביבת Kubernetesנכון, זו בעיה הרבה יותר רחבה מ-Kubernetes, וצריך אותה בכל מקום - אבל אין מה לעשות, אתה סטארטאפ ואתה צריך להתמקדאולי זה כבר לפרק אחר - עשינו שינוי מ-Top-Down ל-Bottom-Up, ל-Product-led Growth [אייר כבר עשה את הפרק הזה . . . ] - והיה לנו חשוב לעשות Flow מאוד ברור ומאוד פשוט - אתה מגיע לאתר והדבר הראשון ששאתה רואה מול הפנים שלך זה Copy-Paste של curl לטרמינל ואתה מתקין את ה-CLI . . .בשבוע שעבר חצינו את רף ה-5,000 Star-ים . . .(רן) מזל טוב! . . .(שמעון) תודה רבה, אנחנו באמת מאוד גאיםוככה יצאנו לדרך - כדי לעזור למתכנתים לעשות את הדבר הנכון.(רן) אוקיי, אז Kubernetes - למי שלא היה באיזור הזה בשנים האחרונות - Kubernetes זה Cluster Management, או Orchestrator יותר נכון. בעצם, אם יש לכן תוכנה שרצה בתוך Container - נניח Docker, לא ניהיה ספציפיים - ויש לכם אלפים כאלה, אז צריך משהו שינהל את אותם - שירים אותם, שיעשה להם Healing, שיפרוש אותם, שישים את הגרסאות הנכונות וכו' - ו-Kubernetes הוא זה שעושה את זה.עכשיו, למי שעדיין מכיר Kubernetes - כדאי שתכירו YAML או JSON - בעצם, Kubernetes פועל ע”י קבצים, קבצי-ענק לפעמים, של JSON-ים או YAML-ים שבאמצעותם אתם מקנפגים (Configure) - אתם אומרים “את ה-Product הזה תשים פה”, “את ה-Service הזה תשים שם” וכו'. ואז נשאלת השאלה- ?What could go wrong(שמעון) אז אני חושב שאולי באמת נדבר על המהפיכה שקרתה פה, כי אני חושב שזה משהו שהוא בקנה מידה של המצאת ה-VM, לטעמי.כי אנחנו חיינו גם - אז באנו ופתאום הגיעו ה-Cloud-ים ואז היה את ה-API של AWS ואת ה-API של GCP ואת ה-API של איזשהו Bare-Metal שאתה משתמש - וניהיה פה מגדל בבל.ואז קמו והקימו את ה-Cloud Native Foundation, את ה-CNCF, ואמרו: “חברים, בואו נעשה שכבה אחידה, שנוכל לדבר בינינו” - ותריץ את זה על Bare-Metal או AWS או GCP או Azure או איפה שתרצהאבל שתיהיה לנו שפה אחידה - כי זה לא הגיוני שאנחנו, בשביל לפתח מוצר, צריכים לתמוך בכל כך הרבה דברים.עכשיו, כשעשו את זה - זה כבר היה בתקופה שהיא יחסית מאוחרת, ויכלו לבוא ולהכיל את ה-Modern Practices בתוך זהאז למשל, בתוך Kubernetes אין UI שאתה יכול לעשות לו Launch Instance, פשוט אין דבר כזהאין שום כפתור שאפשר ללחוץ - יש רק קבצים, לצורך העניין YAML-ים, שלוקחים ועושים להם Apply - ובעצם שולחים אותם לשרת ואומרים “היי! הנה ה-Instructions set של מה שאני רוצה שיקרה!”.ואני חושב שזו מהפכה - כי זה מביא אותך לעולמות של GitOps, זה מביא אותך לעולמות שהכל מתועד, על כל שינוי שקורה אתה יכול לדעת מי שינה אותו ולמה שינה אותו ואיך להחליף אותווזה באמת שם לנו Even Playing Field מסויים - שלטעמי זה ממש הדור הבא . . . . לא רק לטעמי - ניתן לראות מה קורה בעולם.(רן) זה למעשה כלי שנותן לך לתאר את סביבת ה-Production שלך בצורה . . . כקונפיגורציה (Configuration), בצורה שהיא דיסקרפטיבית (Descriptive) - אתה לא צריך להריץ Script-ים כדי שיעבירו אותך למצב הזה, אלא אתה פשוט אומר “הנה, זה המצב - תעשה Apply, תכיל את זה” - וזה נשמע כמו פתרון מצויין, אבל עדיין יש לי תחושה שיש פה כמה צרות שמתחבאות פה בפנים . . .(אורי) יש תמיד את המתח הזה, בין האם אני עושה UI ומאחורי ה-UI אני יכול לשמור על . . . אני יכול להפעיל חוקים או למנוע אפשרויות מסויימות ב-UI - ואז זה שומר עלי; או שאני נותן לעשות את ה-Description באמת בקובץ ועל הקובץ הזה יש לי בעצם Audit של מי עשה ומה עשה ואני יכול לתחקר אחורה - אבל זה הכל שאלה של . . .(שמעון) Flexibility? . . .(אורי) . . . האם זה מניעה או האפשרות אחר כך לתעד . . . או Full Flexibility והיכולת לתחקר אחורה.(שמעון) Simplicity vs. Flexibility, אני קורא לזה . . .ובאמת, חד-משמעית, הלכו All-in על Flexibility - מה שמוריד את ה-Simplicity פשוט לאפס . . . ובדיוק מה שדיברנו פה - זה מביא להמון המון בעיות.אבל לפני שניכנס לבעיות ספציפיות ב-Kubernetes, אני חושב שגם מאוד חשוב לדבר על עוד Transition אחד שקרה - וזה שבהרבה ארגונים התחילו לשבור את החומה . . . אם לפני כמה שנים שברנו את החומה של של ה-QA ושל ה-Dev, אז היום שוברים את החומה של ה-Ops ושל ה-Devיש לנו את העולם של לפתח בצורה אג'יילית (Agile) ו-DevOps-ית - ובהרבה ארגונים הולכים או עשו “You built it - You run it - and you operate it” . . .“אני, ה-DevOps, לא אקום בארבע בבוקר כי אתם כתבם באגים - אתם תקומו בארבע בבוקר ואתם תסדרו את זה”עכשיו, מן הסתם יש לזה יתרונות, כי זה נותן לנו Speed of Delivery, אתה לא צריך לחכות לאף איש DevOps, אתה לא צריך להיות תלוי באף אחד, יש לך Full Ownershipמצד שני . . . (רן) זה נותן Accountability, שזה דבר מאוד חשוב . . . אתה אחראי: אתה עשית פאשלה, אז אתה גם אחראי לתקן אותה.(שמעון) חד-משמעית - אבל מצד שני . . . .(אורי) וואי, אתם מדברים מה-זה-2015 . . . (רן) כן, אנחנו רק משחזרים את ההיסטוריה . . .(שמעון) בדיוק, אנחנו קצת במסע בזמן (רן) . . . תיכף נגיע ל-2021 . . . (שמעון) . . . ומנגד, מן הסתם, וואלה - אולי אני ה- Java Payment Engineer הכי טוב שיש, אבל מה לי ול-Docker-ים? מה לי ול-Kubernetes? אני לא מומחה בדבר הזה, ופתאום אני נאלץ לבוא ולגעת בדברים האלה, שאני לא כל כך מבין בהם הרבה.וגם לא הגיוני לצפות עכשיו מכל מפתח Java - או לא יודע, כל מפתח בכלל - שיהיה Expert תשתיות של Kubernetes, זה פשוט לא . . .(אורי) או שלפחות יהיה מודע לתשתית . . . (שמעון) נכון.(רן) אוקיי, אז פה אנחנו מתחילים לראות את הבעיה, זה קצה הקרחון . . . בואו נצלול פנימה, נראה כמה קירות בקרחון הזה.(שמעון) אז בוא ניכנס יותר עמוק לעולם הבעיה ב-Kubernetes ואז נדבר על עולמות הפתרונות השונים.אז באמת קראנו מאות פוסט-מורטמים (Post-Mortem) של Outages ב-Kubernetes - ואני חייב להגיד שרוב הטעויות . . .ואגב גם ניתן לקרוא היום - למשל State of Security Report של Red Hat 2021, מראה שבאמת בכל הארגונים ה-Number 1 Concern זה Misconfigurations . . . .זה לא Security incidents ולא שיפרצו לי - זה שאנחנו נדפוק את עצמנו . . . בגדול, זה ה-Number 1 Challenge עכשיו . . .ומה שקורה זה קצת מה שדיברנו - בגלל ש-Kubernetes בנוי ל-Flexibility, אז אני יכול לעשות הכל - אבל אז אני אומר “טוב, אני אביא Container ואני אקח איזה Copy & Paste של איזה Template פשוט - ואני אריץ שם, לא יודע - RabbitMQ, וביחד עם ה-Service שלי ועוד איזשהו Redis ויריץ אותו . . .אמממה? שכחתי ש-RabbitMQ, ב-Default שלו, עושה Consume לכל ה-Memory שהוא יכול, כי הוא עושה לזה Queueing . . .ולא שמתי Memory-Limit בתוך ה-Kubernetes YAML שלי ל-Workload הזה.ועכשיו אני בבעיה, כי יש לי “Pod סורר”, שלקח עכשיו את כל ה-Memory של כל ה-Node - ועכשיו אני יכול לחוות Outage . . .עכשיו, אתה - כ”מפתח קלאסי”, נקרא לזה - אולי אתה לא חשוף לזהאולי אתה גם רגיל לעבוד בעולמות הוירטואליזציה (Virtualization) או ה-EC2, כשאתה יודע שכל דבר רץ על Instance ולכל דבר יש את ה-Boundaries שלואבל מה לעשות - פה לא . . . פה אתה יכול להריץ Kubernetes על ה-Bare-Metal אפילו, ופה ה-Docker יכול לרוץ Native . . .וזו רק דוגמא אחת - יכול להיות שלא שמתי Liveness Probe או Readiness Probe . . . והדוגמאות הן עוד מאוד ארוכות . . . (רן) עכשיו, בעיקרון הייתי יכול גם לפני זה לקחת RabbitMQ ולהתקין אותו בצורה לא נכונה . . . אבל זה היה קשה יותר. עכשיו זה ממש קל - זה להעתיק YAML ולשים אותו בפנים - וזה עובד. לפני זה הייתי צריך לעבוד קשה, ורוב הסיכויים שאם הייתי עובד קשה, אז גם הייתי מבין איך נכון לעשות את זה, ועולה מראש על הטעויות שלי.פה זה כל כך קל, שזה פשוט להעתיק YAML מ-Stack Overflow והנה - לכאורה זה עובד . . . עד אשר מגיעה השעה 3 בלילה - ואז זה מפסיק לעבוד . . . (שמעון) נכון . . . שמע, יש דברים כמו . . . דברים שקרו נגיד ל-Targetשמו CronJob וה-Restart Policy שלו, אופס . . . עשו טעות וראו שזה כל הזמן עושה Restartאממה - ה-Pod הזה, היה בו איזה Error - אז הופ! עשו Spin ל-4,500 Pod-ים עם ה-Cron הזה, שדפק להם את כל ה-Cluster.עכשיו, זו טעות קטנה, של “האם אני עושה Restart Always או Never או Kill” - אלו דברים מאוד מאוד פשוטים, לכאורה . . . .ופתאום אין לנו את ה-Guard Rails, שאולי היו לנו פעם עם ה-Ops - פתאום כל Developer שולח את זה ל-Cluster ו-טאק! אללה-באב-אללה, לך תדע מה יהיה . . .(רן) אז למעשה Kubernetes מצד אחד נתן . . . ייצר הזדמנות. עכשיו מאוד קל לפרוש דברים, מאוד קל לשבור את החומות בין Ops לבין Dev - אורי, עכשיו עברנו את 2015 אז אני מתנצל, אבל בנוסף . . . (אורי) התעוררתי . . .(רן) . . . אבל בנוסף, הוא גם נותן לנו, אולי, הזדמנות עכשיו גם לייצר רגולציה - מילה שאתה לא אוהב - או לייצר Safety, אוקיי? אבל בעצם, אם לפני זה כל אחד היה צריך לייצר את ה-Safety הזה בעצמו, ע”י Whatever-כל-מיני-כלבי-שמירה מסוגים שונים או כל מיני Script-ים כאלה שהיית כותב לעצמך, אז היום יש לך אפשרות לייצר את זה בצורה שמתאימה לכולם - זה קצת אולי מדבר עם ה-CNCF שעליו דיברת מקודם.אז מה עשה העולם, כשהוא ראה את הדברים האלה?(שמעון) שאלה מעולה . . . אז בעצם מה קרה? ניתן לראות שני סוגים של Approaches שנלקחו ע”י חברות - ה-Approach הראשון היה “טוב, זה קורה חברים, אין מה לעשות - מעכשיו כל שינוי ב-Kubernetes ב-YAML, ב-Helm Charts - מעכשיו ה-DevOps צריך לחתום על זה”, לעבור על זה . . .ואז במקום שה-DevOps יתעסקו ב-Customization, ב-Performance, בדברים שהם רוצים - הם נהיים מעיין “Human Debugger” ל-YAML, והם צריכים עכשיו - מסכנים, 20 DevOps-ים על 500 מפתחים - צריכים לעבור בראש, לעשות ביד Debugging ל-YAML-ים של מתכנתים . . .(רן) “ועדת DevOps” - הכי אוקסימורון שיש . . . “לך תעבור את הועדה” . . . (אורי) השאלה היא גם אם שליחה כזאת של קונפיגורצית Kubernetes קוראית כל הזמן? זה לא כמו כשמפתח שולח קוד . . . האם זה באמת קורה כל הזמן או שזה קורה לעיתים רחוקות ואז אולי זו לא בעיה . . . .(רן) למה שזה לא יקרה כל הזמן? אתה רוצה לשנות כמות זיכרון, אתה רוצה לשנות מספר Pod-ים, אתה כל הזמן . . .(שמעון) . . . אתה רוצה להריץ Service חדש, אולי מספר Service-ים . . .(אורי) בסדר, זה . . . כמה זה קורה עם המפתח? רוב העבודה שלו זה לפתח . . . אם פעם בכמה זמן הוא צריך לשנות את הקונפיגורציה (Configuration) של ה-Pod שלו אז בסדר . . . (שמעון) פה אני אקח אותך בחזרה ל-2015 . . . (אורי) אני רק אגיד - זה ברור שזה פחות . . . יש לו פחות עצמאות, אוקיי? אבל השאלה היא האם זה באמת קורה כל כך הרבה . . . (שמעון) שאלה טובה . . . האם זה קורה פחות מקוד רגיל? חד משמעית, אני בטוח.אבל אני אקח אותך שוב למסע בזמן ל-2015, נושא לעוס מכל כיוון - microServices vs. Monolith וכו' - אז כנראה שהאמת היא איפשהו באמצע.אבל בסוף אנחנו רואים, בארגונים שמשתמשים במערכות שלנו שאני יכול לדבר עליהם - עשרות ומאות של Service-ים . . .וגם אוהבים לעשות את ה-Separation of Concern - אומרים “אוקיי, יש לי משהו, אז במקום לדחוף אותו עכשיו לתוך אותו ה-Service ולהעמיס עליו עוד יותר - בוא נעשה Service נפרד!”אממה - בכל פעם שאתה עושה Service נפרד, יש לו קונפיגורציות משלו, דאטה משלו, אולי Databases משלו, אולי Cache-ים משלוואז ניהית לך פה “ערימה של Infrastructure” [על הדשא?], שאם פעם ב-Monolith היה לנו Monolith ענק עם שכבה דקה של Infrastructure - עכשיו יש לנו אפליקציה עם שכבה דקה של אפליקציה ומלא מלא Infrastructure מסביבה - כפול 500 כאלה . . .(רן) אני אעשה . . . דרך אגב, אורי - אני יכול לענות לך על השאלה של “עד כמה זה קורה?”: תקרא את ה-Post-Mortem-ים ותראה כמה זה קורה . . . קורה הרבה.(אורי) זה בסדר, אבל אתה יודע - תקרא את ה-Post-Mortem-ים של טעויות קוד . . . (רן) כן, אבל אתה יודע - אנחנו מנסים למצוא את היחס הנכון . . .אבל אנחנו רואים שזה קורה, זאת אומרת - זה קורה.(אורי) אני לא מתווכח עם זה שזה קורה - זה קורה, ויש תמיד את הצורך הזה בעצמאות, אבל חשוב לשים את ה . . .(שמעון) אני חושב שהשאלה, אורי, היא גם מה ה-Cost? - אם אני עשיתי עכשיו טעות והשתמשתי ב-Type הלא נכון או אני לא יודע מה יש בתוך הקוד שלי, לעומת אם אני עכשיו לא שמתי את ה-Memory Limit או את ה-Liveness Probe לא נכון, וזה יכול להשפיע לי על כל ה-Workload שלי ב-Clusterואולי זו הנקודה - להגיד מה ה-Blast Radius של טעות בקונפיגורציה של Kubernetes.(רן) אז פתרון אחד היה “וועדת ה-DevOps” הזו, שכולנו פסלנו . . . אוקיי, אילו עוד פתרונות?(שמעון) פתרון אחר זה הצד השני לחלוטין - זה “טוב, יאללה, אין מה לעשות, זה מה שיקרה” . . .(רן) “אכלתם אותה, חבר'ה . . . יש לכם Kubernetes, יש לכם YAML - תסתדרו”.(שמעון) נכון . . .(אורי) השאלה היא רק מי משלם את השיק . . .(שמעון) נכון - ואגב, אפרופו אתה אומר שיק, זה אחד ה-Use Case-ים באופן מעניין, כשאנחנו עוד מעט נגיע לדבר על הנושא של פוליסות, אבל לשים מעיין Cost Center - יש אנשים שמרימים מיליון דברים . . . וואלה, של מי זה? מה זה? מה זה עושה פה? מי ה-Owner? איזה צוות זה? מה קורה פה כאילו? [מי נתן את ההוראה?!](אורי) אני יכול להגיד שב-Outbrain, ברגע שנכנסנו ל-Kubernetes, הבנו שיש כמובן את העניין הזה של ה-Cost ופשוט פיתחנו Visibility לכל Cost - כל צוות יודע מה ה-Cost של ה-Service-ים שלו, כמה עולה לו כל Service, והם יכולים . . . פעם (Once) שאתה מודד את זה, את יכול לקחת Action.(שמעון) אבל השאלה היא איך אתה יודע שהמתכנת שהעלה את ה-Service הבא לא שכח לשים Label? . . .ואז המערכת שאוספת פשוט לא תצליח “לאסוף את ה-Owner” . . . .(אורי) אז במקום הזה, אנחנו בעצם בודדנו את המפתח מהתשתית ויש לו Pipeline שלם, והוא מבחינתו רואה “Service” . . .(רן) אז יש פה בעצם את פתרון מספר שלוש . . . דיברנו על שני פתרונות: אחד זה היה “וועדת DevOps” , שתיים זה “קחו אולר שוויצרי ותתחרעו עליו” - ושלוש זה לייצר אבסטרקציה: לא לחשוף להם את Kubernetes As-is, אלא ליצור שכבה שמפשטת את זה. [368 Kubernetes and Dyploma at outbrain](שמעון) בדיוק - ואגב, אני מאמין שבמובן מסויים, Datree איפשהו משחקת שם על ה-Abstraction Layer הזומילה שאני דווקא אוהב זה דווקא סוג של Guard Rails מסויימים - שוב פעם, במשחק שבין Flexibility ל-Simplicity.יש ארגונים שפשוט המפתח, בסופו של דבר, לא רואה שמאחורה זה Kubernetes YAML - הם עושים את ה”שמעון-YAML”, ויש להם שם ערכים מסויימים שחשופים אליהםזה Memory ו-CPU ו-Whatever-מה-שאני-רוצהאגב, יש כאלה שגם אם לא נתת את ה-YAML הפנימי שלנו - ה-Outbrain.yaml או ה-AppsFlyer.yaml - אז אתה לא יכול לעשות את ה-Deployment.ואז זה בעצם לקחו ופשוט . . . כי הרי יש אינסוף של פרמוטציות בקונפיגורציות (Permutations, Configurations) ב-Kubernetes, אז הלכו ועשו איזושהי Abstraction Layer על זה.(רן) אז במשרעת הזאת, שבין Flexibility ל-Simplicity, אז לקחו את זה לכיוון של Simplicity.(שמעון) נכון.(רן) וזה יכול לעבוד לחלק מהחברות, אבל כמו שאמרת יש לזה את החסרון של “אוקיי, לפעמים אתה רוצה לעשות משהו קצת אחר”.(שמעון) נכון(אורי) עכשיו, יש בעניין הזה שתי אפשרויות - אתה יכול לשים ממש אפליקציית ניהול מעל זה ולשים בפנים את כל ה”בדיקות נכונות” על ה-Input-ים שמגיעים מהאפליקציה, ויש פתרון של לבנות Compiler, אוקיי? . . . אתה עדיין יכול לשים Kubernetes YAML, אבל אני אריץ לך עליו . . . Debugger או . . .(שמעון) Linter אולי . . . (אורי) . . .שמפעיל את ה-Rules ש . . .(שמעון) יפה, בדיוק(רן) איזשהו מנוע . . . תיכף נגיע לזה, אני רואה שזה על קצה הלשון שלך - איזשהו מנוע-חוקים שאומר “זה בסדר - אבל זה לא בסדר”, אוקיי . . . “מותר לך להרים 50 Pod-ים, אסור לך להרים 51”“מותר לך להקצות 250Mb זכרון - אסור לך להקצות מילימטר יותר”אז איך עושים את זה?(שמעון) אז זה לוקח אותנו קדימה - דיברנו קצת על ה-Cloud Native Foundation, ויש שם הרבה פרויקטים, אני מאוד ממליץ.לאחד הפרויקטים קוראים Open Policy Agent (OPA) - הופה-היי . . . זה בעצם פרויקט שהוא, בבסיסו, ב-Core, זה Policy Agent, שאגב משתמשים בו להרבה Use-Case-יםיש חברות שמשתמשות בו בכלל ל-Authorization, כש-microService מדבר עם איזשהו microService אחרוכיש שם איזשהו User שרוצה לבצע פעולה, אז הוא פונה ל-Service ושואל אותו “האם שמעון יכול לעשות פעולת Delete - כן או לא?”ובעצם מפרידים את ה-Logic Layer של ה-Decision Making מהבחינה הזאת, לבין האפליקציה עצמה.(רן) וה-Scenario הזה לא קשור ל-Kubernetes?(שמעון) לא . . . אני נותן דוגמא אחת כדי שנבין את הזה . . .בגדול, ל-Open Policy Agent יש שתי צורות של הרצה שלו - או כ-HTTP Server שאתה פונה אליואו שאתה יכול להביא אותו כ-Library, שאתה פשוט מדבר איתו.עכשיו, ה-Interface של הדבר הזה הוא שפה דקלרטיבית (Declarative) שקוראים לה Rego, שהיא Inspired ע”י שפה בשם Datalogוה-Emphasis בשפה הזו הוא What you see is what you getזאת אומרת שאין שם איזה-שהם אובייקטים מטורפים או לולאות מטורפות או דברים דינאמיים . . .הפוך - מנסים כמה שיותר שיהיה דיסקריפטיבי (Descriptive) ו-Fair, אני אגיד שזו שפה לא-הכי-קלה שיש . . . היא מצד אחד . . . נכון, היא דיסקריפטיבית, אבל מצד שני זה קצת “מקשה על העין”.(רן) כן - ועכשיו צריך מישהו שישמור עליך מפני השפה הזאת, כי גם שם אתה יכול לעשות שגיאות . . .(שמעון) נכון . . . . אז בעצם זה לוקח אותנו למקום שאתה יכול לתת פוליסות - ולפוליסות האלה אפשר לפנות ולשאול האם אני יכול לבצע פעולה או לא לבצעהוא פשוט שואל אותי - “מה אתה אומר?”נותן לך Input של Data, נותן לך Input של Rule-ים - תבדוק לי ותגיד לי האם זה בסדר או לא בסדר.ואז, כדי להיכנס טיפה יותר עמוק, יש את Conftest ו-Gatekeeperכש-Conftest - כשמו כן הוא: Configuration Test שמבוסס על Open Policy Agentוהוא כבר נבנה ב-Use Case של לעשות בדיקות על Configuration Filesכשאחד ה-Configuration Files, מן הסתם, זה Kubernetes YAMLsואתה ממש יכול לרשום שם “תבדוק לי האם קיימים Label-ים, והאם יש Label מסוג Cost או Team? ואם לא - אז תיכשל!”ואז בעצם מה שעושים זה שבתהליך ה-CI/CD או כ-pre-commit Hook או מה שזה לא יהיה, אנחנו מריצים את הטסטים האלה.אז זה צד אחד, זה Conftestהצד השני זה Gatekeeper - כש-Gatekeeper זה אותו דבר, מריץ את ה-Rego Policies - אבל זה עובד כ-Admission Webhook Controller בתוך Kubernetes.קצת דומה למערכות הפעלה - System Callsבעצם, כשמתרחשת פעולה . . . .(רן) רגע, שנייה, בוא - אני אעצור אותך . . . יש פה הרבה מושגים ואני רוצה לפרוט אותם.אז קודם כל, ה-Conftest - זה משהו שאם אני מבין נכון ירוץ ב-CI או באיזשהו Pipeline שבו אתה . . .אוקיי - שינית קוד של Kubernetes, נגיד ששינית YAML או . . . אפילו אם יש לך איזשהו כלי, נגיד כמו מה שיש ב-Outbrain - אבל בסופו של דבר הוא שינה את הקונפיגורציה של Kubernetesקודם כל תריץ על זה Conftest - ואם הוא יגיד שזה בסדר אז תמשיך הלאהלצורך העניין - “תמיד חייב להיות Label של Cost”, שתדע לאיזו קבוצה זה שייך.ו-Admissions זה כבר סיפור אחר - זו חיה שכבר חיה בתוך Cluster של Kubernetes . . . [עמק החיות המוזרות?](שמעון) . . . כי אתה יכול לשאול אותי “שמעון - אבל מה אם יש לי בנדיטים? מה אם הם עושים kubectl - Apply בום-טראח לתוך ה-Cluster, ולא עובדים עם GitOps ולא עוברים בתוך ה-Pipeline?”אז לשם זה יש באמת את Gatekeeper, שאני לא יודע עד כמה אנחנו רוצים לחפור עמוק אבל זה שוב פעם כמו במערכות הפעלה, כשיש לך System Calls - אז זה לצורך העניין ככה ה-Antivirus עובד - יש איזשהו Executable שרוצה לרוץ, המערכת הפעלה קוראת ל-Antivirus ואומרת “אתה שומע, חביבי - זה בסדר? זה לא בסדר?”והוא הולך ומריץ בדיקה ואומר לו “בסדר” או “לא, תעשה לזה Block”.בדיוק אותו הדבר - רק על Kubernetes(רן) אוקיי, אז אפשר לחשוב על זה כמו על ה-Bouncer בכניסה למועדון לילה, שבא ומסתכל הטיפוסים - “אתה בסדר, אתה נכנס” או “אתה לא בסדר, אתה לא נכנס” - וגם זה על בסיס איזושהי Policy [יותר Profiling . . .]. גם זה, דרך אגב, OPA?(שמעון) כן, גם זה מבוסס OPA וגם זה מבוסס על פוליסות ב-Rego(רן) אוקיי, אז ההבדל המשמעותי זה ש-Conftest רץ בזמן, נקרא לזה “על יבש” - הוא רץ על הקונפיגורציה, אבל Gatekeeper רץ בזמן ה-Execution, ככה שגם אם ניסית לעקוף איכשהו את ה-Conftest אז הוא יעצור אותך בכניסה.(שמעון) מדויק(רן) אוקיי, ובאופן טיפוסי אתה רואה חברות משתמשות בשניהם, באיזשהו אופן? עם קונפיגורציה משותפת? אולי איזשהו שילוב של מה שאורי הזכיר מקודם, של אבסטרקציה מעל? מה קורה בשטח?(שמעון) אז מה שקורה בשטח זה דה-פאקטו היום ניהייה סטנדרט Across the board ה-Open Policy Agent ואני רואה הרבה חברות שמשתמשות ב-Conftest ו-Gatekeeperאבל איפה ה-Challenge?אז אתה אומר לי “שמעון, שכנעת אותי! שמעתי ברברסים, נשמע מדהים!”יאללה - הולך, נכנס, מוריד Conftest - אבל אז אתה מוצא את עצמך בוהה, מול מסך ריק - ואתה אומר : רגע, אבל אילו פוליסות אני אשים?” . . . .מה - אני אחכה ל-Outage הבא כדי לדעת אילו Policies לשים? זה אחד . . . ושתיים - אתה אומר “יש לי Git Repositories 500, אז מה - עכשיו אני אעשה 500 Commit-ים ל-Conftest הזה?”“ועכשיו אני רוצה לשנות Policy - אז מה, אני אעשה 500 Pull-Request-ים?”אז פתאום אתה אומר “רגע - אני צריך איזושהי דרך Central-יסטית לנהל את זה”ואז מה שקורה זה שהיום, בעיקר ארגונים, הולכים ובונים את ה-Layer הזה, של(א) להבין איזה חוקים לשיםו-(ב) של צורה לשלוט בזה . . . בצורה מבוזרת - ובונים את זה בעצמם.שזה גם מביא איזשהו Dashboard שמראה אילו Locations יש, מה בעצם קרה, מה רץ איפה ולמה . . .שאגב - זה בדיוק Datree . . . בנינו בדיוק את מה שהרבה חברות בונות - פשוט אנחנו נותנים את כ-Service.(רן) אז Datree . . . דרך אגב - מאיפה השם? עוד לא דיברנו על זה . . . אבל Datree זה ”כלי שבא ועוזר לך לנהל את ה-OPA שלך” [אחד ה-One-liners אם לא ה-] - זאת אומרת, יש לך פוליסה, אתה מבין בגדול מה אתה רוצה לעשות - אבל עכשיו לך תכיל את זה על 500 Repositories וכו' - וזה מה ש-Datree עושה.אז מאיפה השם?(שמעון) אז זה כמו Data-Tree . . . למפות, אתה יודע . . .בגדול, אנחנו Abstraction Layer מעל הדבר הזה - אתה עושה brew install datree ויש לך Datreeאתה לא צריך להתעסק עם שום “OPA-ות שמופות” . . .אין לך כלום מאחורה.ודבר ראשון מגיע לך, Built-in, שלושים חוקים, שהם, נקרא לזה “נכתבו בדם”מתוכם 21 מופעלים ב-Default ו-9 כבויים.ואתה יכול להריץ את זה על Helm Chart-ים שלך או על Kubernetes YAML-יםלאחר מכן, אתה יכול להיכנס ל-Dashboard שלנו ולהפעיל או לכבות כל Policy - ולהתחיל ליצור פוליסות שונות.כי אתה יכול לשאול אותי “שמעון! אז הדלקתי פה, שמתי Memory Limit 4Gb - ועכשיו באו אלי הצוות של ה-AI ואמרו לי ‘מה זה 4Gb? אני לא מתעורר ב-4Gb! תן לי 50Gb! . . . “רגע - צוות כזה צריך ככה וצוות כזה צריך ככה . . . אתה פתאום צריך הרבה חוקים והרבה פרמוטציות של הרבה Policies . . .זה פתאום מתחיל להסתעף . . .ואתה אומר “מה עם ה-Global Policies, שאני מכיל אותן לכל הארגון? אולי אני חייב Liveness ו-Readiness פה בכל Workload?”אבל עכשיו, הדברים שהם קונפיגורביליים (Configurable) ספציפית פר-Service - הם צריכים לדעת את ה-Context של ה-Service הספציפי.ובעצם, כאן אנחנו נכנסים - אנחנו נותנים לך, במקום אחד, לבוא ולהגדיר את הדברים האלהכמובן שאנחנו גם תומכים בלכתוב Custom Rulesאגב, אנחנו תומכים בלכתוב Custom Rules גם לא ב-OPA - כי דיברנו עם הרבה מהלקוחות שלנו והם אמרו “שמע, זה פשוט מסובך . . .כאילו, תן לי, אתה יודע, תן לי לעשות איזה YAML פשוט . . .”אז לקחנו את JSON Schema - זה RFC Specification, שיש לו מימושים ב-JSON-ים וב-YAML-ים שונים, ובעצם אתה יכול להגדיר חוקים בצורה פשוטה וקלה, ואנחנו מורידים ממך את כל העול הזה.(רן) איפה, לתפיסתך צריך לעבור הגבול - אם צריך לעבור גבול, לצורך העניין - בין ה-Operations לבין הפיתוח? ואני מצטער שאני חוזר ל-2015, אבל בואו נעשה את השאלה הזו רלוונטית . . .נגיד, לדוגמא - האם ה-Operations הם אלו שצריכים להיות אחראים על ה-Policies והפיתוח הם אלה שלוקחים את זה משם? האם ה-Operations הם אלו שצריכים להיות אחראים לא רק על ה-Policies אלה גם על ה-Cluster עצמו ועל כל מיני . . . ועל הגדרות גנריות, ולתת למפתחים רק כמה Call-Back-ים מסויימים, לצורך העניין “את מספר הפורט אתה תקבע, אני אקבע את כל השאר” או “את כמות הזכרון אתה תקבע, את כל השאר אני אקבע” . . . זאת אומרת - איפה לדעתך צריך באמת להיות הגבול בין תחום האחריות של צוות ה-Operations לבין צוות הפיתוח?(שמעון) שאלה מעולה - אני חושב שאני אתחיל בתשובה של “כמובן שזה תלוי . . .”, אני לא כזה מבוגר אבל אני קצת פחות ילד ויודע שהתשובה היא תמיד, מה שנקרא “זה תלוי” . . .אני יכול להגיד מה אני רואה בארגונים - אז דבר ראשון, אני רואה בבירור מובהק שכל ארגון, יש בו את ה-DevOps, יש בו סט אחד של פוליסות שהוא כאילו Company-wide Policiesשאומר “תשמע, בלי Health Checks ב-Container, בלי Liveness Probe, בלי Readiness Probe - חברים, אי אפשר”.“בלי Cost Center Label אני לא יכול” . . .“אני לא יכול” - וזה ברמות של, כאילו . . . ב-Iron Source היה לנו את “ה-Iron Hunter” שהיה הולך ומוצא כל Resource שלא היה מתוייג ופשוט הורג אותו . . . .אמרנו “חברים, הולך להיות פה Hunter - אנחנו מודיעים על זה מראש, יש לכם שלושה חודשים . . .”(רן) “אל תתפשו בלי . . . “(שמעון) לא, אמיתי - זה היה הורג Resource-ים . . . תקשיב, זה משוגע לחלוטין, אבל כאילו הגענו למצב שזה פשוט לא היה . . . לא הייתה ברירה.אז זה דבר ראשון.דבר שני - עכשיו אני מתחיל להיכנס פנימה, ואני חושב שזה גם דיבור שהוא מאוד כזה Cultural - האם יש לך צוות Cental-יסטי אחד וכל השאר מתכנתים? האם יש לך DevOps Ambassadors בתוך כל צוות, שיכולים לקחת את ה-Ownership הזה בשביל הצוות ולעבוד עם ה-DevOps?זה מתחיל להיות כאילו . . . זה מאוד נוגע ב-Culture של הארגון, וזה לא גרידא-טכנולוגי.(רן) מעולה, אוקיי . . .(אורי) אני רואה את זה . . . בכל דבר אתה יכול להגיד “בוא נעביר למפתח גם אחריות על KPI כזה ו-KPI אחר”, אתה יודע . . . מפתחים מאוד אוהבים את ה-Craftsmanship שיש להם, אוהבים את היעילות. אתה יכול להגיד . . . ויש כאלה שמניעים את ה-Business KPIs והם אוהבים את זה וסבבה.זו הרבה שאלה של איפה את שם Cost, אוקיי? ו-Cost אל מול Revenue - לפעמים שווה לי שמשהו יהיה יקר אבל הוא מזיז קדימה Business KPIs בצורה הרבה יותר גדולה.לפעמים צריך איזו “יד כללית” כזאת, שתסתכל על הכל ברמה קצת יותר . . . “אוקיי, יש לנו סוג של מומחיות בתשתית, יש לנו מומחיות של Cost Optimization בתשתית - תנו לנו לעזור לכם”.(שמעון) חד-משמעית . . . אני יודע גם שאתה נותן כאן הרבה את הדוגמאות של Cost, אני עבדתי גם בחברה שהיא Ad-Tech, אני מבין מאוד את העולם של ה-Cost.אני חושב שה-Ownership על ה-Cluster צריך להיות על ה-Bare-Metal של ה-Cluster, או נקרא לזה על הקונפיגורציות שלו - כן צריך להיות על ידי צוות מרכזי.והנושא הכמה-שיותר-אפליקטיבי צריך להיות בצד של המפתחים - ואז יש איזשהו “את האמת באמצע”.להגדיר בכמה אחוז CPU Memory זה עושה Scale-up לעוד Pod-ים? וואלה, זו גם שאלה אפליקטיבית . . .זו שאלה מאוד מאוד אפליקטבית, ולא משנה כמה תשתיות יגידו “לא, אתם מבזבזים פה Resource-ים!”יכול להיות שאם אני מרים אותו ב-90% CPU אז הוא “נחנק” ולא מצליח לעשות Serving ולא - אני צריך להרים עוד ב-70%.אז צריך להבין פה את ה-Balance משני הצדדיםאני כן מסכים, במובן מסויים, עם “היד הכללית המכוונת” - זה כמו שאולי לא צריך ללכת overboard על “You Own it, You Run it- ולכו תסתדרו לבד”, ואולי מהצד השני לא צריך לנעול הכל, ואתה עכשיו פה “Code Monkey”, מחליף מכפתור אדום לירוק.אבל אם . . . אני כן מאמין וכן חושב שזה היה שינוי ממש טוב שיש Ownership לצוותים אפליקטיביים - שהם קמים בלילה, שהם חושבים על “איך אני עושה לזה Scale, איך אני עושה לזה Load Balancing”, והם לא תקועים רק בחלל האפליקטיבי הזה שלהם.מה שאני שומע ממתכנתים זה שהם אומרים “תשמע, ה-Guard Rails האלה ממש עוזרים לי”כי אני יודע, תחשוב על זה משני הצדדים - ה-DevOps משקשקים כי הם אומרים “מכניסים פה שינויים לתשתית שהם לא יודעים מה הם עושים!” וה-Developers משקשקים כי הם אומרים “אני עושה פה שינויים וגם אני לא יודע בדיוק מה אני עושה” - ושני הצדדים כאילו רועדים בשתי הפינותודווקא לבוא ולתת כלים שאומרים “רגע, חברים - בואו, נעשה את הבדיקות, נעזור לכם, ניתן לכם איזשהם Guard Rails שיחזיקו לכם את היד משני הצדדים” יכולים להעלות את ה-Trust משני הכיוונים.(אורי) ואתה יודע - בדרך כלל, ה-Guard Rails השאלה מכסים 90-95% מהצרכים של ה-Developers, שנמצאים בתוך ה-Guard Rails.ויקרה, כן - פעם בכמה זמן, כשמישהו צריך זה . . . אז הוא ידבר עם איש DevOps שיפתח לו את מה שצריך - ולא קרה כלום.(שמעון) נכון . . .(רן) מה - הם מדברים?!(אורי) לפעמים . . . ב-Guard Rails.(רן) טוב, שמעון - היה מרתק. יש עוד נושאים שלא כיסינו שהיית רוצה לכסות לפני שאנחנו מסיימים?(שמעון) לא . . . אני חושב שעשינו כיסוי טוב.(רן) מעולה! מגייסים?(שמעון) כמובן! אנחנו מגייסים, גם למשרות טכניות בכל התפקידים וגם למשרות Go-to-Market.(רן) איך נראה ה-Stack הטכנולוגי שלכם? זאת אומרת, חוץ מ-Kubernetes שהזכרת, מה עוד קורה שם?(שמעון) אז ה-CLI ב-Go, ה-Backend שלנו זה TypeScript ו-Next.js, הכל על AWS בעיקר, יש גם קצת Azureיש Container-ים, הכל CI/CD ו-Deployment ל-Production אוטומטי.מאוד כיף! Building by Developers, for Developers - אבל For Real . . .כאילו - אנשים אשכרה באים להתראיין אצלנו וזה “אחי - לך ל-GitHub, לך תתקין את זה” . . . מבחינתי זה אחד הדברים שאותי, כאילו, מדליקים.(רן) ואתם בתל אביב . . . אי שם . . .(שמעון) כן . . .(אורי) לא נורא . . .(רן) שיהיה בהצלחה, תודה שבאת.(שמעון) איזה כיף - תודה רבה!(רן) להתראות האזנה נעימה ותודה רבה לעופר פורר על התמלול!
How has the cloud changed data analytics? Carl and Richard chat with Vishwas Lele about his latest work taking a developer's view of data analytics - without upsetting the DBAs too much! Vishwas talks about how the cloud has changed bringing disparate data sources together for analytics. With the cloud's compute-on-demand, you don't need to do many transformations of data as it's loaded - but you can test it! This leads to a conversation of how CI/CD techniques can be applied to data to make for accurate data analytics - make your ingestion pipeline smart!
This week in the AppSec News: Malware in the UAParser.js npm package, security vuln in Squirrel scripting language, a blueprint for securing software development, L0phtCrack now open source, appsec videos on Android exploitation, macOS security, & more! Visit https://www.securityweekly.com/asw for all the latest episodes! Show Notes: https://securityweekly.com/asw171
About JasonJason is now the Managing Director at Redpoint Ventures.Links: GitHub: https://github.com/ @jasoncwarner: https://twitter.com/jasoncwarner GitHub: https://github.com/jasoncwarner Jasoncwarner/ama: https://github.com/jasoncwarner/ama TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by Honeycomb. When production is running slow, it's hard to know where problems originate: is it your application code, users, or the underlying systems? I've got five bucks on DNS, personally. Why scroll through endless dashboards, while dealing with alert floods, going from tool to tool to tool that you employ, guessing at which puzzle pieces matter? Context switching and tool sprawl are slowly killing both your team and your business. You should care more about one of those than the other, which one is up to you. Drop the separate pillars and enter a world of getting one unified understanding of the one thing driving your business: production. With Honeycomb, you guess less and know more. Try it for free at Honeycomb.io/screaminginthecloud. Observability, it's more than just hipster monitoring.Corey: This episode is sponsored in part by Liquibase. If you're anything like me, you've screwed up the database part of a deployment so severely that you've been banned from touching every anything that remotely sounds like SQL, at at least three different companies. We've mostly got code deployments solved for, but when it comes to databases we basically rely on desperate hope, with a roll back plan of keeping our resumes up to date. It doesn't have to be that way. Meet Liquibase. It is both an open source project and a commercial offering. Liquibase lets you track, modify, and automate database schema changes across almost any database, with guardrails to ensure you'll still have a company left after you deploy the change. No matter where your database lives, Liquibase can help you solve your database deployment issues. Check them out today at liquibase.com. Offer does not apply to Route 53.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Jason Warner, the Chief Technology Officer at GifHub, although he pronounces it differently. Jason, welcome to the show.Jason: Thanks, Corey. Good to be here.Corey: So, GitHub—as you insist on pronouncing it—is one of those companies that's been around for a long time. In fact, I went to a training conducted by one of your early folks, Scott Chacon, who taught how Git works over the course of a couple of days, and honestly, I left more confused than I did when I entered. It's like, “Oh, this is super awful. Good thing I'll never need to know this because I'm not really a developer.” And I'm still not really a developer and I still don't really know how Git works, but here we are.And it's now over a decade later; you folks have been acquired by Microsoft, and you are sort of the one-stop-shop, from the de facto perspective of, “I'm going to go share some code with people on the internet. I'll use GitHub to do it.” Because, you know, copying and pasting and emailing Microsoft Word documents around isn't ideal.Jason: That is right. And I think that a bunch of things that you mentioned there, played into, you know, GitHub's early and sustained success. But my God, do you remember the old days when people had to email tar files around or drop them in weird spots?Corey: What the hell do you mean, by, “Old days?” It still blows my mind that the Linux kernel is managed by—they use Git, obviously. Linus Torvalds did write Git once upon a time—and it has the user interface you would expect for that. And the way that they collaborate is not through GitHub or anything like that. No, they use Git to generate patches, which they then email to the mailing list. Which sounds like I'm making it up, like, “Oh, well, yeah, tell another one, but maybe involve a fax machine this time.” But no, that is actually what they do.Jason: It blew my mind when I saw that, too, by the way. And you realize, too, that workflows are workflows, and people will build interesting workflows to solve their use case. Now, obviously, anyone that you would be talking to in 2021, if you walked in and said, “Yeah, install Git. Let's set up an email server and start mailing patches to each other and we're going to do it this way.” They would just kind of politely—or maybe impolitely—show you out of the room, and rightfully [laugh] so. But it works for one of the most important software projects in history: Linux.Corey: Yeah, and it works almost in spite of itself to some extent. You've come a long way as a company because initially, it was, “Oh, there's this amazing, decentralized version control system. How do we make it better? I know, we're going to take off the decentralized part of it and give it a central point that everything can go through.” And collaboratively, it works well, but I think that viewing GitHub as a system that is used to sell free Git repositories to people is rather dramatically missing the point. It feels like it's grown significantly beyond just code repository hosting. Tell me more about that.Jason: Absolutely. I remember talking to a bunch of folks right around when I was joining GitHub, and you know, there was still talk about GitHub as, you know, GitHub for lawyers, or GitHub for doctors, or what could you do in a different way? And you know, social coding as an aspect, and maybe turning into a social network with a resume. And all those things are true to a percentage standpoint. But what GitHub should be in the world is the world's most important software development platform, end-to-end software development platform.We obviously have grown a bunch since me joining in that way which we launched dependency management packages, Actions with built-in CI, we've got some deployment mechanisms, we got advanced security underneath it, we've Codespaces in beta and alpha on top of it now. But if you think about GitHub as, join, share, and see other people's code, that's evolution one. If you see it as world's largest, maybe most developed software development platform, that's evolution two, and in my mind, its natural place where it should be, given what it has done already in the world, is become the world's most important software company. I don't mean the most profitable. I just mean the most important.Corey: I would agree. I had a blog post that went up somewhat recently about the future of cloud being Microsoft's to lose. And it's not because Azure is the best cloud platform out there, with respect, and I don't need you to argue the point. It is very clearly not. It is not like other clouds, but I can see a path to where it could become far better than it is.But if I'm out there and I'm just learning how to write code—because I make terrible life choices—and I go to a boot camp or I follow a tutorial online or I take a course somewhere, I'm going to be writing code probably using VS Code, the open-source editor that you folks launched after the acquisition. And it was pretty clear that Atom wasn't quite where the world was going. Great. Then I'm going to host it on GitHub, which is a natural evolution. Then you take a look at things like GitHub Actions that build in CI/CD pipelines natively.All that's missing is a ‘Deploy to Azure' button that is the next logical step, and you're mostly there for an awful lot of use cases. But you can't add that button until Azure itself gets better. Done right, this has the potential to leave, effectively, every other cloud provider in the dust because no one can touch this.Jason: One hundred percent. I mean, the obvious thing that any other cloud should be looking at with us—or should have been before the acquisition, looking at us was, “Oh, no, they could jump over us. They could stop our funnel.” And I used internal metrics when I was talking to them about partnership that led to the sale, which was I showed them more about their running business than they knew about themselves. I can tell them where they were stacked-ranked against each other, based on the ingress and egress of all the data on GitHub, you know, and various reactions to that in those meetings was pretty astounding.And just with that data alone, it should tell you what GitHub would be capable of and what Azure would be capable of in the combination of those two things. I mean, you did mention the ‘Deploy to Azure' button; this has been a topic, obviously, pre and post-acquisition, which is, “When is that coming?” And it was the one hard rule I set during the acquisition was, there will be no ‘Deploy to Azure' button. Azure has to earn the right to get things deployed to, in my opinion. And I think that goes to what you're saying is, if we put a ‘Deploy to Azure' button on top of this and Azure is not ready for that, or is going to fail, ultimately, that looks bad for all of us. But if it earned the right and it gets better, and it becomes one of those, then, you know, people will choose it, and that is, to me, what we're after.Corey: You have to choose the moment because if you do it too soon, you'll set the entire initiative back five years. Do it too late, and you get leapfrogged. There's a golden window somewhere and finding it is going to be hard. And I think it's pretty clear that the other hyperscalers in this space are learning, or have learned, that the next 10 years of cloud or 15 years of cloud or whatever they want to call it, and the new customers that are going to come are not the same as the customers that have built the first half of the business. And they're trying to wrap their heads around that because a lot of where the growth is going to come from is established blue chips that are used to thinking in very enterprise terms.And people think I'm making fun of them when I say this, but Microsoft has 40 years' experience apologizing to enterprises for computer failures. And that is fundamentally what cloud is. It's about talking computers to business executives because as much as we talk about builders, that is not the person at an established company with an existing IT estate, who gets to determine where $50 million a year in cloud-spend is going to go.Jason: It's [laugh] very, [laugh] very true. I mean, we've entered a different spot with cloud computing in the bell curve of adoption, and if you think that they will choose the best technology every time, well, history of computing is littered with better technologies that have failed because the distribution was better on one side. As you mentioned, Microsoft has 40 years, and I wager that Microsoft has the best sales organizations and the best enterprise accounts and, you know, all that sort of stuff, blah, blah, blah, on that side of the world than anyone in the industry. They can sell to enterprises better than almost anyone in the industry. And the other hyperscalers—there's a reason why [TK 00:08:34] is running Google Cloud right now. And Amazon, classically, has been very, very bad assigned to the enterprises. They just happened to be the first mover.Corey: In the early days, it was easy. You'd have an Amazon salesperson roll up to a company, and the exec would say, “Great, why should we consider running things on AWS?” And the answer was, “Oh, I'm sorry, wrong conversation. Right now you have 80 different accounts scattered throughout your org. I'm just here to help you unify them, get some visibility into it, and possibly give you a discount along the way.” And it was a different conversation. Shadow IT was the sole driver of cloud adoption for a long time. That is no longer true. It has to go in the front door, and that is a fundamental shift in how you go to market.Jason: One hundred percent true, and it's why I think that Microsoft has been so successful with Azure, in the last, let's call it five years in that, is that the early adopters in the second wave are doing that; they're all enterprise IT, enterprise dev shops who are buying from the top down. Now, there is still the bottoms-up adoption that going to be happening, and obviously, bottom-up adoption will happen still going forward, but we've entered the phase where that's not the primary or sole mechanism I should say. The sole mechanism of buying in. We have tops-down selling still—or now.Corey: When Microsoft announced it was acquiring GitHub, there was a universal reaction of, “Oh, shit.” Because it's Microsoft; of course they're going to ruin GitHub. Is there a second option? No, unless they find a way to ruin it twice. And none of it came to pass.It is uniformly excellent, and there's a strong argument that could be made by folks who are unaware of what happened—I'm one of them, so maybe I'm right, maybe I'm wrong—that GitHub had a positive effect on Microsoft more than Microsoft had an effect on GitHub. I don't know if that's true or not, but I could believe it based upon what I've seen.Jason: Obviously, the skepticism was well deserved at the time of acquisition, let's just be honest with it, particularly given what Microsoft's history had been for about 15—well, 20 years before, previous to Satya joining. And I was one of those people in the late '90s who would write ‘M$' in various forums. I was 18 or 19 years old, and just got into—Corey: Oh, hating Microsoft was my entire personality.Jason: [laugh]. And it was, honestly, well-deserved, right? Like, they had anti-competitive practices and they did some nefarious things. And you know, I talked about Bill Gates as an example. Bill Gates is, I mean, I don't actually know how old he is, but I'm going to guess he's late '50s, early '60s, but he's basically in the redemption phase of his life for his early years.And Microsoft is making up for Ballmer years, and later Gates years, and things of that nature. So, it was well-deserved skepticism, and particularly for a mid-career to older-career crowd who have really grown to hate Microsoft over that time. But what I would say is, obviously, it's different under Satya, and Scott, and Amy Hood, and people like that. And all we really telling people is give us a chance on this one. And I mean, all of us. The people who were running GitHub at the time, including myself and, you know, let Scott and Satya prove that they are who they say they are.Corey: It's one of those things where there's nothing you could have said that would have changed the opinion of the world. It was, just wait and see. And I think we have. It's now, I daresay, gotten to a point where Microsoft announces that they're acquiring some other beloved company, then people, I think, would extend a lot more credit than they did back then.Jason: I have to give Microsoft a ton of credit, too, on this one for the way in which they handled acquisitions, like us and others. And the reason why I think it's been so successful is also the reason why I think so many others die post-acquisition, which is that Microsoft has basically—I'll say this, and I know I won't get fired because it feels like it's true. Microsoft is essentially a PE holding company at this point. It is acquired a whole bunch of companies and lets them run independent. You know, we got LinkedIn, you got Minecraft, Xbox is its own division, but it's effectively its own company inside of it.Azure is run that way. GitHub's got a CEO still. I call it the archipelago model. Microsoft's the landmass underneath the water that binds them all, and finance, and HR, and a couple of other things, but for the most part, we manifest our own product roadmap still. We're not told what to go do. And I think that's why it's successful. If we're going to functionally integrate GitHub into Microsoft, it would have died very quickly.Corey: You clearly don't mix the streams. I mean, your gaming division writes a lot of interesting games and a lot of interesting gaming platforms. And, like, one of the most popularly played puzzle games in the world is a Microsoft property, and that is, of course, logging into a Microsoft account correctly. And I keep waiting for that to bleed into GitHub, but it doesn't. GitHub is a terrific SAML provider, it is stupidly easy to log in, it's great.And at some level, I wish that would bleed into other aspects, but you can't have everything. Tell me what it's like to go through an acquisition from a C-level position. Because having been through an acquisition before, the process looks a lot like a surprise all-hands meeting one day after the markets close and, “Listen up, idiots.” And [laugh] there we go. I have to imagine with someone in your position, it's a slightly different experience.Jason: It's definitely very different for all C-levels. And then myself in particular, as the primary driver of the acquisition, obviously, I had very privy inside knowledge. And so, from my position, I knew what was happening the entire time as the primary driver from the inside. But even so, it's still disconcerting to a degree because, in many ways, you don't think you're going to be able to pull it off. Like, you know, I remember the months, and the nights, and the weekends, and the weekend nights, and all the weeks I spent on the road trying to get all the puzzle pieces lined up for the Googles, or the Microsofts, or the eventually AWSs, the VMwares, the IBMs of the world to take seriously, just from a product perspective, which I knew would lead to, obviously, acquisition conversations.And then, once you get the call from the board that says, “It's done. We signed the letter of intent,” you basically are like, “Oh. Oh, crap. Okay, hang on a second. I actually didn't—I don't actually believe in my heart of hearts that I was going to actually be able to pull that off.” And so now, you probably didn't plan out—or at least I didn't. I was like, “Shit if we actually pulled this off what comes next?” And I didn't have that what comes next, which is odd for me. I usually have some sort of a loose plan in place. I just didn't. I wasn't really ready for that.Corey: It's got to be a weird discussion, too, when you start looking at shopping a company around to be sold, especially one at the scale of GitHub because you're at such a high level of visibility in the entire environment, where—it's the idea of would anyone even want to buy us? And then, duh, of course they would. And you look the hyperscalers, for example. You have, well, you could sell it to Amazon and they could pull another Cloud9, where they shove it behind the IAM login process, fail to update the thing meaningfully over a period of years, to a point where even now, a significant portion of the audience listening to this is going to wonder if it's a service I just made up; it sounds like something they might have done, but Cloud9 sounds way too inspired for an AWS service name, so maybe not. And—which it is real. You could go sell to Google, which is going to be awesome until some executive changes roles, and then it's going to be deprecated in short order.Or then there's Microsoft, which is the wild card. It's, well, it's Microsoft. I mean, people aren't really excited about it, but okay. And I don't think that's true anymore at all. And maybe I'm not being fair to all the hyperscalers there. I mean, I'm basically insulting everyone, which is kind of my shtick, but it really does seem that Microsoft was far and away the best acquirer possible because it has been transformative. My question—if you can answer it—is, how the hell did you see that beforehand? It's only obvious—even knowing what I know now—in hindsight.Jason: So, Microsoft was a target for me going into it, and the reason why was I thought that they were in the best overall position. There was enough humility on one side, enough hubris on another, enough market awareness, probably, organizational awareness to, kind of, pull it off. There's too much hubris on one side of the fence with some of the other acquirers, and they would try to hug us too deeply, or integrate us too quickly, or things of that nature. And I think it just takes a deep understanding of who the players are and who the egos involved are. And I think egos has actually played more into acquisitions than people will ever admit.What I saw was, based on the initial partnership conversations, we were developing something that we never launched before GitHub Actions called GitHub Launch. The primary reason we were building that was GitHub launches a five, six-year journey, and it's got many, many different phases, which will keep launching over the next couple of years. The first one we never brought to market was a partnership between all of the clouds. And it served a specific purpose. One, it allowed me to get into the room with the highest level executive at every one of those companies.Two allow me to have a deep economic conversation with them at a partnership level. And three, it allowed me to show those executives that we knew what GitHub's value was in the world, and really flip the tables around and say, “We know what we're worth. We know what our value is in the world. We know where we sit from a product influence perspective. If you want to be part of this, we'll allow it.” Not, “Please come work with us.” It was more of a, “We'll allow you to be part of this conversation.”And I wanted to see how people reacted to that. You know how Amazon reacted that told me a lot about how they view the world, and how Google reacted to that showed me exactly where they viewed it. And I remember walking out of the Google conversation, feeling a very specific way based upon the reaction. And you know, when I talked to Microsoft, got a very different feel and it, kind of, confirmed a couple of things. And then when I had my very first conversation with Nat, who have known for a while before that, I realized, like, yep, okay, this is the one. Drive hard at this.Corey: If you could do it all again, would you change anything meaningful about how you approached it?Jason: You know, I think I got very lucky doing a couple of things. I was very intentional aspects of—you know, I tried to serendipitously show up, where Diane Greene was at one point, or a serendipitously show up where Satya or Scott Guthrie was, and obviously, that was all intentional. But I never sold a company like this before. The partnership and the product that we were building was obviously very intentional. I think if I were to go through the sale, again, I would probably have tried to orchestrate at least one more year independent.And it's not—for no other reason alone than what we were building was very special. And the world sees it now, but I wish that the people who built it inside GitHub got full credit for it. And I think that part of that credit gets diffused to saying, “Microsoft fixed GitHub,” and I want the people inside GitHub to have gotten a lot more of that credit. Microsoft obviously made us much better, but that was not specific to Microsoft because we're run independent; it was bringing Nat in and helping us that got a lot of that stuff done. Nat did a great job at those things. But a lot of that was already in play with some incredible engineers, product people, and in particular our sales team and finance team inside of GitHub already.Corey: When you take a look across the landscape of the fact that GitHub has become for a certain subset of relatively sad types of which I'm definitely one a household name, what do you think the biggest misconception about the company is?Jason: I still think the biggest misconception of us is that we're a code host. Every time I talk to the RedMonk folks, they get what we're building and what we're trying to be in the world, but people still think of us as SourceForge-plus-plus in many ways. And obviously, that may have been our past, but that's definitely not where we are now and, for certain, obviously, not our future. So, I think that's one. I do think that people still, to this day, think of GitLab as one of our main competitors, and I never have ever saw GitLab as a competitor.I think it just has an unfortunate naming convention, as well as, you know, PRs, and MRs, and Git and all that sort of stuff. But we take very different views of the world in how we're approaching things. And then maybe the last thing would be that what we're doing at the scale that we're doing it as is kind of easy. When I think that—you know, when you're serving almost every developer in the world at this point at the scale at which we're doing it, we've got some scale issues that people just probably will never thankfully encounter for themselves.Corey: Well, everyone on Hacker News believes that they will, as soon as they put up their hello world blog, so Kubernetes is the only way to do anything now. So, I'm told.Jason: It's quite interesting because I think that everything breaks at scale, as we all know about from the [hyperclouds 00:20:54]. As we've learned, things are breaking every day. And I think that when you get advice, either operational, technical, or managerial advice from people who are running 10 person, 50 person companies, or X-size sophisticated systems, it doesn't apply. But for whatever reason, I don't know why, but people feel inclined to give that feedback to engineers at GitHub directly, saying, “If you just…” and in many [laugh] ways, you're just like, “Well, I think that we'll have that conversation at some point, you know, but we got a 100-plus-million repos and 65 million developers using us on a daily basis.” It's a very different world.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: One of the things that I really appreciate personally because, you know, when you see something that company does, it's nice to just thank people from time to time, so I'm inviting the entire company on the podcast one by one, at some point, to wind up thanking them all individually for it, but Codespaces is one of those things that I think is transformative for me. Back in the before times, and ideally the after times, whenever I travel the only computer I brought with me for a few years now has been an iPad or an iPad Pro. And trying to get an editor on that thing that works reasonably well has been like pulling teeth, my default answer has just been to remote into an EC2 instance and use vim like I have for the last 20 years. But Code is really winning me over. Having to play with code-server and other things like that for a while was obnoxious, fraught, and difficult.And finally, we got to a point where Codespaces was launched, and oh, it works on an iPad. This is actually really slick. I like this. And it was the thing that I was looking for but was trying to have to monkey patch together myself from components. And that's transformative.It feels like we're going back in many ways—at least in my model—to the days of thin clients where all the heavy lifting was done centrally on big computers, and the things that sat on people's desks were mostly just, effectively, relatively simple keyboard, mouse, screen. Things go back and forth and I'm sure we'll have super powerful things in our pockets again soon, but I like the interaction model; it solves for an awful lot of problems and that's one of the things that, at least from my perspective, that the world may not have fully wrapped it head around yet.Jason: Great observation. Before the acquisition, we were experimenting with a couple of different editors, that we wanted to do online editors. And same thing; we were experimenting with some Action CI stuff, and it just didn't make sense for us to build it; it would have been too hard, there have been too many moving parts, and then post-acquisition, we really love what the VS Code team was building over there, and you could see it; it was just going to work. And we had this one person, well, not one person. There was a bunch of people inside of GitHub that do this, but this one person at the highest level who's just obsessed with make this work on my iPad.He's the head of product design, his name's Max, he's an ex-Heroku person as well, and he was just obsessed with it. And he said, “If it works on my iPad, it's got a chance to succeed. If it doesn't work on my iPad, I'm never going to use this thing.” And the first time we booted up Codespaces—or he booted it up on the weekend, working on it. Came back and just, “Yep. This is going to be the one. Now, we got to work on those, the sanding the stones and those fine edges and stuff.”But it really does unlock a lot for us because, you know, again, we want to become the software developer platform for everyone in the world, you got to go end-to-end, and you got to have an opinion on certain things, and you got to enable certain functionality. You mentioned Cloud9 before with Amazon. It was one of the most confounding acquisitions I've ever seen. When they bought it I was at Heroku and I thought, I thought at that moment that Amazon was going to own the next 50 years of development because I thought they saw the same thing a lot of us at Heroku saw, and with the Cloud9 acquisition, what they were going to do was just going to stomp on all of us in the space. And then when it didn't happen, we just thought maybe, you know, okay, maybe something else changed. Maybe we were wrong about that assumption, too. But I think that we're on to it still. I think that it just has to do with the way you approach it and, you know, how you design it.Corey: Sorry, you just said something that took me aback for a second. Wait, you mean software can be designed? It's not this emergent property of people building thing on top of thing? There's actually a grand plan behind all these things? I've only half kidding, on some level, where if you take a look at any modern software product that is deployed into the world, it seems impossible for even small aspects of it to have been part of the initial founding design. But as a counterargument, it would almost have to be for a lot of these things. How do you square that circle?Jason: I think you have to, just like anything on spectrums and timelines, you have to flex at various times for various things. So, if you think about it from a very, very simple construct of time, you just have to think of time horizons. So, I have an opinion about what GitHub should look like in 10 years—vaguely—in five years much more firmly, and then very, very concretely, for the next year, as an example. So, a lot of the features you might see might be more emergent, but a lot of long-term work togetherness has to be loosely tied together with some string. Now, that string will be tightened over time, but it loosely has to see its way through.And the way I describe this to folks is that you don't wake up one day and say, “I'm going on vacation,” and literally just throw a finger on the map. You have to have some sort of vague idea, like, “Hey, I want to have a beach vacation,” or, “I want to have an adventure vacation.” And then you can kind of pick a destination and say, “I'm going to Hawaii,” or, “I'm going to San Diego.” And if you're standing on the East Coast knowing you're going to San Diego, you basically know that you have to just start marching west, or driving west, or whatever. And now, you don't have to have the route mapped out just yet, but you know that hey, if I'm going due southeast, I'm off course, so how do I reorient to make sure I'm still going in the right direction?That's basically what I think about as high-level, as scale design. And it's not unfair to say that a lot of the stuff is not designed today. Amazon is very famous for not designing anything; they design a singular service. But there's no cohesiveness to what Amazon—or AWS specifically, I should say, in this case—has put out there. And maybe that's not what their strategy is. I don't know the internal workings of them, but it's very clear.Corey: Well, oh, yeah. When I first started working in the AWS space and looking through the console, it like, “What is this? It feels like every service's interface was designed by a different team, but that would—oh…” and then the light bulb went on. Yeah. You ship your culture.Jason: It's exactly it. It works for them, but I think if you're going to try to do something very, very, very different, you know, it's going to look a certain way. So, intentional design, I think, is part of what makes GitHub and other products like it special. And if you think about it, you have to have an end-to-end view, and then you can build verticals up and down inside of that. But it has to work on the horizontal, still.And then if you hire really smart people to build the verticals, you get those done. So, a good example of this is that I have a very strong opinion about the horizontal workflow nature of GitHub should look like in five years. I have a very loose opinion about what the matrix build system of Actions looks like. Because we have very, very smart people who are working on that specific problem, so long as that maps back and snaps into the horizontal workflows. And that's how it can work together.Corey: So, when you look at someone who is, I don't know, the CTO of a wildly renowned company that is basically still catering primarily to developers slash engineers, but let's be honest, geeks, it's natural to think that, oh, they must be the alpha geek. That doesn't really apply to you from everything I've been able to uncover. Am I just not digging deeply enough, or are you in fact, a terrible nerd?Jason: [laugh]. I am. I'm a terrible nerd. I am a very terrible nerd. I feel very lucky, obviously, to be in the position I'm in right now, in many ways, and people call me up and exactly that.It's like, “Hey, you must be king of the geeks.” And I'm like, “[laugh], ah, funny story here.” But um, you know, I joke that I'm not actually supposed to be in tech in first place, the way I grew up, and where I did, and how, I wasn't supposed to be here. And so, it's serendipitous that I am in tech. And then turns out I had an aptitude for distributed systems, and complex, you know, human systems as well. But when people dig in and they start talking about topics, I'm confounded. I never liked Star Wars, I never like Star Trek. Never got an anime, board games, I don't play video games—Corey: You are going to get letters.Jason: [laugh]. When I was at Canonical, oh, my goodness, the stuff I tried to hide about myself, and, like, learn, like, so who's this Boba Fett dude. And, you know, at some point, obviously, you don't have to pretend anymore, but you know, people still assume a bunch stuff because, quote, “Nerd” quote, “Geek” culture type of stuff. But you know, some interesting facts that people end up being surprised by with me is that, you know, I was very short in high school and I grew in college, so I decided that I wanted to take advantage of my newfound height and athleticism as you grow into your body. So, I started playing basketball, but I obsessed over it.I love getting good at something. So, I'd wake up at four o'clock in the morning, and go shoot baskets, and do drills for hours. Well, I got really good at it one point, and I end up playing in a Pro-Am basketball game with ex-NBA Harlem Globetrotter legends. And that's just not something you hear about in most engineering circles. You might expect that out of a salesperson or a marketing person who played pro ball—or amateur ball somewhere, or college ball or something like that. But not someone who ends up running the most important software company—from a technical perspective—in the world.Corey: It's weird. People counterintuitively think that, on some level, that code is the answer to all things. And that, oh, all this human interaction stuff, all the discussions, all the systems thinking, you have to fit a certain profile to do that, and anyone outside of that is, eh, they're not as valuable. They can get ignored. And we see that manifesting itself in different ways.And even if we take a look at people whose profess otherwise, we take a look at folks who are fresh out of a boot camp and don't understand much about the business world yet; they have transformed their lives—maybe they're fresh out of college, maybe didn't even go to college—and 18 weeks later, they are signing up for six-figure jobs. Meanwhile, you take a look at virtually any other business function, in order to have a relatively comparable degree of earning potential, it takes years of experience and being very focused on a whole bunch of other things. There's a massive distortion around technical roles, and that's a strange and difficult thing to wrap my head around. But as you're talking about it, it goes both ways, too. It's the idea of, “Oh, I'll become technical than branch into other things.” It sounded like you started off instead with a non-technical direction and then sort of adopted that from other sides. Is that right, or am I misremembering exactly how the story unfolds?Jason: No, that's about right. People say, “Hey, when did I start programming?” And it's very in vogue, I think, for a lot of people to say, “I started programming at three years old,” or five years old, or whatever, and got my first computer. I literally didn't get my first computer until I was 18-years-old. And I started programming when I got to a high school co-op with IBM at 17.It was Lotus Notes programming at the time. Had no exposure to it before. What I did, though, in college was IBM told me at the time, they said, “If you get a computer science degree will guarantee you a job.” Which for a kid who grew up the way I grew up, that is manna from heaven type of deal. Like, “You'll guarantee me a job inside where don't have to dig ditches all day or lay asphalt? Oh, my goodness. What's computer science? I'll go figure it out.”And when I got to school, what I realized was I was really far behind. Everyone was that ubergeek type of thing. So, what I did is I tried to hack the system, and what I said was, “What is a topic that nobody else has an advantage on from me?” And so I basically picked the internet because the internet was so new in the mid-'90s that most people were still not fully up to speed on it. And then the underpinnings in the internet, which basically become distributed systems, that's where I started to focus.And because no one had a real advantage, I just, you know, could catch up pretty quickly. But once I got into computers, it turned out that I was probably a very average developer, maybe even below average, but it was the system's thinking that I stood out on. And you know, large-scale distributed systems or architectures were very good for me. And then, you know, that applies not, like, directly, but it applies decently well to human systems. It's just, you know, different types of inputs and outputs. But if you think about organizations at scale, they're barely just really, really, really complex and kind of irrational distributed systems.Corey: Jason, thank you so much for taking the time to speak with me today. If people want to learn more about who you are, what you're up to, how you think about the world, where can they find you?Jason: Twitter's probably the best place at this point. Just @jasoncwarner on Twitter. I'm very unimaginative. My name is my GitHub handle. It's my Twitter username. And that's the best place that I, kind of, interact with folks these days. I do an AMA on GitHub. So, if you ever want to ask me anything, just kind of go to jasoncwarner/ama on GitHub and drop a question in one of the issues and I'll get to answering that. Yeah, those are the best spots.Corey: And we will, of course, include links to those things in the [show notes 00:33:52]. Thank you so much for taking the time to speak with me today. I really appreciate it.Jason: Thanks, Corey. It's been fun.Corey: Jason Warner, Chief Technology Officer at GitHub. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review in your podcast platform of choice anyway, along with a comment that includes a patch.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Autonomy is key in building a sustainable and motivated team, and this core principle also applies to DevOps. Building self-serve Apache Kafka® and Confluent Platform deployments require a streamlined process with unrestricted tools—a centralized processing tool that allows teams in large or mid-sized organizations to automate infrastructure changes while ensuring shared standards are met. With more than 15 years of engineering and technology consulting experience, Pere Urbón-Bayes (Senior Solution Architect, Professional Services, Confluent) built an open source solution—JulieOps—to enable a self-serve Kafka platform as a service with data governance. JulieOps is one of the first solutions available to realize self-service for Kafka and Confluent with automation. Development, operations, security teams often face hurdles when deploying Kafka. How can a user request the topics that they need for their applications? How can the operations team ensure compliance and role-based access controls? How can schemas be standardized and structured across environments? Manual processes can be cumbersome with long cycle times. Automation reduces unnecessary interactions and shortens processing time, enabling teams to be more agile and autonomous in solving problems from a localized team level. Similar to Terraform, JulieOps is declarative. It's a centralized agent that uses the GitOps philosophy, focusing on a developer-centric experience with tools that developers are already familiar with, to provide abstractions to each product personas. All changes are documented and approved within the change management process to streamline deployments with timely and effective audits, as well as ensure security and compliance across environments. The implementation of a central software agent, such as JulieOps, helps you automate the management of topics, configuration, access controls, Confluent Schema Registry, and more within Kafka. It's multi tenant out of the box and supports on-premises clusters and the cloud with CI/CD practices. Tim and Pere also discuss the steps necessary to build a self-service Kafka with an automatic Jenkins process that will empower development teams to be autonomous.EPISODE LINKSJulieOps on GitHubJulieOps documentationBuilding a Self-Service Kafka Platform as a Service with GitOps with Pere Urbón-BayesOpen Service Broker APIDrive | Daniel H. Pink Watch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
Huxlee Barbee (@huxley_barbee, Head Security Prod Mktg @DataDogHQ) talks about the challenging odds of preventing security attacks, managing configuration mistakes, scaling security through monitoring, and security feedback loops in production. SHOW: 555CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSORS:CloudZero - Cloud Cost Intelligence for Engineering TeamsDatadog Monitoring: Modern Monitoring and AnalyticsStart monitoring your infrastructure, applications, logs and security in one place with a free 14 day Datadog trial. Listeners of The Cloudcast will also receive a free Datadog T-shirt.SHOW NOTES:Cloud misconfiguration, a major risk for cloud securityResilience, DevSecOps, and other key takeaways from RSAC 2021Secure your infrastructure in real time with Datadog Cloud Workload SecurityIntroducing Datadog Cloud Security Posture ManagementTopic 1 - Welcome to the show. Let's start by talking a little bit about your background, and where you focus your attention these days.Topic 2 -According to many reports, configuration mistakes tend to lead to the most security breaches. Who is typically making the mistakes? Topic 3 - Can you dig deeper on the dynamics between security on the one hand and developers and SRE engineers on the other hand?Topic 4 - So what are some of the strategies and tactics for achieving optimum balance between these opposing interests?Topic 5 - Should we think about platform (infra) security apart from workload (application) security?Topic 6 - Can you talk to us about the differences between applying security to things that happen pre-production (e.g. CI/CD, software-supply chains) and things that happen in production?FEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet
The way we write, compile, and run software has continued to evolve since computer programming began. The cloud, serverless, no-code, and CI/CD are all contemporary ideas introduced to help software engineers spend more time on their application and less time on the chores of running it. Darklang is a new way of building serverless backends. The post Darklang Deployless Applications with Paul Biggar appeared first on Software Engineering Daily.