Podcasts about sre

Share on
Share on Facebook
Share on Twitter
Share on Reddit
Copy link to clipboard
  • 402PODCASTS
  • 1,135EPISODES
  • 36mAVG DURATION
  • 1DAILY NEW EPISODE
  • Jan 12, 2022LATEST

POPULARITY

20122013201420152016201720182019202020212022


Best podcasts about sre

Show all podcasts related to sre

Latest podcast episodes about sre

Screaming in the Cloud
Slinging CDK Knowledge with Matt Coulter

Screaming in the Cloud

Play Episode Listen Later Jan 12, 2022 37:37


About MattMatt is an AWS DevTools Hero, Serverless Architect, Author and conference speaker. He is focused on creating the right environment for empowered teams to rapidly deliver business value in a well-architected, sustainable and serverless-first way.You can usually find him sharing reusable, well architected, serverless patterns over at cdkpatterns.com or behind the scenes bringing CDK Day to life.Links: AWS CDK Patterns: https://cdkpatterns.com The CDK Book: https://thecdkbook.com CDK Day: https://www.cdkday.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn't heard of before, but they're doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they're using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they're able to wind up taking what you're running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I'm somewhat skeptical, but their customers seem to really like them, so that's one of those areas where I really have a hard time being too snarky about it because when you solve a customer's problem and they get out there in public and say, “We're solving a problem,” it's very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it's worth exploring. So, if you're looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That's risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined today by Matt Coulter, who is a Technical Architect at Liberty Mutual. You may have had the privilege of seeing him on the keynote stage at re:Invent last year—in Las Vegas or remotely—that last year of course being 2021. But if you make better choices than the two of us did, and found yourself not there, take the chance to go and watch that keynote. It's really worth seeing.Matt, first, thank you for joining me. I'm sorry, I don't have 20,000 people here in the audience to clap this time. They're here, but they're all remote as opposed to sitting in the room behind me because you know, social distancing.Matt: And this left earphone, I just have some applause going, just permanently, just to keep me going. [laugh].Corey: That's sort of my own internal laugh track going on. It's basically whatever I say is hilarious, to that. So yeah, doesn't really matter what I say, how I say it, my jokes are all for me. It's fine. So, what was it like being on stage in front of that many people? It's always been a wild experience to watch and for folks who haven't spent time on the speaking circuit, I don't think that there's any real conception of what that's like. Is this like giving a talk at work, where I just walk on stage randomly, whatever I happened to be wearing? And, oh, here's a microphone, I'm going to say words. What is the process there?Matt: It's completely different. For context for everyone, before the pandemic, I would have pretty regularly talked in front of, I don't know, maybe one, two hundred people in Liberty, in Belfast. So, I used to be able to just, sort of, walk in front of them, and lean against the pillar, and use my clicker, and click through, but the process for actually presenting something as big as a keynote and re:Invent is so different. For starters, you think that when you walk onto the stage, you'll actually be able to see the audience, but the way the lights are set up, you can pretty much see about one row of people, and they're not the front row, so anybody I knew, I couldn't actually see.And yeah, you can only see, sort of like, the from the void, and then you have your screens, so you've six sets of screens that tell you your notes as well as what slides you're on, you know, so you can pivot. But other than that, I mean, it feels like you're just talking to yourself outside of whenever people, thankfully, applause. It's such a long process to get there.Corey: I've always said that there are a few different transition stages as the audience size increases, but for me, the final stage is more or less anything above 750 people. Because as you say, you aren't able to see that many beyond that point, and it doesn't really change anything meaningfully. The most common example that you see in the wild is jokes that work super well with a small group of people fall completely flat to large audiences. It's why so much corporate numerous cheesy because yeah, everyone in the rehearsals is sitting there laughing and the joke kills, but now you've got 5000 people sitting in a room and that joke just sounds strained and forced because there's no longer a conversation, and no one has the shared context that—the humor has to change. So, in some cases when you're telling a story about what you're going to say on stage, during a rehearsal, they're going to say, “Well, that joke sounds really corny and lame.” It's, “Yeah, wait until you see it in front of an audience. It will land very differently.” And I'm usually right on that.I would also advise, you know, doing what you do and having something important and useful to say, as opposed to just going up there to tell jokes the whole time. I wanted to talk about that because you talked about how you're using various CDK and other serverless style patterns in your work at Liberty Mutual.Matt: Yeah. So, we've been using CDK pretty extensively since it was, sort of, Q3 2019. At that point, it was new. Like, it had just gone GA at the time, just came out of dev preview. And we've been using CDK from the perspective of we want to be building serverless-first, well-architected apps, and ideally we want to be building them on AWS.Now, the thing is, we have 5000 people in our IT organization, so there's sort of a couple of ways you can take to try and get those people onto the cloud: You can either go the route of being, like, there is one true path to architecture, this is our architecture and everything you want to build can fit into that square box; or you can go the other approach and try and have the golden path where you say this is the paved road that is really easy to do, but if you want to differentiate from that route, that's okay. But what you need to do is feed back into the golden path if that works. Then everybody can improve. And that's where we've started been using CDK. So, what you heard me talk about was the software accelerator, and it's sort of a different approach.It's where anybody can build a pattern and then share it so that everybody else can rapidly, you know, just reuse it. And what that means is effectively you can, instead of having to have hundreds of people on a central team, you can actually just crowdsource, and sort of decentralize the function. And if things are good, then a small team can actually come in and audit them, so to speak, and check that it's well-architected, and doesn't have flaws, and drive things that way.Corey: I have to confess that I view the CDK as sort of a third stage automation approach, and it's one that I haven't done much work with myself. The first stage is clicking around in the console; the second is using CloudFormation or Terraform; the third stage is what we're talking about here is CDK or Pulumi, or something like that. And then you ascend to the final fourth stage, which is what I use, which is clicking around in the AWS console, but then you lie to people about it. ClickOps is poised to take over the world. But that's okay. You haven't gotten that far yet. Instead, you're on the CDK side. What advantages does CDK offer that effectively CloudFormation or something like it doesn't?Matt: So, first off, for ClickOps in Liberty, we actually have the AWS console as read-only in all of our accounts, except for sandbox. So, you can ClickOps in sandbox to learn, but if you want to do something real, unfortunately, it's going to fail you. So.—Corey: I love that pattern. I think I might steal that.Matt: [laugh]. So, originally, we went heavy on CloudFormation, which is why CDK worked well for us. And because we've actually—it's been a long journey. I mean, we've been deploying—2014, I think it was, we first started deploying to AWS, and we've used everything from Terraform, to you name it. We've built our own tools, believe it or not, that are basically CDK.And the thing about CloudFormation is, it's brilliant, but it's also incredibly verbose and long because you need to specify absolutely everything that you want to deploy, and every piece of configuration. And that's fine if you're just deploying a side project, but if you're in an enterprise that has responsibilities to protect user data, and you can't just deploy anything, they end up thousands and thousands and thousands of lines long. And then we have amazing guardrails, so if you tried to deploy a CloudFormation template with a flaw in it, we can either just fix it, or reject the deploy. But CloudFormation is not known to be the fastest to deploy, so you end up in this developer cycle, where you build this template by hand, and then it goes through that CloudFormation deploy, and then you get the failure message that it didn't deploy because of some compliance thing, and developers just got frustrated, and were like, sod this. [laugh].I'm not deploying to AWS. Back the on-prem. And that's where CDK was a bit different because it allowed us to actually build abstractions with all of our guardrails baked in, so that it just looked like a standard class, for developers, like, developers already know Java, Python, TypeScript, the languages off CDK, and so we were able to just make it easy by saying, “You want API Gateway? There's an API Gateway class. You want, I don't know, an EC2 instance? There you go.” And that way, developers could focus on the thing they wanted, instead of all of the compliance stuff that they needed to care about every time they wanted to deploy.Corey: Personally, I keep lobbying AWS to add my preferred language, which is crappy shell scripting, but for some reason they haven't really been quick to add that one in. The thing that I think surprises me, on some level—though, perhaps it shouldn't—is not just the adoption of serverless that you're driving at Liberty Mutual, but the way that you're interacting with that feels very futuristic, for lack of a better term. And please don't think that I'm in any way describing this in a way that's designed to be insulting, but I do a bunch of serverless nonsense on Twitter for Pets. That's not an exaggeration. twitterforpets.com has a bunch of serverless stuff behind it because you know, I have personality defects.But no one cares about that static site that's been a slide dump a couple of times for me, and a running joke. You're at Liberty Mutual; you're an insurance company. When people wind up talking about big enterprise institutions, you're sort of a shorthand example of exactly what they're talking about. It's easy to contextualize or think of that as being very risk averse—for obvious reasons; you are an insurance company—as well as wanting to move relatively slowly with respect to technological advancement because mistakes are going to have drastic consequences to all of your customers, people's lives, et cetera, as opposed to tweets or—barks—not showing up appropriately at the right time. How did you get to the, I guess, advanced architectural philosophy that you clearly have been embracing as a company, while having to be respectful of the risk inherent that comes with change, especially in large, complex environments?Matt: Yeah, it's funny because so for everyone, we were talking before this recording started about, I've been with Liberty since 2011. So, I've seen a lot of change in the length of time I've been here. And I've built everything from IBM applications right the way through to the modern serverless apps. But the interesting thing is, the journey to where we are today definitely started eight or nine years ago, at a minimum because there was something identified in the leadership that they said, “Listen, we're all about our customers. And that means we don't want to be wasting millions of dollars, and thousands of hours, and big trains of people to build software that does stuff. We want to focus on why are we building a piece of software, and how quickly can we get there? If you focus on those two things you're doing all right.”And that's why starting from the early days, we focused on things like, okay, everything needs to go through CI/CD pipelines. You need to have your infrastructure as code. And even if you're deploying on-prem, you're still going to be using the same standards that we use to deploy to AWS today. So, we had years and years and years of just baking good development practices into the company. And then whenever we started to move to AWS, the question became, do we want to just deploy the same thing or do we want to take full advantage of what the cloud has to offer? And I think because we were primed and because the leadership had the right direction, you know, we were just sitting there ready to say, “Okay, serverless seems like a way we can rapidly help our customers.” And that's what we've done.Corey: A lot of the arguments against serverless—and let's be clear, they rhyme with the previous arguments against cloud that lots of people used to make; including me, let's be clear here. I'm usually wrong when I try to predict the future. “Well, you're putting your availability in someone else's hands,” was the argument about cloud. Yeah, it turns out the clouds are better at keeping things up than we are as individual companies.Then with serverless, it's the, “Well, if they're handling all that stuff for you on their side, when they're down, you're down. That's an unacceptable business risk, so we're going to be cloud-agnostic and multi-cloud, and that means everything we build serverlessly needs to work in multiple environments, including in our on-prem environment.” And from the way that we're talking about servers and things that you're building, I don't believe that is technically possible, unless some of the stuff you're building is ridiculous. How did you come to accept that risk organizationally?Matt: These are the conversations that we're all having. Sort of, I'd say once a week, we all have a multi-cloud discussion—and I really liked the article you wrote, it was maybe last year, maybe the year before—but multi-cloud to me is about taking the best capabilities that are out there and bringing them together. So, you know, like, Azure [ID 00:12:47] or whatever, things from the other clouds that they're good at, and using those rather than thinking, “Can I build a workload that I can simultaneously pay all of the price to run across all of the clouds, all of the time, so that if one's down, theoretically, I might have an outage?” So, the way we've looked at it is we embraced really early the well-architected framework from AWS. And it talks about things like you need to have multi-region availability, you need to have your backups in place, you need to have things like circuit breakers in place for if third-party goes down, and we've just tried to build really resilient architectures as best as we can on AWS. And do you know what I think, if [laugh] it AWS is not—I know at re:Invent, there it went down extraordinarily often compared to normal, but in general—Corey: We were all tired of re:Invent; their us-east-1 was feeling the exact same way.Matt: Yeah, so that's—it deserved a break. But, like, if somebody can't buy insurance for an hour, once a year, [laugh] I think we're okay with it versus spending millions to protect that one hour.Corey: And people make assumptions based on this where, okay, we had this problem with us-east-1 that froze things like the global Route 53 control planes; you couldn't change DNS for seven hours. And I highlighted that as, yeah, this is a problem, and it's something to severely consider, but I will bet you anything you'd care to name that there is an incredibly motivated team at AWS, actively fixing that as we speak. And by—I don't know how long it takes to untangle all of those dependencies, but I promise they're going to be untangled in relatively short order versus running data centers myself, when I discover a key underlying dependency I didn't realize was there, well, we need to break that. That's never going to happen because we're trying to do things as a company, and it's just not the most important thing for us as a going concern. With AWS, their durability and reliability is the most important thing, arguably compared to security.Would you rather be down or insecure? I feel like they pick down—I would hope in most cases they would pick down—but they don't want to do either one. That is something they are drastically incentivized to fix. And I'm never going to be able to fix things like that and I don't imagine that you folks would be able to either.Matt: Yeah, so, two things. The first thing is the important stuff, like, for us, that's claims. We want to make sure at any point in time, if you need to make a claim you can because that is why we're here. And we can do that with people whether or not the machines are up or down. So, that's why, like, you always have a process—a manual process—that the business can operate, irrespective of whether the cloud is still working.And that's why we're able to say if you can't buy insurance in that hour, it's okay. But the other thing is, we did used to have a lot of data centers, and I have to say, the people who ran those were amazing—I think half the staff now work for AWS—but there was this story that I heard where there was an app that used to go down at the same time every day, and nobody could work out why. And it was because someone was coming in to clean the room at that time, and they unplugged the server to plug in a vacuum, and then we're cleaning the room, and then plugging it back in again. And that's the kind of thing that just happens when you manage people, and you manage a building, and manage a premises. Whereas if you've heard that happened that AWS, I mean, that would be front page news.Corey: Oh, it absolutely would. There's also—as you say, if it's the sales function, if people aren't able to buy insurance for an hour, when us-east-1 went down, the headlines were all screaming about AWS taking an outage, and some of the more notable customers were listed as examples of this, but the story was that, “AWS has massive outage,” not, “Your particular company is bad at technology.” There's sort of a reputational risk mitigation by going with one of these centralized things. And again, as you're alluding to, what you're doing is not life-critical as far as the sales process and getting people to sign up. If an outage meant that suddenly a bunch of customers were no longer insured, that's a very different problem. But that's not your failure mode.Matt: Exactly. And that's where, like, you got to look at what your business is, and what you're specifically doing, but for 99.99999% of businesses out there, I'm pretty sure you can be down for the tiny window that AWS is down per year, and it will be okay, as long as you plan for it.Corey: So, one thing that really surprised me about the entirety of what you've done at Liberty Mutual is that you're a big enterprise company, and you can take a look at any enterprise company, and say that they have dueling mottos, which is, “I am not going to comment on that,” or, “That's not funny.” Like, the safe mode for any large concern is to say nothing at all. But a lot of folks—not just you—at Liberty have been extremely vocal about the work that you're doing, how you view these things, and I almost want to call it advocacy or evangelism for the CDK. I'm slightly embarrassed to admit that for a little while there, I thought you were an AWS employee in their DevRel program because you were such an advocate in such strong ways for the CDK itself.And that is not something I expected. Usually you see the most vocal folks working in environments that, let's be honest, tend to play a little bit fast and loose with things like formal corporate communications. Liberty doesn't and yet, there you folks are telling these great stories. Was that hard to win over as a culture, or am I just misunderstanding how corporate life is these days?Matt: No, I mean, so it was different, right? There was a point in time where, I think, we all just sort of decided that—I mean, we're really good at what we do from an engineering perspective, and we wanted to make sure that, given the messaging we were given, those 5000 teck employees in Liberty Mutual, if you consider the difference in broadcasting to 5000 versus going external, it may sound like there's millions, billions of people in the world, but in reality, the difference in messaging is not that much. So, to me what I thought, like, whenever I started anyway—it's not, like, we had a meeting and all decided at the same time—but whenever I started, it was a case of, instead of me just posting on all the internal channels—because I've been doing this for years—it's just at that moment, I thought, I could just start saying these things externally and still bring them internally because all you've done is widened the audience; you haven't actually made it shallower. And that meant that whenever I was having the internal conversations, nothing actually changed except for it meant external people, like all their Heroes—like Jeremy Daly—could comment on these things, and then I could bring that in internally. So, it almost helped the reverse takeover of the enterprise to change the culture because I didn't change that much except for change the audience of who I was talking to.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: One thing that you've done that I want to say is admirable, and I stumbled across it when I was doing some work myself over the break, and only right before this recording did I discover that it was you is the cdkpatterns.com website. Specifically what I love about it is that it publishes a bunch of different patterns of ways to do things. This deviates from a lot of tutorials on, “Here's how to build this one very specific thing,” and instead talks about, “Here's the architecture design; here's what the baseline pattern for that looks like.” It's more than a template, but less than a, “Oh, this is a messaging app for dogs and I'm trying to build a messaging app for cats.” It's very generalized, but very direct, and I really, really like that model of demo.Matt: Thank you. So, watching some of your Twitter threads where you experiment with new—Corey: Uh oh. People read those. That's a problem.Matt: I know. So, whatever you experiment with a new piece of AWS to you, I've always wondered what it would be like to be your enabling architect. Because technically, my job in Liberty is, I meant to try and stay ahead of everybody and try and ease the on-ramp to these things. So, if I was your enabling architect, I would be looking at it going, “I should really have a pattern for this.” So that whenever you want to pick up that new service the patterns in cdkpatterns.com, there's 24, 25 of them right there, but internally, there's way more than dozens now.The goal is, the pattern is the least amount to code for you to learn a concept. And then that way, you can not only see how something works, but you can maybe pick up one of the pieces of the well-architected framework while you're there: All of it's unit tested, all of it is proper, you know, like, commented code. The idea is to not be crap, but not be gold-plated either. I'm currently in the process of upgrading that all to V2 as well. So, that [unintelligible 00:21:32].Corey: You mentioned a phrase just now: “Enabling architect.” I have to say this one that has not crossed my desk before. Is that an internal term you use? Is that an enterprise concept I've somehow managed to avoid? Is that an AWS job role? What is that?Matt: I've just started saying [laugh] it's my job over the past couple of years. That—I don't know, patent pending? But the idea to me is—Corey: No, it's evocative. I love the term, I'd love to learn more.Matt: Yeah, because you can sort of take two approaches to your architecture: You can take the traditional approach, which is the ‘house of no' almost, where it's like, “This is the architecture. How dare you want to deviate. This is what we have decided. If you want to change it, here's the Architecture Council and go through enterprise architecture as people imagine it.” But as people might work out quite quickly, whenever they meet me, the whole, like, long conversational meetings are not for me. What I want to do is teach engineers how to help themselves, so that's why I see myself as enabling.And what I've been doing is using techniques like Wardley Mapping, which is where you can go out and you can actually take all the components of people's architecture and you can draw them on a map for—it's a map of how close they are to the customer, as well as how cutting edge the tech is, or how aligned to our strategic direction it is. So, you can actually map out all of the teams, and—there's 160, 170 engineers in Belfast and Dublin, and I can actually go in and say, “Oh, that piece of your architecture would be better if it was evolved to this. Well, I have a pattern for that,” or, “I don't have a pattern for that, but you know what? I'll build one and let's talk about it next week.” And that's always trying to be ahead, instead of people coming to me and I have to say no.Corey: AWS Proton was designed to do something vaguely similar, where you could set out architectural patterns of—like, the two examples that they gave—I don't know if it's in general availability yet or still in public preview, but the ones that they gave were to build a REST API with Lambda, and building something-or-other with Fargate. And the idea was that you could basically fork those, or publish them inside of your own environment of, “Oh, you want a REST API; go ahead and do this.” It feels like their vision is a lot more prescriptive than what yours is.Matt: Yeah. I talked to them quite a lot about Proton, actually because, as always, there's different methodologies and different ways of doing things. And as I showed externally, we have our software accelerator, which is kind of our take on Proton, and it's very open. Anybody can contribute; anybody can consume. And then that way, it means that you don't necessarily have one central team, you can have—think of it more like an SRE function for all of the patterns, rather than… the Proton way is you've separate teams that are your DevOps teams that set up your patterns and then separate team that's consumer, and they have different permissions, different rights to do different things. If you use a Proton pattern, anytime an update is made to that pattern, it auto-deploys your infrastructure.Corey: I can see that breaking an awful lot.Matt: [laugh]. Yeah. So, the idea is sort of if you're a consumer, I assume you [unintelligible 00:24:35] be going to change that infrastructure. You can, they've built in an escape hatch, but the whole concept of it is there's a central team that looks to what the best configuration for that is. So, I think Proton has so much potential, I just think they need to loosen some of the boundaries for it to work for us, and that's the feedback I've given them directly as well.Corey: One thing that I want to take a step beyond this is, you care about this? More than most do. I mean, people will work with computers, yes. We get paid for that. Then they'll go and give talks about things. You're doing that as well. They'll launch a website occasionally, like, cdkpatterns.com, which you have. And then you just sort of decide to go for the absolute hardest thing in the world, and you're one of four authors of a book on this. Tell me more.Matt: Yeah. So, this is something that there's a few of us have been talking since one of the first CDK Days, where we're friends, so there's AWS Heroes. There's Thorsten Höger, Matt Bonig, Sathyajith Bhat, and myself, came together—it was sometime in the summer last year—and said, “Okay. We want to write a book, but how do we do this?” Because, you know, we weren't authors before this point; we'd never done it before. We weren't even sure if we should go to a publisher, or if we should self-publish.Corey: I argue that no one wants to write a book. They want to have written a book, and every first-time author I've ever spoken to at the end has said, “Why on earth would anyone want to do this a second time?” But people do it.Matt: Yeah. And that's we talked to Alex DeBrie, actually, about his book, the amazing Dynamodb Book. And it was his advice, told us to self-publish. And he gave us his starter template that he used for his book, which took so much of the pain out because all we had to do was then work out how we were going to work together. And I will say, I write quite a lot of stuff in general for people, but writing a book is completely different because once it's out there, it's out there. And if it's wrong, it's wrong. You got to release a new version and be like, “Listen, I got that wrong.” So, it did take quite a lot of effort from the group to pull it together. But now that we have it, I want to—I don't have a printed copy because it's only PDF at the minute, but I want a copy just put here [laugh] in, like, the frame. Because it's… it's what we all want.Corey: Yeah, I want you to do that through almost a traditional publisher, selfishly, because O'Reilly just released the AWS Cookbook, and I had a great review quote on the back talking about the value added. I would love to argue that they use one of mine for The CDK Book—and then of course they would reject it immediately—of, “I don't know why you do all this. Using the console and lying about it is way easier.” But yeah, obviously not the direction you're trying to take the book in. But again, the industry is not quite ready for the lying version of ClickOps.It's really neat to just see how willing you are to—how to frame this?—to give of yourself and your time and what you've done so freely. I sometimes make a joke—that arguably isn't that funny—that, “Oh, AWS Hero. That means that you basically volunteer for a $1.6 trillion company.”But that's not actually what you're doing. What you're doing is having figured out all the sharp edges and hacked your way through the jungle to get to something that is functional, you're a trailblazer. You're trying to save other people who are working with that same thing from difficult experiences on their own, having to all thrash and find our own way. And not everyone is diligent and as willing to continue to persist on these things. Is that a somewhat fair assessment how you see the Hero role?Matt: Yeah. I mean, no two Heroes are the same, from what I've judged, I haven't met every Hero yet because pandemic, so Vegas was the first time [I met most 00:28:12], but from my perspective, I mean, in the past, whatever number of years I've been coding, I've always been doing the same thing. Somebody always has to go out and be the first person to try the thing and work out what the value is, and where it'll work for us more work for us. The only difference with the external and public piece is that last 5%, which it's a very different thing to do, but I personally, I like even having conversations like this where I get to meet people that I've never met before.Corey: You sort of discovered the entire secret of why I have an interview podcast.Matt: [laugh]. Yeah because this is what I get out of it, just getting to meet other people and have new experiences. But I will say there's Heroes out there doing very different things. You've got, like, Hiro—as in Hiro, H-I-R-O—actually started AWS Newbies and she's taught—ah, it's hundreds of thousands of people how to actually just start with AWS, through a course designed for people who weren't coders before. That kind of thing is next-level compared to anything I've ever done because you know, they have actually built a product and just given it away. I think that's amazing.Corey: At some level, building a product and giving it away sounds like, “You know, I want to never be lonely again.” Well, that'll work because you're always going to get support tickets. There's an interesting narrative around how to wind up effectively managing the community, and users, and demands, based on open-source maintainers, that we're all wrestling with as an industry, particularly in the wake of that whole log4j nonsense that we've been tilting at that windmill, and that's going to be with us for a while. One last thing I want to talk about before we wind up calling this an episode is, you are one of the organizers of CDK Day. What is that?Matt: Yeah, so CDK Day, it's a complete community-organized conference. The past two have been worldwide, fully virtual just because of the situation we're in. And I mean, they've been pretty popular. I think we had about 5000 people attended the last one, and the idea is, it's a full day of the community just telling their stories of how they liked or disliked using the CDK. So, it's not a marketing event; it's not a sales event; we actually run the whole event on a budget of exactly $0. But yeah, it's just a day of fun to bring the community together and learn a few things. And, you know, if you leave it thinking CDK is not for you, I'm okay with that as much as if you just make a few friends while you're there.Corey: This is the first time I'd realized that it wasn't a formal AWS event. I almost feel like that's the tagline that you should have under it. It's—because it sounds like the CDK Day, again, like, it's this evangelism pure, “This is why it's great and why you should use it.” But I love conferences that embrace critical views. I built one of the first talks I ever built out that did anything beyond small user groups was “Heresy in the Church of Docker.”Then they asked me to give that at ContainerCon, which was incredibly flattering. And I don't think they made that mistake a second time, but it was great to just be willing to see some group of folks that are deeply invested in the technology, but also very open to hearing criticism. I think that's the difference between someone who is writing a nuanced critique versus someone who's just [pure-on 00:31:18] zealotry. “But the CDK is the answer to every technical problem you've got.” Well, I start to question the wisdom of how applicable it really is, and how objective you are. I've never gotten that vibe from you.Matt: No, and that's the thing. So, I mean, as we've worked out in this conversation, I don't work for AWS, so it's not my product. I mean, if it succeeds or if it fails, it doesn't impact my livelihood. I mean, there are people on the team who would be sad for, but the point is, my end goal is always the same. I want people to be enabled to rapidly deliver their software to help their customers.If that's CDK, perfect, but CDK is not for everyone. I mean, there are other options available in the market. And if, even, ClickOps is the way to go for you, I am happy for you. But if it's a case of we can have a conversation, and I can help you get closer to where you need to be with some other tool, that's where I want to be. I just want to help people.Corey: And if I can do anything to help along that axis, please don't hesitate to let me know. I really want to thank you for taking the time to speak with me and being so generous, not just with your time for this podcast, but all the time you spend helping the rest of us figure out which end is up, as we continue to find that the way we manage environments evolves.Matt: Yeah. And, listen, just thank you for having me on today because I've been reading your tweets for two years, so I'm just starstruck at this moment to even be talking to you. So, thank you.Corey: No, no. I understand that, but don't worry, I put my pants on two legs at a time, just like everyone else. That's right, the thought leader on Twitter, you have to jump into your pants. That's the rule. Thanks again so much. I look forward to having a further conversation with you about this stuff as I continue to explore, well honestly, what feels like a brand new paradigm for how we manage code.Matt: Yeah. Reach out if you need any help.Corey: I certainly will. You'll regret asking. Matt [Coulter 00:33:06], Technical Architect at Liberty Mutual. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, write an angry comment, then click the submit button, but lie and say you hit the submit button via an API call.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
An Enterprise Level View of Cloud Architecture with Levi McCormick

Screaming in the Cloud

Play Episode Listen Later Jan 6, 2022 33:52


About LeviLevi's passion lies in helping others learn to cloud better.Links: Jamf: https://www.jamf.com Twitter: https://twitter.com/levi_mccormick TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open-source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers, and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn't heard of before, but they're doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they're using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they're able to wind up taking what you're running as it is in AWS with no changes, and run it inside of their data centers that span multiple regions. I'm somewhat skeptical, but their customers seem to really like them, so that's one of those areas where I really have a hard time being too snarky about it because when you solve a customer's problem and they get out there in public and say, “We're solving a problem,” it's very hard to snark about that. Multus Medical, Construx.ai and Stax have seen significant results by using them. And it's worth exploring. So, if you're looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That's risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am known-slash-renowned-slash-reviled for my creative pronunciations of various technologies, company names, et cetera. Kubernetes, for example, and other things that get people angry on the internet. The nice thing about today's guest is that he works at a company where there is no possible way for me to make it more ridiculous than it sounds because Levi McCormick is a cloud architect at Jamf. I know Jamf sounds like I'm trying to pronounce letters that are designed to be silent, but no, no, it's four letters: J-A-M-F. Jamf. Levi, thanks for joining me.Levi: Thanks for having me. I'm super excited.Corey: Exactly. Also professional advice for anyone listening: Making fun of company names is hilarious; making fun of people's names makes you a jerk. Try and remember that. People sometimes blur that distinction.So, very high level, you're a cloud architect. Now, I remember the days of enterprise architects where their IDEs were basically whiteboards, and it was a whole bunch of people sitting in a room. They call it an ivory tower, but I've been in those rooms; I assure you there is nothing elevated about this. It's usually a dank sub-basement somewhere. What do you do, exactly?Levi: Well, I am part of the enterprise architecture team at Jamf. My roles include looking at our use of cloud; making sure that we're using our resources to the greatest efficacy possible; coordinating between many teams, many products, many architectures; trying to make sure that we're using best practices; bringing them from the teams that develop them and learn them, socializing them to other teams; and just trying to keep a handle on this wild ride that we're on.Corey: So, what I find fun is that Jamf has been around for a long time. I believe it is not your first name. I want to say Casper was originally?Levi: I believe so, yeah.Corey: We're Jamf customers. You're not sponsoring this episode or anything, to the best of my knowledge. So, this is not something I'm trying to shill the company, but we're a customer; we use you to basically ensure that all of our company MacBooks, and laptops, et cetera, et cetera, are basically ensured that there's disk encryption turned on, that people have a password, and that screensaver is turned on, basically to mean that if someone gets their laptop stolen, it's a, “Oh, I have to spend more money with Apple,” and not, “Time to sound the data breach alarm,” for reasons that should be blindingly obvious. And it's great not just at the box check, but also fixing the real problem of I [laugh] don't want to lose data that is sensitive for obvious reasons. I always thought of this is sort of a thing that worked on the laptops. Why do you have a cloud team?Levi: Many reasons. First of all, we started in the business of providing the software that customers would run in their own data centers, in their own locations. Sometime in about 2015, we decided that we are properly equipped to run this better than other people, and we started to provide that as a service. People would move in, migrate their services into the cloud, or we would bring people into the cloud to start with.Device management isn't the only thing that we do. We provide some SSO-type services, we recently acquired a company called Wandera, which does endpoint security and a VPN-like experience for traffic. So, there's a lot of cloud powering all of those things.Corey: Are you able to disclose whether you're focusing mostly on AWS, on Azure, on Google Cloud, or are you pretending a cloud with something like IBM?Levi: All of the above, I believe.Corey: Excellent. That tells you it's a real enterprise, in seriousness. It's the—we talk about the idea of going all in on one providers being a general best practice of good place to start. I believe that. And then there are exceptions, and as companies grow and accumulate technical debt, that also is load-bearing and generates money, you wind up with this weird architectural series of anti-patterns, and when you draw it on a whiteboard of, “Here's our architecture,” the junior consultant comes in and says, “What moron built this?” Usually two said quote-unquote, “Moron,” and then they've just pooched the entire engagement.Yeah, most people don't show up in the morning hoping to do a terrible job today, unless they work at Facebook. So, there are reasons things are the way they are; they're constraints that shape these things. Yeah, if people were going to be able to shut down the company for two years and rebuild everything from scratch from the ground up, it would look wildly different. But you can't do that most of the time.Levi: Yeah. Those things are load bearing, right? You can't just stop traffic one day, and re-architect it with the golden image of what it should have been. We've gone through a series of acquisitions, and those architectures are disparate across the different acquired products. So, you have to be able to leverage lessons from all of them, bring them together and try and just slowly, incrementally march towards a better future state.Corey: As we take a look at the challenges we see The Duckbill Group over on my side of the world, where we talk to customers, it's I think it is surprising to folks to learn that cloud economics as I see it is—well, first, cost and architecture the same thing, which inherently makes sense, but there's a lot more psychology that goes into it than math. People often assume I spend most of my time staring into spreadsheets. I assure you that would not go super well. But it has to do with the psychological elements of what it is that people are wrestling with, of their understanding of the environment has not kept pace with reality, and APIs tend to, you know, tell truths.It's always interesting to me to see the lies that customers tell, not intentionally, but the reality of it of, “Okay, what about those big instances you're running in Australia?” “Oh, we don't have any instances in Australia.” “Look, I understand that you are saying that in good faith, however…” and now we're in a security incident mode and it becomes a whole different story. People's understanding always trails. What do you spend the bulk of your time doing? Is it building things? Is it talking to people? Is it trying to more or less herd cats in certain directions? What's the day-to-day?Levi: I would say it varies week-to-week. Depends on if we have a new product rolling out. I spend a lot of my time looking at architectural diagrams, reference architectures from AWS. The majority of the work I do is in AWS and that's where my expertise lies. I haven't found it financially incentivized to really branch out into any of the other clouds in terms of expertise, but I spend a lot of my time developing solutions, socializing them, getting them in front of teams, and then educating.We have a wide range of skills internally in terms of what people know or what they've been exposed to. I'd say a lot of engineers want to learn the cloud and they want to get opportunities to work on it, and their day-to-day work may not bring them those opportunities as often as they'd like. So, a good portion of my time is spent educating, guiding, joining people's sprints, joining in their stand-ups, and just kind of talking through, like, how they should approach a problem.Corey: Whenever you work at a big company, you invariably wind up with—well, microservices becomes the right answer, not because of the technical reasons; because of the people reason, the way that you get a whole bunch of people moving in roughly the same direction. You are a large scale company; who owns services in your idealized view of the world? Is it, “Well, I wrote something and it's five o'clock. Off to production with it. Talk to you in two days, if everything—if we still have a company left because I didn't double-check what I just wrote.”Do you think that the people who are building services necessarily should be the ones supporting it? Like, in other words, Amazon's approach of having the software engineers being responsible for the ones running it in production from an ops perspective. Is that the direction you trend towards, or do you tend to be from my side of the world—which is grumpy sysadmin—where people—developers hurl applications into your yard for you to worry about?Levi: I would say, I'm an extremist in the view of supporting the Amazon perspective. I really like you build it, you run it, you own it, you architect it, all of it. I think the other teams in the organization should exist to support and enable those paths. So, if you have platform teams are a really common thing you see hired right now, I think those platforms should be built to enable the company's perspective on operating infrastructure or services, and then those service teams on top of that should be enabled to—and empowered to make the decisions on how they want to build a service, how they want to provide it. Ultimately, the buck should stop with them.You can get into other operational teams, you could have a systems operation team, but I think there should be an explicit contract between a service team, what they build, and what they hand off, you know, you could hand off, like, a tier one level response, you know, you can do playbooks, you could do, you know, minimal alert, response, routing, that kind of stuff with a team, but I think that even that team should have a really strong contract with, like, here's what our team provides, here's how you engage with our team, here's how you will transition services to our team.Corey: The challenge with doing that, in some shops, has been that if you decide to roll out a, you build it, you own it, approach that has not been there since the beginning, you wind up with a lot of pushback from engineers who until now really enjoyed their 5:30 p.m. quitting time, or whenever it was they wound up knocking off work. And they started pushing back, like, “Working out of hours? That's inhumane.” And the DevOps team would be sitting there going, “We're right here. How dare you? Like, what do you think our job is?” And it's a, “Yes, but you're not people.” And then it leads to this whole back and forth acrimonious—we'll charitably call it a debate. How do you drive that philosophy?Levi: It's a challenge. I've seen many teams fracture, fall apart, disperse, if you will, under the transition of going through, like, an extreme service ownership. I think you balance it out with the carrot of you also get to determine your own future, right? You get to determine the programming language you use, you get to determine the underlying technologies that you use. Again, there's a contract: You have to meet this list of security concerns, you need to meet these operational concerns, and how you do that is up to you.Corey: When you take a look across various teams—let's bound this to the industry because I don't necessarily want you to wind up answering tough questions at work the day this episode airs—what do you see the biggest blockers to achieving, I guess, a functional cultural service ownership?Levi: It comes down to people's identity. They've established their own identity, “As I am X,” right? I'm a operations engineer. I'm a developer, I'm an engineer. And getting people to kind of branch out of that really fixed mindset is hard, and that, to me, is the major blocker to people assuming ownership.I've seen people make the transition from, “I'm just an engineer. I just want to write code.” I hate those lines. That frustrates me so much: “I just want to write code.” Transitioning into that, like, ownership of, “I had an idea. I built the platform or the service. It's a huge hit.” Or you know, “Lots of people are using it.” Like, seeing people go through that transformation become empowered, become fulfilled, I think is great.Corey: I didn't really expect to get called out quite like this, but you're absolutely right. I was against the idea, back when I was a sysadmin type because I didn't know how to code. And if you have developers supporting all of the stuff that they've built, then what does that mean for me? It feels like my job is evaporating. I don't know how to write code.Well, then I started learning how to write code incredibly badly. And then wow, it turns out, everyone does this. And here we are. But it's—I don't build applications, for obvious reasons. I'm bad at it, but I found another way to proceed in the wide world that we live in of high technology.But yeah, it was hard because this idea of my sense of identity being tied to the thing that I did, it really was an evolve-or-die dinosaur kind of moment because I started seeing this philosophy across the board. You take a look, even now at modern SRE is, or modern DevOps folks, or modern sysadmins, what they're doing looks a lot less like logging into Linux systems and tinkering on the command line a lot more like running and building distributed applications. Sure, this application that you're rolling out is the one that orchestrates everything there, but you're still running this in the same way the software engineers do, which is, interestingly.Levi: And that doesn't mean a team has to be only software engineers. Your service team can be multiple disciplines. It should be multiple disciplines. I've seen a traditional ops team broken apart, and those individuals distributed into the services that they were chiefly skilled in supporting in the past, as the ops team, as we transitioned those roles from one of the worst on-call rotations I've ever seen—you know, 13 to 14 alerts a night—transitioning those out to those service teams, training them up on the operations, building the playbooks. That was their role. Their role wasn't necessarily to write software, day one.Corey: I quit a job after six weeks because of that style of, I guess, mismanagement. Their approach was that, oh, we're going to have our monitoring system live in AWS because one of our VPs really likes AWS—let's be clear, this was 2008, 2009 era—latency was a little challenging there. And [unintelligible 00:17:04] he really liked Big Brother, which was—not to—now before that became a TV show and at rest, it was a monitoring system—but network latency was always a weird thing in AWS in those days, so instead, he insisted we set up three of them. And whenever—if we just got one page, it was fine. But if we got three, then we had to jump in. And two was always undefined.And they turned this off from I think, 10 p.m. to 6 a.m. every night, just so the person I call could sleep. And I'm looking at this, like, this might be the worst thing I've ever seen in my life. This was before they released the Managed NAT Gateway, so possibly it was.Levi: And then the flood, right, when you would get—Corey: Oh, God this was the days, too—Levi: Yeah.Corey: —when you were—if you weren't careful, you'd set this up to page you on the phone with a text message and great, now it takes time for my cell provider to wind up funneling out the sudden onslaught of 4000 text messages. No thanks.Levi: If your monitoring system doesn't have the ability to say, you know, the alert flood, funnel them into one alert, or just pause all alerts, while—because we know there's an incident; you know, us-east-1 is down, right? We know this; we don't need to get 500 text messages to each engineer that's on call.Corey: Well, my philosophy at that point was no, I'm going to instead take a step beyond. If I'm not empowered to fix this thing that is waking me up—and sometimes that's the monitoring system, and sometimes it's the underlying application—I'm not on call.Levi: Yes, exactly. And that's why I like the model of extre—you know, the service ownership: Because those alerts should go to the people—the pain should be felt by the people who are empowered to fix it. It should not land anywhere else. Otherwise, that creates misaligned incentives and nothing gets better.Corey: Yeah. But in large distributed systems, very often the person is on call more or less turns into a traffic router.Levi: Right. That's unfair to them.Corey: That's never fun—yeah, that's unfair, and it's not fun, either, and there's no great answer when you've all these different contributory factors.Levi: And how hard is it to keep the team staffed up?Corey: Oh, yeah. It's a, “Hey, you want a really miserable job one week out of every however many there are in the cycle?” Eh, people don't like that.Levi: Exactly.Corey: This episode is sponsored by our friends at Oracle HeatWave, a new high-performance accelerator for the Oracle MySQL Database Service, although I insist on calling it, “My squirrel.” While MySQL has long been the world's most popular open source database, shifting from transacting to analytics required way too much overhead and, you know, work. With HeatWave you can run your OLAP and OLTP—don't ask me to ever say those acronyms again—workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: So, I've been tracking what you're up to for little while now—you're always a blast to talk with—what is this whole Cloud Builder thing that you were talking about for a bit, and then I haven't seen much about it.Levi: Ah, so at the beginning of the pandemic, our mutual friend, Forrest Brazeal, released the Cloud Resume Challenge. I looked at that, and I thought, this is a fantastic idea. I've seen lots of people going through it. I recommend the people I mentor go through it. Great way to pick up a couple cloud skills here and there, tell an interesting story in an interview, right? It's a great prep.I intended the Cloud Builder Challenge to be a natural kind of progression from that Resume Challenge to the Builder Challenge where you get operational experience. Again, back to that, kind of, extreme service ownership mentality, here's a project where you can build, really modeled on the Amazon GameDays from re:Invent, you build a service, we'll send you traffic, you process those payloads, do some matching, some sorting, some really light processing on these payloads, and then send it back to us, score some points, we'll build a public dashboard, people can high five each other, they can razz each other, kind of competition they want to do. Really low, low pressure, but just a fun way to get more operational experience in an area where there is really no downside. You know, playing like that at work, bad idea, right?Corey: Generally, yes. [crosstalk 00:21:28] production, we used to have one of those environments; oops-a-doozy.Levi: Yeah. I don't see enough opportunities for people to gain that experience in a way that reflects a real workload. You can go out and you can find all kinds of Hello Worlds, you can find all kinds of—like, for front end development, there are tons of activity activities and things you can do to learn the skills, but for the middleware, the back end engineers, there's just not enough playgrounds out there. Now, standing up a Hello World app, you know, you've got your infrastructures code template, you've got your pre-written code, you deploy it, congratulations. But now what, right?And I intended this challenge to be kind of a series of increasingly more difficult waves, if you will, or levels. I really had a whole gamification aspect to it. So, it would get harder, it would get bigger, more traffic, you know, all of those things, to really put people through what it would be like to receive your, “Post got slash-dotted today,” or those kinds of things where people don't get an opportunity to deal with large amounts of traffic, or variable payloads, that kind of stuff.Corey: I love the idea. Where is it?Levi: It is sitting in a bunch of repos, and I am afraid to deploy it. [laugh].Corey: What is it that scares you about it specifically?Levi: The thing that specifically scares me is encouraging early career developers to go out there, deploy this thing, start playing with it, and then incur a huge cloud bill.Corey: Because they failed to secure something or other reasons behind that?Levi: There are many ways that this could happen, yeah. You could accidentally push your access key, secret key up into a public repo. Now, you've got, you know, Bitcoin miners or Monero miners running in your environment. You forget to shut things off, right? That's a really common thing.I went through a SageMaker demo from AWS a couple years ago. Half the room of intelligent, skilled engineers forgot to shut off the SageMaker instances. And everybody ran out of the $25 of credit they had from the demo—Corey: In about ten minutes. Yeah.Levi: In about ten minutes, yeah. And we had to issue all kinds of requests for credits and back and forth. But granted, AWS was accommodating to all of those people, but it was still a lot of stress.Corey: But it was also slow. They're very slow on that, which is fair. Like, if someone's production environment is down, I can see why you care more about that than you do about someone with, “Ah, I did something wrong and lost money.” The counterpoint to that is that for early career folks, that money is everything. We remember earlier this year, that tragic story from the Robinhood customer who committed suicide after getting a notification that he was $730,000 in debt. Turns out it wasn't even accurate; he didn't owe anything when all was said and done.I can see a scenario in which that happens in the AWS world because of their lack of firm price controls on a free tier account. I don't know what the answer on this is. I'm even okay with a, “Cool you will—this is a special kind of account that we will turn you off at above certain levels.” Fine. Even if you hard cap at the 20 or 50 bucks, yeah, it's going to annoy some people, but no one is going to do something truly tragic over that. And I can't believe that Oracle Cloud of all companies is the best shining example of this because you have to affirmatively upgrade your account before they'll charge you a dime. It's the right answer.Levi: It is. And I don't know if you've ever looked at—well, I'm sure you'd have. You've probably looked at the solutions provided by AWS for monitoring costs in your accounts, preventing additional spend. Like, the automation to shut things down, right, it's oftentimes more engineering work to make it so that your systems will shut down automatically when you reach a certain billing threshold than the actual applications that are in place there.Corey: And I don't for the life of me understand why things are the way that they are. But here we go. It's a—[sigh] it just becomes this perpetual strange world. I wish things were better than they are, but they're not.Levi: It makes me terribly sad. I mean, I think AWS is an incredible product, I think the ecosystem is great, and the community is phenomenal; everyone is super supportive, and it makes me really sad to be hesitant to recommend people dive into it on their own dime.Corey: Yeah. And that is a—[sigh] I don't know how you fix that or square that circle. Because I don't want to wind up, I really do not want to wind up, I guess, having to give people all these caveats, and then someone posts about a big bill problem on the internet, and all the comments are, “Oh, you should have set up budgets on that.” Yeah, that's thing still a day behind. So okay, great, instead of having an enormous bill at the end of the month, you just have a really big one two days later.I don't think that's the right answer. I really don't. And I don't know how to fix this, but, you know, I'm not the one here who's a $1.7 trillion company, either, that can probably find a way to fix this. I assure you, the bulk of that money is not coming from a bunch of small accounts that forgot to turn something off or got exploited.Levi: I haven't done my 2021 taxes yet, but I'm pretty sure I'm not there either.Corey: The world in which we live.Levi: [laugh]. I would love this challenge. I would love to put it out there. If I could, on behalf of, you know, early career people who want to learn—if I could issue credits, if I could spin up sandboxes and say, like, “Here's an account, I know you're going to be safe. I have put in a $50 limit.” Right?Corey: Yeah.Levi: “You can't spend more than $50,” like, if I had that control or that power, I would do this in a heartbeat. I'm passionate about getting people these opportunities to play, you know, especially if it's fun, right? If we can make this thing enjoyable, if we can gamify it, we can play around, I think that'd be great. The experience, though, would be a significant amount of engineering on my side, and then a huge amount of outreach, and that to me makes me really sad.Corey: I would love to be able to do something like that myself with a, “Look, if you get a bill, they will waive it, or I will cover it.” But then you wind up with the whole problem of people not operating in good faith as well. Like, “All right, I'm going to mine a bunch of Bitcoin and claim someone else did it.” Or whatnot. And it's just… like, there are problems with doing this, and the whole structure doesn't lend itself to that working super well.Levi: Exactly. I often say, you know, I face a lot of people who want to talk about mining cryptocurrency in the cloud because I'm a cloud architect, right? That's a really common conversation I have with people. And I remind them, like, it's not economical unless you're not paying for it.Corey: Yeah, it's perfectly economical on someone else's account.Levi: Exactly.Corey: I don't know why people do things the way that they do, but here we are. So, re:Invent. What did you find that was interesting, promising there, promising but not there yet, et cetera? What was your takeaway from it? Since you had the good sense not to be there in person?Levi: [laugh]. To me, the biggest letdown was Amplify Studio.Corey: I thought it was just me. Thank you. I just assumed it was something I wasn't getting from the explanation that they gave. Because what I heard was, “You can drag and drop, basically, a front end web app together and then tie it together with APIs on the back end.” Which is exactly what I want, like Retool does; that's what I want only I want it to be native. I don't think it's that.Levi: Right. I want the experience I already have of operating the cloud, knowing the security posture, knowing the way that my users access it, knowing that it's backed by Amazon, and all of their progressively improving services, right? You say it all the time. Your service running on Amazon is better today than it was two years ago. It was better than it was five years ago. I want that experience. But I don't think Amplify Studio delivered.Corey: I wish it had. And maybe it will, in the fullness of time. Again, AWS services do not get worse as they age they get better.Levi: Some gets stale, though.Corey: Yeah. The worst case scenario is they sit there and don't ever improve.Levi: Right. I thought the releases from S3 in terms of, like, the intelligent tiering, were phenomenal. I would love to see everybody turn on intelligent tiering with instant access. Those things to me were showing me that they're thinking about the problem the right way. I think we're missing a story of, like, how do we go from where we're at today—you know, if I've got trillions of objects in storage, how do I transition into that new world where I get the tiering automatically? I'm sure we'll see blog posts about people telling us; that's what the community is great for.Corey: Yeah, they explain these things in a way that the official docs for some reason fail to.Levi: Right. And why don't—Corey: Then again, it's also—I think—I think it's because the people that are building these things are too close to the thing themselves. They don't know what it's like to look at it through fresh eyes.Levi: Exactly. They're often starting from a blank slate, or from a greenfield perspective. There's not enough thought—or maybe there's a lot of thought to it, but there's not enough communication coming out of Amazon, like, here's how you transition. We saw that with Control Tower, we saw that with some of the releases around API Gateway. There's no story for transitioning from existing services to these new offerings. And I would love to see—and maybe Amazon needs a re:Invent Echo, where it's like, okay, here's all the new releases from re:Invent and here's how you apply them to existing infrastructure, existing environments.Corey: So, what's next for you? What are you looking at that's exciting and fun, and something that you want to spend your time chasing?Levi: I spend a lot of my time following AWS releases, looking at the new things coming out. I spend a lot of energy thinking about how do we bring new engineers into the space. I've worked with a lot of operations teams—those people who run playbooks, they hop on machines, they do the old sysadmin work, right—I want to bring those people into the modern world of cloud. I want them to have the skills, the empowerment to know what's available in terms of services and in terms of capabilities, and then start to ask, “Why are we not doing it that way?” Or start looking at making plans for how do we get there.Corey: Levi, I really want to thank you for taking the time to speak with me. If people want to learn more. Where can they find you?Levi: I'm on Twitter. My Twitter handle is @levi_mccormick. Reach out, I'm always willing to help people. I mentor people, I guide people, so if you reach out, I will respond. That's a passion of mine, and I truly love it.Corey: And we'll of course, include a link to that in the [show notes 00:32:28]. Thank you so much for being so generous with your time. I appreciate it.Levi: Thanks, Corey. It's been awesome.Corey: Levi McCormick, cloud architect at Jamf. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with a comment telling me that service ownership is overrated because you are the storage person, and by God, you will die as that storage person, potentially in poverty.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Primera Plana: Noticias
México es el segundo país con más casos de Ómicron en Latinoamérica

Primera Plana: Noticias

Play Episode Listen Later Jan 4, 2022 4:53


Según la plataforma GISAID, México se convirtió en el país con más casos identificados de la variante Ómicron en Latinoamérica con 368 contagios confirmados; además, la SRE dio a conocer que el 68% de las armas que entran ilegalmente a nuestro país son fabricadas por las empresas de Massachusetts que demandó el gobierno mexicano en 2021. See acast.com/privacy for privacy and opt-out information.

Screaming in the Cloud
Security Can Be More than Hues of Blue with Ell Marquez

Screaming in the Cloud

Play Episode Listen Later Jan 4, 2022 40:08


About EllEll, former SysAdmin, cloud builder, podcaster, and container advocate, has always been a security enthusiast. This enthusiasm and driven curiosity have helped her become an active member of the InfoSec community, leading her to explore the exciting world of Genetic Software Mapping at Intezer.Links: Intezer: https://www.intezer.com Twitter: https://twitter.com/Ell_o_Punk TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. If there's one thing we love doing in the world of cloud, it's forgetting security until the very end, going back and bolting it on as if we intended to do it that way all along. That's why AWS says security is job zero because they didn't want to remember all of their slides once they realized they forgot security. Here to talk with me about that today is Ell Marquez, security research advocate at Intezer. Ell, thank you for joining me.Ell: Of course.Corey: So, what does a security research advocate do, for lack of a better question, I suppose? Because honestly, you look at that, it's like, security research advocate, it seems, would advocate for doing security research. That seems like a good thing to do. I agree, but there's probably a bit more nuance to it, then I can pick up just by the [unintelligible 00:01:17] reading of the title.Ell: You know, we have all of these white papers that you end up getting, the pen test reports that are dropped on your desk that nobody ever gets to, they become low priority, my job is to actually advocate that you do something with the information that you get. And part of that just involves translating that into plain English, so anyone can go with it.Corey: I've got to say, if you want to give the secrets of the universe and make sure that no one ever reads them, make sure that it has a whole bunch of academic-style citations at the beginning, and ideally put it behind some academic paywall, and it feels like people will claim to have read it but never actually read the thing.Ell: Don't forget charts.Corey: Oh yes, with the charts. In varying shades of blue. Apparently that's the only color you're allowed to do some of these charts in; despite having a full universe of color palettes out there, we're just going to put it in varying shades of corporate blue and hope that people read it.Ell: Yep, that sounds about security there. [laugh].Corey: So, how much of, I guess, modern security research these days is coming out of academia versus coming out of industry?Ell: In my experience in, you know, research I've done in researching researchers, it all really revolves around actual practitioners these days, people who are on the front lines, you know, monitoring their honey pots, and actually reporting back on what they're seeing, not just theoretical.Corey: Which I guess brings us to the question of, I wind up watching all of the keynotes that all the big cloud providers put on and they simultaneously pat me on the head and tell me that their side of security is just fine with their shared responsibility model and the rest, whereas all of the breaches I'm ever going to deal with and the only way anyone can ever see my data is if I make a mistake in configuring something. And honestly, does that really sound like something I would do? Probably not, but let's face it, they claim that they are more or less infallible. How accurate is that?Ell: I wish that I could find the original person that said this, but I've heard it so many times. And it's actually the ‘cloud irresponsibility model.' We have this blind faith that if we're paying somebody for it, it's going to be done correctly. I think you may have seen this with billing. How many people are paying for redundant security services with a cloud provider?Corey: I've once—well, more than once have noticed that if you were to configure every AWS security service that they have and enable it in your account, that the resulting bill would be larger than the cost of the data breach it was preventing. So, on some level, there is a point at which it just becomes ridiculous and it's not necessarily worth pursuing further. I honestly used to think that the shared responsibility model story was a sales pitch, and then I grew ever more cynical. And now my position on it is that it's because if you get breached, it's your fault is what they're trying to say. But if you say it outright to someone who just got breached, they're probably not going to give you money anymore. So, you need to wrap that in this whole involved 45-minute presentation with slides, and charts, and images and the rest because people can't refute one of those quite the way that they can a—it's in a tweet sentence of, “It's your fault.”Ell: I kind of have to agree with them in the end that it is your fault. Like, the buck stops with you, regardless. You are the one that chose to trust that cloud provider was going to do everything because your security team might make a mistake, but the cloud provider is made up of humans as well who can make just as many mistakes. At the end of the day, I don't care what cloud provider you used; I care that my data was compromised.Corey: One of the things that irks me the most is when I read about a data breach from a vendor that I had either trusted knowingly with my data or worse, never trusted but they somehow scraped it somewhere and then lost it, and they said, “Oh, a third-party contractor that we hired.” It's, “Yeah, look, I'm doing business with you, ideally, not the people that you choose to do business with in turn. I didn't select that contractor. You did, you can pass out the work and delegate that. You cannot delegate the responsibility.” So no, Verizon, when you talk about having a third-party contractor have a data breach of customer data, you lost the data by not vetting your contractors appropriately.Ell: Let's go back in time to hopefully something everybody remembers: Target. Target being compromised because of their HVAC provider. Yet how many people—you know this is being recorded in the holiday season—are still shopping at Target right now? I don't know if people forget or they just don't care.Corey: A year later, their stock price was higher than it was before the breach. Sure they had a complete turnover of their C-suite at that point; their CSO and CEO were forced out as a result, but life went on. And they continue to remain a going concern despite quite literally having a bull's eye painted on the building. You'd think that would be a metaphor for security issues. But no, no, that is something they actually do.Ell: You know, when you talk about, you know, the CEO being let go or, you know, being run out, but what part did he honestly have to do with it? They're talking about, oh, well, they made the decisions and they were responsible. What because they got that, you know, list of just 8000 papers with the charts on it?Corey: As I take a look at a lot of the previous issues that we've seen with I've been doing my whole S3 Bucket Negligence Awards for a while, but once I actually had a bucket engraved and sent to a company years ago, the Pokémon Company, based upon a story that I read in the Wall Street Journal, how they declined to do business with a prospective vendor because going through their onboarding process, they noticed among other things, insufficient security controls around a whole bunch of things including S3 buckets, and it's holy crap, a company actually making a meaningful decision based upon security. And say what you will about the Pokémon Company, their audience is—at least theoretically—children and occasionally adults who believe they're children—great, not here to shame—but they understand that this is not something you can afford to be lax in and they kiboshed the entire deal. They didn't name the vendor, obviously, but that really took me aback. It was such a rarity to see that, and it's why I unfortunately haven't had to make a bucket like that since. I wish I did. I wish more companies did things like this. But no it's just a matter of, well, we claim to do the right thing, and we checked all the boxes and called it good, and oops, these things happen.Ell: Yes, but even when it goes that way, who actually remembers what happened, and did you ever follow up if there were any consequences to not going, “Okay, third-party. You screwed up, we're out. We're not using you.” I can't name a single time that happened.Corey: Over at The Duckbill Group, we have large enterprise customers. We have to be respectful and careful with their data, let's be very clear here. We have all of their AWS billing data going back for some fixed period of time. And it worries me what happens if that data gets breached. Now, sure, I've done the standard PR crisis comms thing, I have statements and actions prepared to go in the event that it happens, but I'm also taking great pains to make sure it doesn't.It's the idea of okay, let's make sure that we wind up keeping these things not just distinct from the outside world, but distinct from individual clients so we're not mixing and matching any of this stuff. It's one of those areas where if we wind up having a breach, it's not because we didn't follow the baseline building blocks of doing this right. It's something that goes far beyond what we would typically expect to see in an environment like this. This, of course, sets aside the fact that while a breach like that would be embarrassing, it isn't actually material to anyone's business. This is not to say that I'm not taking it seriously because we have contractual provisions that we will not disclose a lot of this stuff, but it does not mean the end of someone's business if this stuff were to go public in the same way that, for example, back when I worked at Grindr many years ago, in the event that someone's data had been leaked there, people could theoretically been killed. There's a spectrum of consequences here, but it still seems like you just do the basic block-and-tackling to make sure that this stuff isn't publicly exposed, then you start worrying about the more advanced stuff. But with all these breaches, it seems like people don't even do that.Ell: You have Tesla, right, who's working on going to Mars, sending people there who had their S3 buckets compromised. At that point, if we've got this technology, just giant there, I think we're safe to do that whole, “Hey, assume breach, assume compromise.” But when I say that, it drives me up the wall how many people just go, “Okay, well, there's nothing we can do. We should just assume that there's going to be an issue,” and just have this mentality where they give up. No, that gives you a starting point to work from, but that's not the way it's being seen.Corey: One of the things that I've started doing as I built up my new laptop recently has been all right, how do I work with this in such a way that I don't have credentials that are going to grant access to things in any long-lived way ever residing on disk? And so that meant with AWS, I started using SSO to log into a bunch of things. It goes through a website, and then it gives a token and the rest that lasts for 12 hours. Great.Okay, SSH keys, how do I handle that? Historically, I would have them encrypted with a passphrase, but then I found for Mac OS an app called Secretive that stores it in the Secure Enclave. I have to either type in a password or prove it with a biometric Touch ID nonsense every time something tries to access the key. It's slightly annoying when I'm checking out five or six Git repos at once, but it also means that nothing that I happen to have compromised in a browser or whatnot is going to be able to just grab the keys, send it off somewhere, and then I'll never realize that I've been compromised throughout. It's the idea of at least theoretically defense in depth because it's me, it's my personal electronics, in all likelihood, that are going to be compromised, more so than it is configured, locked-down S3 buckets, managed properly. And if not me, someone else in my company who has access to these things.Ell: I'm going to give you the best advice you're ever going to get, and people are going to go, “Duh,” but it's happening right now: Don't get complacent, don't get lazy, how many of us are, “Okay, we're just going to put the key over here for a second.” Or, “We're just going to do this for a minute,” and then we forget. I recently, you know, did some research into Emotet and—you know, the new virus and the group behind it—you know how they got caught? When they were raided, everything was in plain text. They forgot to use their VPN for a while, all the files that they'd gotten no encryption. These were the people that that's what they were looking for, but you get lazy.Corey: I've started treating at least the security credential side of doing weird things, even one off bash scripts, as if they were in production. I stuff the credentials into something like AWS's parameter store, and then just have a one line snippet of code that retrieves them at runtime to wind up retrieving those. Would it be easier to just slap it in there in the code? Absolutely, of course it would. But I also look at my newsletter production pipeline, and I count the number of DynamoDB tables that are in active use that are labeled Test or Dev, and I realized, huh, I'm actually kind of bad at taking something that was in Dev and getting it ready for production. Very often, I just throw a load at it and call it good. So, if I never get complacent around things like that, it's a lot harder for me to get yelled at for checking secrets into Git, for example.Ell: Probably not the first time that you've heard this but, Corey, I'm going to have to go with you're abnormal because that is not what we're seeing in a day-to-day production environment.Corey: Oh, of course not. And the reason I do this is because I was a grumpy old sysadmin for so long, and have gotten burned in so many weird ways of messing things up. And once it's in Git, it's eternal—we all know that—and I don't ever want to be in a scenario where I open-source something and surprise, surprise, come to find out in the first two days of doing something, I had something on disk. It's just better not to go down that path if at all possible.Ell: Being a former sysad as well, I must say, what you're able to do within your environment, your computer is almost impossible within a corporate environment. Because as a sysad, I'm looking at, “What did the devs do again? Oh, man, what's the security team going to do?” And you're stuck in the middle trying to figure out how to solve a problem and then manage it through that entire environment.Corey: I never really understood intrinsically the value of things like single-sign-on, until I wound up starting this company. Because first, it was just me for a few years. And yeah, I can manage my developer environments and my AWS environments in such a way that if they get compromised, it's not going to be through basic, “Oops, I forgot that's how computers work,” type of moment. It's going to be at least something a little bit more difficult, I would imagine. Because if you—all right, if you managed to wind up getting my keys and the passphrase, and in some cases, the MFA device, great, good, congratulations, you've done something novel and probably deserve the data.Whereas as soon as I started bringing other people in who themselves were engineers, I sort of still felt the same way. Okay, we're all responsible adults here, and by and large, since I wasn't working with junior people, that held true. And then I started bringing in people who did not come from a deeply computer-y technical background, doing things like finance, and doing things like sales, and doing things like marketing, all of which are themselves deeply technical in their own way, but data privacy and data security are not really something that aligns with that. So, it got into the weeds of, “How do I make sure that people are doing responsible things on their work computers like turning on disk encryption, and forcing a screensaver, and a password and the rest.” And forcing them to at least do some responsible things like having 1Password for everyone was great until I realized a couple people weren't even using it for something, and oh dear. It becomes a much more difficult problem at scale when you have to deal with people who, you know, have actual work to do rather than sitting around trying to defend the technology against any threat they can imagine.Ell: In what you just said though, there is one flaw is we tend to focus on, like you said, marketing and finance and all these organizations who—don't get phished, don't click on this link. But we kind of give the just the openness that your security team, your sysads, your developers, they're going to know best practices. And then we focus on Windows because that's what the researchers are doing. And then we focus on Windows because that's what marketing is using, that's what finance is using. So, what there's no way to compromise a Mac or Linux box? That's a huge, huge open area that you're allowing for attackers.Corey: Let's be very clear here. We don't have any Windows boxes—of which I'm aware—in the company. And yeah, the technical folk we have brought in, most of them I'd worked—or at least the early folks—I'd worked with previously. And we had a shared understanding of security. At least we all said the right things.But yeah, as you—right, as you grow, as you scale, this becomes a big deal. And it's, I also think there's something intrinsically flawed about a model where the entire instruction set is, it all falls on you to not click the link or you're going to doom us all. Maybe if someone can click a link and doom us all, the problem is not with them; it's the fact that we suck at building secure systems that respect defense in depth.Ell: Something that we do wrong, though, is we split it up. We have endpoint protection when we're talking about, you know, our Windows boxes, our Linux boxes, our Mac boxes. And then we have server-side and cloud security. Those connect. Think about, there's a piece of malware called EvilGNOME. You go in on a Linux box, you have access to my camera, keylogging, and watching exactly what I'm doing. I'm your sysad. I then cat out your SSH keys, I go into your box, they now have the password, but we don't look for that. We just assume that those two aren't really that connected, and if we monitor our network and we monitor these devices, we'll be fine. But we don't connect the two pieces.Corey: One thing that I did at a consulting client back in 2012, or so that really raised eyebrows whenever I told people about it was that we wound up going to some considerable trouble building a allow list within Squid—a proxy server that those of us in Linux-land are all too familiar with in some cases—so everything in production could only talk to the outside world via that proxy; it was not allowed to establish any outbound connections other than through that proxy. So, it was at that point only allowed to talk to specify update servers, specified third-party APIs and the rest, so at least in theory, I haven't checked back on them since, I don't imagine that the log4yay nonsense that we've seen recently would necessarily work there. I mean, sure, you have the arbitrary execution of code—that's bad—but reaching out to random endpoints on the internet would not have worked from within that environment. And I liked that model, but oh my God, was it a pain in the butt to set up properly because it turns out, even in 2012, just to update a Linux system reasonably, there's a fair number of things it needs to connect to, from time-to-time, once you have all the things like New Relic instrumentation in, and the app repository you're talking to, and whatever container source you're using, and, and, and. Then you wind up looking at challenges like, oh, I don't know, if you're looking at an AWS-style environment, like most modern things are, okay, we're only going to allow it to talk to AWS endpoints. Well, that's kind of the entire internet now. The goalposts move, the rules change, the game marches on.Ell: On an even simpler point, with that you're assuming only outbound traffic through those devices. Are they not connected to anything within the internal network? Is there no way for an attacker to pivot between systems? I pivot over to that, I get the information, and I make an outbound connection on something that's not configured that way.Corey: We had—you're allowed to talk outbound to the management subnet, which was on its own VLAN, and that could make established connections into other things, but nothing else was allowed to connect into that. There was some defense in depth and some thought put into this. I didn't come up with most of this to be clear, it was—this was smart people sitting around. And yeah, if I sit here and think about this for a while, of course there's going to be ways to do it. This was also back in the days of doing it in physical data centers, so you could have a pretty good idea of what was connect to the outside world just by looking at where the cables went. But there was also always the question of how does this–does this do what I think it's doing or what have I overlooked? Security's job is never done.Ell: Or what was misconfigured in the last update. It's an assumption that everything goes correctly.Corey: Oh, there is that. I want to talk though, about the things I had to worry about back then, it seems like in many cases get kicked upstairs to the cloud providers that we're using these days. But then we see things like Azurescape where security researchers were able to gain access to the Azure control plane where customers using Cosmos DB—Azure's managed database service, one of them—could suddenly have their data accessed by another customer. And Azure is doing its clam up thing and not talking about this publicly other than a brief disclosure, but how is this even possible from security architecture point of view? It makes me wonder if it hadn't been disclosed publicly by the researcher, would they have ever said something? Most assuredly not.Ell: I've worked with several researchers, in Intezer and outside of Intezer, and the amount of frustration that I see within reasonable disclosure, it just blows my mind. You have somebody threatening to sue the researcher if they bring it out. You have a company going, “Okay, well, we've only had six weeks. Give us three more weeks.” And next thing we know, it's six months.There is just this pushback about what we can actually bring out to the public on why they're vulnerable in organizations. So, we're put in this catch-22 as researchers. At what point is my responsibility to the public, and at what point is my responsibility to protect myself, to keep myself from getting sued personally, to keep my company from going down? How can we win when we have small research groups and these massive cloud providers?Corey: This episode is sponsored in part by something new. Cloud Academy is a training platform built on two primary goals. Having the highest quality content in tech and cloud skills, and building a good community the is rich and full of IT and engineering professionals. You wouldn't think those things go together, but sometimes they do. Its both useful for individuals and large enterprises, but here's what makes it new. I don't use that term lightly. Cloud Academy invites you to showcase just how good your AWS skills are. For the next four weeks you'll have a chance to prove yourself. Compete in four unique lab challenges, where they'll be awarding more than $2000 in cash and prizes. I'm not kidding, first place is a thousand bucks. Pre-register for the first challenge now, one that I picked out myself on Amazon SNS image resizing, by visiting cloudacademy.com/corey. C-O-R-E-Y. That's cloudacademy.com/corey. We're gonna have some fun with this one!Corey: For a while, I was relatively confident that we had things like Google's Project Zero, but then they started softening their disclosure timelines and the rest, and it was, we had the full disclosure security distribution list that has been shuttered to my understanding. Increasingly, it's become risky to—yourself—to wind up publishing something that has not been patched and blessed by the providers and the rest. For better or worse, I don't have those problems, just because I'm posting about funny implications of the bill. Yeah, worst case, AWS is temporarily embarrassed, and they can wind up giving credits to people who were affected and be mad at me for a while, but there's no lasting harm in the way that there is with well, people were just able to look at your data for six months, and that's our bad oops-a-doozy. Especially given the assertions that all of these providers have made to governments, to banks, to tax authorities, to all kinds of environments where security really, really matters.Ell: The last statistic that I heard, and it was earlier this year, that it takes over 200 days for compromise even to be detected. How long is it going to take for them to backtrack, figure out how it got in, have they already patched those systems and that vulnerability is gone, but they managed to establish persistence somehow, the layers that go into actually doing your digital forensics only delay the amount of time that any of that is going to come out where that they have some information to present to you. We keep going, “Oh, we found this vulnerability. We're working on patches. We have it fixed.” But does every single vendor already have it pitched? Do they know how it actually interacted within one customer's environment that allowed that breach to happen? It's just ridiculous to think that's actually occurring, and every company is now protected because that patch came out.Corey: As I take a look at how companies respond to these things, you're right, the number one concern most of them have is image control, if I'm being honest with you. It's the reputational management of we are still good at security, even though we've had a lapse here. Like, every breach notification starts out with, “Your security is important to us.” Well, clearly not that important because look at the email you had to send. And it's almost taken on aspects of a comedy piece where it [grips 00:23:10] with corporate insincerity. On some level, when you tell a company that they have a massive security vulnerability, their first questions are not about the data privacy; it's about how do we spend this to make ourselves come out of this with the least damage possible. And I understand it, but it's still crappy.Ell: Us tech folk talk to each other. When we have security and developers speaking to each other, we're a lot more honest than when we're talking to the public, right? We don't try to hold that PR umbrella over ourselves. I was recently on a panel speaking with developers, head SRE folk—what was there? I think there was a CISO on there—and one of the developers just honestly came out and said, “At the end, my job is to say, ‘How much is that breach going to cost, versus how much money will the company lose if I don't make that deployment?'” The first thing that you notice there is that whole how much money you'll lose? The second part is why is the developer the one looking at the breach?Corey: Yeah. The work flows downward. One of the most depressing aspects to me of the CISO role is that it seems like the job is to delegate everything, sign binding contracts in your name, and eventually get fired when there's a breach and your replacement comes in to sign different papers. All the work gets delegated, none of the responsibility does, ideally—unless you're SolarWinds and try and blame it on an intern; I mean, I wish I had an ablative intern or two around here to wind up a casting blame they don't deserve on them. But that's a separate argument—there is no responsibility-taking as I look at this. And that's really a depressing commentary on the state of the world.Ell: You say there's no responsibility taken, but there is a lot of blame assigned. I love the concept of post-mortems to why that breach happened, but the only people in the room are the security team because they had that much control over anything. Companies as a whole need a scapegoat, and more and more, security teams are being blamed for every single compromised as more and more responsibility, more and more privileges, and visibility into what's going on is being taken away from them. Those two just don't balance. And I think it's causing a lot of just complacency and almost giving up from our security teams.Corey: To be clear, when we talk about blameless post-mortems for things like this, I agree with it wholeheartedly within the walls of a company. However, externally as someone whose data has been taken in some of these breaches, oh, I absolutely blame the company. As I should, especially when it's something like well, we have inadvertently leaked your browsing history. Why were you collecting that in the first place? Is sort of the next logical question.I don't believe that my ISP needs that to serve me better. But now you have Verizon sending out emails recently—as of this recording—saying that unless anyone opts out, all the lines in our cell account are going to wind up being data mined effectively, so they can better target advertisements and understand us better. It's no, I absolutely do not want you to be doing that on my phone. Are you out of your mind? There are a few things in this world that we consider more private than our browsing histories. We ask the internet things we wouldn't ask our doctors in many cases, and that is no small thing as far as the level of trust that we place in our ISPs that they are now apparently playing fast and loose with.Ell: I'm going to take this step back because you do a lot of work with cloud providers. Do you think that we actually know what information is being collected about our companies and what we have configured internally and externally by the cloud provider?Corey: That's a good question. I've seen this before, where people will give me the PDF exploded view of last month's AWS bill, and they'll laugh because what information can I possibly get out of that. It just shows spend on services. But I could do that to start sketching out a pretty good idea of what their architecture looks like from that alone. There's an awful lot of value in the metadata.Now, I want to be clear, I do not believe on any provider—except possibly Azure because who knows at this point—that if you encrypt the data, using their encryption facilities—with AWS, I know it's KMS, for example—I do not believe that they can arbitrarily decrypt it and then scan for whatever it is they're looking for. I do not believe that they are doing that because as soon as something like that comes out, it puts the lie to a whole bunch of different audit attestations that they've made and brings the entire empire crumbling down. I don't think they're going to get any useful data from that. However, if I'm trying to build something like Amazon Prime Video, and I can just look at the bill from the Netflix account. Well, that tells me an awful lot about things that they might be doing internally; it's highly suggestive. Could that be used to give them an unfair advantage? Absolutely.I had a tweet a while back that I don't believe that Google's Gmail division is scanning inboxes for things that look like AWS invoices to target their sales teams, but I sure would feel better if they would assure me that was the case. No one was able to ever assure me of that. It's I don't mean to be sitting here slinging mud, but at the same time, it's given that when you don't explicitly say you're not doing something as a company, there's a great chance you might be doing it, that's the sort of stuff that worries me, it's a bunch of unfair dirty trick style stuff.Ell: Maybe I'm just cynical, or maybe I just focus on these topics too much, but after giving a presentation on cloud security, I had two groups, both, you know, from three letter government agencies, come up to me and say, “How do I have these conversations with the cloud provider?” In the conversation, they say, “We've contacted them several times; we want to look at this data; we want to see what they've collected, and we get ghosted, or we end up talking to attorneys. And despite over a year of communication, we've yet to be able to sit down with them.”Corey: Now, that's an interesting story. I would love to have someone come to me with that problem. I don't know how I would solve that yet. But I have a couple ideas.Ell: Hey, maybe they're listening, and they'll reach out to you. But—Corey: You know, if you're having that problem of trying to understand what your cloud provider is doing, please talk to me. I would love to go a little more in depth on that conversation, under an NDA or six.Ell: I was at a loss because the presentation that I was giving was literally about the compromise of managed service providers, whether that be an outsourced security group, whether that be your cloud provider, we're seeing attack groups going after these tar—think about how juicy they are. Why do I need to compromise your account or your company if I can compromise that managed service provider and have access to 15 companies?Corey: Oh, yeah. It's why would someone spend time trying to break into my NetApp when they could break into S3 and get access to everyone's data, theoretically? It's a centralization of security model risk.Ell: Yeah, it seems to so many people as just this crazy idea. It's so far out there. We don't need to worry about it. I mean, we've talked about how Azure Functions has been compromised. We talked about all of these cloud services that people are specifically going after and being able to make traction in these attacks.It's not just this crazy idea. It's something that's happening now, and with the progress that attackers are making, criminal groups are making, this is going to happen pretty soon.Corey: Sometimes when I'm out for a meal with someone who works with AWS in the security org, there'll be an appetizer where, “Oh, there's two of you. I'm going to bring three of them,” because I guess waitstaff love to watch people fight like that. And whenever I want the third one, all I have to do is say, “Can you imagine a day in which, just imagine hypothetically, IAM failed open and allowed every request to go through regardless of everything else?” Suddenly, they look sick, lose their appetite, and I get the third one. But it's at least reassuring to know that even the idea of that is that disgusting to them, and it's not the, “Oh, that happened three weeks ago, but don't tell anyone.” Like, there's none of that going on.I do believe that the people working on these systems at the cloud providers are doing amazingly good work. I believe they are doing far better than I would be able to do in trying to manage all those things myself, by a landslide. But nothing is ever perfect. And it makes me wonder that if and when there are vulnerabilities, as we've already seen—clearly—with Azure, how forthcoming and transparent would they really be? And that's the thing that keeps me up at night.Ell: I keep going back during this talk, but just the interaction with the people there and the crowd was just so eye-opening. And I don't want to be that person, but I keep getting to these moments of, “I told you so.” And I'm not going to go into SolarWinds. Lord, that has been covered, but shortly after that, we saw the same group going through and trying to—I'm not sure if they successfully did it, but they were targeting networks for cloud computing providers. How many companies focused outside of that compromise at that moment to see what it was going to build out to?Corey: That's the terrifying thing is if you can compromise a cloud service provider at this point, it's well, you could sell that exploit on the dark web to someone. Yeah, that is a—if you can get a remote code execution be able to look into any random Cloud account, there's almost no amount of money that is enough for something like that. You could think of the insider trading potential of just compromising Slack. A single company, but everyone talks about everything there, and Slack retains data in perpetuity. Think at the sheer M&A discussions you could come up with? Think of what you could figure out with a sort of a God's eye view of something like that, and then realize that they run on AWS, as do an awful lot of other companies. The damage would be incalculable.Ell: I am not an attacker, nor do I play one on TV, but let's just, kind of, build this out. If I was to compromise a cloud provider, the first thing I would do is lay low. I don't want them to know that I'm there. The next thing I would do is start getting into company environments and scanning them. That way I can see where the vulnerabilities are, I can compromise them that way, and not give out the fact that I came in through that cloud provider. Look, I'm just me sitting here. I'm not a nation state. I'm not somebody who is paid to do this from nine to five, I can only imagine what they would come up with.Corey: It really feels like this is no longer a concern just for those folks who manage have gotten on the bad side of some country's secret service. It seems like APTs, Advanced Persistent Threats, are now theoretically something almost anyone has to worry about.Ell: Let me just set the record straight right now on what I think we need to move away from: The whole APTs are nation states. Not anymore. And APT is anyone who has advanced tactics, anyone who's going to be persistent—because you know what, it's not that they're targeting you, it's that they know that they eventually can get in. And of course, they're a threat to you. When I was researching my work into Advanced Persistent Threats, we had a group named TNT that said, “Okay, you know what? We're done.”So, I contacted them and I said, “Here's what I'm presenting on you. Would you mind reviewing it and tell me if I'm right?” They came back and said, “You know what? We're not in APT because we target open Docker API ports. That's how easy it is.” So, these big attack groups are not even having to rely on advanced methods anymore. The line onto what that is just completely blurring.Corey: That's the scariest part to me is we take a look at this across the board. And the things I have to worry about are no longer things that are solely within my arena of control. They used to be, back when it was in my data center, but now increasingly, I have to extend trust to a whole bunch of different places. Because we're not building anything ourselves. We have all kinds of third-party dependencies, and we have to trust that they're doing the right things as they go, too, and making sure that they're bound so that the monitoring agent that I'm using can't compromise my entire environment. It's really a good time to be professionally paranoid.Ell: And who is actually responsible for all this? Did you know that 70% of the vulnerabilities on our systems right now are on the application level? Yet security teams have to protect it? That doesn't make sense to me at all. And yet, developers can pull in any third-party repository that they need in order to make that application work because hey, we're on a deadline. That function needs to come out.Corey: Ell, I want to thank you for taking the time to speak with me. If people want to learn more about how you see the world and what kind of security research you're advocating for, where can they find you?Ell: I live on Twitter to the point where I'm almost embarrassed to say, but you can find me at @Ell_o_Punk.Corey: Excellent. And we will wind up putting a link to that in the [show notes 00:35:37], as we always do. Thanks so much again for your time. I appreciate it.Ell: Always. I'd be happy to come again. [laugh].Corey: Ell Marquez, security research advocate at Intezer. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that ends in a link that begs me to click it that somehow it looks simultaneously suspicious and frightening.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Luis Cárdenas
México suma aliados contra el tráfico de armas

Luis Cárdenas

Play Episode Listen Later Jan 3, 2022 11:25


Alejandro Celorio, consejero jurídico de la SRE, habló con Enrique Rodríguez, en ausencia de Luis Cárdenas sobre México, que suma aliados contra el tráfico de armas; contestará este mes réplica de fabricantes.

Irish Stew Podcast
S3E9: Jennifer Petoff - A Traveling American Techie in Dublin

Irish Stew Podcast

Play Episode Listen Later Jan 3, 2022 56:44


Born in the Irish America hotbed of Buffalo, New   York, Jennifer Petoff lives in Dublin now. Following a career that has taken unexpected twists. A holder of a Ph.D. in Chemistry from Stanford University, we talk about women in STEM and how Jennifer made her way to Dublin, where she works in the SRE (Site Reliability Engineering) field. Jennifer is also one of four editors of Site Reliability Engineering: How Google Runs Production Systems, a highly successful publication in the world of SRE.While Jennifer is busy in her day job, she also created Sidewalk Safari, an expansive travel blog featuring her travels in Ireland and locales further afield over the past ten years. Her posts display a keen photographic sensibility notable for their focus on colorful or unusual doorways likely cultivated by Dublin's famous Georgian Doors.Jennifer's Business LinksBusiness Unit: Google SRE LinkedIn: ProfileBook: Site Reliability Engineering: How Google Runs Production SystemsJennifer's Travel LinksTravel Blog: Sidewalk SafariTwitter: Sidewalk SafariInstagram: Sidewalk SafariFacebook: Sidewalk SafariPinterest: Sidewalk Safari

Tajno društvo OFC
Božično novoletna specialka

Tajno društvo OFC

Play Episode Listen Later Dec 31, 2021 128:23


Gost božično novoletne specialke je bil legenda navijaške tribune - Cvetko!S Cvetkom smo debatirali o dogodivščinah na nogometnih in košarkarskih gostovanjih, o nastajanju navijaških koreografij, odnosu z Ivanom Zidarjem, kako je doživel "the tekmo" v Velenju in še mnogo več.Srečno novo leto in vse dobro v 2022!Prijetno poslušanje!

Dobra, Mali in Stari.
Srečno 2022!

Dobra, Mali in Stari.

Play Episode Listen Later Dec 31, 2021 165:12


Srečno 2022! by Radio City

MVS Noticias / 102.5 segundos de información
Reanuda INE proceso de revocación de mandato - 30 Dic 21

MVS Noticias / 102.5 segundos de información

Play Episode Listen Later Dec 30, 2021 2:02


El Consejo General del INE resolvió continuar con la organización de la consulta de Revocación de Mandato con los recursos que se tienen disponibles, en acatamiento a las determinaciones de la Suprema Corte de Justicia y el Tribunal Electoral.

102.5 segundos de información
Reanuda INE proceso de revocación de mandato - 30 Dic 21

102.5 segundos de información

Play Episode Listen Later Dec 30, 2021 2:02


Radio Karantin
Srećna vam nova godina i hvala vam što slušate Radio karantin

Radio Karantin

Play Episode Listen Later Dec 28, 2021 1:36


Srećna vam nova godina i hvala vam što slušate Radio karantin by Marija Belić Bibin, Jelena Visser, Aleksandar Kocić

Screaming in the Cloud
President Biden's Advice in Action with Dan Woods

Screaming in the Cloud

Play Episode Listen Later Dec 28, 2021 39:28


About DanDan is CISO and VP of Cybersecurity for Shipt, a Target subsidiary. He worked previously as a Distinguished Engineer on Target's cloud infrastructure. He served as CTO for Joe Biden's 2020 Presidential campaign. Prior to that Dan worked with the Hillary for America tech team through the Groundwork, and contributed as a founding developer on Spinnaker while at Netflix. Dan is an O'Reilly published author and avid public speaker.  Links: Shipt: https://www.shipt.com/ Twitter: https://twitter.com/danveloper LinkedIn: https://www.linkedin.com/in/danveloper TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: Writing ad copy to fit into a 30 second slot is hard, but if anyone can do it the folks at Quali can. Just like their Torque infrastructure automation platform can deliver complex application environments anytime, anywhere, in just seconds instead of hours, days or weeks. Visit Qtorque.io today and learn how you can spin up application environments in about the same amount of time it took you to listen to this ad.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. Sometimes I talk to people who are involved in working on the nonprofit slash political side of the world. Other times I talk to folks who are deep in the throes of commercial businesses, and I obviously personally spend more of my time on one of those sides of the world than I do the other. But today's guest is a little bit different, Dan Woods is the CISO and VP of Cybersecurity at Shipt, a division of Target where he's worked for a fair number of years, but took some time off for his side project, the side hustle as the kids call it, as the CTO for the Biden campaign. Dan, thank you for joining me.Dan: Yeah. Thank you, Corey. Happy to be here.Corey: So, you have an interesting track record as far as your career goes, you've been at Target for a long time. You were a distinguished engineer—not to be confused with ‘extinguished engineer,' which is just someone who is finally—the fire has gone out. And from there you went from being a distinguished engineer to a VP slash CISO, which generally looks a lot less engineer-like, and a lot more, at least in my experience, of sitting in a whole lot of executive-level meetings, managing teams, et cetera. Was that, in fact, an individual contributor—or IC—move into a management track, or am I just misunderstanding this because these are commonly overloaded terms in our industry?Dan: Yeah, yeah, no, that's exactly right. So, IC to leadership, two distinct tracks, distinct career paths. It was something that I've spent a number of years thinking about and more or less working toward and making sure that it was the right path for me to go. The interesting thing about the break that I took in the middle of Target when I was CTO for the campaign is that that was a leadership role, right. I led the team. I managed the team.I did performance reviews and all of that kind of managerial stuff, but I also sat down and did a lot of tech. So, it was kind of like a mix of being a senior executive, but also still continuing to be a distinguished engineer. So, then the natural path out of that for me was to make a decision about do I continue to be an individual contributor or do I go into a leadership track? And I felt like for a number of reasons that my interests more aligned with being on the leadership side of the world, and so that's how I've ended up where I am.Corey: And correct me if I'm wrong because generally speaking political campaigns are not usually my target customers given the fact that they're turning the entire AWS environment off in a few months—win or lose—and yeah, that is, in fact, remains the best way to save money on your AWS bill; it's hard for me to beat that. But at that point most of the people you're working with are in large part volunteers I would imagine.So, managing in a traditional sense of, “Well, we're going to have your next quarterly review.” Well, your candidate might not be in the race then, and what we're going to put you on a PIP, and what exactly you're going to stop letting me volunteer here? You're going to dock them pay—you're not paying me for this. It becomes an interesting management challenge I would imagine just because the people you're working with are passionate and volunteering, and a lot of traditional management and career advice doesn't necessarily map one-to-one I would have to assume.Dan: That is the best way that I've heard it described yet. I try to explain this to folks sometimes and it's kind of difficult to get that message across that like there is sort of a base level organization that exists, right. There were full-time employees who were a part of the tech team, really great group of folks especially from very early on willing to join the campaign and be a part of what it was that we were doing.And then there was this whole ecosystem of folks who just wanted to volunteer, folks who wanted to be a part of it but didn't want to leave their 9:00 to 5:00 who wanted to come in. One of the most difficult things about—we rely on volunteers very heavily in the political space, and very grateful for all the folks who step up and volunteer with organizations that they feel passionate about. In fact, one of the best little tidbits of wisdom the President imparted to me at one point, we were having dinner at his house very early on in the campaign, and he said, “The greatest gift that you can give somebody is your time.” And I think that's so incredibly true. So, the folks who volunteer, it's really important, really grateful that they're all there.In particular, how it becomes difficult, is that you need somebody to manage the volunteers, right, who are there. You need somebody to come up with work and check in that work is getting done because while it's great that folks want to volunteer five, ten hours a week, or whatever it is that they can put in, we also have very real things that need to get done, and they need to get done in a timely manner.So, we had a lot of difficulty especially early on in the campaign utilizing the volunteers to the extent that we could because we were such a small and scrappy team and because everybody who was working on the campaign at the time had a lot of responsibilities that they needed to see through on their own. And so getting into this, it's quite literally a full-time job having to sit down and follow up with volunteers and make sure that they have the appropriate amount of work and make sure that we've set up our environment appropriately so that volunteers can come and go and all of that kind of stuff, so yeah.Corey: It's always an interesting joy looking at the swath of architectural decisions and how they came to be. I talked on a previous episode with Jackie Singh, who was, I believe, after your tenure as CISO, she was involved on the InfoSec side of things, and she was curious as to your thought process or rationale with a lot of the initial architectural decisions that she talked about on her episode which I'm sure she didn't intend it this way, but I am going to blatantly miscategorize as, “Justify yourself. What were you thinking?” Usually it takes years for that kind of, “I don't understand what's going on here so I'm playing data center archeologist or cloud spelunker.” This was a very short window. How did decisions get made architecturally as far as what you're going to run things on? It's been disclosed that you were on AWS, for example. Was that a hard decision?Dan: No, not at all. Not at all. We started out the campaign—I in particular I was one of the first employees hired onto the campaign and the idea all along was that we're not going to be clever, right? We're basically just going to develop what needs to be developed. And the idea with that was that a lot of the code that we were going to sit down and write or a lot of the infrastructure that we were going to build was going to be glue, it not AWS Glue, right, ideally, but just glue that would bind data streams together, right?So, data movement, vendor A produces a CSV file for you and it needs to end up in a bucket somewhere. So, somebody needs to write the code to make that happen, or you need to find a sufficient vendor who can make that happen. There's a lot more vendors today believe it or not than there were two years ago that are doing much better in that kind of space, but two years ago we had the constraints of time and money.Our idea was that the code that we were going to write was going to be for those purposes. What it actually turned into is that in other areas of the business—and I will call it a business because we had formalized roadmaps and different departments working on different things—but in other areas of the business where we didn't have enough money to purchase a solution, we had the ability to go and write software.The interesting thing about this group of technologists who came together especially early on in the campaign to build out the tech team most of them came from an enterprise software development background, right? So, we had the know-how of how to build things at scale and how to do continuous delivery and continuous deployment, and how to operate a cloud-native environment, and how to build applications for that world.So, we ended up doing things like writing an API for managing our donor vetting pipeline, right? And that turned into a complex system of Lambda functions and continuous delivery for a variety of different services that facilitated that pipeline. We also built an architecture for our mobile app which there were plenty of companies that wanted to sell us a mobile app and we just couldn't afford it so we ended up writing the mobile app ourselves.So, after some point in time, what we said was we actually have a fairly robust and complex software infrastructure. We have a number of microservices that are doing various things to facilitate the operation of the business, and something that we need to do is we need to spend a little bit of time and make sure that we're building this in a cohesive way, right? And what part of that means was that, for example, we had to take a step back and say, “Okay, we need to have a unified identity service.” We can't have a different identity—or we can't have every single individual service creating its own identity. We need to have—Corey: I really wish you could pass that lesson out on some of the AWS service teams.Dan: [laugh]. Yes, I know. I know. Yeah. So, we went through—Corey: So, there were some questionable choices you made in there, like you started that with the beginning of, “Well, we had no time which is fine and no budget. So, we chose AWS.” It's like, “Oh, that looks like the exact opposite direction of a great decision, given, you know, my view on it.” Stepping past that entirely, you are also dealing with challenges that I don't think map very well to things that exist in the corporate world. For example, you said you had to build a donor vetting pipeline.It's in the corporate world I didn't have it. It's one of those, “Why in the world would I get in the way of people trying to give me money?” And the obvious answer in your case is, federal law, and it turns out that the best outcome generally does not involve serving prison time. So, you have to address these things in ways that don't necessarily have a one-to-one analog in other spaces.Dan: That's true. That's true. Yes, correct to the federal law thing. Our more pressing reason to do this kind of thing was that we made a commitment very early on in the campaign that we wouldn't take money from executives of the gas and oil industry, for example. There were another bunch of other commitments that were made, but it was inconceivable for us to have enough people that could possibly go manually through those filings. So, for us to be able to build an automated system for doing that meant that we were literally saving thousands of human hours and still getting a beneficial result out of it.Corey: And everything you do is subject to intense scrutiny by folks who are willing to make hay out of anything. If it had leaked at the time, I would have absolutely done some ridiculous nonsense thing about, “Ah, clearly looking at this AWS bill. Joe Biden's supports managed NAT gateway data processing pricing.” And it's absolutely not, but that doesn't stop people from making hay about this because headlines are going to be headlines.And do you have to also deal with the interesting aspect—industrial espionage is always kind of a thing, but by and large most companies don't have to worry that effectively half of the population is diametrically opposed to the thing it is that they're trying to do to the point where they might very well try to get insiders there to start leaking things out. Everything you do has to be built with optics in mind, working under tight constraints, and it seems like an almost insurmountable challenge except for the fact where you actually pulled it off.Dan: Yeah. Yeah. Yeah. We kept saying that the tech was not the story, right, and we wanted to do everything within our power to keep the conversation on the candidate and not on emails or AWS bills or any of that kind of stuff. And so we were very intentional about a lot of the decisions that we ended up making with the idea that if the optics are bad, we pull away from the primary mission of what it is that we're trying to do.Corey: So, what was it that qualified you to be the CTO of a—at the time very fledgling and uncertain campaign, given that you were coming from a role where you were a distinguished engineer, which is not nothing, let's be clear, but it's an executive-level of role rather than a hands-on level of role as CTO. And then if we go back in time, you were one of the founding developers of Spinnaker over at Netflix.And I have a lot of thoughts about Netflix technology and a lot of thoughts about Spinnaker as well, and none of those thoughts are, “This seems like a reasonable architecture I should roll out for a presidential campaign.” So, please, don't take this as the insult that probably sounds like, but why were you the CTO that got tapped?Dan: Great question. And I think in some ways, right place, right time. But in other ways probably needs to speak a little bit to the journey of how I've gotten anywhere in my career. So, going back to Netflix, yeah, so I worked in Netflix. I had the opportunity to work with a lot of incredibly bright and talented folks there. One of the people in particular who I met there and became friends with was Corey Bertram who worked on the core SRE team.Corey left Netflix to go off and at the time he was just like, “I'm going to go do a political startup.” The interesting thing about Netflix at the time—this was 2013, so, this was just after the Obama for America '12 campaign. And a bunch of folks from OFA world came and worked at Netflix and a variety of other organizations in the Bay Area. Corey was not one of those people but we were very well-connected with folks in that world, and Corey said he was going off to do a political startup, and so after my non-mutual departure from Netflix, I was talking to Corey and he said, “Hey, why don't you come over and help us figure out how to do continuous delivery over on the political startup.” That political startup turned into the groundwork which turned into essentially the tech platform for the Hillary for America campaign.So, I had the opportunity working for the groundwork to work very closely with the folks in the technology organization at HFA. And that got me more exposure to what that world is and more connections into that space. And the groundwork was run by Corey, but was the CEO or head—I don't even know what he called himself, was Michael Slaby, who was President Obama's CTO in 2008 and had a bigger technical role in the 2012 campaign.And so, for his involvement in HFA '16 meant that he was a person who was very well connected for the 2020 campaign. And when we were out at a political conference in late 2018 and he said, “Hey, I think that Vice President Biden is going to run. Do you have any interest in talking with his team?” And I said, “Yes, absolutely. Please introduce me.”And I had a couple of conversations with Greg Schultz who was the campaign manager and we just hit it off. And it was a really great fit. Greg was an excellent leader. He was a real visionary, exactly the person that President Biden needed. And he brought me in to set up the tech operation and get everything to where we ultimately won the primary and won the election after that.Corey: And then, as all things do, it ended and the question then becomes, “Great, what's next?” And the answer for you was apparently, “Okay, I'm going to go back to Target-ish.” Although now you're the CISO of a Target subsidiary, Shipt and Target's relationship is—again, I imagine I have that correct as far as you are in fact a subsidiary of Target, so it wasn't exactly a new company, but rather a transition into the previous organization you were in a different role.Dan: Yeah, correct. Yeah, it's a different department inside of Target, but my paycheck still come from Target. [laugh].Corey: So, what was it that inspired you to go into the CISO role? Because obviously security is everyone's job, which is what everyone says, which is why we get away with treating it like it's nobody's job because shared responsibilities tend to work out that way.Dan: Yeah.Corey: And you've done an awful lot of stuff that was not historically deeply security-centric although there's always an element passing through it. Now, going into a CISO role as someone without a deep InfoSec background that I'm aware of, what drove that? How did that work?Dan: You know, I think the most correct answer is that security has always been in my blood. I think like most people who started out—Corey: There are medications for that now.Dan: Yeah, [laugh] good. I might need them. [laugh]. I think like most folks who are kind of my era who started seriously getting into software development and computer system administration in the late ‘90s, early thousands, cybersecurity it wasn't called cybersecurity at the time. It wasn't even called InfoSec, right, it was just called, I don't know, dabbling or something. But that was a gateway for getting into Linux system administration, network engineering, so forth and so on.And for a short period of time I became—when I was getting my RHCE certification way back in the day, I became pretty entrenched in network security and that was a really big focus area that I spent a lot of time on and I got whatever the supplemental network security certification from Red Hat was at the time. And then I realized pretty quickly that the world isn't going to need box operators for very long, and this was just before the DevOps revolution had really come around and more and more things were automated.So, we were still doing hand deployments. I was still dropping WAR files onto a file system and restarting Apache. That was our deployment process. And I saw the writing on the wall and I said, “If I don't dedicate myself to becoming first and foremost a software engineer, then I'm not going to have a very good time in technology here.” So, I jumped out of that and I got into software development, and so that's where my software engineering career evolved out of.So, when I was CTO for the campaign, I like to tell people that I was a hundred percent of CTO, I was a hundred percent a CIO, and I was a hundred percent of CISO for the first 514 days of the campaign or whatever it was. So, I was 300 percent doing all of the top-level technology jobs for the campaign, but cybersecurity was without a doubt the one that we would drop everything for every single time.And that was by necessity; we were constantly under attack on the campaign. And a lot of my headspace during that period of time was dedicated to how do we make sure that we're doing things in the most secure way? So, when I left—when I came back into Target and I came back in as a distinguished engineer there were some areas that they were hoping that I could contribute positively and help move a couple of things along.The idea always the whole time was going to be for me to jump into a leadership position. And I got a call one day from Rich Agostino who's the CISO for Target and he said, “Hey, Shipt needs a cybersecurity operation built out and you're looking for a leadership role. Would you be interested in doing this?” And believe it or not, I had missed the world of cybersecurity so much that when the opportunity came up I said, “Yes, absolutely. I'll dive in head first.” And so that was the path for getting there.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: My take to cybersecurity space is, a little, I think, different than most people's journeys through it. The reason I started a Thursday edition of the Last Week in AWS newsletter is the security happenings in the AWS ecosystem for folks who don't have the word security in their job titles because I used to dabble in that space a fair bit. The problem I found is that is as you move up the ladder to executives that our directors, VPs, and CISOs, the language changes significantly.And it almost becomes a dialect of corporate-speak that I find borderline impenetrable, versus the real world terminology we're talking about when, “Okay, let's make sure that we rotate credentials on a reasonable expected basis where it makes sense,” et cetera et cetera. It almost becomes much more of a box-checking compliance exercise slash layering on as much as you possibly can that for plausible deniability for the inevitable breach that one day hits and instead of actually driving towards better outcomes.And I understand that's a cynical, strange perspective, but I started talking to people about this, and I'm very far from alone in that, which is why people are subscribing to that newsletter and that's the corner of the market I wanted to start speaking to. So, given that you've been an engineer practitioner trying to build things and now a security executive as well, is my assessment of the further higher up you go the entire messaging and purpose change, or is that just someone who's been in the trenches for too long and hasn't been on that side of the world, and I have a certain lack of perspective that would make this all very clear. Which I freely accept, if that's the case.Dan: No, I think that you're right for a lot of organizations. I think that that's a hundred percent true, and it is exactly as you described: a box-checking exercise for a lot of organizations. Something that's important to remember about Target is—Target was the subject of a data breach in 2012, and that was before there were data breaches every single day, right.Now, we look at a data breach and we say that's just going to happen, right, that's the cost of doing business. But back in 2012 it was really a very big story and it was a very big deal, and there was quite a bit of activity in the Target technology world after that breach. So, it reshaped the culture quite literally, new executives were brought in, but there's this whole world of folks inside of Target who have never forgotten that, right, and work day-in and day-out to make sure that we don't have another breach.So, security at Target is a main centrally thought about kind of thing. So, it's very much something that is a part of the way that people operate inside of Target. So, coming over to Shipt, obviously, Shipt is—it is a subsidiary. It is a part of Target, but it doesn't have that long history and hasn't had that same kind of experience. The biggest thing that we really needed at Shipt is first and foremost to get the program established, right. So, I'm three or four months onto the job now and we've tripled the team size. I've been—Corey: And you've stayed out of the headlines, which is basically the biggest and most accurate breach indicator I've found so far.Dan: So far so good. Well, but the thing that we want to do though is to be able to bring that same kind of focus of importance that Target has on cybersecurity into the world of engineering at Shipt. And it's not just a compliance game, and it's not just a thing where we're just trying to say that we have it. We're actually trying to make sure that as we go forward we've got all these best practices from an organization that's been through the bad stuff that we can adopt into our day-to-day and kind of get it done.When we talk about it at an executive level, obviously we're not talking about the penetration tests done by the red team the earlier day, right. We're not calling any of that stuff out in particular. But we do try to summarize it in a way that makes it clear that the thing that we're trying to do is build a security-minded culture and not just check some boxes and make sure that we have the appropriate titles in the appropriate places so that our insurance rates go down, right. We're actually trying to keep people safe.Corey: There's a lot to be said for that. With the Target breach back in—I want to say 2012, was it?Dan: 2012. Yep.Corey: Again, it was a wake-up call and the argument that I've always seen is that everyone is vulnerable—just depends on how much work it's going to take to get there. And for, credit where due, there was a complete rotation in the executive levels which whether that's fair or not, I—people have different opinions on it; my belief has always been you own the responsibility, regardless of who's doing the work.And there's no one as fanatical as a convert, on some level, and you've clearly been doing a lot of things in the right direction. The thing that always surprises me is that when I wind up seeing these surveys in the industry that—what is it? 65% of companies say that they would be vulnerable to a breach, and everybody said, “Oh, we should definitely look at those companies.” My argument is, “Hang on a sec. I want to talk to the 35% who say, ‘oh, we're impenetrable.'” because, spoiler, you are not.No one is. Just the question of how heavy is the lift and how much work is it going to take to get there? I do know that mouthing off in public about how perfect the security of anything is, is the best way to more or less climb to the top of a mountain during a thunderstorm, a hold up a giant metal rod, and curse the name of God. It doesn't lead to positive outcomes, basically ever. In turn, this also leads to companies not talking about security openly.I find that in many cases it is easier for me to get people to talk about their AWS bills than their InfoSec posture. And I do believe, incidentally, those two things are not entirely unrelated, but how do you view it? It was surprisingly easy to get Shipt's CISO to have a conversation with me here on this podcast. It is significantly more challenging in most other companies.Dan: Well, in fairness, you've been asking me for about two-and-a-half years pretty regularly [laugh] to come.Corey: And I always say I will stop bothering you if you want. You said, “No, no. Ask me again in a few months. Ask me again, after the election. Ask me again after—I don't know, like, the one-day delivery thing gets sorted out.” Whatever it happens to be. And that's fine. I follow up religiously, and eventually I can wear people down by being polite yet persistent.Dan: So, persistence on you is actually to credit here. No, I think to your question though, I think that there's a good balance. There's a good balance in being open about what it is that you're trying to do versus over-sharing areas that maybe you're less proficient in, right. So, it wouldn't make a lot of sense for me to come on here and tell you the areas that we need to develop into security. But on the other side of things, I am very happy to come in and talk to you about how our incident response plan is evolving, right, and what our plan looks like for doing all of that kind of stuff.Some of the best security practitioners who I've worked with in the world will tell you that you're not going to prevent a breach from a motivated attacker, and your job as CISO is to make sure that your response is appropriate, right, more so than anything. So, our incident response areas where today we're dedicating quite a bit of effort to build up our proficiency, and that's a very important aspect of the cybersecurity program that we're trying to build here.Corey: And unlike the early days of a campaign, you still have to be ultra-conscious about security, but now you have the luxury of actually being able to hire security staff because it turns out that, “Please come volunteer here,” is not presumably Shipt's hiring pitch.Dan: That's correct. Yeah, exactly. We have a lot of buy-in from the rest of leadership to build out this program. Shipt's history with cybersecurity is one where there were a couple of folks who did a remarkably good job for just being two or three of them for a really long period of time who ran the cybersecurity operation very much was not a part of the engineering culture at Shipt, but there still was coverage.Those folks left earlier in the year, all of them, simultaneously, unfortunately. And that's sort of how the position became open to me in the first place. But it also meant that I was quite literally starting with next to nothing, right. And from that standpoint it made it feel a lot like the early days of the campaign because I was having to build a team from scratch and having to get people motivated to come and work on this thing that had kind of an unknown future roadmap associated with it and all of that kind of stuff.But we've been very privileged to—because we have that leadership support we're able to pay market rates and actually hire qualified and capable and competent engineers and engineering leaders to help build out the aspects of this program that we need. And like I said, we've managed to—we weren't exactly at zero when I walked in the door. So, when I say we were able to quadruple the team, it doesn't mean that we just added four zeros there, [laugh] but we've got a little bit over a dozen people focusing on all areas of security for the business that we can think of. And that's just going to continue to grow. So, it's exciting; it's a challenge. But having the support of the entire organization behind something like this really, really helps a lot.Corey: I know we're running out of time for a lot of the interview, but one more question I want to ask you about is, when you're the CISO for a nationally known politician who is running for the highest office, the risk inherent to getting it wrong is massive. This is one of those mistakes will show indelibly for the rest of, well, one would argue US history, you could arguably say that there will be consequences that go that far out.On the other side of it, once you're done on the campaign you're now the CISO at Shipt. And I am not in any way insinuating that the security of your customers, and your partners, and your data across the board is important. But it does not seem to me from the outside that it has the same, “If we get this wrong there are repercussions that will extend into my grandchildren's time.” How do you find that your ability to care as deeply about this has changed, if it has?Dan: My stress levels are a lot lower I'll say that, but—Corey: You can always spot the veterans on an SRE team because—when I say veterans I mean veterans from the armed forces because, “No one's shooting at me. We can't serve ads right now. I'm really not going to run around and scream like, ‘My hair's on fire,' because this is nothing compared to what stress can look like.” And yeah there's always a worst stressor, but, on some level, it feels like it would be an asset. And again this is not to suggest you don't take security seriously. I want to be very clear on that point.Dan: Yeah, yeah, no. The important challenge of the role is building this out in a way that we have coverage over all the areas that we really need, right, and that is actually the kind of stuff that I enjoy quite a bit. I enjoy starting a program. I enjoy seeing a program come to fruition. I enjoy helping other people build their careers out, and so I have a number of folks who are at earlier at points in their career who I'm very happy that we have them on our team because I can see them grow and I can see them understand and set up what the next thing for them to do is.And so when I look at the day-to-day here, I was motivated on the campaign by that reality of like there is some quite literal life or death stuff that is going to happen here. And that's a really strong presser to make sure that you're doing all the right stuff at the right time. In this case, my motivation is different because I actually enjoy building this kind of stuff out and making sure that we're doing all the right stuff and not having the stress of, like, this could be the end of the world if we get this wrong.Means that I can spend time focusing on making sure that the program is coming together as it should, and getting joy from seeing the program come together is where a lot of that motivation is coming from today. So, it's just different, right? It's a different thing, but at the end of the day it's very rewarding and I'm enjoying it and can see this continuing on for quite some time.Corey: And I look forward to ideally getting you back in another two-and-a-half years after I began badgering you in two hours in order to come back on the show. If—Dan: [laugh].Corey: —people want to hear more about what you're up to, how you view about these things, potentially consider working with you, where can they find you?Dan: Best place although I've not been as active because it has been very busy the last couple of months, but find me on Twitter, @danveloper, find me on LinkedIn. Those—you know, I posted a couple of blog posts about the technology choices that we made on the campaign that I think folks find interesting, and periodically I'll share out my thoughts on Twitter about whatever the most current thing is, Kubernetes or AWS about to go down or something along those lines. So, yeah, that's the best way. And I tweet out all the jobs and post all the jobs that we're hiring for on LinkedIn and all of that kind of stuff. So, usual social channels. Just not Facebook.Corey: Amen to that. And I will of course include links to those things in the [show notes 00:37:29]. Thank you so much for taking the time to speak with me. I appreciate it.Dan: Thank you, Corey.Corey: Dan Woods, CISO and VP of Cybersecurity at Shipt, also formerly of the Biden campaign because wherever he goes he clearly paints a target on his back. I'm Cloud Economist, Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast please leave a five-star review on your podcast platform of choice along with an incoherent rant that is no doubt tied to either politics or the alternate form of politics: Spinnaker.Dan: [laugh].Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Opravičujemo se za vse nevšečnosti

Zdravo! Tokrat ugotovimo, da bi imeli Bife pri konju in konjušniku, z najbolj ozkim asortimanom pijač, ugotovimo pa da tega ne moremo storiti sami. Sklenemo pa, da smo bolj prijazni kot Krikkitčani in tudi, da je večina ljudi, če jih izpostaviš bombardiranju medijev, sposobna postati Krikkitčani. Z veseljem, pa tudi s strahom pričakujemo leto 2022, Aljo pa razloži, zakaj je bolje biti pesimist. Če se kaj dobrega zgodi, si potem prijetno presenečen. Naši junaki so spet v bistromatični restavraciji, sicer pa poglavja ni prav veliko, le Slartibartfast povzame, da so Krikkičani čez noč postali manični sovražniki vsega tujega. V silvestrskem dodatku tudi o teoriji zarote, v kateri so rastline naši overlordi, pogovarjamo se kaj storiti, če (ko) pridejo Vogoni in omenimo citat Johna Lennona. Govorimo tudi o tem da nam je pri Douglasu všeč, da ne špara nikogar in se dela norca iz vseh in vsega. Sklenemo, da je Štoparski vodnik po Galaksiji bolj brezčasna knjiga kot Biblija. Srečno 2022 in hvala za vse.

Noticentro
Instituto Electoral de la Ciudad de México ordena sanción contra autoridades capitalinas

Noticentro

Play Episode Listen Later Dec 24, 2021 1:26


Instituto Electoral de la Ciudad de México ordena sanción contra autoridades capitalinasSRE celebra aniversario de la llegada de las vacunas al país 39 personas murieron y 70 resultaron heridas en un incendio en un barco en el sur de Bangladesh

Changelog Master Feed

Merry Shipmas! This is our special Christmas episode which sums up two months of very early mornings and a few late nights. After many twists and turns, stuff which didn't work out, as well as pleasant surprises, this is what we ended up with:

Screaming in the Cloud
Into the Great Wide Open Source with Julia Ferraioli

Screaming in the Cloud

Play Episode Listen Later Dec 23, 2021 40:19


About JuliaJulia Ferraioli calls herself an Open Source Archaeologist, focusing on sustainability, tooling, and research. Her background includes research in machine learning, robotics, HCI, and accessibility. Julia finds energy in developing creative demos, creating beautiful documents, and rainbow sprinkles. She's also a fierce supporter of LaTeX, the Oxford comma, and small pull requests.Links:Open Source Stories: https://www.opensourcestories.org TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com. Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you're tired of managing open source Redis on your own, or you're using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous non-sense.  Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is someone I have been very politely badgering to come on the show for a while, ever since I saw her speak a couple years ago in the Before Times, at Monktoberfest. As I've said before, anytime the RedMonk folks are involved in something, it is something you probably want to be involved in. That is my new guiding star philosophy when it comes to conferences, Twitter threads, opinions, breakfast cereals, you name it. Please welcome Julia Ferraioli, the co-founder of Open Source Stories, Julia, thank you for joining me today.Julia: Thank you for having me. And I definitely agree on the RedMonk side of things. They are fantastic folk.Corey: They're a small company, which is sort of interesting to me from a perspective of just how outsized their impact on this entire industry is. But it's, I've had as many of them as they will let me have on the show. They are welcome to come back whatever they want, just because they—every single one of them, though they're very different from one another, make everyone around them better with their presence. And that's just a hard thing to see. I didn't mean to turn this into a love letter to RedMonk, but here we are.Julia: I don't mind it. They have the ability to amplify the goodness that they see, anything from their survey designs to just how they interact online. It's wonderful to see.Corey: Speaking of amplifications, you are the co-founder of Open Source Stories, the idea of telling the—to my understanding—the stories behind open source. Like this is sort of like—what is it, Behind the Music, only in this case it's Behind the Code? I mean, how do you envision this?Julia: Oh, I like that framing. So, Open Source Stories is a project that myself and Amanda Casari founded not that terribly long ago because when we were doing research about how to model open source and open source ecosystems, we realized that a lot of the research papers that have been published about open source are pulled mostly from GitHub Archive, which is this repository of GitHub data. It could be the actual Git commit history as well as the activity streams from GitHub as well, but that doesn't capture a lot of the nuances behind open source, things like the narratives, how communities interact, where communication is happening, et cetera. All of these things can happen outside of the hosting platform. So, we launched this project to help tell these stories of the people and events and scenarios behind the open source projects that really power our industry.Corey: I'm going to get letters for this one, I'm sure of it, but I've been involved in the open source ecosystem for a while and I've noticed that there's been a recurring theme among various projects, particularly the more passionate folks working on them, where they talk an awful lot but they aren't very good at telling stories at the same time. And nowhere is this more evident than when we look at what passes for a lot of these projects' documentation. One of the transformative talks that I went to was Jordan Sissel's years and years ago, at the Southern California Linux Expo. And it was a talk about LogStash, which doesn't actually matter because the part that he'd said that really resonated with me, that his whole theme of his talk was around, was if a new user has a bad time, it's a bug. And the idea that, “Oh, you didn't read the documentation properly.”When about I started working with Linux, in some IRC chat rooms, the standard response to someone asked for help was to assume that they're an idiot, begin immediately accosting them with RTFM, for Read the Frickin' Manual, and then look for ways that you could turn this back around on them and make it their fault. And I looked at this and at the time, it's like, “Wow, these are people that are mean to other people,” and I was a small, angry teenager; it's like, “This is my jam. Here I am.” And yeah, many decades later, I'm looking at this and I feel a sense of shame because that's not the energy I want to put into the world. A lot of those communities have evolved and grown and what used to be the area and arena for hobbyists is now powering trillion-dollar companies.Julia: Absolutely. I like the whole, “If the user has a bad experience, that's a bug,” because it absolutely is. And I feel like a lot of these projects haven't invested nearly as much into the user experience as they have into polishing the code. And the attitude that that kind of perpetuates throughout the project about how you treat your users, it's pervasive and it really sets up the types of features that you develop, the contributors that you encouraged to commit to the project, and it just creates a—to put it minorly—less than welcoming environment for users, contributors, maintainers alike. And we don't really need that sort of hostility, especially when we're talking about projects that underpin the foundations, in some cases, of the internet.Corey: When we look at what open source is, I mean, I shortcut to thinking in terms of the context through which I've always approached it, which was generally code, or in my sad, particular story, back in the olden days on good freenode, when that was where a lot of this discourse happened, I was network staff and helping a bunch of different communities get channels set up through a Byzantine process. Because of course there was a Byzantine process; it was an open source community, and if there's one thing we love in open source, it is pretending to be lawyers when we're not. And we're sort of cargo-culting what we think process and procedure often look like. So yeah, there was a bunch of nonsensical paperwork happening there, but it was mostly about helping folks collaborate and communicate. But I've first and foremost, think in terms of code and in terms of community. What is open source to you?Julia: Well, I entered open source in the Sourceforge days, when all you had to do was go and download some code from the internet and hit the right download button, make sure not to hit one of the extraneous ones. And all you need for that is for the code to be under the right license. And to an extent that's what's true today for open source. At the heart of it, this minimum criteria for what constitutes open source is, “Okay, does it comply with the open source definition that the Open Source Initiative puts forth?” Now, I understand that not everybody necessarily agrees with the Open Source definition, but it's useful as a shortcut for how we think about the basic requirements. But what I find when people are talking about open source online is that they have these very different models. You'll hear from people that, “Okay, well, if it doesn't have a standard governance model, it's not really open source.”Corey: The ‘No True Scotsman' argument.Julia: Yeah. So, I find that we've got these different expectations for what open source is, and that leads to us talking past each other or discounting different types of open source when what we really need to do is come up with better language, a better vocabulary, for how to talk about these things. So, for example, I used to work in developer relations, and in developer relations one of the big things that you do is release sample code. Now, oftentimes, I'm not looking for that sample code to be picked up by a bunch of different developers and incorporated as a library into their project—Corey: [laugh]. Well, that's your error in that case because congratulations, that's running in production at a bank somewhere, now.Julia: Oh, I know. And that has definitely happened with my code, and I'm ashamed to say that. [laugh]. But generally speaking, you're not looking to build a huge community around sample code, right?Corey: You say that, but that again, Stack Overflow, it was—Julia: Okay.Corey: —[unintelligible 00:09:22] done rather well. So, there's that.Julia: Well yes, that is true, but when you release code on Stack Overflow, or GitHub, or in a Jest, or just on your blog, the thing that allows the bank to come in and incorporate that into their own application, or to even just learn from it, is the fact that it is open source. Now, it doesn't have a lot of the things that a community like Python or Kubernetes has, but it is still open source; it just has a different purpose than those communities and those ecosystems.Corey: So, I think it is challenging right now to talk about open source as if it were the same type of thing that it was back in the '90s, and the naughts—and even the teens—where it's a bunch of, more or less either hobbyists or people are perceived to be hobbyists. Sure, an awful lot of them are making commits from their redhat.com email address, but okay. And some of these people are increasingly being paid to work places, but then you see almost—I don't necessarily agree with the framing of The New York Times article by Daisuke Wakabayashi—who's a previous guest on the show—of Amazon strip-mining open source, but they definitely are in there—and other companies as well—are sort of appropriating it, or subverting it, or turning it into something that it was not previously, for lack of a better term. What's your take on that?Julia: Oh, that's a hard one. From a fundamentals perspective, that is absolutely within their rights under the definition of open source, and in some cases, the spirit of open source as well.Corey: Oh, and I would argue with someone who said that they should be constrained from doing this as far as a matter of legalities, or rights, or ridiculous Looney Tunes license changes.Julia: Well, there are definitely folks who are trying to make that the case.Corey: Yeah. Oh, yeah. I'm on the position of, they're within their rights to do it, but it's time for a good old fashioned public shunning as a result.Julia: I'm not sure I agree. I think that it is a natural consequence of how open source has gained in popularity and, in some cases, it's a testament to open source's success. Now, does it pose some serious challenges for the open source community and open source ecosystem? Absolutely because this is a new way of using open source that was unanticipated, and in fact, could be characterized as a Black Swan event in [open source-ware 00:12:18].Corey: The fundamental attribution error that I see, back at the very beginning, was that what we wrote the software, therefore, we are the best in the world at running it, therefore, if there's going to be a managed service, clearly ours will be the best. Amazon's core strength has apparently been operational excellence as they like to call it; my position on that is a little bit less of tying into the mystery, a little bit more of they're really fast and getting paged and fixing things in a hurry before customers notice. So okay, great, but it's column A, column B, whatever. The bigger concern I have with Amazon as its product strategy is, “Yes.” If it were just a way to run EC2 instances or virtual machines, then sure, that's great.And every open source project should, on some level, see some validation of its market through a lens of, “Oh, we're getting some competition. That's great.” The challenge I see is that in the line of competitors, Amazon is at or near the front all the time on basically everything. And it's if they would pick a lane to stay in, great.Google is a good example of this. There are things that Google very strongly considers in its wheelhouse, but for other things, they partner with the open source-based company in question to create a managed service partner offering and that's great. Amazon pulls a, “Nope. We're just going to build this out as first-party. The end.”And they compete with everyone, including themselves on almost every axis. And that's where it just gets into a, “Leave some oxygen for the rest of us.” I mean, it feels like they lie awake at night worrying that someone who isn't them somehow making money somewhere. That is, I think, on some level, more of the Black Swan event than someone else deciding that they can host a particular open source project more effectively. But that's where I stand. And again, this is just me as an enthusiastic and obnoxious observer. You're operating in this space. What do you think? That's the important part of the story.Julia: Well, I mean, you definitely have a point, Amazon—or AWS, maybe not necessarily Amazon—takes on different technologies far and wide, so they're not limiting themselves to a space. But that said, I think it comes down less to what is possible with open source and what is okay under the guise of open source, and what is good for the open source ecosystem. And when you fork a project, you do have to understand that you are bifurcating the open source ecosystem. And that can lead to sustainability problems down the road. So, I think the jury is still out on whether forking a project, running it as a managed service—as Amazon is doing with some of the open source projects—if that's going to come back to bite them just from a developer community standpoint because you're going to have people committing to one or the other, but possibly not both.Corey: I think this is why Amazon—I know, they're very annoyed by their perception in the open source ecosystem, but you take a look at other large tech companies, and almost all of them have a few notable open source projects that started life there. For example, we have—I think Cassandra came out of Facebook, but don't quote me on that; Kubernetes came out of Google, a fact for which they steadfastly refused to apologize, so far; and so on, and so forth. But Amazon's open source initiatives have been, “We've open sourced this thing that is basically only used at Amazon.” Or, my personal favorite, we've put all of our documentation up on GitHub so that you can write a corrections to it yourself from the community, which I'm hearing as, “Please, volunteer for a $1.6 trillion company so that they don't have to improve their documentation by hiring expensive people internally.”You can sort of guess my position on that. It seems like they have not launched anything that has a deep heart within Amazon that is broadly adopted outside of their walls. My question for you is, do you believe that having that level of adoption externally is required for a healthy open source project?Julia: Again, I think it goes back to the goals of why you're open-sourcing something. I don't believe that it's necessarily required for the open source project to be quality and be usable, but if your goal is adoption or if your goal is to get ideas and best practices out there, then yeah, you do need that engagement by the broader community, you do need the contributors. But there are a lot of cases where open-sourcing technology is more for the validation, rather than the adoption of the tech. So, it really depends.Corey: I'd say the most cynical reason I've seen to open source things comes from Netflix, where they have a recurring pattern of open-sourcing something, there are two or three commits, and then it basically sits there unattended. What I firmly believe is happening is that a senior engineer at Netflix is working on the thing and they're about to change jobs, so they open source the project so that they can change jobs and then pick up where they left off with an internal fork, I view it as a game of, basically, they're passing themselves a football as they run across the street. And people laugh when I say that, but I've also had people over drinks say, “You are closer than you might think, sometimes.” Which on some level is terrifying. Feels like life is imitating art, but here we go.Julia: That definitely happens, and I have seen it [laugh] as well. People want to essentially use open source to exfiltrate IP.Corey: Yeah. Only doing it legitimate way as opposed to the, “Please don't—hope they don't find that USB stick I've hidden in my sock on my last day.”Julia: Yes. And this is why open source offices have a challenging job in helping facilitate the release of open source software. So, it is hard to ascertain when that is happening.Corey: Yeah, no company is ever going to have a big statement that is going to be anything other than, honestly, marketing speak when it comes time to explain why they're doing a certain thing. It's, “Oh, yeah, we're open-sourcing this so we don't get sued in three years by this other company that might prove to be a competitive threat.” Or, “We're open-sourcing this as a hiring and recruiting technique.” I mean, I would argue, it wasn't open source, but one of the best approaches that I've seen from that perspective came out of Google, I'm firmly convinced to this day that App Engine was run not by their SRE team, but by their recruiting arm, “Because if you can build a great app on App Engine, well, this is, kind of like, how we think about things inside of Google; come and work here,” either via acqui-hiring or a just outright interview funnel. Maybe that's too cynical, too, but again, that leads to the question of is it really open source when it has these deep ties to specific platforms?Here's an open source tool that presumes you're running on top of AWS. Well, great, sure it's built by the community and anyone can access these things, but without paying per second to a cloud provider, probably the referenced cloud provider they're developing this against, it's not going to get very far. So, it's a nuanced argument, and there are shades of that nuance to every aspect of it. And if there's one thing that Twitter is terrible at is capturing nuance in 280 characters. And even in the, “All right, this is my nuanced take on open source in this thread, I will tweet, one of 5,712.” Great. That's not really the forum for that either. And people lose sight of nuance. It's a sticky, delicate thing, and it feels like a lot of the open source community has been enthusiastically agreeing with each other—sometimes violently so—but they're not sharing a common language in which to do it.Julia: Yeah. And in terms of the purposes of open source projects, it is okay for them to have different ones as long as they're telegraphing those purposes to their users and the people who are looking at the projects for their own use. But whether it's open source? I think it's okay for that to be the baseline and then build out the vocabulary of the types of projects that you want from there, based on those expectations. Yes, this particular technology only works with this cloud provider. That's open source that facilitates and accelerates development with that cloud provider.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: I always try and stay away from explicit value judgments on a lot of these things because it's nuanced, and no one who doesn't work at Facebook wakes up expecting to do terrible things today. We're all trying to do the best we can with the constraints are operating within. The challenge is that when you're at a company like an AWS, or a Google, or a Microsoft, or one of these giant companies, the same pressures that the rest of the quote-unquote “mere mortals” in ecosystem have to contend with are very different. But talking to people who work at these big companies, they have meetings and review processes that here at my twelve-person company, I don't even have to consider.Easy example of that: Never once have I put something out into the world and had a single discussion about is this going to get us in trouble with respect to antitrust? That has never been on my radar as far as things I have to care about. Even at my previous job at a highly regulated financial company, where you could argue that they are approaching monopoly status in some areas of the market organically, with passive investing being what it is, great, their open source discussions were always much more aligned with what licenses are we willing to accept legal risk for using internally? Because there are things that are—like IP is why we have a business in many respects, so anything that touches that theoretically means we'd have to disclose how the entire system, how the rest of it works, is not allowed to be used here. And there are reviews and processes and compliance requirements for that.I get that concern, and at a certain point of scale, you're negligent if you don't have a function that looks at it through that lens. But I look back to the early days of just puttering around with, “I want to do a thing and I found this project somewhere that people are excited about,” in the pre-GitHub days, I can download it off as Sourceforge or whatnot and I can make it work. And but it doesn't do this one thing I want to do, “Hey, the code's available. Can I fix it myself? Absolutely not. I'm crap at writing code. But I can talk to people and piece it together from wisdom that they offer.” And it turns into something awful until finally it gets enough traction that someone who knows what they're doing looks at it and refactors and it makes it good.And that's the open source community I recognize and that I see from my early developmental period. I don't recognize what we see in ecosystem today through that same lens of, “Okay, go online. Be nice to people”—well, that's new—“See how this thing works. And oh, if I'm having a problem, I'm probably not the only person who's having a problem like this.” You have to get really good at using Google more than you do at writing code in some respects. But at that point, it's almost entirely a copy-and-paste, except that's not technical enough for the open source world. So instead, we have to learn the 500 arcane subcommands to Git in order to get it out there. But it works. Ish.Julia: I think that community is still out there. I really do. I think that it is harder to find and it's not necessarily where you might tend to look, but those projects are still there. They're still running. They might be a little less high-profile than a lot of the ones that are getting a lot of attention right now, but they are still there.Corey: On some level, it feels like the blame for this lies—at least partially—at the feat of Slack and its success because it used to be that you had IRC, that was how folks communicated. And I remember the early days of that and things like Jabber or internal servers, grea—or internal IRC servers at companies—great, you'd have engineering all talking on that, and oh, you want to have someone in finance or marketing join that thing? Yeah, the short answer is, that won't be happening. But you can try and delude yourself and set it up with a special client and the rest.Slack removed all of that friction, but it's balkanized to the point where every once in a while, I have to go through and remove a bunch of Slack channels slash workspaces slash whatever we're calling them this week from my desktop client because it's basically eating all the RAM like it's trying to be Google Chrome. And then it's great, but there's no universal federated thing the way that there was with IRC where I just pop in a different channel for a different project. And IRC is still there and it comes back to life whenever Slack takes an outage. And then Slack gets fixed, it sort of bleeds off again. But I don't want to be in 500 different Slack workspaces, one for every open source project that I'm using, and there's no coherent sense of identity and community anymore the way there once was. And I feel like I'm old man yelling at the passing of time at this. But you're right, open source to me was always much more about community than it was about code.Julia: Yeah, and I think that we do not talk about the impact of tools for open source that we use. Because you're right; with IRC, it was unified. You could pretty much guarantee that projects of a certain size were present there. And with Slack, you have to sign up for yet another account, not quite yet sure why I can't find the right channels that I need to join in Slack. So, there's a lot of navigation and a lot of prerequisite knowledge that you need to have in order to be productive.And then you've got other tools being used for communication by other communities like, I believe Gitter is a major one as well. Then you have to make sure that you're up-to-date with all of these different interfaces, Discord, everything. And the sociological implication of that shouldn't be underestimated. What are you going to do if you find a project that uses a communication tool that you just really don't want to use or don't want to sign up for yet another account? Maybe you pass on by and you find one that works within your existing set of tools. There aren't a lack of open source projects to join right now. You can be choosy. And we don't yet know what the impact is of that.Corey: It's challenging. There's no good answer that I found that solves all of these things. It's become so balkanized, on some level, that every project out there that I see—and there are some small ones that are incredibly foundational to, basically, civilization as we know it, but it's not working right because it's you have to figure out where they are and what the community norms are because they change from project to project, and there are so many different things. And, like, you can go into NPM and install some relatively trivial thing that does command-line string processing, or whatnot, and it installs 40 different dependencies. And there's a problem and you want to figure out exactly how that works, and et cetera, et cetera, et cetera.Julia: Absolutely. With NPM specifically, or Node specifically, it is interesting that the development model kind of encourages this obscurity, an obfuscation of a functionality. So, it is hard to go in, debug an issue, go to the specific community, understand how they work, contribute a patch, just to fix something that is, you know, five levels up. It gets confusing for developers. It can contribute to longer-term bugs that we see propagate throughout the system. It is not an easy problem to solve, and I have a lot of sympathy for newcomers to the open source ecosystem because it is so hard to navigate. And I think that's an as yet unsolved problem that we need to address.Corey: So, what was it that inspired you to create Open Source Stories? I mean, I love the direction you're taking this in; I love the way you're thinking about [audio break 00:29:38]. Where did it come from? What started this?Julia: Well, when Amanda and I were going back and doing research around—you know, aside from the code for an open source project, where are the different entry points? Where are the different interaction points between projects, ecosystems, and the industry? And we did a couple of interviews, just very organic interviews, with some subject matter experts in Node, in Python, in Go. And there was a point where we stopped—or at least I stopped taking notes because I was just so fascinated by the narrative that our interviewee was putting forth and was talking about. And what we wanted was for it to not just be this meeting between a few people, we wanted to be able to share that with anyone. And so one of the things that really inspired us was StoryCorps, which allows you to record, much like we're doing today, 40 minutes worth of interactions between one to three people.Corey: Oh, we're going to cut it down to five minutes at most. Like, one question; one answer. Boom, we're done.Julia: [laugh].Corey: I kid, I kid.Julia: But it's really about facilitating the sharing of knowledge and sharing of these oral histories. Because as you're doing research into interactions in specific open source communities, you'll get articles, you'll get changelogs, all of that good stuff, but you won't get the nuance that we've been talking about over the course of this podcast. You lose the story behind the story, right? How are decisions made? How are people thinking about the interactions with their users? What are the turning points for a project? What are those conversations between the maintainers that changed the entire game?Those are the sorts of stories that we're hoping to capture because they're important for history, for knowledge sharing, for learning from our past, and making decisions for the future. And so that's really what we wanted to capture. And we wanted to capture the narratives behind the people that don't necessarily show up in the codebase, too: Talking about the designers, the product managers, the marketers behind open source that make it successful. Because there's so much more than code.Corey: Oh, my God, yes. It's… how do I put this politely without getting letters? Well, I guess I'll take a stab at it and see how it plays out. I look at so much of the brilliant code that has been written, and the documentation is abhorrent, and the design of the site, and the icon, and the interface, it looks like a joke that I put on Twitter trying to be funny. It's, the code is important, don't get me wrong, but there's so much more to it than that.And we see this in the industry, too, where companies have gone out of business, trying to get their codebase just right. It's, yeah, you can launch code that is really, really bad, but if you have product-market fit, it is survivable. I've heard stories in the early days of Twitter that we saw the fail whale all the time because it was an abhorrent monstrosity, to the point it became a running joke. But it turns out, when you hit product-market fit, you can afford really good engineers to come in and fix a lot of that stuff. That stuff is more important than the quality of the code, and that is something that I think that we have a collective industry-wide delusion about. And it's a blind spot for us.Julia: Yeah. I think we get wrapped up in the cleverness of the tech, and I've fallen prey to this, too. I get so involved in how I'm solving the problem and forget about the actual problem that I'm trying to solve, right? It's not necessarily about the how, but about the what. And without your fantastic tech writers, designers, usability experts, your open source project is going to be your open source project. It's not going to necessarily get that wide adoption, if that is indeed your goal for the technology that you're releasing.So, it really is about making sure that as we're launching and working on these open source projects and ecosystems, that we are inviting people to the table that have these other unique skills that goes beyond that code and speaks to what makes the project different and unique.Corey: I really want to say how much I appreciate your taking the time to talk to me about this. If people want to get involved themselves, how do they do that? Because I have a hard time accepting that you're doing something called Open Source Stories that eschews community involvement.Julia: Yeah. So, we absolutely would love more folks to get involved. I have been primarily the person working on the site, so we can always use contributors to the site itself, but we also want more storytellers and facilitators. And so if you go to opensourcestories.org, we've got a page specifically designed to facilitate contributions. So, check that out, and we look forward to hearing from anyone who wants to participate.Corey: And we will, of course, include links to that in the show notes. Thank you so much for taking the time to speak with me today. I really appreciate it.Julia: Thanks for having me.Corey: Julia Ferraioli, co-founder of Open Source Stories. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, calling me a fool because I did not bother to RTFM first.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Gymnasium
Slovenski velikani 2022 ustvarjeni v tehniki svinčnika

Gymnasium

Play Episode Listen Later Dec 22, 2021 51:48


Pet dijakinj je nadaljevalo delo svojih predhodnic, s katerimi na Gimnaziji Kranj želijo ustvariti serijo portretov Slovenk in Slovencev, ki so pomembno zaznamovali prostor in čas. Portrete je ustvarilo pet dijakinj, ki so bile tudi naše tokratne gostje, Ula Simonič, Monika Sunesko, Klara Stariha in Lavra Šubic. Dijakinje so slovenske velikane ustvarile v tehniki svinčnika (rjava barvica), potujoča razstava pa je svojo pot začela v knjižnici Gimnazije Kranj. Pridružila se nam je tudi njihova mentorica Nataša Kne. Nastali so: Alma Karlin (1889–1950), Rihard Jakopič (1869–1943), Kristina Gorišek Novaković (1906–1996), Nejc Zaplotnik (1952–1983), Oton Župančič (1878–1949), Kristina Brenk (1911–2009), Stanko Bloudek (1890–1959), pater Simon Ašič (1906–1992), Ita Rina (1907–1979), Josip Plemelj (1873–1967), Franja Bojc Bidovec (1913–1985), Janez Bleiweis (1808–1881). Skupaj so tako od leta 2020 ustvarili že 36 portretov Slovenskih velikanov. Do zdaj so upodobili: Leto 2020: Jože Plečnik, France Prešeren, Lili Novy, Jurij Vega, Ana Mayer – Kansky, Primož Trubar, Ivana Kobilca, Janez Puhar, Jakob Aljaž, Zofka Kveder, Rudolf Maister in Ivan Cankar Leto 2021: Edvard Rusjan, Veronika Deseniška, Srečko Kosovel, Ignacij Borštnik, Josipina Turnograjska, Simon Jenko, Angela Piskernik, Ivan Vurnik, Miki Muster, Leon Štukelj, Duša Počkaj in Herman Potočnik Noordung Leto 2022: Alma Karlin, Rihard Jakopič, Kristina Gorišek Novaković, Nejc Zaplotnik, Oton Župančič, Kristina Brenk, Stanko Bloudek, pater Simon Ašič, Ita Rina, Josip Plemelj, Franja Bojc Bidovec in Janez Bleiweis

The 6 Figure Developer Podcast
Episode 225 – SRE is a Journey with Dave Stanke

The 6 Figure Developer Podcast

Play Episode Listen Later Dec 20, 2021 42:18


  Dave Stanke joins us to talk all about Site Reliability Engineering. Dave is a Developer Relations Engineer with Google Cloud Platform specializing in DevOps, Site Reliability Engineering (SRE), and other flavors of technical relationship therapy. He loves chatting with practitioners: listening to stories, telling stories, sharing a healthy cry. Prior to Google, he was the CTO of OvationTix/TheaterMania, a SaaS startup in the performing arts industry, where he specialized in feeding memory to Java servers. He chose on purpose to live in New Jersey, where he enjoys baking, indie rock, and fatherhood.   Links https://stanke.dev/ https://twitter.com/davidstanke https://cloud.google.com/developers/advocates/dave-stanke   Resources https://sre.google/ https://bit.ly/reliability-discuss https://bit.ly/dora-sodr Thinking, Fast and Slow Site Reliability Engineering The Site Reliability Workbook Want to supercharge your DevOps practice? Research says try SRE Eliminating Toil Identifying and tracking toil using SRE principles How maintenance windows affect your error budget—SRE tips "Tempting Time" by Animals As Leaders used with permissions - All Rights Reserved × Subscribe now! Never miss a post, subscribe to The 6 Figure Developer Podcast! Are you interested in being a guest on The 6 Figure Developer Podcast? Click here to check availability!  

Changelog Master Feed
Crossing the platform gap (Ship It! #32)

Changelog Master Feed

Play Episode Listen Later Dec 17, 2021 71:32


In 2014 Gerhard joined CloudCredo, a startup co-founded by Colin Humphreys, Paula Kennedy & Chris Hedley. They stuck together through two acquisitions: Pivotal & VMware. This year, Colin, Paula & Chris co-founded Syntasso, the Platform-as-a-Product startup. Today they all get together to talk about about what it takes to build a platform team, why Team Topologies is a good conversation starter and why a curated blend of off-the-shelf, composed, and self-created services are required in any organisation operating at scale. Your hunch is right, all of them used to share the same Pivotal London office with Tammer Saleh, our guest from episode 31. Chris used to win all table tennis matches without even breaking a sweat, and today Gerhard gets his comeback. Touché!

Buluta Doğru
DevOps Mühendisi-SRE | Bulut alanındaki meslekler

Buluta Doğru

Play Episode Listen Later Dec 14, 2021 53:27


24. bölümden herkese merhabalar. Arkadaşlar, bu bölümle birlikte yeni bir seriye başlıyoruz. Bu seride bulut alanındaki meslekleri ve rolleri sizlere o rolleri üstlenen konuklarımız anlatacak. Serinin ilk bölümünde Emirates'de SRE olarak çalışan sevgili Deniz Parlak sağ olsun bizi kırmadı ve "DevOps mühendisi ne iş yapar, SRE ile farkları nelerdir, görev tanımları nerelerde farklılaşır" gibi sorulara açıklık getirdi. İyi dinlemeler.

TestGuild Performance Testing and Site Reliability Podcast
Creating a Performance Testing Framework with Stephen Townshend

TestGuild Performance Testing and Site Reliability Podcast

Play Episode Listen Later Dec 14, 2021 31:46


What are the components of a performance testing framework? And why should you build one? Open-source performance testing tools are great, but you need a framework of tools to provide all the features you need for serious performance testing. In this episode, Stephen Townshend, an SRE at IAG, shares why you should build your performance test framework, the components that make up a performance test framework, suggestions of tools and scripting languages along the way, and examples of frameworks implemented in the real world.

Tech Lead Journal
#68 - 2021 Accelerate State of DevOps Report - Nathen Harvey

Tech Lead Journal

Play Episode Listen Later Dec 13, 2021 47:55


“Many organizations think in order to be safe, they have to be slow. But the data shows us that the best performers are getting both. And in fact, as speed increases, so too does stability." Nathen Harvey is the co-author of 2021 Accelerate State of DevOps Report and a Developer Advocate at Google. In this episode, we discussed in-depth the latest release of the State of DevOps Report. Nathen started by describing what the report is all about, how it got started, and explained the five key metrics suggested by the report to measure the software delivery and operational performance. Nathen then explained how the report categorizes different performers based on their performance against the key metrics and how the elite performers outperform the others in terms of speed, stability, and reliability. Next, we dived into several new key findings that came out of the 2021 report that relate to documentation, secure software supply chain, and burnout. Towards the end, Nathen gave great tips on how we can use the findings from the reports to get started and improve our software delivery and operational performance, that ultimately will improve our organizational performance. Listen out for: Career Journey - [00:05:28] State of DevOps Report - [00:09:32] The Five Key Metrics - [00:13:55] Speed, Safety, and Reliability - [00:19:58] Performers Categories - [00:23:26] 2021 New Key Findings - [00:28:01] New Finding: Documentation - [00:30:44] New Finding: Secure Software Supply Chain - [00:34:58] New Finding: Burnout - [00:37:22] How to Start Improving - [00:39:36] 3 Tech Lead Wisdom - [00:43:55] _____ Nathen Harvey's Bio Nathen Harvey, Developer Relations Engineer at Google, has built a career on helping teams realize their potential while aligning technology to business outcomes. Nathen has had the privilege of working with some of the best teams and open source communities, helping them apply the principles and practices of DevOps and SRE. He is part of the Google Cloud DORA research team and a co-author of the 2021 Accelerate State of DevOps Report. Nathen was an editor for 97 Things Every Cloud Engineer Should Know, published by O'Reilly in 2020. Follow Nathen: Twitter – @nathenharvey LinkedIn – https://linkedin.com/in/nathen Github – https://github.com/nathenharvey Our Sponsor Are you looking for a new cool swag? Tech Lead Journal now offers you some swags that you can purchase online. These swags are printed on-demand based on your preference, and will be delivered safely to you all over the world where shipping is available. Check out all the cool swags by visiting https://techleadjournal.dev/shop. Like this episode? Subscribe on your favorite podcast app and submit your feedback. Follow @techleadjournal on LinkedIn, Twitter, and Instagram. Pledge your support by becoming a patron. For more info about the episode (including quotes and transcript), visit techleadjournal.dev/episodes/68.

Changelog Master Feed
Is Kubernetes a platform? (Ship It! #31)

Changelog Master Feed

Play Episode Listen Later Dec 8, 2021 60:55


Tammer Saleh, founder of SuperOrbital and former VP of Engineering at Pivotal, is joining Gerhard to talk about table tennis, remote work, and challenges that teams have with K8s. Some years ago, both Tammer & Gerhard used to work in the same London office on CloudFoundry, and nowadays they are both into Kubernetes. Tammer and the SuperOrbital team are deeply experienced in this topic, and they help teams at companies like Bloomberg, Shopify, and federal U.S. agencies tackle hard Kubernetes and DevOps problems through engineering and training. Why do companies need Kubernetes in the first place? Which are the right reasons for choosing it? Is Kubernetes a platform? Gerhard's favourite: we are doing Kubernetes wrong, but it works better than when we were doing it right, so what's up with that? This last one was a lot of fun, and we left the entire minute of laughter in at your request. Enjoy!

Screaming in the Cloud
Building a User-Friendly Product with Aparna Sinha

Screaming in the Cloud

Play Episode Listen Later Dec 8, 2021 42:53


About AparnaAparna Sinha is Director of Product for Kubernetes and Anthos at Google Cloud. Her teams are focused on transforming the way we work through innovation in platforms. Before Anthos and Kubernetes, Aparna worked on the Android platform. She joined Google from NetApp where she was Director of Product for storage automation and private cloud. Prior to NetApp, Aparna was a leader in McKinsey and Company's business transformation office working with CXOs on IT strategy, pricing, and M&A. Aparna holds a PhD in Electrical Engineering from Stanford and has authored several technical publications. She serves on the Governing Board of the Cloud Native Computing Foundation (CNCF).Links: DevOps Research Report: https://www.devops-research.com/research.html Twitter: https://twitter.com/apbhatnagar TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you're tired of managing open source Redis on your own, or you're using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. Set up a meeting with a Redis expert during re:Invent, and you'll not only learn how you can become a Redis hero, but also have a chance to win some fun and exciting prizes. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous non-sense.  Corey: You know how Git works right?Announcer: Sorta, kinda, not really. Please ask someone else.Corey: That's all of us. Git is how we build things, and Netlify is one of the best ways I've found to build those things quickly for the web. Netlify's Git-based workflows mean you don't have to play slap-and-tickle with integrating arcane nonsense and web hooks, which are themselves about as well understood as Git. Give them a try and see what folks ranging from my fake Twitter for Pets startup, to global Fortune 2000 companies are raving about. If you end up talking to them—because you don't have to; they get why self-service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. We have a bunch of conversations on this show covering a wide gamut of different topics, things that I find personally interesting, usually, and also things I'm noticing in the industry. Fresh on the heels of Google Next, we get to ideally have conversations about both of those things. Today, I'm speaking with the Director of Product Management at Google Cloud, Aparna Sinha. Aparna, thank you so much for joining me today. I appreciate it.Aparna: Thank you, Corey. It's a pleasure to be here.Corey: So, Director of Product Management is one of those interesting titles. We've had a repeat guest here, Director of Outbound Product Management Richard Seroter, which is great. I assume—as I told him—outbound products are the ones that are about to be discontinued. He's been there a year and somehow has failed the discontinue a single thing, so okay, I'm sure that's going to show up on his review. What do you do? The products aren't outbound; they're just products, and you're managing them, but that doesn't tell me much. Titles are always strange.Aparna: Yeah, sure. Richard is one of my favorite people, by the way. I work closely with him. I am the Director of Product for Developer Platform. That's Google Cloud's developer platform.It includes many different products—actually, 30-Plus products—but the primary pieces are usually when a developer comes to Google Cloud, the pieces that they interact with, like our command-line interface, like our Cloud Shell, and all of the SDK pieces that go behind it, and then also our DevOps tooling. So, as you're writing the application in the IDE and as you're deploying it into production, that's all part of the developer platform. And then I also run our serverless platform, which is one of the most developer-friendly capabilities from a compute perspective. It's also integrated into many different services within GCP. So, behind the title, that's really what I work on.Corey: Okay, so you're, I guess, in part responsible for well, I guess, a disappointment of mine a few years ago. I have a habit on Twitter—because I'm a terrible person—of periodically spinning up a new account on various cloud providers and kicking the tires and then live-tweeting the experience, and I was really set to dunk on Google Cloud; I turned this into a whole blog post. And I came away impressed, where the developer experience was pretty close to seamless for getting up and running. It was head and shoulders above what I've seen from other cloud providers, and on the one hand, I want to congratulate you and on the other, it doesn't seem like that's that high of a bar, to be perfectly honest with you because it seems that companies get stuck in their own ways and presuppose that everyone using the product is the same as the people building the product. Google Cloud has been and remains a shining example of great developer experience across the board.If I were starting something net new and did not have deep experience with an existing cloud provider—which let's face it, the most valuable thing about the cloud is knowing how it's going to break because everything breaks—I would be hard-pressed to not pick GCP, if not as the choice, at least a strong number two. So, how did that come to be? I take a look at a lot of Google's consumer apps and, “This is a great user experience,” isn't really something I find myself saying all that often. Google Cloud is sort of its own universe. What happened?Aparna: Well, thank you, first of all, for the praise. We are very humble about it, actually. I think that we're grateful if our developers find the experience to be seamless. It is something that we measure all the time. That may be one of the reasons why you found it to be better than other places. We are continuously trying to improve the time to value for developers, how long it takes them to perform certain actions. And so what you measure is what you improve, right? If you don't measure it, you don't improve it. That's one of our SRE principles.Corey: I wish. I've been measuring certain things for years, and they don't seem to be improving at all. It's like, “Wow, my code is still terrible, but I'm counting the bugs and the number isn't getting smaller.” Turns out there might be additional steps required.Aparna: Yes, you know, we measure it, we look at it, we take active OKRs to improve these things, especially usability. Usability is extremely important for certainly the developer platform, for my group; that's something that's extremely important. I would say, stepping back, you said it's not that common to find a good user experience in the cloud, I think in general—you know, and I've spent the majority of my career, if not all of my career, working on enterprise software. Enterprise software is not always designed in the most user-friendly way; it's not something that people always think about. Some of the enterprise software I've used has been really pretty… pretty bad. Just a list of things.Corey: Oh, yeah. And it seems like their entire philosophy—I did a bit of a dive into this, and I think it was Stripe's Patrick McKenzie who wound up pointing this out originally, though; but the internet is big and people always share and reshare ideas—the actual customer for enterprise software is very often procurement or a business unit that is very organizationally distant from the person who's using it. And I think in a world of a cloud platform, that is no longer true. Yeah, there's a strategic decision of what Cloud do we use, but let's be serious, that decision often comes into play long after there's already been a shadow IT slash groundswell uprising. The sales process starts to look an awful lot less like, “Pick our cloud,” and a lot more like, “You've already picked our cloud. How about we formalize the relationship?”And developer experience with platforms is incredibly important and I'm glad to see that this is a—well, it's bittersweet to me. I am glad to see that this is something that Google is focusing on, and I'm disappointed to admit that it's a differentiator.Aparna: It is a differentiator. It is extremely important. At Google, there are a couple of reasons why this is part of our DNA, and it is actually related to the fact that we are also a consumer products company. We have a very strong user experience team, a very strong measurements-oriented—they measure everything, and they design everything, and they run focus groups. So, we have an extraordinary usability team, and it's actually one of the groups that—just like every other group—is fungible; you can move between consumer and cloud. There's no difference in terms of your training and skill set.And so, I know you said that you're not super impressed with our consumer products, but I think that the practice behind treating the user as king, treating the user as the most important part of your development, is something that we bring over into cloud. And it's just a part of how we do development, and I think that's part of the reason why our products are usable. Again, I shy away from taking any really high credit on these things because I think I always have a very high bar. I want them to be delightful, super delightful, but we do have good usability scores on some of the pieces. I think our command line, I think, is quite good. I think—there's always improvements, by the way, Corey—but I think that there are certain things that are delightful.And a lot of thought goes into it and a lot of multi-functional—meaning across product—user experience and engineering. We have end-developer relations. We have, sort of this four-way communication about—you know, with friction logs and with lots of trials and lots of discussion and measurements, is how we improve the user experience. And I would love to see that in more enterprise software. I think that my experience in the industry is that the user is becoming more important, generally, even in enterprise software, probably because of the migration to cloud.You can't ignore the user anymore. This shouldn't be all about procurement. Anybody can procure a cloud service. It's really about how easily and how quickly can they get to what they want to do as a user, which I think also the definition of what a developer is changing and I think that's one of the most exciting things about our work is that the developer can be anybody; it can be my kids, and it can be anyone across the world. And our goal is to reach those people and to make it easy for them.Corey: If I had to bet on a company not understanding that distinction, on some level, Google's reputation lends itself to that where, oh, great. It's like, I'm a little old to go back to school and join a fraternity and be hazed there, so the second option was, oh, I'll get an interview to be an SRE at Google where, “Oh, great, you've done interesting things, but can you invert a binary tree on a whiteboard?” “No, I cannot. Let's save time and admit that.” So, the concern that I would have had—you just directly contradicted—was the idea that you see at some companies where there's the expectation that all developers are like their developers.Google, for better or worse, has a high technical bar for hiring. A number of companies do not have a similar bar along similar axes, and they're looking for different skill sets to achieve different outcomes, and that's fine. To be clear, I am not saying that, oh, the engineers at Google are all excellent and the engineers all at a bank are all crap. Far from it.That is not true in either direction, but there are differences as far as how they concern themselves with software development, how they frame a lot of these things. And I am surprised that Google is not automatically assuming that developers are the type of developers that you have at Google. Where did that mindset shift come from?Aparna: Oh, absolutely not. I think we would be in trouble if we did that. I studied electrical engineering in school. This would be like assuming that the top of the class is kind of like the kind of people that we want to reach, and it's just absolutely not. Like I said, I want to reach total beginners, I want to reach people who are non-developers with our developer platform.That's our explicit goal, and so we view developers as individuals with a range of superpowers that they've gained throughout their lives, professionally and personally, and people who are always on a path to learn new things, and we want to make it easy for them. We don't treat them as bodies in an employment relationship with some organization, or people with certain minimum bar degrees, or whatever it is. As far as interviewing goes, Corey, in product management, which is the practice that I'm part of, we actually look for, in the interview, that the candidate is not thinking about themselves; they're not imposing themselves on the user base.So, can you think outside of yourself? Can you think of the user base? And are you inquisitive? Are you curious? Do you observe? And how well do you observe differences and diversity, and how well are you able to grasp what might be needed by a particular segment? How well are you able to segment the user base?That's what we look for, certainly in product management, and I'm quite sure also in user experience. You're right, on engineering, of course, we're looking for technical skills, and so on, but that's not how we design our products, that's not how we design the usability of our products.Corey: “If you people were just a little bit smarter slash more like me, then this would work a lot better,” is a common trope. Which brings us, of course, to the current state of serverless. I tend to view serverless as largely a failed initiative so far. And to be clear, I'm viewing this from an AWS-centric lens; that is the… we'll be charitable and call it pool in which I swim. And they announced Lambda in 2015; that's great. “The only code you will ever write in the future is business logic.” Yeah, I might have heard that one before about 15 other technologies dating back to the 60s, but okay.And the expectation was that it was going to take off and set the world on fire. You just needed to learn the constraints of how this worked. And there were a bunch of them, and they were obnoxious, and it didn't have a learning curve so much as a learning cliff. And nowadays, we do see it everywhere, but it's also in small doses. It's mostly used as digital spackle to plaster over the gaps between various AWS services.What I'm not seeing across the board is a radical mindset shift in the way that developers are engaging with cloud platforms that would be heralded by widespread adoption of serverless principles. That said, we are on the heels here of Google Cloud Next, and that you had a bunch of serverless announcements, I'm going to go out on a limb and guess you might not agree with my dismal take on the serverless side of the world?Aparna: Well, I think this is a great question because despite the fact that I like not to be wishy-washy about anything, I actually both agree and disagree [laugh] with what you said. And that's funny.Corey: Well, that's why we're talking about this here instead of on Twitter where two contradictory things can't possibly both be true. Wow, imagine that; nuance, it doesn't fit 280 characters. Please, continue.Aparna: So, what I agree with is that—I agree with you that the former definition of serverless and the constrained way that we are conditioned thinking about serverless is not as expansive as originally hoped, from an adoption perspective. And I think that at Google, serverless is just no longer about only event-driven programming or microservices; it's about running complex workloads at scale while still preserving the delightful developer experience. And this is where the connection to the developer experience comes in. Because the developer experience, in my mind, it's about time to value. How quickly can I achieve the outcome that I need for my business?And what are the things that get in the way of that? Well, setting up infrastructure gets in the way of that, having to scale infrastructure gets in the way of that, having to debug pieces that aren't actually related to the outcome that you're trying to get to gets in the way of that. And the beauty of serverless, it's all in how you define serverless: what does this name actually mean? If serverless only means functions and event-driven applications, then yes, actually, it has a better developer experience, but it is not expansive, and then it is limited, and it's trapped in its skin the way that you mentioned it. [laugh].Corey: And it doesn't lend itself very well to legacy applications—legacy, of course, being condescending engineering-speak for ‘it makes money.' But yeah, that's the stuff that powers the world. We're not going to be redoing all those things as serverless-powered microservices anytime soon, in most cases.Aparna: At Google Cloud, we are redefining serverless. And so what we are taking from Serverless is the delightful user experience and the fact that you don't have to manage the infrastructure, and what we're putting in the serverless is essentially serverless containers. And this is the big revolution in serverless, is that serverless—at least a Google Cloud with serverless containers and our Cloud Run offering—is able to run much bigger varieties of applications and we are seeing large enterprises running legacy applications, like you say, on Cloud Run, which is serverless from a developer experience perspective. There's no cluster, there is no server, there's no VM, there's nothing for you to set up from a scaling perspective. And it essentially scales infinitely.And it is very developer-focused; it's meant for the developer, not for the operator or the infrastructure admin. In reality in enterprise, there is very much a segmentation of roles. And even in smaller companies, there's a segmentation of roles even within the same person. Like, they may have to do some infrastructure work and they may do some development work. And what serverless—at least in the context of Google Cloud—does, is it removes the infrastructure work and maximizes the development work so that you can focus on your application and you can get to that end result, that business value that you're trying to achieve.And with Cloud Run, what we've done is we've preserved that—and I would say, actually, arguably improved that because we've done usability studies that show that we're 22 points above every other serverless offering from a usability perspective. So, it's super important to me that anybody can use this service. Anybody. Maybe even not a developer can use this service. And that's where our focus is.And then what we've done underneath is we've removed many of the restrictions that are traditionally associated with serverless. So, it doesn't have to be event-driven, it is not only a particular set of languages or a particular set of runtimes. It is not only stateless applications, and it's not only request-based billing, it's not only short-running jobs. These are the kinds of things that we have removed and I think we've just redefined serverless.Corey: [unintelligible 00:17:05], on some level, the idea of short-lived functions with a maximum cap feels like a lazy answer to one of the hard problems in computer science, the halting problem. For those not familiar, my layman's understanding of it is, “Okay, you have a program that's running in a loop. How do you deterministically say that it is done executing?” And the functional answer to that is, “Oh, after 15 minutes, it's done. We're killing it.” Which I guess is an answer, but probably not one that's going to get anyone a PhD.It becomes very prescriptive and it leads to really weird patterns trying to work around some of those limitations. And historically, yeah, by working within the constraints of the platform, it works super well. What interests me about Cloud Run is that it doesn't seem to have many of those constraints in quite the same way. It's, “Can you shove whatever monstrosity you've got into a container? You can't? Well, okay, there are ways to get there.”Full disclosure, I was very anti-container; the industry has yet again proven to me that I cannot predict the future. Here we are. “Great, can you shove a container in and hand it to some other place to run it where”—spoiler, people will argue with me on this and they are wrong—“Google engineers are better at running infrastructure to run containers than you are.” Full stop. That is the truism of how this works; economies of scale.I love the idea of being able to take something, throw it over a wall, and not have to think about the rest of it. But everything that I'm thinking about in this context looks certain ways and it's the type of application that I'm working on or that I'm looking at most recently. What are you seeing in Cloud Run as far as interesting customer use cases? What are people doing with it that you didn't expect them to?Aparna: Yeah, I think this is a great time to ask that question because with the pandemic last year—I guess we're still in the pandemic, but with the pandemic, we had developers all over the world become much more important and much more empowered, just because there wasn't really much of an operations team, there wasn't really as much coordination even possible. And so we saw a lot of customers, a lot of developers moving to cloud, and they were looking for the easiest thing that they could use to build their applications. And as a result, serverless and Cloud Run in particular, became extremely popular; I would say hockey stick in terms of usage.And we're seeing everything under the sun. ecobee—this is a home automation company that makes smart thermostats—they're using Cloud Run to launch a new camera product with multi-factor authentication and security built-in, and they had a very tight launch timeline. They were able to very quickly meet that need. Another company—and you talk about, you know, sort of brick and mortar—IKEA, which you and I all like to shop [laugh] at, particularly doing the—Corey: Oh, I love building something from 500 spare parts, badly. It's like basically bringing my AWS architecture experience into my living room. It's great. Please continue.Aparna: Yeah, it's like, yeah—Corey: The Swedish puzzle manufacturer.Aparna: Yes. They're a great company, and I think it just in the downturn and the lockdown, it was actually a very dicey time, very tricky time, particularly for retailers. Of course, everybody was refurbishing their home or [laugh], you know, improving their home environment and their furniture. And IKEA started using serverless containers along with serverless analytics—so with BigQuery, and Cloud Run, and Cloud Functions—and one of the things they did is that they were able to cut their inventory refresh rate from more than three hours to less than three minutes. This meant that when you were going to drive up and do some curbside pickup, you know the order that you placed was actually in stock, which was fantastic for CSAT and everything.But that's the technical piece that they were able to do. When I spoke with them, the other thing that they were able to do with the Cloud Run and Cloud Functions is that they were able to improve the work-life balance of their engineers, which I thought was maybe the biggest accomplishment. Because the platform, they said, was so easy for them to use and so easy for them to accomplish what they needed to accomplish, that they had a better [laugh] better life. And I think that's very meaningful.In other companies, MediaMarktSaturn, we've talked about them before; I don't know if I've spoken to you about them, but we've certainly talked about them publicly. They're a retailer in EMEA, and because of their use of Cloud Run, and they were able to combine the speed of serverless with the flexibility of containers, and their development team was able to go eight times faster while handling 145% increase in digital channel traffic. Again, there are a lot more digital channel traffic during COVID. And perhaps my favorite example is the COVID-19 exposure notifications work that we did with Apple.Corey: An unfortunate example, but a useful one. I—Aparna: Yes.Corey: —we all—I think we all wish it wasn't necessary, but here's the world in which we live. Please, tell me more.Aparna: I have so many friends in engineering and mathematics and these technical fields, and they're always looking at ways that technology can solve these problems. And I think especially something like the pandemic which is so difficult to track, so difficult with the time that it takes for this virus to incubate and so on, so difficult to track these exposures, using the smartphone, using Bluetooth, to have a record of who has it and who they've been in contact with, I think really interesting engineering problem, really interesting human problem. So, we were able to work on that, and of course, when you need a platform that's going to be easy to use, that's going to be something that you can put into production quickly, you're going to use Cloud Run. So, they used Cloud Run, and they also used Cloud Run for Anthos, which is the more hybrid version, for the on-prem piece. And so both of those were used in conjunction to back all of the services that were used in the notifications work.So, those are some of the examples. I think net-net, it's that I think usability, especially in enterprise software is extremely important, and I think that's the direction in which software development is going.Corey: Are you building cloud applications with a distributed team? Check out Teleport, an open source identity-aware access proxy for cloud resources. Teleport provides secure access to anything running somewhere behind NAT: SSH servers, Kubernetes clusters, internal web apps and databases. Teleport gives engineers superpowers! Get access to everything via single sign-on with multi-factor. List and see all SSH servers, kubernetes clusters or databases available to you. Get instant access to them all using tools you already have. Teleport ensures best security practices like role-based access, preventing data exfiltration, providing visibility and ensuring compliance. And best of all, Teleport is open source and a pleasure to use.Download Teleport at https://goteleport.com. That's goteleport.com.Corey: It's easy for me to watch folks—like you—in keynotes at events—like Cloud Next—talk about things and say, “This is how the world is building things, and this is what the future looks like.” And I can sit there and pick to pieces all day, every day. It basically what I do because of deep-seated personality problems with me. It's very different to say that about a customer who has then taken that thing and built it into something that is transformative and solves a very real problem that they have. I may not relate to that problem that they have, but I do not believe that customers are going to have certain problems, find solutions like this and fix them, and the wrong in how they're approaching these things.No one sees the constraints that shape things; no one shows up in the morning hoping to do a crap job today unless you know you're the VP of Integrity at Facebook or something. But there's a very real sense of companies have a bunch of different drivers, and having a tool or a service or a platform that solves it for them, you'd better be very sure before you step up and start saying, “No, you're doing it wrong.” In earlier years, I did not see a whole lot of customer involvement with Cloud Next. It was always a, “Well, a bunch of Googlers are going to tell me how this stuff works, and they'll talk about theoretical things.”That's not the case anymore. You have a whole bunch of highly respectable reference customers out there doing a whole lot of really interesting things. And more to the point, they're willing to go on record talking about this. And I'm not talking about fun startups that are, “Great, it's Twitter, only for pets.” Great. I'm talking banks, companies where mistakes are going to show and leave a mark. It's really hard to reconcile what I'm seeing with Google Cloud in 2021 than what I was seeing in, let's say, five or six years ago. What drove that change?Aparna: Yes, Corey, I think you're definitely correct about that. There's no doubt about it that we have a number of really tremendous customers, we really tremendous enterprise references and so on. I run the Google Cloud Developer Platform, and for me, the developers that I work with and the developers that this platform serves are the inspiration for what we do. And in the last six or seven years that I've worked in Google Cloud, that has always been the case. So, nothing has changed from my perspective, in that regard.If anything, what has changed is that we have far more users, we have been growing exponentially, and we have many more large enterprise customers, but in terms of my journey, I started with the Kubernetes open-source project, I was one of the very early people on that, and I was working with a lot of developers, in that case, in the open-source community, a lot of them became GKE customers, and it just grew. And now we have so many [laugh] customers and so many developers, and we have developed this platform with them. We are very much—it's been a matter of co-innovation, especially on Kubernetes. It has been very much, “Okay, you tell us,” and it's a need-based relationship, you know? Something is not working, we are there and we fix it.Going back to 2017 or whenever it was that Pokemon Go was running on GKE, that was a moment when we realized, “Oh, this platform needs to scale. Okay, let's get at it.” And that's where, Corey, it really helps to have great engineers. For all the pros and cons, I think that's where you want those super-sharp, super-driven, super-intelligent folks because they can make things like that happen, they can make it happen in less than a week, so that—they can make it happen over a Saturday so that Pokemon Go can go live in Japan and everybody can be playing that game. And that's what inspires me.And that's a game, but we have a lot of customers that are running health applications. We have a customer that's running ambulances on the platform. And so this is life-threatening stuff; we have to take that very seriously, and we have to be listening to them and working with them. But I'm inspired, and I think that our roadmap, and the products, and the features that we build are inspired by what they are building on the platform. And they're combining all kinds of different things. They're taking our machine learning capabilities, they're taking our analytics capabilities, they're taking our Maps API, and they're combining it with Cloud Run, they're combining it with GKE. Often they're using both of those.And they're running new services. We've got a customer in Indonesia that's running in a food delivery service; I've got customers that are analyzing the cornfields in the middle of the country to improve crop yield. So, that's the kind of inspiring work, and each of those core, each of those users are coming back to us and saying, “Oh, you know, I need a different type of”—it's very detailed, like, “I need a different type of file system that gives me greater speed or better performance.” We just had a gaming company that was running on GKE that we really won out over a different cloud in terms of performance improvements that we were able to provide on the container startup times. It was just a significant performance improvement. We'll probably publish it in the coming few months.That's the kind of thing that drives it, and I'm very glad that I have a strong engineering team in Google Cloud, and I'm very glad that we have these amazing customers that are trying to do these amazing things, and that they're directly engaging with us and telling us what they need from us because that's what we're here for.Corey: To that end, one more area I want to go into before we call this a show, you've had Cloud Build for a little while, and that's great. Now, at—hot off the presses, you wound up effectively taking that one step further with Cloud Deploy. And I am still mostly someone with terrible build and release practices that people would be ashamed of, struggle to understand the differentiation between what I would do with Cloud Build and what I would do with Cloud Deploy. I understand they're both serverless. I understand that they are things that large companies care about. What is the story there?Aparna: Yeah, it's a journey. As you start to use containers—and these days, like you said, Corey, containers, a lot of people are using them—then you start to have a lot of microservices, and one of the benefits of container usage is that it's really quick to release new versions. You can have different versions of your application, you can test them out, you can roll them out. And so these DevOps practices, they become much more attainable, much more reachable. And we just put out the, I think, the seventh version of the DevOps Research Report—the DORA report—that shows that customers that follow best practices, they achieve their results two times better in terms of business outcomes, and so on.And there's many metrics that show that this kind of thing is important. But I think the most important thing I learned during the pandemic, as we were coming out of the pandemic, is a lot of—and you mentioned enterprises—large banks, large companies' CIOs and CEOs who basically were not prepared for the lockdown, not prepared for the fact that people aren't going to be going into branches, they came to Google Cloud and they said that, “I wish that I had implemented DevOps practices. I wish that I had implemented the capability to roll out changes frequently because I need that now. I need to be able to experiment with a new banking application that's mobile-only. I need to be able to experiment with curbside delivery. And I'm much more dependent on the software than I used to be. And I wish that I had put those DevOps practices.”And so the beginning of 2021, all our conversations were with customers, especially those, you know you said ‘legacy,' I don't think that's the right word, but the traditional companies that have been around for hundreds of years, all of them, they said, “Software is much more important. Yes, if I'm not a software company, at least a large division of my group is now a software group, and I want to put the DevOps practices into play because I know that I need that and that's a better way of working.”By the way, there's a security aspect to that I'd like to come back to because it's really important—especially in banking, financial services, and public sector—as you move to a more agile DevOps workflow, to have security built into that. So, let me come back to that. But with regard to Cloud Build and Cloud Deploy is something I've been wanting to bring into market for a couple of years. And we've been talking about it, we've been working on it actively for more than a year on my team. And I'm very, very excited about this service because what it does is it allows you to essentially put this practice, this DevOps practice into play whereas your artifacts are built and stored in the artifact repository, they can then automatically be deployed into your runtime—which is GKE Cloud Run—in the future, you can deploy them, and you can set how you want to deploy them.Do you want to deploy them to a particular environment that you want to designate the test environment, the environment to which your developers have access in a certain way? Like, it's a test environment, so they can make a lot of changes. And then when do you want to graduate from test to staging, and when do you want to graduate to production and do that gradual rollout? Those are some of the things that Cloud Deploy does.And I think it's high time because how do you manage microservices at scale? How do you really take advantage of container-based development is through this type of tooling. And that's what Cloud Deploy does. It's just the beginning of that, but it's a delightful product. I've been playing around with it; I love it, and we've seen just tremendous reception from our users.Corey: I'm looking forward to kicking the tires on it myself. I want to circle back to talk about the security aspect of it. Increasingly, I'm spending more of my attention looking at cloud security because everyone else has, too, and some of us have jobs that don't include the word security but need to care about it. That's why I have a Thursday edition of my newsletter, now, talking specifically about that. What is the story around security these days from your perspective?And again, it's a huge overall topic, and let's be clear here, I'm not asking, “What does Google Cloud think about security?” That would fill an encyclopedia. What is your take on it? And where do you want to talk about this in the context of Cloud Deploy?Aparna: Yeah, so I think about security from the perspective of the Google Cloud Developer Platform, and specifically from the perspective of the developer. And like you said, security is not often in the title of anybody in the developer organization, so how do we make it seamless? How do we make it such that security is something that is not going to catch you as you're doing your development? That's the critical piece. And at the same time, one of the things we saw during 2020 and 2021 is just the number of cyberattacks just went through the roof. I think there was a 400 to 600% increase in the number of software supply chain attacks. These are attacks where some malicious hacker has come in and inserted some malicious code into your software. [laugh]. Your software, Corey. You know, you the unsuspecting developer is—Corey: Well, it used to be my software; now there's some debate about that.Aparna: Right. That's true because most software is using open-source dependencies; and these open-source dependencies, they have a pretty intricate web of dependencies that they are themselves using. So, it's a transitive problem where you're using a language like Python, or whatever language you're using. And there's a number of—Corey: Crappy bash by default. But yes.Aparna: Well, it was actually a bash script vulnerability, I think, in the Codecov breach that happened, I think it was, in earlier this year, where a malicious bash script was injected into the build system, in fact, of Codecov. And there are all these new attack vectors that are specifically targeting developers. And whether it's nation-states or whoever it is that's causing some of these attacks, it's a problem that is of national and international magnitude. And so I'm really excited that we have the expertise in Google Cloud and beyond Google Cloud.Google, it's a very security-conscious company. This company is a very security-conscious company. [laugh]. And we have built a lot of tooling internally to avoid those kinds of attacks, so what we've done with Cloud Build, and what we're going to do with Cloud Deploy, we're building in the capability for code to be signed, for artifacts to be signed with cryptographic keys, and for that signing, that attestation—we call it an attestation—that attestation to be checked at various points along the software supply chain. So, as you're writing code, as you're submitting the code, as you're building the containers, as you're storing the containers, and then finally as you're deploying them into whatever environment you're deploying them, we check these keys, and we make sure that the software that is going through the system is actually what you intended and that there isn't this malicious code injection that's taking place.And also, we scan the software, we scan the code, we scan the artifacts to check for vulnerabilities, known vulnerabilities as well as unknown vulnerabilities. Known vulnerabilities from a Google perspective; so Google's always a little bit ahead, I would say, in terms of knowing what the vulnerabilities are out there because we do work so much on software across operating systems and programming languages, just across the full gamut of software in the industry, we work on it, and we are constantly securing software. So, we check for those vulnerabilities, we alert you, we help to remediate those vulnerabilities.Those are the type of things that we're doing. And it's all in service of certainly keeping enterprise developers secure, but also just longtail an average, everybody, helping them to be secure so that they don't get hacked and their companies don't get hacked.Corey: It's nice to see people talking about this stuff, who is not directly a security vendor. But by which I mean, you're not using this as the fear, uncertainty, and doubt angle to sell a given service that, “We have to talk about this exploit because otherwise, no one will ever buy this.” Something like Cloud Deploy is very much aligned with a best practices approach to release engineering. It's not, strictly speaking, a security product, but being able to wrap things that are very security-centric around it is valuable.Now, sponsors are always going to do interesting things at various expo halls, and oh, yeah, saw the same product warmed over. This is very much not that, and I don't interpret anything you're saying is trying to sell something via the fear, uncertainty, and doubt model. There are a lot of different areas that I will be skeptical hearing about from different companies; I do take security words from Google extremely seriously because, let's be clear, in the past 20 however many years it has been, you have established a clear track record for caring about these things.Aparna: Yeah. And I have to go back to my initial mission statement, which is to help developers accelerate time to value. And one of the things that will certainly get in the way of accelerating time to value is security breaches, by the nature of them. If you are not running a supply chain that is secure, then it is very difficult for you to empower your developers to do those releases frequently and to update the software frequently because what if the update has an issue? What if the update has a security vulnerability?That's why it's really important to have a toolchain that prevents against that, that checks for those things, that logs those things so that there's an audit trail available, and that has the capability for your security team to set policies to avoid those kinds of things. I think that's how you get speed. You get with security built in, and that's extremely important to developers and especially cloud developers.Corey: I want to thank you for taking the time to speak to me about all the things that you've been working on and how you view this industry unfolding. If people want to learn more about what you're up to, and how you think about these things, where can they find you?Aparna: Well, Corey, I'm available on Twitter, and that may be one of the best ways to reach me. I'm also available at various customer events that we are having, most of them are online now. And so I'll provide you more details on that and I can be reached that way.Corey: Excellent. I will, of course, include links to that in the [show notes 00:38:43]. Thank you so much for being so generous with your time. I appreciate it.Aparna: Thank you so much. I greatly enjoyed speaking with you.Corey: Aparna Sinha, Director of Product Management at Google Cloud. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. And that sentence needed the word ‘cloud' about four more times in it. And if you've enjoyed this episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with a loud angry comment telling me that I just don't understand serverless well enough.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

MONDO Podcast
ŠESTA LIČNA: Zvezda čuva prvo mesto, napeto u Laktašima! | S4 A11

MONDO Podcast

Play Episode Listen Later Dec 7, 2021 92:14


Pauza je gotova, jesmo li vam nedostajali? Dobro, nemojte baš odmah da budete surovo iskreni...kako god bilo, ABA košarka je ponovo sa nama posle reprezentativnog „brejka“, a novo kolo donelo je i nova iznenađenja! Crvena Zvezda je, osokoljena dobrom partijom u Sankt-Petersburgu, dočekala Cibonu u dobrom raspoloženju. U jednoj „maratonskoj“ utakmici za naše uslove (meč je trajao skoro dva sata), crveno-beli su lomili, lomili i na kraju i slomili uporne Zagrepčane, koji su u nekoliko navrata činili meč zanimljivijim nego što je možda i pretio da bude. Srećom po Zvezdu, Nikola Kalinić i Nikola Ivanović su ipak „presudili“ borbenim Cibosima, pa je njihov tim ostao na čelu tabele. Paklen meč videli smo u Laktašima, gde je Partizan nakon rovovske borbe stigao do novog ligaškog trijumfa. Imala je Igokea „više od igre“ dobrih trideset i osam minuta, i uprkos skraćenoj rotaciji vrlo efikasno izlazila na kraj sa beogradskim crno-belima. Na kraju, ipak, presudila je prisebnost u završnici i duga klupa, a „proradio“ je kad je trebalo i Kevin Panter koji je pogodio bitnu trojku kad se lomilo. Budućnost je na parket „Morače“ u nedelju izašla sa svega osam igrača, što ih svejedno nije omelo da realizuju ulogu favorita u meču protiv Derbija. „Studenti“ su gubili bezmalo ceo meč i da nije bilo Kenana Kamenjaša, razlika bi bila i veća. „Đetići“ jesu izašli sa samo osmoricom, ali jedan od njih je bioi iznimno raspoloženi Džastin Kobs, pa previše problema i nije bilo. Novi šok doživela je Cedevita Olimpija, koja je na svom parketu doživela poraz od fenjeraša Zadra. Uprkos važnim pojačanjima u pauzi – u Ljubljanu su stigli Alen Omić i Jogi Ferel – „zmajevi“ su već na premijeri remontovane ekipe doživeli potpuni fijasko. Sjajni Džastin Karter poveo je Zadrane ka vitalnoj pobedi, i tako ih „nivelisao“ sa Splitom u donjem domu. FMP je nastavio sa odličnim partijama. Ovaj put pobeđena je Krka u Železniku, u meču koji nije plenio neizvesnošću sve do samog kraja, kada su se gosti iz Novog Mesta primakli na svega dva poseda. Stipčević, međutim, nije uspeo da pogodi trojku kad je trebalo, a domaći su to itekako umeli da iskoriste. Svoju penal-seriju nastavio je i neverovatni Brajs Džons, i ona sada iznosi pedeset slobodnih bacanja bez promašaja. Do nove pobede stigao je i Borac, naznačivši da je mini-rezultatska kriza za njima. Čačani su ostvarili važnu pobedu u Splitu, a ponovo je u centru pažnje bio Marko Pecarski – kada je on razigran, Borac svakako ima više šansi za pobedu. Iz krize polagano izlazi i Mornar iz Bara, a oni su novi trijumf ostvarili protiv Mege – mladi momci Vlade Jovanovića i dalje traže konzistentnu formu, i ne smeju izgubiti to magično šesto mesto iz vizira... Naravno, u nedeljnom pregledu biće i standardnih tema – analiza novih transfera (a desilo ih se nekoliko tokom ove pauze), petorka kola, „Kraš ekspres“ sa kolegom Mirom, a pesma? E...to je tajna. Za to morate i da kliknete na link! Vidimo se u petak za nedeljnu dozu NBA novosti!

Noticentro
Alejandro Encinas, admitió que en cuatro años sólo ha habido 35 sentencias por el delito de personas desaparecidas

Noticentro

Play Episode Listen Later Dec 6, 2021 1:47


Alejandro Encinas, admitió que en cuatro años sólo ha habido 35 sentencias por el delito de personas desaparecidasEste lunes fue donado a la Secretaría de Relaciones Exteriores, el primer periódico feminista de MéxicoAgencia Europea de Medicamentos, respaldó el uso de RoActemra para tratar adultos con Covid-19 grave

Noticentro
El próximo martes inicia vacunación de refuerzo contra el Covid-19 a adultos mayores de 65 años

Noticentro

Play Episode Listen Later Dec 5, 2021 1:21


El próximo martes inicia vacunación de refuerzo contra el Covid-19 a adultos mayores de 65 añosSRE informó que el gobierno de Estados Unidos, exigirá una prueba de detección de covid-19OMS, advirtió que las prohibiciones de viajar no impedirán la propagación de la variante ómicron

Reversim Podcast
427 DevOps Reloaded with Yair Etziony

Reversim Podcast

Play Episode Listen Later Dec 4, 2021


[קישור לקובץ mp3] שלום וברוכים הבאים לפרק מספר 427 של רברס עם פלטפורמה - התאריך היום הוא ה-25 בנובמבר 2021, והיום אנחנו מקליטים ב-Remote עם ברלין [First we take] - עם יאיר עציוני, שנמצא בברלין - הי יאיר! תודה שאתה פה, כיף שאתה איתנו.יאיר הוא איש DevOps ותיק, והנושא שלנו יהיה מה שנקרא “DevOps Reloaded” או - “בוא נדבר שוב על DevOps ונבין מה זה אומר, וננסה לחזור קצת ל-Basics ונדבר על הנושא כולו” [“DevOps Reloaded” אכן יותר קליט].אז לפני שאנחנו צוללים פנימה - יאיר, מי אתה? מה אתה עושה היום?(יאיר) קוראים לי יאיר עציוני, אני במקור פתח-תקוואי, גרתי בתל אביב 10 שנים.יש לי משהו כמו 20 שנות ניסיון של עבודה בסקטור ה-IT והתוכנה בישראל - עבדתי באמדוקס, כמו רבים וטובים, שם התחלתיעבדתי בסטארטאפים, ב-Qlusters, ב-ECI Telecom, ב-Voltaire, חברת ה-Infiniband . . . התמחיתי בעיקר ב-Linux System ו-Quality Assurance ו-Networking - כל הדברים האלהבאיזשהו שלב, כשהתחיל להגיע הענן, אז בגלל הרקע העבירו אותי הרבה לענןאז AWS, סטארטאפים שוב פעם . . . אחרי זה עבדתי ב-mcAfee - עבדתי בסטארטאפ ישראלי שהתעסק ב-Security, שעבר לידי mcAfeeעבדתי שם גם איזו תקופה . . . Security, Networking, Kernal, דברים כאלהבעיקר כ-QA Engineerואז עברתי לברלין - אחרי שבעצם “פרשתי” מהתחום, אמרתי שאני יותר לא הולך לעבוד בתחום . . . (רן) כן, זה מזכיר לי את “אני עם הסמים גמרתי” . . . . “אני עם ה-DevOps גמרתי” . . . (יאיר) אז זהו, שלא ידעתי שיש בכלל דבר כזה DevOps - אבל הייתי איש QA שעושה Deployments, יודע System, לקנפג (Configure) לעצמו את הסביבות - ואז התחילו להציע לי את הדבר הזה, DevOps . . .אמרתי “מה זה DevOps?” - כי בברלין זה ניהיה פתאום “חם”:מה זה Puppet? מה זה Chef? מה זה הדברים האלה? התחלתי לבדוק . . . ואז עשיתי כמה תפקידים של איש DevOps . . . בכל התפקידים האלה - כמה שקראו לי “איש DevOps”, אני עדיין הרגשתי מעיין שאני System Administrator[“ה-DevOps של המלך”]והשינוי הכי גדול, אני חושב, היה כשפגשתי את מקום העבודה שאני עובד בו עכשיו - שקוראים לו Polar Squadאני יכול להרחיב עליהם טיפה - זו חברה מפינלנד שעושה רק DevOpsובהגדרה של החברה הזאת, אנחנו בעצם יועצים - בעברית אפשר להגיד שאנחנו עושים “ייעוץ תקשוב בענן”אנחנו רואים DevOps בצורה אחרת - אנחנו לא רק עושים “תקשוב בענן”, אנחנו גם עושים משהו שנקרא “ייעוץ ארגוני”, אם אני שוב פעם ניהיה . . . .(רן) נראה לי ש”תקשוב” יש רק בצה”ל . . . אבל אני בטוח שכולם מבינים . . . .בשאר החלקים של התעשייה זה כנראה “תקשורת” או . . . (יאיר) אני אוהב את המילה “תקשוב”, זו אחת המילים האהובות עלי בעברית . . . גם נחמד סוף סוף לדבר קצת עברית . . . המוח שלי צריך עכשיו לעשות המון רי-קליברציה (recalibration) . . .(רן) ביום-יום, דרך אגב, מה אתה - אנגלית? גרמנית?(יאיר) אנגלית - אני התחלתי כבר לחשוב באנגלית . . . אני הייתי לפני כמה חודשים בפתח תקווה [קורה לטובים ביותר], ואני מוצא את עצמי בסופר חושב באנגלית, כשאני צריך לקנות דברים, ואני אומר - “משהו לא בסדר” . . .. המון המון אנגלית כרגע, וברלין היא מאוד International, אז אנגלית זו השפה הרשמית של ה . . . Silicon Allee, מה שנקרא - סצנת הסטארטאפים הלא-ברורה שיש פה.ומה שמעניין, וזה אולי גם משהו שיחבר אותנו להמשך השיחה, זה שבפינלנד הם לוקחים את הדברים בצורה . . . .הם בהרבה מאוד דברים שונים מהישראלים ומאוד דומים לישראלים, אבל הם לוקחים דברים בצורה מאוד רציונלית - והם לא יודעים לעשות חצי עבודה . . . ובפינלנד עשו מחקרים מאוד גדולים על הנפילה של Nokia - זה משהו שבעצם פגע בהם באיזושהי צורה, כי זה משהו שהם מאוד אהבו, זו הייתה גאווה כזאת שם.וכשהם עשו מחקר, הם גילו שמה שבעצם היה חסר זה שהאנשים המקצוענים בתחום שלהם - אנשי ה-System, ה-Product - לא הצליחו להעביר את המסרים ל-C-Levels - וה-C-Levels היו מנותקים ממה שקורה.[זמן טוב לעצור ולצפות שוב ב- Riot On Documentary (2002), שימו לב רק להחזיק חזק לפני]ומה שהתפתח שם זה בעצם זו סצנה שלמה של . . . הם קוראים לזה Flat Hierarchies - בברלין, מיליון חברות יגידו לך שיש להן “Flat Hierarchies”, אבל אין.הן תמיד No Flat בכלל - רק כתוב “Flat Hierarchies” . . .ואני עובד בחברה שאין בה CEO בכלל . . . . אנשים יכולים להגדיר את עצמם . . .בחרנו אפילו את Teal בתור . . . אם אתה מכיר, ייעוץ . . . .לבנות ארגון בצורה של Teal? זה בעצם לבנות אותו מלמטה למעלה . . . .(רן) Teal, לא “טיל” בעברית . . .. אני אעיר, ככה בהערת אגב - דיברת על Nokia ועל פינלנד - אז לי יש משפחה ויש לי גם חבר שגר בפינלנד - והוא גם גר “בעיר של Nokia”, או שלפחות פעם נקראה - קוראים לזה Tampere, איפה שהמפעל הראשי . . . (יאיר) הייתי ב-Tampere!(רן) כן, אז זו עיר מאוד מאוד יפה - אבל Nokia כמעט ולא קיימת שם.אני חושב שהיא עוד קיימת, אבל בטח לא מה שהיה פעם . . . [עדיין כאן . . .](יאיר) כן . . . דרך אגב, תגיד לו שאתה רוצה שהוא יביא לך Mustamakkaraשזה מצחיק - Makkara זה “נקניקיה” בפינית . . . Ma-kara - הם שמעו אותי מדבר עם אשתי והם התפוצצו מצחוק . . . מה שקורה זה שבעצם אנחנו חלק מ-Ecosystem מאוד גדול של חברות מאוד “אידיאליסטיות” וגם העניין הוא שהחברה שלנו יודעת לעשות רק דבר אחד - ואותו דוחפים את האנשים לבנות לבד.זאת אומרת - אין לי HR, אני מנהל את הסניף בברלין ואין לי HR, אני עושה את ה-HR ואני גם עושה את ה-Process-ים.לכן יש מקום מאוד גדול להתפתח בתור בנאדם, וללמוד על התחום שלך - ועל תחומים שאתה לא מכיר בכלל.וזה מאוד מחובר גם ל-DevOps, אנחנו תיכף נגיע לזה - שבעצם אתה לא רק מהנדס בדיקות, אתה יכול להיות הרבה יותר מזה, אז למה ש”נקטין אותך” לזה.אנחנו עובדים עם הרבה מאוד לקוחות - הרבה מהייעוץ הוא ייעוץ ארגוני.הרבה אנשים אומרים לזה משהו כמו “אבל תראה, עשיתי את הכל אוטומטי - ה-כ-ל אוטומטי - יש לי Pipeline-ים, Infrastructure-as-a-Code, הכל מתוקתק - ואני עדיין לא רואה שום דבר משתפר. למה?”(רן) זה באמת ככה . . . מפה אנחנו כבר ממש צוללים לנושא. בטח אתה, שיש לך את הניסיון הזה, לדבר ככה עם לא מעט לקוחות ולהטמיע פרקטיקות - כנראה שאחת התגובות הראשונות שאתה שומע, כמו שכבר התחלת להגיד, ואני מניח שהרבה מהמאזינים שלנו גם שמעו את זה, זה “אוקיי, ניסיתי DevOps, נסיתי טרנספורמציה - למה זה לא עובד? מה חסר? למה לאחרים זה עובד ולי זה לא עובד?” . . .(יאיר) אוקיי . . . אני אתן דוגמא, ואחרי זה מהדוגמא אני אבנה את זה.אני יכול לתת כדוגמא שני לקוחות שלנו - שתי חברות שבעצם הן נכנסו לעניין ה-Kubernetes ולעניין ה-DevOps.דרך אגב - Kubernetes לא בהכרח אומר DevOps, אבל במקרה הזה אפשר להגיד שכן.חברה אחת . . . (רן) כן, נגיד רק באותה הזדמנות שפרויקט לא אומר בהכרח Big Data . . . . אבל ניתן לך את הקונטרה הזו.(יאיר) העניין הוא כזה - לקוח אחד היה, נקרא לזה סטארטאפ-מאוד-חדשני או היפסטרי-כזהאתה יודע - הם כולם עשו את הקפה שלהם Brewed והיו חברה מאוד Green Field-יתה-Frontend, ה-Backend, ה-SRE - הם כולם היו Developers by definion, אנשים שבאים מ-Coding.והם עבדו ביחד - ראיתי איך הם עובדים, זאת אומרת - איך דבר כזה ש . . . זה Cross-Functional teams, עם אחריות מסויימת לכל בנאדם - אבל הם עבדו ביחד, הם . . . היה חסר להם המון ידע בעולם ה-Kubernetes - ב-Pipelines שלהם, באיך לשפר את זה מ-5 דקות Deployment ל-10 שניות Deployment, או 7 שניות או . . . הם לא ידעו כל כך את הטכנולוגיה שמאחורי Kubernetes - אבל הם ידעו לעבוד ממש ממש יפה ביחד.הם - פוף! הם חברה שטסה . . . הם עושים Sprint-ים והם מתקתקים את ה-Sprint-ים והם עובדים כצוותהם נהנים לעבוד ביחד - כל החבר'ה שם, גם היה להם את אותו . . . הייתי אומר שהם התאימו לעבוד אחד עם השני, אם אתה מבין למה אני מתכווןאולי לא המתכנתים הכי מבריקים בעולם - אבל אנשים ברמה גבוהה.החברה השנייה הייתה מעיין ארגון יותר קלאסי - היו להם Sprint-ים, אבל לא היו להם Release-ים בסוף ה-Sprint-ים בהכרחהם היו מאוד מאוד מנותקים אחד מהשני, זאת אומרת - הייתה קבוצת ה-Ops שהייתה מתפרקת כל הזמן, אנשים לא רצו להיות בה, כי כשיש לך 50 הודעות Errors בלילה, אז אתה לא בנאדם שמח . . . .היו Frontend ו-Backend וקבוצת Full-stack - אף אחד לא מדבר עם השני . . . שם גם עשינו ייעוץ ארגוני - אתה ממש רואה את זה, אתה יושב “בתוך הלקוח” ואתה רואה שלושה אנשים רצים כמו מטורפים, מזיעים - ואחרים שרואים YouTube . . . אני לא נגד לראות YouTube בעבודה, אבל כשמישהו אחד מזיע ומישהו אחר רק רואה YouTube . . . .אמרתי לו, ל-CTO - “אני לא אומר שאתה צריך להעביד את כולם בפרך, אבל את שם לב שאתה ועוד שניים עושים הכל - והאחרים מסתכלים עליכם?” . . .. וכמובן כשהיינו צריכים להעביר להם את הידע על ה-HELM charts שבנינו להם - על ה-Repos, על ה-TerraForm, איך כל העסק הזה עובד - אף אחד לא רצה לדעת . . . מה שקרה להם בעצם היה שהם בעצם הם שמרו על המבנה הקודם - אף אחד לא נהנה מה-APIs החדשים של Kubernetes שיכולים לשדרג אותך - ובעצם ה-Ops קיבלו עוד ועוד ועוד ועוד עבודה . . . (רן) אז מה שאתה אומר זה, אם אני אנסה להסיק את המשל ממה שאתה אומר - יש כלים בעולם, לדוגמא Kubernetesאם אנשי ה-Ops פעם השתמשו בכלי אחד והיום משתמשים בכלי אחר - לא עשית בזה כלום . . . מה שהכלים מאפשרים לך זה לחלק את הנטל בין אנשי ה-Ops לאנשי הפיתוח - ושכל אחד ינצל את החלק הרלוונטי אליו בתוך הכלי.לצורך העניין, Kubernetes עושה נקרא-לזה-דמוקרטיזציה של ה-Infrastructure - לא יודע אם זו מילה שהמצאתי עכשיו או לא, אבל בכל אופן זה מאפשר לחלק את הנטל.אם חלק מהחברה הוא גם ככה Idle, שום Kubernetes לא יעזור, כי יש פה איזשהו עניין תרבותי . . . אתה אומר שמי שבא ואומר “טרנספורמצית ה-DevOps שעשינו לא עבדה לי” - אתה אומר שלפחות אחד מהמקרים, או אחת מהסיטואציות שיצא לך לראות, זה שהבעיה היא בתרבות הארגונית ברוב המקרים, ולא מן הסתם בטכנולוגיה או בהטמעהיתכן שיש גם שם בעיה, אבל זה לא מה שאתה מתאר . . .(יאיר) בדיוק - אני הייתי אומר כזה דבר: ההגדרה של DevOps, לפחות אצלנו ב-Polar Squad, היא הגדרה כפולהאנחנו אומרים שזה . . . חייב לבוא שינוי תרבותי ב-DevOps, והשינוי התרבותי הוא כלל-חברתיזה גם Pattern שאני רואה כל הזמן - יש לך צוות DevOps, אבל זה צוות שכל אחד יודע רק משהו מאוד ספציפי בצוות . . . כבר זה לא DevOps, ב-By definion - כי הם . . . אני רואה הרבה אנשים, ואתה לא מאמין כמה מהם אתה . . . הוא יודע רק חתיכה מאוד מאוד קטנה ממה שהוא עושה, הוא לא רואה את התמונה [הכוללת], הוא לא יודע כלום על התמונהואחרי זה, יש לך מלא צוותים בחברה - כל אחד רואה את הפינה שלו, הם לא עובדים ביחד.(רן) האם קיים בכלל “צוות DevOps”, לדעתך? האם זה נכון שיהיה בחברה צוות שקוראים לו “DevOps”?(יאיר) אנחנו נכנסים פה עכשיו לדלת מאוד . . . אני, אישית, מאמין בזה, מהסיבה . . .באמת שהתעמקתי בנושא - למדתי היסטוריה ופילוסופיה באוניברסיטת תל אביב, התחום שלי זה היסטוריה גרמנית של הרעיונות . . . . ואני הייתי מאוד לא מרוצה מהעניין . . . הרגשתי שאין דבר כזה “DevOps Engineer”, זאת אומרת - מבחינתי התפקיד הזה . . . אני מקבל את זה שיש Platform Engineer, אני מקבל את זה שיש Cloud Expert או Cloud Architect, אני אפילו מקבל את ה-SRE, כי ה-SRE - אני מבין את העבודה שלו, גם אם אני לא בטוח שצריך SRE אבל ניחא, “בסדר”, כמו שאמא שלי אומרת . . . אתה צריך מישהו שיעשה לה Reliability בחברה? אני מבין את זהאבל אני לא כל כך מבין את ה-”DevOps Engineer” . . . .אני מבין “DevOps Consultant” - זו הייתה בחירה מודעת ללכת על ה-DevOps Consultant - אני בא, מלמד אותך לעשות את המתודולוגיה הזאת ואני משתחרר, אני הולך, כאילו . . . אני יכול לקבל אפילו DevOps Avdocat, או DevOps Coach - וזה תפקיד שאנחנו חושבים עליו הרבה, על איך עושים אותו בחברה.אני לא חושב ש-DevOps Coach יכול להיות Agile Coach - כי Agile Coach הרבה פעמים לא יודעים איך תוכנה עובדת . . . .אני לא חושב שאתה יכול לייעץ בתוך ארגון או לעזור לארגון לעשות טרנספורמציה, אם אתה לא מבין איך AWS ו-Linux ו-CI/CD Pipelines עובדים.כי אתה לא יכול לדבר באוויר - אתה צריך להראות . . . .נגיד, יש פרויקט שאני יכול לספר, בקצרה, עליו - עשו אותו בבנק בפינלנד, מאוד-מאוד גדולהם פשוט בנו איזושהי Framework של Pipelines ואת כל ה-Deployments והאיך עושים את ה-Enviromentsואז הם שנה עברו, צוות-צוות - לימדו את האנשים, החזיקו להם את הידייםתחשוב - זה בנק, זה מתכנתים Old School by definition - החזיקו להם את הידיים, שמרו עליהם, “תעשה - זה Dokcer, תעשה . . .”אז זה חשוב מאוד . . .(רן) וזה עבד?(יאיר) כן - הבנק עבר אוטומציה מטורפת . . . תראה, אני חייב לשים שוב פעם את הכל בסוגריים - בפינלנד, כשהייתי ברשות השידור בפינלד, אז הם עובדים ב-Scrum וזה קצת לא מה שאנחנו חושבים . . . זה לא רשות השידור בישראל, זה אתר מטורף שכאילו כולו על Infrastructure-as-a-Code והכל שם אוטומטי לחלוטיןאני הייתי שם, ראיתי מה הם עושים - זה קצת . . . מאוד היפסטרי כזה, לא יודע אם זה Applicable לגרמניה וישראל, אבל עדיין . . . . [רגע, אתר רשות השידור כמקום היפסטרי - תן לזה לשקוע . . .]אבל עדיין - הבנק הזה עבר . . . בנק מאושר, הם עשו את זה.בטוח שיש להם מלא בעיות, אני בטוח שזה לא . . . צריך גם להגדיר את זה - מבחינתי, DevOps זו אוטופיה וזה משהו שאנחנו כל הזמן עובדים עליואין Endless loop of measurments . . . .(רן) כן, אז זה בעצם לבוא - אם אני מתרגם את מה שאתה אומר - זה לבוא ולהגיד ש”יש איש DevOps” או ש”יש צוות DevOps” זה אולי שקול ללהגיד “יש איש חדשנות!” או “יש צוות חדשנות!” - אז מה, זה אומר שכל השאר לא חדשניים? זה אומר שכל השאר לא עושים את זה? . . . . אז לבוא ולהגיד ש”יש איש DevOps” זה לבוא ולהגיד שכל השאר לא עושים את זה - וזה בדיוק האנטי-תזה למה ש-DevOps בא ואומר: DevOps בא ואומר שזה של כולם, זה לא רק של מישהו אחד.(יאיר) בדיוק - זה גם של ה-Salesman וזה גם של ה . . . .אני אגיד לך דבר כזה - אם ה-DevOps נשאר בתוך קבוצה מאוד קטנה של שלושה אנשים, אז לא עשינו כלום . . .אם DevOps נשאר קבוצה של שבעה אנשים - לא עשינו כלום . . . .אני לא יכול להגיד לך אם אני יודע . . . עכשיו קוראים לזה “BizOps” ו-”DesignOps” ו-”GitOps” וכל מיני . . . ה-”PeopleOps” . . . אני חושב שכל הדברים האלה מגיעים מאנשים שלא כל כך הבינו . . . .(רן) כן, אז יש את הצד התרבותי - ועכשיו אתה יודע, זה באמת . . . אני חושב שכולם יודעים שהוא קיים, אבל עד שאתה לא באמת חווה את זה, אתה לא באמת מבין מה המשמעות של זה - ולפעמים אני חייב להגיד שגם אני עושה את הטעויות, ורק כשאני מסתכל על זה מהצד - אז אני קולט שעשיתי שם טעויות.אז זה עניין שלוקח הרבה מאוד זמן להבין אותו - ובהקשר הזה, אנשים כמוך, שראו הרבה מאוד חברות ויש להם את הניסיון הזה, יכולים לבוא ולתת את הפרספקטיבה הנכונה.אבל יש גם את העניין הטכנולוגי, שקצת נגענו בו - וחשוב להגיד ש-DevOps זה שילוב של שניהם, ואני חושב שזה נאמר כבר אלפי פעמים, אז פה אנחנו לא חדשים - אבל בוא רגע נדבר על הצד הטכנולוגי, ואולי ככה נעשה איזושהי סקירה קצרה של אילו דברים מעניינים, בצד הטכנולוגי, קרו בזמן האחרון, שבעצם נותנים לנו ומאפשרים לנו לקחת את ה-DevOps צעד אחד קדימה.(יאיר) אוקיי, אז אני חושב שהדבר הכי חשוב שאנחנו רואים לאחרונה זה כניסה של APIs לעולם ה-Infrastructure.בעצם, מה שאנחנו רואים זה שנכנסים כלים של פיתוח לעולם ה-Infrastructure.אני אתן לך דוגמא - כשאני הייתי SysAdmin, היו לי כמה Batch-scripts, ואני לא חושב ש-Git היה אז - וגם אם היה, לא הייתי חולם לשים את זה ב-Git . . . הייתה לי ספרייה כזו של Script - Install - Install Apache . . . . עכשיו זה עולם אחר - אתה לא יכול יותר לעשות את זה בצורה כזאת, כי המערכות כל כך מורכבות - אתה רוצה שכולם יחלקו את המידע ושזה יהיה דקלרטיבי (Declarative) ככל האפשראז בעצם תחשוב על זה - כלי כמו Kubernetes, כלי כמו TerraForm, כלי כמו CDK - משתמשים בעצם ביכולת שענקי התקשוב בענן ו-Google נתנו לנו בעצםבעצם, המפתח וה-Operator מתחילים לעשות קונסולידציה (Consolidation) - הם שניהם עושים הרבה Merge Requests ו-Pull Requests ו-Git ניהיה ה-Source of Truthזה Hopefully, זה לא תמיד קורה . . . אבל אם תחשוב על זה, אתה בעצם משוחרר פתאום - ה-AWS שאני התחלתי לעבוד עליו היה Datacrnter קלאסי . . . הווה אומר - אתה עושה Provisioning למכונות, אחרי זה הם התחילו להוסיף Service-ים - ה-S3 וכל הבניינים האלה.עכשיו - זו מפלצת של Service-ים . . . .מה שאני מנסה להגיד זה שיש את הדבר הזה שאומרים “No Vendor locking” - אבל אם אתה סטארטאפ צעיר, עני יחסית, אין לכם הרבה כסף, אז נכון - זה יעלה לך כסף, אני מסכים, אבל כשאני חושב על העבר ואני חושב על ההווה - אתה יכול, יחסית בזול, אם אתה תחשוב על זה טוב, לבנות לעצמך מערכות ממש טובות - ואחרים עושים לך Lift & Shift.לדעתי, אם האתוס, כשאני הייתי צעיר, היה “בוא נבנה לבד הכל, בוא נעשה הכל לבד” - עכשיו, מי שעושה את זה הוא מתאבד . . .אתה לעולם לא תסיים . . .(רן) אני מסכים לגבי המורכבות - אני חייב להגיד שכל יום, כשאני נכנס ל-Dashoboard של AWS, אני מגלה שם שירותים חדשים שאני לא מבין, אני אפילו לא יודע איך קוראים את השם שלהם, שלא לדבר על מה הם עושים . . . בחלק קטן מאוד שלהם אני משתמש.עכשיו, דיברנו על הדמוקרטיזציה של ה-Infrastrucure - אני אגיד את זה, עד שזה יקלט - אחד האתגרים שלי באופן אישי יצא לראות כשבאים ומכנסים פרקטיקות של DevOps, זה שלאנשי הפיתוח לפעמים קשה לעכל את זה - והדילמה היא . . . כי עכשיו לא צריכים לדעת רק את שפת התכנות - לא רק צריכים לדעת Java ואת כל הספריות שלה או Python או Whatever - הם גם צריכים להבין Infrastructure, משהו שלפני זה מישהו אחר עשה להם, אז עכשיו גם הם צריכים להבין בזה . . .ונשאלת השאלה - מצד אחד זה טוב, אבל מצד שני גם נשאלת השאלה - מהי רמת האבסטרקציה (Abstraction) הנכונה? זאת אומרת - איזו אבסטרקציה צריך לחשוף למפתחים, כדי שיהיו פרודוקטיביים? כדי שבאמת נוכל . . . כדי שהם יהיו איתנו onboard בכל הסיפור הזה של ה-DevOps - וזה די מתקשר לכל הסיפור הזה של Developer Platform, שאני יודע שאתה רוצה להזכיר . . . .אז בוא רגע נדבר על זה - מניסיונך, איזו רמת אבסטרקציה נכונה יכולה לעבוד, כדי שמפתחים יהיו לגמרי Onboard ופרודוקטיביים?(יאיר) תראה, זה מאוד מאוד תלוי . . . אני חושב שקשה לי לתת לזה תשובה אחת.אני חושב שזה גם משתפר עם הזמן, וזה גם מאוד תלוי מי המפתחים - יש מפתחים שמתים לדעת את הדברים האלה ויש מפתחים שלעולם לא יגעו בזה גם . . . (רן) אז אם אתה מגיע עכשיו לחברה, נניח - או אולי אתה יכול להיזכר באחד המקרים האחרונים, שהגעתם לחברה ואני מתאר לעצמי שבאיזשהו שלב גם השאלה הזו עלתה: האם אנחנו רוצים לייצר פלטפורמה למפתחים, ואם כן - אז מה אנחנו רוצים לחשוף להם? האם לחשוף להם Barebone Kubernetes? האם לחשוף להם איזשהו ממשק מעל? האם לחשוף להם שלושה ממשקים מעל? זאת אומרת - איך? מה אנחנו חושפים למפתחים פה?[רפרנס - 368 Kubernetes and Dyploma at outbrain](יאיר) תראה, הייתי אומר שמקרה קלאסי . . . הרבה פעמים, אפשר להמליץ לאנשים להשתמש . . . או שאתה בונה את הפלטפורמה להם . . . הכי טוב למפתחים זה לעבוד עם API - ל-Kubernetes יש API, ויחסית נוח לייצר מולו דברים.אם נגיד . . . כלים כמו TerraForm וזה, אם הם פחות אוהבים, ובכל מקרה עדיף שה-TerraForm שלך יהיה בתוך ה-CI/CD Pipelines, עדיף שכמה שפחות אתה “תעשה עם המקלדת” TerraForm . . .באופן כללי - כמה שפחות מקלדת זה יותר טוב.אני חושב שאם הם בעניין, אז אפשר גם לפתוח קצת, לתת להם קצת kubectl, קצת . . . אבל API - זה הדבר.ולתת להם את זה לאט - כי יש כאן גם Context change - הבנאדם כותב Java, או איזושהי שפה, המון שנים - ונוח לו.הוא מבין שמשהו משתנה, והוא לא רוצה שתפחיד אותו . . . זה ה-Level של האבסטרקציה.או שאפשר להשתמש בכלים כמו humanitec, למשל, שבעצם נותנים לך עוד שכבה, נותנים לך UI יפה כזה מעל ה-Kubernetes - ומחברים לך את כל ה-Dots . . . ואז בעצם יש לך מעיין משהו מאוד נוח לשימוש, שאני חושב שאחרי הסבר מאוד קל אז כל מפתח ישמח לעבוד איתו.ושוב פעם, זה חוזר לעניין הזה שאני מאוד מאוד מאמין בו - אל תבנה לבד כלים, תשתמש בדברים מוכניםאתה חוסך המון זמן וכסף.(רן) כן . . . דרך אגב, אני לא הכרתי את humanitec, אז תודה על הרפרנס . . . אני מסתכל עכשיו באתר וכתוב שזה “Enable developer self-service” - אז מה זה “Self Service”? זה אומר לתת למפתחים להקצות לעצמם משאבים, בזמן שהם צריכים, בלי פגישה ובלי טפסים, לצורך העניין? לייצר API, שהם יכולים דרכו לעשות Provisioning ל-Workloads שלהם?(יאיר) בדיוק . . . (רן) . . . כשעל פניו, זה גם משהו ש-Kubernetes נותן, אבל יכול להיות שהם עושים את זה בצורה יותר “הומנית”, בצורה יותר נוחה . . .(יאיר) מה שהם עושים זה שהם בעצם נותנים עוד שכבה של אבסטרקציה - ובעצם הם עוזרים לך, אתה לא צריך לעשות את ה-Glue, הם עשו בשבילך את כל ה-Glue . . . אני לא יודע אם אתה מכיר או חי את ה-Kubernetes, אבל Kubernetes [זה משהו ש]צריך לדעת לתפעל אותו.אם אתה פשוט זורק Kubernetes בענן איפשהו [רעיון לספורט אולימפי חדש?] וחושב שהדברים יהיו שמחים - אז זה לא, אתה תיהיה מאוד מסכן.הם פשוט מקלים עליך בהרבה הרבה דברים - הם עשו המון עבודה, הם הוסיפו המון APIs, הם הוסיפו המון ממשקיםהם צוות מאוד מאוד חזק - המון אנשים שבאים ממקומות מאוד טובים . . .(רן) . . . דרך אגב - מקלים עליך מהצד של לתפעל את ה-Cluster עצמו, או בצד של להתממשק אליו ולהשתמש בו?(יאיר) יותר בצד של להתממשק ולהשתמש בו, אבל הם גם יכולים לספק לך לפעמים את ה-Cluster, אם אתה רוצה.ואז אתה על ה-Cluster שלהם . . . כל מיני דברים כאלה, בהחלט.(רן) אז יצא לנו לדבר ספציפית על Kubernetes, אבל מן הסתם זו רק דוגמא - יש גם כלים אחרים בעולם, ותהיתי האם פה יש לך אילו-שהן תובנות, לגבי איך יראה ה-Stack הטכנולוגי של עוד X שנים? . . . לא יודע, תבחר X . . . נגיד 5 שנים? 10 שנים? האם תיהיה איזושהי קונסולידציה (Consolidation) לכיוון איזשהו Stack מיוחד, או שאנחנו נמשיך לראות ככה הסתעפויות - ואני יודע שיש פה מן הסתם גם שאלות עסקיות וכלכליות, זה לא רק שאלה טכנולוגית, ברור לגמרי . . . אבל, זאת אומרת, מהדברים שאתה רואה היום - האם אתה רואה ניצנים של התפתחויות חדשות בנושא של הפלטפורמות ענן?(יאיר) אני חושב שהפלטפורמות ענן - החלום שלהן זה . . . הן עובדות בשיטה של סוחר סמים - הן רוצות שתיכנס בחינם, כשאתה חלש וקטן זה נראה לך זול, אתה קונה כמה שיותר שירותים, ואחרי איזה כמה זמן “הו, לא! אני מכור ל-Lambda!” או “אני מכור ל-ALB” . . . אתה לא יכול לצאת מזה.אז הם ישפרו וישדרגו את השירותים שלהםאם, נגיד, Azure ו-AWS נכנסו חזק ל-Kubernetes, הם יעשו “humanitec משל עצמם”, איכשהוהם יעלו על הגל הזה.אני חושב שהרצון של האנשים הוא פשוט לעבוד מהר יותר - והרצון של האנשים לעבוד מהר הולך בניגוד גמור לרמה של ה-Complexity שאנחנו מתעסקים איתה כי microServices זה נחמד, אבל זה קשה לתפעול - צריך המון המון Context, המון המון דבריםוה-Context משתנה המון, אתה . . . . יש איזשהו כלי שאתה חושב שהוא מגניב, ופתאום הוא נעלם לגמרי, ואתה לא יודע מה יהיה הכלי הבא.אבל אני חושב שזה ילך לעוד ועוד אבסטרקציות - עוד ועוד אבסטרקציות.אנשים, אפילו אנשי Ops - מעט מאוד אנשים התחילו “להיכנס מתחת לברזל”, ועוד ועוד אנשים יעלו מעל . . .אני אתן לך דוגמא, ברמת עבודה: אני והבחור השני, שהוא יחסית “ענתיקה” אצלי בצוות - אנחנו, יש לנו תמיד את השאלה הקלאסית שקשורה ל-TCP ול-HTTP - אתה לא מבין כמה אנשים עם ניסיון לא יודעים, לא יכולים להסביר לי את הדבר הזה . . .ותמיד אומר לי הבחור היותר צעיר בצוות - “אבל אתם עתיקים, אתם . . . .”אבל איך אתה יכול לפתור? עדיין ה-ALB שלך . . . מצטער, איך אתה יכול לפתור תקלה, אם אתה לא מבין ואתה לא יודע מה זה Three-way handshake? אני לא יכול, אני מצטער - זה מעצבן אותי . . . (רן) אני כאילו מתפתה לבוא ולהגיד “בוא תשאל אותי רגע את השאלת ראיון בשידור”, ונראה אם אני מצליח לבזות את עצמי, אבל אני אחסוך את זה לעצמי . . . . אתה יודע מצד אחד, יצא לי לחשוב על זה כמה פעמים: תראה, אני יודע איך עובד TCP ו-Three-way handshake, סבבה - אבל יש עוד הרבה דברים שאני לא יודע, אוקיי? אני לא יודע איך עובד הה-CPU ואני גם לא יודע איך עובד ה-GPU ואני לא יודע איך עובד הזכרון של ה-GPU - ויש עוד המון דברים שאני לא יודע.באיזשהו שלב, אתה יודע - זה איזשהו צורך השרדותי: אם אתה תדע את הכל, אתה לא תדע להבחין בין מה שרלוונטי לך לבין מה שלא רלוונטי, מעבר לזה שזה לא פרקטי לדעת את הכל.אז אני אומר שבאיזשהו מובן, זה כאילו מעצבן אותך שהם לא יודעים TCPו-Three-way handshake - ומצד שני, הם “מפנים מקום ב-RAM שלהם” לדברים אחרים, שאולי הם יותר רלוונטיים . . . אז יכול להיות שבראייה השרדותית, הם אולי עשו את הבחירה הנכונה, אפילו שהם לא עשו את זה במודע - אבל הם עשו את הבחירה הנכונה של “בוא לא נלמד את זה, כי זו בעיה פתורה - ואני אשקיע את הזמן בללמוד HELM או Whatever, דברים אחרים שיש להם מקום בזכרון . . . .(יאיר) קודם כל, קיבלתי לעבודה אחד כזה, אז . . . אני נשמע נוקשה אבל אני ממש לא נוקשה.(2) אני חושב - וב-Context של השאלה זה נאמר גם - אני אומר לו “השאלה היא לא… אני לא רוצה שאתה תגיד לי . . .” - כי היה מישהו שלא היה כל כך מומחה לרשתות, שנתן לי מרמת ה-ARP, ה-MAC, והוא נכנס שם ממש לפאקטות (Packets) - ואמרתי “בסדר, זה לא מעניין אותי גם . . . “אבל מה שכן, ב-Context של Infrastructure Engineer, רק תן לי את ה . . . . אני לא מצפה ממך עכשיו להיות אלוף העולם ברשתות, אבל אני רוצה שלפחות תדע שיש שכבות וזה באמת לא הרבה לבקש את זה, לא מדובר פה באיזה Pinpointing, כן . . . מדובר ב . . .אתה יודע - יש שכבות ואתה לא יכול לפתור את ה . . . זה אומר - מבחינתי זה אומר, וסליחה שאני לא מסכים . . . אבל שוב פעם - קיבלתי מישהו גם כשהוא לא ידע את זה, כי הוא ידע מלא דברים אחרים . . . .אז זה לא 100%, כן? אבל . . . (רן) כן - הוא הראה יכולת להעמיק, אתה אומר . . . ודרך אגב, אנחנו מן הסתם סוטים פה לנושא של “איך מראיינים בנושא של DevOps” . . . אבל זה גם נושא מעניין, אולי גם על זה צריך להקליט פעם משהו...אתה אומר, אבל, שהוא העמיק במשהו, אוקיי? הוא הוכיח שהוא יודע להעמיק, ספציפית . . . (יאיר) אני אגיד לך את האמת - באמת באמת - אני מחפש את ה-State of Mind.טכנולוגיה אפשר ללמודהשאלות האלה הן רק יותר כדי לדעת . . . תשמע, אחרת אני אקח אנשים עם State of Mind “מהרחוב” ואני אלמד אותם - ואני לא יכול.השאלות האלה הן איזשהו “בזיק” שאני זורק באוויר כדי לראות איך הם מגיבים - אבל בעיקר חשוב לי איך הוא הוא חושב? האם הוא בא עם סקרנות? האם הוא בא עם יכולת לעשות אבסטרקציה מהדברים שהוא מתעסק בהם? או שהוא מפציץ, או שהוא רובוט . . . (רן) בוא נחזור רגע לנושא שלנו - ואנחנו כבר ככה לקראת הסוף, אז נבחר עוד נושא אחד.רציתי אולי קצת לדבר על Cloud Native - מן הסתם זה Term ששומעים לא מעט . . . מה זה? למי זה טוב? מתי אני צריך את זה?אתה יודע - כולם מדברים על זה, אולי כדאי שגם אני אדע מה זה . . . .(יאיר) אוקיי, קודם כל - Cloud Native זה דבר שכל ברנש או ברנשית שעובדים בפיתוח כרגע כדאי שידעו.זה בעצם גם . . . זה גם סוג-של Non-profit organiztion שמונהול בעצם ע”י כל הענקיות - זה CNCF - ה-Cloud Native Foundationאני מצטער, אבל לפעמים אני שוכח מילים בעברית . . .ובעצם זה גם מביא איזושהי גישה לאיך בעצם אתה אמור לפתח תוכנה - בענן.עכשיו - אני יודע, ואני גם אומר את זה: “ענן התקשוב” הוא לא איזו המצאה כל כך מדהימה וחדשה, אני חושב שמי שעבד אפילו עם Mainframe יודע שבעצם זה היה סוג של ענן תקשובמלא מחשבי-על מחוברים ברשת.אבל כן - אנחנו עכשיו נמצאים בסיטואציה שבה העולם משתנהזאת אומרת, אפילו חברות ענק מתחילות - וזה בגרמניה, המדינה שהיא, נגיד, מאוד מאוד איטית ביכולת שלה לחבק ולקבל טכנולוגיות - מתחילה עכשיו לצאת מהעולם הזה של ה-On-Premise מעולם הזה של “אני צריך את ה-Server-ים שלי אצלי כי הם Secure” . . .ומתחילה לחשוב על הענן בתור “צביר של שירותים”.וצביר השירותים הזה יכול לקדם אותך לעבוד מאוד-מאוד-מאוד מהר.אם אתה מוסיף לזה את הקונספטים של Agile ו-DevOps, אתה יכול בעצם לייצר לעצמך סביבות אלסטיות בטירוףאתה בעצם יכול להשתמש במלא כלים.אני רק אוסיף עוד דבר אחד - זה [אלו] קהילות מאוד מ אוד Vibrant - כל ענקיות התוכנה משלמות מלא-מלא כסף . . .למשל - HELM נשלטת לחלוטין ע”י Microsoft - כל אנשי Microsoft שעובדים על HELM מקבלים משכורות מ-Microsoft . . . (רן) כן, ראיתי את זה ב-GitHub, אני חושב שמי שיצר את זה עובד שם וככה זה התגלגל, אבל אפשר לדבר על זה כמה מילים . . .[מעניין -Matt Butcher, ונראה שבדיוק החודש הוא עבר הלאה . . . .] רק רציתי להעיר, להיות קצת יותר קונקרטי: אמרת “צביר של שירותים”, אז בוא נסתכל רגע על דוגמא קונקרטיתלמשל Storage - אם בעבר ה-Storgae היה היכולת לעשות Mount לאיזשהו דיסק פיזי בתוך המחשב שלך, אז היום Storage, בהרבה מקרים, זה משהו שנמצא רחוק - S3 זו דוגמא קלאסית.עכשיו - אתה לא יודע כמה מחשבים יש מאחורי זה, אתה לא יודע איפה מאחסנים את זה, אין לך שום מושג . . . אבל יש לך API - ואתה יודע שזה אלסטי: כשתצטרך, יהיה לך את זה - ואתה תשלם רק על מה שאתה משתמש.זו דוגמא, דרך אגב - השירות, ספציפית S3, היה קיים הרבה לפני שהמציאו את המונח Cloud Native - וכמו בהרבה מקרים, כמו ב-Design Patterns, קודם כל מסתכלים על מה קורה ורק אחר כך נותנים לזה שם . . . אז למעשה אתה אומר - Cloud Native זה בעצם שנתנו שם להרבה מאוד התנהגויות שמצאו בשטח, שמה שמשותף לכל ההתנהגויות האלה זה שמשתמשים בשירותי ענן שונים . . .ודרך אגב - אנחנו אומרים “ענן”, אבל זה לא חייב להיות ענן, זה גם . . . אני מכיר אימפלמנטציות (Implementations) של Cloud Native, נקרא לזה - שהן בכלל לא ב-Cloud, שהן On-Premise . . . .(יאיר) נכון . . . (רן) . . . כי הם משתמשים בקונספטים של Cloud Native - אז אולי המילה “Cloud” היא קצת אולי מבלבלת . . . (יאיר) . . . יש כלי Native וכל ה . . . כל הדברים האלה, בהחלט.שוב פעם - אל תשכח שמתחת לכל הדברים האלה, זה Marketing Tools, אוקיי? . . . אז ברור שחברות הענן רוצות שאתה תחשוב שהן - יש להן בעלות על הענן, כי אתה משלם להן כסף . . .יש סיבה לזה ש-Kubernetes שיחררה, או ש-Kubernetes שוחרר מ-Google - אבל Borg לא שוחרר מ-Google . . .כי Kubernetes היא גירסת הOpen Source של Borgאתה גם רואה את ה-Distruption ש-Kubernetes עושה ואיך הוא תפס את AWS ואיך ש-AWS רצה אחרי זה - ואתה מבין למה.יש פה עניינים - יש פה סכומי-עתק, כן? כי AWS - זה המנוע של Amazon, ו-Microsoft שמה את כל הביצים שלה בריצה מטורפת על Azureו-Google קצת עובדים אחרת - אני אף פעם לא מצליח להבין את הפילוסופיה של מה שהם מנסים לעשות, אבל יש להם את האימפלמנטצית (Implementation) Kubernetes הכי טובה, אז אתה תמיד צריך לזכור - אפילו שאני מדבר במשפטים אורכים עם הרבה פסיקים [1+] - בסיכומו של הם רוצים למכור לך משהו . . . אתה יכול לעשות את כל הדברים האלה אצלך ב-On-Prem, אתה יכול להריץ איזו אימפלמנטציה שאתה רוצה, זה לא רק מהם - ואתה יכול לקבל את אותם Service-ים - אצלך.ההבדל היחיד שהייתי מוסיף זה ששם מישהו עושה לך את ה-SRE, את ה-Lift & Shift - הוא דואג . . .מישהו דואג שה-S3 שלך תמיד יהיה שם - ואם הוא לא שם, אז הוא יחזיר לך את הכסףוזו נקודה שהיא מאוד מאוד חשובה להבהרה - כי בעצם כל העניין הזה שאתה משלם למישהו אחר קצת מוריד מעצמך את העומסואתה יכול לבחור במה אתה רוצה להתעסקזאת אומרת - אני בכלל “לא רוצה לראות” את ה-Infrastructure, אני לא רוצה לשמוע מ-VMsאני רוצה X מקומות שאני עובד איתם - כמו שאמרנו, נגיד ארבעה-חמישה Services - ושחרר אותי מהכל, אני לא רוצה לראות את זה - ואתה יכול להגיע למקום הזה עכשיו, או להתקרב אליו מאוד-מאוד-מאוד.(רן) אז אם ננסה לסכם רגע את ה-Take-away מהסעיף הזה של ה-Cloud Native, אז(1) זה אוסף של קונספטים שכדאי להכיר(2) צריך לזכור שיש מאחורי זה Marketing, אז לא הכל שם “חקוק בסלע” [מועמד לפרס ה-understatmenet של השנה?]אבל כן יש שם לא מעט Best Practices שכדאי להכיר ולאמץ את מה שרלוונטי אליכם.וה-Term עצמו - “Cloud” - יכול להיות אולי קצת מבלבל, כי תכל'ס אני חושב שכמעט כל ה-Best Practices שקיימים שם, גם יכולים להיות מחוץ ל-Cloudאני יודע שיש הרבה מאוד כלים שהם כלים מצויינים, בלי שום קשר ל-Cloud - כמו Grafana ואחרים - שהם חלק מתוך Cloud Native, ואין שום תלות בינהם לבין היכולת לרוץ על VM ב-Cloudאבל בכל אופן - יש שם לא מעט Resource-ים טובים, וכל הענקים למעשה מובילים את זה - כי אף אחד לא רוצה להישאר בחוץ, כי זו פלטפורמת Marketing מאוד טובה . . . (יאיר) לגמרי . . . .(רן) בסדר, אננחו מגיעים, ככה, לסיום - האם יש משהו שתרצה עוד להוסיף?(יאיר) אני חושב ש . . . הדבר שהייתי רוצה להגיד לאנשים זה שאם אתם יוצאים למסע הזה, של DevOps ו-Cloud Native, ואתם רוצים לעבוד עם הכלים האלה - תחשבו טוב למה . . . מה הכלים האלה יתנו לי? כי כלים-לשם-כלים זה Idle . . .תמיד תחשבו - וזה אולי מביא אותנו בסוף גם להתחלה, ל-Culture ול-DevOps - תחשבו איך הכלים האלה ישפרו את מה שאנחנו עושים ביחד.ומה שאנחנו עושים זה שאנחנו רוצים שה-Business יעבוד . . . איך זה יעשה את ה-Business יותר טוב?מה ה-Added value שאני מקבל על זה - על כל צעד שאני עושה:האם יש לי את האנשים לזה? האם יש לי את הארכיטקטורה המתאימה לזה?למשל, תשים Monolith ב-Kubernetes - סתם, אתה לא מרוויח מזה הרבה, אתה “קונה סבל”, מה שנקרא . . .(רן) . . . צריך גם את המוכנות הטכנולוגית - אבל גם את המוכנות התרבותיתשגם האחרים בחברה ירצו להיות חלק מזה, ואתה לא סתם זורק עליהם סט של טכנולוגיות שהם יחליטו להתעלם מהן ביום שאחרי . . .(יאיר) וגם הייתי אומר שתראה אם זה מתאים . . . הרבה פעמים אני הייתי חלק מצוותים - אני חייב להיות כנה עם זה - בחרנו כלים כי הם נראו לנו מגניביםבחרנו כלים כי הכרנו אותםבחרנו כלים כי זה מה שהחלטנו באותו הרגע, כי הייתה ישיבה ומישהו היה צריך לצעוק משהו . . . [זה ברקע]קצת . . . זה מה שנחמד בזה, ומה שאני רואה עכשיו - איך כל כך הרבה אנשים חוזרים על אותם Patterns של שגיאותוכל מה שאני רוצה להגיד זה “גם אני הייתי שם!” - ועכשיו אני בחוץ, אני לא עושה את השגיאות, אני רק רואה את השגיאות - בואו נעצור רגע, בואו נחשוב . . . בואו נעשה משהו יותר טוב הפעם.(רן) כן . . . טוב - תודה יאיר, תודה רבה! היה כיף והיה מעניין - ובהצלחה והמשך הצלחה ב-Polar Squad.נשמור על קשר - להתראות! האזנה נעימה ותודה רבה לעופר פורר על התמלול!

Changelog Master Feed
Kaizen! Are we holding it wrong? (Ship It! #30)

Changelog Master Feed

Play Episode Listen Later Dec 1, 2021 81:20


This is our third Kaizen episode in which Adam, Jerod & Gerhard talk about GitOps the wrong way, ask questions with Honeycomb and realise that they must be holding the CDN wrong, and the effort that has been going into moving all changelog.com static files from regular volumes to an S3-like object store. If you like a good yak shake, listening to this one is a lot more fun than doing it. Gerhard is most excited about the Ship It Christmas gifts that we have been preparing for you. While GitHub Codespaces is not going to be part of the upcoming Christmas special episode, today's talk covers why investing in a Codespaces integration is worth it. Changelog #459 and Backstage #20 are related to this topic.

Así las cosas
No hay manera de salir de Sudáfrica, hay fronteras cerradas: Daniel Almanza

Así las cosas

Play Episode Listen Later Nov 30, 2021 7:48


Somos al menos diez mexicanos, señala el ultra maratonista que busca salir de Mozambique ante la alerta por Ómicron y aun no tiene contacto con la SRE

Hipsters Ponto Tech
Ferramentas de Monitoramento e Observabilidade – Hipsters Ponto Tech #281

Hipsters Ponto Tech

Play Episode Listen Later Nov 30, 2021 52:48


Você já ouviu falar em observabilidade, SRE e monitoramento? No episódio de hoje do Hipsters Ponto Tech vamos falar sobre essa sopa de letrinhas, ferramentas e produtos open source que grandes empresas usam pra resolver problemas de infraestrutura e auxiliar o time de dev a ter mais facilidade quando vai fazer o deploy. Participantes: Paulo Silveira, o host que gostou muito de tirar dúvidas durante o episódioBruno Giannella, Gerente executivo de tecnologia no Banco PanFelipe Bernardes, Analista do time de Observability de SRE no Banco PanBruno Pereira, CEO da Elvenworks Links: Alura+ "Monitorando aplicações: 4 Golden Signals"Episódio Hipsters.Tech "DevOps: Observabilidade" Inscreva-se no YouTube da AluraInscreva-se na newsletter Imersão, Aprendizagem e Tecnologia Produção e conteúdo: Alura Cursos de Tecnologia - https://www.alura.com.brCaelum Escola de Tecnologia - https://www.caelum.com.br/ Edição e sonorização: Radiofobia Podcast e Multimídia

Practical Operations Podcast Episode Feed
Episode 126 - SRE Doesn't Scale

Practical Operations Podcast Episode Feed

Play Episode Listen Later Nov 29, 2021 29:31


Where we discuss Tyler Treat’s essay about how the paradigm of SRE doesn’t scale. Comments for the episode are welcome - at the bottom of the show notes for the episode there is a Disqus setup, or you can email us at feedback@operations.fm. Links for Episode 126: SRE Doesn’t Scale Google SRE Book: The Evolving SRE Engagement Model Reddit Pay Scale Post

Luis Cárdenas
Un caso emocionante, nos enfrentamos a la industria más poderosa del mundo: Alejandro Celorio

Luis Cárdenas

Play Episode Listen Later Nov 26, 2021 21:47


El consultor jurídico de la SRE, Alejandro Celorio, comentó con Luis Cárdenas sobre la demanda de México a empresas de armas de EU.

Changelog Master Feed
Find the infrastructure advantage (Ship It! #29)

Changelog Master Feed

Play Episode Listen Later Nov 24, 2021 66:07


Zac Smith, managing director Equinix Metal, is sharing how Equinix Metal runs the best hardware and networking in the industry, why pairing magical software with the right hardware is the future, and what Open19 means for sustainability in the data centre. Think modular components that slot in (including CPUs), liquid cooling that converts heat into energy, and a few other solutions that minimise the impact on the environment. But first, Zac tells us about the transition from Packet to Equinix Metal, his reasons for doing what he does, as well as the things that he is really passionate about, such as the most efficient data centres in the world and building for the love of it. This is a great follow-up to episode 18 because it goes deeper into the reasons that make Gerhard excited about the work that Equinix Metal is doing. This conversation with Zac puts it all into perspective. By the way, did you know that Equinix stands for Equality in the Internet Exchange?

Tech Unlocked
EP 48 | How to get unstuck and make successful career transitions with Linda Vivah

Tech Unlocked

Play Episode Listen Later Nov 23, 2021 55:53


Do you currently feel stuck in your career? Do you want to make a career transition but don't know how to start? This is the perfect episode for you! Today on the show, Grace chats with Linda Vivah a Software Engineer, an AWS Community Builder, a mom of 2, a part-time wedding singer &  founder of Coding Crystals about how to get unstuck in your career and develop the mindset needed to make successful career transitions.    Linda Vivah is a Software Engineer at a major media organization in NYC, an AWS Community Builder, a mom of 2, a part-time wedding singer & the founder of a shop called Coding Crystals: a jewelry & accessories company taking inspiration from STEM. She creates content around tech including coding, cloud computing, career tips with a sprinkle of lifestyle. Linda had an untraditional journey into tech and broke into tech post-college via self-studying and attending a coding bootcamp. She currently works as an SRE (Site Reliability Engineer). She was previously working as a Web Application Developer (mainly JavaScript) for 5 years and transitioned from Web Development to an SRE role last year to be more hands-on in the cloud computing space.  Key takeaways: How to shift your mindset when you feel stuck in your career Practical tips for getting unstuck in your career How to learn a new skill while balancing your day job Why most people in tech experience imposter syndrome  The truth about coding bootcamps Top 3 things you should look for in your first tech job Resources: AWS Certified Cloud Practitioner (CLF-C01) The Cloud Resume Challenge AWS Certified Solutions Architect – Associate Flatiron Bootcamp Follow Tech Unlocked for career tips: Website Substack Twitter Instagram   Connect with Grace: Instagram Twitter LinkedIn   Connect with Linda: Instagram TikTok YouTube Twitter Medium Coding Crystals Shop   Enjoyed listening to this episode? Please leave a review on iTunes and Spotify. Questions about sponsorship? Email us techunlockedpod@gmail.com    

The Stack Overflow Podcast
Who owns this outage? Building intelligent, automated escalation chains

The Stack Overflow Podcast

Play Episode Listen Later Nov 22, 2021 22:52


Maxwell, a solution architect at xMatters, took a winding road to get to where he is. After a computer engineering education, he held jobs as field support engineer, product manager, SRE, and finally his current role as a solutions architect, where he serves as something of an SRE for SREs, helping them solve incident management problems with the help of xMatters. When he moved to the SRE role, Maxwell wanted to get back to doing technical work. It was a lateral move within his company, which was migrating an on-prem solution into the cloud. It's a journey that plenty of companies are making now: breaking an application into microservices, running processes in containers, and using Kubernetes to orchestrate the whole thing. Non-production environments would go down and waste SRE time, making it harder to address problems in the production pipeline. At the heart of their issues was the incident response process. They had several bottlenecks that prevented them from delivering value to their customers quickly. Incidents would send emails to the relevant engineers, sometimes 20 on a single email, which made it easy for any one engineer to ignore the problem—someone else has got this. They had a bad silo problem, where escalating to the right person across groups became an issue of its own. And of course, most of this was manual. Their MTTR—mean time to resolve—was lagging. Maxwell moved over to xMatters because they managed to solve these problems through clever automation. Their product automates the scheduling and notification process so that the right person knows about the incident as soon as possible. At the core of this process was a different MTTR—mean time to respond. Once an engineer started working to resolve a problem, it was all down to runbooks and skill. But the lag between the initial incident and that start was the real slowdown. It's not just the response from the first SRE on call. It's the other escalations down the line—to data engineers, for example—that can eat away time. They've worked hard to make  escalation configuration easy. It not only handles who's responsible for specific services and metrics, but who's in the escalation chain from there. When the incident hits, the notifications go out through a series of configured channels; maybe it tries a chat program first, then email, then SMS. The on-call process is often a source of dread, but automating the escalation process can take some of the sting out of it. Check out the episode to learn more. 

The Stack Overflow Podcast
Who owns this outage? Building intelligent, automated escalation chains

The Stack Overflow Podcast

Play Episode Listen Later Nov 22, 2021 22:52


Maxwell, a solution architect at xMatters, took a winding road to get to where he is. After a computer engineering education, he held jobs as field support engineer, product manager, SRE, and finally his current role as a solutions architect, where he serves as something of an SRE for SREs, helping them solve incident management problems with the help of xMatters. When he moved to the SRE role, Maxwell wanted to get back to doing technical work. It was a lateral move within his company, which was migrating an on-prem solution into the cloud. It's a journey that plenty of companies are making now: breaking an application into microservices, running processes in containers, and using Kubernetes to orchestrate the whole thing. Non-production environments would go down and waste SRE time, making it harder to address problems in the production pipeline. At the heart of their issues was the incident response process. They had several bottlenecks that prevented them from delivering value to their customers quickly. Incidents would send emails to the relevant engineers, sometimes 20 on a single email, which made it easy for any one engineer to ignore the problem—someone else has got this. They had a bad silo problem, where escalating to the right person across groups became an issue of its own. And of course, most of this was manual. Their MTTR—mean time to resolve—was lagging. Maxwell moved over to xMatters because they managed to solve these problems through clever automation. Their product automates the scheduling and notification process so that the right person knows about the incident as soon as possible. At the core of this process was a different MTTR—mean time to respond. Once an engineer started working to resolve a problem, it was all down to runbooks and skill. But the lag between the initial incident and that start was the real slowdown. It's not just the response from the first SRE on call. It's the other escalations down the line—to data engineers, for example—that can eat away time. They've worked hard to make  escalation configuration easy. It not only handles who's responsible for specific services and metrics, but who's in the escalation chain from there. When the incident hits, the notifications go out through a series of configured channels; maybe it tries a chat program first, then email, then SMS. The on-call process is often a source of dread, but automating the escalation process can take some of the sting out of it. Check out the episode to learn more. 

Luis Cárdenas
Programa completo Luis Cárdenas 18 Noviembre

Luis Cárdenas

Play Episode Listen Later Nov 18, 2021 181:38


Difícil que no se hable de la Reforma Eléctrica en la Cumbre de Líderes de América del Norte: Ezra Shabot. Les importa que López Obrador siga haciendo el trabajo sucio en el tema de migración: Jorge Castañeda. 

Luis Cárdenas
Les importa que López Obrador siga haciendo el trabajo sucio en el tema de migración: Jorge Castañeda

Luis Cárdenas

Play Episode Listen Later Nov 18, 2021 14:30


Jorge Castañeda, ex secretario de Relaciones Exteriores y profesor de la Universidad de Nueva York, comentó con Luis Cárdenas sobre la Cumbre de Líderes de América del Norte.

Screaming in the Cloud
Breaking Down Productivity Engineering with Micheal Benedict

Screaming in the Cloud

Play Episode Listen Later Nov 18, 2021 45:32


About Micheal BenedictMicheal Benedict leads Engineering Productivity at Pinterest. He and his team focus on developer experience, building tools and platforms for over a thousand engineers to effectively code, build, deploy and operate workloads on the cloud. Mr. Benedict has also built Infrastructure and Cloud Governance programs at Pinterest and previously, at Twitter -- focussed on managing cloud vendor relationships, infrastructure budget management, cloud migration, capacity forecasting and planning and cloud cost attribution (chargeback). Links: Pinterest: https://www.pinterest.com Twitter: https://twitter.com/micheal LinkedIn: https://www.linkedin.com/in/michealb/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: You know how git works right?Announcer: Sorta, kinda, not really Please ask someone else!Corey: Thats all of us. Git is how we build things, and Netlify is one of the best way I've found to build those things quickly for the web. Netlify's git based workflows mean you don't have to play slap and tickle with integrating arcane non-sense and web hooks, which are themselves about as well understood as git. Give them a try and see what folks ranging from my fake Twitter for pets startup, to global fortune 2000 companies are raving about. If you end up talking to them, because you don't have to, they get why self service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y.comCorey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Sometimes when I have conversations with guests here, we run long. Really long. And then we wind up deciding it was such a good conversation, and there's still so much more to say that we schedule a follow-up, and that's what happened today. Please welcome back Micheal Benedict, who is, as of the last time we spoke and presumably still now, the head of engineering productivity at Pinterest. Micheal, how are you?Micheal: I'm doing great, and thanks for that introduction, Corey. Thankfully, yes, I am still the head of engineering productivity; I'm really glad to speak more about it today.Corey: The last time that we spoke, we went up one side and down the other of large-scale environments running on AWS and billing aspects thereof, et cetera, et cetera. I want to stay away from that this time and instead focus on the rest of engineering productivity, which is always an interesting and possibly loaded term. So, what is productivity engineering? It sounds almost like it's an internal dev tools team, or is it something more?Micheal: Well, thanks for asking because I get this question asked a lot of times. So, for one, our primary job is to enable every developer, at least at our company, to do their best work. And we want to do this by providing them a fast, safe, and a reliable path to take any idea into production without ever worrying about the infrastructure. As you clearly know, learning anything about how AWS works—or any public cloud provider works—is a ton of investment, and we do want our product engineers, our mobile engineers, and all the other folks to be focused on delivering amazing experiences to our Pinners. So, we could be doing some of the hard work in providing those abstractions for them in such way, and taking away the pain of managing infrastructure.Corey: The challenge, of course, that I've seen is that a lot of companies take the approach of, “Ah. We're going to make AWS available to all of our engineers in it's raw, unfiltered form.” And that lasts until the first bill shows up. And then it's, “Okay. We're going to start building some guardrails around that.” Which makes a lot of sense. There then tends to be a move towards internal platforms that effectively wrap cloud services.And for a while now, I've been generally down on the concept and publicly so in the general sense. That said, what I say that applies as a best practice or something that most people should consider does tend to fall apart when we talk about specific use cases. You folks are an extremely large environment; how do you view it? First off, do you do internal platforms like that? And secondly, would you recommend that other companies do the same thing?Micheal: I think that's such a great question because every company evolves with its own pace of development. And I wouldn't say Pinterest by itself had a developer productivity or an engineering productivity organization from the get-go. I think this happens when you start realizing that your core engineers who are working on product are now spending a certain fraction of time—which starts ballooning pretty fast—in managing the underlying systems and the infrastructure. And at that point in time, it's probably a good question to ask, how can I reduce the friction in those people's lives such that they could be focused more on the product. And, kind of, centralize or provide some sort of common abstractions through a central team which can take away all that pain.So, that is generally a good guiding principle to think about when your engineers are spending at least 30% of their time on operating the systems rather than building capabilities, that's probably a good time to revisit and see whether a central team would make sense to take away some of that. And just simple examples, right? This includes upgrading OS on your EC2 machines, or just trying to make sure you're patching all the right versions on your next big Kubernetes cluster you're running for serving x number of users. The moment you start seeing that, you want to start thinking about, if there is a central team who could take away that pain, what are the things they could be investing on to help up-level every other engineer within your organization. And I think that's one of the best ways to be thinking about it.And it was also a guiding principle for us within Pinterest to view what investments we could make in these central teams which can up-level each and every different type of engineer in the company as well. And just an example on that could be your mobile engineer would have very different expectations from your backend engineer who was working on certain aspects of code in your product. And it is truly important to understand where you want to centralize capabilities, which both these types of engineers could use, or you want to divest and have unique capabilities where it's going to make them productive. There's no one-size-fits-all solution for this, but I'm happy to talk about what we have at Pinterest, which has been reasonably working well. But I do think there's a lot more improvements we could be doing.Corey: Yeah, but let's also be clear that, as you've mentioned, you are heavily biased towards EC2 instances for a lot of what you do. If we look at the AWS console and we see hundreds of different services now, and it's easy to sit here and say, “Oh, internal platforms are terrible because all of those services are going to be enhanced in various ways and you're never going to be able to keep up with feature parity.” Yeah, but if you can wrap something like EC2 in an internal platform wrapper, that begins to be a different story because sure, someone's going to go and try something new with a different AWS service, they're going to need direct access. But the EC2 product across the board generally does not evolve in leaps and bounds with transformative changes overnight. Let's also not forget that at a company with the scale that Pinterest operates at, “Hey, AWS just dusted off a new feature and docs are still rolling out, and it's not in CloudFormation yet, but we're going to roll it out to production,” probably seems like the wrong direction to go in, I would assume.Micheal: And yes, I think that brings one of the key guardrails, I think, which these groups provide. So, when we start thinking about what teams, centralized teams like engineering productivity, developer tools, developer platforms actually do is they help with a couple of things. The top three are: they can help pave a path for the most common use cases. Like to your point, provisioning EC2 does take a set of steps, all the time. If you're going to have a thousand people doing that every time they're building a new service or trying to expand capacity playing with their launch templates, those are things you can start streamlining and making it simple by some wrapper because you want to address those 80% use cases which are usually common, and you can have a wrapper or could just automate that. And that's one of the key things: can you provide a paved path for those use cases?The second thing is, can you do that by having the right guardrails in place? How often have you heard the story that, “I just clicked a button and that now spun up, like, a thousand-plus instances.” And now you have to juggle between trying to stop them or do something about it.Corey: Back in 2013, you folks were still focusing on this fair bit. I remember because Jeremy Carroll, who I believe was your first SRE there once upon a time, wound up doing a whole series of talks around how Pinterest approached doing an AMI Factory. And back in those days, the challenges were, “Okay. We have the baseline AMI, and that's great, but we also want to do deployments of things and we don't really want to do a new deploy of an entire fleet of EC2 instances for a single line of config change, so how do we wind up weighing off of when you bake a new AMI versus when you just change something that has—in what is deployed to them?” And it was really a complicated problem back then.I'm not convinced it's not still a complicated problem, but the answers are a lot more cohesive. And making sure that every team—when you're talking about a company as large as Pinterest with that many teams—is doing things in the same way, seems like it's critically important otherwise you wind up with a whole bunch of unique-looking instances that each have to be managed by hand as opposed to something that can be reasoned around collectively.Micheal: Yep. And that last part you mentioned is extremely crucial as well because like I said, our audience or our customers are just not the engineers; we do work with our product managers and business partners as well because at times, we have to tie or change our architecture based on certain cost optimizations which would make sense, like you just articulated. We don't want to have all the instance types. It does not add much value to a developer unless they're explicitly seeking a high-memory instance or a [GP-based instance in a 00:10:25] certain way. So, we can then work with our business partners to make sure that we're committing to only a certain type of instances, and how we can abstract our tools to only give you that. For example, our deployment system, Teletraan which is an open-source system, actually condenses down all these instance types to a couple of categories like high-compute, high-memory—and you've probably seen that in many of the new cloud providers as well—so people don't have to learn or know the underlying instance type.When we moved from c3 to c5, it was just called as a high-compute system, so the next time someone provisioned a new service or deployed it using our system, they would just select high-compute as the de facto instance type and we would just automatically provision a C5 for them. So, that just reduces the extra complexity or the cognitive overhead individuals would have to go through in learning each instance type, what is the base AMI that comes on it, what are the different configurations that need to go in terms of setting up your AZ-scaling properties. We give them a good reasonable set of defaults to get started with, and then they can then work on optimizing or making changes to it.Corey: Ignoring entirely your mispronunciation of AMI, which is, of course, three syllables—and that is a petty hill upon which I will die—it occurs to me the more I work with AWS in various ways, the easier it gets. And I used to think in some respects, it was because the platform was so—it was improving so dramatically around me. But no, in many cases, it's because the first time you write some CloudFormation by hand, it's a nightmare and you keep smacking into weird issues. But the second or third time, it's super easy because you just copy the thing you've already built and change the relevant bits around. And that was the learning curve that I went through playing around with a lot of these things.When you start looking at this from a large-scale environment where it's not just about upskilling the people that you have to understand how these things integrate in AWS land, but also the consistent onboarding of engineers at a fairly progressive clip is, great, you effectively have to start doing trainings on all these things, and there's a lot of knobs and dials that can blow up and hurt people. At some point, building the guardrails or building the environment in which you are getting all the stuff abstracted away from where the application engineers have to think about this at all, it eventually reaches a tipping point where it starts to feel like it's no longer optional if you want to continue growing as a company because you don't have the luxury of spending six months of onboarding before you let someone touch the thing they were hired to build.Micheal: And you will see that many companies very often have very similar programming practices like you just described. Even I learned that the same way: you have a base template, you just copy-paste it and start from there on. And no one goes through the bootstrapping process manually anymore; you want to—I think we call it cargo-culting, but in general, just get something to bootstrap and start from there. But one of the things we learned in sort of the hard way is that can also lead to, kind of, you pushing, you know, not great practices because people don't know what is a blessed version of a good template or what actually would make sense. So, some of those things, we have been working on.And this is where centralized teams like engineering productivity are really helpful is we provide you with the blessed or the canonical way to do certain things. Case in point example is a CI/CD pipeline or delivery of software services. We have invested enough in experimenting on what works with some of the more nuanced use cases at Pinterest, in helping generate, sort of, a canonical version which would cover 80% of the use cases. Someone could just go and try to build a service and they could just use the same canonical pipeline without learning much or making changes to it. This also reduces that cargo-culting nature which I called, rather than copying it from unknown sources and trying to like—again, it may cause havoc to our systems, so we can avoid a lot of that because of these practices.Corey: So, let's step a little bit beyond AWS—I know I hate doing it, too—but I'm going to assume that your remit is broader than, oh, AWS whisperer-slash-Wrangler. So, tell me a little bit more about what it is that your day-to-day looks like if there is anything that could be said not to focus purely around AWS whispering.Micheal: So, one of the challenges—and I want to talk about this a bit more—is our environments have become extremely complex over time. And it's the nature of, like, rising entropy. Like, we've just noticed that there's two things: we have a diverse set of customer base, and these include everyone trying to do different workloads or work service types. What that essentially translates into is that we realized that our solution may not fit all of them. For example, what works for a machine-learning engineer in terms of iterating on building a model and delivering a model is not the same as someone working on a long-running service and trying to deploy that. The same would apply for someone trying to operate a Kafka system.And that has made, I think, definitely our job a bit challenging in trying to assess where do you actually draw the line on the abstraction? What is the right layer of abstraction across your local development experience, across when you move over to staging your code in a PR model and getting feedback and subsequently actually releasing it to production? Because this changes dramatically based on what is the workload type you're working on. And we feel like that has been one of the biggest challenges where I know I spent my day-to-day and my team does too, in trying to help provide some of the right solutions for these individuals. There's—very often we'll also get asked from individuals trying to do a very nuanced thing.Of late, we have been talking about thinking about how you operate functions, like provide Functions as a Service within the company? It just put us in a difficult spot at times because we have to ask the hard question, “Is this required?” I know the industry is doing it; it's definitely there. I personally believe, yes, it could be a future, but is that absolutely important? Is that going to benefit Pinterest in any formal way if we invest on some core abstractions?And those are difficult conversations to have because we have exciting engineers coming in trying to do amazing things; it puts us in a hard spot, as well, as to sometimes saying graciously, no. I know many companies deal with it when they have these centralized teams, but I think it's part of that job. Like when you say it's day-to-day, I would say I'm probably saying no a couple of times in that day.Corey: Let's pretend for the sake of argument that I am, tomorrow morning, starting another company—Twitter for Pets—and over the next ten years, it grows to be larger than Pinterest in terms of infrastructure, probably not revenue because it turns out pets are not the lucrative source of ad revenue that I was hoping it would be but, you know, directionally the same thing. It seems to me that building out this sort of function with this sort of approach to things is dramatically early as far as optimizations go when it's just me puttering around on something. I'm always cognizant of the wrong people taking the wrong message when we're talking about things that happen like this at scale. When does having an engineering productivity group begin to make sense?Micheal: I mentioned this earlier; like, yeah, there is definitely not a right answer, but we can start small. For example, this group actually started more as a delivery team. You know, when we started, we realized that we had different ways of deploying services or software at Pinterest, so we first gathered together to figure out, okay, what are the different ways and can we start simplifying that part? And that's where it started expanding. Okay, we are doing button-based deployments right now we have thousand-plus microservices, and we are seeing more incidents than we wanted to because anything where there's a human involved means there's a potential gap for error. I myself was involved in a SEV 0 incident, and I will be honest; we ended up deploying a Hello World application in one of our production fleet. Not the thing I wanted to be associated with my name, but, you know—Corey: And you were suddenly saying hello to the world, in fact—Micheal: [laugh].Corey: —and oops-a-doozy.Micheal: Yeah. So—and that really prompted us to rethink how we need to enable guardrails to do safe production rollouts. And that's how those conversations start ballooning out.Corey: And the healthy correct way. We've all broken production in various ways, and it's—you correctly are identifying, I believe, the direction you're heading in where this is a process problem and a tooling problem; it is not that you are secretly crap and should never have been allowed near anything in production. I mean, that's my excuse for me, but in your case, this is a common thing where it's, if someone can unintentionally cause issues like that, there needs to be better processes and procedures as the organization matures.Micheal: Yep. And that's kind of like always the route or the starting point for these discussions. And it starts growing from there on because, okay, you've helped improve the deploy process but now we're seeing insane amount of slowness, say on the build processes, or even post-deploy, there's, like, issues on how we monitor and look into data.And that I think forces these conversations, okay, where do we have these bespoke tools available? What are people doing today? And you have to ask those hard questions, like what can we actually remove from here? The goal is not to introduce yet another new system. Many a times, to be honest bash just gets the job done. [laugh].Personally, I'm okay with that as long as it's consistent and people, you know, are able to contribute to it and you have good practices in validating it, if it works, we should go for it rather than introducing yet another YAML [laugh] and some of that other aspects of doing that work. And that's what we encourage as well. That's how I think a lot of this starts connecting together in terms of, okay, now this is becoming a productivity group; they're focused on certain challenges where investing probably one person here may up-level a few other engineers who don't have to do that on a day-to-day basis. And I think that's one of the key items for, especially, folks who are running mid-sized companies to realize and start investing in these type of teams to really up-level, sort of, the rest of the engineering.Corey: You've been doing this for a fair while. If you were to go back and start over again on day one—which is always a terrifying question, on some level—what would you have done differently about building out this function as Pinterest continued to scale out?Micheal: Well, first, I must acknowledge that this was just not me, and there's, like, ton of people involved in helping make this happen.Corey: No, that's fair. We'll blame them for the missteps; that is—Micheal: [laugh].Corey: —just fine with me. I kid. I kid.Micheal: I think, definitely the nuances. If I look back, all the decisions that were made then at that point in time, there was a decision made to move to Phabricator, which was back then a great open-source code management system where with the current information at that point in time. And I'm not—I think it's very hard to always look back and say, “Oh, we could have chosen x at one point in time.” And I think in reality, that's how engineering organizations always evolve, that you have to make do with the information you have right now to make a decision that works for you over a couple of years.And I'll give you a small example of this. There was a time when Pinterest was actually on GitHub Enterprise—this was like circa 2013, I would say—and it really served as well for, like, five-plus years. Only then at certain point, we realized that it's hard to hire PHP engineers to support a tool like that, and we had to rethink what is the ROI and the investments we've made here? Can we ever map up or match back to one of the offerings in the industry today? And that's when you make decisions that, okay, at this point in time, it's clear that business continuity talks, you know, and it's hard to operate a system, which is, at this moment not supported, and then you make a call about making a shift or moving.And I think that's the key item. I don't think there's anything dramatically I would have changed since the start. Perhaps definitely investing a bit more individuals into the group and going from there. But that said, I'm really, sort of, at least proud of the fact that usually these teams are extremely lean and small, and they always have an outsized impact, especially when they're working with other engineers, other [opinionated 00:22:13] engineers for what it's worth.This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking databases, observability, management, and security.And - let me be clear here - it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build.With Always Free you can do things like run small scale applications, or do proof of concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free. No asterisk. Start now. Visit https://snark.cloud/oci-free that's https://snark.cloud/oci-free.Corey: Most folks show up intending to do good today, and you make the best decision at the time with the context and constraints that you have, but my question I think is less around, “Well, what were the biggest mistakes you made?” But more to do with the idea of, based upon what you've learned and as you have shown—as you've shined light on these dark areas, as you have been exploring it, has anything jumped out at you that is, “Oh, yeah. Now, that I know—if I had known then what I know now, I would definitely have made this other decision.” Ideally, something that applies a little more globally than specific within Pinterest, just because the whole idea, aspirationally, is that people might learn something from our conversation. At least I will, if nothing else.Micheal: No, I think that's a great question. And I think the three things that jump to me, top of mind. I think technology is means to an end unless it gives you a competitive edge. And it's really hard to figure out at what point in time what technology and why we adopted it, it's going to make the biggest difference. Humans always tend to have a bias towards aligning towards where we want to go. So, that's the first one in my mind.The second one is, and we spoke about this last time, embrace your cloud provider as much as possible. You'd want to avoid taking on operational burden which is not going to add value to the business. If there is something you see your operating which can be offloaded—because your provider can, trust me, do a way better job than you or your team of few can ever do—embrace that as soon as possible. It's better that way because then it frees up your time to focus on the most important thing, which I've realized over time is—I really think teams like ours are actually—we're probably the most value as a glue to all the different experiences a software engineer would go through as part of their SDLC lifecycle.If we can simplify someone's life by giving them a clear view as to where their commit or the work is in this grand scheme of rolling out and giving them the right amount of data to take action when something goes wrong, trust me, they will love you for what you're doing because you're saving them ton of time. Many times, we don't realize that when we publish 11 different ways for you to go and check to just get your basic validation of work done. We tend to so much focus on the technological aspect of what the tool does, rather than the experience of it, and I've realized, if you can bridge the experience, especially for teams like ours, people really don't even need to know whether you're running Kubernetes or any of those solutions behind the scenes. And I think that's one of the biggest takeaways I have.Corey: I want to double down on something you said about the fact that you are not going to be able to run these services as effectively as your provider can. And relatively recently—in fact, since the first time we spoke—AWS has released a investment report in Virginia. And from 2011 through 2020, they have invested in building AWS data centers there, $35 billion. I promise almost no company that employs people listening to this that are not themselves a cloud provider is going to make that kind of investment in running these things themselves.Now, do cloud providers have sharp edges? Yes, absolutely. That is what my entire career is about, unfortunately. But you're not going to do a better job of running things more sustainably, more reliably, et cetera, et cetera. But there are other problems with this—and that's what I want to start exploring here—where in the olden days, when I ran things in data centers and they went down a lot more as a result, sometimes when there were outages, I would have the CEO of the company just standing there nervous worrying over my shoulder as I frantically typed to fix things.Spoiler: my typing accuracy did not improve by having someone looming over me. Now, when there's an outage that your cloud provider takes, in many cases the thing that you are doing to fix it is reloading the status page and waiting for an update because it is completely out of your hands. Is that something that you've had to encounter? Because you can push buttons and turn dials when things are broken and you control it, but in an AWS—or other cloud provider—outage, all you can really do is wait unless you have a DR plan that is large-scale and effective enough that you won't feel foolish or have wasted a huge amount of time and energy migrating off and then—because then it gets repaired in ten minutes. How do you approach that, from your perspective? I guess, the expectation management piece?Micheal: It's definitely I know something which keeps a lot of folks within infrastructure up at night because, like you just said, at times we can feel extremely powerless when we obviously don't have direct control—or visibility at times, as well—on what's happening. One of the things we have realized over time as part of running on our cloud provider for over a decade now, it forces us to rethink a bit on our priority workflows, what we want our Pinners to always have access to, what they need to see, what is not important or critical. Because it puts into perspective, even for the infrastructure teams, is to what is the most important thing we should always have it available and running, what is okay to be in a degraded state, until what time, right? So, it actually forces us to define SLOs and availability criteria within the team where we can broadcast that to the larger audience including the executives. So, none of this comes as a surprise at that point.I mean, it's not the answer, probably, you're looking for because is there's nothing we can do except set expectations clearly on what we can do and how when you think about the business when these things do happen. So, I know people may have I have a different view on this; I'm definitely curious to hear as well, but I know at Pinterest at least we have converged on our priority workflows. When something goes out, how do we jump in to provide a degraded experience? We have very clear run books to do that, and especially when it's a SEV 0, we do have clear processes in place on how often we need to update our entire company on where things are. And especially this is where your partnership with the cloud provider is going to be a big, big boon because you really want to know or have visibility, at the minimum some predictability on when things can get resolved, and how you want to work with them on some creative solutions. This is outside the DR strategy, obviously; you should still be focused on a DR strategy, but these are just simple things we've learned over time on how to just make it predictable for individuals within the company, so not everyone is freaking out.Corey: Yeah, from my perspective, I think the big things that I found that have worked, in my experience—mostly by getting them wrong the first time—is explain that someone else running the infrastructure when they take an outage; there's not much we can do. And no, it's not the sort of thing where picking up the phone and screaming at someone is going to help us, is the sort of thing that is best to communicate to executive stakeholders when things are running well, not in the middle of that incident.Then when things break, it's one of those, “Great, you're an exec. You know what your job is? Literally anything other than standing in the middle of the engineering floor, making everyone freak out even more. We'll have a discussion later about what the contributing factors were when you demand that we fire someone because of an outage. Then we're going to have a long and hard talk about what kind of culture you're trying to build here again?” But there are no perfect answers here.It's easy to sit here in the silver light of day with things working correctly and say, “Oh, yeah. This is how outages should be handled.” But then when it goes down, we're all basically an inch away at best from running around with our hair on fire, screaming, “Fix it, fix it, fix it, fix it, now.” And I am empathetic to that. There's a reason but I fix AWS bills for a living, and one of those big reasons is that it's a strictly business-hours problem and I don't have to run production infrastructure that faces anything that people care about, which is kind of amazing and freeing for someone who spent too many years on call.Micheal: Absolutely. And one of the things is that this is not only with the cloud provider, I think in today's nature of how our businesses are set up, there's probably tons of other APIs you are using or you're working with you may not be aware of. And we ended up finding that the hard way as well. There were a certain set of APIs or services we were using in the critical path which we were not aware of. When these outages happen, that's when you find that out.So, you're not only beholden to your provider at that point in time; you have to have those SLO expectations set with your other SaaS providers as well, other folks you're working with. Because I don't think that's going to change; it's probably only going to get complicated with all the different types of tools you're using. And then that's a trade-off you need to really think about. An example here is just like—you know, like I said, we moved in the past from GitHub to Phabricator—I didn't close the loop on that because we're moving back to GitHub right now [laugh] and that's one of the key projects I'm working with. Yeah, it's circle of life.But the thing is, we did a very strong evaluation here because we felt like, “Okay, there's a probability that GitHub can go down and that means people will be not productive for that couple of hours. What do we do then?” And we had to put a plan together to how we can mitigate that part and really build that confidence with the engineering teams, internally. And it's not the best solution out there; the other solution was just run our own, but how is that going to make any other difference because we do have libraries being pulled out of GitHub and so many other aspects of our systems which are unknowingly dependent on it anyways. So, you have to still mitigate those issues at some point in your entire SDLC process.So, that was just one example I shared, but it's not always on the cloud provider; I think there are just many aspects of—at least today how businesses are run, you're dependent; you have critical dependencies, probably, on some SaaS provider you haven't really vetted or evaluated. You will find out when they go down.Corey: So, I don't think I've told this story before, but before I started this place, I was doing a fair bit of consulting work for other companies. And I was doing a project at Pinterest years ago. And this was one of the best things I've ever experienced at a company site, let alone a client site, where I was there early in the morning, eight o'clock or so, so you know, engineers love to show up at the crack of 11:30. But so I was working a little early; it was great. And suddenly my SSH session that I was using to remote into something or other hung.And it's tap up, tap enter a couple of times, tap it a couple more. It was hung hard. “What's the—” and then someone gently taps me on the shoulder. So, I take the headphones off. It was someone from corporate IT was coming around saying, “Hey, there's a slight problem with our corporate firewall that we're fixing. Here's a MiFi device just for you that you can tether to get back online and get worked on until the firewall gets back.”And it was incredible, just the level of just being on top of things, and the focus on keeping the people who were building things and doing expensive engineering work that was awesome—and also me—productive during that time frame was just something I hadn't really seen before. It really made me think about the value of where do you remove bottlenecks from people getting their jobs done? It was—it remains one of the most impressive things I've seen.Micheal: That is great. And as you were telling me that I did look up our [laugh] internal system to see whether a user called Corey Quinn existed, and I should confirm this with you. I do see entries over here, a couple of commits, but this was 2015. Was that the time you were around, or is this before that even?Corey: That would have been around then, yes. I didn't start this place until late 2016.Micheal: I do see your commits, like, from 2015, and I—Corey: And they're probably terrible, I have no doubt. There's a reason I don't read code for a living anymore.Micheal: Okay, I do see a lot of GIFs—and I hope it's pronounced as GIF—okay, this is cool. We should definitely have a chat about this separately, Corey?Corey: Oh, yeah. “Would you explain this code?” “Absolutely not. I wrote it. Of course, I have no idea what it does. That's the rule. That's the way code always works.”Micheal: Oh, you are an honorary Pinterest engineer at this point, and you have—yes—contributed to our API service and a couple of Puppet profiles I see over here.Corey: Oh, yes—Micheal: [Amazing 00:36:11]. [laugh].Corey: You don't wind up thinking that's a risk factor that should be disclosed. I kid. I kid. It's, I made a joke about this when VMware acquired SaltStack and I did some analytics and found that 60 some odd lines of code I had written, way back when that were still in the current version of what was being shipped. And they thought, “Wait, is this actually a risk?”And no, I am making a joke. The joke is, is my code is bad. Fortunately, there are smart people around me who review these things. This is why code review is so important. But there was a lot to admire when I was there doing various things at Pinterest. It was a fun environment to work in, the level of professionalism was phenomenal, and I was just a big fan of a lot of the automation stuff.Phabricator was great. I love working with it, and, “Great, I'm going to use this to the next place I go.” And I did and then it was—I looked at what it took to get it up and running, and oh, yeah, I can see why GitHub is so popular these days. But it was neat. It was interesting seeing that type of environment up close.Micheal: That is great to hear. You know, this is what I enjoy, like, hearing some of these war stories. I am surprised; you seem to have committed way more than I've ever done in my [laugh] duration here at Pinterest. I do managing for a living, but then again—Corey, the good news is your code is still running on production. And we—Corey: Oh dear.Micheal: —haven't—[laugh]. We haven't removed or made any changes to it, so that's pretty amazing. And thank you for all your contributions.Corey: Oh, please, you don't have to thank me. I was paid, it was fine. That's the value of—Micheal: [laugh].Corey: —[work 00:37:38] for hire. It's kind of amazing. And the best part about consultants is, is when we're done with a project, we get the hell out everyone's happy about it.More happy when it's me that's leaving because of obvious personality-related reasons. But it was just an interesting company from start to finish. I remember one other time, I wound up opening a ticket about having a slight challenge with a flickering on my then Apple-branded display that everyone was using before they discontinued those. And I expected there to be, “Oh, okay. You're a consultant. Great. How did we not put you in the closet with a printer next to that thing, breathing the toner?” Like most consulting clients tend to do, and sure enough, three minutes later, I'm getting that tap on the shoulder again; they have a whole replacement monitor. “Can you go grab a cup of coffee? We'll run the cable for it. It'll just be about five minutes.” I started to feel actively bad about requesting things because I did a lot of consulting work for a lot of different companies, and not to be unkind, but treating consultants and contractors super well is not something that a lot of companies optimize for. I can't necessarily blame them for that. It just really stood out.Micheal: Yep, I do hope we are keeping up with that right now because I know our team definitely has a lot of consultants working with us as well. And it's always amazing to see; we do want to treat them as FTs. It doesn't even matter at that point because we're all individuals and we're trying to work towards common goals. Like you just said, I think I personally have learned a few items as well from some of these folks. Which is again, I think speaks to how we want to work and create a culture of, like, we're all engineers; we want to be solving problems together, and as you were doing it, we want to do it in such a way that it's still fun, and we're not having the restrictions of titles or roles and other pieces. But I think I digressed. It was really fun to see your commits though, I do want to track this at some point before we move completely over to GitHub, at least keep this as a record, for what it's worth.Corey: Yeah basically look at this graffiti in the codebase of, “A shit-poster was here,” and here I am. And that tends to be, on some level, the mark we live on the universe. What's always terrifying is looking at things I did 15 years ago in my first Linux admin job. Can I still ping the thing that I built there? Yes, I can. And how is that even possible? That should not have outlived me; honestly, it should never have seen the light of day in production, but here we are. And you never know how long that temporary kluge you put together is going to last.Micheal: You know, one of the things I was recalling, I was talking to someone in my team about this topic as well. We always talk about 10x engineers. I don't know what your thoughts are on that, but the fact that you just mentioned you built something; it still pings. And there's a bunch of things, in my mind, when you are writing code or you're working on some projects, the fact that it can outlast you and live on, I think that's a big, big contribution. And secondly, if your code can actually help up-level, like, ten other people, I think you've really made the mark of 10x engineer at that point.Corey: Yeah, the idea of the superhuman engineer is always been a strange and dangerous one. If for nothing else, from where I sit, excellence is inherently situational. Like we just talked about someone at Pinterest: is potentially going to be able to have that kind of impact specifically because—to my worldview—that there's enough process and things around there that empower them to succeed. Then if you were to take that engineer and drop them into a five-person startup where none of those things exist, they might very well flounder. It's why I'm always a little suspicious of this is a startup founded by engineers from Google or Facebook, or wherever it is.It's, yeah, and what aspects of that culture do you think are one-to-one matches with the small scrappy startup in the garage? Right, I predicting some challenges here. Excellence is always situational. An amazing employee at one company can get fired at a second one for lack of performance, and that does not mean that there's anything wrong with them and it does not mean that they are a fraud. It means that what they needed to be successful was present in one of those shops, but not the other.Micheal: This is so true. And I really appreciate you bringing this up because whenever we discuss any form of performance management, that is a—in my view personally—I think that's an incorrect term to be using. It is really at that point in time, either you have outlived the environment you are in, or the environment is going in a different direction where I think your current skill set probably could be best used in the environment where it's going to work. And I know it's very fuzzy at that point, but like you said, yes, excellence really means you don't want to tie it to the number of commits you have pushed out, or any specific aspect of your deliverables or how you work.Corey: There are no easy answers to any of these things, and it's always situational. It's why I think people are sometimes surprised when I will make comments about the general case of how things should be, then I talk to a specific environment where they do the exact opposite, and I don't yell at them for it. It's there—in a general sense, I have some guidance, but they are usually reasons things are the way they are, and I'm interested in hearing them out. Everything's situational, the worst consultant in the world is the one that shows up, has no idea what's going on, and then asked, “What moron set this up?” Invariably, two said, quote-unquote, “Moron.” And the engagement doesn't go super well from there. It's, “Okay, why is this the way that it is? What constraints shaped it? What was the context behind the problem you were trying to solve?” And, “Well, why didn't you use this AWS service?” “Because it didn't exist for another three years when we were building that thing,” is a—Micheal: Yes.Corey: —common answer.Micheal: Yes, you should definitely appreciate that of all the decisions that have been made in past. People tend to always forget why they were made. You're absolutely right; what worked back then will probably not work now, or vice versa, and it's always situational. So, I think I can go on about this for hours, but I think you hit that to the point, Corey.Corey: Yeah, I do my best. I want to thank you for taking another block of time out of your day to wind up talking with me about various aspects of what it takes to effectively achieve better levels of engineering productivity at large companies, with many teams, working on shared codebases. If people want to learn more about what you're up to, where can they find you?Micheal: I'm definitely on Twitter. So, please note that I'm spelled M-I-C-H-E-A-L on Twitter. So, you can definitely read on to my tweets there. But otherwise, you can always reach out to me on LinkedIn, too.Corey: Fantastic and we will, of course, include a link to that in the [show notes 00:44:02]. Thanks once again for your time. I appreciate it.Micheal: Thanks a lot, Corey.Corey: Micheal Benedict, head of engineering productivity at Pinterest. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with a comment telling me that you work at Pinterest, have looked at the codebase, and would very much like a refund and an apology.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Changelog Master Feed
What does good DevOps look like? (Ship It! #28)

Changelog Master Feed

Play Episode Listen Later Nov 17, 2021 55:22


This week Gerhard is chatting with Romano Roth, Head of DevOps at Zühlke, a company founded by Gerhard Zühlke in 1968. Nowadays they help companies all over the world build, ship and run anything from factory robots, to AI assistants in complex regulatory environments, and even medical devices that perform autonomous robotic surgery. When Romano is not leading a team of 30 software engineers that specialise in operations, infrastructure and cloud, he is one of the organisers of DevOps Days Zürich, and also the DevOps Meetup group which is how Gerhard and Romano met in 2019. Having started his career as a .Net developer back in 2002, Romano had his fair share of dev and ops challenges, and he always enjoys seeing real business value delivered continuously in an automated way. In recent years, Romano's perspective broadened, and now he sees DevOps challenges across many companies. If you are curious about what good DevOps looks like, and what are the real challenges, then Romano has some good insights for you.

Break Things On Purpose
Tomas Fedor

Break Things On Purpose

Play Episode Listen Later Nov 16, 2021 31:28


In this episode, we cover: 00:00:00 - Introduction 00:02:45 - Adopting the Cloud 00:08:15 - POC Process  00:12:40 - Infrastructure Team Building 00:17:45 - “Disaster Roleplay”/Communicating to the Non-Technical Side  00:20:20 - Leadership 00:22:45 - Tomas' Horror Story/Dashboard Organziation 00:29:20 - Outro Links: Productboard: https://www.productboard.com Scaling Teams: https://www.amazon.com/Scaling-Teams-Strategies-Successful-Organizations/dp/149195227X Seeking SRE: https://www.amazon.com/Seeking-SRE-Conversations-Running-Production/dp/1491978864/ TranscriptJason: Welcome to Break Things on Purpose, a podcast about failure and reliability. In this episode, we chat with Tomas Fedor, Head of Infrastructure at Productboard. He shares his approach to testing and implementing new technologies, and his experiences in leading and growing technical teams.Today, we've got with us Tomas Fedor, who's joining us all the way from the Czech Republic. Tomas, why don't you say hello and introduce yourself?Tomas: Hello, everyone. Nice to meet you all, and my name is Tomas, or call me Tom. And I've been working for a Productboard for past two-and-a-half year as infrastructure leader. And all the time, my experience was in the areas of DevOps, and recently, three and four years is about management within infrastructure teams. What I'm passionate about, my main technologies-wise in cloud, mostly Amazon Web Services, Kubernetes, Infrastructure as Code such as Terraform, and recently, I also jumped towards security compliances, such as SOC 2 Type 2.Jason: Interesting. So, a lot of passions there, things that we actually love chatting about on the podcast. We've had other guests from HashiCorp, so we've talked plenty about Terraform. And we've talked about Kubernetes with some folks who are involved with the CNCF. I'm curious, with your experience, how did you first dive into these cloud-native technologies and adopting the cloud? Is that something you went straight for, or is that something you transitioned into?Tomas: I actually slow transition to cloud technologies because my first career started at university when I was like, say, half developer and half Unix administrator. And I had experience with building very small data center. So, those times were amazing to understand all the hardware aspects of how it's going to be built. And then later on, I got opportunity to join a very famous startup at Czech Republic [unintelligible 00:02:34] called Kiwi.com [unintelligible 00:02:35]. And that time, I first experienced cloud technologies such as Amazon Web Services.Jason: So, as you adopted Amazon, coming from that background of a university and having physical servers that you had to deal with, what was your biggest surprise in adopting the cloud? Maybe something that you didn't expect?Tomas: So, that's great question, and what comes to my mind first, is switching to completely different [unintelligible 00:03:05] because during my university studies and career there, I mostly focused on networking [unintelligible 00:03:13], but later on, you start actually thinking about not how to build a service, but what service you need to use for your use case. And you don't have, like, one service or one use case, but you have plenty of services that can suit your needs and you need to choose wisely. So, that was very interesting, and it needed—and it take me some time to actually adopt towards new thinking, new mindset, et cetera.Jason: That's an excellent point. And I feel like it's only gotten worse with the, “How do you choose?” If I were to ask you to set up a web service and it needs some sort of data store, at this point you've got, what, a half dozen or more options on Amazon? [laugh].Tomas: Exactly.Jason: So, with so many services on providers like Amazon, how do you go about choosing?Tomas: After a while, we came up with a thing like RFCs. That's like ‘Request For Comments,' where we tried to sum up all the goals, and all the principles, and all the problems and challenges we try to tackle. And with that, we also tried to validate all the alternatives. And once you went through all these information, you tried to sum up all the possible solutions. You typically had either one or two options, and those options were validated with all your team members or the whole engineering organization, and you made the decision then you try to run POC, and you either are confirmed, yeah this is the technology, or this is service you need and we are going to implement it, or you revised your proposal.Jason: I really like that process of starting with the RFC and defining your requirements and really getting those set so that as you're evaluating, you have these really stable ideas of what you need and so you don't get swayed by all of the hype around a certain technology. I'm curious, who is usually involved in the RFC process? Is it a select group in the engineering org? Is it broader? How do you get the perspectives that you need?Tomas: I feel we have very great established process at Productboard about RFCs. It's transparent to the whole organization, that's what I love the most. The first week, there is one or two reporters that are mainly focused on writing and summing up the whole proposal to write down goals, and also non-goals because that is going to define your focus and also define focus of reader. And then you're going just to describe alternatives, possible options, or maybe to sum up, “Hey, okay, I'm still unsure about this specific decision, but I feel this is the right direction.” Maybe I have someone else in the organization who is already familiar with the technology or with my use case, and that person can help me.So, once—or we call it a draft state, and once you feel confident, you are going to change the status of RFC to open. The time is open to feedback to everyone, and they typically geared, like, two weeks or three weeks, so everyone can give a feedback. And you have also option to present it on engineering all-hands. So, many engineers, or everyone else joining the engineering all-hands is aware of this RFC so you can receive a lot of feedback. What else is important to mention there that you can iterate over RFCs.So, you mark it as resolved after through two or three weeks, but then you come up with a new proposal, or you would like to update it slightly with important change. So, you can reopen it and update version there. So, that also gives you a space to update your RFC, improve the proposal, or completely to change the context so it's still up-to-date with what you want to resolve.Jason: I like that idea of presenting at engineering all-hands because, at least in my experience, being at a startup, you're often super busy so you may know that the RFC is available, but you may not have time to actually read through it, spend the time to comment, so having that presentation where it's nicely summarized for you is always nice. Moving from that to the POC, when you've selected a few and you want to try them out, tell me more about that POC process. What does that look like?Tomas: So typically, in my infrastructure team, it's slightly different, I believe, as you have either product teams focus on POCs, or you have more platform teams focusing on those. So, in case of the infrastructure team, we would like to understand what code is actually going to be about because typically the infrastructure team has plenty of services to be responsible for, to be maintained, and we try to first choose, like, one specific use case and small use case that's going to suit the need.For instance, I can share about implementation of HashiCorp Vault, like our adoption. We leveraged firstly only key-value engine for storing secrets. And what was important to understand here, whether we want to spend hours of building the whole cluster, or we can leverage their cloud service and try to integrate it with one of our services. And we need to understand what service we are going to adopt with Vault.So, we picked cloud solution. It was very simple, the experience that were seamless for us, we understood what we needed to validate. So, is developer able to connect to Vault? Is application able to connect to Vault? What roles does it offer? Was the difference for cloud versus on-premise solution?And at the end, it's often the cost. So, in that case, POC, we spin up just cloud service integrated with our system, choose the easiest possible adaptable service, run POC, validate it with developers, and provide all the feedback, all the data, to the rest of the engineering. So, that was for us, some small POC with large service at the end.Jason: Along with validating that it does what you want it to do, do you ever include reliability testing in that POC?Tomas: It is, but it is in, like, let's say, it's in a later stage. For example, I can again mention HashiCorp Vault. Once we made a decision to try to spin up first on-premise cluster, we started just thinking, like, how many master nodes do we need to have? How many availability zones do we need to have? So, you are going to follow quorum?And we are thinking, “Okay, so what's actually the reliability of Amazon Web Services regions and their availability zones? What's the reliability of multi-cross-region? And what actually the expectations that is going to happen? And how often they happen? Or when in the past, it happened?”So, all those aspects were considered, and we ran out that decision. Okay, we are still happy with one region because AWS is pretty stable, and I believe it's going to be. And we are now successfully running with three availability zones, but before we jumped to the conclusion of having three availability zones, we run several tests. So, we make sure that in case one availability zone being down, we are still fully able to run HashiCorp Vault cluster without any issues.Jason: That's such an important test, especially with something like HashiCorp Vault because not being able to log into things because you don't have credentials or keys is definitely problematic.Tomas: Fully agree.Jason: You've adopted that during the POC process, or the extended POC process; do you continue that on with your regular infrastructure work continuing to test for reliability, or maybe any chaos engineering?Tomas: I actually measure something about what we are working on, like, what we have so far improved in terms of post-mortem process that's interesting. So, we started two-and-a-half year ago, and just two of us as infrastructure engineers. At the time, there was only one incident response on-call team, our first iteration within the infrastructure team was with migration from Heroku, where we ran all our services, to Amazon Web Services. And that time, we needed to also start thinking about, okay, the infrastructure team needs to be on call as well. So, that required to update in the process because until then, it works great; you have one team, people know each other, people know the whole stack. Suddenly, you are going to add new people, you're going to add new people a separate team, and that's going to change the way how on-call should be treated, and how the process should look like.You may ask why. You have understanding within the one team, you understand the expectations, but then you have suddenly different skill set of people, and they are going to be responsible for different part of the technical organization, so you need to align the expectation between two teams. And that was great because guys at Productboard are amazing, and they are always helpful. So, we sat down, we made first proposal of how new team is going to work like, what are going to be responsibilities. We took inspirations from the already existing on-call process, and we just updated it slightly.And we started to run with first test scenarios of being on call so we understand the process fully. Later on, it evolved to more complex process, but it's still very simple. What is more complex: we have more teams that's first thing being on call; we have better separation of all the alerts, so you're not going to route every alert to one team, but you are able to route it to every team that's responsible for its service; the team have also prepared a set of runbooks so anyone else can easily follow runbook and fix the incident pretty easily, and then we also added section about post-mortems, so what are our expectations of writing down post-mortem once incident is resolved.Jason: That's a great process of documenting, really—right—documenting the process so that everybody, whether they're on a different team and they're coming over or new hires, particularly, people that know nothing about your established practices can take that runbook and follow along, and achieve the same results that any other engineer would.Tomas: Yeah, I agree. And what was great to see that once my team grew—we are currently five and we started two—we saw excitement of the team members to update the process so everybody else we're going to join the on-call is going to be excited, is going to take it as an opportunity to learn more. So, we added disaster roleplay, and that section talks about you are new person joining on-call team, and we would like to make sure you are going to understand all the processes, all the necessary steps, and you are going to be aligned with all the expectations. But before you will actually going to have your first alerts of on-call, we would like to try to run roleplay. Imagine what a HashiCorp Vault cluster is going down; you should be the one resolving it. So, what are the first steps, et cetera?And that time you're going to realize whatever is being needs to be done, it's not only from a technical perspective, such as check our go to monitoring, check runbook, et cetera, but also communication-wise because you need to communicate not only with your shadowing buddy, but you also need to communicate internally, or to the customers. And that's going to change the perspective of how an incident should be handled.Jason: That disaster roleplay sounds really amazing. Can you chat a little bit more about the details of how that works? Particularly you mentioned engaging the non-technical side—right—of communication with various people. Does the disaster roleplay require coordinating with all those people, or is it just a mock, you would pretend to do, but you don't actually reach out to those people during this roleplay?Tomas: So, we would like to also combine the both aspects. We would like to make sure that person understands all the communication channels that are set within our organization, and what they are used for, and then we would like to make sure that that person understand how to involve other engineers within the organization. For instance, what was there the biggest difference is that you have plenty of options how to configure assigning or creating an alert. And so for those, you may have a different notification settings. And what happened is that some of the people have settings only for newly created alert, but when you made a change of assigned person of already existing alert, someone else, it might happen that that person didn't notice it because the notification setting was wrong. So, we encountered even these kind of issues and we were able to fix it, thanks to disaster roleplay. So, that was amazing to be found out.Jason: That's one of the favorite things that I like to do when we're using chaos engineering to do a similar thing to the disaster roleplay, is to really check those incident response processes, and validating those alerts is huge. There's so many times that I've found that we thought that someone would be alerted for some random thing, and turns out that nobody knew anything was going on. I love that you included that into your disaster roleplay process.Tomas: Yeah, it was also great experience for all the engineers involved. Unfortunately, we run it only within our team, but I hope we are going to have a chance to involve all other engineering on-call teams, so the onboarding experience to the engineering on-call teams is going to rise and is going to be amazing.Jason: So, one of the things that I'm really interested in is, you've gone from being a DevOps engineer, an SRE individual contributor role, and now you're leaving a small team. I think a lot of folks, as they look at their career, and I think more people are starting to become interested in this is, what does that progression look like? This is sort of a change of subject, but I'm interested in hearing your thoughts on what are the skills that you picked up and have used to become an effective technical leader within Productboard? What's some of that advice that our listeners, as individual contributors, can start to gain in order to advance where they're going with their own careers?Tomas: Firstly, it's important to understand what makes you passionate in your career, whether it's working with people, understanding their needs and their future, or you would like to be more on track as individual contributor and you would like to enlarge your scope of responsibilities towards leading more technical complex initiatives, that are going to take a long time to be implemented. In case all the infrastructure, or in case of the platform leaders, I would say the position of manager or technical leader also requires certain technical knowledge so you can be still in close touch with your team or with your most senior engineers, so you can set the goals and set the strategic clearly. But still, it's important to be, let's say, people person and be able to listen because in that case, people are going to be more open to you, and you can start helping them, and you can start making their dreams true and achievable.Jason: Making their dreams true. That's a great take on this idea because I feel like so many times, having done infrastructure work, that you start to get a mindset of maybe that people just are making demands of you, all the time. And it's sometimes hard to keep that perspective of working together as a team and really trying to excel to give them a platform that they can leverage to really get things done. We were talking about disaster roleplaying, and that naturally leads to a question that we like to ask of all of our guests and that's, do you have any horror stories from your career about an incident, some horror story or outage that you experienced and what you've learned from it?Tomas: I have one, and it actually happened at the beginning of my career of DevOps engineer. What is interesting here that it was one of the toughest incidents I experienced. It happened after midnight. So, the time I was still new to a company, and we have received an alert informing about too many 502, 504 errors written from API. At the time API process thousands of requests per second, and the incident had a huge impact on the services we were offering.And as I was shadowing my on-call buddy, I tried to check our main alerting channel, see what's happening, what's going on there, how can I help, and I started with checking monitoring system, reviewing all the reports from the engineers of being on-call, and I initiated the investigation on my own. I realized that something is wrong or something is not right, and I realized I was just confused and I want sleep, so it took me a while to get back on track. So, I made the side note, like, how can I start my brain to be working as during the day? And then I got back to the incident resolution process.So, it was really hard for me to start because I didn't know what [unintelligible 00:24:27] you knew about the channel, you knew about your engineers working on the resolution, but there were plenty of different communication funnels. Like, some of the engineers were deep-focused on their own investigation, and some of them were on call. And we needed to provide regular updates to the customers and internally as well. I had that inner feeling of let's share something, but I realized I just can't drop a random message because the message with all the information should have certain format and should have certain information. But I didn't know what kind of information should be there.So, I tried to ping someone, so, “Hey, can you share something?” And in the meantime, actually, more other people send me direct message. And I saw there are a lot of different tracks of people who tried to solve the incident, who tries to provide the status, but we were not aligned. So, this all showed me how important is to have proper communication funnel set. And we got the lucky to actually end up in one channel, we got lucky to resolve incident pretty quickly.And what else I learned that I would recommend to make sure you know where to work. I know it's pretty obvious sentence, but once your company has plenty of dashboards and you need to find one specific metric, sometime it looks like mission impossible.Jason: That's definitely a good lesson learned and feeds back to that disaster roleplays, practicing how you do those communications, understanding where things need to be communicated. You mentioned that it can be difficult to find a metric within a particular dashboard when you have so many. Do you have any advice for people on how to structure their dashboards, or name their dashboards, or organize them in a certain way to make that easier to find the metric or the information that you're looking for?Tomas: I will have a different approach, and that do have basic dashboard that provides you SLOs of all the services you have in the company. So, we understand firstly what service actually impacts the overall stability or reliability. So, that's my first advice. And then you should be able to either click on the specific service, and that should redirect you to it's dashboard, or you're going to have starred one of your favorite dashboards you have. So, I believe the most important is really have one main dashboard where you have all the services and their stability resourced, then you have option to look.Jason: Yeah, when you have one main dashboard, you're using that as basically the starting point, and from there, you can branch out and dive deeper, I guess, into each of the services.Tomas: Exactly, exactly true.Jason: I like that approach. And I think that a lot of modern dashboarding or monitoring systems now, the nice thing is that they have that ability, right, to go from one particular dashboard or graphic and have links out to the other information, or just click on the graph and it will show you the underlying host dashboard or node dashboard for that metric, which is really, really handy.Tomas: And I love the connection with other monitoring services, such as application monitoring. That gives you so much insight and when it's even connected with your work management tool is amazing so you can have all the important information in one place.Jason: Absolutely. So, oftentimes we talk about—what is it—the three pillars of observability, which I know some of our listeners may hate that, but the idea of having metrics and performance monitoring/APM and logs, and just how they all connect to each other can really help you solve a lot, or uncover a lot of information when you're in the middle of an incident. So Tomas, thanks for being on the show. I wanted to wrap up with one more question, and that's do you have any shoutouts, any plugs, anything that you want to share that our listeners should go take a look at?Tomas: Yeah, sure. So, as we are talking about management, I would like to promote one book that helped make my career, and that's Scaling Teams. It's written by Alexander Grosse and David Loftesness.And another one book is from Google, they have, like, three series, one of those is Seeking SRE, and I believe other parts are also useful to be read in case you would like to understand whether your organization needs SRE team and how to implement it within organization, and also, technically.Jason: Those are two great resources, and we'll have those linked in the show notes on the website. So, for anybody listening, you can find more information about those two books there. Tomas, thanks for joining us today. It's been a pleasure to have you.Tomas: Thanks. Bye.Jason: For links to all the information mentioned, visit our website at gremlin.com/podcast. If you liked this episode, subscribe to the Break Things on Purpose podcast on Spotify, Apple Podcasts, or your favorite podcast platform. Our theme song is called “Battle of Pogs” by Komiku, and it's available on loyaltyfreakmusic.com.

Software Misadventures
Cory Watson - Leading observability teams at Twitter & Stripe, how to succeed in a new org, effective ways to advocate for your team and more - #16

Software Misadventures

Play Episode Listen Later Nov 12, 2021 84:09


Cory is currently a Solutions Engineer at Jeli.io and very well known in the community for his work on Observability. His career in observability began at Twitter where he managed the observability team and then he joined Stripe, where he created and led the observability team, this time around as a Principal Engineer. We talk to him about how he got his start in customer support and the role it played in the later part of his career. We discuss his time at Twitter where there was a power outage in the data center on the day he joined and how once he had to stay up all night dealing with file handle leaks. We also discuss how he created and led the observability team at Stripe as an individual contributor, how one can succeed in a new org, how to navigate information asymmetry in the workplace, what are some effective ways to advocate for your team and how we all are just humans trying to get stuff done.

Software Defined Talk
Episode 329: Eat the complexity

Software Defined Talk

Play Episode Listen Later Nov 12, 2021 59:20


This week we discuss the state of severless, the agility equation and Twitter goes blue. Plus, what exactly happens in an Internet Minute…? Rundown The Unfulfilled Promise of Serverless (https://www.lastweekinaws.com/blog/the-unfulfilled-promise-of-serverless) Twitter will now let you pay to undo tweets and read ad-free news in the US (https://www.theverge.com/2021/11/9/22766286/twitter-blue-subscription-service-scroll-nuzzel-undo-tweets-ad-free-articles-us?scrolla=5eb6d68b7fedc32c19ef33b4) From Amazon to Zoom: What Happens in an Internet Minute In 2021? (https://www.visualcapitalist.com/from-amazon-to-zoom-what-happens-in-an-internet-minute-in-2021/) Clubhouse rolls out Replay to let users record live rooms and share them later (https://techcrunch.com/2021/11/08/clubhouse-record-room-replay/) Relevant to your interests HashiCorp: Benchmarking the S-1 Data (https://cloudedjudgement.substack.com/p/hashicorp-benchmarking-the-s-1-data) Cloudflare Announces Third Quarter 2021 Financial Results (https://finance.yahoo.com/news/cloudflare-announces-third-quarter-2021-201500627.html) Hello, world! Meet Kyndryl, the services biz spun-out by IBM (https://www.theregister.com/2021/11/04/kyndryl_ibm_spinoff/) Google is working on a more user-friendly way to find files in Drive (https://www.theverge.com/2021/11/4/22763374/google-drive-search-chips-filters-beta-test) Habitica - Gamify Your Life (https://habitica.com/static/home) Red Hat to hire fewer senior engineers after budget frozen (https://www.theregister.com/2021/11/05/red_hat_jobs/) Datadog Acquires Ozcode - Ozcode (https://oz-code.com/blog/general/datadog-acquires-ozcode) What I Talk About When I Talk About Platforms (https://martinfowler.com/articles/talk-about-platforms.html) To reinvent work, we have to destroy the clock (https://thehustle.co/to-reinvent-work-we-have-to-destroy-the-clock/) We've gone backwards since the Bronze Age with calendars (https://www.theregister.com/2021/11/08/calendar_backwards/) AWS Announces Plans to Open Second Region in Canada (https://www.businesswire.com/news/home/20211108005823/en/AWS-Announces-Plans-to-Open-Second-Region-in-Canada) Security software company McAfee acquired for $14 billion (https://www.theverge.com/2021/11/8/22769910/mcafee-private-investor-group-acquisition-software) Unity is buying Peter Jackson's Weta Digital for over $1.6B (https://techcrunch.com/2021/11/09/unity-is-buying-peter-jacksons-weta-digital-for-over-1-6b/) Amazon's Cloud's New Boss Is Girding to Defend Turf in the Field Company Pioneered (https://www.wsj.com/articles/amazons-cloud-boss-is-girding-to-defend-turf-in-the-field-company-pioneered-11636300800?st=5gwylbkkz0bug3s&reflink=article_email_share) GE to break up into 3 companies focusing on aviation, health care and energy (https://www.cnbc.com/2021/11/09/ge-to-break-up-into-3-companies-focusing-on-aviation-healthcare-and-energy.html) You're scheduling too many 1:1 meetings (https://www.protocol.com/workplace/one-on-one-meetings) Nonsense Elon Musk faces a $15 billion tax bill, which is likely the real reason he's selling stock (https://www.cnbc.com/2021/11/07/elon-musk-faces-a-15-billion-tax-bill-which-is-likely-the-real-reason-hes-selling-stock.html) Sponsors strongDM — Manage and audit remote access to infrastructure. Start your free 14-day trial today at strongdm.com/SDT (http://strongdm.com/SDT) CBT Nuggets — Training available for IT Pros anytime, anywhere. Start your 7-day Free Trial today at cbtnuggets.com/sdt (https://cbtnuggets.com/sdt) Conferences Coté speaking at DevOops (https://devoops.ru/en/) (Russia), Nov 11th: “Kubernetes is not for developers…?” (https://devoops.ru/en/talks/kubernetes-is-not-for-developers/) THAT Conference comes to Texas January 17-20, 2022 (https://that.us/events/tx/2022/) — Now with the right link Brandon going to AWS Re:invent join the meetupIRL channel in SDT Slack Promote SDT TikTok, mentioned some stuff gets edited out Listener Feedback Jordan wants you to work as an SRE at Shopify (http://Senior> Site Reliability Engineer - Remote, Americas). Fully remote in Americas, Hawaii, APAC and EMEA SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack). Send your postal address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and we will send you free laptop stickers! Follow us on Twitch (https://www.twitch.tv/sdtpodcast), Twitter (https://twitter.com/softwaredeftalk), Instagram (https://www.instagram.com/softwaredefinedtalk/), LinkedIn (https://www.linkedin.com/company/software-defined-talk/) and YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured). Brandon built the Quick Concall iPhone App (https://itunes.apple.com/us/app/quick-concall/id1399948033?mt=823) and he wants you to buy it for $0.99. Use the code SDT to get $20 off Coté's book, (https://leanpub.com/digitalwtf/c/sdt) Digital WTF (https://leanpub.com/digitalwtf/c/sdt), so $5 total. Become a sponsor of Software Defined Talk (https://www.softwaredefinedtalk.com/ads)! Recommendations Brandon: macOS disk size utility (http://grandperspectiv.sourceforge.net) Matt: Cloud Native AF #5: (https://www.cloudnativeaf.com/5) Charming Pirates and the Funnel to Contributors Photo Credits Banner (https://unsplash.com/photos/-mwNJswDlXE) Show Art (https://unsplash.com/photos/5mZ_M06Fc9g) Internet Minute Graphic (https://www.visualcapitalist.com/from-amazon-to-zoom-what-happens-in-an-internet-minute-in-2021/)

Changelog Master Feed
OpenTelemetry in your CI/CD (Ship It! #27)

Changelog Master Feed

Play Episode Listen Later Nov 11, 2021 60:54


In this episode, Gerhard is joined by Cyrille Le Clerc, Product Manager Lead on Observability at Elastic, and Oleg Nenashev, Principal Engineer at CloudBees. It all started with Oleg's tweet back in July, in which he was promoting Akihiro Kiuchi's work on Jenkins monitoring with OpenTelemetry. This was done in the context of Google's Summer of Code - a link to Akihiro's demo is in the show notes. As you may remember from episode 20, instrumenting our changelog.com pipeline is on Gerhard's mind, and this conversation helped him clarify a few things. If you are thinking of instrumenting your CI/CD pipeline with OpenTelemetry, this episode is for you.

Así las cosas con Carlos Loret de Mola
Emocionante, la reunión trilateral: Roberto Velasco

Así las cosas con Carlos Loret de Mola

Play Episode Listen Later Nov 10, 2021 9:16


En “Así las Cosas con Loret”, el Jefe de la Unidad para América del Norte de la SRE indicó que hay pláticas con EU sobre el sector energético

Google Cloud Platform Podcast
State of DevOps Report 2021 with Nathen Harvey and Dustin Smith

Google Cloud Platform Podcast

Play Episode Listen Later Nov 10, 2021 45:40


This week, Stephanie Wong and Carter Morgan are talking about the recently released State of DevOps Report. Guests Dustin Smith and Nathen Harvey tell us all about DORA, the research group working to study DevOps, and the findings of their years-long study aimed at improving workplace environments, fostering sustainable increased productivity, and ensuring quality output across industries. During their years of research, the DORA team has developed ways to measure team results and workplace culture. Our guests tell us about the five measures they use, including deployment frequency and reliability. The shared responsibility and collaboration of teams at a company to optimize these five metrics is what makes good DevOps performance. Through a real-life example, we hear how the coordination of goals and incentives across departments can improve results of the DevOps metrics, thus improving the speed and stability of finished products. Once businesses identify problems, they need realistic expectations of the time and energy required to solve these issues. Learning from each change made and growing during the process is an important part of optimization, and our guests talk about the best practices their research has identified for facilitating smoother transitions. High quality documentation is a vital part of optimizing DevOps, and this year’s report examined internal documentation for the first time. Nathan describes what makes good documentation, like clear ownership of the documents and docs that are regularly updated for easy sharing and scaling of up-to-date material across the company. Dustin elaborates, explaining other factors that make quality, reliable documents. Later, we talk SRE and how companies can measure and optimize Site Reliability Engineering. A supportive team culture and ensuring a secure product and supply chain are some important factors in optimal SRE, the DORA study found. Our guests offer advice for companies looking to get started with DevOps practices. Nathen Harvey Nathen Harvey is a developer relations engineer at Google who has built a career on helping teams realize their potential while aligning technology to business outcomes. Nathen has had the privilege of working with some of the best teams and open source communities, helping them apply the principles and practices of DevOps and SRE. Dustin Smith Dustin Smith is a UX Research Manager and the DORA research lead. He studies the factors that influence a team’s ability to deliver software quickly and reliably. Cool things of the week Email is 50 years old, and still where it's @ blog Make the most of hybrid work with Google Workspace blog We analyzed 80 million ransomware samples – here's what we learned blog Interview DevOps site DORA site SRE site 2021 Accelerate State of DevOps report addresses burnout, team performance report

Screaming in the Cloud
That Datadog Will Hunt with Dann Berg

Screaming in the Cloud

Play Episode Listen Later Nov 4, 2021 41:24


About DannDann Berg is a Senior CloudOps Analyst at Datadog, and has nearly a decade of experience working in the cloud and optimizing multi-million dollar budgets. He is also an active member of the larger technical community, hosting the monthly New York City FinOps Meetup, and has been published multiple times in places such as MSNBC, Fox News, NPR, and others. When he's not saving companies millions of dollars, he's writing plays, and has had two full-lengh plays produced in New York City and China.Links: Datadog: https://www.datadoghq.com Personal Website: https://dannb.org LinkedIn: https://www.linkedin.com/in/dannberg/ Twitter: https://twitter.com/dannberg Monthly newsletter: https://dannb.org/newsletter/ Previous SITC episode with Dann Berg, Episode 51: https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/episode-51-size-of-cloud-bill-not-about-number-of-customers-but-number-of-engineers-you-ve-hired/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking databases, observability, management, and security.And - let me be clear here - it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build.With Always Free you can do things like run small scale applications, or do proof of concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free. No asterisk. Start now. Visit https://snark.cloud/oci-free that's https://snark.cloud/oci-free.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. If there's one thing that I love, it is certainly not AWS billing, but for better or worse, that's where my career has led me. Way back in Episode 51, I had Dann Berg, the CloudOps analyst at Datadog. And now he's back for more. Things have changed. He's now a senior CloudOps analyst, and I'm hoping my jokes have gotten better. Dann, thanks for being bold enough to come out and find out.Dann: Yeah. I'm excited to see if these jokes have gotten better. That's the main reason for coming back.Corey: Exactly. Because it turns out that death, taxes, and AWS bills are the things that are inevitable and never seem to change.Dann: Yeah. They just keep coming. They never stop, and they're always slightly different than you expect. I guess, just like death and taxes.Corey: So, when we spoke back in, I want to say 2019 is when it aired, so probably that—ish—is when we had the conversation, if not a little bit before that, you were effectively a team of one, and as mentioned, had the CloudOps analyst title. Now, you're a senior CloudOps analyst, which I assume just means you're older. Is the team larger as well? What does that process look like? How has it evolved in the last couple years?Dann: Yeah, it's been interesting, especially being a single organization and that organization being Datadog, that to be able to grow the team a little bit. So, as you said, it was just me. Now, it's a total of four people, including myself, so three others. And, yeah, it's been interesting just in terms of my own professional development, being able to identify what needs to be done, how much capacity I have, and being able to grow it over time, especially in this fairly new space of being specifically focused on cloud cost billing. So, kind of that bridge between engineering and finance, which itself is kind of a fairly new space, still.Corey: It is. And my favorite part of having these conversations with folks who have no idea what this space is, is learning—when I was starting—out how to talk about this in a way that didn't lead down weird paths. It's, “Oh, you save money on Amazon bills? Can you help me save money on socks?” It's like, “No. Well, yes. Get the Prime card, it gives you 5% off. But no.” And yeah, I talk about camelcamelcamel and other ways of working around the retail side, but that's not really what I do.It's similar to back when I was doing SRE-style work. I made it a point never to talk about being someone involved in working in tech, or suddenly you're the neighborhood printer repair person. Similarly, you have, I guess, gone in a strange direction because you weren't, to my recollection, someone who had a strong SRE background. That's not where you came from in the traditional sense, is it?Dann: No, not an SRE background at all. Yeah, I mean, it's really interesting. So, talking about this space, I mean, people are calling it a lot of different things, cloud economics, the term FinOps—financial operations—is being used a lot, now—Corey: Cloud financial management is another popular one. Oh, swing a dead cat, you'll hit 15 different words, and I give—my advice on that, even though I hate some of the terms is, cool. If people are going to pay you to have a title, even if you think it's ridiculous, you can take the money or you can die on a petty naming hill and here we are.Dann: Yeah. And it's interesting because the role that I was hired for at Datadog was very much this niche, very specific role that I didn't realize was a niche, very specific role at the time. So previously, I was at a company and I was building out their data centers, so I was working with vendors, buying servers, sometimes going on-site, installing, racking those, dealing with RMAs. And I was getting more involved as their cloud usage was growing and bringing some of those hardware capitalization cost procedures to the cloud. And so I found myself in this kind of niche role in my previous company.And a Datadog, they basically had the exact same role that was dealing with all of the billing stuff around the cloud—kind of from an engineering perspective because it was on the engineering team—but working closely with finance, and I was like, “Oh, these are the skills that I have.” And it kind of fit perfectly. And it wasn't until after I got to Datadog and was doing more research about this specific space that I discovered just how wide open it was. And I mean, meeting you was one of the earliest things that I did in the industry. Discovering the FinOps Foundation and a few other things has kind of like opened my eyes to this as an actual career path.Corey: It's an expensive problem that isn't going away anytime soon, and it is foundational and core to the entire rest of how companies are building things these days. My argument has been for a while that when it comes to cloud, cost and architecture are the exact same thing. You don't have the deep SRE architect background, but you're also now a member of a four-person team. Does everyone in the team have the same skill set as you, or do you wind up effectively tagging in subject matter experts from different areas? How is the team composed? People love to ask me this question, and I strongly believe there's no one way to do it. But what's your answer?Dann: Yeah, I mean, the team works very much in terms of everybody kind of taking on tasks that they need to do, but we did hire for specific skill sets when we tried to find people. So, the first person that we hired, we wanted them to have more of a developer engineer type background, writing code, stuff like that. The third hire, we were looking for somebody that was more of a generalist. I've seen myself more as a generalist in the space; anything that's going on, I can pick it up and make some progress on it and build something out. And then the fourth person, we were lacking some of the deeper FP&A or FinOps experience, and so we found somebody with more of that kind of background and less of the engineering experience, but they were eager to, kind of, move from finance into more of an engineering role. And I feel like this is the perfect role for that because I feel like there are a lot of non-engineers that want to break into engineering and don't really know how to do it. And if you are in finance, in FP&A, finding one of these more cloud-cost-optimization-specific roles is the great way to bridge that gap, I feel.Corey: The last time we spoke, I was independent, doing this all myself, and it turns out that taking all of the things that make me and trying to find those in other people is a relatively heavy lift, even if you discount the things like ‘obnoxious on Twitter.' So, how do you start decomposing that? Well, now we're a dozen people and we've found ways to do it. But by and large in our experience, for the way that we interact—and I want to get to that in a second—is that it's easier for us to teach engineers how finance works than it is the opposite direction. And there are exceptions to that, and as we scale, I can easily see a day in the near future where that is no longer the case.However, we also have two very specific styles of engagement. We do our cost optimization projects, where we go into an environment and, “Oh, fix this. Turn that thing off. Do you really need eight copies of those four petabytes of data? Oh, you didn't realize they were there. Great, maybe delete it.” And we look like wizards from the future and things are great.The other project that we do is contract negotiation with AWS, especially at large scale. It's never as simple as people would have you believe because, “Oh, you're doing co-marketing efforts, and you have a very specific use case, and there are business partnerships on 15 different levels, and that all factors into how this works.” It's nuanced and challenging, and of course, because it's a series of anecdata, I can't really tell too many stories in public about that. But those are the two things that we wind up focusing on. You are focusing on a very different problem.You're not moving from company to company, basically reimplementing the same global problem, solving it locally for them. You are embedded in an account for the duration, almost four years now by my count. And, “Okay, I guess I could just do a whole bunch of cost optimization projects on a quarterly basis in an environment like that,” doesn't seem like it solves the problem in any meaningful way. What does your team do?Dann: Yeah. Well, I mean, that's such an interesting question. Just in terms of—yeah, if you're doing consulting, you're starting from square one every time you get a new contract, a new engagement, and being at the same company for, like you said, about four years, going on four years now, you really have a chance to dive in and think about, “Okay, what does it mean to work cloud cost optimization into just the regular business cycle of how it works?” Because I mean, you have the triangle that everybody's familiar with: things can either be cheaper, faster, efficient and at different stages in the product lifecycle, you want to be focusing on these areas, more or less. And so, on our team, the different things that I'm thinking about is, first is visibility, is you want to provide engineers visibility into their cost. And not just numbers, right? Actionable visibility where if something needs to change, they need to do something, they know what that is.And a lot of the times, that means not just costs, but also efficiency. So, these are the metrics that this particular application should be scaling against. As this application grows, as usage grows, are we remaining as cost-efficient? Then there's also the piece—as you're saying—like discovering things within the infrastructure that, “Hey, if we make this change, or if you turn this off, if we do things this way, we'll save a bunch of money. Let's do those.”There's things like reservations, committed use discounts for GCP, all of those kinds of things we manage. And then dealing closely with verifying our bill, working with finance—FP&A—on cost modeling forecasting, both short-term—like, within a month; like, what are we going to be at the end of this month and it's the 10th right now?—and also, what does our next quarter look like? What are our next two years look like? And that bleeds into the contract negotiations, those kind of things as well.So, I mean, it's setting up the cycles of how do you prioritize this work? What is the company focusing on at the time? And what can you do when the company is not focusing explicitly on deciding to save money?Corey: One of the more interesting aspects of my work that I didn't expect is, whenever I wind up starting an engagement, or even in the prospect stage, I love asking the dumbest possible questions I can think of because it turns out they're not. And the most common one that I always love to start with is, “Oh, okay. Your AWS bill is too high. Why do you care?” And that often takes people aback, but once you dig down underneath the surface just a little bit, it becomes pretty clear that the actual goal is not that it's too much money—because spoiler, payroll always cost more than infrastructure—instead, it's, “How do I think about this? How do I rationalize what the additional costs are going to be per thousand monthly active users or whatever metric it is you're choosing to use?”And how do you wind up forecasting that because the old days of data centers where you—“Well, we're going to spend a boatload of money, and then we'll have capacity for the next, ehh, two years, maybe down to eighteen months, depending on growth,” that's easier for companies to rationalize around, rather than this idea of incremental cost on a per-unit basis, but not exactly because it also turns out that architecture changes, problems of scale, AWS pricing changes from time to time, all tend to impact that. What I think is not well understood in this space is that yeah, if you have a 20% overage this month, people are going to have some serious questions, but they're also going to have those same questions if you're 20% low.Dann: Yeah. I mean, understanding why people care about the cost is definitely the first step because with a single company, so it's just constantly looking at the numbers rather than understanding exactly what motivations a company has to contact somebody like you, like a consultant, right? Because usually, I imagine that it's going to be a bill, maybe two bills, three bills come in, and they keep going up and up and up, and they need to go down. And they're going to have an explicit reason why it needs to go down; finance is going to say, “Margins are x, y, and z,” or, “Revenue has done this; our costs can't do this.” There's going to be explicit reasons because if there aren't reasons, then they shouldn't necessarily be focusing on costs at that moment in time.What you want to do is have—I mean, this is way more complicated than just saying it out loud, but have a culture of cloud cost mindfulness, where people aren't just spinning up resources willy nilly. But also, my goal is for people not to have to really think about cost that much other than just in a way that helps them do their work. Because I mean, I want engineers to be able to build stuff and build stuff fast—that's what the cloud is all about—but I also want to be able to do it in a way that isn't inappropriately high in cost.Corey: I have my thoughts on this, and I've shared them before and I'll dive into them again, but how do you approach that? If Datadog makes a grievous error and hires me to write code somewhere as an engineer, what is the, I guess, cost approach training for me as I wind up going through my onboarding as part of an SRE team or an application team?Dann: I mean, this feels so basic as to not even be the right answer, but honestly, visibility is the easiest and best thing that you can give people, and so we've built out some visibility reports that engineers get on a regular basis. We also meet with our top—what is it—ten or fifteen spending internal engineering teams on a monthly basis to go over those costs so that they understand what they're looking at so that we understand the context behind it, so that we can understand what's on the roadmap going forward so that when things in the cost happen, we're aware. And then we're just staying on top of things. And if we have questions, we have an open dialogue with engineers and things like that.In an ideal space, it would be great to have cost, I guess, more fit into the product development lifecycle in a more deeply ingrained way, but at the same time, I really don't want to serve as a gatekeeper. Our goal is not to stop any sort of engineering process. And we haven't needed to do anything like that although I guess every company is going to be different in terms of what their needs are. But yeah, I'm totally happy to being a little bit more reactionary in terms of looking at the numbers and responding, and then proactive just in terms of the regular communication with people.Corey: I tend to take the perspective that engineers need to know enough about cost to maybe fill an index card at most because you don't want them, I guess, over-fixating on it. Left to my own devices in my personal account, I'll see a $7 a month bill and, “Oh, I'm going to spend two weeks knocking that down to $4.” And of course, I can do it, but is that the best use of my time? Absolutely not.Very often what is a lot of money to an engineer is absolutely not to the business. And vice versa when you bring in a data science team; it's, “Oh, yeah, we need at least four more exabytes of data because we never learned to do a join properly.” Yeah, maybe don't do that. Understanding the difference between those two approaches is key. But I've always been of the mindset that I would rather bias for letting developers build and experiment and have things that catch outsized things quickly, then trying to wind up putting a culture of fear around cost because I'd much rather see whether the thing they're trying to build is possible to build, then go back and optimize it later, once that's proven out. But again, this is a nuanced thing.Everyone seems to think I have this back pocket answer that will apply to all companies. And you've been doing this at Datadog for almost four years with a team of people. I am an outsider; I see the global trend, I see what works in different ways in different companies, but the idea that I can sit down and say, “Oh. Well, clearly the thing you're doing is completely wrong because that's not how I think about it,” is the hallmark of a terrible consultant. There are reasons that things are the way that they are and it's generally not that people are expecting to do a terrible job today. You know, unless they work in the Facebook ethics department, which is neither here nor there.Dann: Yeah, I mean, like I said, the product lifecycle, when you're building something new, you want to go as fast as possible. When you're launching it, you want it to be as reliable as possible. Once you're launched, once you're reliable, then you can start focusing on costs is, kind of like, not the universal rule, but kind of the flow that I tend to see. So, as you're at a company that is regularly innovating, creating new products, going through that cycle, you're going to have these kind of periods.As well as you have the products that have been around. There's a lot of legacy code, there's a lot of stuff going on, that maybe isn't the best, or some efficiency work that has been deprioritized for whatever reason, that maybe it's time to start considering doing this. So, keeping track of all of that. And like I said, if for whatever reason the business wants to focus on cloud cost efficiency, or a team has decided that in a particular quarter or for a particular reason they want to focus on that, being able to assist as much as you can, being able to save all that work so that there's kind of like a queue that you can go to when it is time to focus on cost efficiency stuff.Corey: So, here's a fun one for you. As of the time of this recording, it's a couple weeks old, but if you're anything like what we do here for some of our more sophisticated clients, we do occasionally build out prediction models, models of economics that wind up defining how some architectural patterns should be addressed, et cetera, et cetera. What's always fun is the large clients who have this significant level of spend on an outlier service. Every once in a while—it was great that we got to do a deep dive into the Washington Post's use of Lambda because normally, Lambda is a rounding error on the bill; they had a specific challenge and they did a whole blog post on this for the AWS blog. I believe the Monitoring Tools blog, but don't take that at face value; I never remember which AWS blog is which because AWS doesn't speak with a single voice on anything.But yeah, most of the time is block, tackle, baseline stuff that is the big driver of spend, but a few weeks ago, they change the pricing dimensions for S3 intelligent tiering, where there's no longer a monitoring charge for objects that are smaller than 128 kilobytes, and there's no 30-day minimum. So, the fact that those two things went away removed almost every caveat that I can picture for using S3 intelligent tiering, which means that for most use cases, that should now be the default. I imagine you caught that change as well, since that's one of those wake up and take notice, no matter what time of the world [laugh] it is where you are when that gets dropped. How did that change your modeling? Or did that not significantly shift how you view any of this?Dann: No, I mean, I think part of our role within the organization is to pay attention to stuff like that, and then you just have those conversations with the teams that I know were either exploring intelligent tiering. We do some pricing modeling for different products, S3 storage for different types, so updating those and being like, “Hey, this might be something we want to actually use and explore now.” Similar and I guess, more of something that I actively worked on that I consider in the same category is when Amazon announced savings plans as replacing convertible reservations. Because at first they announced, and being like, “Okay, well, it's going to automatically rebalance between… different instance families across regions, too”—which convertible RIs could never do it—“And it's going to be the exact same price for a compute savings plan as a convertible RI.” And we were kind of like, what's the catch? And we spent a few weeks doing a deep dive working with our data science team, kind of like being, “Where is the catch here?”Corey: Yeah, the real catch is that you can't sell it on the secondary market if it—Dann: Yeah.Corey: —turns out you bought the wrong thing, which if that's your Plan A, then good luck.Dann: Yeah. We definitely don't use that secondary market. I don't have as much experience there, although I'm sure some people can use it to their advantage.Corey: Almost no one does. In fact, the reason that it exists—my pet theory—is that once upon a time, companies would try and classify some of the reserved instance purchases as capital expenditures, which there has since been guidance from regulatory authorities not to do that. But at the time, the fact that you could sell it to a third-party on the secondary market would help shore up that argument. If you're listening to this, and you're classifying some of your RIs as CapEx, please don't do that. Feel free to reach out to me, I can dig out the actual regulation and send it to you. There are two of them. It's a nuanced topic. If you're listening to this and have no idea what I'm talking about, God, do I envy you.Dann: [laugh]. Yeah, definitely don't do that. [laugh].Corey: There was a lot that was interesting about savings plans. When I was read in the month or so in advance of them being announced, it was, “Great. I want to see this and this and these other things, too.” And some of those things came to pass. It was extended to work with Lambda.Now, I don't believe that is financially useful in almost every case, but it doesn't need to be because so much of cloud economics from where I sit is psychological in nature, where, “Oh, we have this workload that lives on EC2 instances and we want to move it to Lambda, but we already bought the reserved instances so we're not going to do it because of sunk cost fallacy.” Which is not much of a fallacy when it's that kind of money, in some cases. Okay, great. Now, if it can migrate to Lambda and still wind up getting the discounts you've paid for it, you have removed an architectural barrier. And that's significant.Now, I want to see that same thing apply to oh if you move from EC2 to RDS, or DynamoDB or anything else, that should be helpful, too. But whatever you do, don't do what SageMaker did and launch their own separate savings plan that is not compatible with the compute savings plans, so effectively, it's great; you're locked-in architecturally to one or the other because machine learning is, once again, a marvelously executed scam to sell pickaxes into a digital gold rush.Dann: I mean, I like savings plans a lot and we've been slowly, as convertible RIs have expired, replacing them with savings plans. And I think that it is pushing the other cloud providers forward—because we're definitely multi-cloud—and so that's really useful and I hope more people will take on the compute savings plan type model, just because it makes our lives so much easier. Or it makes my life so much easier in terms of planning it, selling the commitment internally, just everything about it has made my life easier. So, I mean, how many years later are we? I definitely haven't found any big gotchas, I guess, from the secondary market. But that doesn't really impact me.Corey: Yeah, I spent a lot of time looking forward, too, doing deep analyses of okay, for which instance classes in which regions is there a price discrepancy? And I finally got someone to go semi on record and say, “Yeah. There should not be any please ping us if you find one.” “Oh, okay, great. That is enough for me to work with.”Dann: Exactly, we got that, too. I didn't believe it so we were downloading price sheets and doing comparisons, doing all that stuff.Corey: Oh, trust but verify. And when we're talking this kind of money, I don't trust very far. They make mistakes on billing issues from time to time. And I get it; it's hard, but there are challenges here and there. I am glad you mentioned a minute ago that you are multi-cloud because my position on that has often been misconstrued.I think that designing something from day one to work on multiple cloud providers is generally foolish. I think that unless you have a compelling reason not to go all-in on one cloud provider, that's what you should do. Pick a cloud—I don't care which—and go all-in. Conversely, you have a product like Datadog where your customers are in multiple clouds, and first, no one wants to pay egress to send all the telemetry from where they are into AWS, and secondly, they're not going to put up, in many cases, with their data going to a cloud provider they have explicitly chosen not to work with, so you have to meet your customers where they are. In your case, it is absolutely the right thing to do. And Twitter often gets upset and calls me hypocrite on stuff like this because Twitter believes that two things that take opposite visions cannot possibly both be true, but the world is messy.Dann: Yeah. And I mean, the nice thing about us being in multiple clouds is we are our own biggest user. And that's actually one of the reasons why I love working at Datadog is because I get to use Datadog all the time. And not only that, Datadog is on everything and we have all of our products. I'm very spoiled [laugh] with all of this. But I mean, we are running in these different cloud providers; we are using Datadog in those different cloud providers, and that is just helping everything overall, too. In addition to supporting customers that are in each cloud because that is a huge reason as well.Corey: This episode is sponsored in part by something new. Cloud Academy is a training platform built on two primary goals. Having the highest quality content in tech and cloud skills, and building a good community the is rich and full of IT and engineering professionals. You wouldn't think those things go together, but sometimes they do. Its both useful for individuals and large enterprises, but here's what makes it new. I don't use that term lightly. Cloud Academy invites you to showcase just how good your AWS skills are. For the next four weeks you'll have a chance to prove yourself. Compete in four unique lab challenges, where they'll be awarding more than $2000 in cash and prizes. I'm not kidding, first place is a thousand bucks. Pre-register for the first challenge now, one that I picked out myself on Amazon SNS image resizing, by visiting cloudacademy.com/corey. C-O-R-E-Y. That's cloudacademy.com/corey. We're gonna have some fun with this one!Corey: One of the problems that I keep running into across the board is that with things like Datadog—and again, not to single you out; every monitoring vendor to some extent has aspects of this problem—it's that when I'm a customer and I'm hooking my accounts up to Datadog, I want you to tell me about things that are going on, but the CloudWatch charges can be so egregious on the customer side, where it is bizarre and, frankly, abhorrent to me when I wind up paying more for the CloudWatch charges than I am for Datadog. And let's be clear here; I am, in fact, a Datadog customer. I pay you folks money. Not a lot of money, but I pay you money because I have certain things that I need to know are working for a variety of excellent reasons.And the problem that I keep smacking into on this is—it's not your fault; there's not anything you can do. In fact, you are one of the better providers as far as not only not being egregious with the way that you slam the CloudWatch endpoints, but also in giving guidance to customers on how to tune it further. And I really wish that more folks in your space would do things like that. It always bugs me when I wind up using a tool that tries to save money that in turn winds up costing me more than it saves.Dann: Yeah. Yeah, it's tricky there. I have less experienced myself setting up Datadog and running it in my own infrastructure as I'm more digging deep into the cost stuff and us using the cloud, so I can't speak to that specifically. But yeah, you're not the first person that I've heard have that experience. [laugh].Corey: And again, it's not your fault at all. I've been beating up the CloudWatch team for years on this, and I will continue to do so until I'm safely dead, which—depending on Amazon's level of patience—might be in mere minutes.Dann: In the larger-picture-wise, we have to remember that we're super early in the cloud adoption, even looking at the cloud economics FinOps cloud cost optimization world. I feel like most businesses at this stage in their journey are still in data centers and they're dealing with the problem of how do we move to the cloud and do it cost-efficiently? How do we set everything up? And that's where the world is right now.And I think that dealing with, “Okay, we are one hundred percent running in the cloud. What are the processes that we have in place? How do we think of finance and the finance organization not through the lens of ‘we once had data centers and now we don't,' but how do we look through that in the lens of ‘okay, we are cloud-native from day one? What does the finance department look like?'” And dealing with those problems is really interesting because Datadog has never been in a data center. We are cloud-native from the very beginning, and so it was interesting for me to join the company and build up a lot of these processes because it is different than what a lot of other people were dealing with and doing. And it presents some really interesting problems and questions that I think are going to be the foundation for the next decade of building companies and operating in the cloud.Corey: I always love having conversations with folks who are building out teams to handle these things because usually the folks I keep talking to, or who want to have conversations like this are building tools themselves to solve this problem through the miracle of SaaS, where they will bend over backwards to avoid ever talking to a customer. And we're all dealing with the same AWS APIs; there's not that much of a new spin you can put on most of these things. But understanding what customers are actually trying to do instead of falling down the rabbit hole trap of, “Hey, turn off those idle instances that are all labeled ‘drsite' because you probably don't need them,” is foolish. And after a few foolish recommendations, tooling doesn't get there. I am a big believer that tools can assist the process and narrow down what to look at.I believe they shouldn't have to exist; I think that the billing dashboard should be a hell of a lot better natively than having to pay a third party to make sense of it for me. But by and large, I do believe this is a problem that is best solved from a consultative approach. When I started this place, I was planning to build out some software, tried doing it—called DuckTools—and wound up mothballing the whole thing because what we were building was not what the industry claimed to want and, frankly, educating people into a position where then they see the value and only then will they buy is never been a game that I wanted to play.Dann: Yeah, I really liked that article that you guys published about exploring that product and the reason why you decided not to pursue it. But it's super interesting in terms of where the industry is going and building out those tools because I found that there isn't really any new thing that you can do with the tools. All the tools that exist for looking at your costs are largely the same. The main differences that I've seen is that the UI is slightly different and they have different sales teams. And if the sales teams are better, they're going to get more of the market share. And if the sales teams are not as good, it's going to be a smaller market share. And it's weird, too, be in this industry for as long as we have been, and seeing okay, well, Andreessen Horowitz just funded this new company, and this other company got invited into Y Combinator, or all of these things that are happening, and I'm kind of like, okay, but what is this tool really doing differently? And there are a few of them that are; that are doing something innovative and different, but there's also a few that are just like, this is a space where people are in, there's money here, we're doing the same thing, but we got our sales team, and we'll carve out our little corner, and then we'll get acquired, and that'll be that. Although I guess we're just at that stage of innovation in this space, I guess.Corey: Yeah, I have no earthly idea what the story is around how these companies plan to differentiate because it seems to me that they're directly attempting to compete with Cost Explorer, which—Dann: Yeah.Corey: —it's taken some time for that thing to improve to the point where it is now and it'll take further time for it to improve beyond it, but long-term, I don't think you're going to outrun AWS on a straight line like that.Dann: Yeah, I mean, when you work for one of these third-party cost tooling things, and you're working with one of your customers, and they're like, “How do I view this?” And it's kind of like, that is the easiest thing to find in Cost Explorer as well, it's—I can't imagine being like, “Well, you should pay me thousands, tens of thousands, hundreds of thousands of dollars a month to view it here,” when Cost Explorer is free. And I think Cost Explorer, it doesn't do everything, but it's gotten a lot better at what it does, and it could probably solve 90% of people's problems without using a third-party tool.Corey: You are at significant scale in multiple clouds, so the answer that these companies always give is, “Ah, but we provide a single dashboard so that you can look at costs across multiple providers in one place.” Is that even slightly useful to you?Dann: Man, if you need dashboards, get a dashboard tool. Don't get this crazy cost analysis tool. I mean, there are some great dashboard solutions that you can get where you can connect your detailed billing, cost and usage report—whatever cloud provider is calling it, but, like, that really detailed gigabytes per hour report—and then visualize it, build reports, do all that kind of stuff because that's not something that the tooling does well right now, in terms of building out cost dashboards and stuff. But that's also right now. It could in the future.Corey: Yeah. If you're a BI tool, wind up passing out templates that normalize these things? I am so tired of building it all from scratch in Tableau myself. If you're Tableau, sell me a whole bunch of things that I can use to view this stuff through, so I don't have to wind up continually reinventing that particular wheel.Dann: Yeah.Corey: Oh, I like your approach. I didn't know the answer when I was asking the question. I was about to learn something if you'd gone the other direction, but nope, but it's good to know that my impressions remain intact.Dann: Yeah, I mean, I've used different tools in the past. Again, I hesitate to name any of them, but there's a few in this space that I feel like everybody—if they're in this space, they know which tools I'm talking about—Corey: Yes, we do.Dann: —and… yeah, I've used them. They're okay—a few of them are okay, a few of them are better than others, but I mean, I was trying to evaluate the value-add over me manually setting some things up and having some sort of visualization, and just the value-add in terms of what they were charging, even if it was like a significantly smaller percent of the bill because that alone, like, percent of bill is such a difficult cost model—Corey: Oh—Dann: —to do.Corey: I hate that. Pricing is hard. Let's start there.Dann: Yeah. Yeah. Yeah, yeah.Corey: I hate the percent of bill because then it's, “Let me get this straight. I'm paying you a percentage of things like data transfer charges that I know are fixed, that I can't optimize? I'm paying you a percentage of my AWS enterprise support subscription? I'm paying you a percentage of the marketplace?” And so on and so forth. And it doesn't work. At some point of scale as well it's, I could hire a team of 20 people and save money versus what you're charging me. The other side of it though, “Ah, we'll charge you percentage of savings.” Well, then you wind up with people doing a whole bunch of things like before they bring you in, they'll make a bunch of ill-advised reserved instance purchases or savings plan purchases you have to then unwind after the fact. When I was setting this place up, I looked long and hard at different billing models and the only thing I found that worked is fixed fee. The end. Because at that point, suddenly everyone's on board with, “Hey, let's solve the problem and then get out as soon as possible.” We're not trying to build ourselves a forever job nestled in the heart of your company. And it's the only model I found that removes a whole swath of conflicts of interest. And that's the hard part. We have no partners with anyone in this space—including AWS themselves—just because as soon as we do, it becomes extremely disingenuous when we suggest doing something for your sake that happens to benefit them, such as, “Maybe back that S3 bucket up somewhere.” Well, okay, if we're partnered with them, does that mean we're trying to influence spend in the other direction? And it just becomes a morass that I never found it worth the time to deal with.Dann: Yeah, I—Corey: But that doesn't work for SaaS.Dann: Yeah, that makes a lot of sense. And I haven't actually thought about pricing model for consulting in this space that closely, but I mean, when you're charging a percent of bill or percent of savings, you have the opportunity to screw the customer, right, through all the things that you were saying. If you charge a fixed fee, you have the possibility of undervaluing yourself, which the only one that's screwed in that case is you, potentially, and if you're okay with that risk and you're okay with those dollars, that's great. Because yeah, if you're able to be like, “Okay, here's the services that I do, here's the fixed costs.” “Done.” “Done.” That just sets everybody's expectations for the relationship in a much better way that you're not constantly worried about, like, upsells and other things that might happen along the way that screws the customer.Corey: And that's the hardest part, I think, is that people lose sight of the entire customer obsession piece of it. That's one of the things Amazon gets super right. I wish more companies embrace that. Dann, I want to thank you for taking so much time out of your day to suffer my slings, and arrows, and half-formed opinions. If people want to learn more about who you are and what you're up to, where can they find you?Dann: Yeah, I have a website you guys can go to that links everywhere else. It is dannb.org. And I spell my name with two ns, so D-A-N-N-B dot org. And I have LinkedIn, I have Twitter, I have a monthly newsletter that is not really about FinOps or anything, but I really enjoy it; I've been doing it for a year, now, that you should sign up for.Corey: And links to that will, of course, be in the [show notes 00:36:26]. Dann, thanks again for your time. I really appreciate it.Dann: Yeah. Thanks so much for having me again. It's been a blast.Corey: It really has. Dann Berg, senior CloudOps analyst at Datadog. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with a comment featuring a picture of several corkboards full of post-it notes and string, and a deranged comment telling me that you have in fact finally found the catch in savings plans.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

44BITS 팟캐스트 - 클라우드, 개발, 가젯
44bits 팟캐스트 127.log : GeekNews Show, 당근 SRE Meetup, Amazon MemoryDB for Redis

44BITS 팟캐스트 - 클라우드, 개발, 가젯

Play Episode Listen Later Nov 2, 2021 37:26


44bits 팟캐스트 127번째 로그에서는 GeekNews Show, 당근 SRE Meetup, Amazon MemoryDB for Redis에 대해서 이야기를 나누었습니다. 참가자: @nacyo_t, @seapy, @raccoonyy 정기 후원 - 44bits podcast are creating 프로그래머들의 팟캐스트 녹음일 8월 20일, 공개일 11월 02일 쇼노트: https://stdout.fm/127/ 주제별 바로 듣기 00:00 시작 02:01 GeekNews Show 10:04 Amazon MemoryDB for Redis 18:26 당근 SRE Meetup 28:05 시골 유학 30:29 어도비 프리미어프로 텍스트 추출 쇼노트 GeekNews Show GeekNews Show 공개 - GeekNews GeekNews Ask - Geek 들의 질문과 답변 - GeekNews Amazon MemoryDB for Redis Introducing Amazon MemoryDB for Redis - AWS News Blog 당근 SRE Meetup 당근 SRE 밋업 1회 | Festa! 당근 SRE 밋업 1회 발표 영상 플레이리스트 당근 SRE 밋업 1회 발표자료 시골 유학 전남농산어촌유학

The Stack Overflow Podcast
A murder mystery: who killed our user experience?

The Stack Overflow Podcast

Play Episode Listen Later Oct 27, 2021 28:58


The infrastructure that networked applications lives on is getting more and more complicated. There was a time when you could serve an application from a single machine on premises. But now, with cloud computing offering painless scaling to meet your demand, your infrastructure becomes abstracted and not really something you have contact with directly. Compound that problem with with architecture spread across dozens, even hundreds of microservices, replicated across multiple data centers in an ever changing cloud, and tracking down the source of system failures becomes something like a murder mystery. Who shot our uptime in the foot? A good observability system helps with that. On this sponsored episode of the Stack Overflow Podcast, we talk with Greg Leffler of Splunk about the keys to instrumenting an observable system and how the OpenTelemetry standard makes observability easier, even if you aren't using Splunk's product. Observability is really an outgrowth of traditional monitoring. You expect that some service or system could break, so you keep an eye on it. But observability applies that monitoring to an entire system and gives you the ability to answer the unexpected questions that come up. It uses three principal ways of viewing system data: logs, traces, and metrics.Metrics are a number and a timestamp that tell you particular details. Traces follow a request through a system. And logs are the causes and effects recorded from a system in motion. Splunk wants to add a fourth one—events—that would track specific user events and browser failures. Observing all that data first means you have to be able to track and extract that data by instrumenting your system to produce it. Greg and his colleagues at Splunk are huge fans of OpenTelemetry. It's an open standard that can extract data for any observability platform. You instrument your application once and never have to worry about it again, even if you need to change your observability platform. Why use an approach that makes it easy for a client to switch vendors? Leffler and Splunk argue that it's not only better for customers, but for Splunk and the observability industry as a whole. If you've instrumented your system with a vendor locked solution, then you may not switch, you may just let your observability program fall by the wayside. That helps exactly no one. As we've seen, people are moving to the cloud at an ever faster pace. That's no surprise; it offers automatic scaling for arbitrary traffic volumes, high availability, and worry-free infrastructure failure recovery. But moving to the cloud can be expensive, and you have to do some work with your application to be able to see everything that's going on inside it. Plenty of people just throw everything into the cloud and let the provider handle it, which is fine until they see the bill.Observability based on an open standard makes it easier for everyone to build a more efficient and robust service in the cloud. Give the episode a listen and let us know what you think in the comments.