About SamSam Nicholls: Veeam's Director of Public Cloud Product Marketing, with 10+ years of sales, alliance management and product marketing experience in IT. Sam has evolved from his on-premises storage days and is now laser-focused on spreading the word about cloud-native backup and recovery, packing in thousands of viewers on his webinars, blogs and webpages.Links Referenced: Veeam AWS Backup: https://www.veeam.com/aws-backup.html Veeam: https://veeam.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Chronosphere. Tired of observability costs going up every year without getting additional value? Or being locked in to a vendor due to proprietary data collection, querying and visualization? Modern day, containerized environments require a new kind of observability technology that accounts for the massive increase in scale and attendant cost of data. With Chronosphere, choose where and how your data is routed and stored, query it easily, and get better context and control. 100% open source compatibility means that no matter what your setup is, they can help. Learn how Chronosphere provides complete and real-time insight into ECS, EKS, and your microservices, whereever they may be at snark.cloud/chronosphere That's snark.cloud/chronosphere Corey: This episode is brought to us by our friends at Pinecone. They believe that all anyone really wants is to be understood, and that includes your users. AI models combined with the Pinecone vector database let your applications understand and act on what your users want… without making them spell it out. Make your search application find results by meaning instead of just keywords, your personalization system make picks based on relevance instead of just tags, and your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable. Thanks to my friends at Pinecone for sponsoring this episode. Visit Pinecone.io to understand more.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by and sponsored by our friends over at Veeam. And as a part of that, they have thrown one of their own to the proverbial lion. My guest today is Sam Nicholls, Director of Public Cloud over at Veeam. Sam, thank you for joining me.Sam: Hey. Thanks for having me, Corey, and thanks for everyone joining and listening in. I do know that I've been thrown into the lion's den, and I am [laugh] hopefully well-prepared to answer anything and everything that Corey throws my way. Fingers crossed. [laugh].Corey: I don't think there's too much room for criticizing here, to be direct. I mean, Veeam is a company that is solidly and thoroughly built around a problem that absolutely no one cares about. I mean, what could possibly be wrong with that? You do backups; which no one ever cares about. Restores, on the other hand, people care very much about restores. And that's when they learn, “Oh, I really should have cared about backups at any point prior to 20 minutes ago.”Sam: Yeah, it's a great point. It's kind of like taxes and insurance. It's almost like, you know, something that you have to do that you don't necessarily want to do, but when push comes to shove, and something's burning down, a file has been deleted, someone's made their way into your account and, you know, running a right mess within there, that's when you really, kind of, care about what you mentioned, which is the recovery piece, the speed of recovery, the reliability of recovery.Corey: It's been over a decade, and I'm still sore about losing my email archives from 2006 to 2009. There's no way to get it back. I ran my own mail server; it was an iPhone setting that said, “Oh, yeah, automatically delete everything in your trash folder—or archive folder—after 30 days.” It was just a weird default setting back in that era. I didn't realize it was doing that. Yeah, painful stuff.And we learned the hard way in some of these cases. Not that I really have much need for email from that era of my life, but every once in a while it still bugs me. Which gets speaks to the point that the people who are the most fanatical about backing things up are the people who have been burned by not having a backup. And I'm fortunate in that it wasn't someone else's data with which I had been entrusted that really cemented that lesson for me.Sam: Yeah, yeah. It's a good point. I could remember a few years ago, my wife migrated a very aging, polycarbonate white Mac to one of the shiny new aluminum ones and thought everything was good—Corey: As the white polycarbonate Mac becomes yellow, then yeah, all right, you know, it's time to replace it. Yeah. So yeah, so she wiped the drive, and what happened?Sam: That was her moment where she learned the value and importance of backup unless she backs everything up now. I fortunately have never gone through it. But I'm employed by a backup vendor and that's why I care about it. But it's incredibly important to have, of course.Corey: Oh, yes. My spouse has many wonderful qualities, but one that drives me slightly nuts is she's something of a digital packrat where her hard drives on her laptop will periodically fill up. And I used to take the approach of oh, you can be more efficient and do the rest. And I realized no, telling other people they're doing it wrong is generally poor practice, whereas just buying bigger drives is way easier. Let's go ahead and do that. It's small price to pay for domestic tranquility.And there's a lesson in that. We can map that almost perfectly to the corporate world where you folks tend to operate in. You're not doing home backup, last time I checked; you are doing public cloud backup. Actually, I should ask that. Where do you folks start and where do you stop?Sam: Yeah, no, it's a great question. You know, we started over 15 years ago when virtualization, specifically VMware vSphere, was really the up-and-coming thing, and, you know, a lot of folks were there trying to utilize agents to protect their vSphere instances, just like they were doing with physical Windows and Linux boxes. And, you know, it kind of got the job done, but was it the best way of doing it? No. And that's kind of why Veeam was pioneered; it was this agentless backup, image-based backup for vSphere.And, of course, you know, in the last 15 years, we've seen lots of transitions, of course, we're here at Screaming in the Cloud, with you, Corey, so AWS, as well as a number of other public cloud vendors we can help protect as well, as a number of SaaS applications like Microsoft 365, metadata and data within Salesforce. So, Veeam's really kind of come a long way from just virtual machines to really taking a global look at the entirety of modern environments, and how can we best protect each and every single one of those without trying to take a square peg and fit it in a round hole?Corey: It's a good question and a common one. We wind up with an awful lot of folks who are confused by the proliferation of data. And I'm one of them, let's be very clear here. It comes down to a problem where backups are a multifaceted, deep problem, and I don't think that people necessarily think of it that way. But I take a look at all of the different, even AWS services that I use for my various nonsense, and which ones can be used to store data?Well, all of them. Some of them, you have to hold it in a particularly wrong sort of way, but they all store data. And in various contexts, a lot of that data becomes very important. So, what service am I using, in which account am I using, and in what region am I using it, and you wind up with data sprawl, where it's a tremendous amount of data that you can generally only track down by looking at your bills at the end of the month. Okay, so what am I being charged, and for what service?That seems like a good place to start, but where is it getting backed up? How do you think about that? So, some people, I think, tend to ignore the problem, which we're seeing less and less, but other folks tend to go to the opposite extreme and we're just going to backup absolutely everything, and we're going to keep that data for the rest of our natural lives. It feels to me that there's probably an answer that is more appropriate somewhere nestled between those two extremes.Sam: Yeah, snapshot sprawl is a real thing, and it gets very, very expensive very, very quickly. You know, your snapshots of EC2 instances are stored on those attached EBS volumes. Five cents per gig per month doesn't sound like a lot, but when you're dealing with thousands of snapshots for thousands machines, it gets out of hand very, very quickly. And you don't know when to delete them. Like you say, folks are just retaining them forever and dealing with this unfortunate bill shock.So, you know, where to start is automating the lifecycle of a snapshot, right, from its creation—how often do we want to be creating them—from the retention—how long do we want to keep these for—and where do we want to keep them because there are other storage services outside of just EBS volumes. And then, of course, the ultimate: deletion. And that's important even from a compliance perspective as well, right? You've got to retain data for a specific number of years, I think healthcare is like seven years, but then you've—Corey: And then not a day more.Sam: Yeah, and then not a day more because that puts you out of compliance, too. So, policy-based automation is your friend and we see a number of folks building these policies out: gold, silver, bronze tiers based on criticality of data compliance and really just kind of letting the machine do the rest. And you can focus on not babysitting backup.Corey: What was it that led to the rise of snapshots? Because back in my very early days, there was no such thing. We wound up using a bunch of servers stuffed in a rack somewhere and virtualization was not really in play, so we had file systems on physical disks. And how do you back that up? Well, you have an agent of some sort that basically looks at all the files and according to some ruleset that it has, it copies them off somewhere else.It was slow, it was fraught, it had a whole bunch of logic that was pushed out to the very edge, and forget about restoring that data in a timely fashion or even validating a lot of those backups worked other than via checksum. And God help you if you had data that was constantly in the state of flux, where anything changing during the backup run would leave your backups in an inconsistent state. That on some level seems to have largely been solved by snapshots. But what's your take on it? You're a lot closer to this part of the world than I am.Sam: Yeah, snapshots, I think folks have turned to snapshots for the speed, the lack of impact that they have on production performance, and again, just the ease of accessibility. We have access to all different kinds of snapshots for EC2, RDS, EFS throughout the entirety of our AWS environment. So, I think the snapshots are kind of like the default go-to for folks. They can help deliver those very, very quick RPOs, especially in, for example, databases, like you were saying, that change very, very quickly and we all of a sudden are stranded with a crash-consistent backup or snapshot versus an application-consistent snapshot. And then they're also very, very quick to recover from.So, snapshots are very, very appealing, but they absolutely do have their limitations. And I think, you know, it's not a one or the other; it's that they've got to go hand-in-hand with something else. And typically, that is an image-based backup that is stored in a separate location to the snapshot because that snapshot is not independent of the disk that it is protecting.Corey: One of the challenges with snapshots is most of them are created in a copy-on-write sense. It takes basically an instant frozen point in time back—once upon a time when we ran MySQL databases on top of the NetApp Filer—which works surprisingly well—we would have a script that would automatically quiesce the database so that it would be in a consistent state, snapshot the file and then un-quiesce it, which took less than a second, start to finish. And that was awesome, but then you had this snapshot type of thing. It wasn't super portable, it needed to reference a previous snapshot in some cases, and AWS takes the same approach where the first snapshot it captures every block, then subsequent snapshots wind up only taking up as much size as there have been changes since the first snapshots. So, large quantities of data that generally don't get access to a whole lot have remarkably small, subsequent snapshot sizes.But that's not at all obvious from the outside, and looking at these things. They're not the most portable thing in the world. But it's definitely the direction that the industry has trended in. So, rather than having a cron job fire off an AWS API call to take snapshots of my volumes as a sort of the baseline approach that we all started with, what is the value proposition that you folks bring? And please don't say it's, “Well, cron jobs are hard and we have a friendlier interface for that.”Sam: [laugh]. I think it's really starting to look at the proliferation of those snapshots, understanding what they're good at, and what they are good for within your environment—as previously mentioned, low RPOs, low RTOs, how quickly can I take a backup, how frequently can I take a backup, and more importantly, how quickly can I restore—but then looking at their limitations. So, I mentioned that they were not independent of that disk, so that certainly does introduce a single point of failure as well as being not so secure. We've kind of touched on the cost component of that as well. So, what Veeam can come in and do is then take an image-based backup of those snapshots, right—so you've got your initial snapshot and then your incremental ones—we'll take the backup from that snapshot, and then we'll start to store that elsewhere.And that is likely going to be in a different account. We can look at the Well-Architected Framework, AWS deeming accounts as a security boundary, so having that cross-account function is critically important so you don't have that single point of failure. Locking down with IAM roles is also incredibly important so we haven't just got a big wide open door between the two. But that data is then stored in a separate account—potentially in a separate region, maybe in the same region—Amazon S3 storage. And S3 has the wonderful benefit of being still relatively performant, so we can have quick recoveries, but it is much, much cheaper. You're dealing with 2.3 cents per gig per month, instead of—Corey: To start, and it goes down from there with sizeable volumes.Sam: Absolutely, yeah. You can go down to S3 Glacier, where you're looking at, I forget how many points and zeros and nines it is, but it's fractions of a cent per gig per month, but it's going to take you a couple of days to recover that da—Corey: Even infrequent access cuts that in half.Sam: Oh yeah.Corey: And let's be clear, these are snapshot backups; you probably should not be accessing them on a consistent, sustained basis.Sam: Well, exactly. And this is where it's kind of almost like having your cake and eating it as well. Compliance or regulatory mandates or corporate mandates are saying you must keep this data for this length of time. Keeping that—you know, let's just say it's three years' worth of snapshots in an EBS volume is going to be incredibly expensive. What's the likelihood of you needing to recover something from two years—actually, even two months ago? It's very, very small.So, the performance part of S3 is, you don't need to take it as much into consideration. Can you recover? Yes. Is it going to take a little bit longer? Absolutely. But it's going to help you meet those retention requirements while keeping your backup bill low, avoiding that bill shock, right, spending tens and tens of thousands every single month on snapshots. This is what I mean by kind of having your cake and eating it.Corey: I somewhat recently have had a client where EBS snapshots are one of the driving costs behind their bill. It is one of their largest single line items. And I want to be very clear here because if one of those people who listen to this and thinking, “Well, hang on. Wait, they're telling stories about us, even though they're not naming us by name?” Yeah, there were three of you in the last quarter.So, at that point, it becomes clear it is not about something that one individual company has done and more about an overall driving trend. I am personalizing it a little bit by referring to as one company when there were three of you. This is a narrative device, not me breaking confidentiality. Disclaimer over. Now, when you talk to people about, “So, tell me why you've got 80 times more snapshots than you do EBS volumes?” The answer is as, “Well, we wanted to back things up and we needed to get hourly backups to a point, then daily backups, then monthly, and so on and so forth. And when this was set up, there wasn't a great way to do this natively and we don't always necessarily know what we need versus what we don't. And the cost of us backing this up, well, you can see it on the bill. The cost of us deleting too much and needing it as soon as we do? Well, that cost is almost incalculable. So, this is the safe way to go.” And they're not wrong in anything that they're saying. But the world has definitely evolved since then.Sam: Yeah, yeah. It's a really great point. Again, it just folds back into my whole having your cake and eating it conversation. Yes, you need to retain data; it gives you that kind of nice, warm, cozy feeling, it's a nice blanket on a winter's day that that data, irrespective of what happens, you're going to have something to recover from. But the question is does that need to be living on an EBS volume as a snapshot? Why can't it be living on much, much more cost-effective storage that's going to give you the warm and fuzzies, but is going to make your finance team much, much happier [laugh].Corey: One of the inherent challenges I think people have is that snapshots by themselves are almost worthless, in that I have an EBS snapshot, it is sitting there now, it's costing me an undetermined amount of money because it's not exactly clear on a per snapshot basis exactly how large it is, and okay, great. Well, I'm looking for a file that was not modified since X date, as it was on this time. Well, great, you're going to have to take that snapshot, restore it to a volume and then go exploring by hand. Oh, it was the wrong one. Great. Try it again, with a different one.And after, like, the fifth or six in a row, you start doing a binary search approach on this thing. But it's expensive, it's time-consuming, it takes forever, and it's not a fun user experience at all. Part of the problem is it seems that historically, backup systems have no context or no contextual awareness whatsoever around what is actually contained within that backup.Sam: Yeah, yeah. I mean, you kind of highlighted two of the steps. It's more like a ten-step process to do, you know, granular file or folder-level recovery from a snapshot, right? You've got to, like you say, you've got to determine the point in time when that, you know, you knew the last time that it was around, then you're going to have to determine the volume size, the region, the OS, you're going to have to create an EBS volume of the same size, region, from that snapshot, create the EC2 instance with the same OS, connect the two together, boot the EC2 instance, mount the volume search for the files to restore, download them manually, at which point you have your file back. It's not back in the machine where it was, it's now been downloaded locally to whatever machine you're accessing that from. And then you got to tear it all down.And that is again, like you say, predicated on the fact that you knew exactly that that was the right time. It might not be and then you have to start from scratch from a different point in time. So, backup tooling from backup vendors that have been doing this for many, many years, knew about this problem long, long ago, and really seek to not only automate the entirety of that process but make the whole e-discovery, the search, the location of those files, much, much easier. I don't necessarily want to do a vendor pitch, but I will say with Veeam, we have explorer-like functionality, whereby it's just a simple web browser. Once that machine is all spun up again, automatic process, you can just search for your individual file, folder, locate it, you can download it locally, you can inject it back into the instance where it was through Amazon Kinesis or AWS Kinesis—I forget the right terminology for it; some of its AWS, some of its Amazon.But by-the-by, the whole recovery process, especially from a file or folder level, is much more pain-free, but also much faster. And that's ultimately what people care about how reliable is my backup? How quickly can I get stuff online? Because the time that I'm down is costing me an indescribable amount of time or money.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database. If you're tired of managing open source Redis on your own, or if you are looking to go beyond just caching and unlocking your data's full potential, these folks have you covered. Redis Enterprise is the go-to managed Redis service that allows you to reimagine how your geo-distributed applications process, deliver, and store data. To learn more from the experts in Redis how to be real-time, right now, from anywhere, visit redis.com/duckbill. That's R - E - D - I - S dot com slash duckbill.Corey: Right, the idea of RPO versus RTO: recovery point objective and recovery time objective. With an RPO, it's great, disaster strikes right now, how long is acceptable to it have been since the last time we backed up data to a restorable point? Sometimes it's measured in minutes, sometimes it's measured in fractions of a second. It really depends on what we're talking about. Payments databases, that needs to be—the RPO is basically an asymptotically approaches zero.The RTO is okay, how long is acceptable before we have that data restored and are back up and running? And that is almost always a longer time, but not always. And there's a different series of trade-offs that go into that. But both of those also presuppose that you've already dealt with the existential question of is it possible for us to recover this data. And that's where I know that you are obviously—you have a position on this that is informed by where you work, but I don't, and I will call this out as what I see in the industry: AWS backup is compelling to me except for one fatal flaw that it has, and that is it starts and stops with AWS.I am not a proponent of multi-cloud. Lord knows I've gotten flack for that position a bunch of times, but the one area where it makes absolute sense to me is backups. Have your data in a rehydrate-the-business level state backed up somewhere that is not your primary cloud provider because you're otherwise single point of failure-ing through a company, through the payment instrument you have on file with that company, in the blast radius of someone who can successfully impersonate you to that vendor. There has to be a gap of some sort for the truly business-critical data. Yes, egress to other providers is expensive, but you know what also is expensive? Irrevocably losing the data that powers your business. Is it likely? No, but I would much rather do it than have to justify why I'm not doing it.Sam: Yeah. Wasn't likely that I was going to win that 2 billion or 2.1 billion on the Powerball, but [laugh] I still play [laugh]. But I understand your standpoint on multi-cloud and I read your newsletters and understand where you're coming from, but I think the reality is that we do live in at least a hybrid cloud world, if not multi-cloud. The number of organizations that are sole-sourced on a single cloud and nothing else is relatively small, single-digit percentage. It's around 80-some percent that are hybrid, and the remainder of them are your favorite: multi-cloud.But again, having something that is one hundred percent sole-source on a single platform or a single vendor does expose you to a certain degree of risk. So, having the ability to do cross-platform backups, recoveries, migrations, for whatever reason, right, because it might not just be a disaster like you'd mentioned, it might also just be… I don't know, the company has been taken over and all of a sudden, the preference is now towards another cloud provider and I want you to refactor and re-architect everything for this other cloud provider. If all that data is locked into one platform, that's going to make your job very, very difficult. So, we mentioned at the beginning of the call, Veeam is capable of protecting a vast number of heterogeneous workloads on different platforms, in different environments, on-premises, in multiple different clouds, but the other key piece is that we always use the same backup file format. And why that's key is because it enables portability.If I have backups of EC2 instances that are stored in S3, I could copy those onto on-premises disk, I could copy those into Azure, I could do the same with my Azure VMs and store those on S3, or again, on-premises disk, and any other endless combination that goes with that. And it's really kind of centered around, like control and ownership of your data. We are not prescriptive by any means. Like, you do what is best for your organization. We just want to provide you with the toolset that enables you to do that without steering you one direction or the other with fee structures, disparate feature sets, whatever it might be.Corey: One of the big challenges that I keep seeing across the board is just a lack of awareness of what the data that matters is, where you see people backing up endless fleets of web server instances that are auto-scaled into existence and then removed, but you can create those things at will; why do you care about the actual data that's on these things? It winds up almost at the library management problem, on some level. And in that scenario, snapshots are almost certainly the wrong answer. One thing that I saw previously that really changed my way of thinking about this was back many years ago when I was working at a startup that had just started using GitHub and they were paying for a third-party service that wound up backing up Git repos. Today, that makes a lot more sense because you have a bunch of other stuff on GitHub that goes well beyond the stuff contained within Git, but at the time, it was silly. It was, why do that? Every Git clone is a full copy of the entire repository history. Just grab it off some developer's laptop somewhere.It's like, “Really? You want to bet the company, slash your job, slash everyone else's job on that being feasible and doable or do you want to spend the 39 bucks a month or whatever it was to wind up getting that out the door now so we don't have to think about it, and they validate that it works?” And that was really a shift in my way of thinking because, yeah, backing up things can get expensive when you have multiple copies of the data living in different places, but what's really expensive is not having a company anymore.Sam: Yeah, yeah, absolutely. We can tie it back to my insurance dynamic earlier where, you know, it's something that you know that you have to have, but you don't necessarily want to pay for it. Well, you know, just like with insurances, there's multiple different ways to go about recovering your data and it's only in crunch time, do you really care about what it is that you've been paying for, right, when it comes to backup?Could you get your backup through a git clone? Absolutely. Could you get your data back—how long is that going to take you? How painful is that going to be? What's going to be the impact to the business where you're trying to figure that out versus, like you say, the 39 bucks a month, a year, or whatever it might be to have something purpose-built for that, that is going to make the recovery process as quick and painless as possible and just get things back up online.Corey: I am not a big fan of the fear, uncertainty, and doubt approach, but I do practice what I preach here in that yeah, there is a real fear against data loss. It's not, “People are coming to get you, so you absolutely have to buy whatever it is I'm selling,” but it is something you absolutely have to think about. My core consulting proposition is that I optimize the AWS bill. And sometimes that means spending more. Okay, that one S3 bucket is extremely important to you and you say you can't sustain the loss of it ever so one zone is not an option. Where is it being backed up? Oh, it's not? Yeah, I suggest you spend more money and back that thing up if it's as irreplaceable as you say. It's about doing the right thing.Sam: Yeah, yeah, it's interesting, and it's going to be hard for you to prove the value of doing that when you are driving their bill up when you're trying to bring it down. But again, you have to look at something that's not itemized on that bill, which is going to be the impact of downtime. I'm not going to pretend to try and recall the exact figures because it also varies depending on your business, your industry, the size, but the impact of downtime is massive financially. Tens of thousands of dollars for small organizations per hour, millions and millions of dollars per hour for much larger organizations. The backup component of that is relatively small in comparison, so having something that is purpose-built, and is going to protect your data and help mitigate that impact of downtime.Because that's ultimately what you're trying to protect against. It is the recovery piece that you're buying is the most important piece. And like you, I would say, at least be cognizant of it and evaluate your options and what can you live with and what can you live without.Corey: That's the big burning question that I think a lot of people do not have a good answer to. And when you don't have an answer, you either backup everything or nothing. And I'm not a big fan of doing either of those things blindly.Sam: Yeah, absolutely. And I think this is why we see varying different backup options as well, you know? You're not going to try and apply the same data protection policies each and every single workload within your environment because they've all got different types of workload criticality. And like you say, some of them might not even need to be backed up at all, just because they don't have data that needs to be protected. So, you need something that is going to be able to be flexible enough to apply across the entirety of your environment, protect it with the right policy, in terms of how frequently do you protect it, where do you store it, how often, or when are you eventually going to delete that and apply that on a workload by workload basis. And this is where the joy of things like tags come into play as well.Corey: One last thing I want to bring up is that I'm a big fan of watching for companies saying the quiet part out loud. And one area in which they do this—because they're forced to by brevity—is in the title tag of their website. I pull up veeam.com and I hover over the tab in my browser, and it says, “Veeam Software: Modern Data Protection.”And I want to call that out because you're not framing it as explicitly backup. So, the last topic I want to get into is the idea of security. Because I think it is not fully appreciated on a lived-experience basis—although people will of course agree to this when they're having ivory tower whiteboard discussions—that every place your data lives is a potential for a security breach to happen. So, you want to have your data living in a bunch of places ideally, for backup and resiliency purposes. But you also want it to be completely unworkable or illegible to anyone who is not authorized to have access to it.How do you balance those trade-offs yourself given that what you're fundamentally saying is, “Trust us with your Holy of Holies when it comes to things that power your entire business?” I mean, I can barely get some companies to agree to show me their AWS bill, let alone this is the data that contains all of this stuff to destroy our company.Sam: Yeah. Yeah, it's a great question. Before I explicitly answer that piece, I will just go to say that modern data protection does absolutely have a security component to it, and I think that backup absolutely needs to be a—I'm going to say this an air quotes—a “first class citizen” of any security strategy. I think when people think about security, their mind goes to the preventative, like how do we keep these bad people out?This is going to be a bit of the FUD that you love, but ultimately, the bad guys on the outside have an infinite number of attempts to get into your environment and only have to be right once to get in and start wreaking havoc. You on the other hand, as the good guy with your cape and whatnot, you have got to be right each and every single one of those times. And we as humans are fallible, right? None of us are perfect, and it's incredibly difficult to defend against these ever-evolving, more complex attacks. So backup, if someone does get in, having a clean, verifiable, recoverable backup, is really going to be the only thing that is going to save your organization, should that actually happen.And what's key to a secure backup? I would say separation, isolation of backup data from the production data, I would say utilizing things like immutability, so in AWS, we've got Amazon S3 object lock, so it's that write once, read many state for whatever retention period that you put on it. So, the data that they're seeking to encrypt, whether it's in production or in their backup, they cannot encrypt it. And then the other piece that I think is becoming more and more into play, and it's almost table stakes is encryption, right? And we can utilize things like AWS KMS for that encryption.But that's there to help defend against the exfiltration attempts. Because these bad guys are realizing, “Hey, people aren't paying me my ransom because they're just recovering from a clean backup, so now I'm going to take that backup data, I'm going to leak the personally identifiable information, trade secrets, or whatever on the internet, and that's going to put them in breach compliance and give them a hefty fine that way unless they pay me my ransom.” So encryption, so they can't read that data. So, not only can they not change it, but they can't read it is equally important. So, I would say those are the three big things for me on what's needed for backup to make sure it is clean and recoverable.Corey: I think that is one of those areas where people need to put additional levels of thought in. I think that if you have access to the production environment and have full administrative rights throughout it, you should definitionally not—at least with that account and ideally not you at all personally—have access to alter the backups. Full stop. I would say, on some level, there should not be the ability to alter backups for some particular workloads, the idea being that if you get hit with a ransomware infection, it's pretty bad, let's be clear, but if you can get all of your data back, it's more of an annoyance than it is, again, the existential business crisis that becomes something that redefines you as a company if you still are a company.Sam: Yeah. Yeah, I mean, we can turn to a number of organizations. Code Spaces always springs to mind for me, I love Code Spaces. It was kind of one of those precursors to—Corey: It's amazing.Sam: Yeah, but they were running on AWS and they had everything, production and backups, all stored in one account. Got into the account. “We're going to delete your data if you don't pay us this ransom.” They were like, “Well, we're not paying you the ransoms. We got backups.” Well, they deleted those, too. And, you know, unfortunately, Code Spaces isn't around anymore. But it really kind of goes to show just the importance of at least logically separating your data across different accounts and not having that god-like access to absolutely everything.Corey: Yeah, when you talked about Code Spaces, I was in [unintelligible 00:32:29] talking about GitHub Codespaces specifically, where they have their developer workstations in the cloud. They're still very much around, at least last time I saw unless you know something I don't.Sam: Precursor to that. I can send you the link—Corey: Oh oh—Sam: You can share it with the listeners.Corey: Oh, yes, please do. I'd love to see that.Sam: Yeah. Yeah, absolutely.Corey: And it's been a long and strange time in this industry. Speaking of links for the show notes, I appreciate you're spending so much time with me. Where can people go to learn more?Sam: Yeah, absolutely. I think veeam.com is kind of the first place that people gravitate towards. Me personally, I'm kind of like a hands-on learning kind of guy, so we always make free product available.And then you can find that on the AWS Marketplace. Simply search ‘Veeam' through there. A number of free products; we don't put time limits on it, we don't put feature limitations. You can backup ten instances, including your VPCs, which we actually didn't talk about today, but I do think is important. But I won't waste any more time on that.Corey: Oh, configuration of these things is critically important. If you don't know how everything was structured and built out, you're basically trying to re-architect from first principles based upon archaeology.Sam: Yeah [laugh], that's a real pain. So, we can help protect those VPCs and we actually don't put any limitations on the number of VPCs that you can protect; it's always free. So, if you're going to use it for anything, use it for that. But hands-on, marketplace, if you want more documentation, want to learn more, want to speak to someone veeam.com is the place to go.Corey: And we will, of course, include that in the show notes. Thank you so much for taking so much time to speak with me today. It's appreciated.Sam: Thank you, Corey, and thanks for all the listeners tuning in today.Corey: Sam Nicholls, Director of Public Cloud at Veeam. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry insulting comment that takes you two hours to type out but then you lose it because you forgot to back it up.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Look Away, Look Away - The Go! Team feat. The Star Feminine Band Saturdays - Broken Bells We're Not In Orbit Yet… - Broken Bells Smash the Machine - The Babe Rainbow Mediterranean - The Babe Rainbow Fast Cars - Danz CM Love Is Perfect Kindness - Busman's Holiday Hey Margarita - Busman's Holiday Charlie - Sam Valdez Your Party - Crush EP Disintegration - The Vacant Lots Red Desert - The Vacant Lots It Won't Be Cinematic - Sleep Party People This episode features a clip from a terrific interview with Yuval Noah Harari on by Sean Illing on his Vox podcast The Grey Area. After Ezra Klein went to the NYT Vox slowly auditioned other people to fill the gap, renaming The Ezra Klein Show first to The Conversation with different hosts, of which Sean was by far and away the best, and they eventually gave him the feed and renamed it The Grey Area. I just never unsubscribed. Ezra is awesome, but I like the NYT incarnation of his show less than the old version, and I think The Grey Area is usually more interesting than EKS these days.
About ClintonClinton Herget is Field CTO at Snyk, the leader is Developer Security. He focuses on helping Snyk's strategic customers on their journey to DevSecOps maturity. A seasoned technnologist, Cliton spent his 20-year career prior to Snyk as a web software developer, DevOps consultant, cloud solutions architect, and engineering director. Cluinton is passionate about empowering software engineering to do their best work in the chaotic cloud-native world, and is a frequent conference speaker, developer advocate, and technical thought leader.Links Referenced: Snyk: https://snyk.io/ duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Pinecone. They believe that all anyone really wants is to be understood, and that includes your users. AI models combined with the Pinecone vector database let your applications understand and act on what your users want… without making them spell it out.Make your search application find results by meaning instead of just keywords, your personalization system make picks based on relevance instead of just tags, and your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable. Thanks to my friends at Pinecone for sponsoring this episode. Visit Pinecone.io to understand more.Corey: This episode is bought to you in part by our friends at Veeam. Do you care about backups? Of course you don't. Nobody cares about backups. Stop lying to yourselves! You care about restores, usually right after you didn't care enough about backups. If you're tired of the vulnerabilities, costs and slow recoveries when using snapshots to restore your data, assuming you even have them at all living in AWS-land, there is an alternative for you. Check out Veeam, thats V-E-E-A-M for secure, zero-fuss AWS backup that won't leave you high and dry when it's time to restore. Stop taking chances with your data. Talk to Veeam. My thanks to them for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. One of the fun things about establishing traditions is that the first time you do it, you don't really know that that's what's happening. Almost exactly a year ago, I sat down for a previous promoted guest episode much like this one, With Clinton Herget at Snyk—or Synic; however you want to pronounce that. He is apparently a scarecrow of some sorts because when last we spoke, he was a principal solutions engineer, but like any good scarecrow, he was outstanding in his field, and now, as a result, is a Field CTO. Clinton, Thanks for coming back, and let me start by congratulating you on the promotion. Or consoling you depending upon how good or bad it is.Clinton: You know, Corey, a little bit of column A, a little bit of column B. But very glad to be here again, and frankly, I think it's because you insist on mispronouncing Snyk as Synic, and so you get me again.Corey: Yeah, you could add a couple of new letters to it and just call the company [Synack 00:01:27]. Now, it's a hard pivot to a networking company. So, there's always options.Clinton: I acknowledge what you did there, Corey.Corey: I like that quite a bit. I wasn't sure you'd get it.Clinton: I'm a nerd going way, way back, so we'll have to go pretty deep in the stack for you to stump me on some of this stuff.Corey: As we did with the, “I wasn't sure you'd get it.” See that one sailed right past you. And I win. Chalk another one up for me and the networking pun wars. Great, we'll loop back for that later.Clinton: I don't even know where I am right now.Corey: [laugh]. So, let's go back to a question that one would think that I'd already established a year ago, but I have the attention span of basically a goldfish, let's not kid ourselves. So, as I'm visiting the Snyk website, I find that it says different words than it did a year ago, which is generally a sign that is positive; when nothing's been updated including the copyright date, things are going really well or really badly. One wonders. But no, now you're talking about Snyk Cloud, you're talking about several other offerings as well, and my understanding of what it is you folks do no longer appears to be completely accurate. So, let me be direct. What the hell do you folks do over there?Clinton: It's a really great question. Glad you asked me on a year later to answer it. I would say at a very high level, what we do hasn't changed. However, I think the industry has certainly come a long way in the past couple years and our job is to adapt to that Snyk—again, pronounced like a pair of sneakers are sneaking around—it's a developer security platform. So, we focus on enabling the people who build applications—which as of today, means modern applications built in the cloud—to have better visibility, and ultimately a better chance of mitigating the risk that goes into those applications when it matters most, which is actually in their workflow.Now, you're exactly right. Things have certainly expanded in that remit because the job of a software engineer is very different, I think this year than it even was last year, and that's continually evolving over time. As a developer now, I'm doing a lot more than I was doing a few years ago. And one of the things I'm doing is building infrastructure in the cloud, I'm writing YAML files, I'm writing CloudFormation templates to deploy things out to AWS. And what happens in the cloud has a lot to do with the risk to my organization associated with those applications that I'm building.So, I'd love to talk a little bit more about why we decided to make that move, but I don't think that represents a watering down of what we're trying to do at Snyk. I think it recognizes that developer security vision fundamentally can't exist without some understanding of what's happening in the cloud.Corey: One of the things that always scares me is—and sets the spidey sense tingling—is when I see a company who has a product, and I'm familiar—ish—with what they do. And then they take their product name and slap the word cloud at the end, which is almost always codes to, “Okay, so we took the thing that we sold in boxes in data centers, and now we're making a shitty hosted version available because it turns out you rubes will absolutely pay a subscription for it.” Yeah, I don't get the sense that at all is what you're doing. In fact, I don't believe that you're offering a hosted managed service at the moment, are you?Clinton: No, the cloud part, that fundamentally refers to a new product, an offering that looks at the security or potentially the risks being introduced into cloud infrastructure, by now the engineers who were doing it who are writing infrastructure as code. We previously had an infrastructure-as-code security product, and that served alongside our static analysis tool which is Snyk Code, our open-source tool, our container scanner, recognizing that the kinds of vulnerabilities you can potentially introduce in writing cloud infrastructure are not only bad to the organization on their own—I mean, nobody wants to create an S3 bucket that's wide open to the world—but also, those misconfigurations can increase the blast radius of other kinds of vulnerabilities in the stack. So, I think what it does is it recognizes that, as you and I think your listeners well know, Corey, there's no such thing as the cloud, right? The cloud is just a bunch of fancy software designed to abstract away from the fact that you're running stuff on somebody else's computer, right?Corey: Unfortunately, in this case, the fact that you're calling it Snyk Cloud does not mean that you're doing what so many other companies in that same space do it would have led to a really short interview because I have no faith that it's the right path forward, especially for you folks, where it's, “Oh, you want to be secure? You've got to host your stuff on our stuff instead. That's why we called it cloud.” That's the direction that I've seen a lot of folks try and pivot in, and I always find it disastrous. It's, “Yeah, well, at Snyk if we run your code or your shitty applications here in our environment, it's going to be safer than if you run it yourself on something untested like AWS.” And yeah, those stories hold absolutely no water. And may I just say, I'm gratified that's not what you're doing?Clinton: Absolutely not. No, I would say we have no interest in running anyone's applications. We do want to scan them though, right? We do want to give the developers insight into the potential misconfigurations, the risks, the vulnerabilities that you're introducing. What sets Snyk apart, I think, from others in that application security testing space is we focus on the experience of the developer, rather than just being another tool that runs and generates a bunch of PDFs and then throws them back to say, “Here's everything you did wrong.”We want to say to developers, “Here's what you could do better. Here's how that default in a CloudFormation template that leads to your bucket being, you know, wide open on the internet could be changed. Here's the remediation that you could introduce.” And if we do that at the right moment, which is inside that developer workflow, inside the IDE, on their local machine, before that gets deployed, there's a much greater chance that remediation is going to be implemented and it's going to happen much more cheaply, right? Because you no longer have to do the round trip all the way out to the cloud and back.So, the cloud part of it fundamentally means completing that story, recognizing that once things do get deployed, there's a lot of valuable context that's happening out there that a developer can really take advantage of. They can say, “Wait a minute. Not only do I have a Log4Shell vulnerability, right, in one of my open-source dependencies, but that artifact, that application is actually getting deployed to a VPC that has ingress from the internet,” right? So, not only do I have remote code execution in my application, but it's being put in an enclave that actually allows it to be exploited. You can only know that if you're actually looking at what's really happening in the cloud, right?So, not only does Snyk cloud allows us to provide an additional layer of security by looking at what's misconfigured in that cloud environment and help your developers make remediations by saying, “Here's the actual IAC file that caused that infrastructure to come into existence,” but we can also say, here's how that affects the risk of other kinds of vulnerabilities at different layers in the stack, right? Because it's all software; it's all connected. Very rarely does a vulnerability translate one-to-one into risk, right? They're compound because modern software is compound. And I think what developers lack is the tooling that fits into their workflow that understands what it means to be a software engineer and actually helps them make better choices rather than punishing them after the fact for guessing and making bad ones.Corey: That sounds awesome at a very high level. It is very aligned with how executives and decision-makers think about a lot of these things. Let's get down to brass tacks for a second. Assume that I am the type of developer that I am in real life, by which I mean shitty. What am I going to wind up attempting to do that Snyk will flag and, in other words, protect me from myself and warn me that I'm about to commit a dumb?Clinton: First of all, I would say, look, there's no such thing as a non-shitty developer, right? And I built software for 20 years and I decided that's really hard. What's a lot easier is talking about building software for a living. So, that's what I do now. But fundamentally, the reason I'm at Snyk, is I want to help people who are in the kinds of jobs that I had for a very long time, which is to say, you have a tremendous amount of anxiety because you recognize that the success of the organization rests on your shoulders, and you're making hundreds, if not thousands of decisions every day without the right context to understand fully how the results of that decision is going to affect the organization that you work for.So, I think every developer in the world has to deal with this constant cognitive dissonance of saying, “I don't know that this is right, but I have to do it anyway because I need to clear that ticket because that release needs to get into production.” And it becomes really easy to short-sightedly do things like pull an open-source dependency without checking whether it has any CVEs associated with it because that's the version that's easiest to implement with your code that already exists. So, that's one piece. Snyk Open Source, designed to traverse that entire tree of dependencies in open-source all the way down, all the hundreds and thousands of packages that you're pulling in to say, not only, here's a vulnerability that you should really know is going to end up in your application when it's built, but also here's what you can do about it, right? Here's the upgrade you can make, here's the minimum viable change that actually gets you out of this problem, and to do so when it's in the right context, which is in you know, as you're making that decision for the first time, right, inside your developer environment.That also applies to things like container vulnerabilities, right? I have even less visibility into what's happening inside a container than I do inside my application. Because I know, say, I'm using an Ubuntu or a Red Hat base image. I have no idea, what are all the Linux packages that are on it, let alone what are the vulnerabilities associated with them, right? So, being able to detect, I've got a version of OpenSSL 3.0 that has a potentially serious vulnerability associated with it before I've actually deployed that container out into the cloud very much helps me as a developer.Because I'm limiting the rework or the refactoring I would have to do by otherwise assuming I'm making a safe choice or guessing at it, and then only finding out after I've written a bunch more code that relies on that decision, that I have to go back and change it, and then rewrite all of the things that I wrote on top of it, right? So, it's the identifying the layer in the stack where that risk could be introduced, and then also seeing how it's affected by all of those other layers because modern software is inherently complex. And that complexity is what drives both the risk associated with it, and also things like efficiency, which I know your audience is, for good reason, very concerned about.Corey: I'm going to challenge you on aspect of this because on the tin, the way you describe it, it sounds like, “Oh, I already have something that does that. It's the GitHub Dependabot story where it winds up sending me a litany of complaints every week.” And we are talking, if I did nothing other than read this email in that day, that would be a tremendously efficient processing of that entire thing because so much of it is stuff that is ancient and archived, and specific aspects of the vulnerabilities are just not relevant. And you talk about the OpenSSL 3.0 issues that just recently came out.I have no doubt that somewhere in the most recent email I've gotten from that thing, it's buried two-thirds of the way down, like all the complaints like the dishwasher isn't loaded, you forgot to take the trash out, that baby needs a change, the kitchen is on fire, and the vacuuming, and the r—wait, wait. What was that thing about the kitchen? Seems like one of those things is not like the others. And it just gets lost in the noise. Now, I will admit to putting my thumb a little bit on the scale here because I've used Snyk before myself and I know that you don't do that. How do you avoid that trap?Clinton: Great question. And I think really, the key to the story here is, developers need to be able to prioritize, and in order to prioritize effectively, you need to understand the context of what happens to that application after it gets deployed. And so, this is a key part of why getting the data out of the cloud and bringing it back into the code is so important. So, for example, take an OpenSSL vulnerability. Do you have it on a container image you're using, right? So, that's question number one.Question two is, is there actually a way that code can be accessed from the outside? Is it included or is it called? Is the method activated by some other package that you have running on that container? Is that container image actually used in a production deployment? Or does it just go sit in a registry and no one ever touches it?What are the conditions required to make that vulnerability exploitable? You look at something like Spring Shell, for example, yes, you need a certain version of spring-beans in a JAR file somewhere, but you also need to be running a certain version of Tomcat, and you need to be packaging those JARs inside a WAR in a certain way.Corey: Exactly. I have a whole bunch of Lambda functions that provide the pipeline system that I use to build my newsletter every week, and I get screaming concerns about issues in, for example, a version of the markdown parser that I've subverted. Yeah, sure. I get that, on some level, if I were just giving it random untrusted input from the internet and random ad hoc users, but I'm not. It's just me when I write things for that particular Lambda function.And I'm not going to be actively attempting to subvert the thing that I built myself and no one else should have access to. And looking through the details of some of these things, it doesn't even apply to the way that I'm calling the libraries, so it's just noise, for lack of a better term. It is not something that basically ever needs to be adjusted or fixed.Clinton: Exactly. And I think cutting through that noise is so key to creating developer trust in any kind of tool that scanning an asset and providing you what, in theory, are a list of actionable steps, right? I need to be able to understand what is the thing, first of all. There's a lot of tools that do that, right, and we tend to mock them by saying things like, “Oh, it's just another PDF generator. It's just another thousand pages that you're never going to read.”So, getting the information in the right place is a big part of it, but filtering out all of the noise by saying, we looked at not just one layer of the stack, but multiple layers, right? We know that you're using this open-source dependency and we also know that the method that contains the vulnerability is actively called by your application in your first-party code because we ran our static analysis tool against that. Furthermore, we know because we looked at your cloud context, we connected to your AWS API—we're big partners with AWS and very proud of that relationship—but we can tell that there's inbound internet access available to that service, right? So, you start to build a compound case that maybe this is something that should be prioritized, right? Because there's a way into the asset from the outside world, there's a way into the vulnerable functions through the labyrinthine, you know, spaghetti of my code to get there, and the conditions required to exploit it actually exist in the wild.But you can't just run a single tool; you can't just run Dependabot to get that prioritization. You actually have to look at the entire holistic application context, which includes not just your dependencies, but what's happening in the container, what's happening in your first-party, your proprietary code, what's happening in your IAC, and I think most importantly for modern applications, what's actually happening in the cloud once it gets deployed, right? And that's sort of the holy grail of completing that loop to bring the right context back from the cloud into code to understand what change needs to be made, and where, and most importantly why. Because it's a priority that actually translates into organizational risk to get a developer to pay attention, right? I mean, that is the key to I think any security concern is how do you get engineering mindshare and trust that this is actually what you should be paying attention to and not a bunch of rework that doesn't actually make your software more secure?Corey: One of the challenges that I see across the board is that—well, let's back up a bit here. I have in previous episodes talked in some depth about my position that when it comes to the security of various cloud providers, Google is number one, and AWS is number two. Azure is a distant third because it figures out what Crayons tastes the best; I don't know. But the reason is not because of any inherent attribute of their security models, but rather that Google massively simplifies an awful lot of what happens. It automatically assumes that resources in the same project should be able to talk to one another, so I don't have to painstakingly configure that.In AWS-land, all of this must be done explicitly; no one has time for that, so we over-scope permissions massively and never go back and rein them in. It's a configuration vulnerability more than an underlying inherent weakness of the platform. Because complexity is the enemy of security in many respects. If you can't fit it all in your head to reason about it, how can you understand the security ramifications of it? AWS offers a tremendous number of security services. Many of them, when taken in some totality of their pricing, cost more than any breach, they could be expected to prevent. Adding more stuff that adds more complexity in the form of Snyk sounds like it's the exact opposite of what I would want to do. Change my mind.Clinton: I would love to. I would say, fundamentally, I think you and I—and by ‘I,' I mean Snyk and you know, Corey Quinn Enterprises Limited—I think we fundamentally have the same enemy here, right, which is the cyclomatic complexity of software, right, which is how many different pathways do the bits have to travel down to reach the same endpoint, right, the same goal. The more pathways there are, the more risk is introduced into your software, and the more inefficiency is introduced, right? And then I know you'd love to talk about how many different ways is there to run a container on AWS, right? It's either 30 or 400 or eleventy-million.I think you're exactly right that that complexity, it is great for, first of all, selling cloud resources, but also, I think, for innovating, right, for building new kinds of technology on top of that platform. The cost that comes along with that is a lack of visibility. And I think we are just now, as we approach the end of 2022 here, coming to recognize that fundamentally, the complexity of modern software is beyond the ability of a single engineer to understand. And that is really important from a security perspective, from a cost control perspective, especially because software now creates its own infrastructure, right? You can't just now secure the artifact and secure the perimeter that it gets deployed into and say, “I've done my job. Nobody can breach the perimeter and there's no vulnerabilities in the thing because we scanned it and that thing is immutable forever because it's pets, not cattle.”Where I think the complexity story comes in is to recognize like, “Hey, I'm deploying this based on a quickstart or CloudFormation template that is making certain assumptions that make my job easier,” right, in a very similar way that choosing an open-source dependency makes my job easier as a developer because I don't have to write all of that code myself. But what it does mean is I lack the visibility into, well hold on. How many different pathways are there for getting things done inside this dependency? How many other dependencies are brought on board? In the same way that when I create an EKS cluster, for example, from a CloudFormation template, what is it creating in the background? How many VPCs are involved? What are the subnets, right? How are they connected to each other? Where are the potential ingress points?So, I think fundamentally, getting visibility into that complexity is step number one, but understanding those pathways and how they could potentially translate into risk is critically important. But that prioritization has to involve looking at the software holistically and not just individual layers, right? I think we lose when we say, “We ran a static analysis tool and an open-source dependency scanner and a container scanner and a cloud config checker, and they all came up green, therefore the software doesn't have any risks,” right? That ignores the fundamental complexity in that all of these layers are connected together. And from an adversaries perspective, if my job is to go in and exploit software that's hosted in the cloud, I absolutely do not see the application model that way.I see it as it is inherently complex and that's a good thing for me because it means I can rely on the fact that those engineers had tremendous anxiety, we're making a lot of guesses, and crossing their fingers and hoping something would work and not be exploitable by me, right? So, the only way I think we get around that is to recognize that our engineers are critical stakeholders in that security process and you fundamentally lack that visibility if you don't do your scanning until after the fact. If you take that traditional audit-based approach that assumes a very waterfall, legacy approach to building software, and recognize that, hey, we're all on this infinite loop race track now. We're deploying every three-and-a-half seconds, everything's automated, it's all built at scale, but the ability to do that inherently implies all of this additional complexity that ultimately will, you know, end up haunting me, right? If I don't do anything about it, to make my engineer stakeholders in, you know, what actually gets deployed and what risks it brings on board.Corey: This episode is sponsored in part by our friends at Uptycs. Attackers don't think in silos, so why would you have siloed solutions protecting cloud, containers, and laptops distinctly? Meet Uptycs - the first unified solution that prioritizes risk across your modern attack surface—all from a single platform, UI, and data model. Stop by booth 3352 at AWS re:Invent in Las Vegas to see for yourself and visit uptycs.com. That's U-P-T-Y-C-S.com. My thanks to them for sponsoring my ridiculous nonsense.Corey: When I wind up hearing you talk about this—I'm going to divert us a little bit because you're dancing around something that it took me a long time to learn. When I first started fixing AWS bills for a living, I thought that it would be mostly math, by which I mean arithmetic. That's the great secret of cloud economics. It's addition, subtraction, and occasionally multiplication and division. No, turns out it's much more psychology than it is math. You're talking in many aspects about, I guess, what I'd call the psychology of a modern cloud engineer and how they think about these things. It's not a technology problem. It's a people problem, isn't it?Clinton: Oh, absolutely. I think it's the people that create the technology. And I think the longer you persist in what we would call the legacy viewpoint, right, not recognizing what the cloud is—which is fundamentally just software all the way down, right? It is abstraction layers that allow you to ignore the fact that you're running stuff on somebody else's computer—once you recognize that, you realize, oh, if it's all software, then the problems that it introduces are software problems that need software solutions, which means that it must involve activity by the people who write software, right? So, now that you're in that developer world, it unlocks, I think, a lot of potential to say, well, why don't developers tend to trust the security tools they've been provided with, right?I think a lot of it comes down to the question you asked earlier in terms of the noise, the lack of understanding of how those pieces are connected together, or the lack of context, or not even frankly, caring about looking beyond the single-point solution of the problem that solution was designed to solve. But more importantly than that, not recognizing what it's like to build modern software, right, all of the decisions that have to be made on a daily basis with very limited information, right? I might not even understand where that container image I'm building is going in the universe, let alone what's being built on top of it and how much critical customer data is being touched by the database, that that container now has the credentials to access, right? So, I think in order to change anything, we have to back way up and say, problems in the cloud or software problems and we have to treat them that way.Because if we don't if we continue to represent the cloud as some evolution of the old environment where you just have this perimeter that's pre-existing infrastructure that you're deploying things onto, and there's a guy with a neckbeard in the basement who is unplugging cables from a switch and plugging them back in and that's how networking problems are solved, I think you missed the idea that all of these abstraction layers introduced the very complexity that needs to be solved back in the build space. But that requires visibility into what actually happens when it gets deployed. The way I tend to think of it is, there's this firewall in place. Everybody wants to say, you know, we're doing DevOps or we're doing DevSecOps, right? And that's a lie a hundred percent of the time, right? No one is actually, I think, adhering completely to those principles.Corey: That's why one of the core tenets of ClickOps is lying about doing anything in the console.Clinton: Absolutely, right? And that's why shadow IT becomes more and more prevalent the deeper you get into modern development, not less and less prevalent because it's fundamentally hard to recognize the entirety of the potential implications, right, of a decision that you're making. So, it's a lot easier to just go in the console and say, “Okay, I'm going to deploy one EC2 to do this. I'm going to get it right at some point.” And that's why every application that's ever been produced by human hands has a comment in it that says something like, “I don't know why this works but it does. Please don't change it.”And then three years later because that developer has moved on to another job, someone else comes along and looks at that comment and says, “That should really work. I'm going to change it.” And they do and everything fails, and they have to go back and fix it the original way and then add another comment saying, “Hey, this person above me, they were right. Please don't change this line.” I think every engineer listening right now knows exactly where that weak spot is in the applications that they've written and they're terrified of that.And I think any tool that's designed to help developers fundamentally has to get into the mindset, get into the psychology of what that is, like, of not fundamentally being able to understand what those applications are doing all of the time, but having to write code against them anyway, right? And that's what leads to, I think, the fear that you're going to get woken up because your pager is going to go off at 3 a.m. because the building is literally on fire and it's because of code that you wrote. We have to solve that problem and it has to be those people who's psychology we get into to understand, how are you working and how can we make your life better, right? And I really do think it comes with that the noise reduction, the understanding of complexity, and really just being humble and saying, like, “We get that this job is really hard and that the only way it gets better is to begin admitting that to each other.”Corey: I really wish that there were a better way to articulate a lot of these things. This the reason that I started doing a security newsletter; it's because cost and security are deeply aligned in a few ways. One of them is that you care about them a lot right after you failed to care about them sufficiently, but the other is that you've got to build guardrails in such a way that doing the right thing is easier than doing it the wrong way, or you're never going to gain any traction.Clinton: I think that's absolutely right. And you use the key term there, which is guardrails. And I think that's where in their heart of hearts, that's where every security professional wants to be, right? They want to be defining policy, they want to be understanding the risk posture of the organization and nudging it in a better direction, right? They want to be talking up to the board, to the executive team, and creating confidence in that risk posture, rather than talking down or off to the side—depending on how that org chart looks—to the engineers and saying, “Fix this, fix that, and then fix this other thing.” A, B, and C, right?I think the problem is that everyone in a security role or an organization of any size at this point, is doing 90% of the latter and only about 10% of the former, right? They're acting as gatekeepers, not as guardrails. They're not defining policy, they're spending all of their time creating Jira tickets and all of their time tracking down who owns the piece of code that got deployed to this pod on EKS that's throwing all these errors on my console, and how can I get the person to make a decision to actually take an action that stops these notifications from happening, right? So, all they're doing is throwing footballs down the field without knowing if there's a receiver there, right, and I think that takes away from the job that our security analysts really shouldn't be doing, which is creating those guardrails, which is having confidence that the policy they set is readily understood by the developers making decisions, and that's happening in an automated way without them having to create friction by bothering people all the time. I don't think security people want to be [laugh] hated by the development teams that they work with, but they are. And the reason they are is I think, fundamentally, we lack the tooling, we lack—Corey: They are the barrier method.Clinton: Exactly. And we lacked the processes to get the right intelligence in a way that's consumable by the engineers when they're doing their job, and not after the fact, which is typically when the security people have done their jobs.Corey: It's sad but true. I wish that there were a better way to address these things, and yet here we are.Clinton: If only there were better way to address these things.Corey: [laugh].Clinton: Look, I wouldn't be here at Snyk if I didn't think there were a better way, and I wouldn't be coming on shows like yours to talk to the engineering communities, right, people who have walked the walk, right, who have built those Terraform files that contain these misconfigurations, not because they're bad people or because they're lazy, or because they don't do their jobs well, but because they lacked the visibility, they didn't have the understanding that that default is actually insecure. Because how would I know that otherwise, right? I'm building software; I don't see myself as an expert on infrastructure, right, or on Linux packages or on cyclomatic complexity or on any of these other things. I'm just trying to stay in my lane and do my job. It's not my fault that the software has become too complex for me to understand, right?But my management doesn't understand that and so I constantly have white knuckles worrying that, you know, the next breach is going to be my fault. So, I think the way forward really has to be, how do we make our developers stakeholders in the risk being introduced by the software they write to the organization? And that means everything we've been talking about: it means prioritization; it means understanding how the different layers of the stack affect each other, especially the cloud pieces; it means an extensible platform that lets me write code against it to inject my own reasoning, right? The piece that we haven't talked about here is that risk calculation doesn't just involve technical aspects, there's also business intelligence that's involved, right? What are my critical applications, right, what actually causes me to lose significant amounts of money if those services go offline?We at Snyk can't tell that. We can't run a scanner to say these are your crown jewel services that can't ever go down, but you can know that as an organization. So, where we're going with the platform is opening up the extensible process, creating APIs for you to be able to affect that risk triage, right, so that as the creators have guardrails as the security team, you are saying, “Here's how we want our developers to prioritize. Here are all of the factors that go into that decision-making.” And then you can be confident that in their environment, back over in developer-land, when I'm looking at IntelliJ, or, you know, or on my local command line, I am seeing the guardrails that my security team has set for me and I am confident that I'm fixing the right thing, and frankly, I'm grateful because I'm fixing it at the right time and I'm doing it in such a way and with a toolset that actually is helping me fix it rather than just telling me I've done something wrong, right, because everything we do at Snyk focuses on identifying the solution, not necessarily identifying the problem.It's great to know that I've got an unencrypted S3 bucket, but it's a whole lot better if you give me the line of code and tell me exactly where I have to copy and paste it so I can go on to the next thing, rather than spending an hour trying to figure out, you know, where I put that line and what I actually have to change it to, right? I often say that the most valuable currency for a developer, for a software engineer, it's not money, it's not time, it's not compute power or anything like that, it's the right context, right? I actually have to understand what are the implications of the decision that I'm making, and I need that to be in my own environment, not after the fact because that's what creates friction within an organization is when I could have known earlier and I could have known better, but instead, I had to guess I had to write a bunch of code that relies on the thing that was wrong, and now I have to redo it all for no good reason other than the tooling just hadn't adapted to the way modern software is built.Corey: So, one last question before we wind up calling it a day here. We are now heavily into what I will term pre:Invent where we're starting to see a whole bunch of announcements come out of the AWS universe in preparation for what I'm calling Crappy Cloud Hanukkah this year because I'm spending eight nights in Las Vegas. What are you doing these days with AWS specifically? I know I keep seeing your name in conjunction with their announcements, so there's something going on over there.Clinton: Absolutely. No, we're extremely excited about the partnership between Snyk and AWS. Our vulnerability intelligence is utilized as one of the data sources for AWS Inspector, particularly around open-source packages. We're doing a lot of work around things like the code suite, building Snyk into code pipeline, for example, to give developers using that code suite earlier visibility into those vulnerabilities. And really, I think the story kind of expands from there, right?So, we're moving forward with Amazon, recognizing that it is, you know, sort of the de facto. When we say cloud, very often we mean AWS. So, we're going to have a tremendous presence at re:Invent this year, I'm going to be there as well. I think we're actually going to have a bunch of handouts with your face on them is my understanding. So, please stop by the booth; would love to talk to folks, especially because we've now released the Snyk Cloud product and really completed that story. So, anything we can do to talk about how that additional context of the cloud helps engineers because it's all software all the way down, those are absolutely conversations we want to be having.Corey: Excellent. And we will, of course, put links to all of these things in the [show notes 00:35:00] so people can simply click, and there they are. Thank you so much for taking all this time to speak with me. I appreciate it.Clinton: All right. Thank you so much, Corey. Hope to do it again next year.Corey: Clinton Herget, Field CTO at Snyk. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment telling me that I'm being completely unfair to Azure, along with your favorite tasting color of Crayon.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
I flere amerikanske delstater sidder de stadig og tæller på livet løs efter tirsdagens midtvejsvalg. Og endnu er det uklart, hvem der kan kalde sig for valgets vinder. Men vi har allerede nu en taber, siger Politikens internationale kommentator Michael Jarlner: Donald Trump. Eks-præsidenten, som tirsdag ventes at erklære, at han stiller op til præsidentvalget i 2024. Hvad det kan få af betydning for Donald Trumps chancer, at flere af de kandidater, han støttede ved midtvejsvalget, klarede sig dårligt? Og tror republikanerne stadig, han er deres bedste bud på at vinde Det Hvide Hus tilbage?
O Mateus Prado continua com a gente nesse episódio para finalizar com o João Brito a simulação de uma dinâmica de entrevista com perguntas realmente interessantes para conhecer, além do ponto de vista técnico, os candidatos a uma vaga de SRE!Para encerrar a sua parte, o Mateus perguntou para o João quais seriam as métricas que ele criaria para monitorar um sistema de cálculo de frete numa plataforma de e-commerce. E o João, terminou a dinâmica, perguntando para Mateus como faz com o inglês para trabalhar em uma empresa gringa na gringa. Antes de dar o PLAY, saiba que ambos foram contratados para suas fictícias vagas de SRE!Os LINKS dos assuntos comentados no programa seguem abaixo:Série de Kubicast sobre Observabilidade com todos os players do mercado: https://gtup.me/kubicast-84As RECOMENDAÇÕES dos participantes são:Terminal list - Série na Amazon Prime VideoAs Cientistas: 50 Mulheres que Mudaram o Mundo - Livro da Rachel IgnotofskyBlack Box - Livro do Gianfranco BetingAn Elegant Puzzle - Livro do Will LarsonWorld War II History for Kids: 500 Facts - Livro da Kelly Milner HallsSOBRE O KUBICASTO Kubicast é uma produção da Getup, a única empresa brasileira 100% focada e especializada em Kubernetes. Todos os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT.
What is needed for people, communities and societies to transform? Where can we meet other people and build trust and understanding together? How can we create inner transformation? How can we create a mindset where we can change the world? Worldwide known author, speaker and activist Tomas Björkman leads The Eksäret Foundation. Tomas Björkman explains the need for collective shift for us all as a global society: “From a historical and material perspective, the majority of humanity is no doubt doing better than ever; but at what cost? We live in an unfolding and ever more urgent climate crisis. And our polarised societies seem guided by ineffectual leadership. My take on the world is that it is no longer enough to simply “solve problems”. The complexity and inter-relationship between problems mean that all our challenges have elements relating to human thought and emotion. We have to re-orientate by better learning how to evolve and adapt. I use the phrase “we have to”, because we are at what physists term a bifurcation point. Our civilisation will either go forward at a higher level of more complex organisation, or break down. In short: we need to shift our minds and our hearts on a grand collective scale. We need new, deeper ways of relating to ourselves, to each other and to the world. We also need “new maps” of our very existence as the current ones are both antiquated and increasingly revealed as inaccurate. The good news is that humanity has undergone major collective shifts before! My work is dedicated to contributing to increasing our chances of doing that again, while there is (still!) time.” -- The Ekskäret Foundation is a Swedish non-profit organisation which aims to facilitate the co-creation of a more conscious and sustainable society. The Foundation was founded in 2009 with a focus on the connection between life-long personal inner development and growth and societal development. It has developed spaces, arenas and contexts for inquisitive exploratory meetings, dialogue and lifelong transformative learning. Starting with youth camps the Foundation offers workshops, programs, conferences and work labs in personal development and social transformation. All with the purpose to increase our individual and collective capacities to be active co-creators of a more sustainable and conscious society. Read more about Tomas Björkman and The Ekskäret foundation here: http://ekskaret.se/
Da un server on-premises fino a serverless, passando per EC2 e Kubernetes, la storia di Spreaker, una delle più grandi piattaforme di podcasting al mondo, e di come il cloud e un costante focus sulla maturità del team siano stati fondamentali per riuscire a soddisfare le necessità di un business in costante evoluzione.
O Mateus Prado retorna ao Kubicast para fazer com o João Brito uma simulação de entrevista para uma vaga de SRE. Diferentemente do que ocorre na vida real, os dois levantaram questões que de fato podem testar o conhecimento técnico e outras habilidades dos candidatos para a posição DevOps.Antes do play, tente você mesmo responder as perguntas que eles fariam em uma entrevista:Defina SRE sem ferramentas!Como funciona o lance do DNS Lookup?Como expor um banco de dados para uma pessoa fora da sua rede? O que significa cada coisa desse endereço HTTP?O que significa a mensagem “status 200”?Me conte sobre um grande desafio que você teve que enfrentar em produçãoO que você sabe sobre a nossa empresa?Depois de ouvir o podcast, envie o programa para a turma de RH enriquecer seu repertório de perguntas para DevOps!Ah, e se você ainda não conhece o Mateus, confira o episódio 96 do Kubicast!IMPORTANTE: a parte dois desse programa, a gente solta na semana que vem!SOBRE O KUBICASTO Kubicast é uma produção da Getup, especialista em Kubernetes. Todos os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT.
Nesse episódio, o Kubicast teve a honra de conversar com Edilson Azevedo, o colega de equipe de Operações que tem o humor mais ácido de todos, o cara que se esconde no banheiro em dia de festa e que nos ajuda a levar o Kubernetes para o mundo.Para começar de forma bem polêmica, lançamos para ele a pergunta: Open Source dá dinheiro mesmo ou só adesivo e camiseta? Depois, seguimos falando das razões que levam as empresas a adotar o Enterprise vs o Open Source; das diferenças entre rodar OpenShift vs OKD e do lado bom e ruim de seguir as regras do Enterprise. Também, comentamos sobre experiências em startups hypes, que gastam o dinheiro investido em festas regadas a whisky gourmet!Os LINKS comentados no programa:Série - We crashed - We workThere is no rules - livroCase do We BankAs 10 principais diferenças entre Openshift e OKDO ego é seu inimigo (livro)As RECOMENDAÇÕES do episódio:Aspirina e cafeína para relaxarTroco em dobro (filme que está na Netflix)SOBRE O KUBICASTO Kubicast é uma produção da Getup, especialista em Kubernetes. Todos os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT.
Piłka wodna, jako dyscyplina sportu powstała pod koniec XIX wieku w Wielkiej Brytanii. Za jej protoplastów uważają się Szkoci. Pierwsze treningi w tej grze miały odbywać się na pływalniach Glasgow po zajęciach pływackich. Jako pierwszy przepisy opracował Szkot William Wilson w roku 1876, a zostały one ogłoszone przez Londyńskie Stowarzyszenie Pływackie. Na ziemie polskie „przywędrowała” z Austro-Węgier i jeszcze przed wybuchem I wojny światowej grywano w nią w Galicji (głównie w Krakowie). Po odzyskaniu niepodległości zaczęły powstawać pierwsze kluby waterpolowe, a dominowały dwa ośrodki: krakowski i lwowski. Od 1923 organizowano regularne turnieje, zaś od 1925 rozgrywane są cyklicznie seniorskie mistrzostwa kraju. W czasach pionierskich ton rozgrywkom nadawały krakowskie Jutrzenka i Makkabi oraz katowickie EKS i TP Giszowiec. Po II wojnie światowej na reaktywowanie ligi czekano do 1954. Najpierw dominowała Legia Warszawa, potem pałeczkę przejmowały: Arkonia Szczecin, Stilon Gorzów Wielkopolski, Anilana Łódź, KSZO Ostrowiec Świętokrzyski, ŁSTW Łódź i WTS "Polonia" Bytom.Piłka wodna to jednak cały czas bardzo mało znana dyscyplina sportu.
About ShinjiShinji Kim is the Founder & CEO of Select Star, an automated data discovery platform that helps you to understand & manage your data. Previously, she was the Founder & CEO of Concord Systems, a NYC-based data infrastructure startup acquired by Akamai Technologies in 2016. She led the strategy and execution of Akamai IoT Edge Connect, an IoT data platform for real-time communication and data processing of connected devices. Shinji studied Software Engineering at University of Waterloo and General Management at Stanford GSB.Links Referenced: Select Star: https://www.selectstar.com/ LinkedIn: https://www.linkedin.com/company/selectstarhq/ Twitter: https://twitter.com/selectstarhq TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while, I encounter a company that resonates with something that I've been doing on some level. In this particular case, that is what's happened here, but the story is slightly different. My guest today is Shinji Kim, who's the CEO and founder at Select Star.And the joke that I was making a few months ago was that Select Stars should have been the name of the Oracle ACE program instead. Shinji, thank you for joining me and suffering my ridiculous, basically amateurish and sophomore database-level jokes because I am bad at databases. Thanks for taking the time to chat with me.Shinji: Thanks for having me here, Corey. Good to meet you.Corey: So, Select Star despite being the only query pattern that I've ever effectively been able to execute from memory, what you do as a company is described as an automated data discovery platform. So, I'm going to start at the beginning with that baseline definition. I think most folks can wrap their heads around what the idea of automated means, but the rest of the words feel like it might mean different things to different people. What is data discovery from your point of view?Shinji: Sure. The way that we define data discovery is finding and understanding data. In other words, think about how discoverable your data is in your company today. How easy is it for you to find datasets, fields, KPIs of your organization data? And when you are looking at a table, column, dashboard, report, how easy is it for you to understand that data underneath? Encompassing on that is how we define data discovery.Corey: When you talk about data lurking around the company in various places, that can mean a lot of different things to different folks. For the more structured data folks—which I tend to think of as the organized folks who are nothing like me—that tends to mean things that live inside of, for example, traditional relational databases or things that closely resemble that. I come from a grumpy old sysadmin perspective, so I'm thinking, oh, yeah, we have a Jira server in the closet and that thing's logging to its own disk, so that's going to be some information somewhere. Confluence is another source of data in an organization; it's usually where insight and a knowledge of what's going on goes to die. It's one of those write once, read never type of things.And when I start thinking about what data means, it feels like even that is something of a squishy term. From the perspective of where Select Start starts and stops, is it bounded to data that lives within relational databases? Does it go beyond that? Where does it start? Where does it stop?Shinji: So, we started the company with an intention of increasing the discoverability of data and hence providing automated data discovery capability to organizations. And the part where we see this as the most effective is where the data is currently being consumed today. So, this is, like, where the data consumption happens. So, this can be a data warehouse or data lake, but this is where your data analysts, data scientists are querying data, they are building dashboards, reports on top of, and this is where your main data mart lives.So, for us, that is primarily a cloud data warehouse today, usually has a relational data structure. On top of that, we also do a lot of deep integrations with BI tools. So, that includes tools like Tableau, Power BI, Looker, Mode. Wherever these queries from the business stakeholders, BI engineers, data analysts, data scientists run, this is a point of reference where we use to auto-generate documentation, data models, lineage, and usage information, to give it back to the data team and everyone else so that they can learn more about the dataset they're about to use.Corey: So, given that I am seeing an increased number of companies out there talking about data discovery, what is it the Select Star does that differentiates you folks from other folks using similar verbiage in how they describe what they do?Shinji: Yeah, great question. There are many players that popping up, and also, traditional data catalog's definitely starting to offer more features in this area. The main differentiator that we have in the market today, we call it fast time-to-value. Any customer that is starting with Select Star, they get to set up their instance within 24 hours, and they'll be able to get all the analytics and data models, including column-level lineage, popularity, ER diagrams, and how other people are—top users and how other people are utilizing that data, like, literally in few hours, max to, like, 24 hours. And I would say that is the main differentiator.And most of the customers I have pointed out that setup and getting started has been super easy, which is primarily backed by a lot of automation that we've created underneath the platform. On top of that, just making it super easy and simple to use. It becomes very clear to the users that it's not just for the technical data engineers and DBAs to use; this is also designed for business stakeholders, product managers, and ops folks to start using as they are learning more about how to use data.Corey: Mapping this a little bit toward the use cases that I'm the most familiar with, this big source of data that I tend to stumble over is customer AWS bills. And that's not exactly a big data problem, given that it can fit in memory if you have a sufficiently exciting computer, but using Tableau don't wind up slicing and dicing that because at some point, Excel falls down. From my perspective, problem with Excel is that it doesn't tend to work on huge datasets very well, and from the position of Salesforce, the problem with Excel is that it doesn't cost a giant pile of money every month. So, those two things combined, Tableau is the answer for what we do. But that's sort of the end-all for us of, that's where it stops.At that point, we have dashboards that we build and queries that we run that spit out the thing we're looking at, and then that goes back to inform our analysis. We don't inherently feed that back into anything else that would then inform the rest of what we do. Now, for our use case, that probably makes an awful lot of sense because we're here to help our customers with their billing challenges, not take advantage of their data to wind up informing some giant model and mispurposing that data for other things. But if we were generating that data ourselves as a part of our operation, I can absolutely see the value of tying that back into something else. You wind up almost forming a reinforcing cycle that improves the quality of data over time and lets you understand what's going on there. What are some of the outcomes that you find that customers get to by going down this particular path?Shinji: Yeah, so just to double-click on what you just talked about, the way that we see this is how we analyze the metadata and the activity logs—system logs, user logs—of how that data has been used. So, part of our auto-generated documentation for each table, each column, each dashboard, you're going to be able to see the full data lineage: where it came from, how it was transformed in the past, and where it's going to. You will also see what we call popularity score: how many unique users are utilizing this data inside the organization today, how often. And utilizing these two core models and analysis that we create, you can start looking at first mapping out the data flow, and then determining whether or not this dataset is something that you would want to continue keeping or running the data pipelines for. Because once you start mapping these usage models of tables versus dashboards, you may find that there are recurring jobs that creates all these materialized views and tables that are feeding dashboards that are not being looked at anymore.So, with this mechanism by looking initially data lineage as a concept, a lot of companies use data lineage in order to find dependencies: what is going to break if I make this change in the column or table, as well as just debugging any of issues that is currently happening in their pipeline. So, especially when you will have to debug a SQL query or pipeline that you didn't build yourself but you need to find out how to fix it, this is a really easy way to instantly find out, like, where the data is coming from. But on top of that, if you start adding this usage information, you can trace through where the main compute is happening, which largest route table is still being queried, instead of the more summarized tables that should be used, versus which are the tables and datasets that is continuing to get created, feeding the dashboards and is those dashboards actually being used on the business side. So, with that, we have customers that have saved thousands of dollars every month just by being able to deprecate dashboards and pipelines that they were afraid of deprecating in the past because they weren't sure if anyone's actually using this or not. But adopting Select Star was a great way to kind of do a full spring clean of their data warehouse as well as their BI tool. And this is an additional benefit to just having to declutter so many old, duplicated, and outdated dashboards and datasets in their data warehouse.Corey: That is, I guess, a recurring problem that I see in many different pockets of the industry as a whole. You see it in the user visibility space, you see it in the cost control space—I even made a joke about Confluence that alludes to it—this idea that you build a whole bunch of dashboards and use it to inform all kinds of charts and other systems, but then people are busy. It feels like there's no ‘and then.' Like, one of the most depressing things in the universe that you can see after having spent a fair bit of effort to build up those dashboards is the analytics for who internally has looked at any of those dashboards since the demo you gave showing it off to everyone else. It feels like in many cases, we put all these projects and amount of effort into building these things out that then don't get used.People don't want to be informed by data they want to shoot from their gut. Now, sometimes that's helpful when we're talking about observability tools that you use to trace down outages, and, “Well, our site's really stable. We don't have to look at that.” Very awesome, great, awesome use case. The business insight level of dashboard just feels like that's something you should really be checking a lot more than you are. How do you see that?Shinji: Yeah, for sure. I mean, this is why we also update these usage metrics and lineage every 24 hours for all of our customers automatically, so it's just up-to-date. And the part that more customers are asking for where we are heading to—earlier, I mentioned that our main focus has been on analyzing data consumption and understanding the consumption behavior to drive better usage of your data, or making data usage much easier. The part that we are starting to now see is more customers wanting to extend those feature capabilities to their staff of where the data is being generated. So, connecting the similar amount of analysis and metadata collection for production databases, Kafka Queues, and where the data is first being generated is one of our longer-term goals. And then, then you'll really have more of that, up to the source level, of whether the data should be even collected or whether it should even enter the data warehouse phase or not.Corey: One of the challenges I see across the board in the data space is that so many products tend to have a very specific point of the customer lifecycle, where bringing them in makes sense. Too early and it's, “Data? What do you mean data? All I have are these logs, and their purpose is basically to inflate my AWS bill because I'm bad at removing them.” And on the other side, it's, “Great. We pioneered some of these things and have built our own internal enormous system that does exactly what we need to do.” It's like, “Yes, Google, you're very smart. Good job.” And most people are somewhere between those two extremes. Where are customers on that lifecycle or timeline when using Select Star makes sense for them?Shinji: Yeah, I think that's a great question. Also the time, the best place where customers would use Select Star for is that after they have their cloud data warehouse set up. Either they have finished their migration, they're starting to utilize it with their BI tools, and they're starting to notice that it's not just, like, you know, ten to fifty tables that they're starting with; most of them have more than hundreds of tables. And they're feeling that this is starting to go out of control because we have all these data, but we are not a hundred percent sure what exactly is in our database. And this usually just happens more in larger companies, companies at thousand-plus employees, and they usually find a lot of value out of Select Star right away because, like, we will start pointing out many different things.But we also see a lot of, like, forward-thinking, fast-growing startups that are at the size of a few hundred employees, you know, they now have between five to ten-person data team, and they are really creating the right single source of truth of their data knowledge through a Select Star. So, I think you can start anywhere from when your data team size is, like, beyond five and you're continuing to grow because every time you're trying to onboard a data analyst, data scientist, you will have to go through, like, basically the same type of training of your data model, and it might actually look different because the data models and the new features, new apps that you're integrating this changes so quickly. So, I would say it's important to have that base early on and then continue to grow. But we do also see a lot of companies coming to us after having thousands of datasets or tens of thousands of datasets that it's really, like, very hard to operate and onboard anyone. And this is a place where we really shine to help their needs, as well.Corey: Sort of the, “I need a database,” to the, “Help, I have too many databases,” pipeline, where [laugh] at some point people start to—wanting to bring organization to the chaos. One thing I like about your model is that you don't seem to be making the play that every other vendor in the data space tends to, which is, “Oh, we want you to move your data onto our systems. The end.” You operate on data that is in place, which makes an awful lot of sense for the kinds of things that we're talking about. Customers are flat out not going to move their data warehouse over to your environment, just because the data gravity is ludicrous. Just the sheer amount of money it would take to egress that data from a cloud provider, for example, is monstrous.Shinji: Exactly. [laugh]. And security concerns. We don't want to be liable for any of the data—and this is, like, a very specific decision we've made very early on the company—to not access data, to not egress any of the real data, and to provide as much value as possible just utilizing the metadata and logs. And depending on the types of data warehouses, it also can be really efficient because the query history or the metadata systems tables are indexed separately. Usually, it's much lighter load on the compute side. And that definitely has, like, worked well for our advantage, especially being a SaaS tool.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: What I like is just how straightforward the integrations are. It's clear you're extraordinarily agnostic as far as where the data itself lives. You integrate with Google's BigQuery, with Amazon Redshift, with Snowflake, and then on the other side of the world with Looker, and Tableau, and other things as well. And one of the example use cases you give is find the upstream table in BigQuery that a Looker dashboard depends on. That's one of those areas where I see something like that, and, oh, I can absolutely see the value of that.I have two or three DynamoDB tables that drive my newsletter publication system that I built—because I have deep-seated emotional problems and I take it out and everyone else via code—but as a small, contained system that I can still fit in my head. Mostly. And I still forget which table is which in some cases. Down the road, especially at scale, “Okay, where is the actual data source that's informing this because it doesn't necessarily match what I'm expecting,” is one of those incredibly valuable bits of insight. It seems like that is something that often gets lost; the provenance of data doesn't seem to work.And ideally, you know, you're staffing a company with reasonably intelligent people who are going to look at the results of something and say, “That does not align with my expectations. I'm going to dig.” As opposed to the, “Oh, yeah, that seems plausible. I'll just go with whatever the computer says.” There's an ocean of nuance between those two, but it's nice to be able to establish the validity of the path that you've gone down in order to set some of these things up.Shinji: Yeah, and this is also super helpful if you're tasked to debug a dashboard or pipeline that you did not build yourself. Maybe the person has left the company, or maybe they're out-of-office, but this dashboard has been broken and you're quote-unquote, “On call,” for data. What are you going to do? You're going to—without a tool that can show you a full lineage, you will have to start digging through somebody else's SQL code and try to map out, like, where the data is coming from, if this is calculating correctly. Usually takes, you know, few hours to just get to the bottom of the issue. And this is one of the main use cases that our customers bring up every single time, as more of, like, this is now the go-to place every time there is any data questions or data issues.Corey: The first and golden rule of cloud economics is step one, turn that shit off.Shinji: [laugh].Corey: When people are using something, you can optimize the hell out of it however you want, but nothing's going to beat turning it off. One challenge is when we're looking at various accounts and we see a Redshift cluster, and it's, “Okay. That thing's costing a few million bucks a year and no one seems to know anything about it.” They keep pointing to other teams, and it turns into this giant, like, finger-pointing exercise where no one seems to have responsibility for it. And very often, our clients will choose not to turn that thing off because on the one hand, if you don't turn it off, you're going to spend a few million bucks a year that you otherwise would not have had to.On the other, if you delete the data warehouse, and it turns out, oh, yeah, that was actually kind of important, now we don't have a company anymore. It's a question of which is the side you want to be wrong on. And in some levels, leaving something as it is and doing something else is always a more defensible answer, just because the first time your cost-saving exercises take out production, you're generally not allowed to save money anymore. This feels like it helps get to that source of truth a heck of a lot more effectively than tracing individual calls and turning into basically data center archaeologists.Shinji: [laugh]. Yeah, for sure. I mean, this is why from the get go, we try to give you all your tables, all of your database, just ordered by popularity. So, you can also see overall, like, from all the tables, whether that's thousands or tens of thousands, you're seeing the most used, has the most number of dependencies on the top, and you can also filter it by all the database tables that hasn't been touched in the last 90 days. And just having this, like, high-level view gives a lot of ideas to the data platform team about how they can optimize usage of their data warehouse.Corey: From where I tend to sit, an awful lot of customers are still relatively early in their data journey. An awful lot of the marketing that I receive from various AWS mailing lists that I found myself on because I've had the temerity to open accounts has been along the lines of oh, data discovery is super important, but first, they presuppose that I've already bought into this idea that oh, every company must be a completely data-driven company. The end. Full stop.And yeah, we're a small bespoke services consultancy. I don't necessarily know that that's the right answer here. But then it takes it one step further and starts to define the idea of data discovery as, ah, you will use it to find a PII or otherwise sensitive or restricted data inside of your datasets so you know exactly where it lives. And sure, okay, that's valuable, but it also feels like a very narrow definition compared to how you view these things.Shinji: Yeah. Basically, the way that we see data discovery is it's starting to become more of an essential capability in order for you to monitor and understand how your data is actually being used internally. It basically gives you the insights around sure, like, what are the duplicated datasets, what are the datasets that have that descriptions or not, what are something that may contain sensitive data, so on and so forth, but that's still around the characteristics of the physical datasets. Whereas I think the part that's really important around data discovery that is not being talked about as much is how the data can actually be used better. So, have it as more of a forward-thinking mechanism and in order for you to actually encourage more people to utilize data or use the data correctly, instead of trying to contain this within just one team is really where I feel like data discovery can help.And in regards to this, the other big part around data discovery is really opening up and having that transparency just within the data team. So, just within the data team, they always feel like they do have that access to the SQL queries and you can just go to GitHub and just look at the database itself, but it's so easy to get lost in the sea of metadata that is just laid out as just the list; there isn't much context around the data itself. And that context and with along with the analytics of the metadata is what we're really trying to provide automatically. So eventually, like, this can be also seen as almost like a way to, like, monitor the datasets, like, how you're currently monitoring your applications through Datadog or your website with your Google Analytics, this is something that can be also used as more of a go-to source of truth around what your state of the data is, how that's defined, and how that's being mapped to different business processes, so that there isn't much confusion around data. Everything can be called the same, but underneath it actually can mean very different things. Does that make sense?Corey: No, it absolutely does. I think that this is part of the challenge in trying to articulate value that is, I guess, specific to this niche across an entire industry. The context that drives data is going to be incredibly important, and it feels like so much of the marketing in the space is aimed at one or two pre-imagined customer profiles. And that has the side effect of making customers for whom that model doesn't align, look and feel like either doing something wrong, or makes it look like the vendor who's pitching this is somewhat out of touch. I know that I work in a relatively bounded problem space, but I still learn new things about AWS billing on virtually every engagement that I go on, just because you always get to learn more about how customers view things and how they view not just their industry, but also the specificities of their own business and their own niche.I think that is one of the challenges historically, with the idea of letting software do everything. Do you find the problems that you're solving tend to be global in nature or are you discovering strange depths of nuance on a customer-by-customer basis at this point?Shinji: Overall, a lot of the problems that we solve and the customers that we work with is very industry agnostic. As long as you are having many different datasets that you need to manage, there are common problems that arises, regardless of the industry that you're in. We do observe some industry-specific issues because your data is either, it's an unstructured data, or your data is primarily events, or you know, depending on how the data looks like, but primarily because of most of the BI solutions and data warehouses are operating as a relational databases, this is a part where we really try to build a lot of best practices, and the common analytics that we can apply to every customer that's using Select Star.Corey: I really want to thank you for taking so much time to go through the ins and outs of what it is you're doing these days. If people want to learn more, where's the best place to find you?Shinji: Yeah, I mean, it's been fun [laugh] talking here. So, we are at selectstar.com. That's our website. You can sign up for a free trial. It's completely self-service, so you don't need to get on a demo but, like, we'll also help you onboard and happy to give a free demo to whoever that is interested.We are also on LinkedIn and Twitter under selectstarhq. Yeah, I mean, we're happy to help for any companies that have these issues around wanting to increase their discoverability of data, and want to help their data team and the rest of the company to be able to utilize data better.Corey: And we will, of course, put links to all of that in the [show notes 00:28:58]. Thank you so much for your time today. I really appreciate it.Shinji: Great. Thanks for having me, Corey.Corey: Shinji Kim, CEO and founder at Select Star. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that I won't be able to discover because there are far too many podcast platforms out there, and I have no means of discovering where you've said that thing unless you send it to me.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
O Kubicast completou 100 episódios de vida e para gravar esse marco chamamos a internet toda mais alguns convidados especiais!A ideia era fazer o programa no formato ASK ME ANYTHING, mas acabou que começamos a relembrar os episódios mais marcantes e isso deu pano para manga para entrar em assuntos, como rodar Containers com Windows, arquivo YAML, piadas de Java, como sempre, e etc!Agradecemos as pessoas que participaram desse episódio e todos os demais ouvintes: antigos e novos! É muito gratificante poder tocar esse podcast que começou em 2018, sem muitas habilidades para a coisa, e de lá para cá só foi evoluindo para melhor!Se você chegou aqui agora, seja bem-vindo(a)! O Kubicast é uma produção da Getup, empresa especialista em Kubernetes. Todos os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT. Os EPISÓDIOS RELEMBRADOS nesse Kubicast:#6 - O que NÃO esperar de Kubernetes#19 - KubeCon Day 1 - Lightning Talks#35 - The day we recorded with Kelsey Hightower#51 - Maratona KubeCon 2020#60 - Windows Containers#69 - Nomad vs Kubernetes#92 - Kubernetes 1.24 is out!#93 - Por dentro do Tsuru#95 - FOMGO - Fear of missing Gomex#97 - Segue o fio com Leandro DamascenaAs RECOMENDAÇÕES dos participantes do programa:Manifesto (série na Netflix)Succession (série na HBO)Dentro da Mente de um Gato (documentário na Netflix)Ruptura (série na Apple TV +)The Sandman (série na Netflix)
Vi oppsummerer en gjennomsolid triumf mot Brentford, og spør oss om Xhaka er både back, indreløper, playmaker og spiss. Remotandaen til sveitseren forbløffer oss, og vi prøver å oppsummere en usannsynlig reise. I tillegg ser vi nærmere på den kruttsterke kollektive innsatsen, som kvelte et ellers så solid hjemmelag. Hva betydde Parteys comeback, og hvorfor klarte vi oss så fint uten Martin og Zinny? Fabio Vieiras drømmetreff får også mye oppmerksomhet, og vi belyser portugiserens prestasjon og rolle. I tillegg kjører vi en liten sekvens om tidenes (!) yngste PL-debutant, Ethan Nwaneri (15). Til slutt er det duket for spalten "Eks-spilleren", hvor vi mimrer hjertelig om Emirates-æraens kanskje mest ikoniske spiller.
Minu seekordseks vestluskaaslaseks on Tallinna Lennujaama juht Riivo Tuvike. Varasemalt on Riivo töötanud Luminor Liisingu juhina, samuti Nordea Finance'is erinevatel ametikohtadel. Istusime Riivoga maha ja põrgatasime mõtteid, kuidas tema tippjuhiks kujunemise teekond on lahti rullunud ning mida ta oma kogemuste põhjal juhtimise juures kõige olulisemaks peab. “Oma hinges olen ma müügimees. Karjääri alustasin ma müügimehena ja ma arvan, et need iseloomujooned on mul tugevalt endiselt sees. Ja eks ma olen ka müünud kogu aeg midagi. Erinevate inimestega suheldes võib-olla ei tundugi, et ma neile midagi müün, aga tegelikult ma ikkagi müün mingeid ideid või mõtteid või vahest iseennast mingil põhjusel. Selline müügimehe mentaliteet on mul endiselt sees. Mul on oma mõtted, ideed, nägemus, mida ma siis üritan kujundada ja laiendada läbi selle, et ma kuulan inimesi, aga lõpuks ikkagi tekib mingisugune oma lähenemine, mida ma üritan müüa teistele. Eks ma sellist müügimehe tüüpi juht olen. Ja see võiks mind päris hästi kirjeldada.” – Riivo Tuvike Kuulake ikka ...
https://go.dok.community/slack https://dok.community/ With: Sebastian Glab - Cloud Architect, CloudCasa by Catalogic Martin Phan - Field CTO – Americas, CloudCasa by Catalogic Bart Farrell - Head of Community, Data on Kubernetes Community ABSTRACT OF THE TALK If you are running or planning a multi-cloud or even a multi-cluster environment, there are several considerations in implementing a data protection solution – especially if you plan on an organic home-grown, do-it-yourself option. This talk will highlight challenges and best practices around centralized management of configuration, credentials, compliance across multiple accounts, regions, providers etc. We will also highlight the deviations in CSI driver implementations of various storage vendors and cloud providers. Finally, we will cover the various recovery options available in the market today. Kubernetes cloud services are popular since they mitigate, but do not eliminate, the difficulties of operating a Kubernetes environment. This is especially true for protecting the stateful configuration and data of your Kubernetes applications, where the inherent high-availability and infrastructure as code are not a substitute for have cloud-native backup and disaster recovery capabilities. Further, many companies now have multi-cloud strategies for their cloud-native applications. These challenges can be addressed with backup applications that are both Kubernetes managed service and multi-cloud aware in order to snapshot, copy, restore, and migrate Kubernetes workloads (resources and data) running on AKS, EKS and GKE. Capturing information from cloud accounts and how the cluster and storage resources are configured allows 1) centralized visibility into all cloud accounts and the clusters and resources in the accounts including for compliance; 2) cross-account, cross-cluster, and cross-region data restores; 3) automation of the cluster and data restores including for Dev, Test, and Production recovery use cases. BIO Sebastian Glab is a Cloud Architect for CloudCasa and he resides in Poland. He is responsible for integrating the different cloud providers with the CloudCasa service, and making sure that all clusters in the cloud service get discovered and protected. In his free time, he plays volleyball and develops his own projects. Martin Phan is the Field CTO in North America for CloudCasa by Catalogic Software. With over 20+ years of experience in the software-industry, he takes pride in supporting, developing, implementing, and selling enterprise software and data protection solutions to help customer solve their backup and recovery challenges. KEY TAKE-AWAYS FROM THE TALK 1) Challenges and best practices around centralized management of configuration, credentials, compliance across multiple accounts, regions, providers etc. 2) Advantages of cloud awareness and Kubernetes managed service awareness for application and data recovery and security 3) Examples of overcoming Container Storage Interface (CSI) deviations 4) Various recovery options available in the market today.
About AllenAllen is a cloud architect at Tyler Technologies. He helps modernize government software by creating secure, highly scalable, and fault-tolerant serverless applications.Allen publishes content regularly about serverless concepts and design on his blog - Ready, Set Cloud!Links Referenced: Ready, Set, Cloud blog: https://readysetcloud.io Tyler Technologies: https://www.tylertech.com/ Twitter: https://twitter.com/allenheltondev Linked: https://www.linkedin.com/in/allenheltondev/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while I wind up stumbling into corners of the internet that I previously had not traveled. Somewhat recently, I wound up having that delightful experience again by discovering readysetcloud.io, which has a whole series of, I guess some people might call it thought leadership, I'm going to call it instead how I view it, which is just amazing opinion pieces on the context of serverless, mixed with APIs, mixed with some prognostications about the future.Allen Helton by day is a cloud architect at Tyler Technologies, but that's not how I encountered you. First off, Allen, thank you for joining me.Allen: Thank you, Corey. Happy to be here.Corey: I was originally pointed towards your work by folks in the AWS Community Builder program, of which we both participate from time to time, and it's one of those, “Oh, wow, this is amazing. I really wish I'd discovered some of this sooner.” And every time I look through your back catalog, and I click on a new post, I see things that are either I've really agree with this or I can't stand this opinion, I want to fight about it, but more often than not, it's one of those recurring moments that I love: “Damn, I wish I had written something like this.” So first, you're absolutely killing it on the content front.Allen: Thank you, Corey, I appreciate that. The content that I make is really about the stuff that I'm doing at work. It's stuff that I'm passionate about, stuff that I'd spend a decent amount of time on, and really the most important thing about it for me, is it's stuff that I'm learning and forming opinions on and wants to share with others.Corey: I have to say, when I saw that you were—oh, your Tyler Technologies, which sounds for all the world like, oh, it's a relatively small consultancy run by some guy presumably named Tyler, and you know, it's a petite team of maybe 20, 30 people on the outside. Yeah, then I realized, wait a minute, that's not entirely true. For example, for starters, you're publicly traded. And okay, that does change things a little bit. First off, who are you people? Secondly, what do you do? And third, why have I never heard of you folks, until now?Allen: Tyler is the largest company that focuses completely on the public sector. We have divisions and products for pretty much everything that you can imagine that's in the public sector. We have software for schools, software for tax and appraisal, we have software for police officers, for courts, everything you can think of that runs the government can and a lot of times is run on Tyler software. We've been around for decades building our expertise in the domain, and the reason you probably haven't heard about us is because you might not have ever been in trouble with the law before. If you [laugh] if you have been—Corey: No, no, I learned very early on in the course of my life—which will come as a surprise to absolutely no one who spent more than 30 seconds with me—that I have remarkably little filter and if ten kids were the ones doing something wrong, I'm the one that gets caught. So, I spent a lot of time in the principal's office, so this taught me to keep my nose clean. I'm one of those squeaky-clean types, just because I was always terrified of getting punished because I knew I would get caught. I'm not saying this is the right way to go through life necessarily, but it did have the side benefit of, no, I don't really engage with law enforcement going throughout the course of my life.Allen: That's good. That's good. But one exposure that a lot of people get to Tyler is if you look at the bottom of your next traffic ticket, it'll probably say Tyler Technologies on the bottom there.Corey: Oh, so you're really popular in certain circles, I'd imagine?Allen: Super popular. Yes, yes. And of course, you get all the benefits of writing that code that says ‘if defendant equals Allen Helton then return.'Corey: I like that. You get to have the exception cases built in that no one's ever going to wind up looking into.Allen: That's right. Yes.Corey: The idea of what you're doing makes an awful lot of sense. There's a tremendous need for a wide variety of technical assistance in the public sector. What surprises me, although I guess it probably shouldn't, is how much of your content is aimed at serverless technologies and API design, which to my way of thinking, isn't really something that public sector has done a lot with. Clearly I'm wrong.Allen: Historically, you're not wrong. There's an old saying that government tends to run about ten years behind on technology. Not just technology, but all over the board and runs about ten years behind. And until recently, that's really been true. There was a case last year, a situation last year where one of the state governments—I don't remember which one it was—but they were having a crisis because they couldn't find any COBOL developers to come in and maintain their software that runs the state.And it's COBOL; you're not going to find a whole lot of people that have that skill. A lot of those people are retiring out. And what's happening is that we're getting new people sitting in positions of power and government that want innovation. They know about the cloud and they want to be able to integrate with systems quickly and easily, have little to no onboarding time. You know, there are people in power that have grown up with technology and understand that, well, with everything else, I can be up and running in five or ten minutes. I cannot do this with the software I'm consuming now.Corey: My opinion on it is admittedly conflicted because on the one hand, yeah, I don't think that governments should be running on COBOL software that runs on mainframes that haven't been supported in 25 years. Conversely, I also don't necessarily want them being run like a seed series startup, where, “Well, I wrote this code last night, and it's awesome, so off I go to production with it.” Because I can decide not to do business anymore with Twitter for Pets, and I could go on to something else, like PetFlicks, or whatever it is I choose to use. I can't easily opt out of my government. The decisions that they make stick and that is going to have a meaningful impact on my life and everyone else's life who is subject to their jurisdiction. So, I guess I don't really know where I believe the proper, I guess, pace of technological adoption should be for governments. Curious to get your thoughts on this.Allen: Well, you certainly don't want anything that's bleeding edge. That's one of the things that we kind of draw fine lines around. Because when we're dealing with government software, we're dealing with, usually, critically sensitive information. It's not medical records, but it's your criminal record, and it's things like your social security number, it's things that you can't have leaking out under any circumstances. So, the things that we're building on are things that have proven out to be secure and have best practices around security, uptime, reliability, and in a lot of cases as well, and maintainability. You know, if there are issues, then let's try to get those turned around as quickly as we can because we don't want to have any sort of downtime from the software side versus the software vendor side.Corey: I want to pivot a little bit to some of the content you've put out because an awful lot of it seems to be, I think I'll call it variations on a theme. For example, I just read some recent titles, and to illustrate my point, “Going API First: Your First 30 Days,” “Solutions Architect Tips how to Design Applications for Growth,” “3 Things to Know Before Building A Multi-Tenant Serverless App.” And the common thread that I see running through all of these things are these are things that you tend to have extraordinarily strong and vocal opinions about only after dismissing all of them the first time and slapping something together, and then sort of being forced to live with the consequences of the choices that you've made, in some cases you didn't realize you were making at the time. Are you one of those folks that has the wisdom to see what's coming down the road, or did you do what the rest of us do and basically learn all this stuff by getting it hilariously wrong and having to careen into rebound situations as a result?Allen: [laugh]. I love that question. I would like to say now, I feel like I have the vision to see something like that coming. Historically, no, not at all. Let me talk a little bit about how I got to where I am because that will shed a lot of context on that question.A few years ago, I was put into a position at Tyler that said, “Hey, go figure out this cloud thing.” Let's figure out what we need to do to move into the cloud safely, securely, quickly, all that rigmarole. And so, I did. I got to hand-select team of engineers from people that I worked with at Tyler over the past few years, and we were basically given free rein to learn. We were an R&D team, a hundred percent R&D, for about a year's worth of time, where we were learning about cloud concepts and theory and building little proof of concepts.CI/CD, serverless, APIs, multi-tenancy, a whole bunch of different stuff. NoSQL was another one of the things that we had to learn. And after that year of R&D, we were told, “Okay, now go do something with that. Go build this application.” And we did, building on our theory our cursory theory knowledge. And we get pretty close to go live, and then the business says, “What do you do in this scenario? What do you do in that scenario? What do you do here?”Corey: “I update my resume and go work somewhere else. Where's the hard part here?”Allen: [laugh].Corey: Turns out, that's not a convincing answer.Allen: Right. So, we moved quickly. And then I wouldn't say we backpedaled, but we hardened for a long time before the—prior to the go-live, with the lessons that we've learned with the eyes of Tyler, the mature enterprise company, saying, “These are the things that you have to make sure that you take into consideration in an actual production application.” One of the things that I always pushed—I was a manager for a few years of all these cloud teams—I always push do it; do it right; do it better. Right?It's kind of like crawl, walk, run. And if you follow my writing from the beginning, just looking at the titles and reading them, kind of like what you were doing, Corey, you'll see that very much. You'll see how I talk about CI/CD, you'll see me how I talk about authorization, you'll see me how I talk about multi-tenancy. And I kind of go in waves where maybe a year passes and you see my content revisit some of the topics that I've done in the past. And they're like, “No, no, no, don't do what I said before. It's not right.”Corey: The problem when I'm writing all of these things that I do, for example, my entire newsletter publication pipeline is built on a giant morass of Lambda functions and API Gateways. It's microservices-driven—kind of—and each microservice is built, almost always, with a different framework. Lately, all the new stuff is CDK. I started off with the serverless framework. There are a few other things here and there.And it's like going architecting, back in time as I have to make updates to these things from time to time. And it's the problem with having done all that myself is that I already know the answer to, “What fool designed this?” It's, well, you're basically watching me learn what I was, doing bit by bit. I'm starting to believe that the right answer on some level, is to build an inherent shelf-life into some of these things. Great, in five years, you're going to come back and re-architect it now that you know how this stuff actually works rather than patching together 15 blog posts by different authors, not all of whom are talking about the same thing and hoping for the best.Allen: Yep. That's one of the things that I really like about serverless, I view that as a giant pro of doing Serverless is that when we revisit with the lessons learned, we don't have to refactor everything at once like if it was just a big, you know, MVC controller out there in the sky. We can refactor one Lambda function at a time if now we're using a new version of the AWS SDK, or we've learned about a new best practice that needs to go in place. It's a, “While you're in there, tidy up, please,” kind of deal.Corey: I know that the DynamoDB fanatics will absolutely murder me over this one, but one of the reasons that I have multiple Dynamo tables that contain, effectively, variations on the exact same data, is because I want to have the dependency between the two different microservices be the API, not, “Oh, and under the hood, it's expecting this exact same data structure all the time.” But it just felt like that was the wrong direction to go in. That is the justification I use for myself why I run multiple DynamoDB tables that [laugh] have the same content. Where do you fall on the idea of data store separation?Allen: I'm a big single table design person myself, I really like the idea of being able to store everything in the same table and being able to create queries that can return me multiple different types of entity with one lookup. Now, that being said, one of the issues that we ran into, or one of the ambiguous areas when we were getting started with serverless was, what does single table design mean when you're talking about microservices? We were wondering does single table mean one DynamoDB table for an entire application that's composed of 15 microservices? Or is it one table per microservice? And that was ultimately what we ended up going with is a table per microservice. Even if multiple microservices are pushed into the same AWS account, we're still building that logical construct of a microservice and one table that houses similar entities in the same domain.Corey: So, something I wish that every service team at AWS would do as a part of their design is draw the architecture of an application that you're planning to build. Great, now assume that every single resource on that architecture diagram lives in its own distinct AWS account because somewhere in some customer, there's going to be an account boundary at every interconnection point along the way. And so, many services don't do that where it's, “Oh, that thing and the other thing has to be in the same account.” So, people have to write their own integration shims, and it makes doing the right thing of putting different services into distinct bounded AWS accounts for security or compliance reasons way harder than I feel like it needs to be.Allen: [laugh]. Totally agree with you on that one. That's one of the things that I feel like I'm still learning about is the account-level isolation. I'm still kind of early on, personally, with my opinions in how we're structuring things right now, but I'm very much of a like opinion that deploying multiple things into the same account is going to make it too easy to do something that you shouldn't. And I just try not to inherently trust people, in the sense that, “Oh, this is easy. I'm just going to cross that boundary real quick.”Corey: For me, it's also come down to security risk exposure. Like my lasttweetinaws.com Twitter shitposting thread client lives in a distinct AWS account that is separate from the AWS account that has all of our client billing data that lives within it. The idea being that if you find a way to compromise my public-facing Twitter client, great, the blast radius should be constrained to, “Yay, now you can, I don't know, spin up some cryptocurrency mining in my AWS account and I get to look like a fool when I beg AWS for forgiveness.”But that should be the end of it. It shouldn't be a security incident because I should not have the credit card numbers living right next to the funny internet web thing. That sort of flies in the face of the original guidance that AWS gave at launch. And right around 2008-era, best practices were one customer, one AWS account. And then by 2012, they had changed their perspective, but once you've made a decision to build multiple services in a single account, unwinding and unpacking that becomes an incredibly burdensome thing. It's about the equivalent of doing a cloud migration, in some ways.Allen: We went through that. We started off building one application with the intent that it was going to be a siloed application, a one-off, essentially. And about a year into it, it's one of those moments of, “Oh, no. What we're building is not actually a one-off. It's a piece to a much larger puzzle.”And we had a whole bunch of—unfortunately—tightly coupled things that were in there that we're assuming that resources were going to be in the same AWS account. So, we ended up—how long—I think we took probably two months, which in the grand scheme of things isn't that long, but two months, kind of unwinding the pieces and decoupling what was possible at the time into multiple AWS accounts, kind of, segmented by domain, essentially. But that's hard. AWS puts it, you know, it's those one-way door decisions. I think this one was a two-way door, but it locked and you could kind of jimmy the lock on the way back out.Corey: And you could buzz someone from the lobby to let you back in. Yeah, the biggest problem is not necessarily the one-way door decisions. It's the one-way door decisions that you don't realize you're passing through at the time that you do them. Which, of course, brings us to a topic near and dear to your heart—and I only recently started have opinions on this myself—and that is the proper design of APIs, which I'm sure will incense absolutely no one who's listening to this. Like, my opinions on APIs start with well, probably REST is the right answer in this day and age. I had people, like, “Well, I don't know, GraphQL is pretty awesome.” Like, “Oh, I'm thinking SOAP,” and people look at me like I'm a monster from the Black Lagoon of centuries past in XML-land. So, my particular brand of strangeness side, what do you see that people are doing in the world of API design that is the, I guess, most common or easy to make mistakes that you really wish they would stop doing?Allen: If I could boil it down to one word, fundamentalism. Let me unpack that for you.Corey: Oh, please, absolutely want to get a definition on that one.Allen: [laugh]. I approach API design from a developer experience point of view: how easy is it for both internal and external integrators to consume and satisfy the business processes that they want to accomplish? And a lot of times, REST guidelines, you know, it's all about entity basis, you know, drill into the appropriate entities and name your endpoints with nouns, not verbs. I'm actually very much onto that one.But something that you could easily do, let's say you have a business process that given a fundamentally correct RESTful API design takes ten API calls to satisfy. You could, in theory, boil that down to maybe three well-designed endpoints that aren't, quote-unquote, “RESTful,” that make that developer experience significantly easier. And if you were a fundamentalist, that option is not even on the table, but thinking about it pragmatically from a developer experience point of view, that might be the better call. So, that's one of the things that, I know feels like a hot take. Every time I say it, I get a little bit of flack for it, but don't be a fundamentalist when it comes to your API designs. Do something that makes it easier while staying in the guidelines to do what you want.Corey: For me the problem that I've kept smacking into with API design, and it honestly—let me be very clear on this—my first real exposure to API design rather than API consumer—which of course, I complain about constantly, especially in the context of the AWS inconsistent APIs between services—was when I'm building something out, and I'm reading the documentation for API Gateway, and oh, this is how you wind up having this stage linked to this thing, and here's the endpoint. And okay, great, so I would just populate—build out a structure or a schema that has the positional parameters I want to use as variables in my function. And that's awesome. And then I realized, “Oh, I might want to call this a different way. Aw, crap.” And sometimes it's easy; you just add a different endpoint. Other times, I have to significantly rethink things. And I can't shake the feeling that this is an entire discipline that exists that I just haven't had a whole lot of exposure to previously.Allen: Yeah, I believe that. One of the things that you could tie a metaphor to for what I'm saying and kind of what you're saying, is AWS SAM, the Serverless Application Model, all it does is basically macros CloudFormation resources. It's just a transform from a template into CloudFormation. CDK does same thing. But what the developers of SAM have done is they've recognized these business processes that people do regularly, and they've made these incredibly easy ways to satisfy those business processes and tie them all together, right?If I want to have a Lambda function that is backed behind a endpoint, an API endpoint, I just have to add four or five lines of YAML or JSON that says, “This is the event trigger, here's the route, here's the API.” And then it goes and does four, five, six different things. Now, there's some engineers that don't like that because sometimes that feels like magic. Sometimes a little bit magic is okay.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: I feel like one of the benefits I've had with the vast majority of APIs that I've built is that because this is all relatively small-scale stuff for what amounts to basically shitposting for the sake of entertainment, I'm really the only consumer of an awful lot of these things. So, I get frustrated when I have to backtrack and make changes and teach other microservices to talk to this thing that has now changed. And it's frustrating, but I have the capacity to do that. It's just work for a period of time. I feel like that equation completely shifts when you have published this and it is now out in the world, and it's not just users, but in many cases paying customers where you can't really make those changes without significant notice, and every time you do you're creating work for those customers, so you have to be a lot more judicious about it.Allen: Oh, yeah. There is a whole lot of governance and practice that goes into production-level APIs that people integrate with. You know, they say once you push something out the door into production that you're going to support it forever. I don't disagree with that. That seems like something that a lot of people don't understand.And that's one of the reasons why I push API-first development so hard in all the content that I write is because you need to be intentional about what you're letting out the door. You need to go in and work, not just with the developers, but your product people and your analysts to say, what does this absolutely need to do, and what does it need to do in the future? And you take those things, and you work with analysts who want specifics, you work with the engineers to actually build it out. And you're very intentional about what goes out the door that first time because once it goes out with a mistake, you're either going to version it immediately or you're going to make some people very unhappy when you make a breaking change to something that they immediately started consuming.Corey: It absolutely feels like that's one of those things that AWS gets astonishingly right. I mean, I had the privilege of interviewing, at the time, Jeff Barr and then Ariel Kelman, who was their head of marketing, to basically debunk a bunch of old myths. And one thing that they started talking about extensively was the idea that an API is fundamentally a promise to your customers. And when you make a promise, you'd better damn well intend on keeping it. It's why API deprecations from AWS are effectively unique whenever something happens.It's the, this is a singular moment in time when they turn off a service or degrade old functionality in favor of new. They can add to it, they can launch a V2 of something and then start to wean people off by calling the old one classic or whatnot, but if I built something on AWS in 2008 and I wound up sleeping until today, and go and try and do the exact same thing and deploy it now, it will almost certainly work exactly as it did back then. Sure, reliability is going to be a lot better and there's a crap ton of features and whatnot that I'm not taking advantage of, but that fundamental ability to do that is awesome. Conversely, it feels like Google Cloud likes to change around a lot of their API stories almost constantly. And it's unplanned work that frustrates the heck out of me when I'm trying to build something stable and lasting on top of it.Allen: I think it goes to show the maturity of these companies as API companies versus just vendors. It's one of the things that I think AWS does [laugh]—Corey: You see the similar dichotomy with Microsoft and Apple. Microsoft's new versions of Windows generally still have functionalities in them to support stuff that was written in the '90s for a few use cases, whereas Apple's like, “Oh, your computer's more than 18-months old? Have you tried throwing it away and buying a new one? And oh, it's a new version of Mac OS, so yeah, maybe the last one would get security updates for a year and then get with the times.” And I can't shake the feeling that the correct answer is in some way, both of those, depending upon who your customer is and what it is you're trying to achieve.If Microsoft adopted the Apple approach, their customers would mutiny, and rightfully so; the expectation has been set for decades that isn't what happens. Conversely, if Apple decided now we're going to support this version of Mac OS in perpetuity, I don't think a lot of their application developers wouldn't quite know what to make of that.Allen: Yeah. I think it also comes from a standpoint of you better make it worth their while if you're going to move their cheese. I'm not a Mac user myself, but from what I hear for Mac users—and this could be rose-colored glasses—but is that their stuff works phenomenally well. You know, when a new thing comes out—Corey: Until it doesn't, absolutely. It's—whenever I say things like that on this show, I get letters. And it's, “Oh, yeah, really? They'll come up with something that is a colossal pain in the ass on Mac.” Like, yeah, “Try building a system-wide mute key.”It's yeah, that's just a hotkey away on windows and here in Mac land. It's, “But it makes such beautiful sounds. Why would you want them to be quiet?” And it's, yeah, it becomes this back-and-forth dichotomy there. And you can even explain it to iPhones as well and the Android ecosystem where it's, oh, you're going to support the last couple of versions of iOS.Well, as a developer, I don't want to do that. And Apple's position is, “Okay, great.” Almost half of the mobile users on the planet will be upgrading because they're in the ecosystem. Do you want us to be able to sell things those people are not? And they're at a point of scale where they get to dictate those terms.On some level, there are benefits to it and others, it is intensely frustrating. I don't know what the right answer is on the level of permanence on that level of platform. I only have slightly better ideas around the position of APIs. I will say that when AWS deprecates something, they reach out individually to affected customers, on some level, and invariably, when they say, “This is going to be deprecated as of August 31,” or whenever it is, yeah, it is going to slip at least twice in almost every case, just because they're not going to turn off a service that is revenue-bearing or critical-load-bearing for customers without massive amounts of notice and outreach, and in some cases according to rumor, having engineers reach out to help restructure things so it's not as big of a burden on customers. That's a level of customer focus that I don't think most other companies are capable of matching.Allen: I think that comes with the size and the history of Amazon. And one of the things that they're doing right now, we've used Amazon Cloud Cams for years, in my house. We use them as baby monitors. And they—Corey: Yea, I saw this I did something very similar with Nest. They didn't have the Cloud Cam at the right time that I was looking at it. And they just announced that they're going to be deprecating. They're withdrawing them for sale. They're not going to support them anymore. Which, oh at Amazon—we're not offering this anymore. But you tell the story; what are they offering existing customers?Allen: Yeah, so slightly upset about it because I like my Cloud Cams and I don't want to have to take them off the wall or wherever they are to replace them with something else. But what they're doing is, you know, they gave me—or they gave all the customers about eight months head start. I think they're going to be taking them offline around Thanksgiving this year, just mid-November. And what they said is as compensation for you, we're going to send you a Blink Cam—a Blink Mini—for every Cloud Cam that you have in use, and then we are going to gift you a year subscription to the Pro for Blink.Corey: That's very reasonable for things that were bought years ago. Meanwhile, I feel like not to be unkind or uncharitable here, but I use Nest Cams. And that's a Google product. I half expected if they ever get deprecated, I'll find out because Google just turns it off in the middle of the night—Allen: [laugh].Corey: —and I wake up and have to read a blog post somewhere that they put an update on Nest Cams, the same way they killed Google Reader once upon a time. That's slightly unfair, but the fact that joke even lands does say a lot about Google's reputation in this space.Allen: For sure.Corey: One last topic I want to talk with you about before we call it a show is that at the time of this recording, you recently had a blog post titled, “What does the Future Hold for Serverless?” Summarize that for me. Where do you see this serverless movement—if you'll forgive the term—going?Allen: So, I'm going to start at the end. I'm going to work back a little bit on what needs to happen for us to get there. I have a feeling that in the future—I'm going to be vague about how far in the future this is—that we'll finally have a satisfied promise of all you're going to write in the future is business logic. And what does that mean? I think what can end up happening, given the right focus, the right companies, the right feedback, at the right time, is we can write code as developers and have that get pushed up into the cloud.And a phrase that I know Jeremy Daly likes to say ‘infrastructure from code,' where it provisions resources in the cloud for you based on your use case. I've developed an application and it gets pushed up in the cloud at the time of deploying it, optimized resource allocation. Over time, what will happen—with my future vision—is when you get production traffic going through, maybe it's spiky, maybe it's consistently at a scale that outperforms the resources that it originally provisioned. We can have monitoring tools that analyze that and pick that out, find the anomalies, find the standard patterns, and adjust that infrastructure that it deployed for you automatically, where it's based on your production traffic for what it created, optimizes it for you. Which is something that you can't do on an initial deployment right now. You can put what looks best on paper, but once you actually get traffic through your application, you realize that, you know, what was on paper might not be correct.Corey: You ever noticed that whiteboard diagrams never show the reality, and they're always aspirational, and they miss certain parts? And I used to think that this was the symptom I had from working at small, scrappy companies because you know what, those big tech companies, everything they build is amazing and awesome. I know it because I've seen their conference talks. But I've been a consultant long enough now, and for a number of those companies, to realize that nope, everyone's infrastructure is basically a trash fire at any given point in time. And it works almost in spite of itself, rather than because of it.There is no golden path where everything is shiny, new and beautiful. And that, honestly, I got to say, it was really [laugh] depressing when I first discovered it. Like, oh, God, even these really smart people who are so intelligent they have to have extra brain packs bolted to their chests don't have the magic answer to all of this. The rest of us are just screwed, then. But we find ways to make it work.Allen: Yep. There's a quote, I wish I remembered who said it, but it was a military quote where, “No battle plan survives impact with the enemy—first contact with the enemy.” It's kind of that way with infrastructure diagrams. We can draw it out however we want and then you turn it on in production. It's like, “Oh, no. That's not right.”Corey: I want to mix the metaphors there and say, yeah, no architecture survives your first fight with a customer. Like, “Great, I don't think that's quite what they're trying to say.” It's like, “What, you don't attack your customers? Pfft, what's your customer service line look like?” Yeah, it's… I think you're onto something.I think that inherently everything beyond the V1 design of almost anything is an emergent property where this is what we learned about it by running it and putting traffic through it and finding these problems, and here's how it wound up evolving to account for that.Allen: I agree. I don't have anything to add on that.Corey: [laugh]. Fair enough. I really want to thank you for taking so much time out of your day to talk about how you view these things. If people want to learn more, where is the best place to find you?Allen: Twitter is probably the best place to find me: @AllenHeltonDev. I have that username on all the major social platforms, so if you want to find me on LinkedIn, same thing: AllenHeltonDev. My blog is always open as well, if you have any feedback you'd like to give there: readysetcloud.io.Corey: And we will, of course, put links to that in the show notes. Thanks again for spending so much time talking to me. I really appreciate it.Allen: Yeah, this was fun. This was a lot of fun. I love talking shop.Corey: It shows. And it's nice to talk about things I don't spend enough time thinking about. Allen Helton, cloud architect at Tyler Technologies. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that I will reject because it was not written in valid XML.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Com alguns de seus companheiros de Getup, João Brito comenta as mudanças mais relevantes da versão 1.25 do Kubernetes. Algumas delas são a remoção definitiva do PSP (PodSecurityPolicy), a depreciação do suporte para GlusterFS e a morte do Autoscaling v.2 beta 1. Tem também a entrada, ainda em estágio alfa, do recurso de namespace de Linux (não o do Kubernetes, heim!?) e o avanço para estável das features Pod Security Admission e Local Ephemeral Storage Capacity Isolation.Outra novidade é que o PDB (Pod Disruption Budget) vai para versão default, por isso recomendamos que mantenham os deploys produtivos com pelo menos duas réplicas para não ter dor de cabeça na hora de uma atualização, por exemplo.Em meio às observações da nova versão, a turma falou sobre os prós e contras de trabalhar com um cluster gerenciado vs um cluster no On-Premise; e se tem alguma future gate que faz falta num cluster de produção.LINKS do que foi comentado no programa:Artigo da Karol Valencia da Aqua Security: https://blog.aquasec.com/kubernetes-version-1.25KubiLab - Vídeo tutorial do Adonai Costa sobre o KEDA: https://gtup.me/KubilabKeda RECOMENDAÇÕES dos participantes:One Punch Man (livro de mangá)Attack on Titan (série de mangá)The Sandman (filme) Narradores de Javé (filme)Five Days at Memorial (série na Apple TV+)CONVITE! Estamos perto do Kubicast #100 e vamos comemorar esse marco de um jeito muito especial! No formato “ASK ME ANYTHING”, a audiência vai poder tirar todas as suas dúvidas sobre Kubernetes e afins! Inscreva-se para participar: https://getup.io/participe-do-kubicast-100. O evento acontece no dia 15/9 às 19h no Zoom.SOBRE O KUBICASTO Kubicast é uma produção da Getup, especialista em Kubernetes. Todos os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT. #DevOps #Kubernetes #Containers #Kubicast
This week we discuss VMware Explore, Snap's move to multi-cloud and the Galaxy Brain take on thought leadership. Plus, Matt Ray's latest Raspberry Pi project is for the birds…? Runner-up Titles Where's my admin? All my children qualify as adults Start by eating their food Put two letters in front of it Where's the grocery store I got that everything bagel spice Is it OK to hang-up on your kids? In the heat of the moment, you can't set policy. The runbook's already written. Spagetti Bowl Tanzu the Shih Tzu A FinOps Type of Motion The opposite of the Sales Kickoff, the Savings Kickoff Growth is best done in the shadows. Wrapping bullshit with bullshit Nopehouse, home of the fast follower The fast followers are just in front of the also-rans Thought-leadership suicide mission Rundown VMware Explore (https://www.vmware.com/explore/us.html) How Snap rebuilt the infrastructure that now supports 347M users (https://www.protocol.com/enterprise/snap-microservices-aws-google-cloud) Screaming in the Cloud with Martin Casado (https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/the-new-cloud-war-with-martin-casado/) Give finops a say over cloud architecture decisions (https://www.infoworld.com/article/3671148/give-finops-a-say-over-cloud-architecture-decisions.html) Business Dudes Need to Stop Talking Like This (https://newsletters.theatlantic.com/galaxy-brain/630ec150bcbd490021b17eab/business-dudes-need-to-stop-talking-like-this/) Relevant to your Interests Amazon tries a new way to excite you about cybersecurity (it's called laughter) (https://www.zdnet.com/article/amazon-tries-a-new-way-to-excite-you-about-cybersecurity-its-called-laughter/) The golden noose around Apple's neck (https://spectatorworld.com/topic/the-golden-noose-around-apples-neck/) Campaign pushes Cloudflare to drop trans hate site (https://www.axios.com/newsletters/axios-login-85e45e2f-8629-43d3-be69-45072a3631f5.html?chunk=0&utm_term=emshare#story0) Mudge at Twitter (https://twitter.com/igb/status/1562427951882199044) Bloomberg takes cut and paste seriously (https://twitter.com/MidwestHedgie/status/1562450905907478531) Notice of Recent Security Incident - The LastPass Blog (https://blog.lastpass.com/2022/08/notice-of-recent-security-incident/) World's Most Popular Password Manager Says It Was Hacked (https://www.bloomberg.com/news/articles/2022-08-25/the-world-s-most-popular-password-manager-says-it-was-hacked) LastPass Says No Passwords Stolen in Data Breach (https://www.cnet.com/tech/services-and-software/lastpass-says-no-passwords-stolen-in-data-breach/) AWS and Kubecost collaborate to deliver cost monitoring for EKS customers | Amazon Web Services (https://aws.amazon.com/blogs/containers/aws-and-kubecost-collaborate-to-deliver-cost-monitoring-for-eks-customers/) Pandas Pivot Table Explained (https://pbpython.com/pandas-pivot-table-explained.html) Charted: Big Tech's bigness (https://www.axios.com/newsletters/axios-login-3db6f78d-4da1-494b-a5d4-04c8984ce0e5.html?chunk=1&utm_term=emshare#story1) UK's Micro Focus shares nearly double after Canada's OpenText agrees $6 bln takeover (https://www.reuters.com/markets/deals/canadas-opentext-buy-software-firm-micro-focus-6-bln-deal-2022-08-25/) Teradata takes on Snowflake and Databricks with cloud-native platform (https://venturebeat.com/data-infrastructure/teradata-makes-database-analytics-cloud-native/) The State of the Mainframe Market - Summer 2022 (https://futurumresearch.com/market-insight-reports/the-state-of-the-mainframe-market-summer-2022/) City2Surf face recognition raises concerns (https://ia.acs.org.au/content/ia/article/2022/city2surf-face-recognition-raises-concerns.html) IBM Watson Health layoffs disguised as staff 'redeployment' (https://www.theregister.com/2022/08/29/ibm_allegedly_hid_watson_health/) David Young on LinkedIn: The metaverse economy is set to boom... gambling will be a significant (https://www.linkedin.com/posts/david-young-b5276523_metaverse-5g-localisation-activity-6966387069338218496-4x5F?utm_source=share&utm_medium=member_desktop) OCI History (https://twitter.com/solomonstre/status/1564499775415676928) VMware CEO bats away Broadcom concerns as 'next transition' (https://www.theregister.com/2022/08/30/vmware_broadcom_/) Heroku to delete inactive accounts, shut down free tier (https://www.theregister.com/2022/08/25/heroku_delete_inactive_free_tier/) Cloudflare Is One of the Companies That Quietly Powers the Internet. Researchers Say It's a Haven for Misinformation (https://time.com/6208828/cloudflare-misinformation-internet-research/) Nonsense Sounds right (https://twitter.com/6thgrade4ever/status/1433519577892327424?s=20&t=o8cx7C7pcCkVR4cTcQbv4g) When the development team meet their first Scrum Master (https://twitter.com/onejasonknight/status/1564287640366628866?s=20&t=y3AIxGPb8kge28aICQ6dFQ) Chart of the year nominee (https://twitter.com/jpwarren/status/1564109454009716736/photo/1) Conferences DevOps Talks Sydney (https://devops.talksplus.com/sydney/devops.html), Sydney, September 6-7, 2022 Sydney Cloud FinOps Meetup (https://events.finops.org/events/details/finops-sydney-cloud-finops-presents-sydney-cloud-finops-meetup/), online, Oct 13, 2022 Matt's presenting Kube (https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/)C (https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/)o (https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/)n North America (https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/), Detroit, Oct 24 – 28, 2022 SpringOne Platform (https://springone.io/?utm_source=cote&utm_medium=podcast&utm_content=sdt), SF, December 6–8, 2022 THAT Conference Texas Call For Counselors (https://that.us/call-for-counselors/tx/2023/) Jan 16-19, 2023 Listener Feedback Enlightning (https://tanzu.vmware.com/developer/tv/enlightning/) from Whitney SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack). Get a SDT Sticker! Send your postal address to firstname.lastname@example.org (mailto:email@example.com) and we will send you free laptop stickers! Follow us on Twitch (https://www.twitch.tv/sdtpodcast), Twitter (https://twitter.com/softwaredeftalk), Instagram (https://www.instagram.com/softwaredefinedtalk/), LinkedIn (https://www.linkedin.com/company/software-defined-talk/) and YouTube (https://www.youtube.com/channel/UCi3OJPV6h9tp-hbsGBLGsDQ/featured). Use the code SDT to get $20 off Coté's book, (https://leanpub.com/digitalwtf/c/sdt) Digital WTF (https://leanpub.com/digitalwtf/c/sdt), so $5 total. Become a sponsor of Software Defined Talk (https://www.softwaredefinedtalk.com/ads)! Recommendations Brandon: Black Bird (https://www.rottentomatoes.com/tv/black_bird/s01) Matt: BirdNetPi (https://birdnetpi.com/) Festival of Feet Half-Marathon (https://www.westiesjoggers.com/the-georges-river-festival-of-the-feet/) Coté: Spigen ArcDock 120W [GaN III] 4-Port USB C Charging Stantion USB-C PD/USB-A Hub with Spigen USB 4 Cable for Thunderbolt 4 Cable 100W Charging 40Gbps Data Transfer for MacBook Pro Air iPad USB-C Laptop (https://amzn.to/3RqRl7M). C7/C8 coupler cables Photo Credits CoverArt (https://unsplash.com/photos/Ts3yX7wDthw) Banner (https://unsplash.com/photos/hXttDVCwyRA)
Nesse episódio, trazemos o ilustre Renato Groffe, o cara das lives de Azure, .NET e desenvolvimento de software em geral. Engenheiro de Software Sênior e MVP da Microsoft, o Renato está com o .NET desde que tudo era mato! Para explorar tudo o que ele sabe a respeito, falamos sobre .NET no Kubernetes, Linux Containers, benefícios de rodar Kubernetes no AKS, tracing para encontrar os bugs em microsserviços e Azure DevOps. LINKS do que comentamos no episódio:Kubicast #60 - http://gtup.me/kubicast-60LinkedIn do Renato - https://www.linkedin.com/in/renatogroffe/Medium do Renato - https://renatogroffe.medium.com/Coding Night - https://www.youtube.com/codingnightCanal .Net - https://www.youtube.com/canaldotnetGitHub - https://github.com/renatogroffeAzure na prática - https://azurenapratica.com/ - https://www.youtube.com/azurenapraticaAs RECOMENDAÇÕES do programa:Acompanhar os jogos da NBA AL- Andalus: O Legado - Documentário no canal HistoryTrês anúncios para um crime - Filme que está na Star +LEMBRETE! Estamos perto do Kubicast #100 e vamos comemorar esse marco de um jeito especial! Aguarde! SOBRE O KUBICASTO Kubicast é uma produção da Getup, especialista em Kubernetes. Todos os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT. #DevOps #Kubernetes #Containers #Kubicast #AzureDevOps #.NET
O entrevistado desse episódio é o ilustre Leandro Damascena, o maior tricoteiro do mundo DevOps no Twitter! Para começar a conversa, resgatamos um fio que ele fez sobre uma saga de atualização de EKS - da 1.17 para 1.21 - em um cluster em produção! Seguindo, falamos sobre capacity planning no Kubernetes, dimensionamento de recursos para um autoscaling saudável, ferramentas que ajudam no dia a dia de quem mexe com Kubernetes e política de Zero-trust. Vale a pena conferir esse papo sensacional, que você precisa ouvir e compartilhar com seus amigos nerds :DLINKS do que comentamos no episódio:Artigos do Leandro no Medium - https://leandrodamascena.medium.com/Vídeo de pessoas trocando pneu com carro em movimento em Dubai - https://youtu.be/B_1bAnLqlMoAs RECOMENDAÇÕES do programa:Sempre vai ter alguém precisando saber de algo que você já conhece, então perca o medo e compartilhe conhecimento! Comece a fazer isso escrevendo no Medium.@precisamosassistir - Conta do Instagram que mostra os bastidores das cenas de cinemaPaternidade (filme que está na Netflix)Tricô de Pais (podcast que está no Spotify)Podpah com Rodrigo Santoro: https://www.youtube.com/watch?v=kNItlFMdG_cCisne Negro com Nathalie Portman (filme que está na Star+)ATENÇÃO: Estamos perto do Kubicast #100 e vamos comemorar esse marco de um jeito especial! Prepare-se!SOBRE O KUBICASTO Kubicast é uma produção da Getup, especialista em Kubernetes. Os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT. #DevOps #Kubernetes #Containers #Kubicast
Nesse episódio do Kubicast tivemos a honra de receber o Mateus Prado - que começou a sua carreira como instrutor de TI e hoje trabalha como Senior Technical Account Manager na AWS - para falar da importância de ter conhecimento básico de infraestrutura para atuar em DevOps, sem viver de “fazer gambiarras”.É que às vezes o problema não está no Kubernetes e, sim, no DNS, mas você não sabe interpretar a mensagem que ele está te passando. Também, o que pode acontecer é que quando você não manja de protocolo de HTTP, outro recurso primário de redes, você acaba perdido entre status, que não entende a fundo, e respostas de API. Daí, você fica apenas rezando para que elas cheguem de um lado a outro. Então, na posição de DevOps, principalmente iniciante, sua vida será mais fácil se você entender os conceitos essenciais de redes, pois na hora do aperto é preciso saber como as coisas funcionam para poder arrumar, mesmo que hoje em dia se trabalhe com plataformas que abstraem tudo isso!Os LINKS do episódio:Semana DevOps da LINUXtips com Mateus - https://www.youtube.com/watch?v=aI8FeEhDqoc Kubicast com o Pery - https://gtup.me/kubicast-94Kubicast com o Gomex - https://gtup.me/kubicast-95Treinamento LINUXtips com Mateus sobre AWS Expert - https://school.linuxtips.io/p/aws-expertMateus Prado - CloudOps: Operating AWS at scale - https://www.youtube.com/watch?v=Pv_2zgR9qNwAs RECOMENDAÇÕES do programa:Sistemas Operacionais Modernos - livro do Andrew S. Tanenbaum)Unix System Administrator Handbook - livro de capa roxa do Evi Nemeth Debugging Teams: Better Productivity through Collaboration - livro do Brian Fitzpatrick and Ben Collins-SussmanUsability Engineering - livro do Jakob NielsenShow me the numbers - livro do Stephen FewRuptura - série que está na Apple TV PlusATENÇÃO: Estamos perto do Kubicast #100 e vamos comemorar esse marco de um jeito especial! Prepare-se!SOBRE O KUBICASTO Kubicast é uma produção da Getup, especialista em Kubernetes. Os episódios do podcast estão no site da Getup e nas principais plataformas de áudio digital. Alguns deles estão registrados no YT. #DevOps #Kubernetes #Containers #Kubicast
We've run Kubernetes inside Redgate for some research projects (like Spawn) and we are building some skills running this orchestrator. At the same time, we've had no shortage of challenges in keeping the clusters up at times, patching, fixing issues, upgrading to new configurations, etc. Like any software, there is work involved with managing the orchestrator. I've watched Andrew Pruski and Anthony Nocentino write about containers and Kubernetes and overall they've made me view the clusters like email. It's useful and I want to use it, but I don't want to manage or administer it. I'd want some service like AKS or EKS instead. Let someone else build expertise. Read the rest of How Hard is Kubernetes?
About AlexAlex holds a Ph.D. in Computer Science and Engineering from UC San Diego, and has spent over a decade building high-performance, robust data management and processing systems. As an early member of a couple fast-growing startups, he's had the opportunity to wear a lot of different hats, serving at various times as an individual contributor, tech lead, manager, and executive. He also had a brief stint as a Cloud Economist with the Duckbill Group, helping AWS customers save money on their AWS bills. He's currently a freelance data engineering consultant, helping his clients build, manage, and maintain their data infrastructure. He lives in Los Angeles, CA.Links Referenced: Company website: https://bitsondisk.com Twitter: https://twitter.com/alexras LinkedIn: https://www.linkedin.com/in/alexras/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: DoorDash had a problem. As their cloud-native environment scaled and developers delivered new features, their monitoring system kept breaking down. In an organization where data is used to make better decisions about technology and about the business, losing observability means the entire company loses their competitive edge. With Chronosphere, DoorDash is no longer losing visibility into their applications suite. The key? Chronosphere is an open-source compatible, scalable, and reliable observability solution that gives the observability lead at DoorDash business, confidence, and peace of mind. Read the full success story at snark.cloud/chronosphere. That's snark.cloud slash C-H-R-O-N-O-S-P-H-E-R-E.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined this week by a returning guest, who… well, it's a little bit complicated and more than a little bittersweet. Alex Rasmussen was a principal cloud economist here at The Duckbill Group until he committed an unforgivable sin. That's right. He gave his notice. Alex, thank you for joining me here, and what have you been up to, traitor?Alex: [laugh]. Thank you for having me back, Corey.Corey: Of course.Alex: At time of recording, I am restarting my freelance data engineering business, which was dormant for the sadly brief time that I worked with you all at The Duckbill Group. And yeah, so that's really what I've been up to for the last few days. [laugh].Corey: I want to be very clear that I am being completely facetious when I say this. When someone is considering, “Well, am I doing what I really want to be doing?” And if the answer is no, too many days in a row, yeah, you should find something that aligns more with what you want to do. And anyone who's like, “Oh, you're leaving? Traitor, how could you do that?” Yeah, those people are trash. You don't want to work with trash.I feel I should clarify that this is entirely in jest and I could not be happier that you are finding things that are more aligned with aspects of what you want to be doing. I am serious when I say that, as a company, we are poorer for your loss. You have been transformative here across a number of different axes that we will be going into over the course of this episode.Alex: Well, thank you very much, I really appreciate that. And I came to a point where I realized, you know, the old saying, “You don't know what you got till it's gone?” I realized, after about six months of working with Duckbill Group that I missed building stuff, I missed building data systems, I missed being a full-time data person. And I'm really excited to get back to that work, even though I'll definitely miss working with everybody on the team. So yeah.Corey: There are a couple of things that I found really notable about your time working with us. One of them was that even when you wound up applying to work here, you were radically different than—well, let's be direct here—than me. We are almost polar opposites in a whole bunch of ways. I have an eighth-grade education; you have a PhD in computer science and engineering from UCSD. And you are super-deep into the world of data, start to finish, whereas I have spent my entire career on things that are stateless because I am accident prone, and when you accidentally have a problem with the database, you might not have a company anymore, but we can all laugh as we reprovision the web server fleet.We just went in very different directions as far as what we found interesting throughout our career, more or less. And we were not quite sure how it was going to manifest in the context of cloud economics. And I can say now that we have concluded the experiment, that from my perspective, it went phenomenally well. Because the exact areas that I am weak at are where you excel. And, on some level, I would say that you're not necessarily as weak in your weak areas as I am in mine, but we want to reinforce it and complementing each other rather than, “Well, we now have a roomful of four people who are all going to yell at you about the exact same thing.” We all went in different directions, which I thought was really neat.Alex: I did too. And honestly, I learned a tremendous, tremendous amount in my time at Duckbill Group. I think the window into just how complex and just how vast the ecosystem of services within AWS is, and kind of how they all ping off of each other in these very complicated ways was really fascinating, fascinating stuff. But also just an insight into just what it takes to get stuff done when you're talking with—you know, so most of my clientele to date have been small to medium-sized businesses, you know, small as two people; as big as a few hundred people. But I wasn't working with Fortune 1000 companies like Duckbill Group regularly does, and an insight into just, number one, what it takes to get things done inside of those organizations, but also what it takes to get things done with AWS when you're talking about, you know, for instance, contracts that are tens, or hundreds of millions of dollars in total contract value. And just what that involves was just completely eye-opening for me.Corey: From my perspective, what I found—I guess, in hindsight, it should have been more predictable than it was—but you talk about having a background and an abiding passion for the world of data, and I'm sitting here thinking, that's great. We have all this data in the form of the Cost and Usage Reports and the bills, and I forgot the old saw that yeah, if it fits in RAM, it's not a big data problem. And yeah, in most cases, what we have tends to fit in RAM. I guess you don't tend to find things interesting until Microsoft Excel gives up and calls uncle.Alex: I don't necessarily know that that's true. I think that there are plenty of problems to be had in the it fits in RAM space, precisely because so much of it fits in RAM. And I think that, you know, particularly now that, you know—I think there's it's a very different world that we live in from the world that we lived in ten years ago, where ten years ago—Corey: And right now I'm talking to you on a computer with 128 gigs of RAM, and it—Alex: Well, yeah.Corey: —that starts to look kind of big data-y.Alex: Well, not only that, but I think on the kind of big data side, right? When you had to provision your own Hadoop cluster, and after six months of weeping tears of blood, you managed to get it going, right, at the end of that process, you went, “Okay, I've got this big, expensive thing and I need this group of specialists to maintain it all. Now, what the hell do I do?” Right? In the intervening decade, largely due to the just crushing dominance of the public clouds, that problem—I wouldn't call that problem solved, but for all practical purposes, at all reasonable scales, there's a solution that you can just plug in a credit card and buy.And so, now the problem, I think, becomes much more high level, right, than it used to be. Used to be talking about how well you know, how do I make this MapReduce job as efficient as it possibly can be made? Nobody really cares about that anymore. You've got a query planner; it executes a query; it'll probably do better than you can. Now, I think the big challenges are starting to be more in the area of, again, “How do I know what I have? How do I know who's touched it recently? How do I fix it when it breaks? How do I even organize an organization that can work effectively with data at petabyte scale and say anything meaningful about it?”And so, you know, I think that the landscape is shifting. One of the reasons why I love this field so much is that the landscape is shifting very rapidly and as soon as we think, “Ah yes. We have solved all of the problems.” Then immediately, there are a hundred new problems to solve.Corey: For me, what I found, I guess, one of the most eye-opening things about having you here is your actual computer science background. Historically, we have biased for folks who have come up from the ops side of the world. And that lends itself to a certain understanding. And, yes, I've worked with developers before; believe it or not, I do understand how folks tend to think in that space. I have not a complete naive fool when it comes to these things.But what I wasn't prepared for was the nature of our internal, relatively casual conversations about a bunch of different things, where we'll be on a Zoom chat or something, and you will just very casually start sharing your screen, fire up a Jupyter Notebook and start writing code as you're talking to explain what it is you're talking about and watching it render in real time. And I'm sitting here going, “Huh, I can't figure out whether we should, like, wind up giving him a raise or try to burn him as a witch.” I could really see it going either way. Because it was magic and transformative from my perspective.Alex: Well, thank you. I mean, I think that part of what I am very grateful for is that I've had an opportunity to spend a considerable period of time in kind of both the academic and industrial spaces. I got a PhD, basically kept going to school until somebody told me that I had to stop, and then spent a lot of time at startups and had to do a lot of different kinds of work just to keep the wheels attached to the bus. And so, you know, when I arrived at Duckbill Group, I kind of looked around and said, “Okay, cool. There's all the stuff that's already here. That's awesome. What can I do to make that better?” And taking my lens so to speak, and applying it to those problems, and trying to figure out, like, “Okay, well as a cloud economist, what do I need to do right now that sucks? And how do I make it not suck?”Corey: It probably involves a Managed NAT Gateway.Alex: Whoa, God. And honestly, like, I spent a lot of time developing a bunch of different tools that were really just there in the service of that. Like, take my job, make it easier. And I'm really glad that you liked what you saw there.Corey: It was interesting watching how we wound up working together on things. Like, there's a blog post that I believe is out by the time this winds up getting published—but if not, congratulations on listening to this, you get a sneak preview—where I was looking at the intelligent tiering changes in pricing, where any object below 128 kilobytes does not have a monitoring charge attached to it, and above it, it does. And it occurred to me on a baseline gut level that, well wait a minute, it feels like there is some object sizes, where regardless of how long it lives in storage and transition to something cheaper, it will never quite offset that fee. So, instead of having intelligent tiering for everything, that there's some cut-off point below which you should not enable intelligent tiering because it will always cost you more than it can possibly save you.And I mentioned that to you and I had to do a lot of articulating with my hands because it's all gut feelings stuff and this stuff is complicated at the best of times. And your response was, “Huh.” Then it felt like ten minutes later you came back with a multi-page blog post written—again—in a Python notebook that has a dynamic interactive graph that shows the breakeven and cut-off points, a deep dive math showing exactly where in certain scenarios it is. And I believe the final takeaway was somewhere between 148 to 161 kilobytes, somewhere in that range is where you want to draw the cut-off. And I'm just looking at this and marveling, on some level.Alex: Oh, thanks. To be fair, it took a little bit more than ten minutes. I think it was something where it kind of went through a couple of stages where at first I was like, “Well, I bet I could model that.” And then I'm like, “Well, wait a minute. There's actually, like—if you can kind of put the compute side of this all the way to the side and just remove all API calls, it's a closed form thing. Like, you can just—this is math. I can just describe this with math.”And cue the, like, Beautiful Mind montage where I'm, like, going onto the whiteboard and writing a bunch of stuff down trying to remember the point intercept form of a line from my high school algebra days. And at the end, we had that blog post. And the reason why I kind of dove into that headfirst was just this, I have this fascination for understanding how all this stuff fits together, right? I think so often, what you see is a bunch of little point things, and somebody says, “You should use this at this point, for this reason.” And there's not a lot in the way of synthesis, relatively speaking, right?Like, nobody's telling you what the kind of underlying thing is that makes it so that this thing is better in these circumstances than this other thing is. And without that, it's a bunch of, kind of, anecdotes and a bunch of kind of finger-in-the-air guesses. And there's a part of that, that just makes me sad, fundamentally, I guess, that humans built all of this stuff; we should know how all of it fits together. And—Corey: You would think, wouldn't you?Alex: Well, but the thing is, it's so enormously complicated and it's been developed over such an enormously long period of time, that—or at least, you know, relatively speaking—it's really, really hard to kind of get that and extract it out. But I think when you do, it's very satisfying when you can actually say like, “Oh no, no, we've actually done—we've done the analysis here. Like, this is exactly what you ought to be doing.” And being able to give that clear answer and backing it up with something substantial is, I think, really valuable from the customer's point of view, right, because they don't have to rely on us kind of just doing the finger-in-the-air guess. But also, like, it's valuable overall. It extends the kind of domain where you don't have to think about whether or not you've got the right answer there. Or at least you don't have to think about it as much.Corey: My philosophy has always been that when I have those hunches, they're useful, and it's an indication that there's something to look into here. Where I think it goes completely off the rails is when people, like, “Well, I have a hunch and I have this belief, and I'm not going to evaluate whether or not that belief is still one that is reasonable to hold, or there has been perhaps some new information that it would behoove me to figure out. Nope, I've just decided that I know—I have a hunch now and that's enough and I've done learning.” That is where people get into trouble.And I see aspects of it all the time when talking to clients, for example. People who believe things about their bill that at one point were absolutely true, but now no longer are. And that's one of those things that, to be clear, I see myself doing this. This is not something—Alex: Oh, everybody does, yeah.Corey: —I'm blaming other people for it all. Every once in a while I have to go on a deep dive into our own AWS bill just to reacquaint myself with an understanding of what's going on over there.Alex: Right.Corey: And I will say that one thing that I was firmly convinced was going to happen during your tenure here was that you're a data person; hiring someone like you is the absolute most expensive thing you can ever do with respect to your AWS bill because hey, you're into the data space. During your tenure here, you cut the bill in half. And that surprises me significantly. I want to further be clear that did not get replaced by, “Oh, yeah. How do you cut your AWS bill by so much?” “We moved everything to Snowflake.” No, we did not wind up—Alex: [laugh].Corey: Just moving the data somewhere else. It's like, at some level, “Great. How do I cut the AWS bill by a hundred percent? We migrate it to GCP.” Technically correct; not what the customer is asking for.Alex: Right? Exactly, exactly. I think part of that, too—and this is something that happens in the data part of the space more than anywhere else—it's easy to succumb to shiny object syndrome, right? “Oh, we need a cloud data warehouse because cloud data warehouse, you know? Snowflake, most expensive IPO in the history of time. We got to get on that train.”And, you know, I think one of the things that I know you and I talked about was, you know, where should all this data that we're amassing go? And what should we be optimizing for? And I think one of the things that, you know, the kind of conclusions that we came to there was, well, we're doing some stuff here, that's kind of designed to accelerate queries that don't really need to be accelerated all that much, right? The difference between a query taking 500 milliseconds and 15 seconds, from our point of view, doesn't really matter all that much, right? And that realization alone, kind of collapsed a lot of technical complexity, and that, I will say we at Duckbill Group still espouse, right, is that cloud cost is an architectural problem, it's not a right-sizing your instances problem. And once we kind of got past that architectural problem, then the cost just sort of cratered. And honestly, that was a great feeling, to see the estimate in the billing console go down 47% from last month, and it's like, “Ah, still got it.” [laugh].Corey: It's neat to watch that happen, first off—Alex: For sure.Corey: But it also happened as well, with increasing amounts of utility. There was a new AWS billing page that came out, and I'm sure it meets someone's needs somewhere, somehow, but the things that I always wanted to look at when I want someone to pull up their last month's bill is great, hit the print button—on the old page—and it spits out an exploded pdf of every type of usage across their entire AWS estate. And I can skim through that thing and figure out what the hell's going on at a high level. And this new thing did not let me do that. And that's a concern, not just for the consulting story because with our clients, we have better access than printing a PDF and reading it by hand, but even talking to randos on the internet who were freaking out about an AWS bill, they shouldn't have to trust me enough to give me access into their account. They should be able to get a PDF and send it to me.Well, I was talking with you about this, and again, in what felt like ten minutes, you wound up with a command line tool, run it on an exported CSV of a monthly bill and it spits it out as an HTML page that automatically collapses in and allocates things based upon different groups and service type and usage. And congratulations, you spent ten minutes to create a better billing experience than AWS did. Which feels like it was probably, in fairness to AWS, about seven-and-a-half minutes more time than they spent on it.Alex: Well, I mean, I think that comes back to what we were saying about, you know, not all the interesting problems in data are in data that doesn't fit in RAM, right? I think, in this case, that came from two places. I looked at those PDFs for a number of clients, and there were a few things that just made my brain hurt. And you and Mike and the rest of the folks at Duckbill could stare at the PDF, like, reading the matrix because you've seen so many of them before and go, ah, yes, “Bill spikes here, here, here.” I'm looking at this and it's just a giant grid of numbers.And what I wanted was I wanted to be able to say, like, don't show me the services in alphabetical order; show me the service is organized in descending order by spend. And within that, don't show me the operations in alphabetical order; show me the operations in decreasing order by spend. And while you're at it, group them into a usage type group so that I know what usage type group is the biggest hitter, right? The second reason, frankly, was I had just learned that DuckDB was a thing that existed, and—Corey: Based on the name alone, I was interested.Alex: Oh, it was an incredible stroke of luck that it was named that. And I went, “This thing lets me run SQL queries against CSV files. I bet I can write something really fast that does this without having to bash my head against the syntactic wall that is Pandas.” And at the end of the day, we had something that I was pretty pleased with. But it's one of those examples of, like, again, just orienting the problem toward, “Well, this is awful.”Because I remember when we first heard about the new billing experience, you kind of had pinged me and went, “We might need something to fix this because this is a problem.” And I went, “Oh, yeah, I can build that.” Which is kind of how a lot of what I've done over the last 15 years has been. It's like, “Oh. Yeah, I bet I could build that.” So, that's kind of how that went.Corey: This episode is sponsored in part by our friend EnterpriseDB. EnterpriseDB has been powering enterprise applications with PostgreSQL for 15 years. And now EnterpriseDB has you covered wherever you deploy PostgreSQL on-premises, private cloud, and they just announced a fully-managed service on AWS and Azure called BigAnimal, all one word. Don't leave managing your database to your cloud vendor because they're too busy launching another half-dozen managed databases to focus on any one of them that they didn't build themselves. Instead, work with the experts over at EnterpriseDB. They can save you time and money, they can even help you migrate legacy applications—including Oracle—to the cloud. To learn more, try BigAnimal for free. Go to biganimal.com/snark, and tell them Corey sent you.Corey: The problem that I keep seeing with all this stuff is I think of it in terms of having to work with the tools I'm given. And yeah, I can spin up infrastructure super easily, but the idea of, I'm going to build something that manipulates data and recombines it in a bunch of different ways, that's not something that I have a lot of experience with, so it's not my instinctive, “Oh, I bet there's an easier way to spit this thing out.” And you think in that mode. You effectively wind up automatically just doing those things, almost casually. Which does make a fair bit of sense, when you understand the context behind it, but for those of us who don't live in that space, it's magic.Alex: I've worked in infrastructure in one form or another my entire career, data infrastructure mostly. And one of the things—I heard this from someone and I can't remember who it was, but they said, “When infrastructure works, it's invisible.” When you walk in the room and flip the light switch, the lights come on. And the fact that the lights come on is a minor miracle. I mean, the electrical grid is one of the most sophisticated, globally-distributed engineering systems ever devised, but we don't think about it that way, right?And the flip side of that, unfortunately, is that people really pay attention to infrastructure most when it breaks. But they are two edges of the same proverbial sword. It's like, I know, when I've done a good job, if the thing got built and it stayed built and it silently runs in the background and people forget it exists. That's how I know that I've done a good job. And that's what I aim to do really, everywhere, including with Duckbill Group, and I'm hoping that the stuff that I built hasn't caught on fire quite yet.Corey: The smoke is just the arising of the piles of money it wound up spinning up.Alex: [laugh].Corey: It's like, “Oh yeah, turns out that maybe we shouldn't have built a database out of pure Managed NAT Gateways. Yeah, who knew?”Alex: Right, right. Maybe I shouldn't have filled my S3 bucket with pure unobtainium. That was a bad idea.Corey: One other thing that we do here that I admit I don't talk about very often because people get the wrong idea, but we do analyst projects for vendors from time to time. And the reason I don't say that is, when people hear about analysts, they think about something radically different, and I do not self-identify as an analyst. It's, “Oh, I'm not an analyst.” “Really? Because we have analyst budget.” “Oh, you said analyst. I thought you said something completely different. Yes, insert coin to continue.”And that was fine, but unlike the vast majority of analysts out there, we don't form our opinions based upon talking to clients and doing deeper dive explorations as our primary focus. We're a team of engineers. All right, you have a product. Let's instrument something with it, or use your product for something and we'll see how it goes along the way. And that is something that's hard for folks to contextualize.What was really fun was bringing you into a few of those engagements just because it was interesting; at the start of those calls. “It was all great, Corey is here and—oh, someone else's here. Is this a security problem?” “It's no, no, Alex is with me.” And you start off those calls doing what everyone should do on those calls is, “How can we help?” And then we shut up and listen. Step one, be a good consultant.And then you ask some probing questions and it goes a little bit deeper and a little bit deeper, and by the end of that call, it's like, “Wow, Alex is amazing. I don't know what that Corey clown is doing here, but yeah, having Alex was amazing.” And every single time, it was phenomenal to watch as you, more or less, got right to the heart of their generally data-oriented problems. It was really fun to be able to think about what customers are trying to achieve through the lens that you see the world through.Alex: Well, that's very flattering, first of all. Thank you. I had a lot of fun on those engagements, honestly because it's really interesting to talk to folks who are building these systems that are targeting mass audiences of very deep-pocketed organizations, right? Because a lot of those organizations, the companies doing the building are themselves massive. And they can talk to their customers, but it's not quite the same as it would be if you or I were talking to the customers because, you know, you don't want to tell someone that their baby is ugly.And note, now, to be fair, we under no circumstances were telling people that their baby was ugly, but I think that the thing that is really fun for me is to kind of be able to wear the academic database nerd hat and the practitioner hat simultaneously, and say, like, “I see why you think this thing is really impressive because of this whiz-bang, technical thing that it does, but I don't know that your customers actually care about that. But what they do care about is this other thing that you've done as an ancillary side effect that actually turns out is a much more compelling thing for someone who has to deal with this stuff every day. So like, you should probably be focusing attention on that.” And the thing that I think was really gratifying was when you know that you're meeting someone on their level and you're giving them honest feedback and you're not just telling them, you know, “The Gartner Magic Quadrant says that in order to move up and to the right, you must do the following five features.” But instead saying, like, “I've built these things before, I've deployed them before, I've managed them before. Here's what sucks that you're solving.” And seeing the kind of gears turn in their head is a very gratifying thing for me.Corey: My favorite part of consulting—and I consider analyst style engagements to be a form of consulting as well—is watching someone get it, watching that light go on, and they suddenly see the answer to a problem that's been vexing them I love that.Alex: Absolutely. I mean, especially when you can tell that this is a thing that has been keeping them up at night and you can say, “Okay. I see your problem. I think I understand it. I think I might know how to help you solve it. Let's go solve it together. I think I have a way out.”And you know, that relief, the sense of like, “Oh, thank God somebody knows what they're doing and can help me with this, and I don't have to think about this anymore.” That's the most gratifying part of the job, in my opinion.Corey: For me, it has always been twofold. One, you've got people figuring out how to solve their problem and you've made their situation better for it. But selfishly, the thing I like the most personally has been the thrill you get from solving a puzzle that you've been toying with and finally it clicks. That is the endorphin hit that keeps me going.Alex: Absolutely.Corey: And I didn't expect when I started this place is that every client engagement is different enough that it isn't boring. It's not the same thing 15 times. Which it would be if it were, “Hi, thanks for having us. You haven't bought some RIs. You should buy some RIs. And I'm off.” It… yeah, software can do that. That's not interesting.Alex: Right. Right. But I think that's the other thing about both cloud economics and data engineering, they kind of both fit into that same mold. You know, what is it? “All happy families are alike, but each unhappy family is unhappy in its own way.” I'm butchering Chekhov, I'm sure. But like—if it's even Chekhov.But the general kind of shape of it is this: everybody's infrastructure is different. Everybody's organization is different. Everybody's optimizing for a different point in the space. And being able to come in and say, “I know that you could just buy a thing that tells you to buy some RIs, but it's not going to know who you are; it's not going to know what your business is; it's not going to know what your challenges are; it's not going to know what your roadmap is. Tell me all those things and then I'll tell you what you shouldn't pay attention to and what you should.”And that's incredibly, incredibly valuable. It's why, you know, it's why they pay us. And that's something that you can never really automate away. I mean, you hear this in data all the time, right? “Oh, well, once all the infrastructure is managed, then we won't need data infrastructure people anymore.”Well, it turns out all the infrastructure is managed now, and we need them more than we ever did. And it's not because this managed stuff is harder to run; it's that the capabilities have increased to the point that they're getting used more. And the more that they're getting used, the more complicated that use becomes, and the more you need somebody who can think at the level of what does the business need, but also, what the heck is this thing doing when I hit the run key? You know? And that I think, is something, particularly in AWS where I mean, my God, the amount and variety and complexity of stuff that can be deployed in service of an organization's use case is—it can't be contained in a single brain.And being able to make sense of that, being able to untangle that and figure out, as you say, the kind of the aha moment, the, “Oh, we can take all of this and just reduce it down to nothing,” is hugely, hugely gratifying and valuable to the customer, I'd like to think.Corey: I think you're right. And again, having been doing this in varying capacities for over five years—almost six now; my God—the one thing has been constant throughout all of that is, our number one source for new business has always been word of mouth. And there have been things that obviously contribute to that, and there are other vectors we have as well, but by and large, when someone winds up asking a colleague or a friend or an acquaintance about the problem of their AWS bill, and the response almost universally, is, “Yeah, you should go talk to The Duckbill Group,” that says something that validates that we aren't going too far wrong with what we're approaching. Now that you're back on the freelance data side, I'm looking forward to continuing to work with you, if through no other means and being your customer, just because you solve very interesting and occasionally very specific problems that we periodically see. There's no reason that we can't bring specialists in—and we do from time to time—to look at very specific aspects of a customer problem or a customer constraint, or, in your case for example, a customer data set, which, “Hmm, I have some thoughts on here, but just optimizing what storage class that three petabytes of data lives within seems like it's maybe step two, after figuring what the heck is in it.” Baseline stuff. You know, the place that you live in that I hand-wave over because I'm scared of the complexity.Alex: I am very much looking forward to continuing to work with you on this. There's a whole bunch of really, really exciting opportunities there. And in terms of word of mouth, right, same here. Most of my inbound clientele came to me through word of mouth, especially in the first couple years. And I feel like that's how you know that you're doing it right.If someone hires you, that's one thing, and if someone refers you, to their friends, that's validation that they feel comfortable enough with you and with the work that you can do that they're not going to—you know, they're not going to pass their friends off to someone who's a chump, right? And that makes me feel good. Every time I go, “Oh, I heard from such and such that you're good at this. You want to help me with this?” Like, “Yes, absolutely.”Corey: I've really appreciated the opportunity to work with you and I'm super glad I got the chance to get to know you, including as a person, not just as the person who knows the data, but there's a human being there, too, believe it or not.Alex: Weird. [laugh].Corey: And that's the important part. If people want to learn more about what you're up to, how you think about these things, potentially have you looked at a gnarly data problem they've got, where's the best place to find you now?Alex: So, my business is called Bits on Disk. The website is bitsondisk.com. I do write occasionally there. I'm also on Twitter at @alexras. That's Alex-R-A-S, and I'm on LinkedIn as well. So, if your lovely listeners would like to reach me through any of those means, please don't hesitate to reach out. I would love to talk to them more about the challenges that they're facing in data and how I might be able to help them solve them.Corey: Wonderful. And we will of course, put links to that in the show notes. Thank you again for taking the time to speak with me, spending as much time working here as you did, and honestly, for a lot of the things that you've taught me along the way.Alex: My absolute pleasure. Thank you very much for having me.Corey: Alex Rasmussen, data engineering consultant at Bits on Disk. I'm Cloud Economist Corey Quinn. This is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that is so large it no longer fits in RAM.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.