Podcasts about chief cloud economist

  • 17PODCASTS
  • 337EPISODES
  • 38mAVG DURATION
  • ?INFREQUENT EPISODES
  • May 14, 2024LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about chief cloud economist

Latest podcast episodes about chief cloud economist

Detection at Scale
The Duckbill Group's Corey Quinn on What Billing Data Can Tell Us About AWS Security

Detection at Scale

Play Episode Listen Later May 14, 2024 28:07


In a recent episode of the Detection at Scale podcast recorded at the RSA conference, Jack chats with Corey Quinn, Chief Cloud Economist at The Duckbill Group, an AWS cost-management agency. They talked about the intersection of security and billing in the context of AWS environments, highlighting the significance of observability through billing data to enhance security measures.  Corey also discussed key offenders in AWS services for security and highlighted the challenges companies face in determining optimal investments in security services. Throughout our discussion, Corey offers valuable takeaways on navigating the evolving landscape of AWS security practices and optimizing billing strategies for enhanced cloud security. Topics discussed: The importance of observability via billing data to bolster AWS security measures and optimize investments in security services. How to identify key security offenders in AWS services to enhance cloud security practices and mitigate potential breaches. The challenges in determining optimal security investments within AWS environments. Detecting potential breaches through AWS billing insights and the significance of understanding billing intricacies for security enhancements. The impact of billing data on identifying security vulnerabilities and navigating the AWS security landscape with enhanced strategies. The role of services like Route 53 in bolstering security measures and considerations for AWS spending on security services.  Resources Mentioned:  Corey Quinn on LinkedIn The Duckbill Group website 

Builders Gonna Build
6. Creating a niche of your own | Corey Quinn, The Duckbill Group

Builders Gonna Build

Play Episode Listen Later Apr 10, 2024 74:58


Corey Quinn is the Chief Cloud Economist at The Duckbill Group and a big celebrity at the AWS circles well known for his sense of humor and unrelenting focus on making some good fun of the cloud providers.In our interview, we are learning Corey's background, how The Duckbill Group got started, and how he runs the media side of his business. As usual, we talked about bootstrapping and running consulting services while building a product.This episode was originally published on the Metacast podcast, episode 32.Do you love audio podcasts?Try Metacast podcast app at metacast.app.Join our Reddit community at r/metacastapp.Where to find CoreyScreaming in the Cloud (podcast)Last week in AWS (newsletter)AWS Morning Brief (podcast)X: @QuinnyPigDuckbill GroupBooksPractical Monitoring: Effective Strategies for the Real World by Mike JulianNever Eat Alone by Tahl Raz Keith FerrazziOther referencesRoute 53, Amazon's Premier DatabaseWriting New Editions and Ticking All the Boxes with Andreas WittigMetacast Ep. 8 - SquadCast Founders on Remote Collaboration in PodcastsYes, I Test in Production (And So Do You) by Charity MajorsGet in touch!Send us an email at hello@buildersgonnabuild.comSubscribe to newsletter at buildersgonnabuild.comMetacast: Behind the scenes - "build in public" style podcast and newsletter at metacastpodcast.com

The Cloud Gambit
Making Budgets Behave and Dollars Dance with Corey Quinn

The Cloud Gambit

Play Episode Listen Later Jan 30, 2024 43:33


Corey Quinn is the Chief Cloud Economist at the Duckbill Group, an expert at cost optimization in AWS, and is truly a virtuoso of snark. In this conversation, we discuss the intricacies of cloud cost, how bills get out of control, and Corey imparts some wisdom on how to make sense of it so you can make those dollars go further.Where to find CoreyPodcast: https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/YouTube: https://www.youtube.com/@LastWeekinAWSLinkedIn: https://www.linkedin.com/in/coquinn/Twitter: https://twitter.com/QuinnyPigTikTok: https://www.tiktok.com/@quinnypigFollow, Like, and Subscribe!Podcast: https://www.thecloudgambit.com/YouTube: https://www.youtube.com/@TheCloudGambitLinkedIn: https://www.linkedin.com/company/thecloudgambitTwitter: https://twitter.com/TheCloudGambitTikTok: https://www.tiktok.com/@thecloudgambit

Screaming in the Cloud
How Snyk Gets Buy-In to Improve Security with Chen Gour Arie

Screaming in the Cloud

Play Episode Listen Later Jan 23, 2024 28:15


Chen Gour Arie, Director of Engineering at Snyk, joins Corey on Screaming in the Cloud to discuss how his company, Enso Security, got acquired by Snyk and what drew him to Snyk's mission as a partner. Chen expands on the challenges currently facing the security space, and shares what he feels are likely outcomes for challenges like improving compliance across value-add on security tools and the increasing scope of cybersecurity at such a relatively early phase of the industry's development. Corey and Chen also discuss what makes Snyk so appealing to developers and why that was an important part of their growth strategy, as well as Chen's take on recent security incidents that have hit the news. About ChenChen is the Co-founder of Enso Security (part of Snyk) - the world's 1st ASPM platform. With decades of hands-on experience in cybersecurity and software development, Chen has focused his career on building effective application security tools and practices.Links Referenced:Snyk: https://snyk.ioSnyk AppRisk: https://snyk.io/product/snyk-apprisk/TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by our friends at Snyk, and as a part of that they have given me someone rather distinct as far as career paths and trajectories go. Chen Gour Arie is currently a director of engineering over at Snyk, but in a previous life—read as about six months or so ago—he was a co-founder of Enso Security, which got acquired. Chen, thank you for joining me.Chen: Thank you for having me, Corey.Corey: So, I guess an interesting place to begin is, what has the past couple of years been like? And let's dive in with, what is or was Enso Security?Chen: Yeah. So, Enso started for me first as friendship because I joined the team that I was working with as a contractor for a while. There was such an excellent and interesting team with a very interesting environment. And then after a while, they asked me to join that team, and then I became part of the security team of a company called Wix.com.It's quite a large company, web do-it-yourself kind of platform, that you can build your own website with a presentation style kind of interface, and our job was to secure that. And we formed a very, very nice friendship throughout it, but we also gained a lot of experience because you work with such a large company, and you experience many challenges, including real-time attempts to penetrate, and the complexity of social engineering at large scale. You go through a lot of things. So, this was the start. And after a couple of years, we decided that we have some interesting ideas that can do good to the community in the cybersecurity industry, and we embarked on a new journey together to start Enso.Corey: I can see why you aligned with Snyk. It sounds like a lot of what you were aimed at is very much in step with how they tend to approach things. I have a number of sponsors that I can say this about, but Snyk is a particularly fun one, in that, obviously, you folks pay me to run advertisements and featured guest episodes like this, which is appreciated, but we also pay you as a customer of Snyk because it does a lot of things that we find both incredibly useful and incredibly valuable. The thread that I've seen running through everything coming out of Snyk has been this concept of, I think, what some folks would say shifting left, but it comes down to the idea of flagging issues as early in the process as possible rather than trying to get someone to remember what they did three months ago, and oh, yeah, go back and address that. That alone has made it one of the best approaches to things that are truly important—and yes, I consider security to be one of those things—that I've seen in a while on the dev tool space.Chen: Yeah, and this has been the mission of Snyk for a very long time. And when we started Enso, our mission was to help in some additional elements of the same problem space in introducing additional tools to help drive this shift left, this democratization of the security effort around and in the organization, and resolving some of the friction that is created with the, kind of, confusing ownership of security and software development. So, this was kind of the mission of Enso. The category introduced by it and the ASPM category to bring the notion of postural security, postural management to applications. And it really is a huge fit with the journey of Snyk, and we were very excited to be approached by them to join their journey and help them do further shift left and extend on problem space on the complexity of this collaboration between security and developers.Corey: A question I have around this is that it seems to me that viewing security posture management from an application perspective, and then viewing other parts of it from a cloud provider perspective and other parts of it from a variety of different things—you know, go to RSA and walk up and down the endless rows of booths, and you know, look at the 12 different things that they're all selling because it's all the same stuff around 12 categories or so, with different companies and logos and the rest—it feels like, on some level, that can lead very quickly to a fractured security posture where, well this is the app side of the security, and then we have the infrastructure security folks, but those groups don't really collaborate because they're separate and distinct. How do you square that circle?Chen: Yeah, it's not an easy problem, and I think that the North Star of many vendors exists this notion of sometimes I think we call it CNAP or something that will unify all of it. Cloud as a solution, and the offering that exists with cloud computing enables a lot of it, enables a lot of this unification, but we have to remember that the industry is young. The software security industry in general is young. If we will look at any other industry with that size, all of them have much more history and time to mature. And inside this industry, the security itself is even younger.It has become a real problem much later than then when software started. It has become a huge problem when cloud emerged and became, like, the huge deal that it is now. And when more and more businesses are based on digital services, and more people are writing software, a lot of it is young, and it needs time to mature, and it's time to get to—to accomplish some big parts like this unification that you are pointing out missing.Corey: I have to confess my own bias here. A lot of the stuff that I build is very small-scale, leverages serverless technologies heavily, and even when I'm dealing with things like the CDK, where I start to have my application and the infrastructure that powers it coalesce into the same sort of thing, it becomes increasingly difficult, if not outright impossible for some of these co...

Screaming in the Cloud
Continuing to Market After the Product Has Sold with Kim Harrison

Screaming in the Cloud

Play Episode Listen Later Jan 18, 2024 32:33


Kim Harrison, a freelance content marketing strategist and author, joins Corey on Screaming in the Cloud to talk about asking the right questions to find your target demographic, why she has such a deep love for story telling, and how marketing extends after the product has been sold. Kim shares her unique experiences with solving urgently painful problems that customers are experiencing and subsequently building a relationship with those customers that allows her to solve more pain points down the line. About KimKim is a professional storyteller focused on strategic communications. She translates complex ideas into compelling narratives, helping teams share their perspectives. She enjoys building impactful stories, and using a range of mediums and channels to reach specific audiences.For 10+ years Kim has worked closely with teams focused on big data and developer tooling. They have brought new methodologies forward, impacted the language used to describe technologies, and even established new industry categories.Links Referenced:Personal/Company website: https://www.kimber.kim/LinkedIn: https://www.linkedin.com/in/kimberh/Twitter: https://twitter.com/kittyriotTranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. One of the unpleasant-to-some-folk realizations that people sometimes have is, “Wait a minute. Corey, you've been doing marketing all this time.” To which the only response I can come up with is a slightly more professional version of, “Well, duh.” And I think that's because people misunderstand what marketing is and what it means. Here to talk about that, and presumably other things as well, is Kim Harrison, a freelance content marketing strategist. Kim, thank you for agreeing to listen to me.Kim: [laugh] Thank you for having me, Corey. It's great connecting with you today.Corey: You've worked at a number of different places over the course of your career, the joys of freelancing. You have periodically been involved in getting folks from the companies at which you've been working onto this show, but it's sort of the ‘always a bridesmaid, never a bride' type of philosophy. You were somewhat surprised when I reached out and said, “Hey, why don't you come on the show yourself?” Which is always the sign it's going to be a fascinating episode because some of the most valuable conversations that I find I have here are with people who don't think at first that they have much to say. And then I love proving them wrong. But you're in marketing. Presumably, you have many things to say.Kim: [laugh] It's funny, you say that I feel like in marketing, we're always behind the scenes, we are the ones building and crafting the image, and bringing that story forward of, who is this? What is this company? What is this product? What do they do? Why should I care about it? And, “Wow, those are amazing stickers. I want five of them, please.” So, I'm kind of used to being behind the curtain rather than in the foreground talking about what I do.Corey: People tend to hate marketing, especially developers, when you talk to them, but when you really drill down into it, it's not marketing that they hate. It is, on some level, a marketing straw man—or straw person, whatever the current term of art is—because they think of the experience through the lens of the worst examples of it. And everyone who has been in the industry for five minutes knows what I'm talking about. Billboards that make no sense where a company spent $20 million on an ad buy and seven bucks over the lunch counter trying to figure out what to say once you have all of that attention, or bad email blasts that are completely irrelevant, untargeted, misspell your name, and are clearly written by a robot. That's not what marketing is, at least in my mind. What is it to you?Kim: For me, marketing is how you communicate who you are, what have you built, what is the value that it provides, and how can somebody use it. There's many ways in which you can share that, that can be all of those activities that you just talked about. And I think it's easy to sometimes lose the story in all of that and talk about things that may not be as important. I think a lot of times people get excited about what they've built, and love to talk about what they've built but not why it provides value, and what value it provides. And so, staying focused and really sharing that clear story is—it's a lot harder than I think people give it credit for.Corey: A very senior, well-known engineering leader whose name I will not mention because I—I can tell stories, or I can name names, but I don't believe in doing both—once said, out of what was otherwise like this—like, this person just dispenses wisdom like a vending machine. It's amazing, but one of the dumbest things I ever heard this person say was, “I never want to get marketing outreach, or show me ads or the rest. If you've built something awesome. I will find it on my own.” Which is a terrific recipe to follow if you'd like to starve to death.Kim: Yeah, I agree with that. And I think there is this… I don't know, maybe it feels great to imagine that what you've built is just so interesting that people would automagically find their way to you and pop up in your DMs and beg to throw money at you for what your product is. But I mean, truly if nobody knows that the thing exists, or even what it does, how could they? I've seen this happen quite often in technology where there's actually an amazing product that maybe they are sharing who they are, they are promoting themselves, but the messaging just doesn't quite land, and so there's a lot of confusion and misunderstanding about an amazing product. And so, not sharing, but also not sharing a very accurate, complete picture of who you are can also hurt you.Corey: When I first started going out independently in the fall of 2016, I did not know whether it was going to work, whether I was going to succeed or have to go do something else, but what I knew very obviously, was that, one way or another, 18 months from now, I was going to want to have an audience to tell about whatever I was doing. Like, the best time to build an audience is five years ago; the second-best time is today, just like planting a tree. So, I started building out the email newsletter. It was something I wish existed, no one else had built it, I figured I'd give it a shot, and it resonated, and that's where the Last Week in AWS newsletter came from. But it means that I can reach out and talk to 32,000 people in their inbox, more or less whenever I want to, tell them whatever is on my mind, and I do that in the form of my newsletters. And that more than anything else has really led to anything that could be equated to be… me as a brand, so to speak. It took work to get there, but I view it as something that, in hindsight or to someone who had spent 20 minutes thinking about marketing, was obvious, but it took me a while to get there from first principles.Kim: Yeah, for sure. And, you know, as a person who receives your newsletter, as somebody who has collaborated with you in the past, something I know you do really well is you are very clear about who you are, what you stand for, and you're consistent. And so, I think… in my opinion, I think you've done a great job of earning your audience's trust, and that's a huge part of this, right? As a marketer, it's very easy to say, you know, “My thing is bigger, better, faster,” but if it's pure conjecture, if it's not—if there's no there there, people will find out, you will lose that trust, and it can become difficult. And so, it does take time. And I think—I imagine, and I would ask you—I imagine you were very intentional about what you did. It took time, and you understood that, and it's like, okay, put your head down and be patient because this will reap rewards in the end.Corey: That's the curse, on some level, of having succeeded at something. You look back in hindsight, and everything looks like one thing clearly led to another, and where you are now is sort of inevitable when viewed through that lens. It does not feel like that on the day-to-day. I promise.Kim: [laugh] What—okay, so as you built your audience, what was the hardest part for you?Corey: Figuring out who the audience was, to be perfectly honest. It didn't take long before Datadog came sniffing around, six issues in, asking if they could sponsor. And it was, “You want to give me money to talk about you? Of course, you can give me money. How much money?” And I inadvertently found myself with a sponsor-driven media business.But that led to a bit of a crisis of faith for me of, who is my audience? Is it the sponsors because that—like, I like money, and I wish to incentivize the behavior of giving it to me, but if I do that, then suddenly, I'm more or less just a mouthpiece or a shill for whoever pays me enough, and that means the audience loses interest. It has to be the community is my target because that's what I consider myself a part of. I write content that I want to read, that I want to exist, and if sponsors like that, great. If they don't, then well, okay, it's not for everyone.But the audience is around because they either agree with what I say, or they appreciate the authenticity of it. And it goes down to the old saw of would you rather have a pile of money, or would you rather have a relationship with someone? It's like, “Well, I can turn a relationship into money way more easily than I can the opposite.” So yeah, I would much rather build a working rapport with the people who support me.Kim: Interesting. Yeah, I agree with you. And I would ask another question about your audience. Who was in that audience? Is this one kind of person? Is this many kinds of people? How do you think about who you're speaking to? Is it a unified group, or are you considering that there are three or four different kinds of people within this body, and you try to address all of them at different points in a week or month?Corey: If you try to write for everyone, you wind up writing for no one—Kim: Yeah.Corey: —and every time I think I have a grasp on who my audience is—like, if you're listening to this show, for example, I have some baseline assumptions about you in the aggregate, but if you were to reach out—which again, everyone is welcome to do—I would be probably astounded to learn some of the things that you folks are working on, how you view these things, what you like, what you don't like about the show. On some level, I operate in a vacuum here, just because feedback to a podcast is a rare thing. I suspect it's because it's like listening to an AM radio show, and who calls into an AM radio show? Lunatics, obviously. And most people—except on Twitter—don't self-identify as lunatics, so that's not something that they want to do.I encourage you to buck that trend. Reach out. I promise, I drag multi-trillion-dollar companies, not individuals who dare to reach out. Some of my best friendships started off with someone reaching out like, “Hey, I like what you're doing, and I'd like to learn more about it.” One thing leads to another, and there are no strangers; just friends we haven't met yet.Kim: Yeah, yeah. In the world of developer marketing, sometimes that audience can be a range of people. It can be the user versus your buyer. So, when I think about content marketing and I think about telling the story of a platform or a brand to, you know, this range of people, maybe I want to tell that same story, but I've got to do it in slightly different ways. Because to your point, if you try to be, you know, one thing for everybody or nothing to everyone, it just, it doesn't work. And so, how do you talk to that buyer who can actually sign the check versus the individual contributor, the person who's using the product day-to-day? What part of that story do they want to hear? What makes sense to them? What is engaging to them?Corey: Part of the challenge I've had is that I always assume that the audience was largely comprised of people who vaguely resemble me, namely relatively senior engineering folks who have seen way too many cycles where today's shiny new shit becomes tomorrow's legacy garbage that they needed to maintain. But that is not true. In practice, about 60% of the audience is individual contributing engineers, and the remaining 40 is almost entirely some form of management, ranging from team leads to C-level executives of Fortune 50s and everything in between. And every piece that I write is written for someone. And by that I mean, a specific person or my idea of that person as I go.Now, I don't mention them by name, but that means that different pieces are targeted at different audiences and presuppose different baseline levels of knowledge. And sometimes that works, sometimes that doesn't, but it means that everything that I write should ideally resonate with some constituency.Kim: Yeah. Yeah. And, again, as a person who has collaborated with you, you have a range of channels that you share content across. And so, I think when I first met you and first started working with you, I very quickly started to understand where that made sense to me, not just as a collaborator, but as somebody who enjoys the people that you bring in to interview, the stories that you tell, the conversations that you start. But I've noticed there's areas that I tend towards, and would listen to or read more. I don't know if that was intentional, if there are certain areas that you focus on for different segments of your audience.Corey: Partially. And this is a weird thing for me to say, particularly in this medium. I don't listen to podcasts myself. I read extremely quickly, I do not have the patience to sit through a conversation. It makes sense when I'm driving somewhere, but I barely do that. My drive home from dropping off my toddler at preschool is all of seven minutes, which is not long enough for basically anything, so it's not for me.I don't watch videos. I don't listen to podcasts. I read. That's part of the reason that every episode of this show has a transcript. It's also part of the reason, though, that I have the podcast entirely, as that I am not the common case in a bunch of things. An awful lot of people do listen to the podcast. I've talked to listeners who are surprised to learn I have an email newsletter, but I view it as the newsletter came first and then the podcast.Occasionally, I find people who only know me through my YouTube videos—which are sporadic because it's a lot of effort to get one of those up—and no one sees all of it. This did lead to a bit of a weird crisis for me early on of, okay, so I have a Twitter account, I have a LinkedIn page, I have the Screaming in the Cloud podcast, I have the AWS Morning Brief podcast, I have the Last Week in AWS newsletter, and I have the Last Week in AWS blog, and of course, I have my day job at The Duckbill Group where we fix AWS bills. That is seven or eight different URLs. Where do I tell people to go?Kim: Yeah.Corey: It's a very hard problem.Kim: Do you do that? How do you do that? Or do you allow people to find their own way?Corey: Whether you allow people to or not, they're going to do it on their own. My default of where do I send people is lastweekinaws.com. That talks a little bit about who I am, it has a prominently featured ‘newsletter signup' widget there, give me your email address and you will get an opt-in confirmation.Click that, and you will start receiving my newsletters, which talk in the bottom about other things that I do, and let people find their way to different places, like slack.lastweekinaws.com, for the community Slack channel, which is sort of the writer's room for some of these conversations. There's a bunch of different ways, but not everyone wants to engage in the same way, and that's okay.Kim: Yeah. That is something that's come up a lot for me, managing content programs. You said it yourself: not everybody learns the same way, and so thinking about different ways to share a story, I would say right now a lot of people are really burnt out on webinars. I think the past couple of years of being at home and staring at screens has done a number on us all. But still, there are ways in which some people do prefer video.Maybe shorter format is better, or audio, or reading. And it's great that you put the transcript in because I know I'm a person who really values that. Sometimes I can't listen to an episode, and it's great that I can, you know, kind of skim through and read through parts of the interview that I knew that were going to come up. And so, being attuned to the fact that there's many different ways to tell a story, and having fun with that—dare I say [laugh]—is, I think, a huge part of it.Corey: You have to have fun, otherwise, you aren't going to be able to stay the course, at least that's my philosophy. I am very fortunate in that what I do is technically marketing for the consultancy because an overwhelming percentage of our leads come from, people have heard of me and that leads them here. It's never clear to me where was the original point of contact, how did you get into the orbit, who recommended you, but that is functionally what it is. I'm fortunate in that the media side of our business with sponsorships turns this into a business unit that generates a profit. But it is functionally still a marketing department. That is not mandatory.Kim: Yeah. So, an interesting thing that I've seen happen within developer marketing is when thinking about this audience and how you market your consultancy, you spoke about how many people are individual contributors in your audience. I—did you say it was like 60%?Corey: 60% engineers, although it's also how people view what their role is changes rather drastically. And I've never found that any of these things that are categorizations of roles or company styles or what have ever fit me well. I don't fit anywhere I go. And that's okay. I assume that there's a lot of slop and wiggle room in there, but it gives me a direction to go in. I would have guessed before that, that 95% of the audience was engineering hands-on coding-type practitioners.Kim: Right.Corey: Clearly I'm wrong.Kim: Well, in understanding that, I mean, what you've got is an understanding of who can take what action. I mean, yeah, at some point, you do want sponsors, right? If you are marketing for your consultancy, you probably do want to reach those executives that would be the person that would actually bring you in—your team in—to evaluate and give them advice and feedback, and that's not always the individual contributor. However, having a presence within the community is equally beneficial to your brand. And so, for me, as a person who has worked in-house at teams, often the demand gen team is telling me, “Oh, we just want to do things that will get leads in the door,” you know, leads that will actually turn into customers, but addressing your community and having a presence there, and showing up there, and participating is just as important. You know, that's brand awareness.And so, there will sometimes be activities that you do that really are just about participating, and showcasing yourself and your team as the experts that you are. And sometimes it will be a direct, “We have this feature. We have this product. Here's how you can do a trial and sign up to become a customer.”Corey: That is, I think, something that gets missed a lot. With so much marketing in this industry slash sector slash whatever it is that you want to call it is, in larger companies in particular, you wind up with people who are writing some of the messaging around this that are too far removed from the actual customer journey. You see it very early startup phase, too, where… I see it on the show, sometimes, with very early stage technical co-founders. They want to talk about the internals of this very hard thing that they built and how it works. Great. That's not your customer. That is not something that anything other than your competitor or your prospective hires are really going to be that interested in.Kim: Yeah.Corey: Talk about the painful problem that you solve.Kim: Absolutely. Show—oh, my gosh, I just had a conversation with a colleague about this very thing. Show the return on investment, show the value you provide, and do it explicitly, do it very clearly. Do not assume that people understand. Give numbers if you can, metrics. Just really put it out there because I think in this moment right now, in this economy… budgets are tight. And so, if you can't clearly show what value you provide and why you should be there, you know, why somebody should bring your product into their stack, you're just not going to make it through, or you're not going to last long.Corey: Yeah. It's hard. None of this stuff is easy, and marketing is way, way, way harder than it looks. Done well, it looks like you barely did anything at all. Do it badly, and suddenly the entire internet lines up to dunk on you.Kim: Oh, that is so true. Gosh, and that's really difficult for marketers because, as you said, we've done well, it just feels natural. Like, of course, this would happen. But there's so much that goes on behind the scenes to execute and make it look seamless and flawless. That is something that I like to advise onto my fellow marketers and content marketers is, don't forget to remind your team what you've been up to and what it took to get there so that they appreciate the value of what you're providing, and will continue to do those things that help keep that momentum moving forward. As you said, how many years did you work on getting that audience together where it is today? This was not six months. This was a real time and effort for you to build this following, and to earn this trust, and to have the brand that you have now.Corey: The funny part is, I didn't do most of it. My entire time doing this, I have been unable to materially alter the trajectory of growth. It is all word of mouth, people in the audience telling other people about whatever it is that I do. I have run a number of experiments across almost every medium that was within my reach, and none of them seem to materially tip anything other than being authentic and being there for the audience, and then just letting the rest sort of handle itself.Kim: Mm-hm. I like that you said that, that you're running experiments. You're in conversation with your audience. You're really thinking about how your message lands, and what they like or don't like, or what resonates.Corey: It's a hard problem. How do you view marketing? You've been working in this space a lot. You have specifically in your title of Freelance Content Marketing Strategist a derivation of the word strategy, which has always been something that I'm not great at. It's longer-term, big picture thinking. I'm much better tactically in the weeds. What do you see as the broad sweep of how it's being done in this industry?Kim: I can speak to myself. I studied sociology. I really love thinking about what influences people, I love stories and storytelling, and so my focus is strategic communications. And that's a fancy way of just saying, you know, taking these complex ideas, these products that people built, and turning them into compelling narratives so we can showcase the value they provide. And I think it's especially interesting and challenging doing that in technology when a lot of times you're bringing forth a completely new products that never existed before, so how do you speak to that? How do you help people understand that a thing they've never been able to do before they can now do, and it could be a part of their life, and it could be part of their workflow, and change how they think about their own practices?And so, for me, it really is storytelling. I'm a sucker for, you know, a good podcast and a good book on the side. That's how I think about it, but I also do appreciate that at the end of the day, this is marketing, we are, you know, a business, and so I also enjoy being a part of a team. So, I can help build the beautiful story and think about how to share that effectively, get that in front of the right people at the right time so that they can have an understanding of who you are, what you are, what you offer, be a part of the larger conversation that is in place that you can become a trusted brand, and doing that within you know, a larger marketing team, those people that make sure that, you know, ultimately we're getting those people into the marketing and sales funnel, and the appropriate activities that happen next. So I'm, I tend to hang out in my storytelling realm of marketing, but fully well appreciate and know that this is—to your point, this is—marketing is a large effort, and there are a lot of people that contribute to the different moving parts. And it's like a dance making it all come together.Corey: Something I found as well is a complete lack of awareness outside of marketing itself, in the differences between all of the marketing sub-functions. It's the engineering equivalent of lumping mobile developers, and front-end developers, and SREs, and back-end developers, and DBAs, and so on, and so on, and so on, all into the same bucket. Like, “You're just an engineer. Can you fix my printer?” Style stuff.Kim: Yeah.Corey: Marketing is a vast landscape, and you start subdividing it further and further, and there's a reason that it's an entire organization within companies and not a person.Kim: Yeah, for sure. And gosh, some of the people that I've worked with at earlier-stage companies that are capable of covering more than one area, really creative, flexible, nimble fingers, you know, they are quick on their feet and can see that, you know, larger vision and help contribute to that. So, you know, building out messaging is one thing. Thinking about how to get that in front of your audience is another. How to guide your customers through that journey, like, what does the learning process look like, and how do you make sure that you continue to drive those conversations so that somebody can go through that learning process? How are you showing up in the real world at an event? How is your team talking to [media 00:25:23] to analysts?I mean, the list can go on, as you begin to think about the more and more people in the world that you want to touch and interact with, who should know who you are? They should understand who you are, what is your brand, what product have you built, and why it's important to the conversation right now. And so yeah, you start to bring in more team members who specialize in that, who can help you make sure that you're doing that particular function really well. And it's fascinating being inside of a small startup and then watching that operation scale into something larger, and really watching that effort take off. It's pretty cool to see.Corey: Something I'm curious about that you have been rather vocal about is that marketing extends after the product is sold. What do you mean by that?Kim: The way that I think about that is, in my opinion, customers should be a part of the customer journey. So, the customer journey is from point zero where this person or team or organization was not aware of who you are to, “Oh, apparently, there's a solution that fits my need,” to, “Oh, and I want this particular brand, I want this tool in my stack, I want to work with these people,” to, they've signed on to become a customer. Even after that point, in my opinion, marketing efforts should continue, in that perhaps that customer came in to solve one or two use cases, but your platform or product can help with many others. And so, making sure that customer is onboarded appropriately so that they're getting the full value out of the product that they should, and they're keeping them educated so that they're aware of other parts of the product that maybe they didn't learn about in their discovery journey, as well as, you know, as your product evolves, new features that are offered.So, as I think about marketing, the existing customer base is also a group of people that I'm always thoughtful about. So, let's say that, you know, if I were to plan out a product release announcement, that is a segment that I would absolutely want to make sure that we include in our strategy. And where are the touchpoints for that? How can we make sure that segment is also understanding and aware of this new announcement, and how it can affect them? And what resources would I provide to them so that they know about it, they will use it well, perhaps become a power user, and you know, very selfishly… sorry to say this out loud, but maybe they'll become a power user and want to come on a webinar with me, or be featured in an article about how much they enjoy using it. But again, just because you've got a customer in-house doesn't mean that journey is finished. There's, as your product continues to grow and evolve, your relationship with that customer should also continue.Corey: There are two schools of thought on taking money from customers. One of them is you get them as much money as you possibly can upfront, once. And there's also the idea of, all right, I want to have an ongoing relationship in which they broaden their relationship in the fullness of time and grow as a customer. Some of our best sources of business have come from folks who either—not just—don't tell their peers at other companies about us, but come back to us when their situation changes, or wind up doing business with us as they land somewhere else in the ecosystem. Like there is, “Yeah, we like working with you,” is all well and good, “And I want to do it again; here's money,” is a different level of endorsement.Kim: Absolutely. And some of the companies that I've worked with, often customers will come in because they have some extreme point of pain, and they want to solve that one thing. They do not have time to think about the dozen different interesting use cases. “I have this thing that I need to solve, and I need to get it done now.” And so, work with them on that, and later on, that opportunity to expand their understanding of what else is possible.And even coach and provide guidance on, especially with some newer products where people are learning new development techniques. “Did you know that this is also possible? Have you considered this?” And so, thinking about that, like, not everybody is just twiddling their thumbs, “Oh, I have free time. I'd love to learn a thing.” They're usually coming to you because they have a very painful thing that they need solved, hence why it's great to talk about the value you provide: “I can help you solve that, I can help this pain go away, and help your business do what it needs to get done.” And so, when they're our customer, that next moment is that great, great opportunity to talk about other use cases, other parts of the platform.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where's the best place for them to find you?Kim: Right now, I'm mostly active on LinkedIn, and I believe—would you be able to provide a link to that in the show notes?Corey: Oh, we absolutely will put that in the show notes, whether you want us to or not. That's the beautiful part of having show notes for folks.Kim: Awesome. Yeah, I think that's the best place to find me today. Unfortunately, I don't use Twitter as much as I used to. So, I do exist there, but I'm not—Corey: That's such a smart decision.Kim: I know, I feel terrible about it. And I got to say, I miss the community that it was.Corey: Yeah, that's the reason I focus on the newsletter as the primary means of audience building. Because email is older than I am. It will exist after I'm gone—and that's fine—but it means that it's not going to be purchased by some billionaire man-child who's going to ruin the thing. I don't need to worry about algorithmic nonsense in the same way. I can reach out and talk to people with something to say. I'm in that very rarefied space where when a company blocks an email that I send out, they get yelled at by their internal constituencies of, “Hey, where'd that email go? I was looking for it.”Kim: That's awesome.Corey: Thank you so much for taking the time to speak with me. I appreciate it.Kim: Thank you, Corey. It's a pleasure talking with you.Corey: It really is because I—like you—am delightful. Kim Harrison, freelance content marketing strategist, has been my guest today. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry and insulting comment. Don't worry about telling me about it. If your comment was any good, I'm sure I'll find it on my own.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Future of Entertaining Developer Content with Jason Lengstorf

Screaming in the Cloud

Play Episode Listen Later Jan 16, 2024 33:41


Jason Lengstorf, a developer media producer and host of the show Learn with Jason, joins Corey on this week's episode of Screaming in the Cloud to layout his ideas for creative developer content. Jason explains how devTV can have way more reach than webinars, the lack of inspiration he experiences at conferences these days, and why companies should be focused on hiring specialists before putting DevRels on the payroll. Plus, Corey and Jason discuss walking the line between claiming you're good at everything and not painting yourself into a corner as a DevRel and marketer.About JasonJason Lengstorf helps tech companies connect with developer communities through better media. He advocates for continued learning through collaboration and play and regularly live streams coding with experts on his show, Learn With Jason. He lives in Portland, Oregon.Links Referenced:Learn with Jason: https://www.learnwithjason.dev/Personal Website Links: https://jason.energy/linksTranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Before I went to re:Invent, I snuck out of the house for a couple of days to GitHub Universe. While I was there, I discovered all kinds of fascinating things. A conference that wasn't predicated on being as cheap as humanly possible was one of them, and a company that understood how developer experience might play out was another.And I also got to meet people I don't normally get to cross paths with. My guest today is just one such person. Jason Lengstorf is a developer media producer at Learn with Jason, which I have to assume is named after yourself.Jason: [laugh] It is yes.Corey: Or it's a dramatic mispronunciation on my part, like, no, no, it's ‘Learn with JSON' and it's basically this insane way of doing weird interchange formats, and you just try to sneak it through because you know I happen to be an XML purist.Jason: [laugh] Right, I'm just going to throw you a bunch of YAML today. That's all I want to talk about.Corey: Exactly. It keeps things entertaining, we're going to play with it. So, let's back up a sec. What do you do? Where do you start and where do you stop?Jason: I'm still learning how to answer this question, but I help companies do a better job of speaking to developer audiences. I was an engineer for a really long time, I went from engineering into developer advocacy and developer experience, and as of the last year, I'm doing that independently, with a big focus on the media that companies produce because I think that what used to work isn't working, and that there's a big opportunity ahead of us that I am really excited to help companies move into.Corey: It feels like this has been an ongoing area of focus for an awful lot of folks. How do you successfully engage with developer audiences? And if I'm being direct and more than a little bit cynical, a big part of it is that historically, the ways that a company marketed to folks was obnoxious. And for better or worse, when you're talking about highly technical topics and you're being loudly incorrect, a technical audience is not beholden to some of the more common business norms, and will absolutely call you out in the middle of you basically lying to them. “Oh, crap, what do we do now,” seemed to be a large approach. And the answer that a lot of folks seem to have come up with was DevRel, which… I've talked about it before in a bunch of different ways, and my one-liner is generally, “If you work in DevRel, that means you work in marketing, but they're scared to tell you that.”Jason: [laugh] I don't think you're wrong. And you know, the joke that I've made for a long time is that they always say that developers hate marketing. But I don't think developers hate marketing; they just hate the way that your company does it. And—Corey: Oh, wholeheartedly agree. Marketing done right is engaging and fun. A lot of what I do in public is marketing. Like, “Well, that's not true. You're just talking about whatever dumb thing AWS did this week.” “Well, yes, but then you stick around to see what else I say, and I just become sort of synonymous with ‘Oh, yeah, that's the guy that fixes AWS bills.'” That is where our business comes from, believe it or not.Jason: Ri—and I think this was sort of the heart of DevRel is that people understood this. They understood that the best way to get an audience engaged is to have somebody who's part of that audience engage with them because you want to talk to them on the level that they work. You're not—you know, a marketing message from somebody who doesn't understand what you do is almost never going to land. It just doesn't feel relatable. But if you talk to somebody who's done the thing that you do for work, and they can tell you a story that's engaging about the thing that you do for work, you want to hear more. You—you know, you're looking for a community, and I think that DevRel, the aim was to sort of create that community and give people a space to hang out with the added bonus of putting the company that employs that DevRel as an adjacent player to get some of that extra shine from wherever this community is doing well.Corey: It felt like 2019 was peak DevRel, and that's where I started to really see that you had, effectively, a lot of community conferences were taken over by DevRel, and you wound up with DevRel pitching to DevRel. And it became so many talks that were aligned with almost imagined problems. I think one of the challenges of working in DevRel is, if you're not careful, you stop being a practitioner for long enough that you can no longer relate to what the audience is actually dealing with. I can sit here and complain about data center travails that I had back in 2011, but are those still accurate in what's about to be 2024? Probably not.Jason: And I think the other problem that happens too is that when you work in DevRel, you are beholden to the company's goals, if the company employees you. And where I think we got really wrong is companies have to make money. We have to charge customers or the company ceases to exist, so when we go out and tell stories, we're encouraged by the company to focus on the stories that have the highest ROI for the company. And that means that I'm up on stage talking about some, like, far-future, large-scale enterprise thing that very few companies need, but most of the paying customers of my company would need. And it becomes less relatable, and I think that leads to some of the collapse that we saw that you mentioned, where dev events feel less like they're for devs and more like they're partner events where DevRel is talking to other DevRel is trying to get opportunities to schmooze partners, and grow our partner pipeline.Corey: That's a big part of it, where it seems, on some level, that so much of what DevRel does, when I see them talking about DevRel, it doesn't get around to DevRel is. Instead, it gets stuck in the weeds of what DevRel is not“. We are not shills for our employer.” Okay, I believe you, but also, I don't ever see you saying anything that directly contravenes what your employer does. Now, let me be clear: neither do I, but I'm also in a position where I can control what my employer does because I have the control to move in directions that align with my beliefs.I'm not saying that it's impossible to be authentic and true to yourself if you work for an employer, but I have seen a couple of egregious examples of people changing companies and then their position on topics they've previously been very vocal on pulled an entire one-eighty, where it's… it really left a bad taste in my mouth.Jason: Yeah. And I think that's sort of the trick of being a career DevRel is you have to sort of walk this line of realizing that a DevRel career is probably short at every company. Because if you're going to go there and be the face of a company, and you're not the owner of that company, they're almost inevitably going to start moving in a direction as business develops, that's not going to line up with your core values. And you can either decide, like, okay that's fine, they pay me well enough, I'm just going to suck it up and do this thing that I don't care about that much, or you have to leave. And so, if you're being honest with yourself, and you know that you're probably going to spend between 12 and 24 months at any given company as a DevRel, which—by the history I'm seeing, that seems to be pretty accurate—you need to be positioning and talking about things in a way that isn't painting you into that corner where you have to completely about-face, if you switch companies. But that also works against your goals as a DevRel at the company. So, it's—I think we've made some big mistakes in the DevRel industry, but I will pause to take a breath here [laugh].Corey: No, no, it's fine. Like, it's weird that I view a lot of what I do is being very similar to DevRel, but I would never call myself that. And part of it is because, for better or worse, it is not a title that tends to engender a level of respect from business owners, decision makers, et cetera because it is such a mixed bag. You have people who have been strategic advisors across the board becoming developer advocates. That's great.You also see people six months out of a boot camp who have decided don't like writing code very much, so they're going to just pivot to talking about writing code, and invariably, they believe, more or less, whatever their employer tells them because they don't have the history and the gravitas to say, “Wait a minute, that sounds like horse pucky to me.” And it's a very broad continuum. I just don't like blending in.Jason: Where I think we got a lot of this wrong is that we never did define what DevRel is. As you say, we mostly define what DevRel is not, and that puts us in a weird position where companies see other companies do DevRel, and they mostly pay attention to the ones who do DevRel really well. And they or their investors or other companies say, “You need a great DevRel program. This is the secret to growth.” Because we look at companies that have done it effectively, and we see their growth, and we say, “Clearly this has a strong correlation. We should invest in this.” But they don't—they haven't done it themselves. They don't understand which part of it is that works, so they just say, “We're hiring for DevRel.” The job description is nine different careers in a trench coat. And the people applying—Corey: Oh, absolutely. It's nine different things and people wind up subdividing into it, like, “I'm an events planner. I'm not a content writer.”Jason: Right.Corey: Okay, great, but then why not bill yourself as a con—as an events planner, and not have to wear the DevRel cloak?Jason: Exactly. And this is sort of what I've seen is that when you put up a DevRel job, they list everything, and then when you apply for a DevRel job, you also don't want to paint yourself into a corner and say, “My specialty is content,” or, “My specialty is public speaking,” or whatever it is. And therefore you say, “I do DevRel,” to give yourself more latitude as an employee. Which obviously I want to keep optionality anywhere I go. I would like to be able to evolve without being painted into a small box of, like, this is all I'm allowed to do, but it does put us in this really precarious position.And what I've noticed a lot of companies do is they hire DevRel—undefined, poorly written job description, poor understanding of the field. They get a DevRel who has a completely different understanding of what DevRel is compared to the people with the role open. Both of them think they're doing DevRel, they completely disagree on what those fundamentals are, and it leads to a mismatch, to burnout, to frustration, to, you know, this high turnover rate in this field. And everybody then starts to say, well, “DevRel is the problem.” But really, the problem is that we're not—we're defining a category, not a job, and I think that's the part that we really screwed up as an industry.Corey: Yeah. I wish there were a better way around there, but I don't know what that might be. Because it requires getting a bunch of people to change some cornerstone of what's become their identity.Jason: This is the part where I—this is probably my spiciest take, but I think that DevRel is marketing, but it is a different kind of marketing. And so, in a perfect world—like, where things start to fall apart is you try to slot DevRel into engineering, or you try to slot it into marketing, as a team on these broader organizations, but the challenge then becomes, if you have DevRel, in marketing, it will inevitably push more toward marketing goals, enterprise goals, top-of-funnel, qualified leads, et cetera. If you put them into engineering, then they have more engineering goals. They want to do developer experience reviews. They want to get out there and do demos. You know, it's much more engineering-focused—or if you're doing it right, is much more engineering-focused.But the best DevRel teams are doing both of those with a really good measure, and really clear metrics that don't line up with engineering or marketing. So, in a perfect world, you would just have an enterprise marketing team, and a developer marketing team, and that developer marketing team would be an organization that is DevRel today. And you would hire specialists—event planners, great speakers, great demo writers, probably put your docs team in there—and treat it as an actual responsibility that requires a larger team than just three or four ex-developers who are now speaking at conferences.Corey: There were massive layoffs across DevRel when the current macroeconomic correction hit, and I'd been worried about it for years in advance because—Jason: Mm-hm.Corey: So, many of these folks spent so much time talking about how they were not marketing, they were absolutely not involved in that. But marketing is the only department that really knows how to describe the value of these sorts of things without having hard metrics tied to it. DevRel spent a lot of time talking about how every metric used to measure them was somehow wrong, and if you took it to its logical conclusion, you would basically give these people a bunch of money—because they are expensive—and about that much money again in annual budget to travel more or less anywhere they want to go, and every time something good happened, as a result, to the company, they had some hand in it nebulously, but you could never do anything to measure their performance, so just trust that they're doing a good job. This is tremendously untenable.Jason: Mm-hm. Yeah, I think when I was running the developer experience org at Netlify, most of my meetings were justifying the existence of the team because there weren't good metrics. You can't put sales qualified leads on DevRel. It doesn't make any sense because there are too many links in the chain after DevRel opens the door, where somebody has to go from, ‘I'm aware of this company' to ‘I've interacted with the landing page' to ‘I've actually signed up for something' to ‘now I'm a customer,' before you can get them to a lead. And so, to have DevRel take credit is actually removing credit from the marketing team.And similarly, if somebody goes through onboarding, a lot of that onboarding can be guided by DevRel. The APIs that new developers interface with can be—the feedback can come from DevRel, but ultimately, the engineering team did that work the product team did that work. So, DevRel is this very interesting thing. I've described it as a turbocharger, where if you put it on an engine that runs well, you get better performance out of that engine. If you just plop one on the table, not a lot happens.Corey: Yeah, it's a good way of putting it. I see very early stage startups looking to hire a developer advocate or DevRel person in their seed stage or Series A, and it's… there's something else you're looking for here. Hire that instead. You're putting the cart before the horse.Jason: What a lot of people saw is they saw—what they're thinking of as DevRel is what they saw from very public founders. And when you get a company that's got this very public-facing, very engaging, charismatic founder, that's what DevRel feels like. It is, you know, this is the face of the company, we're showing you what we do on the inside, we're exposing our process, we're sharing the behind the scenes, and proving to you that we really are great engineers, and we care a lot. Look at all this cool stuff we're doing. And that founder up on stage was, I think, the original DevRel.That's what we used to love about conferences is we would go there and we would see somebody showing this thing they invented, or this new product they had built, and it felt so cool because it was these inspirational moments of watching somebody brilliant do something brilliant. And you got to follow along for that journey. And then we try to—Corey: Yeah I mean, that's natural, but you see booths at conferences, the small company startup booths, a lot of times you'll be able to talk to the founders directly. As the booths get bigger, your likelihood of being able to spend time talking to anyone who's materially involved in the strategic direction of that company gets smaller and smaller. Like, the CEO of GitHub isn't going to be sitting around at the GitHub booth at re:Invent. They're going to be, you know, talking to other folks—if they're there—and going to meetings and whatnot. And then you wind up with this larger and larger company. It's a sign of success, truly, but it also means that you've lost something along the way.Jason: Yeah, I think, you know, it's the perils of scale. And I think that when you start looking at the function of DevRel, it should sort of be looked at as, like, when we can't handle this anymore by ourselves, we should look for a specialty the same way that you do for any other function inside of a company. You know, it wouldn't make sense on day one of a startup to hire a reliability engineer. You're not at the point where that makes sense. It's a very expensive person to hire, and you don't have enough product or community or load to justify that role yet. And hopefully, you will.And I think DevRel is sort of the same way. Like, when you first start out your company, your DevRel should be the founding team. It should be your engineers, sharing the things that they're building so that the community can see the brilliance of your engineering team, sharing with the community, obviously, being invested in that community. And when you get big enough that those folks can no longer manage that and their day-to-day work, great, then look into adding specialists. But I think you're right that it's cart before the horse to, you know, make a DevRel your day-one hire. You just don't have enough yet.Corey: Yeah, I wish that there were an easy way to skin the cat. I'm not sure there is. I think instead we wind up with people doing what they think is going to work. But I don't know what the truth is.Jason: Mmm.Corey: At least. That's where I land on it.Jason: [laugh] Yeah, I mean, every company is unique, and every experience is going to be unique, so I think to say, “Do it exactly like this,” is—that's got a lot of survivorship bias, and do as I say—but at the same time, I do think there's some universal truths. Like, it doesn't really make sense to hire a specialist before you've proven that specialty is the secret sauce of your business. And I think you grow when it's time to grow, not just in case. I think companies that over-hire end up doing some pretty painful layoffs down the road. And, you know, obviously, there's an opposite end of that spectrum where you can grow too slowly and bury your team and burn everybody out, but I think, you know—we, [laugh] leading into the pandemic, I guess, we had a lot of free money, and I think people were thinking, let's go build an empire and we'll grow into that empire. And I think that is a lot of why we're seeing this really painful downsizing right now, is companies hired just in case and then realized that actually, that in case didn't come to be.Corey: What is the future of this look like? Easy enough to look back and say, well, that didn't work? Well, sure. What is the future?Jason: The playbook that we saw before—in, like, 2019 and before—was very event-driven, very, like, webinar-driven. And as we went into 2020, and people were at home, we couldn't travel, we got real sick of Zoom calls. We don't want to get on another video call again. And that led to that playbook not working anymore. You know, I don't want to get on a webinar with a company. I don't want to go travel to a company event, you know, or at least not very many of them. I want to go see the friends I haven't seen in three years.So, travel priorities changed, video call fatigue is huge, so we need something that people want to do, that is interesting, and that is, you know, it's worth making in its own right, so that people will engage with it, and then you work in the company goals as an incidental. Not as a minor incidental, but you know, it's got to be part of the story; it can't be the purpose. People won't sign up for a webinar willingly these days, I don't think, unless they have exactly the problem that your webinar purports to solve.Corey: And even if they do, it becomes a different story.Jason: Right.Corey: It's [high buying 00:19:03] signal, but people are constantly besieged by requests for attention. This is complicated by what I've seen over the last year. When marketing budgets get—cut, arguably too much, but okay—you see now that there's this follow-on approach where, okay, what are we going to cut? And people cut things that in many cases work, but are harder to attribute success to. Events, for example, are doing very well because you have someone show up at your booth, you scan their badge. Three weeks later, someone from that company winds up signing up for a trial or whatnot, and ah, I can connect those dots.Whereas you advertise on I don't know, a podcast as a hypothetical example that I'm pulling out of what's right in front of me, and someone listening to this and hearing a message from a sponsor, they might be doing something else. They'll be driving, washing dishes, et cetera, and at best they'll think, “Okay, I should Google that when I get back to a computer.” And they start hearing about it a few times, and, “Oh. Okay, now it's time for me to go and start paying serious attention to this because that sounds like it aligns with a problem I have.” They're not going to remember where they initially heard it.They're going to come in off of a Google search, so it sounds like it's all SEO's benefit that this is working, and it is impossible to attribute. I heard some marketer once say that 50% of your marketing budget is wasted, but you'll go bankrupt trying to figure out which half. It all ties together. But I can definitely see why people bias for things that are more easily attributed to the metric you care about.Jason: Yes. And I think that this is where I see the biggest opportunity because I think that we have to embrace that marketing signal is directional, not directly attributable. And if you have a focus campaign, you can see your deviation from baseline signups, and general awareness, and all of the things that you want to be true, but you have to be measuring that thing, right? So, if we launch a campaign where we're going to do some video ads, or we're going to do some other kind of awareness thing, the goal is brand awareness, and you measure that through, like, does your name get mentioned on social media? Do you see a deviation from baseline signups where it is trending upward?And each of those things is signal that the thing you did worked. Can you directly attribute it? No, but I think a functional team can—you know, we did this at Netlify all the time where we would go and look: what were the efforts that were made, what were the ones that got discussion on different social media platforms, and what was the change from baseline? And we saw certain things always drove a non-trivial deviation from baseline in the right direction. And that's one of the reasons that I think the future of this is going to be around how do you go broader with your reach?And my big idea—to nutshell it—is, like, dev TV. I think that developers want to see the things that they're interested in, but they want it to be more interesting than a straight webinar. They want to see other developers using tools and getting a sense of what's possible in an entertaining way. Like, they want stories, they don't want straight demos. So, my thinking here is, let's take this and steer into it.Like, we know that developers love when you put a documentary together. We saw the Vue documentary, and the React documentary, and the GraphQL documentary, and the Kubernetes documentary coming out of the Honeypot team, and they've got hundreds of thousands, and in some cases, millions of views because developers really want to see good stories about us, about our community. So, why not give the dev community a Great British Bake Off, but for web devs? Why not create an Anthony Bourdain Parts Unknown-style travel show that highlights various web communities? Why not get out there and make reality competition shows and little docuseries that help us highlight all the things that we're learning and sharing and building?Every single one of those is going to involve developers talking about the tools they use, talking about the problems they solve, talking about what they were doing before and how they've made it better. That's exactly what a webinar is, that's what a conference talk is, but instead of getting a small audience at a conference, or you know, 15 to 30 people signing up for your webinar, now we've got the potential for hundreds of thousands or even millions of people to watch this thing because it's fun to watch. And then they become aware of the companies involved because it's presented by the company; they see the thing get used or talked about by developers in their community, I think there's a lot of magic and potential in that, and we've seen it work in other verticals.Corey: And part of the problem comes down as well to the idea that, okay, you're going to reach some people in person at events, but the majority of engineers are not going to be at any event or—Jason: Right.Corey: Any event at all, for that matter. They just don't go to events for a variety of excellent reasons. How do you reach out to them? Video can work, but I always find that requires a bit of a different skill than, I don't know, podcasting or writing a newsletter. So, many times, it feels like it's, oh, and now you're just going to basically stare at the camera, maybe with someone else, and it looks like the Zoom call to which the viewer is not invited.Jason: Right.Corey: They get enough of that. There has to be something else.Jason: And I think this is where the new skill set, I think, is going to come in. It exists in other places. We see this happen in a lot of other industries, where they have in-house production teams, they're doing collaborations with actors and athletes and bringing people in to make really entertaining stories that drive underlying narratives. I mean, there's the ones that are really obvious, like, the Nikes of the world, but then there are far less obvious examples.Like, there was this show called Making It. It was… Nick Offerman and Amy Poehler were the hosts. It was the same format as the Great British Bake Off but around DIY and crafting. And one of the permanent judges was the Etsy trend expert, right? And so, every single episode, as they're judging this, the Etsy trend expert is telling all of these crafters and contestants, “You know, what you built here is always a top seller on Etsy. This is such a good idea, it's so well executed, and people love this stuff. It flies off the shelves in Etsy stores.”Every single episode, just perfectly natural product placement, where a celebrity that you know—Nick Offerman and Amy Poehler—are up there, lending—like, you want to see them. They're so funny and engaging, and then you've got the credibility of Etsy's trend expert telling the contestants of the show, “If you do DIY and crafting, you can make a great living on Etsy. Here are the things that will make that possible.” It's such subtle, but brilliant product placement throughout the entire thing. We can do that. Like, we have the money, we just spend it in weird places.And I think that as an industry, if we start getting more creative about this and thinking about different ways we can apply these marketing dollars that we're currently dumping into very expensive partner dinners or billboards or getting, you know, custom swag or funding yet another $150,000 conference sponsorship, we could make a series of a TV show for the same cost as throwing one community event, and we would reach a significantly larger group.Corey: Yeah. Now, there is the other side of it, too, where Lord knows I found this one out the fun way, that creating content requires significant effort and—Jason: Yes.Corey: Focus. And, “Oh, it's a five-minute video. Great, that could take a day or three to wind up putting together, done right.” One of the hardest weeks of my year is putting together a bunch of five-minute videos throughout the course of re:Invent. So much that is done in advance that is basically breaking the backs of the editing team, who are phenomenal, but it still turns into more than that, where you still have this other piece of it of the actual content creation part.And you can't spend all your time on that because pretty soon I feel like you become a talking head who doesn't really do the things that you are talking to the world about. And that content gets pretty easy to see when you start looking at, okay, what did someone actually do? Oh, they were a developer for three years, and they spent the next seven complaining about development, and how everyone is—Jason: [laugh].Corey: Doing it wrong on YouTube. Hmm… it starts to get a little, how accurate is this really? So, for me, it was always critical that I still be hands-on with things that I'm talking about because otherwise I become a disaster.Jason: And I agree. One of the things that my predecessor at Netlify, Sarah Drasner, put in place was a, what she called an exchange program, where we would rotate the DevRel team onto product, and we rotate product onto the DevRel team. And it was a way of keeping the developer experience engineers actually engineers. They would work on the product, they didn't do any DevRel work, they were exclusively focused on doing actual engineering work inside our product to just help keep their skills sharp, keep them up to date on what's going on, build more empathy for the engineers that we talk to every day, build more empathy for our team instead of us—you know, you never want to hear a DevRel throw the engineering team under the bus for not shipping a feature everybody wants.So, these sorts of things are really important, and they're hard to do because we had to—you know, that's a lot of negotiation to say, “Hey, can we take one of your engineers for a quarter, and we'll give you one of our engineers for a quarter, and you got to trust us that's going to work out in your favor.” [laugh] Right? Like, there's a lot that goes into this to make that sort of stuff possible. But I absolutely agree. I don't think you get to make this type of content if you've fully stepped out of engineering. You have to keep it part of your practice.Corey: There's no way around it. You have to be hands-on. I think that's the right way to do it, otherwise, it just leads to, frankly, disaster. Very often, you'll see people who are, like, “Oh, they're great in the DevRel space. What do they do?” And they go to two or three conferences a year, and they have a blog post or so. It's like, okay, what are they doing the rest of that time?Sometimes the answer is fighting internal political fires. Other times it's building things and learning these things and figuring out where they stand. There are some people, I don't want to name names, although an easy one is Kelsey Hightower, who has since really left the stage, that he's retired, but when he went up on stage and said something—despite the fact that he worked at Google—it was eminently clear that he believed in what he was saying, or he would not say it.Jason: Right.Corey: He was someone who was very clearly aware of the technology about which he was speaking. And that was great. I wish that it were not such a standout moment to see him speak and talk about that. But unfortunately, he kind of is. Not as many people do that as well as we'd like.Jason: Agreed. I think it was always a treat to see Kelsey speak. And there are several others that I can think of in the community who, when they get on stage, you want to be in that audience, and you want to sit down and listen. And then there are a lot of others who when they get on stage, it's like that this book could have been a blog post, or this—you know, this could have been an email, that kind of thing. Like you could have sent me this repo because all you did was walk through this repo line-by-line, or something that—it doesn't feel like it came from them; it feels like it's being communicated by them.And I think that's, again, like, when I criticize conferences, a lot of my criticism comes from the fact that, coming up, I feel like every speaker that I saw on stage—and this is maybe just memory… playing favorites for me, but I feel like I saw a lot of people on stage who were genuinely passionate about what they were creating, and they were genuinely putting something new into the world every time they got on stage. And I have noticed that I feel less and less like that. Also, I feel like events have gotten less and less likely to put somebody on stage unless they've got a big name DevRel title. Like, you have to work at a company that somebody's heard of because they're all trying to get that draw because attendance is going down. And—Corey: Right. It's a—like, having run some conferences myself, the trick is, is you definitely want some ringers in there. People you know will do well, but you also need to give space for new voices to arise. And sometimes it's a—it always bugs me when it seems like, oh, they're here because their company is a big sponsor. Of course, they have the keynote. Other times, it's a… like, hate the actual shill talks, which I don't see as much, which I'm thankful for; I'd stop going to those conferences, but jeez.Jason: Yeah, and I think it's definitely one of those, like, this is a thing that we can choose to correct. And I have a suspicion that this is a pendulum not a—not, like, the denouement of—is that the right—how do you say that word? De-NOW-ment? De-NEW-ment? Whatever.Corey: Denouement is my understanding, but that might be the French acc—Jason: Oh, me just—Corey: The French element.Jason: —absolutely butchering that. Yeah [laugh]. I don't think this is the end of conferences, like we're seeing them taper into oblivion. I think this is a lull. I think that we're going to realize that we want to—we really do love being in a place with other developers. I want to do that. I love that.But we need to get back to why we were excited to go to conferences in the first place, which was this sharing of knowledge and inspiration, where you would go see people who were literally moving the world forward in development, and creating new things so that you would walk away with insider info, you had just seen the new thing, up close and personal, had those conversations, and you went back so jazzed to build something new. I feel like these days, I feel more like I went and watched a handful of product demos, and now I'm really just waiting to the hallway track, which is the only, like, actually interesting part at a lot of events these days.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where's the best place for them to find you?Jason: Most of what I share is on learnwithjason.dev, or if you want a big list of links, I have jason.energy/links, which has a whole bunch of fun stuff for you to find.Corey: Awesome. And we will, of course, include links to that in the show notes. Thank you so much for taking the time to speak with me. I really appreciate it.Jason: Yeah, thanks so much for having me. This was a blast.Corey: Jason Lengstorf, developer media producer at Learn with Jason. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that will no doubt become the basis for somebody's conference talk.Jason: [laugh].Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Championing CDK While Accepting the Limits of AWS with Matthew Bonig

Screaming in the Cloud

Play Episode Listen Later Jan 11, 2024 43:32


Matthew Bonig, Chief Cloud Architect at Defiance Digital, joins Corey on Screaming in the Cloud to discuss his experiences in CDK, why developers can't be solely reliant on AI or coding tools to fill in the blanks, and his biggest grievances with AWS. Matthew gives an in-depth look at how and why CDK has been so influential for him, as well as the positive work that Defiance Digital is doing as a managed service provider. Corey and Matthew debate the need for AWS to focus on innovating instead of simply surviving off its existing customer base.About MatthewChief Cloud Architect at Defiance Digital. AWS DevTools Hero, co-author of The CDK Book, author of the Advanced CDK Course. All things CDK and Star Trek.Links Referenced:CDK Book: https://www.thecdkbook.com/cdk.dev: https://cdk.devTwitter: https://twitter.com/mattbonigLinkedIn: https://www.linkedin.com/in/matthewbonig/Personal website: https://matthewbonig.comduckbillgroup.com: https://duckbillgroup.comTranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. And I'm back with my first recording that was conducted post-re:Invent and all of its attendant glory and nonsense; we might talk a little bit about what happened at the show. But my guest today is the Chief Cloud Architect at Defiance Digital, Matthew Bonig. Matthew, thank you for joining me.Matthew: Thank you, Corey. Thanks for having me today.Corey: So, you are deep into the CDK. You're one of the AWS Dev Tools Heros, and you're the co-author of the CDK Book, you've done a lot, really. You have a course now for Advanced CDK work. Honestly, at this point, it starts to feel like when I say the CDK is a cult, you're one of the cult leaders, or at least very high up in the cult.Matthew: [laugh] Yes, it was something that I discovered—Corey: Your robe has a fringe on it.Matthew: Yeah, yeah. I discovered this at re:Invent, and it kind of hit me a little surprised that I got called out by a couple people by being the CDK guy. And I didn't realize that I'd hit that status yet, so I got to get myself a hat, and a cloak, and maybe some fun stuff to wear.Corey: For me, what I saw on the—it was in the run-up to re:Invent, but the big CDK sized announcement was the fact that the new version of Amplify now is much closer tied to the CDK than it was in previous incarnations, which is great. It sort of solves the problem, how do I build a thing through a variety of different tools? Great, and how do I manage that thing programmatically? It seems if, according to what it says on the tin, that it narrows that gap. Of course, here in reality, I haven't had time to pick anything like that up, and I won't for months, just because so much comes out all at the same time. What happened in the CDK world? What did I miss? What's exciting?Matthew: Well, you know, the CDK world has been, I've said, fairly mature for a while now. You know, fundamentally the way the CDK works and the functionality within it hasn't changed drastically. Even when 2.0 came out a couple of years ago, there wasn't a drastic fundamental change in the way that the API worked. Really, the efforts that we've been seeing for the last year or so, and especially the last few months, is trying to button up some functionality, hit some of those edge cases have been rough for some users, and ultimately just continue to fill out things like L2 constructs and maybe try to build out some L3s.I think what they're doing with Amplify is a good sign that they are trying to, sort of, reach across the aisle and work with other frameworks and work with other systems within AWS to make the experience better, shows their commitment to the CDK of making it really the first class citizen for doing IaC work in AWS.Corey: I think that that is a—that's a long road, and it's also a lot of work under the hood that's not easily appreciated. You've remarked at one point that my talk at the CDK Community Day was illuminating, if nothing else, if for no other reason than I dressed up as a legitimate actual cultist and a robe to give the talk—Matthew: Yeah. Loved it.Corey: Because I have deep-seated emotional problems. But it was fun. It talked a bit about my journey with it, where originally I viewed it as, more or less, this thing that was not for me. And a large part of that because I come from a world of sysadmin ops types, where, “I don't really know how to code,” was sort of my approach to this. Because I was reaff—I had that reaffirmed every time I talked to a developer. Like, “You call this a bash script? It's terrible.” And sure, but it worked, and it tied into a different knowledge set.Then, when I encountered the CDK for the first time, I tried to use it in Python, which at the time was not really well-supported and led to unfortunate outcomes—I do not know if that's still the case—what got me into it, in seriousness, was when I tried it a few months later with TypeScript and that started to work a little bit more clearly, with the caveat that I did not know JavaScript, I did not know TypeScript, I had to learn it as I went in service to the CDK. And it works really well insofar as it scratched an itch that I had. There's a whole class of problems that I don't have to deal with, which include getting someone who isn't me involved in some of that codebase, or working in environments where you have either a monorepo or a crap ton of tiny repos scattered everywhere and collaborating with other people. I cannot speak authoritatively to any of that. I will say it's incredibly annoying when I'm trying to update something written in the CDK, and then I have touched it in a year-and-a-half, and the first thing I have to do is upgrade a whole a bunch of dependencies, clear half a day just to get the warnings to clear before I can go ahead and deploy the things, let alone implement the tiny change I'm logging into the thing to fix.Matthew: Oh, yeah, yes. Yeah, the dependency updates are probably one of the most infuriating things about any Node.js system, and I don't think that I've ever run across any application project framework, anything in which doing dependency upgrades wasn't a nightmare. And I think it's because the Node.js community, more so than I've seen any other, doesn't care about semantic versioning. And unfortunately, the CDK doesn't technically care about semantic versioning, either, which makes it very tricky to do upgrades properly.Corey: There also seems to be the additional problem layered on top, which is all of the various documentation sources that I stumble upon, the official documentation, not terrific at giving real-world use case. It feels like it's trying to read the dictionary to learn how English works, not really its purpose. So, I find a bunch of blog posts, and all of them tend to approach this ecosystem slightly differently. One talks about using NPM. Another talks about Yarn.If you're doing anything that involves a web app, as seems to be increasingly common, some will say, “Oh, use WEBrick,” others will recommend using Vite. There's the whole JavaScript framework wars, and the only unifying best practice seems to be, “Oh, there's another way to do it that you should be using instead of the way you currently are on.” And if you listen to that, you wind up in hell.Matthew: Oh, horribly so. Yeah, the split in the ecosystem between NPM and Yarn, I think, has been incredibly detrimental to the overall comfort level in Node.js development. You know, I was an NPM guy for many, many years, and then actually, the CDK got me more using Yarn, simply because Yarn handles cross-library dependency resolution a bit different from NPM. And I just ran into fewer errors and fewer problems if I use Yarn along the way.But NPM then came a long way since then. Now, there's also a PNPM, which is good if you're using monorepos. But then if you're going to be using monorepos, there's another 15 tools out there that you can use for those sorts of things. And ultimately, I think it's going to be what is the thing that causes you the least amount of problems when dealing with them. And every single dependency issue that I've ever run into when upgrading any project, whether it be a web application, a back-end API, or the CDK, it's always unique enough that there isn't a one-size-fits-all answer to solving those problems.Corey: The most recent experience I had with the CDK—since you know, you're basically Mr. CDK at this point, whether you want to be or not, and this is what I do, instead of filing issues anywhere or asking for help, I drag people onto this show, and then basically assault them with my weird use cases—I'm in the process of building something out in the service of shitposting, because that is my nature, and I decided, oh, there's a new thing called the Dynamo table v2—Matthew: Yes.Corey: Which is great. I looked into it. The big difference is that it addresses it from the beginning as a global table, so you have optionality. Cool. Trying to migrate something that is existing from a Dynamo table to a Dynamo v2 table started throwing CloudFormation issues, so my answer was—this was pre-production—just tear down the stack and rebuild it. That feels like that would be a problem if this had been something that was actually full of data at this point.Matthew: There's a couple of ways that you could maybe go about it. Now, this is a very special case that you mentioned because you're talking about fundamentally changing the CloudFormation resource that you are creating, so of course, the CDK being an abstraction layer over top of CloudFormation and the Dynamo table v2 using the global table resource rather than just the table resource. If you had a case where you have to do that migration—and I've actually got a client right now who's very much looking to do that—the process would probably be to orphan the existing table so that you can retain the data and then using an import routine with CloudFormation to bring that in under the new resource. I haven't tried it yet—Corey: In this case, the table was empty, so it was easy enough to just destroy and then recreate, but it meant that I also had to tear down and recreate everything else in the stack as well, including CloudFront distributions, ACM certificates, so it took 20 minutes.Matthew: Yes. And that is one of the reasons why I often will stick any sort of stateful resource into their own stack so that if I have to go through an operation like this, I'm know that I'm not going to be modifying things that are very painful to drop and recreate, like, CloudFront distributions, which can take a half an hour or more to re-initialize.Corey: Yeah. So, that was fun. The problem got sorted out, but it was still a bit challenging. I feel like at some level, the CDK is hobbled by the fact that under the hood, it really just is just CloudFormation once all is said and done, and CloudFormation has never been the speediest thing. I didn't understand that until I started playing with Terraform and I saw how much more quickly it could provision things just by calling the service APIs directly. It sort of raises the question of what the hell the CloudFormation service is doing when it takes five times longer to do effectively the same thing.Matthew: Yeah, and the big thing that I appreciate about Terraform versus CloudFormation—speed being kind of the big win—is the fact that Terraform doesn't obfuscate or hide state from you. If you absolutely need to, you can go in and change that state that relates your Terraform definitions to the back-end resources. You can't do that with CloudFormation. So CloudFormation, did release few years ago, that import routine, and that was pretty good—not great, but pretty good; it's getting better all the time—whereas this was a complete and unneeded feature with Terraform because if it came down to the point where you already had a resource, and you just want to tie it to your IaC, you just edit a state file. And they've got their import routines and tie-in routines as well, but having that underlying state exposed was a big advantage, in my mind, to Terraform that I missed going to CloudFormation, and still to this day frustrates me that I can't do that underlying state change.Corey: It becomes painful and challenging, for better or worse.Matthew: Yep.Corey: But yeah, that was what I ran into. Things have improved, though. When I google various topics, I find that the v2 documentation comes up instead of the v1. That was maddening for a little while. I find that there are still things that annoy me, but they become less all the time, partially because I feel like I'm getting better at knowing how to search for them, and also because I think I'm becoming broken in the right ways that the CDK tends to expect.Matthew: Oh, like how?Corey: Oh, easy example here: I was recently trying to get something set up and running, and I don't know why this is the case, I don't know if it holds true and other programming languages, but I'm getting more used to the fact that there are two files in TypeScript-land that run a project. One is generally small and in a side directory that no one cares about, I think it's in a lib or the bin subdirectory. I don't remember which because I don't care. And then there are things you have to do within the other equivalent that basically reference each other. And I've gotten better at understanding that those aren't one file, for example. Though they seem to sure be a lot in all the demos, but it's not how the init process, when you're starting something new, spins up.Matthew: Yeah, this is the hell of TypeScript, the fact that Node.js, as a runtime, cannot process TypeScript files, so you always have to pass them through a compiler. This is actually one of the things that I like about using Projen for all of my projects instead of using CDK init to start them is that those baseline configurations handle the TypeScript nature of the runtime—or I should say, the anti-TypeScript nature of the runtime a little bit better, and you run into fewer problems. You never have to worry about necessarily doing build routines or other things because they actually use the ts-node runtime to handle your CDK files instead of the node runtime. And I think that's a big benefit in terms of the developer experience. It just makes it so I generally never have to care about those JavaScript files that get compiled from TypeScript. In the, you know, two years or so I've been using Projen, I never have to worry about a build routine to turn that into JavaScript. And that makes the developer experience significantly better.Corey: Yeah, I still miss an awful lot of things that I feel like I should be understanding. I've never touched Projen, for example. It's on my backlog of things to look into.Matthew: Highly recommend it.Corey: Yeah, I also am still in that area of… my TypeScript knowledge has not yet gotten to a point where I see the value of it. It feels like I've spent far more time fighting with the arbitrary restrictions that are TypeScript than it has saved me from typing errors in anything that I've built. I believe it has to come back around at some point of familiarity with the language, but I'm not there yet.Matthew: Got you. So, Python developer before this?Corey: Ish. Mostly brute force and enthusiasm, but yeah, Python.Matthew: Python, and I think you said bash scripting and other things that have no inherent typing built into it.Corey: Right.Matthew: Yeah, that is a problem, I think… that I thankfully avoided. I was an application developer for many years. My background and my experience has always been around strongly typed languages, so when it came to adopting the CDK, everything felt very natural to me. But as I've worked with people over the years, both internally at Defiance as well as people in the community that don't have a background in that, I've been exposed to how problematic TypeScript as a language truly can be for someone who has never had this experience of, I've got this thing and it has a well-defined shape to it, and if I don't respect that, then I'm going to bang my head against to these weird errors that are hard to comprehend and hard to grok way more than it feels like I'm getting value from it.Corey: There's also a lack of understanding around how to structure projects, in my case, where all right, I have a front-end and I have a back-end. Is this all within the context of the CDK project? And this, of course, also presupposes that everything I'm doing is effectively greenfield, in which case, great, do I use the front-end wizard tutorial thing that I'm following, and how does that integrate when I'm using the CDK to deploy it somewhere, and so on and so forth. It's stuff that makes sense once you have angry and loud enough opinions, but I don't yet.Matthew: Yeah, so the key thing that I tell people about project structure—because it does often come up a lot—is that ultimately, the CDK itself doesn't really care how you structure things. So, how you structure, where you put certain files, how you organize them, is your personal preference. Now, there are some exceptions to that. When it comes to things like Lambda functions that you're building or Docker files, there are probably some better practices you can go through, but it's actually more dependent on those systems rather than the CDK directly itself. So I go through, in the Advanced CDK course, you know, my basic starting directory structure for everything, which is stacks, constructs, apps, and stages all go into their own specific directories.But then once those directories start growing—because I've added more stacks, more constructs, and things—once I get to around five to maybe seven files in a directory, then I look at them and go, “Okay, how can I group these together?” I create subdirectories, I move those files around. My development tool of choice, which is WebStorm—JetBrains's long-running tool—handles the moving of those files for me, so all of my imports, all of my references automatically get updated accordingly, which is really nice, and I can refactor things as much as I want to without too much of a problem. So, as a project grows over time, my directory structure can change to make sure that it is readable, well organized, and understandable, and it's never been too much of a problem.Corey: Yeah, it's one of those things that does take some getting used to. It helps, I think, having a mentor of sorts to take you under their wing and explain these things to you, but that's a hard thing to scale as well. So, in the absence of that we wind up defaulting to oh, whatever the most recent blog post we read is.Matthew: Yeah. Yeah, and I think one of the truest, I think, and truthful complaints I've heard about the CDK and why it can be fundamentally very difficult is that it has no guardrails. It is a general-purpose languages, and general purpose languages don't have guardrails. They don't want to be in the way of you building whatever you need to build.But when it comes to an Infrastructure as Code project, which is inherently very different from an API or a website or other, sort of, more typical programming projects, having guardrail—or not having guardrails is a bad thing, and it can really lead you down some bad paths. I remember working with a client this last year who had leveraged context instead of properties on classes to hand configuration value down through code, down through stacks and constructs and things like that. And it worked. It functionally got them what they needed, up until a point, and then all of sudden, they were like, “Well, now we want to do X with the CDK, and we simply cannot because we've now painted ourselves into a corner.” And that's the downside of not having these good guard rails.And I think that early, they needed to do this early on. When the CDK was initially released, and it got popular back around the 0.4, 0.5 timeframe—I think I picked it up right around 0.4, too—when it officially hit a 1.0 release, there should have been a better set of guidelines and best practices published. You can go to the documents and see them, and they have been published, but it really didn't go far enough to really explain how and why you had to take the steps to make sure you didn't screw yourself six months later.Corey: It's sort of those one-way doors you don't realize you're passing through when you first start building something. And I find, especially when you follow my development approach of more or less used to be copying and pasting for various places, now it's copying and pasting from one place which is Chat-Gippity-4, then—although I've seen increasingly GitHub's Copilot has been great at this and Code Whisperer, in my experience, has not yet been worth the energy it takes to really go diving into it. Your mileage may of course vary on that. But I found it was not making materially better or suggestions on CDK stuff then Copilot was.Matthew: Yeah, I haven't tried Code Whisperer outside of the shell. I've been using Copilot for the last year and absolutely adore it. I think it has completely changed the way that I felt about coding. I saw writing code for the last couple of years as being very tedious and very boring in terms of there weren't interesting problems to solve, and Copilot, as I've seen it, is autocomplete on steroids. So, it doesn't keep me from having to solve the interesting problems; it just keeps me from having to type out the boring solutions, and it's the thing that I love about it.Now, hopefully, Code Whisperer continues to get better over time. I'm hoping all of Amazon's GenAI products continue to get better over time and I can maybe ditch a subscription to Copilot, but for now, Copilot is still my thing. And it's producing good enough results for me. Thankfully because I've been working with it for four years now, I don't rely on it to answer my questions about how to use constructs. I go back to the docs for those. If I need to.Corey: It occurs to me that I can talk about this now because this episode will not air until after this has become generally available, but what's really spanked it from my perspective has been Google's Duet. And the key defining difference is, as I'm in one of these files—in many cases, I'm doing something with React these days due to an escalating series of weird choices—and—Matthew: My apologies, by the way. My condolences, I should say.Corey: Well, yeah. Well, things like Copilot Chat are great when they say, “Oh yeah, assuming that you're handling the state this way in your component, now…” What I love about Duet is it goes, and it actually checks, which is awesome. And it has contextual awareness of the entire project, not just the three lines that I'm talking about, or the file that I'm looking at this moment. It goes ahead and does the intelligent thing of looking at some of these things. It still has some problems where it's confidently wrong about things that really shouldn't be, but okay, early days.Matthew: Sure. Yeah, I'll need to check that out a little bit more because I still, to this day, despise working with React. It is still my framework of choice because the ecosystem is so good around it. And so, established that I know that whatever problem I have, I'll find 14 blogs, and maybe one of them is the answer that I want, versus any other framework where it still feels so very new and so very immature that I will probably beat my head more than I want to. Web development now is a hobby, not a job, so I don't want to bang my head against a hobby project.Corey: I tend to view, on some level, that these AIs coding assistants are good enough to get me almost anywhere I need to go, to the point where a beginner or enthusiastic amateur will be able to get sorted out. And for a lot of what I'm building, that's all I really need. I don't need this to be something that will withstand the rigors of production at a bank, for example. One challenge I have seen with all these things is there's a delay in something being released and their training data growing to understand those things. Very often it'll wind up giving me recommendations for—I forget the name of it, but there was a state manager in React that the first thing you saw when you installed it was, “This has been deprecated. This is the new replacement.” And if you explicitly ask about the replacement, it does the right thing, but it just cheerfully goes ahead and tells you to use ancient stuff or apply poor security practices or the rest.Matthew: Yeah, that's very scary to me, to be honest because I think these AI development tools—for me, it's revitalized my interest in doing development, but where I get really, really scared is where they become a dependency in writing the right code. And every time I ever use Copilot to fill out stuff, I'm always double-checking, and I'm always making sure that this is right or that is right. And what I worry about is those developers who are maybe still learning some things, or are having to write in-line SQL on to their back-end and let Copilot, or Code Whisperer, or whatever tool they pick fill this stuff out, and that answer is based on a solution that works for a 10,000 record database, but fails horribly on a 100 million record database. And now all of a sudden, and you've got this problem that is just festering in through a dev environment, in through a QA environment, and even maybe into a prod environment, and you don't find out that failure until six months later, when some database table runs past its magical limit and now all of sudden, you've got these queries that are failing, they're crashing databases, they're running into problems, and this developer that didn't really know what they built in the first place is now being asked, “Why doesn't your code work,” and they just sort of have to go, “Maybe ChatGPT can tell me why my code doesn't work.” And that's the scariest part of me to these things is that they're a little bit too good at answering difficult questions with a simple answer. There is no, “It depends,” with these answers, and there needs to be for a lot of what we do in complex systems that, for example, in the AWS world, we're expected to build complex systems, and ChatGPT and these other tools are bad at that.Corey: We're required to build complex systems, and, on some level, I would put that onus on Amazon in many respects. I mean, the challenge I keep smacking into is that they're building—they're giving you a bunch of components and expecting you to assemble them all yourself to achieve even relatively simple things. It increasingly feels like this is the direction that they want customers to go in because they're bad at moving up the stack and develop—delivering integrated solutions themselves.Matthew: Well, so I would wonder, would you consider a relatively simple system, then?Corey: Okay, one of the things I like to do is go out in the evenings, and sometimes with a friend, I'll have a few too many beers. And then I'll come up with an idea for I want to redirect this random domain that I want to buy to someone else's website. The end. Now, if you go with Namecheap, or GoDaddy, or one of these various things, you can set that up in their mobile app with a couple of clicks and a payment, and you're done. With AWS, you have a minimum of six different services you need to work with, many of which do not support anything on a mobile basis and don't talk to one another relatively well. I built a state machine out of step functions that will do a lot of it for me, but it's an example of having to touch so many different things just for a relatively straightforward solution space that is a common problem. And that's a small example, but you see it across the board.Matthew: Yeah, yeah. I was expecting you to come up with a little bit of a different answer for what a simple system is, for example, a website. Everyone likes to say, “Oh, a static website with just raw HTML. That's a simple”—Corey: No, that's hard as hell because the devil is in the details, and it slices you to ribbons whenever you go down that path.Matthew: Exactly.Corey: No, I'm talking things that a human being would do without needing to be an expert in getting that many different AWS services to talk to one another.Matthew: Yeah, and I agree that AWS traditionally is very bad at moving up that stack and getting those things to work. You had mentioned at the very top of this about Amplify. Amplify is a system that I have tried once or twice, and I generally think that, for the right use case, is an excellent system and I really like a lot of what it does.Corey: It is. I agree. Having gone down that, building up my scavenger hunt app that I'll be open-sourcing at some point next year.Matthew: Yeah. And it's fantastic, but it has a very steep cliff where you hit that point where all of a sudden, you go, “Okay, I added this, and I added this, and I added this, and now I want to add this one other thing, but to do it, now all of a sudden, I have to go through a tremendous amount of work.” It wasn't just the simple push button that the previous four steps were. Now, I have this one other thing that I need to do, and now it's a very difficult thing to incorporate into my system. And I'm having to learn all new stuff that I never had to care about before because Amplify made it way too easy.And I don't think this is necessarily an AWS problem. I think this is just a fundamentally difficult software problem to solve. Microsoft, I spent years and years in the Microsoft world, and this was my biggest complaint about Microsoft was that they made extremely difficult things, far too simple to solve. And then once those systems became either buggy, problematic, misconfigured, whatever you want to call it, once they stopped working for some reason, the people who were responsible for figuring those answers out didn't have the preceding knowledge because they didn't need it. And then all of a sudden, they go, “Well, I don't know how to solve this problem because I was told it was just this push-button thing.”So, Amplify is great, and I think it's fantastic, but it is a very, very difficult problem to solve. Amazon has proven to be very, very good at building the fundamentals, and I think that they function very well as a platform service, as a building blocks. But they give you the Lego pieces, and they expect you to build the very complex Batmobile. And they can maybe give you some custom pieces here and there, like the fenders, and the tires, and stuff like that, but that's not their bread and butter.Corey: Well, even starting with the CDK is a perfect example. Like, you can use the CDK init to create a new project from scratch, which is awesome. I love the fact that that exists, but it doesn't go far enough. It doesn't automatically create a repo you store the thing in that in turn hooks up to a CI/CD process that will wind up doing the build and deploy. Instead, it expects to do that all locally, which is a counter pattern. That's an anti-pattern. It'll lead you down the wrong path. And you always have to build these things from scratch yourself as you keep going. At least that's what it feels like.Matthew: Yeah, it is. And I think that here at Defiance Digital, our job as an MSP is to talk to the customer and figure out, but what are those very specific things you need? So, we do build new CDK repos all the time for our customers. But some of our customers want a trunk base system. Some of them want a branching or a development branch base system. Some of them have a very complex SDLC process within a PR stage of code changes versus a slightly less complex one after things have been merged into trunk.So, we fundamentally look at it like we're that bridge between the two, and in that case, AWS works great. In fact, all SaaS solutions are really nice because they give us those building blocks and then we provide value by figuring out which one of those we need to incorporate in for our clients. But every single one of our clients is very different. And we've only got, you know, less than a dozen right now. But you know, I've got project managers and directors always coming back to me and saying, “Well, how do we cookie-cutter this process?” And you can't do it. It's just very, very difficult.Not in a small-scale. Maybe when you're really big, and you're a company like AWS who has thousands, if not potentially millions of customers, you can find those patterns, but it is a very fundamentally difficult problem to solve, and we've seen multiple companies over the last two decades try to do these things and ultimately fail. So, I don't necessarily blame AWS for not having these things or not doing them well.Corey: Yes and no. I mean, GitHub delivers excellent experience for the user, start to finish. There's—Vercel does something very similar over in the front-end universe, too, where it is clearly possible, but it seems that designing user interfaces and integrating disparate things together is not an Amazon's DNA, which makes sense when you view the two-pizza teams assembling to build larger things. But man, is that a frustration.Matthew: Yeah. I really wonder if this two-pizza team mentality can ever work well for products that are bigger than just the fundamental concepts. I think Amplify is pretty good, but if you really want something that is this service that works for 80% of customers, you can't do it with five people. You can't do it with six. You need to have teams like what GitHub and what Vercel and other things, where teams are potentially dozens of people that really coordinate things and have a good project manager and product owner and understand the problem very well. And it's just very difficult with these very, very small teams to get that going.I don't know what the future of AWS looks like. It feels like a very Microsoft in the mid-2000s, which is, they're running off of their existing customers, they don't really have a need to innovate significantly because they have a lot of people locked in, they would be just fine for years on years on end with the products they have. So, there isn't a huge driver for doing it, not like, maybe, GCP or Azure really need to start to continue to innovate stronger in this space to pick up more customers. AWS doesn't have a problem getting customers.And if there isn't a significant change in the mentality, like what Microsoft saw at the end of the 2000s with getting rid of Ballmer, bringing in Satya and really changing the mentality inside the company, I don't see AWS breaking out from this anytime soon. But I think that's actually a good thing. I think AWS should stick to just building the fundamentals, and I think that they should rely on their partners and their third parties to bridge that gap. I think Jeremy Daly at Ampt and what they're building over there is a fantastic product.Corey: Yeah. The problem is that Amazon seems to be in denial about a lot of this, at least with what they're saying publicly.Matthew: Yeah, but what they say publicly and how they feel internally could be very, very different. I would say that, you know, we don't know what they're thinking internally. And that's fine. I don't necessarily need to. I think more specifically, we need to understand what their roadmap looks like and we need to understand, you know, what, are they going to change in the future to maybe fill in some of these gaps.I would say that the problem you said earlier about being able to do a simple website redirect, I don't think that's Amazon's desire to build those things. I think there should be a third-party that's built on top of AWS, and maybe even works directly within your AWS account as a marketplace product for doing that, but I don't think that's necessarily in the benefit of AWS to build that directly.Corey: We'll see. I'm very curious to see how this unfolds because a lot of customers want answers that require things that have to be assembled for them. I mean, honestly, a lot of the GenAI stuff is squarely in that category.Matthew: Agreed, but is this something where AWS needs to build it internally, and then we've got a product like App Composer, or Copilot, or things where they try, and then because they don't get enough traction, it just feels like they stall out and get stagnant? I mean, App Composer was a keynote product announcement during last year's re:Invent, and this year, we saw them introduce the ability to step function editing within it, and introduce the functionality into your IDE, VS Code directly. Both good things, but a year's worth of development effort to release those two features feels slow to me. The integration to VS Code should have been simple.Corey: Yeah. They are not the innovative company that would turn around and deliver something incredible three months after something had launched, “And here's a great new series of features around it.” It feels like the pace of innovation and face of delivery has massively slowed.Matthew: Yeah. And that's the scariest thing for me. And, you know, we saw this a little bit with a discussion recently in the cdk.dev server because if you take a look at what's been happening with the CDK application for the last six months and even almost a year now, it feels like the pace of changes within the codebase has slowed.There have been multiple releases over the course of the last year where the release at the end of the week—and they hit a pretty regular cadence of a release every week—that release at the end of the week fixes one bug or adds one small feature change to one construct in some library that maybe 10% of users are going to use. And that's troublesome. One of the main reasons why I ditched the Terraform and went hard on the CDK was that I looked at how many issues were open on the Terraform AWS provider, and how many missing features were, and how slow they were to incorporate those in, and said, “I can't invest another two years into this product if there isn't going to be that innovation.” And I wasn't in a place to do the development work myself—despite the fact that you can because it's open-source and providers are forkable—and the CDK is getting real close to that same spot right now. So, this weekend—and I know this is going to come out, you know, weeks later—but you know, the weekend of December 10th, they announced a change to the way that they were going to take contributions from the CDK community.And the long and short of it right now—and there's still some debate over exactly what they said—is, we're not going to accept brand-new L2 constructs from the community. Those have to be built internally by AWS only. That's a dr—step in the wrong direction. I understand why they're taking that approach. Contributions in the CDK have been very rough for the last four or five months because of the previous policies they put into place, but this is an open-source product. It's supposed to be an open-source product. It's also a very complex set of code because of all of the various AWS services that are being hit by it. This isn't just Amplify, which is hitting a couple of things here and there. This is potentially—Corey: It touches everything.Matthew: It touches everything.Corey: Yeah, I can see their perspective, but they've got to get way better at supporting things rapidly if they want to play that game.Matthew: And they can't do that internally with AWS, not with a two-pizza team.Corey: No. And there's an increasing philosophy I'm hearing from teams of, “Well, my service supports it. Other stuff, that's not my area of responsibility.” The wisdom that I've seen that really encapsulates this is written on Colm MacCárthaigh's old laptop in 2019: “AWS is the product.” That's the truth. It's not about the individual components; it's about the whole, collectively.Matthew: Right. And so, if we're not getting these L2 constructs and these things being built out for all of the services that CloudFormation hits, then the product feels stalled, there isn't a good initiative for users to continue trying to adopt it because over time, users are just going to hit more and more services in AWS, not fewer as they use the products. That's what AWS wants. They want people to be using VPC Lattice and all the GenAI stuff, and Glue, and SageMaker, and all these things, but if you don't have those L2 constructs, then there's no advantage of the CDK over top of just raw CloudFormation. So, the step in the right direction, in my opinion, would have been to make it easier and better for outside contributions to get into CDK, and they went the opposite way, and that's scary.Now, they basically said, go build these on your own, go publish them on the Construct Hub, and if they're good, we'll incorporate them in. But they also didn't define what good was, and what makes a good API. API development is very difficult. How do you build a construct that's going to hit 80% of use cases and still give you an out for those other 20 you missed? That's fundamentally hard.Corey: It is. And I don't know if there are good answers, yet. Maybe they're going in the right direction, maybe they're not.Matthew: Time will tell. My hope is that I can try to do some videos here after the new year to try to maybe make this a better experience for people. What does good API design look like? What is it like to implement these things well so they can be incorporated in? There has been a lot of pushback already, just after the first couple of days, from some very vocal users within the CDK community saying, “This is bad. This is fundamentally bad stuff.”Even from big fanboys like myself, who have supported the CDK, who co-authored the CDK Book, and they said, “This is not good.” So, we'll see what happens. Maybe they change direction after a couple of days. Maybe this is— turns out to be a great way to do it. Only time will really tell at this point.Corey: Awesome. And where can people go to find out more as you continue your exploration in this space and find out what you're up to in general?Matthew: So, I do have a Twitter account at@mattbonig on Twitter, however, I am probably going to be doing less and less over there. Engagement and the community as a whole over there has been problematic for a while, and I'll probably be doing more on LinkedIn, so you can find me there. Just search for Matthew Bonig. It's a very unique name.I've also got a website, matthewbonig.com, and from there, you can see blog articles, and a link to my Advanced CDK course, which I'm going to continue adding sessions to over the course of the next few months. I've got one coming out shortly about the deadly embrace and how you can work through that problem with the deadly embrace and hopefully not be so scared about multi-stack applications.Corey: I look forward to that because Lord knows, I'm running into that one myself increasingly frequently.Matthew: Well, good. I will hopefully be able to get this video out and solve all of your problems very easily.Corey: Awesome. Thank you so much for taking the time to speak with me. I appreciate it.Matthew: Thank you for having me. I really appreciate it.Corey: Matthew Bonig, Chief Cloud Architect at Defiance Digital, AWS Dev Tools Hero, and oh so much more. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that you will then have to wind up building the implementation for that constructs that power that comment yourself because apparently we're not allowed to build them globally anymore.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Importance of the Platform-As-a-Product Mentality with Evelyn Osman

Screaming in the Cloud

Play Episode Listen Later Jan 9, 2024 35:26


Evelyn Osman, Principal Platform Engineer at AutoScout24, joins Corey on Screaming in the Cloud to discuss the dire need for developers to agree on a standardized tool set in order to scale their projects and innovate quickly. Corey and Evelyn pick apart the new products being launched in cloud computing and discover a large disconnect between what the industry needs and what is actually being created. Evelyn shares her thoughts on why viewing platforms as products themselves forces developers to get into the minds of their users and produces a better end result.About EvelynEvelyn is a recovering improviser currently role playing as a Lead Platform Engineer at Autoscout24 in Munich, Germany. While she says she specializes in AWS architecture and integration after spending 11 years with it, in truth she spends her days convincing engineers that a product mindset will make them hate their product managers less.Links Referenced:LinkedIn: https://www.linkedin.com/in/evelyn-osman/TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is Evelyn Osman, engineering manager at AutoScout24. Evelyn, thank you for joining me.Evelyn: Thank you very much, Corey. It's actually really fun to be on here.Corey: I have to say one of the big reasons that I was enthused to talk to you is that you have been using AWS—to be direct—longer than I have, and that puts you in a somewhat rarefied position where AWS's customer base has absolutely exploded over the past 15 years that it's been around, but at the beginning, it was a very different type of thing. Nowadays, it seems like we've lost some of that magic from the beginning. Where do you land on that whole topic?Evelyn: That's actually a really good point because I always like to say, you know, when I come into a room, you know, I really started doing introductions like, “Oh, you know, hey,” I'm like, you know, “I'm this director, I've done this XYZ,” and I always say, like, “I'm Evelyn, engineering manager, or architect, or however,” and then I say, you know, “I've been working with AWS, you know, 11, 12 years,” or now I can't quite remember.Corey: Time becomes a flat circle. The pandemic didn't help.Evelyn: [laugh] Yeah, I just, like, a look at that the year, and I'm like, “Jesus. It's been that long.” Yeah. And usually, like you know, you get some odd looks like, “Oh, my God, you must be a sage.” And for me, I'm… you see how different services kind of, like, have just been reinventions of another one, or they just take a managed service and make another managed service around it. So, I feel that there's a lot of where it's just, you know, wrapping up a pretty bow, and calling it something different, it feels like.Corey: That's what I've been low-key asking people for a while now over the past year, namely, “What is the most foundational, interesting thing that AWS has done lately, that winds up solving for this problem of whatever it is you do as a company? What is it that has foundationally made things better that AWS has put out in the last service? What was it?” And the answers I get are all depressingly far in the past, I have to say. What's yours?Evelyn: Honestly, I think the biggest game-changer I remember experiencing was at an analyst summit in Stockholm when they announced Lambda.Corey: That was announced before I even got into this space, as an example of how far back things were. And you're right. That was transformative. That was awesome.Evelyn: Yeah, precisely. Because before, you know, we were always, like, trying to figure, okay, how do we, like, launch an instance, run some short code, and then clean it up. AWS is going to charge for an hour, so we need to figure out, you know, how to pack everything into one instance, run for one hour. And then they announced Lambda, and suddenly, like, holy shit, this is actually a game changer. We can actually write small functions that do specific things.And, you know, you go from, like, microservices, like, to like, tiny, serverless functions. So, that was huge. And then DynamoDB along with that, really kind of like, transformed the entire space for us in many ways. So, back when I was at TIBCO, there was a few innovations around that, even, like, one startup inside TIBCO that quite literally, their entire product was just Lambda functions. And one of their problems was, they wanted to sell in the Marketplace, and they couldn't figure out how to sell Lambda on the marketplace.Corey: It's kind of wild when we see just how far it's come, but also how much they've announced that doesn't change that much, to be direct. For me, one of the big changes that I remember that really made things better for customers—thought it took a couple of years—was EFS. And even that's a little bit embarrassing because all that is, “All right, we finally found a way to stuff a NetApp into us-east-1,” so now NFS, just like you used to use it in the 90s and the naughts, can be done responsibly in the cloud. And that, on some level, wasn't a feature launch so much as it was a concession to the ways that companies had built things and weren't likely to change.Evelyn: Honestly, I found the EFS launch to be a bit embarrassing because, like, you know, when you look closer at it, you realize, like, the performance isn't actually that great.Corey: Oh, it was horrible when it launched. It would just slam to a halt because you got the IOPS scaled with how much data you stored on it. The documentation explicitly said to use dd to start loading a bunch of data onto it to increase the performance. It's like, “Look, just sandbag the thing so it does what you'd want.” And all that stuff got fixed, but at the time it looked like it was clown shoes.Evelyn: Yeah, and that reminds me of, like, EBS's, like, gp2 when we're, like you know, we're talking, like, okay, provision IOPS with gp2. We just kept saying, like, just give yourself really big volume for performance. And it feel like they just kind of kept that with EFS. And it took years for them to really iterate off of that. Yeah, so, like, EFS was a huge thing, and I see us, we're still using it now today, and like, we're trying to integrate, especially for, like, data center migrations, but yeah, you always see that a lot of these were first more for, like, you know, data centers to the cloud, you know. So, first I had, like, EC2 classic. That's where I started. And I always like to tell a story that in my team, we're talking about using AWS, I was the only person fiercely against it because we did basically large data processing—sorry, I forget the right words—data analytics. There we go [laugh].Corey: I remember that, too. When it first came out, it was, “This sounds dangerous and scary, and it's going to be a flash in the pan because who would ever trust their core compute infrastructure to some random third-party company, especially a bookstore?” And yeah, I think I got that one very wrong.Evelyn: Yeah, exactly. I was just like, no way. You know, I see all these articles talking about, like, terrible disk performance, and here I am, where it's like, it's my bread and butter. I'm specialized in it, you know? I write code in my sleep and such.[Yeah, the interesting thing is, I was like, first, it was like, I can 00:06:03] launch services, you know, to kind of replicate when you get in a data center to make it feature comparable, and then it was taking all this complex services and wrapping it up in a pretty bow for—as a managed service. Like, EKS, I think, was the biggest one, if we're looking at managed services. Technically Elasticsearch, but I feel like that was the redheaded stepchild for quite some time.Corey: Yeah, there was—Elasticsearch was a weird one, and still is. It's not a pleasant service to run in any meaningful sense. Like, what people actually want as the next enhancement that would excite everyone is, I want a serverless version of this thing where I can just point it at a bunch of data, I hit an API that I don't have to manage, and get Elasticsearch results back from. They finally launched a serverless offering that's anything but. You have to still provision compute units for it, so apparently, the word serverless just means managed service over at AWS-land now. And it just, it ties into the increasing sense of disappointment I've had with almost all of their recent launches versus what I felt they could have been.Evelyn: Yeah, the interesting thing about Elasticsearch is, a couple of years ago, they came out with OpenSearch, a competing Elasticsearch after [unintelligible 00:07:08] kind of gave us the finger and change the licensing. I mean, OpenSearch actually become a really great offering if you run it yourself, but if you use their managed service, it can kind—you lose all the benefits, in a way.Corey: I'm curious, as well, to get your take on what I've been seeing that I think could only be described as an internal shift, where it's almost as if there's been a decree passed down that every service has to run its own P&L or whatnot, and as a result, everything that gets put out seems to be monetized in weird ways, even when I'd argue it shouldn't be. The classic example I like to use for this is AWS Config, where it charges you per evaluation, and that happens whenever a cloud resource changes. What that means is that by using the cloud dynamically—the way that they supposedly want us to do—we wind up paying a fee for that as a result. And it's not like anyone is using that service in isolation; it is definitionally being used as people are using other cloud resources, so why does it cost money? And the answer is because literally everything they put out costs money.Evelyn: Yep, pretty simple. Oftentimes, there's, like, R&D that goes into it, but the charges seem a bit… odd. Like from an S3 lens, was, I mean, that's, like, you know, if you're talking about services, that was actually a really nice one, very nice holistic overview, you know, like, I could drill into a data lake and, like, look into things. But if you actually want to get anything useful, you have to pay for it.Corey: Yeah. Everything seems to, for one reason or another, be stuck in this place where, “Well, if you want to use it, it's going to cost.” And what that means is that it gets harder and harder to do anything that even remotely resembles being able to wind up figuring out where's the spend going, or what's it going to cost me as time goes on? Because it's not just what are the resources I'm spinning up going to cost, what are the second, third, and fourth-order effects of that? And the honest answer is, well, nobody knows. You're going to have to basically run an experiment and find out.Evelyn: Yeah. No, true. So, what I… at AutoScout, we actually ended up doing is—because we're trying to figure out how to tackle these costs—is they—we built an in-house cost allocation solution so we could track all of that. Now, AWS has actually improved Cost Explorer quite a bit, and even, I think, Billing Conductor was one that came out [unintelligible 00:09:21], kind of like, do a custom tiered and account pricing model where you can kind of do the same thing. But even that also, there is a cost with it.I think that was trying to compete with other, you know, vendors doing similar solutions. But it still isn't something where we see that either there's, like, arbitrarily low pricing there, or the costs itself doesn't really quite make sense. Like, AWS [unintelligible 00:09:45], as you mentioned, it's a terrific service. You know, we try to use it for compliance enforcement and other things, catching bad behavior, but then as soon as people see the price tag, we just run away from it. So, a lot of the security services themselves, actually, the costs, kind of like, goes—skyrockets tremendously when you start trying to use it across a large organization. And oftentimes, the organization isn't actually that large.Corey: Yeah, it gets to this point where, especially in small environments, you have to spend more energy and money chasing down what the cost is than you're actually spending on the thing. There were blog posts early on that, “Oh, here's how you analyze your bill with Redshift,” and that was a minimum 750 bucks a month. It's, well, I'm guessing that that's not really for my $50 a month account.Evelyn: Yeah. No, precisely. I remember seeing that, like, entire ETL process is just, you know, analyze your invoice. Cost [unintelligible 00:10:33], you know, is fantastic, but at the end of the day, like, what you're actually looking at [laugh], is infinitesimally small compared to all the data in that report. Like, I think oftentimes, it's simply, you know, like, I just want to look at my resources and allocate them in a multidimensional way. Which actually isn't really that multidimensional, when you think about it [laugh].Corey: Increasingly, Cost Explorer has gotten better. It's not a new service, but every iteration seems to improve it to a point now where I'm talking to folks, and they're having a hard time justifying most of the tools in the cost optimization space, just because, okay, they want a percentage of my spend on AWS to basically be a slightly better version of a thing that's already improving and works for free. That doesn't necessarily make sense. And I feel like that's what you get trapped into when you start going down the VC path in the cost optimization space. You've got to wind up having a revenue model and an offering that scales through software… and I thought, originally, I was going to be doing something like that. At this point, I'm unconvinced that anything like that is really tenable.Evelyn: Yeah. When you're a small organization you're trying to optimize, you might not have the expertise and the knowledge to do so, so when one of these small consultancies comes along, saying, “Hey, we're going to charge you a really small percentage of your invoice,” like, okay, great. That's, like, you know, like, a few $100 a month to make sure I'm fully optimized, and I'm saving, you know, far more than that. But as soon as your invoice turns into, you know, it's like $100,000, or $300,000 or more, that percentage becomes rather significant. And I've had vendors come to me and, like, talk to me and is like, “Hey, we can, you know, for a small percentage, you know, we're going to do this machine learning, you know, AI optimization for you. You know, you don't have to do anything. We guaranteed buybacks your RIs.” And as soon as you look at the price tag with it, we just have to walk away. Or oftentimes we look at it, and there are truly very simple ways to do it on your own, if you just kind of put some thought into it.Corey: While we want to talking a bit before this show, you taught me something new about GameLift, which I think is a different problem that AWS has been dealing with lately. I've never paid much attention to it because it is the—as I assume from what it says on the tin, oh, it's a service for just running a whole bunch of games at scale, and I'm not generally doing that. My favorite computer game remains to be Twitter at this point, but that's okay. What is GameLift, though, because you want to shining a different light on it, which makes me annoyed that Amazon Marketing has not pointed this out.Evelyn: Yeah, so I'll preface this by saying, like, I'm not an expert on GameLift. I haven't even spun it up myself because there's quite a bit of price. I learned this fall while chatting with an SA who works in the gaming space, and it kind of like, I went, like, “Back up a second.” If you think about, like, I'm, you know, like, World of Warcraft, all you have are thousands of game clients all over the world, playing the same game, you know, on the same server, in the same instance, and you need to make sure, you know, that when I'm running, and you're running, that we know that we're going to reach the same point the same time, or if there's one object in that room, that only one of us can get it. So, all these servers are doing is tracking state across thousands of clients.And GameLift, when you think about your dedicated game service, it really is just multi-region distributed state management. Like, at the basic, that's really what it is. Now, there's, you know, quite a bit more happening within GameLift, but that's what I was going to explain is, like, it's just state management. And there are far more use cases for it than just for video games.Corey: That's maddening to me because having a global session state store, for lack of a better term, is something that so many customers have built themselves repeatedly. They can build it on top of primitives like DynamoDB global tables, or alternately, you have a dedicated region where that thing has to live and everything far away takes forever to round-trip. If they've solved some of those things, why on earth would they bury it under a gaming-branded service? Like, offer that primitive to the rest of us because that's useful.Evelyn: No, absolutely. And honestly, I wouldn't be surprised if you peeled back the curtain with GameLift, you'll find a lot of—like, several other you know, AWS services that it's just built on top of. I kind of mentioned earlier is, like, what I see now with innovation, it's like we just see other services packaged together and releases a new product.Corey: Yeah, IoT had the same problem going on for years where there was a lot of really good stuff buried in there, like IOT events. People were talking about using that for things like browser extensions and whatnot, but you need to be explicitly told that that's a thing that exists and is handy, but otherwise you'd never know it was there because, “Well, I'm not building anything that's IoT-related. Why would I bother?” It feels like that was one direction that they tended to go in.And now they take existing services that are, mmm, kind of milquetoast, if I'm being honest, and then saying, “Oh, like, we have Comprehend that does, effectively detection of themes, keywords, and whatnot, from text. We're going to wind up re-releasing that as Comprehend Medical.” Same type of thing, but now focused on a particular vertical. Seems to me that instead of being a specific service for that vertical, just improve the baseline the service and offer HIPAA compliance if it didn't exist already, and you're mostly there. But what do I know? I'm not a product manager trying to get promoted.Evelyn: Yeah, that's true. Well, I was going to mention that maybe it's the HIPAA compliance, but actually, a lot of their services already have HIPAA compliance. And I've stared far too long at that compliance section on AWS's site to know this, but you know, a lot of them actually are HIPAA-compliant, they're PCI-compliant, and ISO-compliant, and you know, and everything. So, I'm actually pretty intrigued to know why they [wouldn't 00:16:04] take that advantage.Corey: I just checked. Amazon Comprehend is itself HIPAA-compliant and is qualified and certified to hold Personal Health Information—PHI—Private Health Information, whatever the acronym stands for. Now, what's the difference, then, between that and Medical? In fact, the HIPAA section says for Comprehend Medical, “For guidance, see the previous section on Amazon Comprehend.” So, there's no difference from a regulatory point of view.Evelyn: That's fascinating. I am intrigued because I do know that, like, within AWS, you know, they have different segments, you know? There's, like, Digital Native Business, there's Enterprise, there's Startup. So, I am curious how things look over the engineering side. I'm going to talk to somebody about this now [laugh].Corey: Yeah, it's the—like, I almost wonder, on some level, it feels like, “Well, we wound to building this thing in the hopes that someone would use it for something. And well, if we just use different words, it checks a box in some analyst's chart somewhere.” I don't know. I mean, I hate to sound that negative about it, but it's… increasingly when I talk to customers who are active in these spaces around the industry vertical targeted stuff aimed at their industry, they're like, “Yeah, we took a look at it. It was adorable, but we're not using it that way. We're going to use either the baseline version or we're going to work with someone who actively gets our industry.” And I've heard that repeated about three or four different releases that they've put out across the board of what they've been doing. It feels like it is a misunderstanding between what the world needs and what they're able to or willing to build for us.Evelyn: Not sure. I wouldn't be surprised, if we go far enough, it could probably be that it's just a product manager saying, like, “We have to advertise directly to the industry.” And if you look at it, you know, in the backend, you know, it's an engineer, you know, kicking off a build and just changing the name from Comprehend to Comprehend Medical.Corey: And, on some level, too, they're moving a lot more slowly than they used to. There was a time where they were, in many cases, if not the first mover, the first one to do it well. Take Code Whisperer, their AI powered coding assistant. That would have been a transformative thing if GitHub Copilot hadn't beaten them every punch, come out with new features, and frankly, in head-to-head experiments that I've run, came out way better as a product than what Code Whisperer is. And while I'd like to say that this is great, but it's too little too late. And when I talk to engineers, they're very excited about what Copilot can do, and the only people I see who are even talking about Code Whisperer work at AWS.Evelyn: No, that's true. And so, I think what's happening—and this is my opinion—is that first you had AWS, like, launching a really innovative new services, you know, that kind of like, it's like, “Ah, it's a whole new way of running your workloads in the cloud.” Instead of you know, basically, hiring a whole team, I just click a button, you have your instance, you use it, sell software, blah, blah, blah, blah. And then they went towards serverless, and then IoT, and then it started targeting large data lakes, and then eventually that kind of run backwards towards security, after the umpteenth S3 data leak.Corey: Oh, yeah. And especially now, like, so they had a hit in some corners with SageMaker, so now there are 40 services all starting with the word SageMaker. That's always pleasant.Evelyn: Yeah, precisely. And what I kind of notice is… now they're actually having to run it even further back because they caught all the corporations that could pivot to the cloud, they caught all the startups who started in the cloud, and now they're going for the larger behemoths who have massive data centers, and they don't want to innovate. They just want to reduce this massive sysadmin team. And I always like to use the example of a Bare Metal. When that came out in 2019, everybody—we've all kind of scratched your head. I'm like, really [laugh]?Corey: Yeah, I could see where it makes some sense just for very specific workloads that involve things like specific capabilities of processors that don't work under emulation in some weird way, but it's also such a weird niche that I'm sure it's there for someone. My default assumption, just given the breadth of AWS's customer base, is that whenever I see something that they just announced, well, okay, it's clearly not for me; that doesn't mean it's not meeting the needs of someone who looks nothing like me. But increasingly as I start exploring the industry in these services have time to percolate in the popular imagination and I still don't see anything interesting coming out with it, it really makes you start to wonder.Evelyn: Yeah. But then, like, I think, like, roughly a year or something, right after Bare Metal came out, they announced Outposts. So, then it was like, another way to just stay within your data center and be in the cloud.Corey: Yeah. There's a bunch of different ways they have that, okay, here's ways you can run AWS services on-prem, but still pay us by the hour for the privilege of running things that you have living in your facility. And that doesn't seem like it's quite fair.Evelyn: That's exactly it. So, I feel like now it's sort of in diminishing returns and sort of doing more cloud-native work compared to, you know, these huge opportunities, which is everybody who still has a data center for various reasons, or they're cloud-native, and they grow so big, that they actually start running their own data centers.Corey: I want to call out as well before we wind up being accused of being oblivious, that we're recording this before re:Invent. So, it's entirely possible—I hope this happens—that they announce something or several some things that make this look ridiculous, and we're embarrassed to have had this conversation. And yeah, they're totally getting it now, and they have completely surprised us with stuff that's going to be transformative for almost every customer. I've been expecting and hoping for that for the last three or four re:Invents now, and I haven't gotten it.Evelyn: Yeah, that's right. And I think there's even a new service launches that actually are missing fairly obvious things in a way. Like, mine is the Managed Workflow for Amazon—it's Managed Airflow, sorry. So, we were using Data Pipeline for, you know, big ETL processing, so it was an in-house tool we kind of built at Autoscout, we do platform engineering.And it was deprecated, so we looked at a new—what to replace it with. And so, we looked at Airflow, and we decided this is the way to go, we want to use managed because we don't want to maintain our own infrastructure. And the problem we ran into is that it doesn't have support for shared VPCs. And we actually talked to our account team, and they were confused. Because they said, like, “Well, every new service should support it natively.” But it just didn't have it. And that's, kind of, what, I kind of found is, like, there's—it feels—sometimes it's—there's a—it's getting rushed out the door, and it'll actually have a new managed service or new service launched out, but they're also sort of cutting some corners just to actually make sure it's packaged up and ready to go.Corey: When I'm looking at this, and seeing how this stuff gets packaged, and how it's built out, I start to understand a pattern that I've been relatively down on across the board. I'm curious to get your take because you work at a fairly sizable company as an engineering manager, running teams of people who do this sort of thing. Where do you land on the idea of companies building internal platforms to wrap around the offerings that the cloud service providers that they use make available to them?Evelyn: So, my opinion is that you need to build out some form of standardized tool set in order to actually be able to innovate quickly. Now, this sounds counterintuitive because everyone is like, “Oh, you know, if I want to innovate, I should be able to do this experiment, and try out everything, and use what works, and just release it.” And that greatness [unintelligible 00:23:14] mentality, you know, it's like five talented engineers working to build something. But when you have, instead of five engineers, you have five teams of five engineers each, and every single team does something totally different. You know, one uses Scala, and other on TypeScript, another one, you know .NET, and then there could have been a [last 00:23:30] one, you know, comes in, you know, saying they're still using Ruby.And then next thing you know, you know, you have, like, incredibly diverse platforms for services. And if you want to do any sort of like hiring or cross-training, it becomes incredibly difficult. And actually, as the organization grows, you want to hire talent, and so you're going to have to hire, you know, a developer for this team, you going to have to hire, you know, Ruby developer for this one, a Scala guy here, a Node.js guy over there.And so, this is where we say, “Okay, let's agree. We're going to be a Scala shop. Great. All right, are we running serverless? Are we running containerized?” And you agree on those things. So, that's already, like, the formation of it. And oftentimes, you start with DevOps. You'll say, like, “I'm a DevOps team,” you know, or doing a DevOps culture, if you do it properly, but you always hit this scaling issue where you start growing, and then how do you maintain that common tool set? And that's where we start looking at, you know, having a platform… approach, but I'm going to say it's Platform-as-a-Product. That's the key.Corey: Yeah, that's a good way of framing it because originally, the entire world needed that. That's what RightScale was when EC2 first came out. It was a reimagining of the EC2 console that was actually usable. And in time, AWS improved that to the point where RightScale didn't really have a place anymore in a way that it had previously, and that became a business challenge for them. But you have, what is it now, 2, 300 services that AWS has put out, and out, and okay, great. Most companies are really only actively working with a handful of those. How do you make those available in a reasonable way to your teams, in ways that aren't distracting, dangerous, et cetera? I don't know the answer on that one.Evelyn: Yeah. No, that's true. So, full disclosure. At AutoScout, we do platform engineering. So, I'm part of, like, the platform engineering group, and we built a platform for our product teams. It's kind of like, you need to decide to [follow 00:25:24] those answers, you know? Like, are we going to be fully containerized? Okay, then, great, we're going to use Fargate. All right, how do we do it so that developers don't actually—don't need to think that they're running Fargate workloads?And that's, like, you know, where it's really important to have those standardized abstractions that developers actually enjoy using. And I'd even say that, before you start saying, “Ah, we're going to do platform,” you say, “We should probably think about developer experience.” Because you can do a developer experience without a platform. You can do that, you know, in a DevOps approach, you know? It's basically build tools that makes it easy for developers to write code. That's the first step for anything. It's just, like, you have people writing the code; make sure that they can do the things easily, and then look at how to operate it.Corey: That sure would be nice. There's a lack of focus on usability, especially when it comes to a number of developer tools that we see out there in the wild, in that, they're clearly built by people who understand the problem space super well, but they're designing these things to be used by people who just want to make the website work. They don't have the insight, the knowledge, the approach, any of it, nor should they necessarily be expected to.Evelyn: No, that's true. And what I see is, a lot of the times, it's a couple really talented engineers who are just getting shit done, and they get shit done however they can. So, it's basically like, if they're just trying to run the website, they're just going to write the code to get things out there and call it a day. And then somebody else comes along, has a heart attack when see what's been done, and they're kind of stuck with it because there is no guardrails or paved path or however you want to call it.Corey: I really hope—truly—that this is going to be something that we look back and laugh when this episode airs, that, “Oh, yeah, we just got it so wrong. Look at all the amazing stuff that came out of re:Invent.” Are you going to be there this year?Evelyn: I am going to be there this year.Corey: My condolences. I keep hoping people get to escape.Evelyn: This is actually my first one in, I think, five years. So, I mean, the last time I was there was when everybody's going crazy over pins. And I still have a bag of them [laugh].Corey: Yeah, that did seem like a hot-second collectable moment, didn't it?Evelyn: Yeah. And then at the—I think, what, the very last day, as everybody's heading to re:Play, you could just go into the registration area, and they just had, like, bags of them lying around to take. So, all the competing, you know, to get the requirements for a pin was kind of moot [laugh].Corey: Don't you hate it at some point where it's like, you feel like I'm going to finally get this crowning achievement, it's like or just show up at the buffet at the end and grab one of everything, and wow, that would have saved me a lot of pain and trouble.Evelyn: Yeah.Corey: Ugh, scavenger hunts are hard, as I'm about to learn to my own detriment.Evelyn: Yeah. No, true. Yeah. But I am really hoping that re:Invent proves me wrong. Embarrassingly wrong, and then all my colleagues can proceed to mock me for this ridiculous podcast that I made with you. But I am a fierce skeptic. Optimistic nihilist, but still a nihilist, so we'll see how re:Invent turns out.Corey: So, I am curious, given your experience at more large companies than I tend to be embedded with for any period of time, how have you found that these large organizations tend to pick up new technologies? What does the adoption process look like? And honestly, if you feel like throwing some shade, how do they tend to get it wrong?Evelyn: In most cases, I've seen it go… terrible. Like, it just blows up in their face. And I say that is because a lot of the time, an organization will say, “Hey, we're going to adopt this new way of organizing teams or developing products,” and they look at all the practices. They say, “Okay, great. Product management is going to bring it in, they're going to structure things, how we do the planning, here's some great charts and diagrams,” but they don't really look at the culture aspect.And that's always where I've seen things fall apart. I've been in a room where, you know, our VP was really excited about team topologies and say, “Hey, we're going to adopt it.” And then an engineering manager proceeded to say, “Okay, you're responsible for this team, you're responsible for that team, you're responsible for this team talking to, like, a team of, like, five engineers,” which doesn't really work at all. Or, like, I think the best example is DevOps, you know, where you say, “Ah, we're going to adopt DevOps, we're going to have a DevOps team, or have a DevOps engineer.”Corey: Step one: we're going to rebadge everyone with existing job titles to have the new fancy job titles that reflect it. It turns out that's not necessarily sufficient in and of itself.Evelyn: Not really. The Spotify model. People say, like, “Oh, we're going to do the Spotify model. We're going to do skills, tribes, you know, and everything. It's going to be awesome, it's going to be great, you know, and nice, cross-functional.”The reason I say it bails on us every single time is because somebody wants to be in control of the process, and if the process is meant to encourage collaboration and innovation, that person actually becomes a chokehold for it. And it could be somebody that says, like, “Ah, I need to be involved in every single team, and listen to know what's happening, just so I'm aware of it.” What ends up happening is that everybody differs to them. So, there is no collaboration, there is no innovation. DevOps, you say, like, “Hey, we're going to have a team to do everything, so your developers don't need to worry about it.” What ends up happening is you're still an ops team, you still have your silos.And that's always a challenge is you actually have to say, “Okay, what are the cultural values around this process?” You know, what is SRE? What is DevOps, you know? Is it seen as processes, is it a series of principles, platform, maybe, you know? We have to say, like—that's why I say, Platform-as-a-Product because you need to have that product mindset, that culture of product thinking, to really build a platform that works because it's all about the user journey.It's not about building a common set of tools. It's the user journey of how a person interacts with their code to get it into a production environment. And so, you need to understand how that person sits down at their desk, starts the laptop up, logs in, opens the IDE, what they're actually trying to get done. And once you understand that, then you know your requirements, and you build something to fill those things so that they are happy to use it, as opposed to saying, “This is our platform, and you're going to use it.” And they're probably going to say, “No.” And the next thing, you know, they're just doing their own thing on the side.Corey: Yeah, the rise of Shadow IT has never gone away. It's just, on some level, it's the natural expression, I think it's an immune reaction that companies tend to have when process gets in the way. Great, we have an outcome that we need to drive towards; we don't have a choice. Cloud empowered a lot of that and also has given tools to help rein it in, and as with everything, the arms race continues.Evelyn: Yeah. And so, what I'm going to continue now, kind of like, toot the platform horn. So, Gregor Hohpe, he's a [solutions architect 00:31:56]—I always f- up his name. I'm so sorry, Gregor. He has a great book, and even a talk, called The Magic of Platforms, that if somebody is actually curious about understanding of why platforms are nice, they should really watch that talk.If you see him at re:Invent, or a summit or somewhere giving a talk, go listen to that, and just pick his brain. Because that's—for me, I really kind of strongly agree with his approach because that's really how, like, you know, as he says, like, boost innovation is, you know, where you're actually building a platform that really works.Corey: Yeah, it's a hard problem, but it's also one of those things where you're trying to focus on—at least ideally—an outcome or a better situation than you currently find yourselves in. It's hard to turn down things that might very well get you there sooner, faster, but it's like trying to effectively cargo-cult the leadership principles from your last employer into your new one. It just doesn't work. I mean, you see more startups from Amazonians who try that, and it just goes horribly because without the cultural understanding and the supporting structures, it doesn't work.Evelyn: Exactly. So, I've worked with, like, organizations, like, 4000-plus people, I've worked for, like, small startups, consulted, and this is why I say, almost every single transformation, it fails the first time because somebody needs to be in control and track things and basically be really, really certain that people are doing it right. And as soon as it blows up in their face, that's when they realize they should actually take a step back. And so, even for building out a platform, you know, doing Platform-as-a-Product, I always reiterate that you have to really be willing to just invest upfront, and not get very much back. Because you have to figure out the whole user journey, and what you're actually building, before you actually build it.Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place for them to find you?Evelyn: So, I used to be on Twitter, but I've actually got off there after it kind of turned a bit toxic and crazy.Corey: Feels like that was years ago, but that's beside the point.Evelyn: Yeah, precisely. So, I would even just say because this feels like a corporate show, but find me on LinkedIn of all places because I will be sharing whatever I find on there, you know? So, just look me up on my name, Evelyn Osman, and give me a follow, and I'll probably be screaming into the cloud like you are.Corey: And we will, of course, put links to that in the show notes. Thank you so much for taking the time to speak with me. I appreciate it.Evelyn: Thank you, Corey.Corey: Evelyn Osman, engineering manager at AutoScout24. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, and I will read it once I finish building an internal platform to normalize all of those platforms together into one.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Benchmarking Security Attack Response Times in the Age of Automation with Anna Belak

Screaming in the Cloud

Play Episode Listen Later Jan 4, 2024 31:11


Anna Belak, Director of the Office of Cybersecurity Strategy at Sysdig, joins Corey on Screaming in the Cloud to discuss the newest benchmark for responding to security threats, 5/5/5. Anna describes why it was necessary to set a new benchmark for responding to security threats in a timely manner, and how the Sysdig team did research to determine the best practices for detecting, correlating, and responding to potential attacks. Corey and Anna discuss the importance of focusing on improving your own benchmarks towards a goal, as well as how prevention and threat detection are both essential parts of a solid security program. About AnnaAnna has nearly ten years of experience researching and advising organizations on cloud adoption with a focus on security best practices. As a Gartner Analyst, Anna spent six years helping more than 500 enterprises with vulnerability management, security monitoring, and DevSecOps initiatives. Anna's research and talks have been used to transform organizations' IT strategies and her research agenda helped to shape markets. Anna is the Director of Thought Leadership at Sysdig, using her deep understanding of the security industry to help IT professionals succeed in their cloud-native journey. Anna holds a PhD in Materials Engineering from the University of Michigan, where she developed computational methods to study solar cells and rechargeable batteries.Links Referenced: Sysdig: https://sysdig.com/ Sysdig 5/5/5 Benchmark: https://sysdig.com/555 TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined again—for another time this year—on this promoted guest episode brought to us by our friends at Sysdig, returning is Anna Belak, who is their director of the Office of Cybersecurity Strategy at Sysdig. Anna, welcome back. It's been a hot second.Anna: Thank you, Corey. It's always fun to join you here.Corey: Last time we were here, we were talking about your report that you folks had come out with, the, “Cybersecurity Threat Landscape for 2022.” And when I saw you were doing another one of these to talk about something, I was briefly terrified. “Oh, wow, please tell me we haven't gone another year and the cybersecurity threat landscape is moving that quickly.” And it sort of is, sort of isn't. You're here today to talk about something different, but it also—to my understanding—distills down to just how quickly that landscape is moving. What have you got for us today?Anna: Exactly. For those of you who remember that episode, one of the key findings in the Threat Report for 2023 was that the average length of an attack in the cloud is ten minutes. To be clear, that is from when you are found by an adversary to when they have caused damage to your system. And that is really fast. Like, we talked about how that relates to on-prem attacks or other sort of averages from other organizations reporting how long it takes to attack people.And so, we went from weeks or days to minutes, potentially seconds. And so, what we've done is we looked at all that data, and then we went and talked to our amazing customers and our many friends at analyst firms and so on, to kind of get a sense for if this is real, like, if everyone is seeing this or if we're just seeing this. Because I'm always like, “Oh, God. Like, is this real? Is it just me?”And as it turns out, everyone's not only—I mean, not necessarily everyone's seeing it, right? Like, there's not really been proof until this year, I would say because there's a few reports that came out this year, but lots of people sort of anticipated this. And so, when we went to our customers, and we asked for their SLAs, for example, they were like, “Oh, yeah, my SLA for a [PCRE 00:02:27] cloud is like 10, 15 minutes.” And I was like, “Oh, okay.” So, what we set out to do is actually set a benchmark, essentially, to see how well are you doing. Like, are you equipped with your cloud security program to respond to the kind of attack that a cloud security attacker is going to—sorry, an anti-cloud security—I guess—attacker is going to perpetrate against you.And so, the benchmark is—drumroll—5/5/5. You have five seconds to detect a signal that is relevant to potentially some attack in the cloud—hopefully, more than one such signal—you have five minutes to correlate all such relevant signals to each other so that you have a high fidelity detection of this activity, and then you have five more minutes to initiate an incident response process to hopefully shut this down, or at least interrupt the kill chain before your environments experience any substantial damage.Corey: To be clear, that is from a T0, a starting point, the stopwatch begins, the clock starts when the event happens, not when an event shows up in your logs, not once someone declares an incident. From J. Random Hackerman, effectively, we're pressing the button and getting the response from your API.Anna: That's right because the attackers don't really care how long it takes you to ship logs to wherever you're mailing them to. And that's why it is such a short timeframe because we're talking about, they got in, you saw something hopefully—and it may take time, right? Like, some of the—which we'll describe a little later, some of the activities that they perform in the early stages of the attack are not necessarily detectable as malicious right away, which is why your correlation has to occur, kind of, in real time. Like, things happen, and you're immediately adding them, sort of like, to increase the risk of this detection, right, to say, “Hey, this is actually something,” as opposed to, you know, three weeks later, I'm parsing some logs and being like, “Oh, wow. Well, that's not good.” [laugh].Corey: The number five seemed familiar to me in this context, so I did a quick check, and sure enough, allow me to quote from chapter and verse from the CloudTrail documentation over an AWS-land. “CloudTrail typically delivers logs within an average of about five minutes of an API call. This time is not guaranteed.” So effectively, if you're waiting for anything that's CloudTrail-driven to tell you that you have a problem, it is almost certainly too late by the time that pops up, no matter what that notification vector is.Anna: That is, unfortunately or fortunately, true. I mean, it's kind of a fact of life. I guess there is a little bit of a veiled [unintelligible 00:04:43] at our cloud provider friends because, really, they have to do better ultimately. But the flip side to that argument is CloudTrail—or your cloud log source of choice—cannot be your only source of data for detecting security events, right? So, if you are operating purely on the basis of, “Hey, I have information in CloudTrail; that is my security information,” you are going to have a bad time, not just because it's not fast enough, but also because there's not enough data in there, right? Which is why part of the first, kind of, benchmark component is that you must have multiple data sources for the signals, and they—ideally—all will be delivered to you within five seconds of an event occurring or a signal being generated.Corey: And give me some more information on that because I have my own alerter, specifically, it's a ClickOps detector. Whenever someone in one of my accounts does something in the console, that has a write aspect to it rather than just a read component—which again, look at what you want in the console, that's fine—if you're changing things that is not being managed by code, I want to know that it's happening. It's not necessarily bad, but I want to at least have visibility into it. And that spits out the principal, the IP address it emits from, and the rest. I haven't had a whole lot where I need to correlate those between different areas. Talk to me more about the triage step.Anna: Yeah, so I believe that the correlation step is the hardest, actually.Corey: Correlation step. My apologies.Anna: Triage is fine. It's [crosstalk 00:06:06]—Corey: Triage, correlations, the words we use matter on these things.Anna: Dude, we argued about the words on this for so long, you could even imagine. Yeah, triage, correlation, detection, you name it, we are looking at multiple pieces of data, we're going to connect them to each other meaningfully, and that is going to provide us with some insight about the fact that a bad thing is happening, and we should respond to it. Perhaps automatically respond to it, but we'll get to that. So, a correlation, okay. The first thing is, like I said, you must have more than one data source because otherwise, I mean, you could correlate information from one data source; you actually should do that, but you are going to get richer information if you can correlate multiple data sources, and if you can access, for example, like through an API, some sort of enrichment for that information.Like, I'll give you an example. For SCARLETEEL, which is an attack we describe in the thread report, and we actually described before, this is—we're, like—on SCARLETEEL, I think, version three now because there's so much—this particular certain actor is very active [laugh].Corey: And they have a better versioning scheme than most companies I've spoken to, but that's neither here nor there.Anna: [laugh]. Right? So, one of the interesting things about SCARLETEEL is you could eventually detect that it had happened if you only had access to CloudTrail, but you wouldn't have the full picture ever. In our case, because we are a company that relies heavily on system calls and machine learning detections, we [are able to 00:07:19] connect the system call events to the CloudTrail events, and between those two data sources, we're able to figure out that there's something more profound going on than just what you see in the logs. And I'll actually tell you, which, for example, things are being detected.So, in SCARLETEEL, one thing that happens is there's a crypto miner. And a crypto miner is one of these events where you're, like, “Oh, this is obviously malicious,” because as we wrote, I think, two years ago, it costs $53 to mine $1 of Bitcoin in AWS, so it is very stupid for you to be mining Bitcoin in AWS, unless somebody else is—Corey: In your own accounts.Anna: —paying the cloud bill. Yeah, yeah [laugh] in someone else's account, absolutely. Yeah. So, if you are a sysadmin or a security engineer, and you find a crypto miner, you're like, “Obviously, just shut that down.” Great. What often happens is people see them, and they think, “Oh, this is a commodity attack,” like, people are just throwing crypto miners whatever, I shut it down, and I'm done.But in the case of this attack, it was actually a red herring. So, they deployed the miner to see if they could. They could, then they determined—presumably; this is me speculating—that, oh, these people don't have very good security because they let random idiots run crypto miners in their account in AWS, so they probed further. And when they probed further, what they did was some reconnaissance. So, they type in commands, listing, you know, like, list accounts or whatever. They try to list all the things they can list that are available in this account, and then they reach out to an EC2 metadata service to kind of like, see what they can do, right?And so, each of these events, like, each of the things that they do, like, reaching out to a EC2 metadata service, assuming a role, doing a recon, even lateral movement is, like, by itself, not necessarily a scary, big red flag malicious thing because there are lots of, sort of, legitimate reasons for someone to perform those actions, right? Like, reconnaissance, for one example, is you're, like, looking around the environment to see what's up, right? So, you're doing things, like, listing things, [unintelligible 00:09:03] things, whatever. But a lot of the graphical interfaces of security tools also perform those actions to show you what's, you know, there, so it looks like reconnaissance when your tool is just, like, listing all the stuff that's available to you to show it to you in the interface, right? So anyway, the point is, when you see them independently, these events are not scary. They're like, “Oh, this is useful information.”When you see them in rapid succession, right, or when you see them alongside a crypto miner, then your tooling and/or your process and/or your human being who's looking at this should be like, “Oh, wait a minute. Like, just the enumeration of things is not a big deal. The enumeration of things after I saw a miner, and you try and talk to the metadata service, suddenly I'm concerned.” And so, the point is, how can you connect those dots as quickly as possible and as automatically as possible, so a human being doesn't have to look at, like, every single event because there's an infinite number of them.Corey: I guess the challenge I've got is that in some cases, you're never going to be able to catch up with this. Because if it's an AWS call to one of the APIs that they manage for you, they explicitly state there's no guarantee of getting information on this until the show's all over, more or less. So, how is there… like, how is there hope?Anna: [laugh]. I mean, there's always a forensic analysis, I guess [laugh] for all the things that you've failed to respond to.Corey: Basically we're doing an after-action thing because humans aren't going to react that fast. We're just assuming it happened; we should know about it as soon as possible. On some level, just because something is too late doesn't necessarily mean there's not value added to it. But just trying to turn this into something other than a, “Yeah, they can move faster than you, and you will always lose. The end. Have a nice night.” Like, that tends not to be the best narrative vehicle for these things. You know, if you're trying to inspire people to change.Anna: Yeah, yeah, yeah, I mean, I think one clear point of hope here is that sometimes you can be fast enough, right? And a lot of this—I mean, first of all, you're probably not going to—sorry, cloud providers—you don't go into just the cloud provider defaults for that level of performance, you are going with some sort of third-party tool. On the, I guess, bright side, that tool can be open-source, like, there's a lot of open-source tooling available now that is fast and free. For example, is our favorite, of course, Falco, which is looking at system calls on endpoints, and containers, and can detect things within seconds of them occurring and let you know immediately. There is other EBPF-based instrumentation that you can use out there from various vendors and/or open-source providers, and there's of course, network telemetry.So, if you're into the world of service mesh, there is data you can get off the network, also very fast. So, the bad news or the flip side to that is you have to be able to manage all that information, right? So, that means—again, like I said, you're not expecting a SOC analyst to look at thousands of system calls and thousands of, you know, network packets or flow logs or whatever you're looking at, and just magically know that these things go together. You are expecting to build, or have built for you by a vendor or the open-source community, some sort of dissection content that is taking this into account and then is able to deliver that alert at the speed of 5/5/5.Corey: When you see the larger picture stories playing out, as far as what customers are seeing, what the actual impact is, what gave rise to the five-minute number around this? Just because that tends to feel like it's a… it is both too long and also too short on some level. I'm just wondering how you wound up at—what is this based on?Anna: Man, we went through so many numbers. So, we [laugh] started with larger numbers, and then we went to smaller numbers, then we went back to medium numbers. We align ourselves with the timeframes we're seeing for people. Like I said, a lot of folks have an SLA of responding to a P0 within 10 or 15 minutes because their point basically—and there's a little bit of bias here into our customer base because our customer base is, A, fairly advanced in terms of cloud adoption and in terms of security maturity, and also, they're heavily in let's say, financial industries and other industries that tend to be early adopters of new technology. So, if you are kind of a laggard, like, you probably aren't that close to meeting this benchmark as you are if you're saying financial, right? So, we asked them how they operate, and they basically pointed out to us that, like, knowing 15 minutes later is too late because I've already lost, like, some number of millions of dollars if my environment is compromised for 15 minutes, right? So, that's kind of where the ten minutes comes from. Like, we took our real threat research data, and then we went around and talked to folks to see kind of what they're experiencing and what their own expectations are for their incident response in SOC teams, and ten minutes is sort of where we landed.Corey: Got it. When you see this happening, I guess, in various customer environments, assuming someone has missed that five-minute window, is a game over effectively? How should people be thinking about this?Anna: No. So, I mean, it's never really game over, right? Like until your company is ransomed to bits, and you have to close your business, you still have many things that you can do, hopefully, to save yourself. And also, I want to be very clear that 5/5/5 as a benchmark is meant to be something aspirational, right? So, you should be able to meet this benchmark for, let's say, your top use cases if you are a fairly high maturity organization, in threat detection specifically, right?So, if you're just beginning your threat detection journey, like, tomorrow, you're not going to be close. Like, you're going to be not at all close. The point here, though, is that you should aspire to this level of greatness, and you're going to have to create new processes and adopt new tools to get there. Now, before you get there, I would argue that if you can do, like, 10-10-10 or, like, whatever number you start with, you're on a mission to make that number smaller, right? So, if today, you can detect a crypto miner in 30 minutes, that's not great because crypto miners are pretty detectable these days, but give yourself a goal of, like, getting that 30 minutes down to 20, or getting that 30 minutes down to 10, right?Because we are so obsessed with, like, measuring ourselves against our peers and all this other stuff that we sometimes lose track of what actually is improving our security program. So yes, compare it to yourself first. But ultimately, if you can meet the 5/5/5 benchmark, then you are doing great. Like, you are faster than the attackers in theory, so that's the dream.Corey: So, I have to ask, and I suspect I might know the answer to this, but given that it seems very hard to move this quickly, especially at scale, is there an argument to be made that effectively prevention obviates the need for any of this, where if you don't misconfigure things in ways that should be obvious, if you practice defense-in-depth to a point where you can effectively catch things that the first layer meets with successive layers, as opposed to, “Well, we have a firewall. Once we're inside of there, well [laugh], it's game over for us.” Is prevention sufficient in some ways to obviate this?Anna: I think there are a lot of people that would love to believe that that's true.Corey: Oh, I sure would. It's such a comforting story.Anna: And we've done, like, I think one of my opening sentences in the benchmark, kind of, description, actually, is that we've done a pretty good job of advertising prevention in Cloud as an important thing and getting people to actually, like, start configuring things more carefully, or like, checking how those things have been configured, and then changing that configuration should they discover that it is not compliant with some mundane standard that everyone should know, right? So, we've made great progress, I think, in cloud prevention, but as usual, like, prevention fails, right? Like I still have smoke detectors in my house, even though I have done everything possible to prevent it from catching fire and I don't plan to set it on fire, right? But like, threat detection is one of these things that you're always going to need because no matter what you do, A, you will make a mistake because you're a human being, and there are too many things, and you'll make a mistake, and B, the bad guys are literally in the business of figuring ways around your prevention and your protective systems.So, I am full on on defense-in-depth. I think it's a beautiful thing. We should only obviously do that. And I do think that prevention is your first step to a holistic security program—otherwise, what even is the point—but threat detection is always going to be necessary. And like I said, even if you can't go 5/5/5, you don't have threat detection at that speed, you need to at least be able to know what happened later so you can update your prevention system.Corey: This might be a dangerous question to get into, but why not, that's what I do here. This [could 00:17:27] potentially an argument against Cloud, by which I mean that if I compromise someone's Cloud account on any of the major cloud providers, once I have access of some level, I know where everything else in the environment is as a general rule. I know that you're using S3 or its equivalent, and what those APIs look like and the rest, whereas as an attacker, if I am breaking into someone's crappy data center-hosted environment, everything is going to be different. Maybe they don't have a SAN at all, for example. Maybe they have one that hasn't been patched in five years. Maybe they're just doing local disk for some reason.There's a lot of discovery that has to happen that is almost always removed from Cloud. I mean, take the open S3 bucket problem that we've seen as a scourge for 5, 6, 7 years now, where it's not that S3 itself is insecure, but once you make a configuration mistake, you are now in line with a whole bunch of other folks who may have much more valuable data living in that environment. Where do you land on that one?Anna: This is the ‘leave cloud to rely on security through obscurity' argument?Corey: Exactly. Which I'm not a fan of, but it's also hard to argue against from time-to-time.Anna: My other way of phrasing it is ‘the attackers are ripping up the stack' argument. Yeah, so—and there is some sort of truth in that, right? Part of the reason that attackers can move that fast—and I think we say this a lot when we talk about the threat report data, too, because we literally see them execute this behavior, right—is they know what the cloud looks like, right? They have access to all the API documentation, they kind of know what all the constructs are that you're all using, and so they literally can practice their attack and create all these scripts ahead of time to perform their reconnaissance because they know exactly what they're looking at, right? On-premise, you're right, like, they're going to get into—even to get through my firewall, whatever, they're getting into my data center, they don't do not know what disaster I have configured, what kinds of servers I have where, and, like, what the network looks like, they have no idea, right?In Cloud, this is kind of all gifted to them because it's so standard, which is a blessing and a curse. It's a blessing because—well for them, I mean, because they can just programmatically go through this stuff, right? It's a curse for them because it's a blessing for us in the same way, right? Like, the defenders… A, have a much easier time knowing what they even have available to them, right? Like, the days of there's a server in a closet I've never heard of are kind of gone, right? Like, you know what's in your Cloud account because, frankly, AWS tells you. So, I think there is a trade-off there.The other thing is—about the moving up the stack thing, right—like no matter what you do, they will come after you if you have something worth exploiting you for, right? So, by moving up the stack, I mean, listen, we have abstracted all the physical servers, all of the, like, stuff we used to have to manage the security of because the cloud just does that for us, right? Now, we can argue about whether or not they do a good job, but I'm going to be generous to them and say they do a better job than most companies [laugh] did before. So, in that regard, like, we say, thank you, and we move on to, like, fighting this battle at a higher level in the stack, which is now the workloads and the cloud control plane, and the you name it, whatever is going on after that. So, I don't actually think you can sort of trade apples for oranges here. It's just… bad in a different way.Corey: Do you think that this benchmark is going to be used by various companies who will learn about it? And if so, how do you see that playing out?Anna: I hope so. My hope when we created it was that it would sort of serve as a goalpost or a way to measure—Corey: Yeah, it would just be marketing words on a page and never mentioned anywhere, that's our dream here.Anna: Yeah, right. Yeah, I was bored. So, I wrote some—[laugh].Corey: I had a word minimum to get out the door, so there we are. It's how we work.Anna: Right. As you know, I used to be a Gartner analyst, and my desire is always to, like, create things that are useful for people to figure out how to do better in security. And my, kind of, tenure at the vendor is just a way to fund that [laugh] more effectively [unintelligible 00:21:08].Corey: Yeah, I keep forgetting you're ex-Gartner. Yeah, it's one of those fun areas of, “Oh, yeah, we just want to basically talk about all kinds of things because there's a—we have a chart to fill out here. Let's get after it.”Anna: I did not invent an acronym, at least. Yeah, so my goal was the following. People are always looking for a benchmark or a goal or standard to be like, “Hey, am I doing a good job?” Whether I'm, like a SOC analyst or director, and I'm just looking at my little SOC empire, or I'm a full on CSO, and I'm looking at my entire security program to kind of figure out risk, I need some way to know whether what is happening in my organization is, like, sufficient, or on par, or anything. Is it good or is it bad? Happy face? Sad face? Like, I need some benchmark, right?So normally, the Gartner answer to this, typically, is like, “You can only come up with benchmarks that are—” they're, like, “Only you know what is right for your company,” right? It's like, you know, the standard, ‘it depends' answer. Which is true, right, because I can't say that, like, oh, a huge multinational bank should follow the same benchmark as, like, a donut shop, right? Like, that's unreasonable. So, this is also why I say that our benchmark is probably more tailored to the more advanced organizations that are dealing with kind of high maturity phenomena and are more cloud-native, but the donut shops should kind of strive in this direction, right?So, I hope that people will think of it this way: that they will, kind of, look at their process and say, “Hey, like, what are the things that would be really bad if they happened to me, in terms of sort detection?” Like, “What are the threats I'm afraid of where if I saw this in my cloud environment, I would have a really bad day?” And, “Can I detect those threats in 5/5/5?” Because if I can, then I'm actually doing quite well. And if I can't, then I need to set, like, some sort of roadmap for myself on how I get from where I am now to 5/5/5 because that implies you would be doing a good job.So, that's sort of my hope for the benchmark is that people think of it as something to aspire to, and if they're already able to meet it, then that they'll tell us how exactly they're achieving it because I really want to be friends with them.Corey: Yeah, there's a definite lack of reasonable ways to think about these things, at least in ways that can be communicated to folks outside of the bounds of the security team. I think that's one of the big challenges currently facing the security industry is that it is easy to get so locked into the domain-specific acronyms, philosophies, approaches, and the rest, that even coming from, “Well, I'm a cloud engineer who ostensibly needs to know about these things.” Yeah, wander around the RSA floor with that as your background, and you get lost very quickly.Anna: Yeah, I think that's fair. I mean, it is a very, let's say, dynamic and rapidly evolving space. And by the way, like, it was really hard for me to pick these numbers, right, because I… very much am on that whole, ‘it depends' bandwagon of I don't know what the right answer is. Who knows what the right answer is [laugh]? So, I say 5/5/5 today. Like, tomorrow, the attack takes five minutes, and now it's two-and-a-half/two-and-a-half, right? Like it's whatever.You have to pick a number and go for it. So, I think, to some extent, we have to try to, like, make sense of the insanity and choose some best practices to anchor ourselves in or some, kind of like, sound logic to start with, and then go from there. So, that's sort of what I go for.Corey: So, as I think about the actual reaction times needed for 5/5/5 to actually be realistic, people can't reliably get a hold of me on the phone within five minutes, so it seems like this is not something you're going to have humans in the loop for. How does that interface with the idea of automating things versus giving automated systems too much power to take your site down as a potential failure mode?Anna: Yeah. I don't even answer the phone anymore, so that wouldn't work at all. That's a really, really good question, and probably the question that gives me the most… I don't know, I don't want to say lost sleep at night because it's actually, it's very interesting to think about, right? I don't think you can remove humans from the loop in the SOC. Like, certainly there will be things you can auto-respond to some extent, but there'd better be a human being in there because there are too many things at stake, right?Some of these actions could take your entire business down for far more hours or days than whatever the attacker was doing before. And that trade-off of, like, is my response to this attack actually hurting the business more than the attack itself is a question that's really hard to answer, especially for most of us technical folks who, like, don't necessarily know the business impact of any given thing. So, first of all, I think we have to embrace other response actions. Back to our favorite crypto miners, right? Like there is no reason to not automatically shut them down. There is no reason, right? Just build in a detection and an auto-response: every time you see a crypto miner, kill that process, kill that container, kill that node. I don't care. Kill it. Like, why is it running? This is crazy, right?I do think it gets nuanced very quickly, right? So again, in SCARLETEEL, there are essentially, like, five or six detections that occur, right? And each of them theoretically has a potential auto-response that you could have executed depending on your, sort of, appetite for that level of intervention, right? Like, when you see somebody assuming a role, that's perfectly normal activity most of the time. In this case, I believe they actually assumed a machine role, which is less normal. Like, that's kind of weird.And then what do you do? Well, you can just, like, remove the role. You can remove that person's ability to do anything, or remove that role's ability to do anything. But that could be very dangerous because we don't necessarily know what the full scope of that role is as this is happening, right? So, you could take, like, a more mitigated auto-response action and add a restrictive policy to that rule, for example, to just prevent activity from that IP address that you just saw, right, because we're not sure about this IP address, but we're sure about this role, right?So, you have to get into these, sort of, risk-tiered response actions where you say, “Okay, this is always okay to do automatically. And this is, like, sometimes, okay, and this is never okay.” And as you develop that muscle, it becomes much easier to do something rather than doing nothing and just, kind of like, analyzing it in forensics and being, like, “Oh, what an interesting attack story,” right? So, that's step one, is just start taking these different response actions.And then step two is more long-term, and it's that you have to embrace the cloud-native way of life, right? Like this immutable, ephemeral, distributed religion that we've been selling, it actually works really well if you, like, go all-in on the religion. I sound like a real cult leader [laugh]. Like, “If you just go all in, it's going to be great.” But it's true, right?So, if your workflows are immutable—that means they cannot change as they're running—then when you see them drifting from their original configuration, like, you know, that is bad. So, you can immediately know that it's safe to take an auto-respon—well, it's safe, relatively safe, take an auto-response action to kill that workload because you are, like, a hundred percent certain it is not doing the right things, right? And then furthermore, if all of your deployments are defined as code, which they should be, then it is approximately—[though not entirely 00:27:31]—trivial to get that workload back, right? Because you just push a button, and it just generates that same Kubernetes cluster with those same nodes doing all those same things, right? So, in the on-premise world where shooting a server was potentially the, you know, fireable offense because if that server was running something critical, and you couldn't get it back, you were done.In the cloud, this is much less dangerous because there's, like, an infinite quantity of servers that you could bring back and hopefully Infrastructure-as-Code and, kind of, Configuration-as-Code in some wonderful registry, version-controlled for you to rely on to rehydrate all that stuff, right? So again, to sort of TL;DR, get used to doing auto-response actions, but do this carefully. Like, define a scope for those actions that make sense and not just, like, “Something bad happened; burn it all down,” obviously. And then as you become more cloud-native—which sometimes requires refactoring of entire applications—by the way, this could take years—just embrace the joy of Everything-as-Code.Corey: That's a good way of thinking about it. I just, I wish there were an easier path to get there, for an awful lot of folks who otherwise don't find a clear way to unlock that.Anna: There is not, unfortunately [laugh]. I mean, again, the upside on that is, like, there are a lot of people that have done it successfully, I have to say. I couldn't have said that to you, like, six, seven years ago when we were just getting started on this journey, but especially for those of you who were just at KubeCon—however, long ago… before this airs—you see a pretty robust ecosystem around Kubernetes, around containers, around cloud in general, and so even if you feel like your organization's behind, there are a lot of folks you can reach out to to learn from, to get some help, to just sort of start joining the masses of cloud-native types. So, it's not nearly as hopeless as before. And also, one thing I like to say always is, almost every organization is going to have some technical debt and some legacy workload that they can't convert to the religion of cloud.And so, you're not going to have a 5/5/5 threat detection SLA on those workloads. Probably. I mean, maybe you can, but probably you're not, and you may not be able to take auto-response actions, and you may not have all the same benefits available to you, but like, that's okay. That's okay. Hopefully, whatever that thing is running is, you know, worth keeping alive, but set this new standard for your new workloads. So, when your team is building a new application, or if they're refactoring an application, can't afford the new world, set the standard on them and don't, kind of like, torment the legacy folks because it doesn't necessarily make sense. Like, they're going to have different SLAs for different workloads.Corey: I really want to thank you for taking the time to speak with me yet again about the stuff you folks are coming out with. If people want to learn more, where's the best place for them to go?Anna: Thanks, Corey. It's always a pleasure to be on your show. If you want to learn more about the 5/5/5 benchmark, you should go to sysdig.com/555.Corey: And we will, of course, put links to that in the show notes. Thank you so much for taking the time to speak with me today. As always, it's appreciated. Anna Belak, Director at the Office of Cybersecurity Strategy at Sysdig. I'm Cloud Economist Corey Quinn, and this has been a promoted guest episode brought to us by our friends at Sysdig. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will read nowhere even approaching within five minutes.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Fundamentals of Building Mission-Driven Technology with Danilo Campos

Screaming in the Cloud

Play Episode Listen Later Jan 2, 2024 33:07


Danilo Campos, Proprietor of Antigravity, joins @quinnypig on Screaming in the Cloud to discuss his philosophy behind building tools that not only enhance developer experience but also improve the future of our world. Danilo shares his thoughts on how economic factors have influenced tech companies and their strategies for product, open source, and more. He also shares what he thinks is another, better way to approach these strategies, without ignoring the economic element. About DaniloDanilo Campos wants a world where technology makes us more powerful and expressive versions of ourselves. He worked with GitHub and the White House to deliver coding platforms to public housing residents, supported Glitch.com in its last days as an independent, and developed products for multiple early-stage startups, including Hipmunk. Today Danilo offers freelance developer experience services for devtools firms through Antigravity DX.Links Referenced: Antigravity DX: https://antigravitydx.com/ Blog: https://redeem-tomorrow.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and periodically on this show, we like to gaze into the future and tried to predict how that's going to play out. On this episode, I want to start off by instead looking into the past, more specifically my past. Before I started this place, I wound up working at a company called FutureAdvisor, which was a great startup for all of three months before we were bought by a BlackRock. I soon learned what a BlackRock actually was.While I was there, I encountered an awful lot of oral tradition around a guy named Danilo, and he—as it turned out—was a contractor who had been brought in to do a fair bit of mobile work. Meet my guest today, Danilo Campos, who is at present, the proprietor of a company called Antigravity DX. Thank you so much for joining me, I appreciate it.Danilo: Hey, Corey, it's good to be here.Corey: It's weird talking to you, just because you were someone that I knew by reputation, and if I were to take all the things that were laid at your feet after you no longer had been there, it feels like you were there for 20 years. What did you actually do there, and how long were you embedded for?Danilo: I loved the FutureAdvisor guys. I thought they were such a blast to work with. I loved what they were working on. I learned so much about how finance and investing works from FutureAdvisor, and somehow it was only seven months of my life. I'd been introduced to the founders as a freelance iOS developer at the time—this was 2014—and a guy I had worked with at Hipmunk actually put me in touch with these guys, and we connected. And they needed to get started doing mobile. They'd never done any mobile stuff, they didn't have anyone on staff who did mobile stuff.And by that point, I'd shipped I think, must have been half a dozen iOS native apps, and so I knew this stuff pretty well. I understood the workflows, I understood the path to getting from idea to shipped product, and they just wanted occasional help. How do we wireframe this? How do we plan the product that way? How do we structure this thing? And so, it started off as this just, kind of, occasional troubleshooting consulting thing.And I think about August 2014. They call me in for a meeting, they said, “Hey, we're stuck. We don't know how to get this thing off the ground. Could you help us get this project moving so that we actually ship it?” And so, I just came and embedded for seven months, and by the end of it, I was just running the entire iOS engineering team. We had a designer working with us. We had, I think it was four folks who were building the product. We had QA. It was a whole team to get this thing out the door. And we got it out the door after seven months of really working at it. And like I said, it was a blast. I love those folks.Corey: I have to be clear, when I say that I encountered a lot of what you had done. It was not negative. This was not one of those startups where there's a glorious tradition of assassinating the character out of everyone who has left the company—or at least Git repos—because they're not there to defend themselves anymore. There were times where decisions that you had made were highlighted as, “We needed to be doing things more like this.” There were times it was, “Oh, we can't do that because of how you wound up building this other thing.”And it was weird because it felt like you were the hand of some ancient deity, just moving things back and forth in your infinite wisdom of the ancients. It was unknowable, and we had to accept it as gospel, whether we liked it or not, at different times. In practice, I now know this was honestly just the outgrowth of a rapidly expanding culture where you've got to go from a team of five people to the team of 50 and keep everyone rowing in the same direction, ideally. But it was a really interesting social dynamic that I got to observe as a result, and I'm just tickled pink to be able to talk to you now. What are you doing these days?Danilo: Thank you for the context, by the way, because you know, I move on, as you do in a contract capacity, and you hope things work out.Corey: Yeah. To be clear, it was never a context of, “There's the bastard. Get him.” Like, that is not the perspective we are coming at this from at all.Danilo: Yeah, yeah, yeah. No, and it's hard because it was a very strange, alien codebase compared to the rest of the company. I get how it ended up in that spot. These days, I am a freelance developer experience consultant, and I spent a year-and-a-half at Glitch.com. And developer experience was always something that I really cared about. I did some work at GitHub that was about getting people—specifically teenagers living in public housing—into computing and the internet, and I'd had to do a bunch of DX work to make that happen because I had an afternoon to get people from zero to writing code.And that is not a straightforward situation, especially in a low-income housing environment, for example, right? So, I cared about this stuff a lot. And then I spent a year-and-a-half at Glitch.com, and it was like getting a graduate degree on everything about the leverage for creating outcomes in developer tools. And I just, I felt like I was carrying some gift from the Gods. I just, I felt the need to get this out to the wider world, and so that's what I do with Antigravity.Corey: When I got to catch up with you in person for the first time at the excellent and highly recommended Monktoberfest conference—Danilo: Excellent.Corey: —that the folks over at RedMonk put on every year, it was interesting, in that you and I got to talking very rapidly, not about technology as such, but about culture and the industry and values and the rest. It was a wonderfully refreshing conversation that I don't normally get to have so soon after meeting someone. I think that one of the more interesting aspects of our relatively wide-ranging conversation in a surprisingly brief period of time focused, first off, among the idea of developer tools and what so many of them seem to get wrong. I know that we basically dove into discussing about our violently agreeing opinions around the state of developer experience, for example. What are the hills you're willing to die on in that space?Danilo: I think that computing generally exists to amplify and multiply our power. Computing exists to let us do things that we could not do with the simple, frail flesh that we're born with, right? Computers augment our ambitions because they can do things with infinite iteration. And so, if you can come up with something that you can bottle in the form of an algorithm that repeats infinitely, you can have incredible impact on the world. And so, I think that there's a responsibility to find ways to make that power something that is easy to hand to other people and let them pick up and run with.And so, developer tools, to me, has this almost sacred connotation because what you're doing is handing people the fire of the Gods and saying, “Whatever you can come up with, whatever your imagination allows you to do with these tools, they can repeat infinitely and make whatever change you want—for good or for ill—in the world.” And that's very special to me. I think we've gotten bored of it because it's just, you know, it's a 50-year-old business at this point. But I think there's still a lot of magic to it, and the more we see the magic, the more magical outcomes we can coax out of everyday people who become better developers.Corey: From my perspective, one of the reasons I care so much about developer experience is that the failure mode of getting it wrong means that the person trying to understand the monstrosity you've built feels like they're somehow not smart, or they're just not getting it in some key and fundamental way. And that's not true. It's that you, for whatever reason, what you have built is not easily understandable to them where they are. I go back to what I first heard in 2012, at a talk that Logstash creator, Jordan Sissel wound up saying, where his entire thesis was that if a user has a bad time, it's a bug.Danilo: Yeah.Corey: And I thought that that was just a wonderfully prescient statement that I wanted to sign onto wholeheartedly. [That was 00:09:08] my first exposure to it. I know that's not the entirety of developer experience by a long shot, but it's the one where I think you lose the most mind share when you get it wrong.Danilo: Well, and I'm glad that you bring that up because I think that kind of defines the spectrum of the emotional experience of interacting with developer tools. On one end of the spectrum, you've got, “I feel so stupid. This has made me feel worse about myself. This has given me less of a sense of confidence in myself than I had when I started.” And at the other end of the spectrum, the other extreme is, “I cannot believe I am this cool. I cannot believe that my imagination has been made manifest in this way that now exists in the world and can go out and touch other people and make their lives better.”Those are the two, kind of, extremes of the subjective emotional experience that can come from developer tools. And so, I think that there is a business imperative that really pushes us toward the extreme of making people feel awesome. I think about this in the context of Iron Man, you've seen Iron Man, yeah.Corey: Oh, yes.Danilo: All right. So, the Iron Man suit is the perfect metaphor for a developer tool that is working correctly for you, right? Because on its own, the suit is not very interesting, and on his own, Tony Stark is not all that powerful, but you combine the suit and the person, and suddenly extraordinary emergent outcomes come out. The ambition of the human is amplified, and he feels so [BLEEP] cool. And I think that's what we're looking to do with developer tools is that we want to take a person, amplify their range, give them a range of motion that lets them soar into the clouds and do whatever they need to do up there so that when they come back down, they feel transformed. They feel like more than what they started.Corey: I would agree with that. There's a sense of whimsy and wonder as I look through my career trajectory, going from a sysadmin role, where you there was a pretty constant and hard to beat ratio in most shops—and the ratio [unintelligible 00:11:29] varied—but number of admins to the number of servers. And now with the magic of cloud being what it is, it's a, “Well, how many admins does it take to run X number of servers?” Like, “Well, as an [admin done 00:11:39] right, I can manage all of them because that's how programming languages work.” And that is a mystical and powerful thing.But lately, it seems like there's been some weird changes in the world of developer tooling. Cynically, I've said a couple of times that giving a toss about the developer experience was in fact a zero interest rate phenomenon. Like, when you're basically having to fend off casual offers of 400 grand a year from big tech, how do you hire and retain people at a company that has one of those old, tiny profit-generating business models and compete with them? And a lot of times, developer experience was part of how you did that. I don't know that I necessarily believe that that is as tied to that cynical worldview as I might pretend on the internet, but I don't know—I do wonder if it's a factor because it seems like we've seen a definite change in the way that developer tools are approaching their community of users and customers.Danilo: Well, my immediate reflex is to open up the kind of systems theory box and look at what's inside of that. Because I think that what we are experiencing, if we use the interest rate lens, is a period of time where everyone is a little bit worried that the good times are over for good. And I feel the sense of this in a lot of places. I think developer experience is a pretty good avatar to try this on with because I definitely also perceive it in that sphere.During the heyday of 0% interest rates, everything was about how much totalizing growth can you achieve? And from a developer tools perspective, all right, well, we need to make it so that the tools, kind of, grow themselves, so let's invest a lot in developer experience so that people very quickly get onboarded, without us having to hold their hand, without us having to conduct a sales call, let's get them to the point where they can quickly understand—because the documentation is so good and the artifacts are so good—exactly how to use these tools to maximum effect. Let's get them to a point where it's very easy for them to share the results of their work so that other people see the party and really want to join in. And so, all kinds of effort and energy and capital was being invested in this kind of growth strategy.And now I think that people are, again, a little bit afraid that the good times are over, and so we see this really sales-driven culture of growth, where it's like, all right, well, for this company to succeed, we have to really make sure that we're going and closing these big sales, and if individual developers can't figure out how the hell this works, well, that's their problem, and we're not going to worry about it. And we've talked about this: this fear of the good times being over drives people, I think, to all kinds of bad behavior. The rug-pulling that we've seen in open-source licensing where somebody's like, “All right, I've taken a bunch from this community, and now I'm going to keep it, and I'm not going to give anything back.” This is the behavior of people who are afraid that the good times are behind them. I don't have the luxury of being that pessimistic about the future, and I don't think our industry can afford it either.[midroll 00:15:03]Corey: The rules changing late in the game is something that has always upset me. It feels inherently unfair, and it's weird because you can have these companies say that, “Look, we've never done anything like that. Why wouldn't you trust us?” Right up until the point where they do. Reddit is a great example, where for years, they had a great API—ish—that could do things that their crap-ass mobile client natively couldn't. And Apollo was how I interacted with Reddit constantly. I was a huge Reddit user. I was simultaneously, at one point, moderator of the legal advice subreddit and the personal finance subreddit. I was passionate about that stuff, and it was great.And then they wound up effectively killing all third-party clients that don't bend the knee, and well, why am I going to spend my time donating content and energy and time to a for-profit company that gets very jealous when other people find ways to leverage their platform in ways that they don't personally find themselves able to do. Screw ‘em. I haven't been back on Reddit since. It's just a, “Fool me once, shame on me story.” Twitter did the exact same thing. I built a threading Twitter client simultaneously deployed to 20 AWS regions, until they decided they didn't want people creating content through their APIs and killed the whole thing with no notice. Great. Now, they're—I got an email asking me to come back. Go to hell. I tried that once. You've eviscerated people's businesses and the rest.And you see it with licensed changes as well. But it all comes down to the same thing, from my perspective, which is an after-the-fact changing of the rules. And by moving the goalposts like that, I wonder what guarantees a startup or a project that doesn't intend to do those things can offer to its community. Because, look, HashiCorp made its decision to change the licensing for Terraform. Good for them. They're entitled to do that. I'm not suggesting, in any way shape or form, that they have violated any legal term.And I don't even know they're necessarily doing anything that doesn't make sense from their point of view. And the only people I really see that upset about it are licensing purists—which I no longer am for a variety of reasons—people who work at HashiCorp, obviously, and their direct competitors who are not sympathetic in that particular place. But as a counterpoint, if they wind up building a new open-source project, of course, I'm not going to contribute. I mean, that's a decision I get to make. And I don't know how you square that circle because otherwise, if that continues, no one will be able to have a sense of safety around contributing to anything open-source unless they're pleased to wind up doing volunteer work for a one-day unicorn.Danilo: So, I really appreciate the economical survey of the landscape that you just provided because I think that captures it really well. The Reddit case in particular breaks my heart. I will go to my grave absolutely loving Steve Huffman. Steve Huffman gave me my first break as a paid developer and product designer, and he was an enormous pain in the ass to work with, and I loved every minute of it. Like, he's just an interesting, if volatile, character.And I see that volatility playing out with Red Hat in the incredible hostility that they were conveying around being held to account for these changes. And I have a lot of sympathy for that crew because they've built all this value, they kind of missed the euphoria boat in terms of, you know, getting the best price for an IPO, for example, and they've got to figure out, all right, how do we scrape together value from what we've got within the constraints that we have? How do we build a fence around the value that we've got and put a tollbooth in front of it so that the public markets are excited about this and give us our best bang for the buck? That's Steve Hoffman's job. That's his crew's job. I understand the pressures and I respect that.And I think that the way they went about it this year was short-sighted because what it does is it undervalues everybody who isn't in the boardroom, making decisions with them. I think what we have to understand that when we build software, Metcalfe's law applies to developer tools just as much as any other network here. And so, the people who are stakeholders, who are participants, who are constituents of your community, are load-bearing members of the value chain that you are putting together, and so when you just cut them out, you might be nicking an artery that bleeds out very, very, very slowly. And the sentiment that you just expressed here about how your experience of Reddit was soured, I mean you're the enthusiast type, right? Like, who wants to sign up for the drama of flame wars and moderation except if you really just love it?And so, what they were able to do was take people who, for years, absolutely loved it, and just drain away their love and enthusiasm for it. And the thing is, over time, that harms the long-term value that you are trying to actually protect. When we live in a world where computers can do all of this stuff infinitely, when they will provide us with extraordinary scale, when information can be copied and distributed at near-zero marginal cost, what we're doing is setting up chains of incentives to get people to do stuff, essentially, for free. You were unpaid labor doing that moderation, and the reason that you did it for free was because it was fun, was because it spoke to something inside of you that really mattered, and you wanted to provide for a community of other people who also cared about these topics. And that fun was taken away from you. So, there's a bunch of this stuff that doesn't fit into a spreadsheet, and if we make decisions exclusively on what fits into a spreadsheet, we're going to turn around someday and find that we have cut off some of the most valuable parts of what makes this industry great.Corey: I agree. I feel like companies have a—they launch, and they want the benefits of having an open-source community, but as they grow and get to a point of success and becoming self-sustaining, it's harder to see those benefits because at that point, it just feels like it's all downside: you are basically giving what you built away to your direct competitors, you are seeing significant value scattered throughout the ecosystem that you are capturing a very small portion of, and it becomes frustrating—especially in historical environments—where you have the sense of—back when you built the company years ago, it's well, obviously we'd be the best place to host and run this because no one's going to run this as well as the people who built it. And then cloud companies, with their operational excellence, come in and put the lie to that, in many cases [laugh]. It's like, oh dear. Not like that.And I understand, truly, the frustration and the pain and the fear that drives companies in that position. And I don't have a better answer, which is my big problem because I'm just sitting here saying, “You're doing it wrong. Don't do it like that.” “Okay, well, what should they do instead?” “No, I just want to be angry. I'm not here to offer solutions.” And I feel for them. I do. I have a lot of empathy for everyone involved in this conversation. It just sucks, but we need a better outcome than the current state, or we're going to not see the same open innovation. Even these days, when I build things, by default, I don't build in the open, not because I'm worried about competitive threats, but because I don't want to deal with people complaining to me about things that I've built and don't want to think about this week.Danilo: I think that we're living through the hangover of—I mean, if you looked at the crypto craze as an example of this hangover, right—here we were with the sky the limit. We can sell monkey pictures for extraordinary amounts of money and there's nothing behind it. We went from euphoria to fear in the space of a handful of quarters. And so, that has put all of us, even the most optimistic, in a place where we feel our backs are against the walls. But I think the responsibility we have is, again, computing fundamentally changes the economics of so many categories of labor, and it changes the economics of information generally.And so, we can do a bunch of stuff that doesn't cost that much over the long-term, relative to the value it creates. But it only works if we have a really clear thesis of the value we're creating. If we don't value the contributions of a community, if we don't value the emergent outcomes that arise from building something that's very expressive, that then lets outsiders show up and do things that we never predicted, if we're not building strategies that look at this value as something that is precious instead of something to be cut off and captured, then I think that we just continue to spiral down the drain of paranoia, and greed, and fear instead of doing things that actually create long-term sustainable growth for our business.Corey: I really wish that there were easier, direct paths. Like on some level, too, it's—I feel like this is part of the problem, that every company views going public as its ultimate goal.Danilo: Yeah.Corey: At least that's what it feels like. Like The Duckbill Group. If we ever go public, my God, I will have been so far gone from this company long before then, just because at that point, you have given control over to people who are not aligned, in many cases, with the values that you founded the company with. Like, one of the things I love about being a small business is that I don't need to necessarily think the next quarter's earnings. I can think longer-term. “Okay, in two or three years, what do I want to be doing?” Or five or ten. I'm not forced into this narrow, short-sighted treadmill where I have to continually show infinite growth in all areas at all times. That doesn't sound healthy.Danilo: I agree, and I think that this is a place where I can give you a lot of hope because I look at a handful of economic tailwinds that are really going to make it possible to build businesses in a different way than was practical before. If we look at the last cycle, one of the absolute game changers was open-source. So, you showed up and there was already a web server written for you, and there was already a database written for you, and so you would just pull these things off the shelf instead of having to hire a team that would build your web server from scratch, that would build your database from scratch. And so, that changed the economics of how companies could be made, and that created an entire cycle of new technology growth.And if we look for an analogy of that kind of labor savings for the next technology cycle, we're going to see things like cloud-based serverless services, right? So like, now you don't need to even administer a Linux server. You don't need to know how the server works under the hood. You pay one company for an API that gives you a database, and they manage the stuff. So, I'm thinking of companies like Neon, or PlanetScale, right? You give them cash, they give you a database, they worry about it, they do all of the on-call stuff, you don't have to think about. So, this makes it even cheaper to build things of higher complexity because you are outsourcing much of the management of that complexity to other firms. And I think that that pattern is going to change the overall costs of starting and scaling and maintaining any sort of web-based product. And so, that's number one.And then number two, is that when we look at stuff like large language models, the stuff that you can do with ChatGPT in terms of figuring out how to solve a broad array of problems that maybe you don't have a lot of domain expertise in, I think that means that we're going to see smaller teams get even further than we expect. And so, the net result of these trends is going to be, you don't need to take vast amounts of venture funding in order to get to a company that serves a large number of people at a meaningful scale, with meaningful returns for the principles involved, and then they don't have to go all the way down to the IPO route. They don't have to figure out some sort of mega-scale unicorn exit; they can just build companies that work, that solve customer problems, keep it close, and then you don't have the totalizing endless need for growth. I think we're going to see a lot more of that this cycle.Corey: I sure hope you're right. I think that there's been a clear trend toward panic, or at least if not panic, then at least looking at current conditions and assuming that they'll persist forever. We just saw ten years of an unprecedented bull run, where people tended to assume that interest rates would be forever low, growth was always going to be double-digit at least, and there was no need to think about anything that would ever argue against those things. For the first few years of my consulting company, it was a devil of a problem trying to convince people to care about their AWS bills because frankly, when money is free, there is no reason for someone to. They are being irrational if they do. Now, of course, that's a very different story, but at the time, I felt for a while like I was the one who was nuts.Danilo: So, the interest rate conditions are always going to make people behave a certain way. That's why they exist, right? We have monetary policy designed to influence business behavior. And if we look at that zoom, then we say, “All right, look, this stuff is all cyclical. We know there's going to be good times, we know there's going to be lean times, but at the end of the day, we care about building stuff.” Right?I don't spend a lot of time with the sort of venture capitalist set who's really obsessed with building, but I really love building. I just, I can't stop building things. It is what I was put on this planet to do, and I think that there are so many people who feel exactly the same way. And so, regardless of the larger interest-rate phenomenon, we have to find a path where we can just build the stuff that we need to build. Build it for our reasons, for the right reasons, not because we just want to cash out. Although, you know, getting paid is great. I don't begrudge anyone that.Corey: You can't eat aspirations, as it turns out.Danilo: That's right, right? We've got to worry about the economics, and that's reasonable. But at the end of the day, making things happen through technology is its own mission and its own reward, regardless of what some sort of venture fund needs to make return happen. So, I think that we are going to get past this moment of slump and return to the fundamentals of we need to build technology because building technology makes us feel good and creates impact in the world that we absolutely need. And those are the fundamentals of this business.Corey: I agree with you wholeheartedly. I think that I've been around too many cycles—this is a polite way of saying I'm old—and you learn when that happens that everything that feels so immediate and urgent in the moment, in the broad sweep of things, so rarely is. Not everything can be life or death because you'll die lots of times.Danilo: Yeah.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where's the best place for them to find you?Danilo: If you want to engage me for my thinking and strategy around humanist technology tools growth, you should find me at antigravitydx.com. And if you want to read more about what I think about, I maintain a blog at redeem-tomorrow.com, and you can learn all about my thinking about the last cycle, and the coming one as well.Corey: And I will absolutely include a link to that in the [show notes 00:31:52]. Thank you so much for taking the time to speak with me. I appreciate it.Danilo: It's a pleasure, Corey. Thank you for having me. Really great to chat.Corey: Danilo Campos, proprietor at Antigravity DX. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment taking care within that comment to link to a particular section of the FutureAdvisor code repo.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How Vercel is Improving the Developer Experience on the Front End with Guillermo Rauch

Screaming in the Cloud

Play Episode Listen Later Dec 21, 2023 33:16


Guillermo Rauch, Founder and CEO of Vercel, joins Corey on Screaming in the Cloud to discuss how he decided to focus on building a front-end tool that is fast, reliable, and focuses on the developer experience. Guillermo explains how he discovered that Javascript was the language that set online offerings apart, and also reveals the advice he gives to founders on how to build an effective landing page. Corey and Guillermo discuss the effects of generative AI on developer experience, and Guillermo explains why Vercel had a higher standard for accuracy when rolling out their new AI product for developers, v0. About GuillermoGuillermo Rauch is Founder and CEO of Vercel, where he leads the company's mission to enable developers to create at the moment of inspiration. Prior to founding Vercel, Guillermo co-founded LearnBoost and Cloudup where he served the company as CTO through its acquisition by Automattic in 2013. Originally from Argentina,Guillermo has been a developer since the age of ten and is passionate about contributing to the open source community. He has created a number of JavaScript projects including socket.io, Mongoose.js, Now, and Next.js.Links Referenced: Vercel: https://vercel.com/ v0.dev: https://v0.dev Personal website: https://rauchg.com Personal twitter: https://twitter.com/rauchg TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I don't talk a lot about front-end on this show, primarily because I am very bad at front-end, and in long-standing tech tradition, if I'm not good at something, apparently I'm legally obligated to be dismissive of it and not give it any attention. Strangely enough, I spent the last week beating on some front-end projects, and now I'm not just dismissive, I'm angry about it. Here to basically suffer the outpouring of frustration and confusion is Guillermo Rauch, founder and CEO of Vercel, but also the creator of Next.js. Guillermo, thank you for joining me.Guillermo: Great to be here. Thanks for setting me up with that awesome intro.Corey: It's true, if I were talking to someone who looked at what I've done, and for some Godforsaken reason, they wanted to follow in my footsteps, well, that path has been closed, so learning a bunch of Perl early on and translating it all to bad bash scripts and the rest, and then maybe picking up Python isn't really the way that I would advise someone getting started today. The de facto lingua franca of the internet is JavaScript, whether we like it or not, and I would strongly suggest that be someone's first language despite the fact that I'm bad at it, I don't understand it, and therefore it makes me angry.Guillermo: Yeah, it's so funny because it sounds like my story. And my personal journey was, when I was a kid, I had a—I knew I wanted to hack around with computers, and reverse engineer them, and improve them, and just create my own things, and I had these options for what programming language I could go with. And I tried it all: PHP, Perl, [Mod PHP 00:02:12], [Mod Perl 00:02:13], Apache, LAMP, cgi-bin folders, all the whole nine yards. And regardless of what back-end technology I used, I encountered this striking fact, which was… the thing that can make your product really stand out in a web browser is typically involving JavaScript in some fashion. So, when Google came out with suggestions as you would type in a search box, my young kid Argentinian brain blew up. I was like, “Holy crap, they can suggest, they can read my mind, and they can render suggestions without a full page refresh? What is that magic?”And then more products like that came out. Google Docs, Gmail Chat, Facebook's real time newsfeed, all the great things on the internet seemed to have this common point of, there's this layer of interactivity, real-time data, customization, personalization, and it seems uniquely enabled by the front-end. So, I just went all in. I taught myself how to code, I taught myself—I became a front-end engineering expert, I joined some of the early projects that shaped the ecosystem. Like, there was this library called MooTools, and a lot of folks might not have heard that name. It's in the annals of JavaScript history.And later on, you know, what I realized is, what if front-end can actually be the starting point of how you develop the best applications, right, rather than this thing that people, like, reluctantly frown upon, like yourself. I mention that as an opportunity rather than a diss because when you create a great front-end experience, now the data has proven you run a better business, you run a more dynamic business, probably are running an AI-powered business, like, all of the AI products on the planet today are using this technology to stream text in front of your eyes in real time and do all this awesome things. So yeah, I became obsessed with front-end, and I founded this company Vercel, which is a front-end cloud. So, you come here to basically build the best products. Now, you don't have to build the back-end, so you can use back-ends that are off the shelf, you can connect to your existing back-ends, and we piggyback on the world's best infrastructure to make this possible, but we offer developers a very streamlined path to create these awesome products on the internet.Corey: I have to say that I have been impressed when I've used Vercel for a number of projects. And what impresses me is less the infrastructure powering it, less the look at how performance it is and all the stuff that most people talk about, but as mentioned, I am not good at front-end or frankly programming at all. And so, many products in this space fall into the very pernicious trap of, “Oh, well, everyone who's using this is at least this tall on the board of how smart you are to get on the amusement park ride.” So, I feel that when I'm coming at this from a—someone who is not a stranger to computers but is definitely new to this entire ecosystem, everything just made sense in a way that remarkably few products can pull off. I don't know if you would call that user experience, developer experience, or what the terminology you bias for there is, but it is a transformational difference.Guillermo: Thank you. I think it's a combination of things. So, developer experience has definitely always been a focus for me. I was that weird person that obsessed about the CLI parameters of the tool, and the output of the tool, and just like how it feels for the engineer. I did combine that with—and I think this is where Vercel really stands out—I did combine that with a world-class infrastructure bit because what I realized after creating lots and lots of very popular open-source projects—like, one is called Socket.io, and other one called Mongoose—DX, or developer experience, in the absence of an enticing carrot for the business doesn't work. Maybe it has some short term adoption, maybe it has raving fans on Twitter or X [laugh], but at the end of the day, you have to deliver something that's tangible to the end-user and to the business.So I think Vercel focusing on the front-end has found a magical combination there of I can make the developer lives easier. Being a developer myself, I just tremendously empathize with that, but it can also make more profit for the business. When they make your website faster and render more dynamic data that serve as better recommendations for a product on e-commerce or in a marketing channel, I can help you roll out more experiments, then they make your business better, and I think that's one of the magical combinations there.The other thing, frankly, is that we'd started doing fewer things. So, when you come to Vercel, you typically come with a framework that you've already chosen. It could be Next.js, it could be View, [unintelligible 00:07:18], there's 35-plus frameworks. But we basically told the market, you have to use one of these developer tools in order to guide your development.And what companies were doing before—I mean, this almost seems obvious in retrospect, that we would optimize for her certain patterns and certain tools—but what the market was doing before was rolling out your own frameworks. Like, every company was, basically—React, for example, is a very popular way of building interfaces, and our framework actually is built on top of React. But when I would go to and talk to all these principal engineers that all these companies, they were saying, oh yeah, “We're creating our own framework. We're creating our own tools.” And I think that to me now feels almost like a zero interest rate phenomenon. Like, what business do you have in creating frameworks, tools, and bespoke infra when you're really in the business of creating delightful experiences for your customers?Corey: What I think is lost on a lot of folks is that if you are trying to learn something new and use a tool, and the developer experience is bad, the takeaway—at least for me and a lot of people that I talk to is not, “Oh, this tool has terrible ergonomics. That's it's problem.” Instead, the takeaway is very much, “Oh, I'm dumb because I don't understand this thing.” And I know intellectually that I am not usually the dumbest person in the world when it comes to a particular tool or technology, but I keep forgetting that on a visceral level. It's, “I just wish I was smart enough to understand that.”No, I don't. I wish it was presented in a way that was more understandable and the documentation was better. When you're just starting out and building something in your spare time, the infrastructure cost is basically nothing, but your time is the expensive part in it. So, if you have to spend three hours to track down something just because it wasn't clearly explained, the burden of adopting that tool is challenging. I would argue that one of the reasons that AWS sees some of the success that it does is not necessarily because it's great so much as because everyone knows how it breaks. That's important.I'm not saying their infrastructure isn't world-class—please, don't come at me in the comments on this one—but I am saying that we know where its sharp edges are, and that means that we're more comfortable building with it. But the idea of learning a brand-new cloud with different sharp edges in other areas. That's terrifying. I'd rather stick with the devil I know.Guillermo: Exactly. I just think that you're not going to be able to make a difference for customers in 2023 by creating another bespoke cloud that is general purpose, it doesn't really optimize around anything, and you have to learn all the sharp edges from scratch. I think we saw that with the rise of cloud-native companies like Stripe and Twilio where they were going after these amazingly huge markets like financial infrastructure or communications infrastructure, but the angle was, “Here's this awesome developer experience.” And that's what we're doing with Vercel for the front-end and for building products, right? There has to be an opinionated developer experience that guides you to success.And I agree with you that there's really, these days in the developer communities, zero tolerance for sharp edges, and we've spent a lot of time in—even documentation, like, it used to be that your startup would make or break it by whether you had great documentation. I think in the age of frameworks, I would even dare say that documentation, of course, is extremely important, but if I can have the tool itself guide you to success, at that point, you're not even reading documentation. We're now seeing this with AI and, like, generative AI for code. At Vercel, we're investing in generative AI for user interfaces. Do you actually need to read documentation at that point? So, I think we're optimizing for the absolute minimal amount of friction required to be successful with these platforms.Corey: I think that there's a truth in that of meeting customers in where they are. Your documentation can be spectacular, but people don't generally read the encyclopedia for fun either. And the idea of that is that—at least ideally—I should not need to go diving into the documentation, and so many tools get this wrong, where, “Oh, I want to set up a new project,” and it bombards you with 50 questions, and each one of these feels pretty… momentous. Like, what one-way door am I passing through that I don't realize the consequences of until I'm 12 hours into this thing, and then have to backtrack significantly. I like, personally, things that bias for having a golden path, but also make it easy to both deviate from it, as well as get back onto it. Because there's more than one way to do it is sort of the old Perl motto. That is true in spades in anything approaching the JavaScript universe.Guillermo: Yeah. I have a lot of thoughts on that. On the first point, I completely agree that the golden path of the product cannot be documentation-mediated. One of the things that I've become obsessive about—and this is an advice that I share with a lot of other startup founders is, when it comes to your landing page, the primary call to action has to be this golden path to success, like, 2, 3, 4 clicks later, I have to have something tangible. That was our inspiration.And when we made it the primary call to action for Vercel is deploy now. Start now. Get it out now. Ship it now. And the way that you test out the platform is by deploying a template. What do we do is we create a Git repo for you, it sets up the entire CI/CD pipeline, and then at that point, you already have something working, something in the cloud, you spent zero time reading documentation, and you can start iterating.And even though that might not be the final thing you do in Vercel, I always hear the stories of CTOs that are now deploying Vercel at really large scale, and they always tell me, “I started with your hobby tier, I started with free tier, I deployed a template, I hacked on a product during the weekend.” Now, a lot of our AI examples are very popular in this crowd. And yeah, there's a golden path that requires zero documentation. Now, you also mentioned that, what about complexity? This is an enterprise-grade platform. What about escape hatches? What about flexibility? And that's where our platform also shines because we have the entire power of a Turing-complete language, which is JavaScript and TypeScript, to customize every aspect of the platform.And you have a framework that actually answered a lot of the problems that came with serverless solutions in the past, which is that you couldn't run any of that on your local machine. The beauty of Vercel and Next.js is we kind of pioneered this concept that we called ‘Framework Defined Infrastructure.' You start with the framework, the framework has this awesome property that you can install on your computer, it has a dev command—like, it literally runs on your computer—but then when you push it to the cloud, it now defines the infrastructure. It creates all of the resources that are highly optimal.This creates—basically converts what was a single node system on your computer to a globally distributed system, which is a very complex and difficult engineering challenge, but Vercel completely automates it away. And now for folks that are looking for, like, more advanced solutions, they can now start poking into the outputs of that compilation process and say, “Okay, I can now have an influence or I can reconfigure some aspects of this pipeline.” And of course, if you don't think about those escape hatches, then the product just ends up being limiting and frustrating, so we had to think really hard about meeting both ends of the spectrum.Corey: In my own experimentations early on with Chat-Gippity—which is how I insist on pronouncing ChatGPT—a lot of what I found was that it was a lot more productive for me to, instead of asking just the question and getting the answer was, write a Python script to—Guillermo: Yes.Corey: Query this API to get me that answer. Because often it would be wrong. Sometimes very convincingly wrong, and I can at least examine it in various ways and make changes to it and iterate forward, whereas when everything is just a black box, that gets very hard to do. The idea of building something that can be iterated on is key.Guillermo: I love that. The way that Vercel actually first introduced itself to the world was this idea of immutable deployments and immutable infrastructure. And immutable sounds like a horrible word because I want to mutate things, but it was inspired by this idea of functional programming where, like, each iteration to the state, each data change, can be tracked. So, every time you deploy in Vercel, you get this unique URL that represents a point-in-time infrastructure deployment. You can go back in time, you can revert, you can use this as a way of collaborating with other engineers in your team, so you can send these hyperlinks around to your front-end projects.And it gives you a lot of confidence. Now, you can iterate knowing that before things go out, there's a lot of scrutiny, there's a lot of QA, there's a lot of testing processes that you can kick off against this serverless infrastructure that was created for each deployment. The conclusion for us so far has been that our role in the world is to increase iteration velocity. So, iteration speed is the faster horse of the cloud, right? Like, instead of getting a car, you get a faster horse.When you say, “Okay, I made the build pipeline 10% faster,” or, “I brought the TLS termination 10% closer to the visitor, and, like, I have more [pops 00:17:10],” things like that. That to me, is the speed. You can do those things, and they're awesome, but if you don't have a direction—which is velocity—then you don't know what you're building next. You don't know if your customers are happy. You don't know if you're delivering value. So, we built an entire platform that optimizes around, what should you ship next? What is the friction involved in getting your next iteration out? Is launching an experiment on your homepage, for example, is that a costly endeavor? Does it take you weeks? Does it take you months?One of the initial inspirations for just starting Vercel and making deployments really easy was, how difficult is it for the average company to change in their footer of their website is this copyright 2022? And you have—it's a new year. You have to bump it to copyright 2023. How long do you think it takes that engineer to, A, run the stack locally, so they can actually see the change; deploy it, but deploy to what we call the preview environment, so they can grab that URL and send [it to 00:18:15], Corey, and say, “Corey, does it look good? I updated [laugh] I updated the year in the footer.”And then you tell me, “Looks good, let's ship it to production.” Or you tell me, “No, no, no, it's risky. Let's divide it into two cohorts: 50% of traffic gets 2022, 50% of traffic gets 2023.” Obviously, this is a joke, but consider the implications of how difficult it is and the average organization to actually do this thing.Corey: Oh, I find things like that all the time, especially on microservices that I built to handle some of my internal workflows here, and I haven't touched in two or three years. And okay, now it's time for me to update them to reflect some minor change. And first, I wind up in the screaming node warnings and I have to update things so that they actually work in a reasonable way. And, on some level, making a one-line change can take half a day. Now, in the real world, when people are working on these apps day-in and day-out, it gets a lot easier to roll those changes in over time, but coming back to something unmaintained, that becomes a project the longer you let it sit.Part of me wishes that there were easier ways around it, but there are trade-offs in almost any decision you make. If you're building something from the beginning of, well, I want to be able to automatically update the copyright year, you can even borderline make that something that automatically happens based upon the global time, whereas when you're trying to retrofit it afterwards, yeah, it becomes a project.Guillermo: Yeah, and now think, that's just a simple example of changing a string. That might be difficult for a product engineering in any organization. Or it may be slow, or it may be not as streamlined, or maybe it works really well for the first project that that company created. What about every incremental project thereafter?So, now I said—let's stop talking about a string, right? Let's think about you're about an e-commerce website where what we hear from our customers on average, like, 10% of revenue flows through the homepage. Now, I have to change a primary component that renders on the hero of the page, and I have to collaborate with every department in the organization. I have to collaborate with the design team, I have to collaborate with marketing, I have to collaborate with the business owners to track the analytics appropriately. So, what is the cost of every incremental experiment that you want to put in production?The other thing that's particularly interesting about front-end as it relates to cloud infrastructure is, scaling up front-end is a very difficult thing. What ends up happening is most front-ends are actually static websites. They're cached at the edge—or they're literally statically generated—and then they push all of the dynamism to the client side. So, you end up with this spaghetti of script tags on the client, you end up accumulating a lot of tech debt in the [shipping 00:20:56] huge bundles of JavaScript to the client to try to recover some dynamism, to try and run these experiments. So, everyone is in this, kind of, mess of the yes, maybe we can experiment, but we kind of offloaded the rendering work to the client. That in turn makes me—basically, I'm making the website slower for the visitor. I'm making them do the rendering work.And I'm trying to sell them something. I'm trying to speed up some processes. It's my responsibility to make it fast. So, what we ended up finding out is that yes, the cloud moved this forward a lot in terms of having these awesome building blocks, these awesome infrastructure primitives, but both in the developer experience, just changing something about your web product and also the end-user experiences, that web product renders really fast, those things really didn't happen with this first chapter of the cloud. And I think we're entering a new generation of higher-level clouds like Vercel that are optimizing for these things.Corey: I think that there's a historical focus on things that have not happened before. And that was painful and terrible, so we're not going to be focusing on what's happening in the future, we're going to build a process or a framework or something that winds up preventing that thing that hurt us from hurting us again. Now, that's great in moderation, but at some point—we see this at large companies from time-to-time—where you have so much process that is ossified scar tissue, basically, that it becomes almost impossible to get things done. Because oh, I want to make that—for example, that one-line change to a copyright date, well, here's the 5000 ways deploys have screwed us before, so we need to have three humans sign off on the one-line change, and a bunch of other stuff before it ever sees the light of day. Now, I'm exaggerating slightly, just as you are, but that feels like it acts as a serious brake on innovation.On the exact opposite side, where we see massive acceleration has been around the world of generative AI. Yes, it is massively hyped in a bunch of ways. I don't think it is going to be a defined way that changes the nature of humanity the way that some of these people are going after, but it's also clearly more than a parlor trick.Guillermo: I'm kind of in that camp. So, like you, I've been writing code for many years. I'm pretty astonished by the AI's ability to enhance my output. And of course, now I'm not writing code full time, so there is a sense of, okay because I don't have time, because I'm doing a million things, any minute I have seems like AI has just made it so much more worthwhile and I can squeeze so much more productivity out of it. But one of the areas that I'm really excited about is this idea of generative UI, which is not just autocompleting code in a text editor, but is the idea that you can use natural language to describe an interface and have the AI generate that interface.So, Vercel created this product called v0—you can check it out at v0.dev—where to me, it's really astonishing that you can get these incredibly high quality user interfaces, and basically all you have to do is input [laugh] a few English words. I have this personal experience of, I've been learning JavaScript and perfecting all my knowledge around it for, like, 20 or so years. I created Next.js.And Next.js itself powers a lot of these AI products. Like the front-end of ChatGPT is built on Next.js. And I used v0 to create… to basically recreate my blog. Like, I created rauchg.com, I deployed it on Vercel, but every pixel of that UI, I handcrafted.And as we were working on v0, I said okay, “I'm going to challenge myself to put myself back in the shoes of, like, I'm going to redesign this and I'm going to start over with just human language.” Not only did I arrive to the right look and feel of what I wanted to get, the code that it produced was better than I would have written by hand. Concretely, it was more accessible. So, there were areas of the UI where, like, some icons were rendered where I had not filled in those gaps. I just didn't know how to do that. The AI did. So, I really believe that AI will transform our lives as [laugh] programmers, at least I think, in many other areas in very profound ways.Corey: This is very similar to a project that I've been embarked on for the last few days where I described the app I wanted into Chat-Gippity and follow the instructions, and first, it round up point—sending me down a rabbit hole of the wrong Framework version that had been deprecated, and whatnot, and then I brought it all into VS Code where Jif-Ub Copilot, it kept switching back and forth between actively helpful, and ooh, the response matches publicly available code, so I'm not going to tell you the answer, despite the fact that feature has never been enabled on my account. So yeah, of course, it matches publicly available code. This is quite literally the React tutorial starter project. And it became incredibly frustrating, but it also would keep generating things in bursts, so my code is not at all legible or well organized or consistent for that matter. But it's still better than anything I'd be able to write myself. I'm looking forward to using v0 or something like it to see how that stacks up for some of my ridiculous generation ideas for these things.Guillermo: Yeah, you touched on a very important point is, the code has to work. The code has to be shippable. I think a lot of AI products have gotten by by giving you an approximation of the result, right? Like, they hallucinate sometimes, they get something wrong. It's still very helpful because sometimes it's sending you the right direction.But for us, the bar is that these things have to produce code that's useful, and that you can ship, and that you can iterate on. So, going back to that idea of iteration velocity, we call it v0 because we wanted it to be the first version. We still very much believe there is humans in the loop and folks will be iterating a lot on the initial draft that this thing is giving you, but it's so much better than starting with an empty code editor, [laugh] right? Like, and this applies, by the way to, like, not just new projects, but I always talk about, like, our customers have a few really important landing pages, key pages, maybe it's the product detail page in e-commerce, maybe it's your homepage and, like, your key product pages for a marketing website. Maybe it's where—and the checkout, for example, extremely important.But then there's a lot of incremental UIs that you have to add every single day. The banner for [laugh] accepting cookies or not, the consent management dialog. There's a lot of things that the worst case scenario is that you offload them again to some third-party script, to some iframe of sorts because you really don't have the bandwidth, time, or resources to build it yourself. And then you sacrifice speed, you sacrifice brand fidelity. And again because we're the front-end cloud, we're obsessed with your ability to ship UI that's native to your product, that is a streamline, that works really well. So, I think AI is going to have a significant effect there where a lot of things where you were sending someone to some other website because you just didn't have the bandwidth to create that UI, you can now own the experience end to end.Corey: That is no small thing. A last question I have, before we wind up calling this episode is, there was a period of time—I don't know if we're still in it or not—where it felt like every time I got up to get a cup of coffee and came back, there would be three JavaScript frameworks that launched during that interim. So, Next.js was at 1.1 of those when someone got up to get a cup of coffee. But that's shown a staying power that is, frankly, remarkable. Why? I don't know enough about the ecosystem to have an opinion on that, but I noticed when things stand out, and Next does.Guillermo: Yeah, I think it's a number of factors. Number one, we, as an industry I think, we coalesced, and we found the right engine to build our car. And that engine became React. Most folks building UI today are choosing React or a similar engine, but React has really become the gold standard for a lot of engineers. Now, what ended up happening next is that people realized I want a car. I want the full product. I need to drive. I don't want to assemble this freaking car every single time I have a new project.And Next.js filled a very important gap in the world where what you were looking for was not a library; what you were looking for is a framework that has opinions, but those opinions are very in line with how the web is supposed to work. We took a page from, basically, the beginnings of the web. We make a lot of jokes that in many ways, our inspiration was PHP, where server rendering is the default, where it's very expressive, it's very easy to reach for data. It just works for a lot of people. Again, that's the old [stack 00:30:03] in the olden days.And so, it obviously didn't quite work, but the inspiration was, can we make something that is a streamline for creating web interfaces at scale? At scale. And to your point, there's also a sense of, like, maybe it doesn't make sense anymore to build all this infrastructure from scratch every single time I started a project. So, Next filled in that gap. The other thing we did really well, I think, is that we gave people a universal model for how to use not just the server side, but also the client side strategically.So, I'll give you an example. When you go to ChatGPT, a lot of things on the screen are server rendered, but when you start doing interactions as a user, that requires for something like you'd say, “Hey Dali, generate an image.” That stuff requires a lot of optimistic UI. It requires features that are more like what a mobile native application can do. So, we can give folks the best of both worlds: the speed, interactivity, and fluidity of a native app, but we had those, sort of, fundamentals of how a website should work that even Perl and PHP had gotten right, once upon a time. So, I think we found that right blend of utility and flexibility, and folks love it, and I think, yeah, we're excited to continue to help steward this project as a standard for building on the web.Corey: I really want to thank you for taking the time to talk about a lot of the genesis of this stuff and how you view it, which I think gives us a pretty decent idea of how you're going to approach the evolution of what you've built. If people want to learn more, where's the best place for them to find you?Guillermo: So, head to vercel.com to learn about our platform. You can check out v0.dev, which we'll be opening broadly to the public soon, if you want to get started with this idea of generative UI. And myself, I'm always tweeting on X, twitter.com or x.com/rauchg to find me.Corey: One of these days we'll be able to kick that habit, I hope [laugh].Guillermo: [laugh]. Yeah.Corey: Thank you so much for being so generous with your time. I appreciate it.Guillermo: Thank you.Corey: Guillermo Rauch, founder and CEO of Vercel, and creator of Next.js. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that will be almost impossible for you to submit because that podcast platform did not pay attention to user experience.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started. Vercel: https://vercel.com/ v0.dev: https://v0.dev Personal website: https://rauchg.com Personal twitter: https://twitter.com/rauchg TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I don't talk a lot about front-end on this show, primarily because I am very bad at front-end, and in long-standing tech tradition, if I'm not good at something, apparently I'm legally obligated to be dismissive of it and not give it any attention. Strangely enough, I spent the last week beating on some front-end projects, and now I'm not just dismissive, I'm angry about it. Here to basically suffer the outpouring of frustration and confusion is Guillermo Rauch, founder and CEO of Vercel, but also the creator of Next.js. Guillermo, thank you for joining me.Guillermo: Great to be here. Thanks for setting me up with that awesome intro.Corey: It's true, if I were talking to someone who looked at what I've done, and for some Godforsaken reason, they wanted to follow in my footsteps, well, that path has been closed, so learning a bunch of Perl early on and translating it all to bad bash scripts and the rest, and then maybe picking up Python isn't really the way that I would advise someone getting started today. The de facto lingua franca of the internet is JavaScript, whether we like it or not, and I would strongly suggest that be someone's first language despite the fact that I'm bad at it, I don't understand it, and therefore it makes me angry.Guillermo: Yeah, it's so funny because it sounds like my story. And my personal journey was, when I was a kid, I had a—I knew I wanted to hack around with computers, and reverse engineer them, and improve them, and just create my own things, and I had these options for what programming language I could go with. And I tried it all: PHP, Perl, [Mod PHP 00:02:12], [Mod Perl 00:02:13], Apache, LAMP, cgi-bin folders, all the whole nine yards. And regardless of what back-end technology I used, I encountered this striking fact, which was… the thing that can make your product really stand out in a web browser is typically involving JavaScript in some fashion. So, when Google came out with suggestions as you would type in a search box, my young kid Argentinian brain blew up. I was like, “Holy crap, they can suggest, they can read my mind, and they can render suggestions without a full page refresh? What is that magic?”And then more products like that came out. Google Docs, Gmail Chat, Facebook's real time newsfeed, all the great things on the internet seemed to have this common point of, there's this layer of interactivity, real-time data, customization, personalization, and it seems uniquely enabled by the front-end. So, I just went all in. I taught myself how to code, I taught myself—I became a front-end engineering expert, I joined some of the early projects that shaped the ecosystem. Like, there was this library called MooTools, and a lot of folks might not have heard that name. It's in the annals of JavaScript history.And later on, you know, what I realized is, what if front-end can actually be the starting point of how you develop the best applications, right, rather than this thing that people, like, reluctantly frown upon, like yourself. I mention that as an opportunity rather than a dis because when you create a great front-end experience, now the data has proven you run a better business, you run a more dynamic business, probably are running an AI-powered business, like, all of the AI products on the planet today are using this technology to stream text in front of your eyes in real time and do all this awesome things. So yeah, I became obsessed with front-end, and I founded this company Vercel, which is a front-end cloud. So, you come here to basically build the best products. Now, you don't have to build the back-end, so you can use back-ends that are off the shelf, you can connect to your existing back-ends, and we piggyback on the world's best infrastructure to make this possible, but we offer developers a very streamlined path to create these awesome products on the internet.Corey: I have to say that I have been impressed when I've used Vercel for a number of projects. And what impresses me is less the infrastructure powering it, less the look at how performance it is and all the stuff that most people talk about, but as mentioned, I am not good at front-end or frankly programming at all. And so, many products in this space fall into the very pernicious trap of, “Oh, well, everyone who's using this is at least this tall on the board of how smart you are to get on the amusement park ride.” So, I feel that when I'm coming at this from a—someone who is not a stranger to computers but is definitely new to this entire ecosystem, everything just made sense in a way that remarkably few products can pull off. I don't know if you would call that user experience, developer experience, or what the terminology you bias for there is, but it is a transformational difference.Guillermo: Thank you. I think it's a combination of things. So, developer experience has definitely always been a focus for me. I was that weird person that obsessed about the CLI parameters of the tool, and the output of the tool, and just like how it feels for the engineer. I did combine that with—and I think this is where Vercel really stands out—I did combine that with a world-class infrastructure bit because what I realized after creating lots and lots of very popular open-source projects—like, one is called Socket.io, and other one called Mongoose—DX, or developer experience, in the absence of an enticing carrot for the business doesn't work. Maybe it has some short term adoption, maybe it has raving fans on Twitter or X [laugh], but at the end of the day, you have to deliver something that's tangible to the end-user and to the business.[unintelligible 00:06:33] Vercel focusing on the front-end has found a magical combination there of I can make the developer lives easier. Being a developer myself, I just tremendously empathize with that, but it can also make more profit for the business. When they make your website faster and render more dynamic data that serve as better recommendations for a product on e-commerce or in a marketing channel, I can help you roll out more experiments, then they make your business better, and I think that's one of the magical combinations there.The other thing, frankly, is that we'd started doing fewer things. So, when you come to Vercel, you typically come with a framework that you've already chosen. It could be Next.js, it could be View, [unintelligible 00:07:18], there's 35-plus frameworks. But we basically told the market, you have to use one of these developer tools in order to guide your development.And what companies were doing before—I mean, this almost seems obvious in retrospect, that we would optimize for her certain patterns and certain tools—but what the market was doing before was rolling out your own frameworks. Like, every company was, basically—React, for example, is a very popular way of building interfaces, and our framework actually is built on top of React. But when I would go to and talk to all these principal engineers that all these companies, they were saying, oh yeah, “We're creating our own framework. We're creating our own tools.” And I think that to me now feels almost like a zero interest rate phenomenon. Like, what business do you have in creating frameworks, tools, and bespoke infra when you're really in the business of creating delightful experiences for your customers?Corey: What I think is lost on a lot of folks is that if you are trying to learn something new and use a tool, and the developer experience is bad, the takeaway—at least for me and a lot of people that I talk to is not, “Oh, this tool has terrible ergonomics. That's it's problem.” Instead, the takeaway is very much, “Oh, I'm dumb because I don't understand this thing.” And I know intellectually that I am not usually the dumbest person in the world when it comes to a particular tool or technology, but I keep forgetting that on a visceral level. It's, “I just wish I was smart enough to understand that.”No, I don't. I wish it was presented in a way that was more understandable and the documentation was better. When you're just starting out and building something in your spare time, the infrastructure cost is basically nothing, but your time is the expensive part in it. So, if you have to spend three hours to track down something just because it wasn't clearly explained, the burden of adopting that tool is challenging. I would argue that one of the reasons that AWS sees some of the success that it does is not necessarily because it's great so much as because everyone knows how it breaks. That's important.I'm not saying their infrastructure isn't world-class—please, don't come at me in the comments on this one—but I am saying that we know where its sharp edges are, and that means that we're more comfortable building with it. But the idea of learning a brand-new cloud with different sharp edges in other areas. That's terrifying. I'd rather stick with the devil I know.Guillermo: Exactly. I just think that you're not going to be able to make a difference for customers in 2023 by creating another bespoke cloud that is general purpose, it doesn't really optimize around anything, and you have to learn all the sharp edges from scratch. I think we saw that with the rise of cloud-native companies like Stripe and Twilio where they were going after these amazingly huge markets like financial infrastructure or communications infrastructure, but the angle was, “Here's this awesome developer experience.” And that's what we're doing with Vercel for the front-end and for building products, right? There has to be an opinionated developer experience that guides you to success.And I agree with you that there's really, these days in the developer communities, zero tolerance for sharp edges, and we've spent a lot of time in—even documentation, like, it used to be that your startup would make or break it by whether you had great documentation. I think in the age of frameworks, I would even dare say that documentation, of course, is extremely important, but if I can have the tool itself guide you to success, at that point, you're not even reading documentation. We're now seeing this with AI and, like, generative AI for code. At Vercel, we're investing in generative AI for user interfaces. Do you actually need to read documentation at that point? So, I think we're optimizing for the absolute minimal amount of friction required to be successful with these platforms.Corey: I think that there's a truth in that of meeting customers in where they are. Your documentation can be spectacular, but people don't generally read the encyclopedia for fun either. And the idea of that is that—at least ideally—I should not need to go diving into the documentation, and so many tools get this wrong, where, “Oh, I want to set up a new project,” and it bombards you with 50 questions, and each one of these feels pretty… momentous. Like, what one-way door am I passing through that I don't realize the consequences of until I'm 12 hours into this thing, and then have to backtrack significantly. I like, personally, things that bias for having a golden path, but also make it easy to both deviate from it, as well as get back onto it. Because there's more than one way to do it is sort of the old Perl motto. That is true in spades in anything approaching the JavaScript universe.Guillermo: Yeah. I have a lot of thoughts on that. On the first point, I completely agree that the golden path of the product cannot be documentation-mediated. One of the things that I've become obsessive about—and this is an advice that I share with a lot of other startup founders is, when it comes to your landing page, the primary call to action has to be this golden path to success, like, 2, 3, 4 clicks later, I have to have something tangible. That was our inspiration.And when we made it the primary call to action for Vercel is deploy now. Start now. Get it out now. Ship it now. And the way that you test out the platform is by deploying a template. What do we do is we create a Git repo for you, it sets up the entire CI/CD pipeline, and then at that point, you already have something working, something in the cloud, you spent zero time reading documentation, and you can start iterating.And even though that might not be the final thing you do in Vercel, I always hear the stories of CTOs that are now deploying Vercel at really large scale, and they always tell me, “I started with your hobby tier, I started with free tier, I deployed a template, I hacked on a product during the weekend.” Now, a lot of our AI examples are very popular in this crowd. And yeah, there's a golden path that requires zero documentation. Now, you also mentioned that, what about complexity? This is an enterprise-grade platform. What about escape hatches? What about flexibility? And that's where our platform also shines because we have the entire power of a Turing-complete language, which is JavaScript and TypeScript, to customize every aspect of the platform.And you have a framework that actually answered a lot of the problems that came with serverless solutions in the past, which is that you couldn't run any of that on your local machine. The beauty of Vercel and Next.js is we kind of pioneered this concept that we called ‘Framework Defined Infrastructure.' You start with the framework, the framework has this awesome property that you can install on your computer, it has a dev command—like, it literally runs on your computer—but then when you push it to the cloud, it now defines the infrastructure. It creates all of the resources that are highly optimal.This creates—basically converts what was a single node system on your computer to a globally distributed system, which is a very complex and difficult engineering challenge, but Vercel completely automates it away. And now for folks that are looking for, like, more advanced solutions, they can now start poking into the outputs of that compilation process and say, “Okay, I can now have an influence or I can reconfigure some aspects of this pipeline.” And of course, if you don't think about those escape hatches, then the product just ends up being limiting and frustrating, so we had to think really hard about meeting both ends of the spectrum.Corey: In my own experimentations early on with Chat-Gippity—which is how I insist on pronouncing ChatGPT—a lot of what I found was that it was a lot more productive for me to, instead of asking just the question and getting the answer was, write a Python script to—Guillermo: Yes.Corey: Query this API to get me that answer. Because often it would be wrong. Sometimes very convincingly wrong, and I can at least examine it in various ways and make changes to it and iterate forward, whereas when everything is just a black box, that gets very hard to do. The idea of building something that can be iterated on is key.Guillermo: I love that. The way that Vercel actually first introduced itself to the world was this idea of immutable deployments and immutable infrastructure. And immutable sounds like a horrible word because I want to mutate things, but it was inspired by this idea of functional programming where, like, each iteration to the state, each data change, can be tracked. So, every time you deploy in Vercel, you get this unique URL that represents a point-in-time infrastructure deployment. You can go back in time, you can revert, you can use this as a way of collaborating with other engineers in your team, so you can send these hyperlinks around to your front-end projects.And it gives you a lot of confidence. Now, you can iterate knowing that before things go out, there's a lot of scrutiny, there's a lot of QA, there's a lot of testing processes that you can kick off against this serverless infrastructure that was created for each deployment. The conclusion for us so far has been that our role in the world is to increase iteration velocity. So, iteration speed is the faster horse of the cloud, right? Like, instead of getting a car, you get a faster horse.When you say, “Okay, I made the build pipeline 10% faster,” or, “I brought the TLS termination 10% closer to the visitor, and, like, I have more [pops 00:17:10],” things like that. That to me, is the speed. You can do those things, and they're awesome, but if you don't have a direction—which is velocity—then you don't know what you're building next. You don't know if your customers are happy. You don't know if you're delivering value. So, we built an entire platform that optimizes around, what should you ship next? What is the friction involved in getting your next iteration out? Is launching an experiment on your homepage, for example, is that a costly endeavor? Does it take you weeks? Does it take you months?One of the initial inspirations for just starting Vercel and making deployments really easy was, how difficult is it for the average company to change in their footer of their website is this copyright 2022? And you have—it's a new year. You have to bump it to copyright 2023. How long do you think it takes that engineer to, A, run the stack locally, so they can actually see the change; deploy it, but deploy to what we call the preview environment, so they can grab that URL and send [it to 00:18:15], Corey, and say, “Corey, does it look good? I updated [laugh] I updated the year in the footer.”And then you tell me, “Looks good, let's ship it to production.” Or you tell me, “No, no, no, it's risky. Let's divide it into two cohorts: 50% of traffic gets 2022, 50% of traffic gets 2023.” Obviously, this is a joke, but consider the implications of how difficult it is and the average organization to actually do this thing.[midroll 00:18:41]Corey: Oh, I find things like that all the time, especially on microservices that I built to handle some of my internal workflows here, and I haven't touched in two or three years. And okay, now it's time for me to update them to reflect some minor change. And first, I wind up in the screaming node warnings and I have to update things so that they actually work in a reasonable way. And, on some level, making a one-line change can take half a day. Now, in the real world, when people are working on these apps day-in and day-out, it gets a lot easier to roll those changes in over time, but coming back to something unmaintained, that becomes a project the longer you let it sit.Part of me wishes that there were easier ways around it, but there are trade-offs in almost any decision you make. If you're building something from the beginning of, well, I want to be able to automatically update the copyright year, you can even borderline make that something that automatically happens based upon the global time, whereas when you're trying to retrofit it afterwards, yeah, it becomes a project.Guillermo: Yeah, and now think, that's just a simple example of changing a string. That might be difficult for a product engineering in any organization. Or it may be slow, or it may be not as streamlined, or maybe it works really well for the first project that that company created. What about every incremental project thereafter?So, now I said—let's stop talking about a string, right? Let's think about you're about an e-commerce website where what we hear from our customers on average, like, 10% of revenue flows through the homepage. Now, I have to change a primary component that renders on the hero of the page, and I have to collaborate with every department in the organization. I have to collaborate with the design team, I have to collaborate with marketing, I have to collaborate with the business owners to track the analytics appropriately. So, what is the cost of every incremental experiment that you want to put in production?The other thing that's particularly interesting about front-end as it relates to cloud infrastructure is, scaling up front-end is a very difficult thing. What ends up happening is most front-ends are actually static websites. They're cached at the edge—or they're literally statically generated—and then they push all of the dynamism to the client side. So, you end up with this spaghetti of script tags on the client, you end up accumulating a lot of tech debt in the [shipping 00:20:56] huge bundles of JavaScript to the client to try to recover some dynamism, to try and run these experiments. So, everyone is in this, kind of, mess of the yes, maybe we can experiment, but we kind of offloaded the rendering work to the client. That in turn makes me—basically, I'm making the website slower for the visitor. I'm making them do the rendering work.And I'm trying to sell them something. I'm trying to speed up some processes. It's my responsibility to make it fast. So, what we ended up finding out is that yes, the cloud moved this forward a lot in terms of having these awesome building blocks, these awesome infrastructure primitives, but both in the developer experience, just changing something about your web product and also the end-user experiences, that web product renders really fast, those things really didn't happen with this first chapter of the cloud. And I think we're entering a new generation of higher-level clouds like Vercel that are optimizing for these things.Corey: I think that there's a historical focus on things that have not happened before. And that was painful and terrible, so we're not going to be focusing on what's happening in the future, we're going to build a process or a framework or something that winds up preventing that thing that hurt us from hurting us again. Now, that's great in moderation, but at some point—we see this at large companies from time-to-time—where you have so much process that is ossified scar tissue, basically, that it becomes almost impossible to get things done. Because oh, I want to make that—for example, that one-line change to a copyright date, well, here's the 5000 ways deploys have screwed us before, so we need to have three humans sign off on the one-line change, and a bunch of other stuff before it ever sees the light of day. Now, I'm exaggerating slightly, just as you are, but that feels like it acts as a serious brake on innovation.On the exact opposite side, where we see massive acceleration has been around the world of generative AI. Yes, it is massively hyped in a bunch of ways. I don't think it is going to be a defined way that changes the nature of humanity the way that some of these people are going after, but it's also clearly more than a parlor trick.Guillermo: I'm kind of in that camp. So, like you, I've been writing code for many years. I'm pretty astonished by the AI's ability to enhance my output. And of course, now I'm not writing code full time, so there is a sense of, okay because I don't have time, because I'm doing a million things, any minute I have seems like AI has just made it so much more worthwhile and I can squeeze so much more productivity out of it. But one of the areas that I'm really excited about is this idea of generative UI, which is not just autocompleting code in a text editor, but is the idea that you can use natural language to describe an interface and have the AI generate that interface.So, Vercel created this product called v0—you can check it out at v0.dev—where to me, it's really astonishing that you can get these incredibly high quality user interfaces, and basically all you have to do is input [laugh] a few English words. I have this personal experience of, I've been learning JavaScript and perfecting all my knowledge around it for, like, 20 or so years. I created Next.js.And Next.js itself powers a lot of these AI products. Like the front-end of ChatGPT is built on Next.js. And I used v0 to create… to basically recreate my blog. Like, I created rauchg.com, I deployed it on Vercel, but every pixel of that UI, I handcrafted.And as we were working on v0, I said okay, “I'm going to challenge myself to put myself back in the shoes of, like, I'm going to redesign this and I'm going to start over with just human language.” Not only did I arrive to the right look and feel of what I wanted to get, the code that it produced was better than I would have written by hand. Concretely, it was more accessible. So, there were areas of the UI where, like, some icons were rendered where I had not filled in those gaps. I just didn't know how to do that. The AI did. So, I really believe that AI will transform our lives as [laugh] programmers, at least I think, in many other areas in very profound ways.Corey: This is very similar to a project that I've been embarked on for the last few days where I described the app I wanted into Chat-Gippity and follow the instructions, and first, it round up point—sending me down a rabbit hole of the wrong Framework version that had been deprecated, and whatnot, and then I brought it all into VS Code where Jif-Ub Copilot, it kept switching back and forth between actively helpful, and ooh, the response matches publicly available code, so I'm not going to tell you the answer, despite the fact that feature has never been enabled on my account. So yeah, of course, it matches publicly available code. This is quite literally the React tutorial starter project. And it became incredibly frustrating, but it also would keep generating things in bursts, so my code is not at all legible or well organized or consistent for that matter. But it's still better than anything I'd be able to write myself. I'm looking forward to using v0 or something like it to see how that stacks up for some of my ridiculous generation ideas for these things.Guillermo: Yeah, you touched on a very important point is, the code has to work. The code has to be shippable. I think a lot of AI products have gotten by by giving you an approximation of the result, right? Like, they hallucinate sometimes, they get something wrong. It's still very helpful because sometimes it's sending you the right direction.But for us, the bar is that these things have to produce code that's useful, and that you can ship, and that you can iterate on. So, going back to that idea of iteration velocity, we call it v0 because we wanted it to be the first version. We still very much believe there is humans in the loop and folks will be iterating a lot on the initial draft that this thing is giving you, but it's so much better than starting with an empty code editor, [laugh] right? Like, and this applies, by the way to, like, not just new projects, but I always talk about, like, our customers have a few really important landing pages, key pages, maybe it's the product detail page in e-commerce, maybe it's your homepage and, like, your key product pages for a marketing website. Maybe it's where—and the checkout, for example, extremely important.But then there's a lot of incremental UIs that you have to add every single day. The banner for [laugh] accepting cookies or not, the consent management dialog. There's a lot of things that the worst case scenario is that you offload them again to some third-party script, to some iframe of sorts because you really don't have the bandwidth, time, or resources to build it yourself. And then you sacrifice speed, you sacrifice brand fidelity. And again because we're the front-end cloud, we're obsessed with your ability to ship UI that's native to your product, that is a streamline, that works really well. So, I think AI is going to have a significant effect there where a lot of things where you were sending someone to some other website because you just didn't have the bandwidth to create that UI, you can now own the experience end to end.Corey: That is no small thing. A last question I have, before we wind up calling this episode is, there was a period of time—I don't know if we're still in it or not—where it felt like every time I got up to get a cup of coffee and came back, there would be three JavaScript frameworks that launched during that interim. So, Next.js was at 1.1 of those when someone got up to get a cup of coffee. But that's shown a staying power that is, frankly, remarkable. Why? I don't know enough about the ecosystem to have an opinion on that, but I noticed when things stand out, and Next does.Guillermo: Yeah, I think it's a number of factors. Number one, we, as an industry I think, we coalesced, and we found the right engine to build our car. And that engine became React. Most folks building UI today are choosing React or a similar engine, but React has really become the gold standard for a lot of engineers. Now, what ended up happening next is that people realized I want a car. I want the full product. I need to drive. I don't want to assemble this freaking car every single time I have a new project.And Next.js filled a very important gap in the world where what you were looking for was not a library; what you were looking for is a framework that has opinions, but those opinions are very in line with how the web is supposed to work. We took a page from, basically, the beginnings of the web. We make a lot of jokes that in many ways, our inspiration was PHP, where server rendering is the default, where it's very expressive, it's very easy to reach for data. It just works for a lot of people. Again, that's the old [stack 00:30:03] in the olden days.And so, it obviously didn't quite work, but the inspiration was, can we make something that is a streamline for creating web interfaces at scale? At scale. And to your point, there's also a sense of, like, maybe it doesn't make sense anymore to build all this infrastructure from scratch every single time I started a project. So, Next filled in that gap. The other thing we did really well, I think, is that we gave people a universal model for how to use not just the server side, but also the client side strategically.So, I'll give you an example. When you go to ChatGPT, a lot of things on the screen are server rendered, but when you start doing interactions as a user, that requires for something like you'd say, “Hey Dali, generate an image.” That stuff requires a lot of optimistic UI. It requires features that are more like what a mobile native application can do. So, we can give folks the best of both worlds: the speed, interactivity, and fluidity of a native app, but we had those, sort of, fundamentals of how a website should work that even Perl and PHP had gotten right, once upon a time. So, I think we found that right blend of utility and flexibility, and folks love it, and I think, yeah, we're excited to continue to help steward this project as a standard for building on the web.Corey: I really want to thank you for taking the time to

Screaming in the Cloud
How Tailscale Builds for Users of All Tiers with Maya Kaczorowski

Screaming in the Cloud

Play Episode Listen Later Dec 19, 2023 33:45


Maya Kaczorowski, Chief Product Officer at Tailscale, joins Corey on Screaming in the Cloud to discuss what sets the Tailscale product approach apart, for users of their free tier all the way to enterprise. Maya shares insight on how she evaluates feature requests, and how Tailscale's unique architecture sets them apart from competitors. Maya and Corey discuss the importance of transparency when building trust in security, as well as Tailscale's approach to new feature roll-outs and change management.About MayaMaya is the Chief Product Officer at Tailscale, providing secure networking for the long tail. She was mostly recently at GitHub in software supply chain security, and previously at Google working on container security, encryption at rest and encryption key management. Prior to Google, she was an Engagement Manager at McKinsey & Company, working in IT security for large enterprises.Maya completed her Master's in mathematics focusing on cryptography and game theory. She is bilingual in English and French.Outside of work, Maya is passionate about ice cream, puzzling, running, and reading nonfiction.Links Referenced: Tailscale: https://tailscale.com/ Tailscale features: VS Code extension: https://marketplace.visualstudio.com/items?itemName=tailscale.vscode-tailscale  Tailscale SSH: https://tailscale.com/kb/1193/tailscale-ssh  Tailnet lock: https://tailscale.com/kb/1226/tailnet-lock  Auto updates: https://tailscale.com/kb/1067/update#auto-updates  ACL tests: https://tailscale.com/kb/1018/acls#tests  Kubernetes operator: https://tailscale.com/kb/1236/kubernetes-operator  Log streaming: https://tailscale.com/kb/1255/log-streaming  Tailscale Security Bulletins: https://tailscale.com/security-bulletins  Blog post “How Our Free Plan Stays Free:” https://tailscale.com/blog/free-plan  Tailscale on AWS Marketplace: https://aws.amazon.com/marketplace/pp/prodview-nd5zazsgvu6e6  TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and I am joined today on this promoted guest episode by my friends over at Tailscale. They have long been one of my favorite products just because it has dramatically changed the way that I interact with computers, which really should be enough to terrify anyone. My guest today is Maya Kaczorowski, Chief Product Officer at Tailscale. Maya, thanks for joining me.Maya: Thank you so much for having me.Corey: I have to say originally, I was a little surprised to—“Really? You're the CPO? I really thought I would have remembered that from the last time we hung out in person.” So, congratulations on the promotion.Maya: Thank you so much. Yeah, it's exciting.Corey: Being a product person is probably a great place to start with this because we've had a number of conversations, here and otherwise, around what Tailscale is and why it's awesome. I don't necessarily know that beating the drum of why it's so awesome is going to be covering new ground, but I'm sure we're going to come up for that during the conversation. Instead, I'd like to start by talking to you about just what a product person does in the context of building something that is incredibly central not just to critical path, but also has massive security ramifications as well, when positioning something that you're building for the enterprise. It's a very hard confluence of problems, and there are days I am astonished that enterprises can get things done based purely upon so much of the mitigation of what has to happen. Tell me about that. How do you even function given the tremendous vulnerability of the attack surface you're protecting?Maya: Yeah, I don't know if you—I feel like you're talking about the product, but also the sales cycle of talking [laugh] and working with enterprise customers.Corey: The product, the sales cycle, the marketing aspects of it, and—Maya: All of it.Corey: —it all ties together. It's different facets of frankly, the same problem.Maya: Yeah. I think that ultimately, this is about really understanding who the customer that is buying the product is. And I really mean that, like, buying the product, right? Because, like, look at something like Tailscale. We're typically used by engineers, or infrastructure teams in an organization, but the buyer might be the VP of Engineering, but it might be the CISO, or the CTO, or whatever, and they're going to have a set of requirements that's going to be very different from what the end-user has as a set of requirements, so even if you have something like bottom-up adoption, in our case, like, understanding and making sure we're checking all the boxes that somebody needs to actually bring us to work.Enterprises are incredibly demanding, and to your point, have long checklists of what they need as part of an RFP or that kind of thing. I find that some of the strictest requirements tend to be in security. So like, how—to your point—if we're such a critical part of your network, how are you sure that we're always available, or how are you sure that if we're compromised, you're not compromised, and providing a lot of, like, assurances and controls around making sure that that's not the case.Corey: I think that there's a challenge in that what enterprise means to different people can be wildly divergent. I originally came from the school of obnoxious engineering where oh, as an engineer, whenever I say something is enterprise grade, that's not a compliment. That means it's going to be slow and moribund. But that is a natural consequence of a company's growth after achieving success, where okay, now we have actual obligations to customers and risk mitigation that needs to be addressed. And how do you wind up doing that without completely hobbling yourself when it comes to accelerating feature velocity? It's a very delicate balancing act.Maya: Yeah, for sure. And I think you need to balance, to your point, kind of creating demand for the product—like, it's actually solving the problem that the customer has—versus checking boxes. Like, I think about them as features, or you know, feature requests versus feature blockers or deal blockers or adoption blockers. So, somebody wants to, say, connect to an AWS VPC, but then the person who has to make sure that that's actually rolled out properly also wants audit logs and SSH session recording and RBAC-based controls and lots of other things before they're comfortable deploying that in their environment. And I'm not even talking about the list of, you know, legal, kind of, TOS requirements that they would have for that kind of situation.I think there's a couple of things that you need to do to even signal that you're in that space. One of the things that I was—I was talking to a friend of mine the other day how it feels like five years ago, like, nobody had SOC 2 reports, or very few startups had SOC 2 reports. And it's probably because of the advent of some of these other companies in this space, but like, now you can kind of throw a dart, and you'll hit five startups that have SOC 2 reports, and the amount that you need to show that you're ready to sell to these companies has changed.Corey: I think that there's a definite broadening of the use case. And I've been trying to avoid it, but let's go diving right into it. I used to view Tailscale as, oh it's a VPN. The end. Then it became something more where it effectively became the mesh overlay where all of the various things that I have that speak Tailscale—which is frankly, a disturbing number of things that I'd previously considered to be appliances—all talk to one another over a dedicated network, and as a result, can do really neat things where I don't have to spend hours on end configuring weird firewall rules.It's more secure, it's a lot simpler, and it seems like every time I get that understanding down, you folks do something that causes me to yet again reevaluate where you stand. Most recently, I was doing something horrifying in front-end work, and in VS Code the Tailscale extension popped up. “Oh, it looks like you're running a local development server. Would you like to use Tailscale Funnel to make it available to the internet?” And my response to that is, “Good lord, no, I'm ashamed of it, but thanks for asking.” Every time I think I get it, I have to reevaluate where it stands in the ecosystem. What is Tailscale now? I feel like I should get the official description of what you are.Maya: Well, I sure hope I'm not the official description. I think the closest is a little bit of what you're saying: a mesh overlay network for your infrastructure, or a programmable network that lets you mesh together your users and services and services and services, no matter where they are, including across different infrastructure providers and, to your point, on a long list of devices you might have running. People are running Tailscale on self-driving cars, on robots, on satellites, on elevators, but they're also running Tailscale on Linux running in AWS or a MacBook they have sitting under their desk or whatever it happens to be. The phrase that I like to use for that is, like, infrastructure agnostic. We're just a building block.Your infrastructure can be whatever infrastructure you want. You can have the cheapest GPUs from this cloud, or you can use the Android phone to train the model that you have sitting on your desk. We just help you connect all that stuff together so you can build your own cloud whatever way you want. To your point, that's not really a VPN [laugh]. The word VPN doesn't quite do it justice. For the remote access to prod use case, so like a user, specifically, like, a developer infra team to a production network, that probably looks the most like a zero-trust solution, but we kind of blur a lot of the lines there for what we can do.Corey: Yeah, just looking at it, at the moment, I have a bunch of Raspberries Pi, perhaps, hanging out on my tailnet. I have currently 14 machines on there, I have my NAS downstairs, I have a couple of EC2 instances, a Google Cloud instance, somewhere, I finally shut down my old Oracle Cloud instance, my pfSense box speaks it natively. I have a Thinkst Canary hanging out on there to detect if anything starts going ridiculously weird, my phone, my iPad, and a few other things here and there. And they all just talk seamlessly over the same network. I can identify them via either IP address, if I'm old, or via DNS if I want to introduce problems that will surprise me at one point or another down the road.I mean, I even have an exit node I share with my brother's Tailscale account for reasons that most people would not expect, namely that he is an American who lives abroad. So, many weird services like banks or whatnot, “Oh, you can't log in to check your bank unless you're coming from US IP space.” He clicks a button, boom, now he doesn't get yelled at to check his own accounts. Which is probably not the primary use case you'd slap on your website, but it's one of those solving everyday things in somewhat weird ways.Maya: Oh, yeah. I worked at a bank maybe ten years ago, and they would block—this little bank on the east coast of the US—they would block connections from Hawaii because why would any of your customers ever be in Hawaii? And it was like, people travel and maybe you're—Corey: How can you be in Hawaii? You don't have a passport.Maya: [laugh]. People travel. They still need to do banking. Like, it doesn't change, yeah. The internet, we've built a lot of weird controls that are IP-based, that don't really make any sense, that aren't reflective. And like, that's true for individuals—like you're describing, people who travel and need to bank or whatever they need to do when they travel—and for corporations, right? Like the old concept—this is all back to the zero trust stuff—but like, the old concept that you were trusted just because you had an IP address that was in the corp IP range is just not true anymore, right? Somebody can walk into your office and connect to the Wi-Fi and a legitimate employee can be doing their job from home or from Starbucks, right? Those are acceptable ways to work nowadays.Corey: One other thing that I wanted to talk about is, I know that in previous discussions with you folks—sometimes on the podcast sometimes when I more or less corner someone a Tailscale at your developer conference—one of the things that you folks talk about is Tailscale SSH, which is effectively a drop-in replacement for the SSH binary on systems. Full disclosure, I don't use it, mostly because I'm grumpy and I'm old. I also like having some form of separation of duties where you're the network that ties it all together, but something else winds up acting as that authentication step. That said, if I were that interesting that someone wanted to come after me, there are easier ways to get in, so I'm mostly just doing this because I'm persnickety. Are you seeing significant adoption of Tailscale SSH?Maya: I think there's a couple of features that are missing in Tailscale SSH for it to be as adopted by people like you. The main one that I would say is—so right now if you use Tailscale SSH, it runs a binary on the host, you can use your Tailscale credentials, and your Tailscale private key, effectively, to SSH something else. So, you don't have to manage a separate set of SSH keys or certs or whatever it is you want to do to manage that in your network. Your identity provider identity is tied to Tailscale, and then when you connect to that device, we still need to have an identity on the host itself, like in Unix. Right now, that's not tied to Tailscale. You can adopt an identity of something else that's already on the host, but it's not, like, corey@machine.And I think that's the number one request that we're getting for Tailscale SSH, to be able to actually generate or tie to the individual users on the host for an identity that comes from, like, Google, or GitHub, or Okta, or something like that. I'm not hearing a lot of feedback on the security concerns that you're expressing. I think part of that is that we've done a lot of work around security in general so that you feel like if Tailscale were to be compromised, your network wouldn't need to be compromised. So, Tailscale itself is end-to-end encrypted using WireGuard. We only see your public keys; the private keys remain on the device.So, in some sense the, like, quote-unquote, “Worst” that we could do would be to add a node to your network and then start to generate traffic from that or, like, mess with the configuration of your network. These are questions that have come up. In terms of adding nodes to your network, we have a feature called tailnet lock that effectively lets you sign and verify that all the nodes on your network are supposed to be there. One of the other concerns that I've heard come up is, like, what if the binary was compromised. We develop in open-source so you can see that that's the case, but like, you know, there's certainly more stuff we could be doing there to prevent, for example, like a software supply chain security attack. Yeah.Corey: Yeah, but you also have taken significant architectural steps to ensure that you are not placed in a position of undue trust around a lot of these things. Most recently, you raised a Series B, that was $100 million, and the fact that you have not gone bankrupt in the year since that happened tells me that you are very clearly not routing all customer traffic through you folks, at least on one of the major cloud providers. And in fact, a little bit of playing a-slap-and-tickle with Wireshark affirm this, that the nodes talk to each other; they do not route their traffic through you folks, by design. So one, great for the budget, I have respect for that data transfer pattern, but also it means that you are in the position of being a global observer in a way that can be, in many cases, exploited.Maya: I think that's absolutely correct. So, it was 18 months ago or so that we raised our Series B. When you use Tailscale, your traffic connects peer-to-peer directly between nodes on your network. And that has a couple of nice properties, some of what you just described, which is that we don't see your traffic. I mean, one, because it's end-to-end encrypted, but even if we could capture it, and then—we're not in the way of capturing it, let alone decrypting it.Another nice property it has is just, like, latency, right? If your user is in the UK, and they're trying to access something in Scotland, it's not, you know, hair-pinning, bouncing all the way to the West Coast or something like that. It doesn't have to go through one of our servers to get there. Another nice property that comes with that is availability. So, if our network goes down, if our control plane goes down, you're temporarily not able to add nodes or change your configuration, but everything in your network can still connect to each other, so you're not dependent on us being online in order for your network to work.And this is actually coming up more and more in customer conversations where that's a differentiator for us versus a competitor. Different competitors, also. There's a customer case study on our website about somebody who was POC'ing us with a different option, and literally during the POC, the competitor had an outage, unfortunately for them, and we didn't, and they sort of looked at our model, our deployment model and went, “Huh, this really matters to us.” And not having an outage on our network with this solution seems like a better option.Corey: Yeah, when the network is down, the computers all turn into basically space heaters.Maya: [laugh]. Yeah, as long as they're not down because, I guess, unplugged or something. But yeah, [laugh] I completely agree. Yeah. But I think there's a couple of these kinds of, like, enterprise things that people are—we're starting to do a better job of explaining and meeting customers where they are, but it's also people are realizing actually does matter when you're deploying something at this scale that's such a key part of your network.So, we talked a bit about availability, we talked a bit about things like latency. On the security side, there's a lot that we've done around, like I said, tailnet lock or that type of thing, but it's like some of the basic security features. Like, when I joined Tailscale, probably the first thing I shipped in some sense as a PM was a change log. Here's the change log of everything that we're shipping as part of these releases so that you can have confidence that we're telling you what's going on in your network, when new features are coming out, and you can trust us to be part of your network, to be part of your infrastructure.Corey: I do want to further call out that you have a—how should I frame this—a typically active security notification page.Maya: [laugh].Corey: And I think it is easy to misconstrue that as look at how terrifyingly insecure this is? Having read through it, I would argue that it is not that you are surprisingly insecure, but rather that you are extraordinarily transparent about things that are relatively minor issues. And yes, they should get fixed, but, “Oh, that could be a problem if six other things happen to fall into place just the right way.” These are not security issues of the type, “Yeah, so it turns out that what we thought was encrypting actually wasn't and we're just expensive telnet.” No, there's none of that going on.It's all been relatively esoteric stuff, but you also address it very quickly. And that is odd, as someone who has watched too many enterprise-facing companies respond to third-party vulnerability reports with rather than fixing the problem, more or less trying to get them not to talk about it, or if they do, to talk about it only using approved language. I don't see any signs of that with what you've done there. Was that a challenging internal struggle for you to pull off?Maya: I think internally, it was recognizing that security was such an important part of our value proposition that we had to be transparent. But once we kind of got past that initial hump, we've been extremely transparent, as you say. We think we can build trust through transparency, and that's the most important thing in how we respond to security incidents. But code is going to have bugs. It's going to have security bugs. There's nothing you can do to prevent that from happening.What matters is how you—and like, you should. Like, you should try to catch them early in the development process and, you know, shift left and all that kind of stuff, but some things are always going to happen [laugh] and what matters in that case is how you respond to them. And having another, you know, an app update that just says “Bug fixes” doesn't help you figure out whether or not you should actually update, it doesn't actually help you trust us. And so, being as public and as transparent as possible about what's actually happening, and when we respond to security issues and how we respond to security issues is really, really important to us. We have a policy that talks about when we will publish a bulletin.You can subscribe to our bulletins. We'll proactively email anyone who has a security contact on file, or alternatively, another contact that we have if you haven't provided us a security contact when you're subject to an issue. I think by far and large, like, Tailscale has more security bulletins just because we're transparent about them. It's like, we probably have as many bugs as anybody else does. We're just lucky that people report them to us because they see us react to them so quickly, and then we're able to fix them, right? It's a net positive for everyone involved.Corey: It's one of those hard problems to solve for across the board, just because I've seen companies in the past get more or less brutalized by the tech press when they have been overly transparent. I remember that there was a Reuters article years ago about Slack, for example, because they would pull up their status history and say, “Oh, look at all of these issues here. You folks can't keep your website up.” But no, a lot of it was like, “Oh, file uploads for a small subset of our users is causing a problem,” and so on and so forth. These relatively minor issues that, in aggregate, are very hard to represent when you're using traffic light signaling.So, then you see people effectively going full-on AWS status page where there's a significant outage lasting over a day, last month, and what you see on this is if you go really looking for it is this yellow thing buried in his absolute sea of green lights, even though that was one of the more disruptive things to have happened this year. So, it's a consistent and constant balance, and I really have a lot of empathy no matter where you wind up landing on that?Maya: Yeah, I think that's—you're saying it's sort of about transparency or being able to find the right information. I completely agree. And it's also about building trust, right? If we set expectations as to how we will respond to these things then we consistently respond to them, people believe that we're going to keep doing that. And that is almost more important than, like, committing to doing that, if that makes any sense.I remember having a conversation many years ago with an eng manager I worked with, and we were debating what the SLO for a particular service should be. And he sort of made an interesting point. He's like, “It doesn't really matter what the SLO is. It matters what you actually do because then people are going to start expecting [laugh] what you actually do.” So, being able to point at this and say, “Yes, here's what we say and here's what we actually do in practice,” I think builds so much more trust in how we respond to these kinds of things and how seriously we take security.I think one of the other things that came out of the security work is we realized—and I think you talked to Avery, the CEO of Tailscale on a prior podcast about some of this stuff—but we realized that platforms are broken, and we don't have a great way of pushing automatic updates on a lot of platforms, right? You know, if you're using the macOS store, or the Android Play Store, or iOS or whatever, you can automatically update your client when there is a security issue. On other platforms, you're kind of stuck. And so, as a result of us wanting to make sure that the fleet is as updated as possible, we've actually built an auto-update feature that's available on all of our major clients now, so people can opt in to getting those updates as quickly as needed when there is a security issue. We want to expose people to as little risk as possible.Corey: I am not a Tailscale customer. And that bugs me because until I cross that chasm into transferring $1 every month from my bank account to yours, I'm just a whiny freeloader in many respects, which is not at all how you folks who never made me feel I want to be very clear on that. But I believe in paying for the services that empower me to do my job more effectively, and Tailscale absolutely qualifies.Maya: Yeah, understood, I think that you still provide value to us in ways that aren't your data, but then in ways that help our business. One of them is that people like you tend to bring Tailscale to work. They tend to have a good experience at home connecting to their Synology, helping their brother connect to his bank account, whatever it happens to be, and they go, “Oh.” Something kind of clicks, and then they see a problem at work that looks very similar, and then they bring it to work. That is our primary path of adoption.We are a bottom-up adoption, you know, product-led growth product [laugh]. So, we have a blog post called “How Our Free Plan Stays Free” that covers some of that. I think the second thing that I don't want to undersell that a user like you also does is, you have a problem, you hit an issue, and you write into support, and you find something that nobody else has found yet [laugh].Corey: I am very good at doing that entirely by accident.Maya: [laugh]. But that helps us because that means that we see a problem that needs to get fixed, and we can catch it way sooner than before it's deployed, you know, at scale, at a large bank, and you know, it's a critical, kind of, somebody's getting paged kind of issue, right? We have a couple of bugs like that where we need, you know, we need a couple of repros from a couple different people in a couple different situations before we can really figure out what's going on. And having a wide user base who is happy to talk to us really helps us.Corey: I would say it goes beyond that, too. I have—I see things in the world of Tailscale that started off as features that I requested. One of the more recent ones is, it is annoying to me to see on the Tailscale machines list everything I have joined to the tailnet with that silly little up arrow next to it of, “Oh, time to go back and update Tailscale to the latest,” because that usually comes with decent benefits. Great, I have to go through iteratively, or use Ansible, or something like that. Well, now there's a Tailscale update option where it will keep itself current on supported operating systems.For some unknown reason, you apparently can't self-update the application on iOS or macOS. Can't imagine why. But those things tend to self-update based upon how the OS works due to all the sandboxing challenges. The only challenge I've got now is a few things that are, more or less, embedded devices that are packaged by the maintainer of that embedded system, where I'm beholden to them. Only until I get annoyed enough to start building a CI/CD system to replace their package.Maya: I can't wait till you build that CI/CD system. That'll be fun.Corey: “We wrote this code last night. Straight to the bank with it.” Yeah, that sounds awesome.Maya: [laugh] You'd get a couple of term sheets for that, I'm sure.Corey: There are. I am curious, looping back to the start of our conversation, we talked about enterprise security requirements, but how do you address enterprise change management? I find that that's something an awful lot of companies get dreadfully wrong. Most recently and most noisily on my part is Slack, a service for which I paid thousands of dollars a year, decided to roll out a UI redesign that, more or less, got in the way of a tremendous number of customers and there was no way to stop it or revert it. And that made me a lot less likely to build critical-flow business processes that depended upon Slack behaving a certain way.Just, “Oh, we decided to change everything in the user interface today just for funsies.” If Microsoft pulled that with Excel, by lunchtime they'd have reverted it because an entire universe of business users would have marched on Redmond to burn them out otherwise. That carries significant cost for businesses. Yet I still see Tailscale shipping features just as fast as you ever have. How do you square that circle?Maya: Yeah. I think there's two different kinds of change management really, which is, like—because if you think about it, it's like, an enterprise needs a way to roll out a product or a feature internally and then separately, we need a way to roll out new things to customers, right? And so, I think on the Tailscale side, we have a change log that tells you about everything that's changing, including new features, and including changes to the client. We update that religiously. Like, it's a big deal, if something doesn't make it the day that it's supposed to make it. We get very kind of concerned internally about that.A couple of things that were—that are in that space, right, we just talked about auto-updates to make it really easy for you to maintain what's actually rolled out in your infrastructure, but more importantly, for us to push changes with a new client release. Like, for example, in the case of a security incident, we want to be able to publish a version and get it rolled out to the fleet as quickly as possible. Some of the things that we don't have here, but although I hear requests for is the ability to, like, gradually roll out features to a customer. So like, “Can we change the configuration for 10% of our network and see if anything breaks before rolling back, right before rolling forward.” That's a very traditional kind of infra change management thing, but not something I've ever seen in, sort of, the networking security space to this degree, and something that I'm hearing a lot of customers ask for.In terms of other, like, internal controls that a customer might have, we have a feature called ACL Tests. So, if you're going to change the configuration of who can access what in your network, you can actually write tests. Like, your permission file is written in HuJSON and you can write a set of things like, Corey should be able to access prod. Corey should not be able to access test, or whatever it happens to be—actually, let's flip those around—and when you have a policy change that doesn't pass those tests, you actually get told right away so you're not rolling that out and accidentally breaking a large part of your network. So, we built several things into the product to do it. In terms of how we notify customers, like I said, that the primary method that we have right now is something like a change log, as well as, like, security bulletins for security updates.Corey: Yeah, it's one of the challenges, on some level, of the problem of oh, I'm going to set up a service, and then I'm going to go sail around the world, and when I come back in a year or two—depending on how long I spent stranded on an island somewhere—now I get to figure out what has changed. And to your credit, you have to affirmatively enable all of the features that you have shipped, but you've gone from, “Oh, it's a mesh network where everything can talk to each other,” to, “I can use an exit node from that thing. Oh, now I can seamlessly transfer files from one node to another with tail drop,” to, “Oh, Tailscale Funnel. Now, I can expose my horrifying developer environment to the internet.” I used that one year to give a talk at a conference, just because why not?Maya: [crosstalk 00:27:35].Corey: Everything evolves to become [unintelligible 00:27:37] email on Microsoft Outlook, or tries to be Microsoft Excel? Oh, no, no. I want you to be building Microsoft PowerPoint for me. And we eventually get there, but that is incredibly powerful functionality, but also terrifying when you think you have a handle on what's going on in a large-scale environment, and suddenly, oh, there's a whole new vector we need to think about. Which is why your—the thought and consideration you put into that is so apparent and so, frankly, welcome.Maya: Yeah, you actually kind of made a statement there that I completely missed, which is correct, which is, we don't turn features on by default. They are opt-in features. We will roll out features by default after they've kind of baked for an incredibly long period of time and with, like, a lot of fanfare and warning. So, the example that I'll give is, we have a DNS feature that was probably available for maybe 18 months before we turned it on by default for new tailnets. So didn't even turn it on for existing folks. It's called Magic DNS.We don't want to touch your configuration or your network. We know people will freak out when that happens. Knowing, to your point, that you can leave something for a year and come back, and it's going to be the same is really important. For everyone, but for an enterprise customer as well. Actually, one other thing to mention there. We have a bunch of really old versions of clients that are running in production, and we want them to keep working, so we try to be as backward compatible as possible.I think the… I think we still have clients from 2019 that are running and connecting to corp that nobody's updated. And like, it'd be great if they would update them, but like, who knows what situation they're in and if they can connect to them, and all that kind of stuff, but they still work. And the point is that you can have set it up four years ago, and it should still work, and you should still be able to connect to it, and leave it alone and come back to it in a year from now, and it should still work and [laugh] still connect without anything changing. That's a very hard guarantee to be able to make.Corey: And yet, somehow you've been able to do that, just from the perspective of not—I've never yet seen you folks make a security-oriented decision that I'm looking at and rolling my eyes and amazed that you didn't make the decision the other way. There are a lot of companies that while intending very well have done, frankly, very dumb things. I've been keeping an eye on you folks for a long time, and I would have caught that in public. I just haven't seen anything like that. It's kind of amazing.Last year, I finally took the extraordinary step of disabling SSH access anywhere except the tailnet to a number of my things. It lets my logs fill up a lot less, and you've built to that level of utility-like reliability over the series of longtime experimentation. I have yet to regret having Tailscale in the mix, which is, frankly, not something I can say about almost any product.Maya: Yeah. I'm very proud to hear that. And like, maintaining that trust—back to a lot of the conversation about security and reliability and stuff—is incredibly important to us, and we put a lot of effort into it.Corey: I really appreciate your taking the time to talk to me about how things continue to evolve over there. Anything that's new and exciting that might have gotten missed? Like, what has come out in, I guess, the last six months or so that are relevant to the business and might be useful for people looking to use it themselves?Maya: I was hoping you're going to ask me what came out in the last, you know, 20 minutes while we were talking, and the answer is probably nothing, but you never know. But [laugh]—Corey: With you folks, I wouldn't doubt it. Like, “Oh, yeah, by the way, we had to do a brand treatment redo refresh,” or something on the website? Why not? It now uses telepathy just because.Maya: It could, that'd be pretty cool. No, I mean, lots has gone on in the last six months. I think some of the things that might be more interesting to your listeners, we're now in the AWS Marketplace, so if you want to purchase Tailscale through AWS Marketplace, you can. We have a Kubernetes operator that we've released, which lets you both ingress and egress from a Kubernetes cluster to things that are elsewhere in the world on other infrastructure, and also access the Kubernetes control plane and the API server via Tailscale. I mentioned auto-updates. You mentioned the VS Code extension. That's amazing, the fact that you can kind of connect directly from within VS Code to things on your tailnet. That's a lot of the exciting stuff that we've been doing. And there's boring stuff, you know, like audit log streaming, and that kind of stuff. But it's good.Corey: Yeah, that stuff is super boring until suddenly, it's very, very exciting. And those are not generally good days.Maya: [laugh]. Yeah, agreed. It's important, but boring. But important.Corey: [laugh]. Well, thank you so much for taking the time to talk through all the stuff that you folks are up to. If people want to learn more, where's the best place for them to go to get started?Maya: tailscale.com is the best place to go. You can download Tailscale from there, get access to our documentation, all that kind of stuff.Corey: Yeah, I also just want to highlight that you can buy my attention but never my opinion on things and my opinion on Tailscale remains stratospherically high, so thank you for not making me look like a fool, by like, “Yes. And now we're pivoting to something horrifying is a business model and your data.” Thank you for not doing exactly that.Maya: Yeah, we'll keep doing that. No, no, blockchains in our future.Corey: [laugh]. Maya Kaczorowski, Chief Product Officer at Tailscale. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. This episode has been brought to us by our friends at Tailscale. If you enjoyed this episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that will never actually make it back to us because someone screwed up a firewall rule somewhere on their legacy connection.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Using DevOps to Ignite a Chain Reaction of Productivity and Happiness with Dave Mangot

Screaming in the Cloud

Play Episode Listen Later Dec 14, 2023 34:03


Dave Mangot, CEO and founder of Mangoteque, joins Coreyon Screaming in the Cloud to explain how leveraging DevOps improves the lives of engineers and results in stronger businesses. Dave talks about the importance of exclusively working for private equity firms that act ethically, the key difference between venture capital and private equity, and how conveying issues and ideas to your CEO using language he understands leads to faster results. Corey and Dave discuss why successful business are built on two things: infrastructure as code and monitoring.About DaveDave Mangot, author of DevOps Patterns for Private Equity, helps portfolio companies get good at delivering software.  He is a leading consultant, author, and speaker as the principal at Mangoteque.  A DevOps veteran, Dave has successfully led digital, SRE, and DevOps transformations at companies such as Salesforce, SolarWinds, and Cable & Wireless. He has a proven track record of working with companies to quickly mature their existing culture to improve the speed, frequency, and resilience of their software service delivery.Links Referenced: Mangoteque: https://www.mangoteque.com DevOps Patterns for Private Equity: https://www.amazon.com/DevOps-Patterns-Private-Equity-organization/dp/B0CHXVDX1K “How to Talk Business: A Short Guide for Tech Leaders”: https://itrevolution.com/articles/how-to-talk-business-a-short-guide-for-tech-leaders/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is someone that I have known for, well, longer than I've been doing this show. Dave Mangot is the founder and CEO at Mangoteque. Dave, thank you for joining me.Dave: Hey, Corey, it's great to be here. Nice to see you again.Corey: I have to say, your last name is Mangot and the name of your company is Mangoteque, spelled M-A-N-G-O-T-E-Q-U-E, if I got that correctly, which apparently I did. What an amazing name for a company. How on earth did you name a company so well?Dave: Yeah, I don't know. I have to think back, a few years ago, I was just getting started in consulting, and I was talking to some friends of mine who were giving me a bunch of advice—because they had been doing consulting for quite some time—about what my rates should be, about all kinds of—you know, which vendors I should work with for my legal advice. And I said, “I'm having a lot of trouble coming up with a name for the company.” And this guy, Corey Quinn, was like, “Hey, I got a name for you.” [laugh].Corey: I like that story, just because it really goes to show the fine friends of mine over at all of the large cloud services companies—but mostly AWS—that it's not that hard to name something well. The trick, I think, is just not to do it in committee.Dave: Yeah. And you know, it was a very small committee obviously of, like, three. But yeah, it's been great. I have a lot of compliments on the name of my company. And I was like, oh, “You know that guy, the QuinnyPig dude?” And they're like, “Yeah?” “Oh, yeah, it was—that was his idea.” And I liked it. And it works really well for the things that I do.Corey: It seems to. So, talk to you about what it is that you do because back when we first met and many, many years ago, you were an SRE manager at a now defunct observability company. This was so long ago, I don't think that they used the term observability. It was Librato, which, “What do you do?” “We do monitoring,” back when that didn't sound like some old-timey thing. Like, “Oh, yeah. Right, between the blacksmith and the cobbler.” But you've evolved significantly since you were doing the mundane, pedestrian tasks of keeping the service up and running. What do you do these days?Dave: Yeah, that was before the observability wars [laugh] [whatever you like 00:02:55] to call it. But over time, that company was owned by SolarWinds and I wound up being responsible for all the SolarWinds cloud company SRE organizations. So, started—ran a global organization there. And they were owned by a couple of private equity firms. And I got to know one of the firms rather well, and then when I left SolarWinds, I started working with private equity firm portfolio companies, especially software investments. And what I like to say is I teach people how to get good at delivering software.Corey: So, you recently wrote a book, and I know this because I make it a point to get a copy of the book—usually by buying it, but you beat me to it by gifting me one—of every guest I have on the show who's written a book. Sometimes that means I wind up with the eclectic collections of poetry, other times, I wind up with a number of different books around the DevOps and cloud space. And one of these days, I'm going to wind up talking to someone who wound up writing an encyclopedia or something, to where I have to back the truck around. But what I wanted to ask is about your title, of all things. It's called DevOps Patterns for Private Equity. And I have to ask, what makes private equity special?Dave: I think as a cloud economist, what you also just told me, is you owe me $17.99 for the book because it was gifted.Corey: Is that how expensive books are these days? My God, I was under the impression once you put the word ‘DevOps' in the title, that meant you're above 40 bucks, just as, you know, entrance starting fees here.Dave: I think I need to talk to my local cloud economist on how to price things. Yeah, the book is about things that I've basically seen at portfolio companies over the years. The thing about, you know, why private equity, I think it would be one question, just because I've been involved in the DevOps movement since pretty much the start, when John Willis calls me a DevOps OG, which I think is a compliment. But the thing that I like about working with private equity, and more specifically, private equity portfolio companies is, like I wrote in the book, they're serious. And serious means that they're not afraid to make a big investment, they're not afraid to change things quickly, they're not afraid to reorganize, or rethink, or whatever because a lot of these private equity firms have, how they describe it as a three to five year investment thesis. So, in three to five years, they want to have some kind of an exit event, which means that they can't just sit around and talk about things and try it and see what happens—Corey: In the fullness of time, 20 years from now. Yeah, it doesn't work that well. But let's back up a little bit here because something that I have noticed over the years is that, especially when it comes to financial institutions, the general level of knowledge is not terrific. For a time, a lot of people were very angry at Goldman Sachs, for example. But okay, fair enough. What does Goldman Sachs do? And the answer was generally incoherent.And again, I am in no way, shape or form, different from people who form angry opinions without having all of the facts. I do that myself three times before breakfast. My last startup was acquired by BlackRock, and I was the one that raised our hand internally, at the 40-person company when that was announced, as everyone was sort of sitting there stunned: “What's a BlackRock?” Because I had no idea. Well, for the next nine months, I assure you, I found out what a BlackRock is. But what is private equity? Because I see a lot of them getting beaten up for destroying companies. Everyone wants to bring up the Toys-R-Us story as a for instance. But I don't get the sense that that is the full picture. Tell me more.Dave: Yes. So, I'm probably not the best spokesperson for private equity. But—Corey: Because you don't work for a private equity firm, you only work with them, that makes you a terrific spokesperson because you're not [in 00:06:53] this position of, “Well, justify what your company does here,” situation, there's something to be said for objectivity.Dave: So, you know, like I wrote in the book, there are approximately 10,000 private equity firms in the United States. They are not all going to be ethical. That is just not a thing. I choose to work with a specific segment of private equity companies, and these private equity companies want to make a good business. That's what they're going for.And you and I, having had worked at many companies in our careers, know that there's a lot of companies out there that aren't a good business. You're like, “Why are we doing this? This doesn't make any sense. This isn't a good investment. This”—there's a lot of things and what I would call the professional level private equity firms, the ones at the top—and not all of them at the top are ethical, don't get me wrong; I have a blacklist here of companies I won't work for. I will not say who those companies are.Corey: I am in the same boat. I think that anyone who works in an industry at all and doesn't have a list of companies that they would not do business with, is, on some level, either haven't thought it through, hasn't been in business long enough, or frankly, as long as you're paying them, everything you can do is a-okay. And you know, I'm not going to sit here and say that those are terrible people, but I never wanted to do that soul-searching. I always thought the only way to really figure out where you stand is to figure it out in advance before there's money on the table. Like, do you want to go do contracting for a defense company? Well no, objectively, I don't, but that's a lot harder to say when they're sitting on the table with $20 million in front of you of, “Do you want to work with a defense company?” Because you can rationalize your way into anything when the stakes are high enough. That's where I've always stood on it. But please, continue.Dave: I'd love to be in that situation to turn down $20 million [laugh].Corey: Yeah, that's a hard situation to find yourself in, right?Dave: But regardless, there's a lot of different kinds of private equity firms. Generally the firms that I work with, they all want—not generally; the ones I work with want to make better companies. I have had operating partners at these companies tell me—because this always comes up with private equity—there's no way to cut your way to a good company. So, the private equity firms that I work with invest in these companies. Do they sell off unprofitable things? Of course they do. Do they try to streamline some things sometimes so that the company is only focused on X or Y, and then they tuck other companies into it—that's called a buy and build strategy or a platform strategy—yes. But the purpose of that is to make a better company.The thing that I see a lot of people in our industry—meaning, like, us tech kind of folks—get confused about is what the difference is between venture capital and private equity. And private equity, in general, is the thing that is the kind of financing that follows on after venture capital. So, in venture capital, you are trying to find product-market fit. The venture capitalists are putting all their bets down like they're in Vegas at re:Invent, and trying to figure out which bet is going to pay off, but they have no expectation that all of the bets are going to pay off. With private equity, the companies have product-market fit, they're profitable. If they're not profitable, they have a very clear line to profitability.And so, what these private equity firms are trying to do, no matter what the size of the company is, whether it's a 50-person company or a 5000-person company, they're trying to get these companies up to another level so that they're more profitable and more valuable, so that either a larger fish will gobble them up or they'll go out on the public markets, like onto the stock market, those kinds of things, but they're trying to make a company that's more valuable. And so, not everything looks so good [laugh] when you're looking at it from the outside, not understanding what these people are trying to do. That's not to say they're not complete jerks who are in private equity because there are.Corey: Because some parts are missing. Kidding. Kidding. Kidding.Dave: [laugh].Corey: It's a nuanced area, and it's complicated, just from the perspective of… finance is deceptively complicated. It looks simple, on some level, because on some level, you can always participate in finance. I have $10. I want to buy a thing that costs $7. How does that work? But it gets geometrically more complex the further you go. Financial engineering is very much a thing.And it is not at all obvious how those things interplay with different dynamics. One of the private equity outcomes, as you alluded to a few minutes ago, is the idea that they need to be able to rapidly effect change. It becomes a fast turnaround situation, and then have an exit event of some kind. So, the DevOps patterns that you write about are aligned with an idea of being effective, presumably, rather than, well, here's how you slowly introduce a sweeping cultural mindset shift across the organization. Like, that's great, but some of us don't have that kind of runway for what we're trying to achieve to be able to pull that off. So, I'm assuming that a lot of the patterns you talk about are emphasizing rapid results.Dave: Well, I think the best way to describe this, right, is what we've talked about is they want to make a better company. And for those of us who have worked in the DevOps movement for all these years, what's one great way of making a better company? Adopting DevOps principles, right? And so, for me, one of the things I love about my job is I get to go in and make engineers' lives better. No more working on weekends, no more we're only going to do deployments at 11 o'clock at night, no more we're going to batch things up and ship them three or four times a year, which all of us who've done DevOps stuff for years know, like, fastest way to have a catastrophe is batch up as many things as possible and release them all at once.So like, for me, I'm going in making engineers' lives better. When their lives are better, they produce better results because they're not stressed out, they're not burned out, they get to spend time with their families, all those kinds of things. When they start producing better results, the executives are happier. The executives can go to the investors and show all the great results they're getting, so the investors are happier. So, for me, I always say, like, I'm super lucky because I have a job that's win, win, win.And like, I'm helping them to make a better company, I'm helping them to ship faster, I'm helping them do things in the cloud, I'm helping them get more reliability, which helps them retain customers, all these things. Because we know from the—you know, remember the 2019 State of DevOps Report: highest performers are twice as likely to meet or exceed their organization's performance goals, and those can be customer retention, revenue, whatever those goals are. And so, I get to go in and help make a better company because I'm making people's lives better and, kind of, everybody wins. And so, for me, it's super rewarding.Corey: That's a good way of framing it. I have to ask, since the goal for private equity, as you said, is to create better companies, to effectively fix a bunch of things that, for better or worse, had not been working optimally. Let me ask the big, dumb, naive question here. Isn't that ostensibly the goal of every company? Now, everyone says it's their goal, but whether that is their goal or not, I think, is a somewhat separate question.Dave: Yeah. I—that should be the goal of every company, I agree. There are people who read my book and said, “Hey, this stuff applies far beyond private equity.” And I say, “Yeah, it absolutely does.” But there are constraints—[gold rat 00:15:10]—within private equity, about the timing, about the funding, about whatever, to get the thing to another level. And that's an interesting thing that I've seen is I've seen private equity companies take a company up to another level, have some kind of exit event, and then buy that company again years later. Which, like, what? Like, how could that be?Corey: I've seen that myself. It feels, on some level, like that company goes public, and then goes private, then goes public, then goes private to the same PE firm, and it's like, are you really a PE company or are you just secretly a giant cat, perpetually on the wrong side of a door somewhere?Dave: But that's because they will take it to a level, the company does things, things happen out in the market, and then they see another opportunity to grow them again. Where in a regular company—in theory—you're going to want to just get better all the time, forever. This is the Toyota thesis about continual improvement.Corey: I am curious as far as what you are seeing changing in the market with the current macroeconomic conditions, which is a polite way to say the industry going wonky after ten years of being relatively up and to the right.Dave: Yeah, well, I guess the fun thing is, we have interest rates, we had a pandemic, we had [laugh], like, all this exciting stuff. There's, you know, massive layoffs, [unintelligible 00:16:34] and then all this, kind of like, super churn-y things. I think the fun thing for me is, I went to a private equity conference in San Francisco, I don't know, a month ago or something like that, and they had all these panelists on stage pontificating about this and that and the other thing, and one of the women said something that I thought was really great, especially for someone like me. She said, “The next five to ten years in private equity are going to be about growth and operational efficiency.” And I was like, “That's DevOps. That's awesome.” [laugh].That really works well for me because, like, we want to have people twice as likely to meet or exceed their organization's performance goals. That's growth. And we want operational efficiency, right? Like, stop manually copying files around, start putting stuff in containers, do all these things that enable us to go fast speed and also do that with high quality. So, if the next five to ten years are going to be about growth and operational efficiency, I think it's a great opportunity for people to take in a lot of these DevOps principles.And so, the being on the Screaming in the Cloud podcast, like, I think cloud is a huge part of that. I think that's a big way to get growth and operational efficiency. Like, how better to be able to scale? How better to be able to Deming's PDSA cycle, right—Plan, Do, Study, Act—how better to run all these experiments to find out, like, how to get better, how to be more efficient, how to meet our customers' demands. I think that's a huge part of it.Corey: That is, I think, a very common sentiment as far as how folks are looking at things from a bigger picture these days. I want to go back as well to something you said earlier that I was joking around at the start of the episode about, “Wow, what an amazing name for the company. How did you come up with it?” And you mentioned that you had been asking a bunch of people for advice—or rather, you mentioned you had gotten advice from people. I want to clarify, you were in fact asking. I wasn't basically the human form of Clippy popping up, “It looks like you're starting a business. Let me give you unsolicited advice on what you should be doing.”What you've done, I think, is a terrific example of the do what I say not what I do type of problem, where you have focused on your positioning on a specific segment of the market: private equity firms and their portfolio companies. If I had been a little bit smarter, I would have done something similar in my own business. I would fix AWS bills for insurance companies in the Pacific Northwest or something like that, where people can hear the type of company they are reflected in the name of what it is that you do. I was just fortunate enough or foolish enough to be noisy enough in order to talk about what I do in a way that I was able to overcome that. But targeting the way that you have, I think is just so spot on. And it's clearly working out for you.Dave: I think a Corey Quinn Clippy would be very distracting in [laugh] my Microsoft Word, first of all [laugh]. Second of all—Corey: They're calling it Copilot now.Dave: [laugh]—there's this guy Corey and his partner Mike who turned me on to this guy, Jonathan Stark, who has his theory about your business. He calls it, like, elucidating, like, a Rolodex moment. So, if somebody's talking about X or Y, and they say, “Oh, yeah. You want to talk to Corey about that.” Or, “You want to talk to Mike about that.”And so, for me, working with private equity portfolio companies, that's a Rolodex moment. When people are like, “I'm at a portfolio company. We just got bought. They're coming in, and they want to understand what our spend is on the cloud, and this and that. Like, I don't know what I'm supposed to do here.” A lot of times people think of me because I tend to work on those kinds of problems. And so, it doesn't mean I can't work on other things, and I definitely do work on other things, I've definitely worked with companies that are not owned by private equity, but for me, that's really a place that I enjoy working, and thankfully, I get Rolodex moments from those things.Corey: That's the real value that I've found. The line I've heard is always it's not just someone at a party popping up and saying, “Oh, yeah, I have that problem.” But, “Oh, my God, you need to talk to this person I know who has that problem.” It's the introduction moment. In my case at least, it became very hard for me to find people self-identifying as having large AWS bills, just because, yeah, individual learners or small startup founders, for example, might talk about it here and there, but large companies do not tend to complain about that in Twitter because that tends to, you know, get them removed from their roles when they start going down that path. Do you find that it is easier for you to target what you do to people because it's easier to identify them in public? Because I assure you, someone with a big AWS bill is hard to spot out of a crowd.Dave: Well, I think you need to meet people where they are, I think is probably the best way of saying that. So, if you are—and this isn't something I need to explain to you, obviously, so this is more for your listeners, but like, if you're going to talk about, “Hey, I'm looking for companies with large AWS bills,” [pthhh] like that's, maybe kind of whatever. But if you say, “Hey, I want to improve your margins and your operational efficiencies,” all of a sudden, you're starting to speak their language, right? And that language is where people start to understand that, “Hey, Corey's talking about me.”Corey: A large part of how I talk about this was shaped by some of the early conversations I had. The way that I think about this stuff and the way that I talk is not necessarily what terms my customers use. Something that I found that absolutely changed my approach was having an investigative journalist—or a former investigative journalist, in this case—interview people I'd worked with to get case studies and testimonials from them. But what she would also do was get the exact phrasing that they use to describe the value that I did, and how they talked about what we'd done. Because that became something that was oh, you're effectively writing the rough draft of my marketing copy when you do that. Speaking in the language of your customer is so important, and I meet a lot of early-stage startups that haven't quite unlocked that bit of insight yet.Dave: And I think looking at that from a slightly different perspective is also super important. So, not only speaking the language of your customer, but let's say you're not a consultant like me or you. Let's say you work inside of a company. You need to learn to speak the language of business, right? And this is, like, something I wrote about in the beginning of the book about the guy in San Francisco who got locked up for not giving away the Cisco passwords, and Gavin Newsom had to go to his jail cell and all this other crazy stuff that happened is, technologists often think that the reason that they go to work is to play with technology. The reason we go to work is to enable the business.And—so shameless plug here I—wrote a paper that came out, like, two months ago with IT Revolution—so the people who do The Phoenix Project, and Accelerate, and The DevOps Handbook, and all that other stuff, I wrote this paper with, like, Courtney Kissler, and Paul Gaffney, and Scott Nasello, and a whole bunch of amazing technologists, but it's about speaking the language of business. And as technologists, if we want to really contribute and feel like the work that we're doing is contributing and valuable, you need to start understanding how those other people are talking. So, you and I were just talking about, like, operational efficiencies, and margins, and whatever. What is all that stuff? And figuring that out and being able to have that conversation with your CEO or whoever, those are the things that get people to understand exactly what you're trying to do, and what you're doing, and why this thing is so important.I talk to so many engineers that are like, “Ah, I talked to management and they just don't understand, and [da-dah].” Yeah, they don't understand because you're speaking technology language. They don't want to hear about, like, CNCF compliant this, that, and the—that doesn't mean anything to them. You need to understand in their lang—talk to them and their language and say like, “Hey, this is why this is good for the business.” And I think that's a really important thing for people to start to learn.Corey: So, a question that I have, given that you have been doing this stuff, I think, longer than I have, back when cloud wasn't really a thing, and then it was a thing, but it seemed really irresponsible to do. And then it went through several more iterations to the point where now it's everywhere. What's your philosophy of cloud?Dave: So, I'll go back to something that just came out, the 2023 State of DevOps Report just came out. I follow those things pretty closely. One of the things they talked about in the paper is one of the key differentiators to get your business to have what they call high organizational performance—again, this [laugh] is going back to business talk again—is what they call infrastructure flexibility. And I just don't think you can get infrastructure flexibility if you're not in the cloud. Can you do it? Absolutely.You know, back over a decade ago, I built out a bunch of stuff in a data center on what I called cloud principles. We could shoot things in the head, get new ones back, we did all kinds of things, we identified SKUs of, like, what kind of classes of machines we had. All that looks like a lot of stuff that you would just do in AWS, right? Like, I know, my C instances are compute. I know my M instances are memory. Like, they're all just SKUs, right?Corey: Yeah, that changed a little bit now to the point where they have so many different instance families that some of their names look like dumps of their firmware.Dave: [laugh]. That is probably true. But like, this idea that, like, I want to have this infrastructure flexibility isn't just my idea that it's going to turn out well. Like, the State of DevOps Report kind of proves it. And so, for me, like, I go back to some of the principles of the DevOps movement, and like, if you look at the DORA metrics, let's say you've got deployment frequency and lead time for changes. That's speed: how fast can I do something? And you've got time-to-recover, and you've got change failure rate. That's quality: how much can I ship without having problems, and how fast can I recover when I do?And I think this is one of the things I teach to a lot of my clients about moving into the cloud. If you want to be successful, you have to deliver with speed and quality. Speed: Infrastructure as Code, full stop. If I want to be able to go fast, I need to be able to destroy an environment, bring a new environment up, I need to be able to do that in minutes. That's speed.And then the second requirement, and the only other requirement, is build monitoring in from the start. Everything gets monitored. And that's quality. Like, if I monitor stuff, I know when I've deployed something that's spiking CPU. If that's monitored, I know that this thing is costing me a hell of a lot more than other things. I know all this stuff. And I can do capacity planning, I can do whatever the heck I want. But those are the two fundamental things: Infrastructure as Code and monitoring.And yes, like you said, I worked at a monitoring or observability company, so perhaps I'm slightly biased, but what I've seen is, like, companies that adopt those two principles, and everything else comes from that—so all my Kubernetes stuff and all those other things are not at odds with those principles—those are the people who actually wind up doing really well. And I think those are the people that have—State of DevOps Report—infrastructure flexibility, and that enables them to have higher organizational performance.Corey: I think you're onto something. Like, I still remember the days of having to figure out the number of people who you had in your ops team versus how many servers they could safely and reasonably run. And now that question has little, if any, meaning. If someone asked me, “Okay, so we're running right now 10,000 instances in our cloud environment. How many admins should it take us to run those?” The correct response is, “How the heck are you running those things?” Like, tell me more because the answer is probably terrifying. Because right now, if you do that correctly, it's you want to make a change to all of them or some subset of them? You change a parameter somewhere and computers do the heavy lifting.Dave: Yeah, I ran a content delivery network for cable and wireless. We had three types of machines. You know, it was like Windows Media Server and some squid-cache thing or whatever. And it didn't matter how many we had. It's all the same. Like, if I had 10,000 and I had 50,000, it's irrelevant. Like, they're all the same kind of crap. It's not that hard to manage a bunch of stuff that's all the same.If I have 10,000 servers and each one is a unique, special snowflake because I'm running in what I call a hosted configuration, I have 10,000 customers, therefore I have 10,000 servers, and each of them is completely different than the other, then that's going to be a hell of a lot harder to manage than 10,000 things that the load balancer is like [bbbrrrp bbbrrrp] [laugh] like, just lay it out. So, it's sort of a… kind of a nonsense question at this point. Like you're saying, like, it doesn't really matter how many. It's complexity. How much complexity do I have? And as we all say, in the DevOps movement, complexity isn't free. Which I'll bet is a large component of how you save companies money with The Duckbill Group.Corey: It goes even beyond that because cloud infrastructure is always less expensive than the people working on it, unless you do something terrifying. Otherwise, everything should be running an EC2 instances. Nothing higher-level built on top of it because if people's time is free, the cheapest thing you're going to get is a bunch of instances. The end. That is not really how you should be thinking about this.Dave: [laugh]. I know a lot of private equity firms that would love to find a place where time was free [laugh]. They could make a lot of money.Corey: Yeah. Pretty sure that the biggest—like, “What's your biggest competitive headwind?” You know [laugh], “Wage laws.” Like it doesn't work that way. I'm sorry, but it doesn't [laugh].I really want to thank you for taking the time to talk to me about what you're up to, how things are going over in your part of the universe. If people want to learn more, where's the best place for them to go to find you?Dave: They can go to mangoteque.com. I've got all the links to my blog, my mailing list. Definitely, if you're interested in this intersection of DevOps and private equity, sign up for the mailing list. For people who didn't get Corey's funky spelling of my last name, it is a play on the fact that it is French and I also work with technology companies. So, it's M-A-N-G-O-T-E-Q-U-E dot com.If you type that in—Mangoteque—to any search engine, obviously, you will find me. I am not difficult to find on the internet because I've been doing this for quite some time. But thank you for having me on the show. It's always great to catch up with you. I love hearing about what you're doing. I super appreciate you're asking me about the things that I'm working on, and you know, been a big help.Corey: No, it's deeply fascinating. It's neat to watch you continue to meet your market in a variety of different ways. Dave Mangot, CEO and founder of Mangoteque, which is excellently named. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry comment almost certainly filled with incoherent screaming because you tuned out just as soon as you heard the words ‘private equity.'Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Using SRE to Solve the Obvious Problems with Laura Nolan

Screaming in the Cloud

Play Episode Listen Later Dec 12, 2023 29:46


Laura Nolan, Principal Software Engineer at Stanza, joins Corey on Screaming in the Cloud to offer insights on how to use SRE to avoid disastrous and lengthy production delays. Laura gives a rich history of her work with SREcon, why her approach to SRE is about first identifying the biggest fire instead of toiling with day-to-day issues, and why the lack of transparency in systems today actually hurts new engineers entering the space. Plus, Laura explains to Corey why she dedicates time to work against companies like Google who are building systems to help the government (inefficiently) select targets during wars and conflicts.About LauraLaura Nolan is a software engineer and SRE. She has contributed to several books on SRE, such as the Site Reliability Engineering book, Seeking SRE, and 97 Things Every SRE Should Know. Laura is a Principal Engineer at Stanza, where she is building software to help humans understand and control their production systems. Laura also serves as a member of the USENIX Association board of directors. In her copious spare time after that, she volunteers for the Campaign to Stop Killer Robots, and is half-way through the MSc in Human Factors and Systems Safety at Lund University. She lives in rural Ireland in a small village full of medieval ruins.Links Referenced: Company Website: https://www.stanza.systems/ Twitter: https://twitter.com/lauralifts LinkedIn: https://www.linkedin.com/in/laura-nolan-bb7429/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is someone that I have been low-key annoying to come onto this show for years, and finally, I have managed to wear her down. Lauren Nolan is a Principal Software Engineer over at Stanza. At least that's what you're up to today, last I've heard. Is that right?Laura: That is correct. I'm working at Stanza, and I don't want to go on and on about my startup, but I'm working with Niall Murphy and Joseph Bironas and Matthew Girard and a bunch of other people who more recently joined us. We are trying to build a load management SaaS service. So, we're interested in service observability out of the box, knowing if your critical user journeys are good or bad out of the box, being able to prioritize your incoming requests by what's most critical in terms of visibility to your customers. So, an emerging space. Not in the Gartner Group Magic Circle yet, but I'm sure at some point [laugh].Corey: It is surreal to me to hear you talk about your day job because for, it feels like, the better part of a decade now, “Laura, Laura… oh, you mean USENIX Laura?” Because you are on the USENIX board of directors, and in my mind, that is what is always short-handed to what you do. It's, “Oh, right. I guess that isn't your actual full-time job.” It's weird. It's almost like seeing your teacher outside of the elementary school. You just figure that they fold themselves up in the closet there when you're not paying attention. I don't know what you do when SREcon is not in process. I assume you just sit there and wait for the next one, right?Laura: Well, no. We've run four of them in the last year, so there hasn't been very much waiting. I'm afraid. Everything got a little bit smooshed up together during the pandemic, so we've had a lot of events coming quite close together. But no, I do have a full-time day job. But the work I do with USENIX is just as a volunteer. So, I'm on the board of directors, as you say, and I'm on the steering committee for all of the global SREcon events, and typically is often served by the program committee as well. And I'm sort of there, annoying the chairs to, “Hey, do your thing on time,” very much like an elementary school teacher, as you say.Corey: I've been a big fan of USENIX for a while. One of the best interview processes I ever saw was closely aligned with evaluating candidates along with USENIX SAGE levels to figure out what level of seniority are they in different areas. And it was always viewed through the lens of in what types of consulting engagements will the candidate shine within, not the idea of, “Oh, are you good or are you crap? And spoiler, if I'm asking the question, I'm of course defaulting myself to goading you to crap.” Like the terrible bespoke artisanal job interview process that so many companies do. I love how this company had built this out, and I asked them about it, and, “Oh, yeah, it comes—that dates back to the USENIX SAGE things.” That was one of my first encounters with what USENIX actually did. And the more I learned, the more I liked. How long have you been involved with the group?Laura: A relatively short period of time. I think I first got involved with USENIX in around 2015, going to [Lisa 00:03:29] and then going on to SREcon. And it was all by accident, of course. I fell onto the SREcon program committee somehow because I was around. And then because I was still around and doing stuff, I got eventually—you know, got co-opted into chairing and onto the steering committee and so forth.And you know, it's like everything volunteer. I mean, people who stick around and do stuff tend to be kept around. But USENIX is quite important to me. We have an open access policy, which is something that I would like to see a whole lot more of, you know, we put everything right out there for free as soon as it is ready. And we are constantly plagued by people saying, “Hey, where's my SREcon video? The conference was like two weeks ago.” And we're like, “No, no, we're still processing the videos. We'll be there; they'll be there.”We've had people, like, literally offer to pay extra money to get the videos sooner, but [laugh] we're, like, we are open access. We are not keeping the videos away from you. We just aren't ready yet. So, I love the open access policy and I think what I like about it more than anything else is the fact that it's… we are staunchly non-vendor. We're non-technology specific and non-vendor.So, it's not, like, say, AWS re:Invent for example or any of the big cloud vendor conferences. You know, we are picking vendor-neutral content by quality. And as well, as anyone who's ever sponsored SREcon or any of the other events will also tell you that that does not get you a talk in the conference program. So, the content selection is completely independent, and in fact, we have a complete Chinese wall between the sponsorship organization and the content organization. So, I mean, I really like how we've done that.I think, as well, it's for a long time been one of the family of conferences that our organizations have conferences that has had the best diversity. Not perfect, but certainly better than it was, although very, very unfortunately, I see conference diversity everywhere going down after the pandemic, which is—particularly gender diversity—which is a real shame.Corey: I've been a fan of the SREcon conferences for a while before someone—presumably you; I'm not sure—screwed up before the pandemic and apparently thought they were talking about someone else, and I was invited to give a keynote at SREcon in EMEA that I co-presented with John Looney. Which was fun because he and I met in person for the first time three hours beforehand, beat together our talk, then showed up an hour beforehand, found there will be no confidence monitor, went away for the next 45 minutes and basically loaded it all into short term cash and gave a talk that we could not repeat if we had to for a million dollars, just because it was so… you're throwing the ball to your partner on stage and really hoping they're going to be able to catch it. And it worked out. It was an anger subtext translator skit for a bit, which was fun. All the things that your manager says but actually means, you know, the fun sort of approach. It was zany, ideally had some useful takeaways to it.But I loved the conference. That was one of the only SREcons that I found myself not surprised to discover was coming to town the next week because for whatever reason, there's presumably a mailing list that I'm not on somewhere where I get blindsided by, “Oh, yeah, hey, didn't you know SREcon is coming up?” There's probably a notice somewhere that I really should be paying attention to, but on the plus side, I get to be delightfully surprised every time.Laura: Indeed. And hopefully, you'll be delightfully surprised in March 2024. I believe it's the 18th to the 20th, when SREcon will be coming to town in San Francisco, where you live.Corey: So historically, in addition to, you know, the work with USENIX, which is, again, not your primary occupation most days, you spent over five years at Google, which of course means that you have strong opinions on SRE. I know that that is a bit dated, where the gag was always, it's only called SRE if it comes from the Mountain View region of California, otherwise it's just sparkling DevOps. But for the initial take of a lot of the SRE stuff was, “Here's how to work at Google.” It has progressed significantly beyond that to the point where companies who have SRE groups are no longer perceived incorrectly as, “Oh, we just want to be like Google,” or, “We hired a bunch of former Google people.”But you clearly have opinions to this. You've contributed to multiple books on SRE, you have spoken on it at length. You have enabled others to speak on it at length, which in many ways, is by far the better contribution. You can only go so far scaling yourself, but scaling other people, that has a much better multiplier on it, which feels almost like something an SRE might observe.Laura: It is indeed something an SRE might observe. And also, you know, good catch because I really felt you were implying there that you didn't like my book contributions. Oh, the shock.Corey: No. And to be clear, I meant [unintelligible 00:08:13], strictly to speaking.Laura: [laugh].Corey: Books are also a great one-to-many multiplier because it turns out, you can only shove so many people into a conference hall, but books have this ability to just carry your words beyond the room that you're in a way that video just doesn't seem to.Laura: Ah, but open access video that was published on YouTube, like, six weeks ahead [laugh]. That scales.Corey: I wish. People say they want to write a book and I think they're all lying. I think they want to have written the book. That's my philosophy on it. I do not understand people who've written a book. Like, “So, what are you going to do now?” “I'm going to write another book.” “Okay.” I'm going to smile, not take my eyes off you for a second and back away slowly because I do not understand your philosophy on that. But you've worked on multiple books with people.Laura: I actually enjoy writing. I enjoy the process of it because I always learn something when I write. In fact, I learn a lot of things when I write, and I enjoy that crafting. I will say I do not enjoy having written things because for me, any achievement once I have achieved it is completely dead. I will never think of it again, and I will think only of my excessively lengthy-to do list, so I clearly have problems here. But nevertheless. It's exactly the same with programming projects, by the way. But back to SRE we were talking about SRE. SRE is 20 now. SRE can almost drink alcohol in the US, and that is crazy.Corey: So, 2003 was the founding of it, then.Laura: Yes.Corey: Yay, I can do simple arithmetic in my head, still. I wondered how far my math skills had atrophied.Laura: Yes. Good job. Yes, apparently invented in roughly 2003. So, the—I mean, from what I understand Google's publishing of the, “20 years of SRE at Google,” they have, in the absence of an actual definite start date, they've simply picked. Ben Treynor's start date at Google as the start date of SRE.But nevertheless, [unintelligible 00:09:58] about 20 years old. So, is it all grown up? I mean, I think it's become heavily commodified. My feeling about SRE is that it's always been this—I mean, you said it earlier, like, it's about, you know, how do I scale things? How do I optimize my systems? How do I intervene in systems to solve problems to make them better, to see where we're going to be in pain and six months, and work to prevent that?That's kind of SRE work to me is, figure out where the problems are, figure out good ways to intervene and to improve. But there's a lot of SRE as bureaucracy around at the moment where people are like, “Well, we're an SRE team, so you know, you will have your SLO Golden Signals, and you will have your Production Readiness Checklists, which will be the things that we say, no matter how different your system is from what we designed this checklist for, and that's it. We're doing SRE now. It's great.” So, I think we miss a lot there.My personal way of doing SRE is very much more about thinking, not so much about the day-to-day SLO [excursion-type 00:10:56] things because—not that they're not important; they are important, but they will always be there. I always tend to spend more time thinking about how do we avoid the risk of, you know, a giant production fire that will take you down for days, or God forbid, more than days, you know? The sort of, big Roblox fire or the time that Meta nearly took down the internet in late-2021, that kind of thing. So, I think that modern SRE misses quite a lot of that. It's a little bit like… so when BP, when they had the Deepwater Horizon disaster on that very same day, they received an award for minimizing occupational safety risks in their environment. So, you know, [unintelligible 00:11:41] things like people tripping and—Corey: Must have been fun the next day. “Yeah, we're going to need that back.”Laura: [laugh] people tripping and falling, and you know, hitting themselves with a hammer, they got an award because it was so safe, they had very little of that. And then this thing goes boom.Corey: And now they've tried to pivot into an optimization award for efficiency, like, we just decided to flash fry half the sea life in the Gulf at once.Laura: Yes. Extremely efficient. So, you know, I worry that we're doing SRE a little bit like BP. We're doing it back before Deepwater Horizon.Corey: I should disclose that I started my technical career as a grumpy old Unix sysadmin—because it's not like you ever see one of those who's happy or young; didn't matter that I was 23 years old, I was grumpy and old—and I have viewed the evolution since then have going from calling myself a sysadmin to a DevOps engineer to an SRE to a platform engineer to whatever we're calling it this week, I still view it as fundamentally the same job, in the sense that the responsibility has not changed, and that is keep the site or environment up. But the tools, the processes and the techniques we apply to it have evolved. Is that accurate? Does it sound like I'm spouting nonsense? You're far closer to the SRE world than I ever was, but I'm curious to get your take on that perspective. And please feel free to tell me I'm wrong.Laura: No, no. I think you're completely right. And I think one of the ways that I think is shifted, and it's really interesting, but when you and I were, when we were young, we could see everything that was happening. We were deploying on some sort of Linux box or other sort of Unix box somewhere, most likely, and if we wanted, we could go and see the entire source code of everything that our software was running on. And kids these days, they're coming up, and they are deploying their stuff on RDS and ECS and, you know, how many layers of abstraction are sitting between them and—Corey: “I run Kubernetes. That means I don't know where it runs, and neither does anyone else.” It's great.Laura: Yeah. So, there's no transparency anymore in what's happening. So, it's very easy, you get to a point where sometimes you hit a problem, and you just can't figure it out because you do not have a way to get into that system and see what's happening. You know, even at work, we ran into a problem with Amazon-hosted Prometheus. We were like, “This will be great. We'll just do that.” And we could not get some particular type of remote write operation to work. We just could not. Okay, so we'll have to do something else.So, one of the many, many things I do when I'm not, you know, trying to run the SREcon conference or do actual work or definitely not write a book, I'm studying at Lund University at the moment. I'm doing this master's degree in human factors and system safety. And one of the things I've realized since doing that program is, in tech, we missed this whole 1980s and 1990s discipline of cognitive systems theory, cognitive systems engineering. This is what people were doing. They were like, how can people in the control room in nuclear plants and in the cockpit in the airplane, how can they get along with their systems and build a good mental model of the automation and understand what's going on?We missed all that. We came of age when safety science was asking questions like how can we stop organizational failures like Challenger and Columbia, where people are just not making the correct decisions? And that was a whole different sort of focus. So, we've missed all of this 1980s and 1990s cognitive system stuff. And there's this really interesting idea there where you can build two types of systems: you can build a prosthesis which does all your interaction with a system for you, and you can see nothing, feel nothing, do nothing, it's just this black box, or you can have an amplifier, which lets you do more stuff than you could do just by yourself, but lets you still get into the details.And we build mostly prostheses. We do not build amplifiers. We're hiding all the details; we're building these very, very opaque abstractions. And I think it's to the detriment of—I mean, it makes our life harder in a bunch of ways, but I think it also makes life really hard for systems engineers coming up because they just can't get into the systems as easily anymore unless they're running them themselves.Corey: I have to confess that I have a certain aversion to aspects of SRE, and I'm feeling echoes of it around a lot of the human factor stuff that's coming out of that Lund program. And I think I know what it is, and it's not a problem with either of those things, but rather a problem with me. I have never been a good academic. I have an eighth grade education because school is not really for me. And what I loved about being a systems administrator for years was the fact that it was like solving puzzles every day.I got to do interesting things, I got to chase down problems, and firefight all the time. And what SRE is represented is a step away from that to being more methodical, to taking on keeping the site up as a discipline rather than an occupation or a task that you're working on. And I think that a lot of the human factors stuff plays directly into it. It feels like the field is becoming a lot more academic, which is a luxury we never had, when holy crap, the site is down, we're going to go out of business if it isn't back up immediately: panic mode.Laura: I got to confess here, I have three master's degrees. Three. I have problems, like I said before. I got what you mean. You don't like when people are speaking in generalizations and sort of being all theoretical rather than looking at the actual messy details that we need to deal with to get things done, right? I know. I know what you mean, I feel it too.And I've talked about the human factors stuff and theoretical stuff a fair bit at conferences, and what I always try to do is I always try and illustrate with the details. Because I think it's very easy to get away from the actual problems and, you know, spend too much time in the models and in the theory. And I like to do both. I will confess, I like to do both. And that means that the luxury I miss out on is mostly sleep. But here we are.Corey: I am curious as far as what you've seen as far as the human factors adoption in this space because every company for a while claimed to be focused on blameless postmortems. But then there would be issues that quickly turned into a blame Steve postmortem instead. And it really feels, at least from a certain point of view, that there was a time where it seemed to be gaining traction, but that may have been a zero interest rate phenomenon, as weird as that sounds. Do you think that the idea of human factors being tied to keeping systems running in a computer sense has demonstrated staying power or are you seeing a recession? It could be I'm just looking at headlines too much.Laura: It's a good question. There's still a lot of people interested in it. There was a conference in Denver last February that was decently well attended for, you know, a first initial conference that was focusing on this issue, and this very vibrant Slack community, the LFI and the Learning from Incidents in Software community. I will say, everything is a little bit stretched at the moment in industry, as you know, with all the layoffs, and a lot of people are just… there's definitely a feeling that people want to hunker down and do the basics to make sure that they're not seen as doing useless stuff and on the line for layoffs.But the question is, is this stuff actually useful or not? I mean, I contend that it is. I contend that we can learn from failures, we can learn from what we're doing day-to-day, and we can do things better. Sometimes you don't need a lot of learning because what's the biggest problem is obvious, right [laugh]? You know, in that case, yeah, your focus should just be on solving your big obvious problem, for sure.Corey: If there was a hierarchy of needs here, on some level, okay, step one, is the building—Laura: Yes.Corey: Currently on fire? Maybe solve that before thinking about the longer-term context of what this does to corporate culture.Laura: Yes, absolutely. And I've gone into teams before where people are like, “Oh, well, you're an SRE, so obviously, you wish to immediately introduce SLOs.” And I can look around and go, “Nope. Not the biggest problem right now. Actually, I can see a bunch of things are on fire. We should fix those specific things.”I actually personally think that if you want to go in and start improving reliability in a system, the best thing to do is to start a weekly production meeting if the team doesn't have that, actually create a dedicated space and time for everyone to be able to get together, discuss what's been happening, discuss concerns and risks, and get all that stuff out in the open. I think that's very useful, and you don't need to spend however long it takes to formally sit down and start creating a bunch of SLOs. Because if you're not dealing with a perfectly spherical web service where you can just use the Golden Signals and if you start getting into any sorts of thinking about data integrity, or backups, or any sorts of asynchronous processing, these sorts of things, they need SLOs that are a lot more interesting than your standard error rate and latency. Error rate and latency gets you so far, but it's really just very cookie-cutter stuff. But people know what's wrong with their systems, by and large. They may not know everything that's wrong with their systems, but they'll know the big things, for sure. Give them space to talk about it.Corey: Speaking of bigger things and turning into the idea of these things escaping beyond pure tech, you have been doing some rather interesting work in an area that I don't see a whole lot of people that I talked to communicating about. Specifically, you're volunteering for the campaign to stop killer robots, which ten years ago would have made you sound ridiculous, and now it makes you sound like someone who is very rationally and reasonably calling an alarm on something that is on our doorstep. What are you doing over there?Laura: Well, I mean, let's be real, it sounds ridiculous because it is ridiculous. I mean, who would let a computer fly around to the sky and choose what to shoot at? But it turns out that there are, in fact, a bunch of people who are building systems like that. So yeah, I've been volunteering with the campaign for about the last five years, since roughly around the time that I left Google, in fact, because I got interested in that around about the time that Google was doing the Project Maven work, which was when Google said, “Hey, wouldn't it be super cool if we took all of this DoD video footage of drone video footage, and, you know, did a whole bunch of machine-learning analysis on it and figured out where people are going all the time? Maybe we could click on this house and see, like, a whole timeline of people's comings and goings and which other people they are sort of in a social network with.”So, I kind of said, “Ahh… maybe I don't want to be involved in that.” And I left Google. And I found out that there was this campaign. And this campaign was largely lawyers and disarmament experts, people of that nature—philosophers—but also a few technologists. And for me, having run computer systems for a large number of years at this point, the idea that you would want to rely on a big distributed system running over some janky network with a bunch of 18-year-old kids running it to actually make good decisions about who should be targeted in a conflict seems outrageous.And I think almost every [laugh] software operations person, or in fact, software engineer that I've spoken to, tends to feel the same way. And yet there is this big practical debate about this in international relations circles. But luckily, there has just been a resolution in the UN just in the last day or two as we record this, the first committee has, by a very large majority, voted to try and do something about this. So hopefully, we'll get some international law. The specific interventions that most of us in this field think would be good would be to limit the amount of force that autonomous weapon, or in fact, an entire set of autonomous weapons in a region would be able to wield because there's a concern that should there be some bug or problem or a sort of weird factor that triggers these systems to—Corey: It's an inevitability that there will be. Like, that is not up for debate. Of course, it's going to break in 2020, the template slide deck that AWS sent out for re:Invent speakers had a bunch of clip art, and one of them was a line art drawing of a ham with a bone in it. So, I wound up taking that image, slapping it on a t-shirt, captioning it “AWS Hambone,” and selling that as a fundraiser for 826 National.Laura: [laugh].Corey: Now, what happened next is that for a while, anyone who tweeted the phrase “AWS Hambone” would find themselves banned from Twitter for the next 12 hours due to some weird algorithmic thing where it thought that was doxxing or harassment or something. And people on the other side of the issue that you're talking about are straight face-idly suggesting that we give that algorithm [unintelligible 00:24:32] tool a gun.Laura: Or many guns. Many guns.Corey: I'm sorry, what?Laura: Absolutely.Corey: Yes, or missiles or, heck, let's build a whole bunch of them and turn them loose with no supervision, just like we do with junior developers.Laura: Exactly. Yes, so many people think this is a great idea, or at least they purport to think this is a great idea, which is not always the same thing. I mean, there's lots of different vested interests here. Some people who are proponents of this will say, well, actually, we think that this will make targeting more accurate, less civilians will actually will die as a result of this. And the question there that you have to ask is—there's a really good book called Drone by Chamayou, Grégoire Chamayou, and he says that there's actually three meanings to accuracy.So, are you hitting what you're aiming at is one of it—one thing. And that's a solved problem in military circles for quite some time. You got, you know, laser targeting, very accurate. Then the other question is, how big is the blast radius? So, that's just a matter of, you know, how big an explosion are you going to get? That's not something that autonomy can help with.The only thing that autonomy could even conceivably help with in terms of accuracy is better target selection. So, instead of selecting targets that are not valid targets, selecting more valid targets. But I don't think there's any good reason to think that computers can solve that problem. I mean, in fact, if you read stuff that military experts write on this, and I've got, you know, lots of academic handbooks on military targeting processes, they will tell you, it's very hard and there's a lot of gray areas, a lot of judgment. And that's exactly what computers are pretty bad at. Although mind you, I'm amused by your Hambone story and I want to ask if AWS Hambone is a database?Corey: Anything is a database, if you hold it wrong.Laura: [laugh].Corey: It's fun. I went through a period of time where, just for fun, I would ask people to name an AWS service and I would talk about how you could use it incorrectly as a database. And then someone mentioned, “What about AWS Neptune,” which is their graph database, which absolutely no one understands, and the answer there is, “I give up. It's impossible to use that thing as a database.” But everything else can be. Like, you know, the tagging system. Great, that has keys and values; it's a database now. Welcome aboard. And I didn't say it was a great database, but it is a free one, and it scales to a point. Have fun with it.Laura: All I'll say is this: you can put labels on anything.Corey: Exactly.Laura: We missed you at the most recent SREcon EMEA. There was a talk about Google's internal Chubby system and how people started using it as a database. And I did summon you in Slack, but you didn't show up.Corey: No. Sadly, I've gotten a bit out of the SRE space. And also, frankly, I've gotten out of the community space for a little while, when it comes to conferences. And I have a focused effort at the start of 2024 to start changing that. I am submitting CFPs left and right.My biggest fear is that a conference will accept one of these because a couple of them are aspirational. “Here's how I built the thing with generative AI,” which spoiler, I have done no such thing yet, but by God, I will by the time I get there. I have something similar around Kubernetes, which I've never used in anger, but soon will if someone accepts the right conference talk. This is how I learned Git: I shot my mouth off in a CFP, and I had four months to learn the thing. It was effective, but I wouldn't say it was the best approach.Laura: [laugh]. You shouldn't feel bad about lying about having built things in Kubernetes, and with LLMs because everyone has, right?Corey: Exactly. It'll be true enough by the time I get there. Why not? I'm not submitting for a conference next week. We're good. Yeah, Future Corey is going to hate me.Laura: Have it build you a database system.Corey: I like that. I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place for them to find you these days?Laura: Ohh, I'm sort of homeless on social media since the whole Twitter implosion, but you can still find me there. I'm @lauralifts on Twitter and I have the same tag on BlueSky, but haven't started to use it yet. Yeah, socials are hard at the moment. I'm on LinkedIn. Please feel free to follow me there if you wish to message me as well.Corey: And we will, of course, put links to that in the [show notes 00:28:31]. Thank you so much for taking the time to speak with me. I appreciate it.Laura: Thank you for having me.Corey: Laura Nolan, Principal Software Engineer at Stanza. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that soon—due to me screwing up a database system—will be transmogrified into a CFP submission for an upcoming SREcon.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Terraform and The Art of Teaching Tech with Ned Bellavance

Screaming in the Cloud

Play Episode Listen Later Dec 7, 2023 35:02


Ned Bellavance worked in the world of tech for more than a decade before joining the family profession as an educator. He joins Corey on Screaming in the Cloud to discuss his shift from engineer to educator and content creator, the intricacies of Terraform, and how changes in licensing affect the ecosystem.About NedNed is an IT professional with more than 20 years of experience in the field. He has been a helpdesk operator, systems administrator, cloud architect, and product manager. In 2019, Ned founded Ned in the Cloud LLC to work as an independent educator, creator, and consultant. In this new role, he develops courses for Pluralsight, runs multiple podcasts, writes books, and creates original content for technology vendors.Ned is a Microsoft MVP since 2017 and a HashiCorp Ambassador since 2020.Ned has three guiding principles: embrace discomfort, fail often, and be kind.Links Referenced: Ned in the Cloud: https://nedinthecloud.com/ LinkedIn: https://www.linkedin.com/in/ned-bellavance/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is Ned Bellavance, who's the founder and curious human over at Ned in the Cloud. Ned, thank you for joining me.Ned: Yeah, it's a pleasure to be here, Corey.Corey: So, what is Ned in the Cloud? There are a bunch of easy answers that I feel don't give the complete story like, “Oh, it's a YouTube channel,” or, “Oh no, it's the name that you wound up using because of, I don't know, easier to spell the URL or something.” Where do you start? Where do you stop? What are you exactly?Ned: What am I? Wow, I didn't know we were going to get this deep into philosophical territory this early. I mean, you got to ease me in with something. But so, Ned in the Cloud is the name of my blog from back in the days when we all started up a blog and hosted on WordPress and had fun. And then I was also at the same time working for a value-added reseller as a consultant, so a lot of what went on my blog was stuff that happened to me in the world of consulting.And you're always dealing with different levels of brokenness when you go to clients, so you see some interesting things, and I blogged about them. At a certain point, I decided I want to go out and do my own thing, mostly focused on training and education and content creation and I was looking for a company name. And I went through—I had a list of about 40 different names. And I showed them to my wife, and she's like, “Why don't you go Ned in the Cloud? Why are you making this more complicated than it needs to be?”And I said, “Well, I'm an engineer. That is my job, by definition, but you're probably right. I should just go with Ned in the Cloud.” So, Ned in the Cloud now is a company, just me, focused on creating educational content for technical learners on a variety of different platforms. And if I'm delivering educational content, I am a happy human, and if I'm not doing that, I'm probably out running somewhere.Corey: I like that, and I'd like to focus on education first. There are a number of reasons that people will go in that particular direction, but what was it for you?Ned: I think it's kind of in the heritage of my family. It's in my blood to a certain degree because my dad is a teacher, my mom is a teacher-turned-librarian, my sister is a teacher, my wife is a teacher, her mother is a teacher. So, there was definitely something in the air, and I think at a certain point, I was the black sheep in the sense that I was the engineer. Look, this guy over here. And then I ended up deciding that I really liked training people and learning and teaching, and became a teacher of sorts, and then they all went, “Welcome to the fold.”Corey: It's fun when you get to talk to people about the things that they're learning because when someone's learning something I find that it's the time when their mind is the most open. I don't think that that's something that you don't get to see nearly as much once someone already, quote-unquote, “Knows a thing,” because once that happens, why would you go back and learn something new? I have always learned the most—even about things that I've built myself—by putting it in the hands of users and seeing how they honestly sometimes hold it wrong and make mistakes that don't make sense to me, but absolutely make sense to them. Learning something—or rather, teaching something—versus building that thing is very much an orthogonal skill set, and I don't think that there's enough respect given to that understanding.Ned: It's an interesting sphere of people who can both build the thing and then teach somebody else to build the thing because you're right, it's very different skill sets. Being able to teach means that you have to empathize with the human being that you're teaching and understand that their perspective is not yours necessarily. And one of the skills that you build up as an instructor is realizing when you're making a whole bunch of assumptions because you know something really well, and that the person that you're teaching is not going to have that context, they're not going to have all those assumptions baked in, so you have to actually explain that stuff out. Some of my instruction has been purely online video courses through, like, Pluralsight; less of a feedback loop there. I have to publish the entire course, and then I started getting feedback, so I really enjoy doing live trainings as well because then I get the questions right away.And I always insist, like, if I'm delivering a lecture, and you have a question, please don't wait for the end. Please interrupt me immediately because you're going to forget what that question is, you're going to lose your train of thought, and then you're not going to ask it. And the whole class benefits when someone asks a question, and I benefit too. I learn how to explain that concept better. So, I really enjoy the live setting, but making the video courses is kind of nice, too.Corey: I learned to speak publicly and give conference talks as a traveling contract trainer for Puppet years ago, and that was an eye-opening experience, just because you don't really understand something until you're teaching other people how it works. It's how I learned Git. I gave a conference talk that explained Git to people, and that was called a forcing function because I had four months to go to learn this thing I did not fully understand and welp, they're not going to move the conference for me, so I guess I'd better hustle. I wouldn't necessarily recommend that approach. These days, it seems like you have a, let's say, disproportionate level of focus on the area of Infrastructure as Code, specifically you seem to be aiming at Terraform. Is that an accurate way of describing it?Ned: That is a very accurate way of describing it. I discovered Terraform while I was doing my consulting back in 2016 era, so this was pretty early on in the product's lifecycle. But I had been using CloudFormation, and at that time, CloudFormation only supported JSON, which meant it was extra punishing. And being able to describe something more succinctly and also have access to all these functions and loops and variables, I was like, “This is amazing. Where were you a year ago?” And so, I really just jumped in with both feet into Terraform.And at a certain point, I was at a conference, and I went past the Pluralsight booth, and they mentioned that they were looking for instructors. And I thought to myself, well, I like talking about things, and I'm pretty excited about this Terraform thing. Why don't I see if they're looking for someone to do a Terraform course? And so, I went through their audition process and sure enough, that is exactly what they were looking for. They had no getting started course for Terraform at the time. I published the course in 2017, and it has been in the top 50 courses ever since on Pluralsight. So, that told me that there's definitely an appetite and maybe this is an area I should focus on a little bit more.Corey: It's a difficult area to learn. About two months ago, I started using Terraform for the first time in anger in ages. I mean, I first discovered it when I was on my way back from one of those Puppet trainings, and the person next to me was really excited about this thing that we're about to launch. Turns out that was Mitchell Hashimoto and Armon was sitting next to him on the other side. Why he had a middle seat, I'll never know.But it was a really fun conversation, just talking about how he saw the world and what he was planning on doing. And a lot of that vision was realized. What I figured out a couple months ago is both that first, I'm sort of sad that Terraform is as bad as it is, but it's the best option we've got because everything else is so much worse. It is omnipresent, though. Effectively, every client I've ever dealt with on AWS billing who has a substantial estate is managing it via Terraform.It is the lingua franca of cloud across the board. I just wish it didn't require as much care and feeding, especially for the getting-started-with-a-boilerplate type of scenario. So, much of what you type feels like it's useless stuff that should be implicit. I understand why it's not, but it feels that way. It's hard to learn.Ned: It certainly can be. And you're right, there's a certain amount of boilerplate and [sigh] code that you have to write that seems pointless. Like, do I have to actually spell this all out? And sometimes the answer is yes, and sometimes the answer is you should use a module for that. Why are you writing this entire VPC configuration out yourself? And that's the sort of thing that you learn over time is that there are shortcuts, there are ways to make the code simpler and require less care and feeding.But I think ultimately, your infrastructure, just like your software, evolves, changes has new requirements, and you need to manage it in the same way that you want to manage your software. And I wouldn't tell a software developer, “Oh, you know, you could just write it once and never go back to it. I'm sure it's fine.” And by the same token, I wouldn't tell an infrastructure developer the same thing. Now, of course, people do that and never go back and touch it, and then somebody else inherits that infrastructure and goes, “Oh, God. Where's the state data?” And no one knows, and then you're starting from scratch. But hopefully, if you have someone who's doing it responsibly, they'll be setting up Terraform in such a way that it is maintainable by somebody else.Corey: I'd sure like to hope so. I have encountered so many horrible examples of code and wondering what malicious person wrote this. And of course, it was me, 6 or 12 months ago.Ned: Always [laugh].Corey: I get to play architect around a lot of these things. In fact, that's one of the problems that I've had historically with an awful lot of different things that I've basically built, called it feature complete, let it sit for a while using the CDK or whatnot, and then oh, I want to make a small change to it. Well, first, I got to spend half a day during the entire line dependency updates and seeing what's broken and how all of that works. It feels like for better or worse, Terraform is a lot more stable than that, as in, old versions of Terraform code from blog posts from 2016 will still effectively work. Is that accurate? I haven't done enough exploring in that direction to be certain.Ned: The good thing about Terraform is you can pin the version of various things that you're using. So, if you're using a particular version of the AWS provider, you can pin it to that specific version, and it won't automatically upgrade you to the latest and greatest. If you didn't do that, then you'll get bit by the update bug, which certainly happens to some folks when they changed the provider from version 3 to version 4 and completely changed how the S3 bucket object was created. A lot of people's scripts broke that day, so I think that was the time for everyone to learn what the version argument is and how it works. But yeah, as long as you follow that general convention of pinning versions of your modules and of your resource provider, you should be in a pretty stable place when you want to update it.Corey: Well, here's the $64,000 question for you, then. Does Dependabot on your GitHub repo begin screaming at you as soon as you've done that because in one of its dependencies in some particular weird edge cases when they're dealing with unsanitized, internet-based input could wind up taking up too many system resources, for example? Which is, I guess, in an ideal world, it wouldn't be an issue, but in practice, my infrastructure team is probably not trying to attack the company from the inside. They have better paths to get there, to be very blunt.Ned: [laugh].Corey: Turns out giving someone access to a thing just directly is way easier than making them find it. But that's been one of the frustrating parts where, especially when it encounters things like, I don't know, corporate security policies of, “Oh, you must clear all of these warnings,” which well-intentioned, poorly executed seems to be the takeaway there.Ned: Yeah, I've certainly seen some implementations of tools that do static scanning of Terraform code and will come up with vulnerabilities or violations of best practice, then you have to put exceptions in there. And sometimes it'll be something like, “You shouldn't have your S3 bucket public,” which in most cases, you shouldn't, but then there's the one team that's actually publishing a front-facing static website in the S3 bucket, and then they have to get, you know, special permission from on high to ignore that warning. So, a lot of those best practices that are in the scanning tools are there for very good reasons and when you onboard them, you should be ready to see a sea of red in your scan the first time and then look through that and kind of pick through what's actually real, and we should improve in our code, and what's something that we can safely ignore because we are intentionally doing it that way.Corey: I feel like there's an awful lot of… how to put this politely… implicit dependencies that are built into things. I'll wind up figuring out how to do something by implementing it and that means I will stitch together an awful lot of blog posts, things I found on Stack Overflow, et cetera, just like a senior engineer and also Chat-Gippity will go ahead and do those things. And then the reason—like, someone asks me four years later, “Why is that thing there?” And… “Well, I don't know, but if I remove it, it might stop working, so…” there was almost a cargo-culting style of, well, it's always been there. So, is that necessary? Is it not?I'm ashamed by how often I learned something very fundamental in a system that I've been using for 20 years—namely, the command line—just by reading the man page for a command that I already, quote-unquote, “Already know how to use perfectly well.” Yeah, there's a lot of hidden gems buried in those things.Ned: Oh, my goodness, I learned something about the Terraform CLI last week that I wish I'd known two years ago. And it's been there for a long time. It's like, when you want to validate your code with the terraform validate, you can initialize without initializing the back-end, and for those who are steeped in Terraform, that means something and for everybody else, I'm sorry [laugh]. But I discovered that was an option, and I was like, “Ahhh, this is amazing.” But to get back to the sort of dependency problems and understanding your infrastructure better—because I think that's ultimately what's happening when you have to describe something using Infrastructure as Code—is you discover how the infrastructure actually works versus how you thought it worked.If you look at how—and I'm going to go into Azure-land here, so try to follow along with me—if you go into Azure-land and you look at how they construct a load balancer, the load balancer is not a single resource. It's about eight different resources that are all tied together. And AWS has something similar with how you have target groups, and you have the load balancer component and the listener and the health check and all that. Azure has the same thing. There's no actual load balancer object, per se.There's a bunch of different components that get slammed together to form that load balancer. When you look in the portal, you don't see any of that. You just see a load balancer, and you might think this is a very simple resource to configure. When it actually comes time to break it out into code, you realize, oh, this is eight different components, each of which has its own options and arguments that I need to understand. So, one of the great things that I have seen a lot of tooling up here around is doing the import of existing infrastructure into Terraform by pointing the tool at a collection of resources—whatever they are—and saying, “Go create the Terraform code that matches that thing.” And it's not going to be the most elegant code out there, but it will give you a baseline for what all the settings actually are, and other resource types are, and then you can tweak it as needed to add in input variables or remove some arguments that you're not using.Corey: Yeah, I remember when they first announced the importing of existing state. It's wow, there's an awful lot of stuff that it can be aware of that I will absolutely need to control unless I want it to start blowing stuff away every time I run the—[unintelligible 00:15:51] supposedly [unintelligible 00:15:52] thing against it. And that wasn't a lot of fun. But yeah, this is the common experience of it. I only recently was reminded of the fact that I once knew, and I'd forgotten that a public versus private subnet in AWS is a human-based abstraction, not something that is implicit to the API or the way they envision subnets existing. Kind of nice, but also weird when you have to unlearn things that you've thought you'd learned.Ned: That's a really interesting example of we think of them as very different things, and when we draw nice architecture diagrams there—these are the private subnets and these are the public ones. And when you actually go to create one using Terraform—or really another tool—there's no box that says ‘private' or ‘make this public.' It's just what does your route table look like? Are you sending that traffic out the internet gateway or are you sending it to some sort of NAT device? And how does traffic come back into that subnet? That's it. That's what makes it private versus public versus a database subnet versus any other subnet type you want to logically assign within AWS.Corey: Yeah. It's kind of fun when that stuff hits.Ned: [laugh].Corey: I am curious, as you look across the ecosystem, do you still see that learning Terraform is a primary pain point for, I guess, the modern era of cloud engineer, or has that sunk below the surface level of awareness in some ways?Ned: I think it's taken as a given to a certain degree that if you're a cloud engineer or an aspiring cloud engineer today, one of the things you're going to learn is Infrastructure as Code, and that Infrastructure as Code is probably going to be Terraform. You can still learn—there's a bunch of other tools out there; I'm not going to pretend like Terraform is the end-all be-all, right? We've got—if you want to use a general purpose programming language, you have something like Pulumi out there that will allow you to do that. If you want to use one of the cloud-native tools, you've got something like CloudFormation or Azure has Bicep. Please don't use ARM templates because they hurt. They're still JSON only, so at least CloudFormation added YAML support in there. And while I don't really like YAML, at least it's not 10,000 lines of code to spin up, like, two domain controllers in a subnet.Corey: I personally wind up resolving the dichotomy between oh, should we go with JSON or should we go with YAML by picking the third option everyone hates more. That's why I'm a staunch advocate for XML.Ned: [laugh]. I was going to say XML. Yeah oh, as someone who dealt with SOAP stuff for a while, yeah, XML was particularly painful, so I'm not sad that went away. JSON for me, I work with it better, but YAML is more readable. So, it's like it's, pick your poison on that. But yeah, there's a ton of infrastructure tools out there.They all have basically the same concepts behind them, the same core concepts because they're all deploying the same thing at the end of the day and there's only so many ways you can express that concept. So, once you learn one—say you learned CloudFormation first—then Terraform is not as big of a leap. You're still declaring stuff within a file and then having it go and make those things exist. It's just nuances between the implementation of Terraform versus CloudFormation versus Bicep.Corey: I wish that there were more straightforward abstractions, but I think that as soon as you get those, that inherently limits what you're able to do, so I don't know how you square that circle.Ned: That's been a real difficult thing is, people want some sort of universal cloud or infrastructure language and abstraction. I just want a virtual machine. I don't care what kind of platform I'm on. Just give me a VM. But then you end up very much caring [laugh] what kind of VM, what operating system, what the underlying hardware is when you get to a certain level.So, there are some workloads where you're like, I just needed to run somewhere in a container and I really don't care about any of the underlying stuff. And that's great. That's what Platform as a Service is for. If that's your end goal, go use that. But if you're actually standing up infrastructure for any sort of enterprise company, then you need an abstraction that gives you access to all the underlying bits when you want them.So, if I want to specify different placement groups about my VM, I need access to that setting to create a placement group. And if I have this high-level of abstraction of a virtual machine, it doesn't know what a placement group is, and now I'm stuck at that level of abstraction instead of getting down to the guts, or I'm going into the portal or the CLI and modifying it outside of the tool that I'm supposed to be using.Corey: I want to change gears slightly here. One thing that has really been roiling some very particular people with very specific perspectives has been the BSL license change that Terraform has wound up rolling out. So far, the people that I've heard who have the strongest opinions on it tend to fall into one of three categories: either they work at HashiCorp—fair enough, they work at one of HashiCorp's direct competitors—which yeah, okay, sure, or they tend to be—how to put this delicately—open-source evangelists, of which I freely admit I used to be one and then had other challenges I needed to chase down in other ways. So, I'm curious as to where you, who are not really on the vendor side of this at all, how do you see it shaking out?Ned: Well, I mean, just for some context, essentially what HashiCorp decided to do was to change the licensing from Mozilla Public licensing to BSL for, I think eight of their products and Terraform was amongst those. And really, this sort of tells you where people are. The only one that anybody really made any noise about was Terraform. There's plenty of people that use Vault, but I didn't see a big brouhaha over the fact that Vault changed its licensing. It's really just about Terraform. Which tells you how important it is to the ecosystem.And if I look at the folks that are making the most noise about it, it's like you said, they basically fall into one of two camps: it's the open-source code purists who believe everything should be licensed in completely open-source ways, or at least if you start out with an open-source license, you can't convert to something else later. And then there is a smaller subset of folks who work for HashiCorp competitors, and they really don't like the idea of having to pay HashiCorp a regular fee for what used to be ostensibly free to them to use. And so, what they ended up doing was creating a fork of Terraform, just before the licensing change happened and that fork of Terraform was originally called OpenTF, and they had an OpenTF manifesto. And I don't know about you, when I see the word ‘manifesto,' I back away slowly and try not to make any sudden moves.Corey: You really get the sense there's going to be a body count tied to this. And people are like, “What about the Agile Manifesto?” “Yeah, what about it?”Ned: [laugh]. Yeah, I'm just—when I see ‘manifesto,' I get a little bit nervous because either someone is so incredibly passionate about something that they've kind of gone off the deep end a little bit, or they're being somewhat duplicitous, and they have ulterior motives, let's say. Now, I'm not trying to cast aspersions on anybody. I can't read anybody's mind and tell you exactly what their intention was behind it. I just know that the manifesto reads a little bit like an open-source purist and a little bit like someone having a temper tantrum, and vacillating between the two.But cooler heads prevailed a little bit, and now they have changed the name to OpenTofu, and it has been accepted by the Linux Foundation as a project. So, it's now a member of the Linux Foundation, with all the gravitas that that comes with. And some people at HashiCorp aren't necessarily happy about the Linux Foundation choosing to pull that in.Corey: Yeah, I saw a whole screed, effectively, that their CEO wound up brain-dumping on that frankly, from a messaging perspective, he would have been better served as not to say anything at all, to be very honest with you.Ned: Yeah, that was a bit of a yikes moment for me.Corey: It's very rare that you will listen yourself into trouble as opposed to opening your mouth and getting yourself into trouble.Ned: Exactly.Corey: You wouldn't think I would be one of those—of all people who would have made that observation, you wouldn't think I would be on that list, yet here I am.Ned: Yeah. And I don't think either side is entirely blameless. I understand the motivations behind HashiCorp wanting to make the change. I mean, they're a publicly traded company now and ostensibly that means that they should be making some amount of money for their investors, so they do have to bear that in mind. I don't necessarily think that changing the licensing of Terraform is the way to make that money.I think in the long-term, it's not going—it may not hurt them a lot, but I don't think it's going to help them out a lot, and it's tainted the goodwill of the community to a certain degree. On the other hand, I don't entirely trust what the other businesses are saying as well in their stead. So, there's nobody in this that comes out a hundred percent clean [laugh] on the whole process.Corey: Yeah, I feel like, to be direct, the direct competitors to HashiCorp along its various axes are not the best actors necessarily to complain about what is their largest competitor no longer giving them access to continue to compete against them with their own product. I understand the nuances there, but it also doesn't feel like they are the best ambassadors for that. I also definitely understand where HashiCorp is coming from where, why are we investing all this time, energy, and effort for people to basically take revenue away from us? But there's also the bigger problem, which is, by and large, compared to how many sites are running Terraform and the revenues that HashiCorp puts up for it, they're clearly failing to capture the value they have delivered in a massive way. But counterpoint, if they hadn't been open-source for their life until this point, would they have ever captured that market share? Probably not.Ned: Yeah, I think ultimately, the biggest competitor to their paid offering of Terraform is their free version of Terraform. It literally has enough bells and whistles already included and plenty of options for automating those things and solving the problems that their enterprise product solves that their biggest problem is not other competitors in the Terraform landscape; it's the, “Well, we already have something, and it's good enough.” And I'm not sure how you sell to that person, that's why I'm not in marketing, but I think that is their biggest competitor is the people who already have a solution and are like, “Why do I need to pay for your thing when my thing works well enough?”Corey: That's part of the strange thing that I'm seeing as I look across this entire landscape is it feels like this is not something that is directly going to impact almost anyone out there who's just using this stuff, either the open-source version as a paying customer of any of these things, but it is going to kick up a bunch of dust. And speaking of poor messaging, HashiCorp is not really killing it this quarter, where the initial announcement led to so many questions that were unclear, such as—like, they fixed this later in the frequently asked questions list, but okay, “I'm using Terraform right now and that's fine. I'm building something else completely different. Am I going to lose my access to Terraform if you decide to launch a feature that does what my company does?” And after a couple of days, they put up an indemnity against that. Okay, fine.Like, when Mongo did this, there was a similar type of dynamic that was emerging, but a lot fewer people are writing their own database engine to then sell onward to customers that are provisioning infrastructure on behalf of their customers. And where the boundaries lay for who was considered a direct Terraform competitor was unclear. I'm still not convinced that it is clear enough to bet the business on for a lot of these folks. It comes down to say what you mean, not—instead of hedging, you're not helping your cause any.Ned: Yeah, I think out of the different products that they have, some are very clear-cut. Like, Vault is a server that runs as a service, and so that's very clear what that product is and where the lines of delineation are around Vault. If I go stand up a bunch of Vault servers and offer them as a service, then that is clearly a competitor. But if I have an automation pipeline service and people can technically automate Terraform deployments with my service, even if that's not the core thing that I'm looking to do, am I now a competitor? Like, it's such a fuzzy line because Terraform isn't an application, it's not a server that runs somewhere, it's a CLI tool and a programming language. So yeah, those lines are very, very fuzzy. And I… like I said, it would be better if they say what they meant, as opposed to sort of the mealy-mouthed language that they ended up using and the need to publish multiple revisions of that FAQ to clarify their position on very specific niche use cases.Corey: Yeah, I'm not trying to be difficult or insulting or anything like that. These are hard problems that everyone involved is wrestling with. It just felt a little off, and I think the messaging did them no favors when that wound up hitting. And now, everyone is sort of trying to read the tea leaves and figure out what does this mean because in isolation, it doesn't mean anything. It is a forward-looking thing.Whatever it is you're doing today, no changes are needed for you, until the next version comes out, in which case, okay, now do we incorporate the new thing or don't we? Today, to my understanding, whether I'm running Terraform or OpenTofu entirely comes down to which binary am I invoking to do the apply? There is no difference of which I am aware. That will, of course, change, but today, I don't have to think about that.Ned: Right. OpenTofu is a literal fork of Terraform, and they haven't really added much in the way of features, so it should be completely compatible with Terraform. The two will diverge in the future as feature as new features get added to each one. But yeah, for folks who are using it today, they might just decide to stay on the version pre-fork and stay on that for years. I think HashiCorp has pledged 18 months of support for any minor version of Terraform, so you've got at least a year-and-a-half to decide. And we were kind of talking before the recording, 99% of people using Terraform do not care about this. It does not impact their daily workflow.Corey: No. I don't see customers caring at all. And also, “Oh, we're only going to use the pre-fork version of Terraform,” they're like, “Thanks for the air cover because we haven't updated any of that stuff in five years, so tha”—Ned: [laugh].Corey: “Oh yeah, we're doing it out of license concern. That's it. That's the reason we haven't done anything recent with it.” Because once it's working, changes are scary.Ned: Yeah.Corey: Terraform is one of those scary things, right next to databases, that if I make a change that I don't fully understand—and no one understands everything, as we've covered—then this could really ruin my week. So, I'm going to be very cautious around that.Ned: Yeah, if metrics are to be believed across the automation platforms, once an infrastructure rollout happens with a particular version of Terraform, that version does not get updated. For years. So, I have it on good authority that there's still Terraform version 0.10 and 0.11 running on these automation platforms for really old builds where people are too scared to upgrade to, like, post 0.12 where everything changed in the language.I believe that. People don't want to change it, especially if it's working. And so, for most people, this licensing chain doesn't matter. And all the constant back and forth and bickering just makes people feel a little nervous, and it might end up pushing people away from Terraform as a platform entirely, as opposed to picking a side.Corey: Yeah, and I think that that is probably the fair way to view it at this point where right now—please, friends at HashiCorp and HashiCorp competitors don't yell at me for this—it's basically a nerd slap-fight at the moment.Ned: [laugh].Corey: And of one of the big reasons that I also stay out of these debates almost entirely is that I married a corporate attorney who used to be a litigator and I get frustrated whenever it comes down to license arguments because you see suddenly a bunch of engineers who get to cosplay as lawyers, and reading the comments is infuriating once you realize how a little bit of this stuff works, which I've had 15 years of osmotic learning on this stuff. Whenever I want to upset my wife, I just read some of these comments aloud and then our dinner conversation becomes screaming. It's wonderful.Ned: Bad legal takes? Yeah, before—Corey: Exactly.Ned: Before my father became a social studies teacher, he was a lawyer for 20 years, and so I got to absorb some of the thought process of the lawyer. And yeah, I read some of these takes, and I'm like, “That doesn't sound right. I don't think that would hold up in any court of law.” Though a lot of the open-source licensing I don't think has been tested in any sort of court of law. It's just kind of like, “Well, we hope this stands up,” but nobody really has the money to check.Corey: Yeah. This is the problem with these open-source licenses as well. Very few have never been tested in any meaningful way because I don't know about you, but I don't have a few million dollars in legal fees lying around to prove the point.Ned: Yeah.Corey: So, it's one of those we think this is sustainable, and Lord knows the number of companies that have taken reliances on these licenses, they're probably right. I'm certainly not going to disprove the fact—please don't sue me—but yeah, this is one of those things that we're sort of assuming is the case, even if it's potentially not. I really want to thank you for taking the time to discuss how it is you view these things and talk about what it is you're up to. If people want to learn more, where's the best place for them to find you?Ned: Honestly, just go to my website. It's nedinthecloud.com. And you can also find me on LinkedIn. I don't really go for Twitter anymore.Corey: I envy you. I wish I could wean myself off of it. But we will, of course, include a link to that in the show notes. Thank you so much for being so generous with your time. It's appreciated.Ned: It's been a pleasure. Thanks, Corey.Corey: Net Bellavance, founder and curious human at Ned in the Cloud. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that I will then fork under a different license and claim as my own.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Creating Value in Incident Management with Robert Ross

Screaming in the Cloud

Play Episode Listen Later Dec 5, 2023 35:09


Robert Ross, CEO and Co-Founder at FireHydrant, joins Corey on Screaming in the Cloud to discuss how being an on-call engineer fighting incidents inspired him to start his own company. Robert explains how FireHydrant does more than just notify engineers of an incident, but also helps them to be able to effectively put out the fire. Robert tells the story of how he “accidentally” started a company as a result of a particularly critical late-night incident, and why his end goal at FireHydrant has been and will continue to be solving the problem, not simply choosing an exit strategy. Corey and Robert also discuss the value and pricing models of other incident-reporting solutions and Robert shares why he feels surprised that nobody else has taken the same approach FireHydrant has. About RobertRobert Ross is a recovering on-call engineer, and the CEO and co-founder at FireHydrant. As the co-founder of FireHydrant, Robert plays a central role in optimizing incident response and ensuring software system reliability for customers. Prior to founding FireHydrant, Robert previously contributed his expertise to renowned companies like Namely and Digital Ocean. Links Referenced: FireHydrant: https://firehydrant.com/ Twitter: https://twitter.com/bobbytables TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Developers are responsible for more than ever these days. Not just the code they write, but also the containers and cloud infrastructure their apps run on. And a big part of that responsibility is app security — from code to cloud. That's where Snyk comes in. Snyk is a frictionless security platform that meets teams where they are, automating application security controls across their existing tools, workflows, and the AWS application stack — including seamless integrations with AWS CodePipeline, Amazon EKS, Amazon Inspector and several others. I'm a customer myself. Deploy on AWS. Secure with Snyk. Learn more at snyk.co/scream. That's S-N-Y-K-dot-C-O/scream.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. And this featured guest episode is brought to us by our friends at FireHydrant and for better or worse, they've also brought us their CEO and co-founder, Robert Ross, better known online as Bobby Tables. Robert, thank you for joining us.Robert: Super happy to be here. Thanks for having me.Corey: Now, this is the problem that I tend to have when I've been tracking companies for a while, where you were one of the only people that I knew of at FireHydrant. And you kind of still are, so it's easy for me to imagine that, oh, it's basically your own side project that turned into a real job, sort of, side hustle that's basically you and maybe a virtual assistant or someone. I have it on good authority—and it was also signaled by your Series B—that there might be more than just you over there now.Robert: Yes, that's true. There's a little over 60 people now at the company, which is a little mind-boggling for me, starting from side projects, building this in Starbucks to actually having people using the thing and being on payroll. So, a little bit of a crazy thing for me. But yes, over 60.Corey: So, I have to ask, what is it you folks do? When you say ‘fire hydrant,' the first thing that I think I was when I was a kid getting yelled at by the firefighter for messing around with something I probably shouldn't have been messing around with.Robert: So, it's actually very similar where I started it because I was messing around with software in ways I probably shouldn't have and needed a fire hydrant to help put out all the fires that I was fighting as an on-call engineer. So, the name kind of comes from what do you need when you're putting out a fire? A fire hydrant. So, what we do is we help people respond to incidents really quickly, manage them from ring to retro. So, the moment you declare an incident, we'll do all the timeline tracking and eventually help you create a retrospective at the very end. And it's been a labor of love because all of that was really painful for me as an engineer.Corey: One of the things that I used to believe was that every company did something like this—and maybe they do, maybe they don't—I'm noticing these days an increasing number of public companies will never admit to an incident that very clearly ruined things for their customers. I'm not sure if they're going to talk privately to customers under NDAs and whatnot, but it feels like we're leaving an era where it was an expectation that when you had a big issue, you would do an entire public postmortem explaining what had happened. Is that just because I'm not paying attention to the right folks anymore, or are you seeing a downturn in that?Robert: I think that people are skittish of talking about how much reliability they—or issues they may have because we're having this weird moment where people want to open more incidents like the engineers actually want to say we have more incidents and officially declare those, and in the past, we had these, like, shadow incidents that we weren't officially going to say it was an incident, but was a pretty big deal, but we're not going to have a retro on it so it's like it didn't happen. And kind of splitting the line between what's a SEV1, when should we actually talk about this publicly, I think companies are still trying to figure that out. And then I think there's also opposing forces. We talk to folks and it's, you know, public relations will sometimes get involved. My general advice is, like, you should be probably talking about it no matter what. That's how you build trust.It's trust, with incidences, lost in buckets and gained back in drops, so you should be more public about it. And I think my favorite example is a major CDN had a major incident and it took down, like, the UK government website. And folks can probably figure out who I'm talking about, but their stock went up the next day. You would think that a major incident taking down a large portion of the internet would cause your stock to go down. Not the case. They were on it like crazy, they communicated about it like crazy, and lo and behold, you know, people were actually pretty okay with it as far as they could be at the end of the day.Corey: The honest thing that really struck me about that was I didn't realize that CDN that you're referencing was as broadly deployed as it was. Amazon.com took some downtime as a result of this.Robert: Yeah.Corey: It's, “Oh, wow. If they're in that many places, I should be taking them more seriously,” was my takeaway. And again, I don't tend to shame folks for incidents because as soon as you do that, they stopped talking about them. They still have them, but then we all lose the ability to learn from them. I couldn't help but notice that the week that we're recording this, so there was an incident report put out by AWS for a Lambda service event in Northern Virginia.It happened back in June, we're recording this late in October. So, it took them a little bit of time to wind up getting it out the door, but it's very thorough, very interesting as far as what it talks about as far as their own approach to things. Because otherwise, I have to say, it is easy as a spectator slash frustrated customer to assume the absolute worst. Like, you're sitting around there and like, “Well, we have a 15-minute SLA on this, so I'm going to sit around for 12 minutes and finish my game of solitaire before I answer the phone.” No, it does not work that way. People are scrambling behind the scenes because as systems get more complicated, understanding the interdependencies of your own system becomes monstrous.I still remember some of the very early production engineering jobs that I had where—to what you said a few minutes ago—oh, yeah, we'll just open an incident for every alert that goes off. Then we dropped a [core switch 00:05:47] and Nagio sent something like 8000 messages inside of two minutes. And we would still, 15 years later, not be done working through that incident backlog had we done such a thing. All of this stuff gets way harder than you would expect as soon as your application or environment becomes somewhat complicated. And that happens before you realize it.Robert: Yeah, much faster. I think that, in my experience, there's a moment that happens for companies where maybe it's the number of customers you have, number of servers you're running in production, that you have this, like, “Oh, we're running a big workload right now in a very complex system that impacts people's lives, frankly.” And the moment that companies realize that is when you start to see, like, oh, process change, you build it, you own it, now we have an SRE team. Like, there's this catalyst that happens in all of these companies that triggers this. And it's—I don't know, from my perspective, it's coming at a faster rate than people probably realize.Corey: From my perspective, I have to ask you this question, and my apologies in advance if it's one of those irreverent ones, but do you consider yourself to be an observability company?Robert: Oh, great question. No. No, actually. We think that we are the baton handoff between an observability tool and our platform. So, for example, we think that that's a good way to kind of, you know, as they say, monitor the system, give reports on that system, and we are the tool that based on that monitor may be going off, you need to do something about it.So, for example, I think of it as like a smoke detector in some cases. Like, in our world, like that's—the smoke detector is the thing that's kind of watching the system and if something's wrong, it's going to tell you. But at that point, it doesn't really do anything that's going to help you in the next phase, which is managing the incident, calling 911, driving to the scene of the fire, whatever analogies you want to use. But I think the value-add for the observability tools and what they're delivering for businesses is different than ours, but we touch each other, like, very much so.Corey: Managing an incident when something happens and diagnosing what is the actual root cause of it, so to speak—quote-unquote, “Root cause.” I know people have very strong opinions on—Robert: Yeah, say the word [laugh].Corey: —that phrase—exactly—it just doesn't sound that hard. It is not that complicated. It's, more or less, a bunch of engineers who don't know what they're actually doing, and why are they running around chasing this stuff down is often the philosophy of a lot of folks who have never been in the trenches dealing with these incidents themselves. I know this because before I was exposed to scale, that's what I thought and then, oh, this is way harder than you would believe. Now, for better or worse, an awful lot of your customers and the executives at those customers did, for some strange reason, not come up through production engineering as the thing that they've done. They are executives, so it feels like it would be a challenging conversation to have with them, but one thing that you've got in your back pocket, which I always love talking to folks about, is before this, you were an engineer and then you became a CEO of a reasonably-sized company. That is a very difficult transition. Tell me about it.Robert: Yeah. Yeah, so a little of that background. I mean, I started writing code—I've been writing code for two-thirds of my life. So, I'm 32 now; I'm relatively young. And my first job out of high school—skipping college entirely—was writing code. I was 18, I was working in a web dev shop, I was making good enough money and I said, you know what? I don't want to go to college. That sounds—I'm making money. Why would I go to college?And I think it was a good decision because I got to be able—I was right kind of in the centerpiece of when a lot of really cool software things were happening. Like, DevOps was becoming a really cool term and we were seeing the cloud kind of emerge at this time and become much more popular. And it was a good opportunity to see all this confluence of technology and people and processes emerge into what is, kind of like, the base plate for a lot of how we build software today, starting in 2008 and 2009. And because I was an on-call engineer during a lot of that, and building the systems as well, that I was on call for, it meant that I had a front-row seat to being an engineer that was building things that was then breaking, and then literally merging on GitHub and then five minutes later [laugh], seeing my phone light up with an alert from our alerting tool. Like, I got to feel the entire process.And I think that that was nice because eventually one day, I snapped. And it was after a major incident, I snapped and I said, “There's no tool that helps me during this incident. There's no tool that kind of helps me run a process for me.” Because the only thing I care about in the middle of the night is going back to bed. I don't have any other priority [laugh] at 2 a.m.So, I wanted to solve the problem of getting to the fire faster and extinguishing it by automating as much as I possibly could. The process that was given to me in an outdated Confluence page or Google Doc, whatever it was, I wanted to automate that part so I could do the thing that I was good at as an engineer: put out the fire, take some notes, and then go back to bed, and then do a retrospective sometime next day or in that week. And it was a good way to kind of feel the problem, try to build a solution for it, tweak a little bit, and then it kind of became a company. I joke and I say on accident, actually.Corey: I'll never forget one of the first big, hairy incidents that I had to deal with in 2009, where my coworker had just finished migrating the production environment over to LDAP on a Thursday afternoon and then stepped out for a three-day weekend, and half an hour later, everything started exploding because LDAP will do that. And I only had the vaguest idea of how LDAP worked at all. This was a year into my first Linux admin job; I'd been a Unix admin before that. And I suddenly have the literal CEO of the company breathing down my neck behind me trying to figure out what's going on and I have no freaking idea of myself. And it was… feels like there's got to be a better way to handle these things.We got through. We wound up getting it back online, no one lost their job over it, but it was definitely a touch-and-go series of hours there. And that was a painful thing. And you and I went in very different directions based upon experiences like that. I took a few more jobs where I had even worse on-call schedules than I would have believed possible until I started this place, which very intentionally is centered around a business problem that only exists during business hours. There is no 2 a.m. AWS billing emergency.There might be a security issue masquerading as one of those, but you don't need to reach me out of business hours because anything that is a billing problem will be solved in Seattle's timeline over a period of weeks. You leaned into it and decided, oh, I'm going to start a company to fix all of this. And okay, on some level, some wit that used to work here, wound up once remarking that when an SRE doesn't have a better idea, they start a monitoring company.Robert: [laugh].Corey: And, on some level, there's some validity to it because this is the problem that I know, and I want to fix it. But you've differentiated yourself in a few key ways. As you said earlier, you're not an observability company. Good for you.Robert: Yeah. That's a funny quote.Corey: Pete Cheslock. He has a certain way with words.Robert: Yeah [laugh]. I think that when we started the company, it was—we kind of accidentally secured funding five years ago. And it was because this genuinely was something I just, I bought a laptop for because I wanted to own the IP. I always made sure I was on a different network, if I was going to work on the company and the tool. And I was just writing code because I just wanted to solve the problem.And then some crazy situation happened where, like, an investor somehow found FireHydrant because they were like, “Oh, this SRE thing is a big space and incidents is a big part of it.” And we got to talking and they were like, “Hey, we think what you're building is valuable and we think you should build a company here.” And I was—like, you know, the Jim Carrey movie, Yes Man? Like, that was kind of me in that moment. I was like, “Sure.” And here we are five years later. But I think the way that we approached the problem was let's just solve our own problem and let's just build a company that we want to work at.And you know, I had two co-founders join me in late 2018 and that's what we told ourselves. We said, like, “Let's build a company that we want to work for, that solves problems that we have had, that we care about solving.” And I think it's worked out, you know? We work with amazing companies that use our tool—much to their chagrin [laugh]—multiple times a day. It's kind of a problem when you build an incident response tool is that it's a good thing when people are using it, but a bad thing for them.Corey: I have to ask of all of the different angles to approach this from, you went with incident management as opposed to focusing on something that is more purely technical. And I don't say that in any way that is intended to be sounding insulting, but it's easier from an engineering mind to—having been one myself—to come up with, “Here's how I make one computer talk to his other computer when the following event happens.” That's a much easier problem by orders of magnitude than here's how I corral the humans interacting with that computer's failure to talk to another computer in just the right way. How did you get onto this path?Robert: Yeah. The problem that we were trying to solve for it was the getting the right people in the room problem. We think that building services that people own is the right way to build applications that are reliable and stable and easier to iterate on. Put the right people that build that software, give them, like, the skin in the game of also being on call. And what that meant for us is that we could build a tool that allowed people to do that a lot easier where allowing people to corral the right people by saying, “This service is broken, which powers this functionality, which means that these are the people that should get involved in this incident as fast as possible.”And the way we approached that is we just built up part of our functionality called Runbooks, where you can say, “When this happens, do this.” And it's catered for incidents. So, there's other tools out there, you can kind of think of as, like, we're a workflow tool, like Zapier, or just things that, like, fire webhooks at services you build and that ends up being your incident process. But for us, we wanted to make it, like, a really easy way that a project manager could help define the process in our tool. And when you click the button and say, “Declare Incident: LDAP is Broken,” and I have a CEO standing behind me, our tool just would corral the people for you.It was kind of like a bat signal in the air, where it was like, “Hey, there's this issue. I've run all the other process. I just need you to arrive at and help solve this problem.” And we think of it as, like, how can FireHydrant be a mech suit for the team that owns incidents and is responsible for resolving them?Corey: There are a few easier ways to make a product sound absolutely ridiculous than to try and pitch it to a problem that it is not designed to scale to. What is the ‘you must be at least this tall to ride' envisioning for FireHydrant? How large slash complex of an organization do you need to be before this starts to make sense? Because I promise, as one person with a single website that gets no hits, that is probably not the best place for—Robert: Probably not.Corey: To imagine your ideal user persona.Robert: Well, I'm sure you get way more hits than that. Come on [laugh].Corey: It depends on how controversial I'm being in a given week.Robert: Yeah [laugh].Corey: Also, I have several ridiculous, nonsense apps out there, but honestly, those are for fun. I don't charge people for them, so they can deal with my downtime till I get around to it. That's the way it works.Robert: Or, like, spite-visiting your website. No it's—for us, we think that the ‘must be this tall' is when do you have, like, sufficiently complicated incidents? We tell folks, like, if you're a ten-person shop and you have incidents, you know, just use our free tier. Like, you need something that opens a Slack channel? Fine. Use our free tier or build something that hits the Slack API [unintelligible 00:18:18] channel. That's fine.But when you start to have a lot of people in the room and multiple pieces of functionality that can break and multiple people on call, that's when you probably need to start to invest in incident management. Because it is a return on investment, but there is, like, a minimum amount of incidents and process challenges that you need to have before that return on investment actually, I would say, comes to fruition. Because if you do think of, like, an incident that takes downtime, or you know, you're a retail company and you go down for, let's say, ten minutes, and your number of sales per hour is X, it's actually relatively simple for that type of company to understand, okay, this is how much impact we would need to have from an incident management tool for it to be valuable. And that waterline is actually way—it's way lower than I think a lot of people realize, but like you said, you know, if you have a few 100 visitors a day, it's probably not worth it. And I'll be honest there, you can use our free tier. That's fine.Corey: Which makes sense. It's challenging to wind up-sizing things appropriately. Whenever I look at a pricing page, there are two things that I look for. And incidentally, when I pull up someone's website, I first make a beeline for pricing because that is the best way I found for a lot of the marketing nonsense words to drop away and it get down to brass tacks. And the two things I want are free tier or zero-dollar trial that I can get started with right now because often it's two in the morning and I'm trying to see if this might solve a problem that I'm having.And I also look for the enterprise tier ‘contact us' because there are big companies that do not do anything that is not custom nor do they know how to sign a check that doesn't have two commas in it. And whatever is between those two, okay, that's good to look at to figure out what dimensions I'm expected to grow on and how to think about it, but those are the two tent poles. And you've got that, but pricing is always going to be a dark art. What I've been seeing across the industry. And if we put it under the broad realm of things that watch your site and alert you and help manage those things, there are an increasing number of, I guess what I want to call component vendors, where you'll wind up bolting together a couple dozen of these things together into an observability pipeline-style thing, and each component seems to be getting extortionately expensive.Most of the wake-up-in-the-middle-of-the-night services that will page you—and there are a number of them out there—at a spot check of these, they all cost more per month per user than Slack, the thing that most of us to end up living within. This stuff gets fiendishly expensive, fiendishly quickly, and at some point, you're looking at this going, “The outage is cheaper than avoiding the outage through all of these things. What are we doing here?” What's going on in the industry, other than ‘money printing machine stopped going brrr' in quite the same way?Robert: Yeah, I think that for alerting specifically, this is a big part of, like, the journey that we wanted to have in FireHydrant was like, we also want to help folks with the alerting piece. So, I'll focus on that, which is, I think that the industry around notifying people for incidents—texts, call, push notifications, emails, there's a bunch of different ways to do it—I think where it gets really crazy expensive as in this per-seat model that most of them seem to have landed on. And we're per-seat for, like, the core platform of FireHydrant—so you know, before people spite-visit FireHydrant, look at our pricing pitch—but we're per-seat there because the value there is, like, we're the full platform for the service catalog retrospectives, Runbooks, like, there's a whole other component of FireHydrant—status pages—but when it comes to alerting, like, in my opinion, that should be active user for a few reasons. I think that if you're going to have people responding to incidents and the value from us is making sure they get to that incident very quickly because we wake them up in the middle of the night, we text them, we call them we make their Hue lights turn red, whatever it is, then that's, like, the value that we're delivering at that moment in time, so that's how we should probably invoice you.And I think that what's happened is that the pricing for these companies, they haven't innovated on the product in a way that allows them to package that any differently. So, what's happened, I think, is that the packaging of these products has been almost restrictive in the way that they could change their pricing models because there's nothing much more to package on. It's like, cool there's an alerting aspect to this, but that's what people want to buy those tools for. They want to buy the tool so it wakes them up. But that tool is getting more expensive.There was even a price increase announced today for a big one [laugh] that I've been publicly critical of. That is crazy expensive for a tool that texts you and call you. And what peo—what's going on now are people are looking, they're looking at the pricing sheet for Twilio and going, “What the heck is going on?” Like, I—to send a text on Twilio in the United States is fractions of a penny and here we are paying $40 a user for that person to receive six texts that month because of a webhook that hit an HCP server and, like, it's supposed to call that person? That's kind of a crazy model if you think about it. Like, engineers are kind of going, “Wait a minute. What's up here?” Like, and when engineers start thinking, “I could build this on a weekend,” like, something's wrong, like, with that model. And I think that people are starting to think that way.Corey: Well engineers, to be fair, will think that about an awful lot of stuff.Robert: Anything. Yeah, they [laugh]—Corey: I've heard it said about Dropbox, Facebook, the internet—Robert: Oh, Dropbox is such a good one.Corey: BGP. Yeah okay, great. Let me know how that works out for you.Robert: What was that Dropbox comment on Hacker News years ago? Like, “Just set up NFS and host it that way and it's easy.” Right?Corey: Or rsync. Yeah—Robert: Yeah, it was rsync.Corey: What are you going to make with that? Like, who's going to buy that? Like, basically everyone for at least a time.Robert: And whether or not the engineers are right, I think is a different point.Corey: It's the condescension dismissal of everything that isn't writing the code that really galls, on some level.Robert: But I think when engineers are thinking about, like, “I could build this on a weekend,” like, that's a moment that you have an opportunity to provide the value in an innovative, maybe consolidated way. We want to be a tool that's your incident management ring to retro, right? You get paged in the middle of the night, we're going to wake you up, and when you open up your laptop, groggy-eyed, and like, you're about to start fighting this fire, FireHydrant's already done a lot of work. That's what we think is, like, the right model do this. And candidly, I have no idea why the other alerting tools in this space haven't done this. I've said that and people tend to nod in agreement and say like, “Yeah, it's been—it's kind of crazy how they haven't approached this problem yet.” And… I don't know, I want to solve that problem for folks.Corey: So, one thing that I have to ask, you've been teasing on the internet for a little bit now is something called Signals where you are expanding your product into the component that wakes people up in the middle of the night, which in isolation, fine, great, awesome. But there was a company whose sole stated purpose was to wake people up in the middle of the night, and then once they started doing some business things such as, oh I don't know, going public, they needed to expand beyond that to do a whole bunch of other things. But as a customer, no, no, no, you are the thing that wakes me up in the middle of the night. I don't want you to sprawl and grow into everything else because if you're going to have to pick a vendor that claims to do everything, well, I'll just stay with AWS because they already do that and it's one less throat to choke. What is that pressure that is driving companies that are spectacular at the one thing to expand into things that frankly, they don't have the chops to pull off? And why is this not you doing the same thing?Robert: Oh, man. The end of that question is such a good one and I like that. I'm not an economist. I'm not—like, that's… I don't know if I have a great comment on, like, why are people expanding into things that they don't know how to do. It seems to be, like, a common thing across the industry at a certain point—Corey: Especially particularly generative AI. “Oh, we've been experts in this for a long time.” “Yeah, I'm not that great at dodgeball, but you also don't see me mouthing off about how I've been great at it and doing it for 30 years, either.”Robert: Yeah. I mean, there was a couple ads during football games I watched. I'm like, “What is this AI thing that you just, like, tacked on the letter X to the end of your product line and now all of a sudden, it's AI?” I have plenty of rants that are good for a cocktail at some point, but as for us, I mean, we knew that we wanted to do alerting a long time ago, but it does have complications. Like, the problem with alerting is that it does have to be able to take a brutal punch to the face the moment that AWS us-east-2 goes down.Because at that moment in time, a lot of webhooks are coming your way to wake somebody up, right, for thousands of different companies. So, you do have to be able to take a very, very sufficient amount of volume instantaneously. So, that was one thing that kind of stopped us. In 2019 even, we wrote a product document about building an alerting tool and we kind of paused. And then we got really deep into incident management, and the thing that makes us feel very qualified now is that people are actually already integrating their alerting tools into FireHydrant today. This is a very common thing.In fact, most people are paying for a FireHydrant and an alerting tool. So, you can imagine that gets a little expensive when you have both. So, we said, well, let's help folks consolidate, let's help folks have a modern version of alerting, and let's build on top of something we've been doing very well already, which is incident management. And we ended up calling it Signals because we think that we should be able to receive a lot of signals in, do something correct with them, and then put a signal out and then transfer you into incident management. And yeah, we're are excited for it actually. It's been really cool to see it come together.Corey: There's something to be said for keeping it in a certain area of expertise. And people find it very strange when they reach out to my business partner and me asking, okay, so are you going to expand into Google Cloud or Azure or—increasingly, lately—Datadog—which has become a Fortune 500 board-level expense concern, which is kind of wild to me, but here we are—and asking if we're going to focus on that, and our answer is no because it's very… well, not very, but it is relatively easy to be the subject matter expert in a very specific, expensive, painful problem, but as soon as you start expanding that your messaging loses focus and it doesn't take long—since we do you view this as an inherent architectural problem—where we're saying, “We're the best cloud engineers and cloud architects in the world,” and then we're competing against basically everyone out there. And it costs more money a year for Accenture or Deloitte's marketing budget than we'll ever earn as a company in our entire lifetime, just because we are not externally boosted, we're not putting hundreds of people into the field. It's a lifestyle business that solves an expensive, painful problem for our customers. And that focus lends clarity. I don't like the current market pressure toward expansion and consolidation at the cost of everything, including it seems, customer trust.Robert: Yeah. That's a good point. I mean, I agree. I mean, when you see a company—and it's almost getting hard to think about what a company does based on their name as well. Like, names don't even mean anything for companies anymore. Like Datadog has expanded into a whole lot of things beyond data and if you think about some of the alerting tools out there that have names of, like, old devices that used to attach to our hips, that's just a different company name than what represents what they do.And I think for us, like, incidents, that's what we care about. That's what I know. I know how to help people manage incidents. I built software that broke—sometimes I was an arsonist—sometimes I was a firefighter, it really depends, but that's the thing that we're going to be good at and we're just going to keep building in that sphere.Corey: I think that there's a tipping point that starts to become pretty clear when companies focus away from innovating and growing and serving customers into revenue protection mode. And I think this is a cyclical force that is very hard to resist. But I can tell even having conversations like this with folks, when the way that a company goes about setting up one of these conversations with me, you came by yourself, not with a squadron of PR people, not with a whole giant list of talking points you wanted to go to, just, “Let's talk about this stuff. I'm interested in it.”As a company grows, that becomes more and more uncommon. Often, I'll see it at companies a third the size of yours, just because there's so much fear around everything we say must be spoken in such a way that it could never be taken in a negative way against us. That's not the failure mode. The failure mode is that no one listens to you or cares what you have to say. At some point, yeah, I get the shift, but damned if it doesn't always feel like it's depressing.Robert: Yeah. This is such great questions because I think that the way I think about it is, I care about the problem and if we solve the problem and we solve it well and people agree with us on our solution being a good way to solve that problem, then the revenue, like, happens because of that. I've gotten asked from, like, from VCs and customers, like, “What's your end goal with FireHydrant as the CEO of the company?” And what they're really asking is, like, “Do you want to IPO or be acquired?” That's always a question every single time.And my answer is, maybe, I don't know, philosophical, but it's, I think if we solve the problem, like, one of those will happen, but that's not the end goal. Because if I aim at that, we're going to come up short. It's like how they tell you to throw a ball, right? Like they don't say, aim at the glove. They say, like, aim behind the person.And that's what we want to do. We just want to aim at solving a problem and then the revenue will come. You have to be smart about it, right? It's not a field of dreams, like, if you build it, like, revenue arrives, but—so you do have to be conscious of the business and the operations and the model that you work within, but it should all be in service of building something that's valuable.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where should they go to find you, other than, you know, to their most recent incident page?Robert: [laugh]. No, thanks for having me. So, to learn more about me, I mean, you can find me on Twitter on—or X. What do we call it now?Corey: I call it Twitter because I don't believe in deadnaming except when it's companies.Robert: Yeah [laugh]. twitter.com/bobbytables if you want to find me there. If you want to learn more about FireHydrant and what we're doing to help folks with incidents and incident response and all the fun things in there, it's firehydrant.com or firehydrant.io, but we'll redirect you to dot com.Corey: And we will, of course, put a link to all of that in the [show notes 00:33:10]. Thank you so much for taking the time to speak with me. It's deeply appreciated.Robert: Thank you for having me.Corey: Robert Ross, CEO and co-founder of FireHydrant. This featured guest episode has been brought to us by our friends at FireHydrant, and I'm Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that will never see the light of day because that crappy platform you're using is having an incident that they absolutely do not know how to manage effectively.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How MongoDB is Paving The Way for Frictionless Innovation with Peder Ulander

Screaming in the Cloud

Play Episode Listen Later Nov 30, 2023 36:08


Peder Ulander, Chief Marketing & Strategy Officer at MongoDB, joins Corey on Screaming in the Cloud to discuss how MongoDB is paving the way for innovation. Corey and Peder discuss how Peder made the decision to go from working at Amazon to MongoDB, and Peder explains how MongoDB is seeking to differentiate itself by making it easier for developers to innovate without friction. Peder also describes why he feels databases are more ubiquitous than people realize, and what it truly takes to win the hearts and minds of developers. About Peder Peder Ulander, the maestro of marketing mayhem at MongoDB, juggles strategies like a tech wizard on caffeine. As the Chief Marketing & Strategy Officer, he battles buzzwords, slays jargon dragons, and tends to developers with a wink. From pioneering Amazon's cloud heyday as Director of Enterprise and Developer Solutions Marketing to leading the brand behind cloud.com's insurgency, Peder's built a legacy as the swashbuckler of software, leaving a trail of market disruptions one vibrant outfit at a time. Peder is the Scarlett Johansson of tech marketing — always looking forward, always picking the edgy roles that drive what's next in technology.Links Referenced:MongoDB: https://mongodb.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode of Screaming in the Cloud is brought to us by my friends and yours at MongoDB, and into my veritable verbal grist mill, they have sent Peder Ulander, their Chief Marketing Officer. Peder, an absolute pleasure to talk to you again.Peder: Always good to see you, Corey. Thanks for having me.Corey: So, once upon a time, you worked in marketing over at AWS, and then you transitioned off to Mongo to, again, work in marketing. Imagine that. Almost like there's a narrative arc to your career. A lot of things change when you change companies, but before we dive into things, I just want to call out that you're a bit of an aberration in that every single person that I have spoken to who has worked within your org has nothing but good things to say about you, which means you are incredibly effective at silencing dissent. Good work.Peder: Or it just shows that I'm a good marketer and make sure that we paint the right picture that the world needs to see.Corey: Exactly. “Do you have any proof of you being a great person to work for?” “No, just word of mouth,” and everyone, “Ah, that's how marketing works.”Peder: Exactly. See, I'm glad you picked up somewhere.Corey: So, let's dive into that a little bit. Why would you leave AWS to go work at Mongo. Again, my usual snark and sarcasm would come up with a half dozen different answers, each more offensive than the last. Let's be serious for a second. At AWS, there's an incredibly powerful engine that drives so much stuff, and the breadth is enormous.MongoDB, despite an increasingly broad catalog of offerings, is nowhere near that level of just universal applicability. Your product strategy is not a Post-It note with the word ‘yes' written on it. There are things that you do across the board, but they all revolve around databases.Peder: Yeah. So, going back prior to MongoDB, I think you know, at AWS, I was across a number of different things, from the developer ecosystem, to the enterprise transformation, to the open-source work, et cetera, et cetera. And being privy to how customers were adopting technology to change their business or change the experiences that they were delivering to their customers or increase the value of the applications that they built, you know, there was a common thread of something that fundamentally needed to change. And I like to go back to just the evolution of tech in that sense. We could talk about going from physical on-prem systems to now we're distributed in the cloud. You could talk about application constructs that started as big fat monolithic apps that moved to virtual, then microservices, and now functions.Or you think about networking, we've gone from fixed wire line, to network edge, and cellular, and what have you. All of the tech stack has changed with the exception of one layer, and that's the data layer. And I think for the last 20 years, what's been in place has worked okay, but we're now meeting this new level of scale, this new level of reach, where the old systems are not what's going to be what the new systems are built on, or the new experiences are built on. And as I was approached by MongoDB, I kind of sat back and said, “You know, I'm super happy at AWS. I love the learning, I love the people, I love the space I was in, but if I were to put my crystal ball together”—here's a Bezos statement of looking around corners—“The data space is probably one of the biggest spaces ripe for disruption and opportunity, and I think Mongo is in an incredible position to go take advantage of that.”Corey: I mean, there's an easy number of jokes to make about AmazonBasics MongoDB, which is my disparaging name for their DocumentDB first-party offering. And for a time, it really felt like AWS's perspective toward its partners was one of outright hostility, if not antagonism. But that narrative no longer holds true in 2023. There's been a definite shift. And to be direct, part of the reason that I believe that is the things you have said both personally and professionally in your role as CMO of Mongo that has caused me to reevaluate this because despite all of your faults—a counted list of which I can provide you after the show—Peder: [laugh].Corey: You do not say things that you do not believe to be true.Peder: Correct.Corey: So, something has changed. What is it?Peder: So, I think there's an element of coopetition, right? So, I would go as far as to say the media loved to sensationalize—actually even the venture community—loved to sensationalize the screen scraping stripping of open-source communities that Amazon represented a number of years ago. The reality was their intent was pretty simple. They built an incredibly amazing IT stack, and they wanted to run whatever applications and software were important to their customers. And when you think about that, the majority of systems today, people want to run open-source because it removes friction, it removes cost, it enables them to go do cool new things, and be on the bleeding edge of technology.And Amazon did their best to work with the top open-source projects in the world to make it available to their customers. Now, for the commercial vendors that are leaning into this space, that obviously does present itself threat, right? And we've seen that along a number of the cohorts of whether you want to call it single-vendor open-source or companies that have a heavy, vested interest in seeing the success of their enterprise stack match the success of the open-source stack. And that's, I think, where media, analysts, venture, all kind of jumped on the bandwagon of not really, kind of, painting that bigger picture for the future. I think today when I look at Amazon—and candidly, it'll be any of the hyperscalers; they all have a clone of our database—it's an entry point. They're running just the raw open-source operational database capabilities that we have in our community edition and making that available to customers.We believe there's a bigger value in going beyond just that database and introducing, you know, anything from the distributed zones to what we do around vector search to what we do around stream processing, and encryption and all of these advanced features and capabilities that enable our customers to scale rapidly on our platform. And the dependency on delivering that is with the hyperscalers, so that's where that coopetition comes in, and that becomes really important for us when we're casting our web to engage with some of the world's largest customers out there. But interestingly enough, we become a big drag of services for an AWS or any of the other hyperscalers out there, meaning that for every dollar that goes to a MongoDB, there's, you know, three, five, ten dollars that goes to these hyperscalers. And so, they're very active in working with us to ensure that, you know, we have fair and competing offers in the marketplace, that they're promoting us through their own marketplace as well as their own channels, and that we're working together to further the success of our customers.Corey: When you take a look at the exciting things that are happening at the data layer—because you mentioned that we haven't really seen significant innovation in that space for a while—one of the things that I see happening is with the rise of Generative AI, which requires very special math that can only be handled by very special types of computers. I'm seeing at least a temporary inversion in what has traditionally been thought of as data gravity, whereas it's easier to move compute close to the data, but in this case, since the compute only lives in the, um, sparkling us-east-1 regions of Virginia, otherwise, it's just generic, sparkling expensive computers, great, you have to effectively move the mountain to Mohammed, so to speak. So, in that context, what else is happening that is driving innovation in the data space right now?Peder: Yeah, yeah. I love your analogy of, move the mountain of Mohammed because that's actually how we look at the opportunity in the whole Generative AI movement. There are a lot of tools and capabilities out there, whether we're looking at code generation tools, LLM modeling vendors, some of the other vector database companies that are out there, and they're all built on the premise of, bring your data to my tool. And I actually think that's a flawed strategy. I think that these are things that are going to be features in core application databases or operational databases, and it's going to be dependent on the reach and breadth of that database, and the integrations with all of these AI tools that will define the victor going forward.And I think that's been a big core part of our platform. When we look at Atlas—111 availability zones across all three hyperscalers with a single, unified, you know, interface—we're actually able to have the customers keep their operational data where it's most important to them and then apply the tools of the hyperscalers or the partners where it makes the most sense without moving the data, right? So, you don't actually have to move the mountain to Mohammed. We're literally building an experience where those that are running on MongoDB and have been running on MongoDB can gain advantage of these new tools and capabilities instantly, without having to change anything in their architectures or how they're building their applications.Corey: There was a somewhat over-excited… I guess, over-focus in the space of vector databases because whatever those are—which involves math, and I am in no way shape, or form smart enough to grasp the nuances thereof, but everyone assures me that it's necessary for Generative AI and machine learning and yadda, yadda, yadda. So, when in doubt, when I'm confronted by things I don't fully understand, I turn to people who do. And the almost universal consensus that I have picked up from people who track databases for a living—as opposed to my own role of inappropriately using everything in the world except databases as a database—is that vector is very much a feature, not a core database type.Peder: Correct. The best way to think about it—I mean, databases in general, they're dealing with structured and unstructured data, and generally, especially when you're doing searches or relevance, you're limited to the fact that those things in the rows and the columns or in the documents is text, right? And the reality is, there's a whole host of information that can be found in metadata, in images, in sounds, in all of these other sources that were stored as individual files but unsearchable. Vector, vectorization, and vector embeddings actually enable you to take things far beyond the text and numbers that you traditionally were searching against and actually apply more, kind of, intelligence to it, or apply sounds or apply sme—you know, you can vectorize smells to some extent. And what that does is it actually creates a more pleasing slash relevant experience for how you're actually building the engagements with your customers.Now, I'll make it a little more simple because that was trying to define vectors, which as you know, is not the easiest thing. But imagine being able to vectorize—let's say I'm a car company—we're actually working with a car company on this—and you're able to store all of the audio files of cars that are showing certain diagnostic issues—the putters and the spurts and the pings and the pangs—and you can actually now isolate these sounds and apply them directly to the problem and resolution for the mechanics that are working on them. Using all of this stuff together, now you actually have a faster time to resolution. You don't want mechanics knowing the mechanics of vectors in that sense, right, so you build an application that abstracts all of that complexity. You don't require them to go through PDFs of data and find all of the options for fixing this stuff.The relevance comes back and says, “Yes, we've seen that sound 20 times across this vehicle. Here's how you fix it.” Right? And that cuts significant amount of time, cost, efficiency, and complexity for those auto mechanics. That is such a big push forward, I think, from a technology perspective, on what the true promise of some of these new capabilities are, and why I get excited about what we're doing with vector and how we're enabling our customers to, you know, kind of recreate experiences in a way that are more human, more relevant.Corey: Now, I have to say that of course you're going to say nice things about your capabilities where vector is concerned. You would be failing in your job if you did not. So, I feel like I can safely discount every positive thing that you say about Mongo's positioning in the vector space and instead turn to, you know, third parties with no formalized relationship with you. Yesterday, Retool's State of AI report came across my desk. I am a very happy Retool customer. They've been a periodic sponsor, from time-to-time, of my ridiculous nonsense, which is neither here nor there, but I want to disclaim the relationship.And they had a Gartner Magic Quadrant equivalent that on one axis had Net Promoter Score—NPS, which is one of your people's kinds of things—and the other was popularity. And Mongo was so far up and to the right that it was almost hilarious compared to every other entrant in the space. That is a positioning that I do not believe it is possible to market your way into directly. This is something that people who are actually doing these things have to use the product, and it has to stand up. Mongo is clearly effective at doing this in a way that other entrants aren't. Why?Peder: Yeah, that's a good question. I think a big part of that goes back to the earlier statement I made that vector databases or vector technology, it's a feature, it's not a separate thing, right? And when I think about all of the new entrants, they're creating a new model where now you have to move your data out of your operational database and into their tool to get an answer and then push back in. The complexity, the integrations, the capabilities, it just slows everything down, right? And I think when you look at MongoDB's approach to take this developer data platform vision of getting all of the core tools that developers need to build compelling applications with from a data perspective, integrating it into one seamless experience, we're able to basically bring classic operational database capabilities, classic text search type capabilities, embed the vector search capabilities as well, it actually creates a richer platform and experience without all of that complexity that's associated with bolt-on sidecar Gen AI tool or vector database.Corey: I would say that that's one of those things that, again, can only really be credibly proven by what the market actually does, as opposed to, you know, lip-sticking the heck out of a pig and hoping that people don't dig too deeply into what you're saying. It's definitely something we're seeing adoption of.Peder: Yeah, I mean, this kind of goes to some of the stuff, you know, you pointed out, the Retool thing. This is not something you can market your way into. This is something that, you know, users are going to dictate the winners in this space, the developers, they're going to dictate the winners in the space. And so, what do you have to do to win the hearts and minds of developers, you have to make the tech extremely approachable, it's got to be scalable to meet their needs, not a lot of friction involved in learning these new capabilities and applying it to all of the stuff that has come before. All of these things put together, really focusing on that developer experience, I mean, that goes to the core of the MongoDB ethos.I mean, this is who we were when we started the company so long ago, and it's continued to drive the innovation that we do in the platform. And I think this is just yet again, another example of focusing on developer needs, making it super engaging and useful, removing the friction, and enabling them to just go create new things. That's what makes it so fun. And so when, you know, as a marketer, and I get the Retool chart across my desk, we haven't been pitching them, we haven't been marketing to them, we haven't tried to influence this stuff, so knowing that this is a true, unbiased audience, actually is pretty cool to see. To your point, it was surprising how far up and to the right that we sat, given, you know, where we were in just—we launched this thing… six months ago? We launched it in June. The amount of customers that have signed up, are using it, and engaged with us on moving forward has been absolutely amazing.Corey: I think that there has been so much that gets lost in the noise of marketing. My approach has always been to cut through so much of it—that I think AWS has always done very well with—is—almost at their detriment these days—but if you get on stage, you can say whatever you want about your company's product, and I will, naturally and lovingly, make fun of whatever it is that you say. But when you have a customer coming on stage and saying, “This is how we are using the thing that they have built to solve a very specific business problem that was causing us pain,” then I shut up, and I listen because it's very hard to wind up dismissing that without being an outright jerk about things. I think the failure mode of that is, taken too far, you lose the ability to tell your own story in a coherent way, and it becomes a crutch that becomes very hard to get rid of. But the proof is really in the pudding.For me, like, the old jokes about—in the early teens—where MongoDB would periodically lose data as configured by default. Like, “MongoDB. It's Snapchat for databases.” Hilarious joke at the time, but it really has worn thin. That's like being angry about what Microsoft did in 2005 and 2006. It's like, “Yeah, okay, you have a point, but it is also ancient history, and at some point you need to get with the modern era, get with the program.”And I think that seeing the success and breadth of MongoDB that I do—you are in virtually every customer that I talk to, in some way, shape, or form—and seeing what it is that they're doing with you folks, it is clear that you are not a passing fad, that you are not going away anytime soon.Peder: Right.Corey: And even with building things in my spare time and following various tutorials of dubious credibility from various parts of the internet—as those things tend to go—MongoDB is very often a default go-to reference when someone needs a database for which a SQLite file won't do.Peder: Right. It's fascinating to see the evolution of MongoDB, and today we're lucky to track 45,000-plus customers on our platform doing absolutely incredible things. But I think the biggest—to your point—the biggest proof is in the pudding when you get these customers to stand up on stage and talk about it. And even just recently, through our .local series, some of the customers that we've been highlighting are doing some amazing things using MongoDB in extremely business-critical situations.My favorite was, I was out doing our .local in Hong Kong, where Cathay Pacific got up on stage, and they talked a little bit about their flight folder. Now, if you remember going through the airport, you always see the captains come through, and they had those two big boxes of paperwork before they got onto the plane. Not only was that killing the environment with all the trees that got cut down for it, it was cumbersome, complex, and added a lot of time and friction with regards to flight operations. Now, take that from a single flight over all of the fleet that's happening across the world.We were able to work with Cathay Pacific to digitize their entire flight folder, all of their documentation, removing the need for cutting down trees and minimizing a carbon footprint form, but at the same time, actually delivering a solution where if it goes down, it grounds the entire fleet of the airline. So, imagine that. That's so business-critical, mission-critical, has to be there, reliable, resilient, available for the pilots, or it shuts down the business. Seeing that growth and that transformation while also seeing the environmental benefit for what they have achieved, to me, that makes me proud to work here.Similarly, we have companies like Ford, another big brand-name company here in the States, where their entire connected car experience and how they're basically operationalizing the connection between the car and their home base, this is all being done using MongoDB as well. So, as they think of these new ideas, recognizing that things are going to be either out at the edges or at a level of scale that you can't just bring it back into classic rows and columns, that's actually where we're so well-suited to grow our footprint. And, you know, I remember back to when I was at Sun—Sun Microsystems. I don't know if anybody remembers that company. That was an old one.But at one point, it was Jonathan that said, “Everything of value connects to the network.” Right? Those things that are connecting to the network also need applications, they need data, they need all of these services. And the further out they go, the more you need a database that basically scales to meet them where they are, versus trying to get them to come back to where your database happens to sit. And in order to do that, that's where you break the mold.That's where—I mean, that kind of goes into the core ethos of why we built this company to begin with. The original founders were not here to build a database; they were building a consumer app that needed to scale to the edges of the earth. They recognized that databases didn't solve for that, so they built MongoDB. That's actually thinking ahead. Everything connecting to the network, everything being distributed, everything basically scaling out to all the citizens of the planet fundamentally needs a new data layer, and that's where I think we've come in and succeeded exceptionally well.Corey: I would agree. Another example I like to come up with, and it's fun that the one that leaps to the top of my mind is not one of the ones that you mentioned, but HSBC—the massive bank—very publicly a few years ago, wound up consolidating, I think it was 46 relational databases onto MongoDB. And the jokes at the time wrote themselves, but let's be serious for a second. Despite the jokes that we all love to tell, they are a bank, a massive bank, and they don't play fast-and-loose or slap-and-tickle with transactional integrity or their data stores for these things.Because there's a definite belief across the banking sector—and I know this having worked in it myself for years—that if at some point, you have the ATMs spitting out the wrong account balances, people will begin rioting in the streets. I don't know if that's strictly accurate or hyperbole, but it's going to cause massive amounts of chaos if it happens. So, that is something that absolutely cannot happen. The fact that they're willing to engage with you folks and your technology and be public about it at that scale, that's really all you need to know from a, “Is this serious technology or clown shoes technology?”Peder: [laugh]. Well, taking that comment, now let's exponentially increase that. You know, if I sit back, and I look at my customer base, financial services is actually one of our biggest verticals as a business. And you mentioned HSBC. We had Wells Fargo on the stage last year at our world event.Nine out of the top ten world's banks are using MongoDB in some of their applications, some at the scale of HSBC, some are still just getting started. And it all comes down to the fact that we have proven ourselves, we are aligned to mission-critical business environments. And I think when it comes down to banks, especially that transactional side, you know, building in the capabilities to be able to have high frequency transactions in the banking world is a hard thing to go do, and we've been able to prove it with some of the largest banks on the planet.Corey: I also want to give you credit—although it might be that I'm giving you credit for a slow release process; I hope not—but when I visit mongodb.com, it still talks up front that you are—and I want to quote here—oh, good lord, it changes every time I load the page—but it talks about, “Build faster, build smarter,” on this particular version of the load. It talks about the data platform. You have not effectively decided to pivot everything you say in public to tie directly into the Generative AI hype bubble that we are currently experiencing. You have a bunch of different use cases, and you're not suddenly describing what you do in Gen AI terms that make it impossible to understand just what the company-slash-product-slash-services actually do.Peder: Right.Corey: So, I want to congratulate you on that.Peder: Appreciate that, right? Look, it comes down to the core basics. We are a developer data platform. We bring together all of the capabilities, tools, and functions that developers need when building apps as it pertains to their data functions or data layer, right? And that's why this integrated approach of taking our operational database and building in search, or stream processing, or vector search, all of the things that we're bringing to the platform enable developers to move faster. And what that says is, we're great for all use cases out there, not just Gen AI use cases. We're great for all use cases where customers are building applications to change the way that they're engaging with the customers.Corey: And what I like about this is that you're clearly integrating this stuff under the hood. You are talking to people who are building fascinating stuff, you're building things yourself, but you're not wrapping yourself in the mantle of, “This is exactly what we do because it's trendy right now.” And I appreciate that. It's still intelligible, and I wouldn't think that I had to congratulate someone on, “Wow, you build marketing that a human being can extract meaning from. That's amazing.” But in 2023, the closing days thereof, it very much is.Peder: Yep, yep. And it speaks a lot to the technology that we've built because, you know, on one side—it reminds me a lot of the early days of cloud where everything was kind of cloud-washed for a bit, we're seeing a little bit of that in the hype cycle that we have right now—sticking to our guns and making sure that we are building a technology platform that enables developers to move quickly, that removing the friction from the developer lifecycle as it pertains to the data layer, that's where the success is right, we have to stay on top of all of the trends, we have to make sure that we're enabling Gen AI, we have to make sure that we're integrating with the Amazon Bedrocks and the CodeWhisperers of the world, right, to go push this stuff forward. But to the point we made earlier, those are capabilities and features of a platform where the higher-level order is to really empower our customers to develop innovative, disruptive, or market-leading technologies for how they engage with their customers.Corey: Yeah. And that it's neat to be able to see that you are empowering companies to do that without feeling the need to basically claim their achievements as your own, which is an honest-to-God hard thing to do, especially as you become a platform company because increasingly, you are the plumbing that makes a lot of the flashy, interesting stuff possible. It's imperative, you can't have those things without the underlying infrastructure, but it's hard to talk about that infrastructure, too.Peder: You know, it's funny, I'm sure all of my colleagues would hate me for saying this, but the wheel doesn't turn without the ball bearing. Somebody still has to build the ball bearing in order for that sucker to move, right? And that's the thing. This is the infrastructure, this is the heart of everything that businesses need to build applications. And one of the—you know, another kind of snide comment I've made to some of my colleagues here is, if you think about every market-leading app, in fact, let's go to the biggest experiences you and I use on a daily basis, I'm pretty sure you're booking travel online, you're searching for stuff on Google, you're buying stuff through Amazon, you're renting a house through Airbnb, and you're listening to your music through Spotify. What are those? Those are databases with a search engine.Corey: The world is full of CRUD applications. These are, effectively, simply pretty front-ends to a database. And as much as we'd like to pretend otherwise, that's very much the reality of it. And we want that to be the case. Different modes of interaction, different requirements around them, but yeah, that is what so much of the world is. And I think to ignore that is to honestly blind yourself to a bunch of very key realities here.Peder: That kind of goes back to the original vision for when I came here. It's like, look, everything of value for us, everything that I engage with, is—to your point—it's a database with a great experience on top of it. Now, let's start to layer in this whole Gen AI push, right, what's going on there. We're talking about increased relevance in search, we're talking about new ways of thinking about sourcing information. We've even seen that with some of the latest ChatGPT stuff that developers are using that to get code snippets and figure out how to solve things within their platform.The era of the classic search engine is in the middle of a complete change, and the opportunity, I think, that I see as this moves forward is that there is no incumbent. There isn't somebody who owns this space, so we're just at the beginning of what probably will be the next. Google's, Airbnb's, and Uber's of the world for the next generation. And that's really exciting to see.Corey: I'm right there with you. What are the interesting founding stories at Google is that they wound up calling typical storage vendors for what they needed, got basically ‘screw on out of here, kids,' pricing, so they shrugged, and because they had no real choice to get enterprise-quality hardware, they built a bunch of highly redundant systems on top of basically a bunch of decommissioned crap boxes from the university they were able to more or less get for free or damn near it, and that led to a whole innovation in technology. One of the glorious things about cloud that I think goes under-sold is that I can build a ridiculous application tonight for maybe, what, 27 cents IT infrastructure spend, and if it doesn't work, I round up to dollar, it'll probably get waived because it'll cost more to process the credit card transaction than take my 27 cents. Conversely, if it works, I'm already building with quote-unquote, “Enterprise-grade” components. I don't need to do a massive uplift. I can keep going. And that is no small thing.Peder: No, it's not. When you step back, every single one of those stories was about abstracting that complexity to the end-user. In Google's case, they built their own systems. You or I probably didn't know that they were screwing these things together and soldering them in the back room in the middle of the night. Similarly, when Amazon got started, that was about taking something that was only accessible to a few thousand and now making it accessible to a few million with the costs of 27 cents to build an app.You removed the risk, you removed the friction from enabling a developer to be able to build. That next wave—and this is why I think the things we're doing around Gen AI, and our vector search capabilities, and literally how we're building our developer data platform is about removing that friction and limits and enabling developers to just come in and, you know, effectively do what they do best, which is innovate, versus all of the other things. You know, in the Google world, it's no longer racking and stacking. In the cloud world, it's no longer managing and integrating all the systems. Well, in the data world, it's about making sure that all of those integrations are ready to go and at your fingertips, and you just focus on what you do well, which is creating those new experiences for customers.Corey: So, we're recording this a little bit beforehand, but not by much. You are going to be at re:Invent this year—as am I—for eight nights—Peder: Yes.Corey: Because for me at least, it is crappy cloud Hanukkah, and I've got to deal with that. What have you got coming up? What do you plan to announce? Anything fun, exciting, or are you just there basically, to see how many badges you can actually scan in one day?Peder: Yeah [laugh]. Well, you know, it's shaping up to be quite an incredible week, there's no question. We'll see what brings to town. As you know, re:Invent is a huge event for us. We do a lot within that ecosystem, a lot of the customers that are up on stage talking about the cool things they're doing with AWS, they're also MongoDB customers. So, we go all out. I think you and I spoke before about our position there with SugarCane right on the show floor, I think we've managed to secure you a Friends of Peder all-access pass to SugarCane. So, I look forward to seeing you there, Corey.Corey: Proving my old thesis of, it really is who you know. And thank you for your generosity, please continue.Peder: [laugh]. So, we will be there in full force. We have a number of different innovation talks, we have a bunch of community-related events, working with developers, helping them understand how we play in the space. We're also doing a bunch of hands-on labs and design reviews that help customers basically build better, and build faster, build smarter—to your point earlier on some of the marketing you're getting off of our website. But we're also doing a number of announcements.I think first off, it was actually this last week, we made the announcement of our integrations with Amazon—or—yeah, Amazon CodeWhisperer. So, their code generation tool for developers has now been fully trained on MongoDB so that you can take advantage of some of these code generation tools with MongoDB Atlas on AWS. Similarly, there's been a lot of noise around what Amazon is doing with Bedrock and the ability to automate certain tasks and things for developers. We are going to be announcing our integrations with Agents for Amazon Bedrock being supported inside of MongoDB Atlas, so we're excited to see that, kind of, move forward. And then ultimately, we're really there to celebrate our customers and connect them so that they can share what they're doing with many peers and others in the space to give them that inspiration that you so eloquently talked about, which is, don't market your stuff; let your customers tell what they're able to do with your stuff, and that'll set you up for success in the future.Corey: I'm looking forward to seeing what you announce in conjunction with what AWS announces, and the interplay between those two. As always, I'm going to basically ignore 90% of what both companies say and talk instead to customers, and, “What are you doing with it?” Because that's the only way to get truth out of it. And, frankly, I've been paying increasing amounts of attention to MongoDB over the past few years, just because of what people I trust who are actually good at databases have to say about you folks. Like, my friends at RedMonk always like to say—I've stolen the line from them—“You can buy my attention, but not my opinion.”Peder: A hundred percent.Corey: You've earned the opinion that you have, at this point. Thank you for your sponsorship; it doesn't hurt, but again, you don't get to buy endorsements. I like what you're doing. Please keep going.Peder: No, I appreciate that, Corey. You've always been supportive, and definitely appreciate the opportunity to come on Screaming in the Cloud again. And I'll just push back to that Friends of Peder. There's, you know, also a little bit of ulterior motive there. It's not just who you know, but it's [crosstalk 00:34:39]—Corey: It's also validating that you have friends. I get it. I get it.Peder: Oh yeah, I know, right? And I don't have many, but I have a few. But the interesting thing there is we're going to be able to connect you with a number of the customers doing some of these cool things on top of MongoDB Atlas.Corey: I look forward to it. Thank you so much for your time. Peder Ulander, Chief Marketing Officer at MongoDB. I'm Cloud Economist Corey Quinn and this has been a promoted guest episode of Screaming in the Cloud, brought to us by our friends at Mongo. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review in your podcast platform of choice, along with an angry, insulting comment that I will ignore because you basically wrapped it so tightly in Generative AI messaging that I don't know what the hell your point is supposed to be.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Taking a Hybrid AI Approach to Security at Snyk with Randall Degges

Screaming in the Cloud

Play Episode Listen Later Nov 29, 2023 35:57


Randall Degges, Head of Developer Relations & Community at Snyk, joins Corey on Screaming in the Cloud to discuss Snyk's innovative AI strategy and why developers don't need to be afraid of security. Randall explains the difference between Large Language Models and Symbolic AI, and how combining those two approaches creates more accurate security tooling. Corey and Randall also discuss the FUD phenomenon to selling security tools, and Randall expands on why Snyk doesn't take that approach. Randall also shares some background on how he went from being a happy Snyk user to a full-time Snyk employee. About RandallRandall runs Developer Relations & Community at Snyk, where he works on security research, development, and education. In his spare time, Randall writes articles and gives talks advocating for security best practices. Randall also builds and contributes to various open-source security tools.Randall's realms of expertise include Python, JavaScript, and Go development, web security, cryptography, and infrastructure security. Randall has been writing software for over 20 years and has built a number of popular API services and open-source tools.Links Referenced: Snyk: https://snyk.io/ Snyk blog: https://snyk.io/blog/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn, and this featured guest episode is brought to us by our friends at Snyk. Also brought to us by our friends at Snyk is one of our friends at Snyk, specifically Randall Degges, their Head of Developer Relations and Community. Randall, thank you for joining me.Randall: Hey, what's up, Corey? Yeah, thanks for having me on the show, man. Looking forward to talking about some fun security stuff today.Corey: It's been a while since I got to really talk about a security-centric thing on this show, at least in order of recordings. I don't know if the one right before this is a security thing; things happen on the back-end that I'm blissfully unaware of. But it seems the theme lately has been a lot around generative AI, so I'm going to start off by basically putting you in the hot seat. Because when you pull up a company's website these days, the odds are terrific that they're going to have completely repositioned absolutely everything that they do in the context of generative AI. It's like, “We're a generative AI company.” It's like, “That's great.” Historically, I have been a paying customer of Snyk so that it does security stuff, so if you're now a generative AI company, who do I use for the security platform thing that I was depending upon? You have not done that. First, good work. Secondly, why haven't you done that?Randall: Great question. Also, you said a moment ago that LLMs are very interesting, or there's a lot of hype around it. Understatement of the last year, for sure [laugh].Corey: Oh, my God, it has gotten brutal.Randall: I don't know how many billions of dollars have been dumped into LLM in the last 12 months, but I'm sure it's a very high number.Corey: I have a sneaking suspicion that the largest models cost at least a billion each train, just based upon—at least retail price—based upon the simple economics of how long it takes to do these things, how expensive that particular flavor of compute is. And the technology is his magic. It is magic in a box and I see that, but finding ways that it applies in different ways is taking some time. But that's not stopping the hype beasts. A lot of the same terrible people who were relentlessly pushing crypto have now pivoted to relentlessly pushing generative AI, presumably because they're working through Nvidia's street team, or their referral program, or whatever it is. Doesn't matter what the rest of us do, as long as we're burning GPU cycles on it. And I want to distance myself from that exciting level of boosterism. But it's also magic.Randall: Yeah [laugh]. Well, let's just talk about AI insecurity for a moment and answer your previous question. So, what's happening in space, what's the deal, what is all the hype going to, and what is Snyk doing around there? So, quite frankly—and I'm sure a lot of people on your show say the same thing—but Snyk isn't new into, like, the AI space. It's been a fundamental part of our platform for many years now.So, for those of you listening who have no idea what the heck Snyk is, and you're like, “Why are we talking about this,” Snyk is essentially a developer security company, and the core of what we do is two things. The first thing is we help scan your code, your dependencies, your containers, all the different parts of your application, and detect vulnerabilities. That's the first part. The second thing we do is we help fix those vulnerabilities. So, detection and remediation. Those are the two components of any good security tool or security company.And in our particular case, we're very focused on developers because our whole product is really based on your application and your application security, not infrastructure and other things like this. So, with that being said, what are we doing at a high level with LLMs? Well, if you think about AI as, like, a broad spectrum, you have a lot of different technologies behind the scenes that people refer to as AI. You have lots of these large language models, which are generating text based on inputs. You also have symbolic AI, which has been around for a very long time and which is very domain specific. It's like creating specific rules and helping do pattern detection amongst things.And those two different types of applied AI, let's say—we have large language models and symbolic AI—are the two main things that have been happening in industry for the last, you know, tens of years, really, with LLM as being the new kid on the block. So, when we're talking about security, what's important to know about just those two underlying technologies? Well, the first thing is that large language models, as I'm sure everyone listening to this knows, are really good at predicting things based on a big training set of data. That's why companies like OpenAI and their ChatGPT tool have become so popular because they've gone out and crawled vast portions of the internet, downloaded tons of data, classified it, and then trained their models on top of this data so that they can help predict the things that people are putting into chat. And that's why they're so interesting, and powerful, and there's all these cool use cases popping up with them.However, the downside of LLMs is because they're just using a bunch of training data behind the scenes, there's a ton of room for things to be wrong. Training datasets aren't perfect, they're coming from a ton of places, and even if they weren't perfect, there's still the likelihood that things that are going to be generating output based on a statistical model isn't going to be accurate, which is the whole concept of hallucinations.Corey: Right. I wound up remarking on the livestream for GitHub Universe a week or two ago that the S in AI stood for security. One of the problems I've seen with it is that it can generate a very plausible looking IAM policy if you ask it to, but it doesn't actually do what you think it would if you go ahead and actually use it. I think that it's still squarely in the realm of, it's great at creativity, it's great at surface level knowledge, but for anything important, you really want someone who knows what they're doing to take a look at it and say, “Slow your roll there, Hasty Pudding.”Randall: A hundred percent. And when we're talking about LLMs, I mean, you're right. Security isn't really what they're designed to do, first of all [laugh]. Like, they're designed to predict things based on statistics, which is not a security concept. But secondly, another important thing to note is, when you're talking about using LLMs in general, there's so many tricks and techniques and things you can do to improve accuracy and improve things, like for example, having a ton of [contexts 00:06:35] or doing Few-Shot Learning Techniques where you prompt it and give it examples of questions and answers that you're looking for can give you a slight competitive edge there in terms of reducing hallucinations and false information.But fundamentally, LLMs will always have a problem with hallucinations and getting things wrong. So, that brings us to what we mentioned before: symbolic AI and what the differences are there. Well, symbolic AI is a completely different approach. You're not taking huge training sets and using machine learning to build statistical models. It's very different. You're creating rules, and you're parsing very specific domain information to generate things that are highly accurate, although those models will fail when applied to general-purpose things, unlike large language models.So, what does that mean? You have these two different types of AI that people are using. You have symbolic AI, which is very specific and requires a lot of expertise to create, then you have LLMs, which take a lot of experience to create as well, but are very broad and general purpose and have a capability to be wrong. Snyk's approach is, we take both of those concepts, and we use them together to get the best of both worlds. And we can talk a little bit about that, but I think fundamentally, one of the things that separates Snyk from a lot of other companies in the space is we're just trying to do whatever the best technical solution is to solve the problem, and I think we found that with our hybrid approach.Corey: I think that there is a reasonable distrust of AI when it comes to security. I mean, I wound up recently using it to build what has been announced by the time this thing airs, which is my re:Invent photo scavenger hunt app. I know nothing about front-end, so that's okay, I've got a robot in my pocket. It's great at doing the development of the initial thing, and then you have issues, and you want to add functionality, and it feels like by the time I was done with my first draft, that ten different engineers had all collaborated on this thing without ever speaking to one another. There was no consistent idiomatic style, it used a variety, a hodgepodge of different lists and the rest, and it became a bit of a Frankenstein's monster.That can kind of work if we're talking about a web app that doesn't have any sensitive data in it, but holy crap, the idea of applying that to, “Yeah, that's how we built our bank's security policy,” is one of those, “Let me know who said that, so they can not have their job anymore,” territory when the CSO starts [hunting 00:08:55].Randall: You're right. It's a very tenuous situation to be in from a security perspective. The way I like to think about it—because I've been a developer for a long time and a security professional—and I as much as anyone out there love to jump on the hype train for things and do whatever I can to be lazy and just get work done quicker. And so, I use ChatGPT, I use GitHub Copilot, I use all sorts of LLM-based tools to help me write software. And similarly to the problems when developers are not using LLM to help them write code, security is always a concern.Like, it doesn't matter if you have a developer writing every line of code themselves or if they're getting help from Copilot or ChatGPT. Fundamentally, the problem with security and the reason why it's such an annoying part of the developer experience, in all honesty, is that security is really difficult. You can take someone who's an amazing engineer, who has 30 years of experience, like, you can take John Carmack, I'm sure, one of the most legendary developers to ever walk the Earth, you could sit over his shoulder and watch him write software, right, I can almost guarantee you that he's going to have some sort of security problem in his code, even with all the knowledge he has in his head. And part of the reason that's the case is because modern security is way complicated. Like if you're building a web app, you have front-end stuff you need to protect, you have back-end stuff you need to protect, there's databases and infrastructure and communication layers between the infrastructure and the services. It's just too complicated for one person to fully grasp.And so, what do you do? Well, you basically need some sort of assistance from automation. You have to have some sort of tooling that can take a look at your code that you're writing and say, “Hey Randall, on line 39, when you were writing this function that's taking user data and doing something with it, you forgot to sanitize the user data.” Now, that's a simple example, but let's talk about a more complex example. Maybe you're building some authentication software, and you're taking users' passwords, and you're hashing them using a common hashing algorithm.And maybe the tooling is able to detect way using the bcrypt password hashing algorithm with a work factor of ten to create this password hash, but guess what, we're in 2023 and a work factor of ten is something that older commodity CPUs can now factor at a reasonable rate, and so you need to bump that up to 13 or 14. These are the types of things where you need help over time. It's not something that anyone can reasonably assume they can just deal with in their head. The way I like to think about it is, as a developer, regardless of how you're building code, you need some sort of security checks on there to just help you be productive, in all honesty. Like, if you're not doing that, you're just asking for problems.Corey: Oh, yeah. On some level, even the idea of it's just going to be very computationally expensive to wind up figuring out what that password hash is, well great, but one of the things that we've been aware of for a while is that given the rise of botnets and compromised computers, the attackers have what amounts to infinite computing capacity, give or take. So, if they want in, on some level, badly enough, they're going to find a way to get in there. When you say that every developer is going to sit down and write insecure code, you're right. And a big part of that is because, as imagined today, security is an incredibly high friction process, and it's not helped, frankly, by tools that don't have nuance or understanding.If I want to do a crap ton of busy work that doesn't feel like it moves the needle forward at all, I'll go around to resolving the hundreds upon hundreds of Dependabot alerts I have for a lot of my internal services that write my weekly newsletter. Because some dependency three deep winds up having a failure mode when it gets untrusted input of the following type, it can cause resource exhaustion. It runs in a Lambda function, so I don't care about the resources, and two, I'm not here providing the stuff that I write, which is the input with an idea toward exploiting stuff. So, it's busy work, things I don't need to be aware of. But more to the point, stuff like that has the high propensity to mask things I actually do care about. Getting the signal from noise from your misconfigured, ill-conceived alerting system is just awful. Like, a bad thing is there are no security things for you to work on, but a worse one is, “Here are 70,000 security things for you to work on.” How do you triage? How do you think about it?Randall: A hundred percent. I mean, that's actually the most difficult thing, I would say, that security teams have to deal with in the real world. It's not having a tool to help detect issues or trying to get people to fix them. The real issue is, there's always security problems, like you said, right? Like, if you take a look and just scan any codebase out there, any reasonably-sized codebase, you're going to find a ridiculous amount of issues.Some of those issues will be actual issues, like, you're not doing something in code hygiene that you need to do to protect stuff. A lot of those issues are meaningless things, like you said. You have a transitive dependency that some direct dependency is referring to, and maybe in some function call, there's an issue there, and it's alerting you on it even though you don't even use this function call. You're not even touching this class, or this method, or whatever it is. And it wastes a lot of time.And that's why the Holy Grail in the security industry in all honesty is prioritization and insights. At Snyk, we sort of pioneered this concept of ASPM, which stands for Application Security Posture Management. And fundamentally what that means is when you're a security team, and you're scanning code and finding all these issues, how do you prioritize them? Well, there's a couple of approaches. One approach is to use static analysis to try to figure out if these issues that are being detected are reachable, right? Like, can they be achieved in some way, but that's really hard to do statically and there's so many variables that go into it that no one really has foolproof solutions there.The second thing you can do is you can combine insights and heuristics from a lot of different places. So, you can take a look at static code analysis results, and you can combine them with agents running live that are observing your application, and then you can try to determine what stuff is actually reachable given this real world heuristic, and you know, real time information and mapping it up with static code analysis results. And that's really the holy grail of figuring things out. We have an ASPM product—or maybe it's a feature, an offering, if you will, but it's something that Snyk provides, which gives security admins a lot more insight into that type of operation at their business. But you're totally right, Corey, it's a really difficult problem to solve, and it burns a lot of goodwill in the security community and in the industry because people spend a lot of time getting false alerts, going through stuff, and just wasting millions of hours a year, I'm sure.Corey: That's part of the challenge, too, is that it feels like there are two classes of problems in the world, at least when it comes to business. And I found this by being on the wrong side of it, on some level. Here on the wrong side, it's things like caring about cost optimization, it's caring about security, it's remembering to buy fire insurance for your building. You can wind up doing all of those things—and you should be doing them, but you can over-index on them to the point where you run out of money and your business dies. The proactive side of that fence is getting features to market sooner, increasing market share, growing revenue, et cetera, and that's the stuff that people are always going to prioritize over the back burner stuff. So, striking a balance between that is always going to be a bit of a challenge, and where people land on that is going to be tricky.Randall: So, I think this is a really good bridge. You're totally right. It's expensive to waste people's time, basically, is what you're saying, right? You don't want to waste people's time, you want to give them actionable alerts that they can actually fix, or hopefully you fix it for them if you can, right? So, I'm going to lay something out, which is, in our opinion, is the Snyk way, if you will, that you should be approaching these developer security issues.So, let's take a look at two different approaches. The first approach is going to be using an LLM, like, let's say, just ChatGPT. We'll call them out because everyone knows ChatGPT. The first approach we're going to take is—Corey: Although I do insist on pronouncing it Chat-Gippity. But please, continue.Randall: [laugh]. Chat-Gippity. I love that. I haven't heard that before. Chat-Gippity. Sounds so much more fun, you know?Corey: It sounds more personable. Yeah.Randall: Yeah. So, you're talking to Chat-Gippity—thank you—and you paste in a file from your codebase, and you say, “Hey, Chat-Gippity. Here's a file from my codebase. Please help me identify security issues in here,” and you get back a long list of recommendations.Corey: Well, it does more than that. Let me just interject there because one of the things it does that I think very few security engineers have mastered is it does it politely and constructively, as opposed to having an unstated tone of, “You dumbass,” which I beli—I've [unintelligible 00:17:24] with prompts on this. You can get it to have a condescending, passive-aggressive tone, but you have to go out of your way to do it, as opposed to it being the default. Please continue.Randall: Great point. Also, Daniel from Unsupervised Learning, by the way, has a really good post where he shows you setting up Chat-Gippity to mimic Scarlett Johansson from the movie Her on your phone so you can talk to it. Absolutely beautiful. And you get these really fun, very nice responses back and forth around your code analysis. So, shout out there.But going back to the point. So, if you get these responses back from Chat-Gippity, and it's like, “Hey look, here's all the security issues,” a lot of those things will be false alerts, and there's been a lot of public security research done on these analysis tools just give you information. A lot of those things will be false alerts, some things will be things that maybe they're a real problem, but cannot be fixed due to transitive dependencies, or whatever the issues are, but there's a lot of things you need to do there. Now, let's take it up one notch, let's say instead of using Chat-Gippity directly, you're using GitHub Copilot. Now, this is a much better situation for working with code because now what Microsoft is doing is let's say you're running Copilot inside of VS Code. It's able to analyze all the files in your codebase, and it's able to use that additional context to help provide you with better information.So, you can talk to GitHub Copilot and say, “Hey, I'd really like to know what security issues are in this file,” and it's going to give you maybe a little bit better answers than ChatGPT directly because it has more context about the other parts of your codebase and can give you slightly better answers. However, because these things are LLMs, you're still going to run into issues with accuracy, and hallucinations, and all sorts of other problems. So, what is the better approach? And I think that's fundamentally what people want to know. Like, what is a good approach here?And on the scanning side, the right approach in my mind is using something very domain specific. Now, what we do at Snyk is we have a symbolic AI scanning engine. So, we take customers' code, and we take an entire codebase so you have access to all the files and dependencies and things like this, and you take a look at these things. And we have a security analyst team that analyzes real-world security issues and fixes that have been validated. So, we do this by pulling lots of open-source projects as well as other security information that we originally produced, and we define very specific rules so that we can take a look at software, and we can take a look at these codebases with a very high degree of certainty.And we can give you a very actionable list of security issues that you need to address, and not only that, we can show you how is going to be the best way to address them. So, with that being said, I think the second side to that is okay, if that's a better approach on the scanning side, maybe you shouldn't be using LLMs for finding issues; maybe you should be using them for fixing security issues, which makes a lot of sense. So, let's say you do it the Snyk way, and you use symbolic AI engines and you sort of find these issues. Maybe you can just take that information then, in combination with your codebase, and fire off a request to an LLM and say, “Hey Chat-Gippity, please take this codebase, and take this security information that we know is accurate, and fix this code for me.” So, now you're going one step further.Corey: One challenge that I've seen, especially as I've been building weird software projects with the help of magic robots from the future, is that a lot of components, like in React for example, get broken out into their own file. And pasting a file in is all well and good, but very often, it needs insight into the rest of the codebase. At GitHub Universe, something that they announced was Copilot Enterprise, which trains Copilot on the intricacies of your internal structures around shared libraries, all of your code, et cetera. And in some of the companies I'm familiar with, I really believe that's giving a very expensive, smart robot a form of brain damage, but that's neither here nor there. But there's an idea of seeing the interplay between different components that individual analysis on a per-file basis will miss, feels to me like something that needs a more holistic view. Am I wrong on that? Am I oversimplifying?Randall: You're right. There's two things we need to address. First of all, let's say you have the entire application context—so all the files, right—and then you ask an LLM to create a fix for you. This is something we do at Snyk. We actually use LLMs for this purpose. So, we take this information we ask the LLM, “Hey, please rewrite this section of code that we know has an issue given this security information to remove this problem.” The problem then becomes okay, well, how do you know this fix is accurate and is not going to break people's stuff?And that's where symbolic AI becomes useful again. Because again, what is the use case for symbolic AI? It's taking very specific domains of things that you've created very specific rule sets for and using them to validate things or to pass arbitrary checks and things like that. And it's a perfect use case for this. So, what we actually do with our auto-fix product, so if you're using VS Code and you have Copilot, right, and Copilot's spitting out software, as long as you have Snyk in the IDE, too, we're actually taking a look at those lines of code Copilot just inserted, and a lot of the time, we are helping you rewrite that code to be secured using our LLM stuff, but then as soon as we get that fixed created, we actually run it through our symbolic engine, and if we're saying no, it's actually not fixed, then we go back to the LLM, we re-prompt it over and over again until we get a working solution.And that's essentially how we create a much more sophisticated iteration, if you will, of using AI to really help improve code quality. But all that being said, you still had a good point, which is maybe if you're using the context from the application, and people aren't doing things properly, how does that impact what LLMs are generating for you? And an interesting thing to note is that our security team internally here, just conducted a really interesting project, and I would be angry at myself if I didn't explain it because I think it's a very cool concept.Corey: Oh, please, I'm a big fan of hearing what people get up to with these things in ways that is real-world stories, not trying to sell me anything, or also not dunking on, look what I saw on the top of Hacker News the other day, which is, “If all you're building is something that talks to Chat-Gippity's API, does some custom prompting, and returns a response, you shouldn't be building it.” I'm like, “Well, I built some things that do exactly that.” But I'm also not trying to raise $6 million in seed money to go and productize it. I'm just hoping someone does it better eventually, but I want to use it today. Please tell me a real world story about something that you've done.Randall: Okay. So, here's what we did. We went out and we found a bunch of GitHub projects, and we tried to analyze them ourselves using a bunch of different tools, including human verification, and basically give it a grade and say, “Okay, this project here has really good security hygiene. Like, there's not a lot of issues in the code, things are written in a nice way, the style and formatting is consistent, the dependencies are up-to-date, et cetera.” Then we take a look at multiple GitHub repos that are the opposite of that, right? Like, maybe projects that hadn't been maintained in a long time, or were written in a completely different style where you have bad hygienic practices, maybe you have hard-coded secrets, maybe you have unsanitized input coming from a user or something, right, but you take all these things.So, we have these known examples of good and bad projects. So, what did we do? Well, we opened them up in VS Code, and we basically got GitHub Copilot and we said, “Okay, what we're going to do is use each of these codebases, and we're going to try to add features into the projects one at a time.” And what we did is we took a look at the suggested output that Copilot was giving us in each of these cases. And the interesting thing is that—and I think this is super important to understand about LLMs, right—but the interesting thing is, if we were adding features to a project that has good security hygiene, the types of code that we're able to get out of LLMs, like, GitHub Copilot was pretty good. There weren't a ton of issues with it. Like, the actual security hygiene was, like, fairly good.However, for projects where there were existing issues, it was the opposite. Like we'd get AI recommendations showing us how to write things insecurely, or potentially write things with hard-coded secrets in it. And this is something that's very reproducible today in, you know, what is it right now, middle of November 2023. Now, is it going to be this case a year from now? I don't necessarily know, but right now, this is still a massive problem, so that really reinforces the idea that not only when you're talking about LLMs is the training set they used to build the model's important, but also the context in which you're using them is incredibly important.It's very easy to mislead LLMs. Another example of this, if you think about the security scanning concept we talked about earlier, imagine you're talking to Chat-Gippity, and you're [pasting 00:25:58] in a Python function, and the Python function is called, “Completely_safe_not_vulnerable_function.” That's the function name. And inside of that function, you're backdooring some software. Well, if you ask Chat-Gippity multiple times and say, “Hey, the temperature is set to 1.0. Is this code safe?”Sometimes you'll get the answer yes because the context within the request that has that thing saying this is not a vulnerable function or whatever you want to call it, that can mislead the LLM output and result in problems, you know? It's just, like, classic prompt injection type issues. But there's a lot of these types of vulnerabilities still hidden in plain sight that impact all of us, and so it's so important to know that you can't just rely on one thing, you have to have multiple layers: something that helps you with things, but also something that is helping you fix things when needed.Corey: I think that's the key that gets missed a lot is the idea of it's not just what's here, what have you put here that shouldn't be; what have you forgotten? There's a different side of it. It's easy to do a static analysis and say, “Oh, you're not sanitizing your input on this particular form.” Great. Okay—well, I say it's easy. I wish more people would do that—but then there's also a step beyond of, what is it that someone who has expertise who's been down this road before would take one look at your codebase and say, “Are you making this particular misconfiguration or common misstep?”Randall: Yeah, it's incredibly important. You know, like I said, security is just one of those things where it's really broad. I've been working in security for a very long time and I make security mistakes all the time myself.Corey: Yeah. Like, in your developer environment right now, you ran this against the production environment and didn't get permissions errors. That is suspicious. Tell me more about your authentication pattern.Randall: Right. I mean, there's just a ton of issues that can cause problems. And it's… yeah, it is what it is, right? Like, software security is something difficult to achieve. If it wasn't difficult, everyone would be doing it. Now, if you want to talk about, like, vision for the future, actually, I think there's some really interesting things with the direction I see things going.Like, a lot of people have been leaning into the whole AI autonomous agents thing over the last year. People started out by taking LLMs and saying, “Okay, I can get it to spit out code, I can get it to spit out this and that.” But then you go one step further and say, “All right, can I get it to write code for me and execute that code?” And OpenAI, to their credit, has done a really good job advancing some of the capabilities here, as well as a lot of open-source frameworks. You have Langchain, and Baby AGI, and AutoGPT, and all these different things that make this more feasible to give AI access to actually do real meaningful things.And I can absolutely imagine a world in the future—maybe it's a couple of years from now—where you have developers writing software, and it could be a real developer, it could be an autonomous agent, whatever it is. And then you also have agents that are taking a look at your software and rewriting it to solve security issues. And I think when people talk about autonomous agents, a lot of the time they're purely focusing on LLMs. I think it's a big mistake. I think one of the most important things you can do is focus on the very niche symbolic AI engines that are going to be needed to guarantee accuracy with these things.And that's why I think the Snyk approach is really cool, you know? We dedicated a huge amount of resources to security analysts building these very in-depth rule sets that are guaranteeing accuracy on results. And I think that's something that the industry is going to shift towards more in the future as LLMs become more popular, which is, “Hey, you have all these great tools, doing all sorts of cool stuff. Now, let's clean it up and make it accurate.” And I think that's where we're headed in the next couple of years.Corey: I really hope you're right. I think it's exciting times, but I also am leery when companies go too far into boosterism where, “Robots are going to do all of these things for us.” Maybe, but even if you're right, you sound psychotic. And that's something that I think gets missed in an awful lot of the marketing that is so breathless with anticipation. I have to congratulate you folks on not getting that draped over your message, once again.My other favorite part of your messaging when you pull up snyk.com—sorry, snyk.io. What is it these days? It's the dot io, isn't it?Randall: Dot io. It's hot.Corey: Dot io, yes.Randall: Still hot, you know?Corey: I feel like I'm turning into a boomer here where, “The internet is dot com.”Randall: [laugh].Corey: Doesn't necessarily work that way. But no, what I love is the part where you have this fear-based marketing of if you wind up not using our product, here are all the terrible things that will happen. And my favorite part about that marketing is it doesn't freaking exist. It is such a refreshing departure from so much of the security industry, where it does the fear, uncertainty, and doubt nonsense stuff that I love that you don't even hint in that direction. My actual favorite thing that is on your page, of course, is at the bottom. If you mouse over the dog in the logo at the bottom of the page, it does the quizzical tilting head thing, and I just think that is spectacular.Randall: So, the Snyk mascot, his name is Pat. He's a Doberman and everyone loves him. But yeah, you're totally right. The FUD thing is a real issue in security. Fear, uncertainty, and doubt, it's the way security companies sell products to people. And I think it's a real shame, you know?I give a lot of tech talks, at programming conferences in particular, around security and cryptography, and one of the things I always start out with when I'm giving a tech talk about any sort of security or cryptography topic is I say, “Okay, how many of you have landed in a Stack Overflow thread where you're talking about a security topic and someone replies and says, ‘oh, a professional should be doing this. You shouldn't be doing it yourself?'” That comes up all the time when you're looking at security topics on the internet. Then I ask people, “How many of you feel like security is this, sort of like, obscure, mystical arts that requires a lot of expertise in math knowledge, and all this stuff?” And a lot of people sort of have that impression.The reality though is security, and to some extent, cryptography, it's just like any other part of computer science. It's something that you can learn. There's best practices. It's not rocket science, you know? Maybe it is if you're developing a brand-new hashing algorithm from scratch, yes, leave that to the professionals. But using these things is something everyone needs to understand well, and there's tons of material out there explaining how to do things right. And you don't need to be afraid of this stuff, right?And so, I think, a big part of the Snyk message is, we just want to help developers just make their code better. And what is one way that you're going to do a better job at work, get more of your code through the PR review process? What is a way you're going to get more features out? A big part of that is just building things right from the start. And so, that's really our focus in our message is, “Hey developers, we want to be, like, a trusted partner to help you build things faster and better.” [laugh].Corey: It's nice to see it, just because there's so much that just doesn't work out the way that we otherwise hope it would. And historically, there's been a tremendous problem of differentiation in the security space. I often remark that at RSA, there's about 12 companies exhibiting. Now sure, there are hundreds of booths, but it's basically the same 12 things. There's, you know, the entire row of firewalls where they use different logos and different marketing words on the slides, but they're all selling fundamentally the same thing. One of things I've always appreciated about Snyk is it has never felt that way.Randall: Well, thanks. Yeah, we appreciate that. I mean, our whole focus is just developer security. What can we do to help developers build things securely?Corey: I mean, you are sponsoring this episode, let's be clear, but also, we are paying customers of you folks, and that is not—those things are not related in any way. What's the line that we like to use that we stole from the RedMonk folks? “You can buy our attention, but not our opinion.” And our opinion of what you folks are up to is then stratospherically high for a long time.Randall: Well, I certainly appreciate that as a Snyk employee who is also a happy user of the service. The way I actually ended up working at Snyk was, I'd been using the product for my open-source projects for years, and I legitimately really liked it and I thought this was cool. And yeah, I eventually ended up working here because there was a position, and you know, a friend reached out to me and stuff. But I am a genuinely happy user and just like the goal and the mission. Like, we want to make developers' lives better, and so it's super important.Corey: I really want to thank you for taking the time to speak with me about all this. If people want to learn more, where's the best place for them to go?Randall: Yeah, thanks for having me. If you want to learn more about AI or just developer security in general, go to snyk.io. That's S-N-Y-K—in case it's not clear—dot io. In particular, I would actually go check out our [Snyk Learn 00:34:16] platform, which is linked to from our main site. We have tons of free security lessons on there, showing you all sorts of really cool things. If you check out our blog, my team and I in particular also do a ton of writing on there about a lot of these bleeding-edge topics, and so if you want to keep up with cool research in the security space like this, just check it out, give it a read. Subscribe to the RSS feed if you want to. It's fun.Corey: And we will put links to that in the [show notes 00:34:39]. Thanks once again for your support, and of course, putting up with my slings and arrows.Randall: And thanks for having me on, and thanks for using Snyk, too. We love you [laugh].Corey: Randall Degges, Head of Developer Relations and Community at Snyk. This featured guest episode has been brought to us by our friends at Snyk, and I'm Corey Quinn. If you've enjoyed this episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry comment that I will get to reading immediately. You can get me to read it even faster if you make sure your username is set to ‘Dependabot.'Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Chronosphere on Crafting a Cloud-Native Observability Strategy with Rachel Dines

Screaming in the Cloud

Play Episode Listen Later Nov 28, 2023 29:41


Rachel Dines, Head of Product and Technical Marketing at Chronosphere, joins Corey on Screaming in the Cloud to discuss why creating a cloud-native observability strategy is so critical, and the challenges that come with both defining and accomplishing that strategy to fit your current and future observability needs. Rachel explains how Chronosphere is taking an open-source approach to observability, and why it's more important than ever to acknowledge that the stakes and costs are much higher when it comes to observability in the cloud. About RachelRachel leads product and technical marketing for Chronosphere. Previously, Rachel wore lots of marketing hats at CloudHealth (acquired by VMware), and before that, she led product marketing for cloud-integrated storage at NetApp. She also spent many years as an analyst at Forrester Research. Outside of work, Rachel tries to keep up with her young son and hyper-active dog, and when she has time, enjoys crafting and eating out at local restaurants in Boston where she's based.Links Referenced: Chronosphere: https://chronosphere.io/ LinkedIn: https://www.linkedin.com/in/rdines/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's featured guest episode is brought to us by our friends at Chronosphere, and they have also brought us Rachel Dines, their Head of Product and Solutions Marketing. Rachel, great to talk to you again.Rachel: Hi, Corey. Yeah, great to talk to you, too.Corey: Watching your trajectory has been really interesting, just because starting off, when we first started, I guess, learning who each other were, you were working at CloudHealth which has since become VMware. And I was trying to figure out, huh, the cloud runs on money. How about that? It feels like it was a thousand years ago, but neither one of us is quite that old.Rachel: It does feel like several lifetimes ago. You were just this snarky guy with a few followers on Twitter, and I was trying to figure out what you were doing mucking around with my customers [laugh]. Then [laugh] we kind of both figured out what we're doing, right?Corey: So, speaking of that iterative process, today, you are at Chronosphere, which is an observability company. We would have called it a monitoring company five years ago, but now that's become an insult after the observability war dust has settled. So, I want to talk to you about something that I've been kicking around for a while because I feel like there's a gap somewhere. Let's say that I build a crappy web app—because all of my web apps inherently are crappy—and it makes money through some mystical form of alchemy. And I have a bunch of users, and I eventually realize, huh, I should probably have a better observability story than waiting for the phone to ring and a customer telling me it's broken.So, I start instrumenting various aspects of it that seem to make sense. Maybe I go too low level, like looking at all the discs on every server to tell me if they're getting full or not, like their ancient servers. Maybe I just have a Pingdom equivalent of is the website up enough to respond to a packet? And as I wind up experiencing different failure modes and getting yelled at by different constituencies—in my own career trajectory, my own boss—you start instrumenting for all those different kinds of breakages, you start aggregating the logs somewhere and the volume gets bigger and bigger with time. But it feels like it's sort of a reactive process as you stumble through that entire environment.And I know it's not just me because I've seen this unfold in similar ways in a bunch of different companies. It feels to me, very strongly, like it is something that happens to you, rather than something you set about from day one with a strategy in mind. What's your take on an effective way to think about strategy when it comes to observability?Rachel: You just nailed it. That's exactly the kind of progression that we so often see. And that's what I really was excited to talk with you about today—Corey: Oh, thank God. I was worried for a minute there that you'd be like, “What the hell are you talking about? Are you just, like, some sort of crap engineer?” And, “Yes, but it's mean of you to say it.” But yeah, what I'm trying to figure out is there some magic that I just was never connecting? Because it always feels like you're in trouble because the site's always broken, and oh, like, if the disk fills up, yeah, oh, now we're going to start monitoring to make sure the disk doesn't fill up. Then you wind up getting barraged with alerts, and no one wins, and it's an uncomfortable period of time.Rachel: Uncomfortable period of time. That is one very polite way to put it. I mean, I will say, it is very rare to find a company that actually sits down and thinks, “This is our observability strategy. This is what we want to get out of observability.” Like, you can think about a strategy and, like, the old school sense, and you know, as an industry analyst, so I'm going to have to go back to, like, my roots at Forrester with thinking about, like, the people, and the process, and the technology.But really what the bigger component here is like, what's the business impact? What do you want to get out of your observability platform? What are you trying to achieve? And a lot of the time, people have thought, “Oh, observability strategy. Great, I'm just going to buy a tool. That's it. Like, that's my strategy.”And I hate to bring it to you, but buying tools is not a strategy. I'm not going to say, like, buy this tool. I'm not even going to say, “Buy Chronosphere.” That's not a strategy. Well, you should buy Chronosphere. But that's not a strategy.Corey: Of course. I'm going to throw the money by the wheelbarrow at various observability vendors, and hope it solves my problem. But if that solved the problem—I've got to be direct—I've never spoken to those customers.Rachel: Exactly. I mean, that's why this space is such a great one to come in and be very disruptive in. And I think, back in the days when we were running in data centers, maybe even before virtual machines, you could probably get away with not having a monitoring strategy—I'm not going to call it observability; it's not we call the back then—you could get away with not having a strategy because what was the worst that was going to happen, right? It wasn't like there was a finite amount that your monitoring bill could be, there was a finite amount that your customer impact could be. Like, you're paying the penny slots, right?We're not on the penny slots anymore. We're in the $50 craps table, and it's Las Vegas, and if you lose the game, you're going to have to run down the street without your shirt. Like, the game and the stakes have changed, and we're still pretending like we're playing penny slots, and we're not anymore.Corey: That's a good way of framing it. I mean, I still remember some of my biggest observability challenges were building highly available rsyslog clusters so that you could bounce a member and not lose any log data because some of that was transactionally important. And we've gone beyond that to a stupendous degree, but it still feels like you don't wind up building this into the application from day one. More's the pity because if you did, and did that intelligently, that opens up a whole world of possibilities. I dream of that changing where one day, whenever you start to build an app, oh, and we just push the button and automatically instrument with OTel, so you instrument the thing once everywhere it makes sense to do it, and then you can do your vendor selection and what you said were decisions later in time. But these days, we're not there.Rachel: Well, I mean, and there's also the question of just the legacy environment and the tech debt. Even if you wanted to, the—actually I was having a beer yesterday with a friend who's a VP of Engineering, and he's got his new environment that they're building with observability instrumented from the start. How beautiful. They've got OTel, they're going to have tracing. And then he's got his legacy environment, which is a hot mess.So, you know, there's always going to be this bridge of the old and the new. But this was where it comes back to no matter where you're at, you can stop and think, like, “What are we doing and why?” What is the cost of this? And not just cost in dollars, which I know you and I could talk about very deeply for a long period of time, but like, the opportunity costs. Developers are working on stuff that they could be working on something that's more valuable.Or like the cost of making people work round the clock, trying to troubleshoot issues when there could be an easier way. So, I think it's like stepping back and thinking about cost in terms of dollar sense, time, opportunity, and then also impact, and starting to make some decisions about what you're going to do in the future that's different. Once again, you might be stuck with some legacy stuff that you can't really change that much, but [laugh] you got to be realistic about where you're at.Corey: I think that that is a… it's a hard lesson to be very direct, in that, companies need to learn it the hard way, for better or worse. Honestly, this is one of the things that I always noticed in startup land, where you had a whole bunch of, frankly, relatively early-career engineers in their early-20s, if not younger. But then the ops person was always significantly older because the thing you actually want to hear from your ops person, regardless of how you slice it, is, “Oh, yeah, I've seen this kind of problem before. Here's how we fixed it.” Or even better, “Here's the thing we're doing, and I know how that's going to become a problem. Let's fix it before it does.” It's the, “What are you buying by bringing that person in?” “Experience, mostly.”Rachel: Yeah, that's an interesting point you make, and it kind of leads me down this little bit of a side note, but a really interesting antipattern that I've been seeing in a lot of companies is that more seasoned ops person, they're the one who everyone calls when something goes wrong. Like, they're the one who, like, “Oh, my God, I don't know how to fix it. This is a big hairy problem,” I call that one ops person, or I call that very experienced person. That experience person then becomes this huge bottleneck into solving problems that people don't really—they might even be the only one who knows how to use the observability tool. So, if we can't find a way to democratize our observability tooling a little bit more so, like, just day-to-day engineers, like, more junior engineers, newer ones, people who are still ramping, can actually use the tool and be successful, we're going to have a big problem when these ops people walk out the door, maybe they retire, maybe they just get sick of it. We have these massive bottlenecks in organizations, whether it's ops or DevOps or whatever, that I see often exacerbated by observability tools. Just a side note.Corey: Yeah. On some level, it feels like a lot of these things can be fixed with tooling. And I'm not going to say that tools aren't important. You ever tried to implement observability by hand? It doesn't work. There have to be computers somewhere in the loop, if nothing else.And then it just seems to devolve into a giant swamp of different companies, doing different things, taking different approaches. And, on some level, whenever you read the marketing or hear the stories any of these companies tell you also to normalize it from translating from whatever marketing language they've got into something that comports with the reality of your own environment and seeing if they align. And that feels like it is so much easier said than done.Rachel: This is a noisy space, that is for sure. And you know, I think we could go out to ten people right now and ask those ten people to define observability, and we would come back with ten different definitions. And then if you throw a marketing person in the mix, right—guilty as charged, and I know you're a marketing person, too, Corey, so you got to take some of the blame—it gets mucky, right? But like I said a minute ago, the answer is not tools. Tools can be part of the strategy, but if you're just thinking, “I'm going to buy a tool and that's going to solve my problem,” you're going to end up like this company I was talking to recently that has 25 different observability tools.And not only do they have 25 different observability tools, what's worse is they have 25 different definitions for their SLOs and 25 different names for the same metric. And to be honest, it's just a mess. I'm not saying, like, go be Draconian and, you know, tell all the engineers, like, “You can only use this tool [unintelligible 00:10:34] use that tool,” you got to figure out this kind of balance of, like, hands-on, hands-off, you know? How much do you centralize, how much do you push and standardize? Otherwise, you end up with just a huge mess.Corey: On some level, it feels like it was easier back in the days of building it yourself with Nagios because there's only one answer, and it sucks, unless you want to start going down the world of HP OpenView. Which step one: hire a 50-person team to manage OpenView. Okay, that's not going to solve my problem either. So, let's get a little more specific. How does Chronosphere approach this?Because historically, when I've spoken to folks at Chronosphere, there isn't that much of a day one story, in that, “I'm going to build a crappy web app. Let's instrument it for Chronosphere.” There's a certain, “You must be at least this tall to ride,” implicit expectation built into the product just based upon its origins. And I'm not saying that doesn't make sense, but it also means there's really no such thing as a greenfield build out for you either.Rachel: Well, yes and no. I mean, I think there's no green fields out there because everyone's doing something for observability, or monitoring, or whatever you want to call it, right? Whether they've got Nagios, whether they've got the Dog, whether they've got something else in there, they have some way of introspecting their systems, right? So, one of the things that Chronosphere is built on, that I actually think this is part of something—a way you might think about building out an observability strategy as well, is this concept of control and open-source compatibility. So, we only can collect data via open-source standards. You have to send this data via Prometheus, via Open Telemetry, it could be older standards, like, you know, statsd, Graphite, but we don't have any proprietary instrumentation.And if I was making a recommendation to somebody building out their observability strategy right now, I would say open, open, open, all day long because that gives you a huge amount of flexibility in the future. Because guess what? You know, you might put together an observability strategy that seems like it makes sense for right now—actually, I was talking to a B2B SaaS company that told me that they made a choice a couple of years ago on an observability tool. It seemed like the right choice at the time. They were growing so fast, they very quickly realized it was a terrible choice.But now, it's going to be really hard for them to migrate because it's all based on proprietary standards. Now, of course, a few years ago, they didn't have the luxury of Open Telemetry and all of these, but now that we have this, we can use these to kind of future-proof our mistakes. So, that's one big area that, once again, both my recommendation and happens to be our approach at Chronosphere.Corey: I think that that's a fair way of viewing it. It's a constant challenge, too, just because increasingly—you mentioned the Dog earlier, for example—I will say that for years, I have been asked whether or not at The Duckbill Group, we look at Azure bills or GCP bills. Nope, we are pure AWS. Recently, we started to hear that same inquiry specifically around Datadog, to the point where it has become a board-level concern at very large companies. And that is a challenge, on some level.I don't deviate from my typical path of I fix AWS bills, and that's enough impossible problems for one lifetime, but there is a strong sense of you want to record as much as possible for a variety of excellent reasons, but there's an implicit cost to doing that, and in many cases, the cost of observability becomes a massive contributor to the overall cost. Netflix has said in talks before that they're effectively an observability company that also happens to stream movies, just because it takes so much effort, engineering, and raw computing resources in order to get that data do something actionable with it. It's a hard problem.Rachel: It's a huge problem, and it's a big part of why I work at Chronosphere, to be honest. Because when I was—you know, towards the tail end at my previous company in cloud cost management, I had a lot of customers coming to me saying, “Hey, when are you going to tackle our Dog or our New Relic or whatever?” Similar to the experience you're having now, Corey, this was happening to me three, four years ago. And I noticed that there is definitely a correlation between people who are having these really big challenges with their observability bills and people that were adopting, like Kubernetes, and microservices and cloud-native. And it was around that time that I met the Chronosphere team, which is exactly what we do, right? We focus on observability for these cloud-native environments where observability data just goes, like, wild.We see 10X 20X as much observability data and that's what's driving up these costs. And yeah, it is becoming a board-level concern. I mean, and coming back to the concept of strategy, like if observability is the second or third most expensive item in your engineering bill—like, obviously, cloud infrastructure, number one—number two and number three is probably observability. How can you not have a strategy for that? How can this be something the board asks you about, and you're like, “What are we trying to get out of this? What's our purpose?” “Uhhhh… troubleshooting?”Corey: Right because it turns into business metrics as well. It's not just about is the site up or not. There's a—like, one of the things that always drove me nuts not just in the observability space, but even in cloud costing is where, okay, your costs have gone up this week so you get a frowny face, or it's in red, like traffic light coloring. Cool, but for a lot of architectures and a lot of customers, that's because you're doing a lot more volume. That translates directly into increased revenues, increased things you care about. You don't have the position or the context to say, “That's good,” or, “That's bad.” It simply is. And you can start deriving business insight from that. And I think that is the real observability story that I think has largely gone untold at tech conferences, at least.Rachel: It's so right. I mean, spending more on something is not inherently bad if you're getting more value out of it. And it definitely a challenge on the cloud cost management side. “My costs are going up, but my revenue is going up a lot faster, so I'm okay.” And I think some of the plays, like you know, we put observability in this box of, like, it's for low-level troubleshooting, but really, if you step back and think about it, there's a lot of larger, bigger picture initiatives that observability can contribute to in an org, like digital transformation. I know that's a buzzword, but, like that is a legit thing that a lot of CTOs are out there thinking about. Like, how do we, you know, get out of the tech debt world, and how do we get into cloud-native?Maybe it's developer efficiency. God, there's a lot of people talking about developer efficiency. Last week at KubeCon, that was one of the big, big topics. I mean, and yeah, what [laugh] what about cost savings? To me, we've put observability in a smaller box, and it needs to bust out.And I see this also in our customer base, you know? Customers like DoorDash use observability, not just to look at their infrastructure and their applications, but also look at their business. At any given minute, they know how many Dashers are on the road, how many orders are being placed, cut by geos, down to the—actually down to the second, and they can use that to make decisions.Corey: This is one of those things that I always found a little strange coming from the world of running systems in large [unintelligible 00:17:28] environments to fixing AWS bills. There's nothing that even resembles a fast, reactive response in the world of AWS billing. You wind up with a runaway bill, they're going to resolve that over a period of weeks, on Seattle business hours. If you wind up spinning something up that creates a whole bunch of very expensive drivers behind your bill, it's going to take three days, in most cases, before that starts showing up anywhere that you can reasonably expect to get at it. The idea of near real time is a lie unless you want to start instrumenting everything that you're doing to trap the calls and then run cost extrapolation from there. That's hard to do.Observability is a very different story, where latencies start to matter, where being able to get leading indicators of certain events—be a technical or business—start to be very important. But it seems like it's so hard to wind up getting there from where most people are. Because I know we like to talk dismissively about the past, but let's face it, conference-ware is the stuff we're the proudest of. The reality is the burning dumpster of regret in our data centers that still also drives giant piles of revenue, so you can't turn it off, nor would you want to, but you feel bad about it as a result. It just feels like it's such a big leap.Rachel: It is a big leap. And I think the very first step I would say is trying to get to this point of clarity and being honest with yourself about where you're at and where you want to be. And sometimes not making a choice is a choice, right, as well. So, sticking with the status quo is making a choice. And so, like, as we get into things like the holiday season right now, and I know there's going to be people that are on-call 24/7 during the holidays, potentially, to keep something that's just duct-taped together barely up and running, I'm making a choice; you're make a choice to do that. So, I think that's like the first step is the kind of… at least acknowledging where you're at, where you want to be, and if you're not going to make a change, just understanding the cost and being realistic about it.Corey: Yeah, being realistic, I think, is one of the hardest challenges because it's easy to wind up going for the aspirational story of, “In the future when everything's great.” Like, “Okay, cool. I appreciate the need to plant that flag on the hill somewhere. What's the next step? What can we get done by the end of this week that materially improves us from where we started the week?” And I think that with the aspirational conference-ware stories, it's hard to break that down into things that are actionable, that don't feel like they're going to be an interminable slog across your entire existing environment.Rachel: No, I get it. And for things like, you know, instrumenting and adding tracing and adding OTEL, a lot of the time, the return that you get on that investment is… it's not quite like, “I put a dollar in, I get a dollar out,” I mean, something like tracing, you can't get to 60% instrumentation and get 60% of the value. You need to be able to get to, like, 80, 90%, and then you'll get a huge amount of value. So, it's sort of like you're trudging up this hill, you're charging up this hill, and then finally you get to the plateau, and it's beautiful. But that hill is steep, and it's long, and it's not pretty. And I don't know what to say other than there's a plateau near the top. And those companies that do this well really get a ton of value out of it. And that's the dream, that we want to help customers get up that hill. But yeah, I'm not going to lie, the hill can be steep.Corey: One thing that I find interesting is there's almost a bimodal distribution in companies that I talk to. On the one side, you have companies like, I don't know, a Chronosphere is a good example of this. Presumably you have a cloud bill somewhere and the majority of your cloud spend will be on what amounts to a single application, probably in your case called, I don't know, Chronosphere. It shares the name of the company. The other side of that distribution is the large enterprise conglomerates where they're spending, I don't know, $400 million a year on cloud, but their largest workload is 3 million bucks, and it's just a very long tail of a whole bunch of different workloads, applications, teams, et cetera.So, what I'm curious about from the Chronosphere perspective—or the product you have, not the ‘you' in this metaphor, which gets confusing—is, it feels easier to instrument a Chronosphere-like company that has a primary workload that is the massive driver of most things and get that instrumented and start getting an observability story around that than it does to try and go to a giant company and, “Okay, 1500 teams need to all implement this thing that are all going in different directions.” How do you see it playing out among your customer base, if that bimodal distribution holds up in your world?Rachel: It does and it doesn't. So, first of all, for a lot of our customers, we often start with metrics. And starting with metrics means Prometheus. And Prometheus has hundreds of exporters. It is basically built into Kubernetes. So, if you're running Kubernetes, getting Prometheus metrics out, actually not a very big lift. So, we find that we start with Prometheus, we start with getting metrics in, and we can get a lot—I mean, customers—we have a lot of customers that use us just for metrics, and they get a massive amount of value.But then once they're ready, they can start instrumenting for OTEL and start getting traces in as well. And yeah, in large organizations, it does tend to be one team, one application, one service, one department that kind of goes at it and gets all that instrumented. But I've even seen very large organizations, when they get their act together and decide, like, “No, we're doing this,” they can get OTel instrumented fairly quickly. So, I guess it's, like, a lining up. It's more of a people issue than a technical issue a lot of the time.Like, getting everyone lined up and making sure that like, yes, we all agree. We're on board. We're going to do this. But it's usually, like, it's a start small, and it doesn't have to be all or nothing. We also just recently added the ability to ingest events, which is actually a really beautiful thing, and it's very, very straightforward.It basically just—we connect to your existing other DevOps tools, so whether it's, like, a Buildkite, or a GitHub, or, like, a LaunchDarkly, and then anytime something happens in one of those tools, that gets registered as an event in Chronosphere. And then we overlay those events over your alerts. So, when an alert fires, then first thing I do is I go look at the alert page, and it says, “Hey, someone did a deploy five minutes ago,” or, “There was a feature flag flipped three minutes ago,” I solved the problem right then. I don't think of this as—there's not an all or nothing nature to any of this stuff. Yes, tracing is a little bit of a—you know, like I said, it's one of those things where you have to make a lot of investment before you get a big reward, but that's not the case in all areas of observability.Corey: Yeah. I would agree. Do you find that there's a significant easy, early win when customers start adopting Chronosphere? Because one of the problems that I've found, especially with things that are holistic, and as you talk about tracing, well, you need to get to a certain point of coverage before you see value. But human psychology being what it is, you kind of want to be able to demonstrate, oh, see, the Meantime To Dopamine needs to come down, to borrow an old phrase. Do you find that some of there's some easy wins that start to help people to see the light? Because otherwise, it just feels like a whole bunch of work for no discernible benefit to them.Rachel: Yeah, at least for the Chronosphere customer base, one of the areas where we're seeing a lot of traction this year is in optimizing the costs, like, coming back to the cost story of their overall observability bill. So, we have this concept of the control plane in our product where all the data that we ingest hits the control plane. At that point, that customer can look at the data, analyze it, and decide this is useful, this is not useful. And actually, not just decide that, but we show them what's useful, what's not useful. What's being used, what's high cardinality, but—and high cost, but maybe no one's touched it.And then we can make decisions around aggregating it, dropping it, combining it, doing all sorts of fancy things, changing the—you know, downsampling it. We can do this, on the trace side, we can do it both head based and tail based. On the metrics side, it's as it hits the control plane and then streams out. And then they only pay for the data that we store. So typically, customers are—they come on board and immediately reduce their observability dataset by 60%. Like, that's just straight up, that's the average.And we've seen some customers get really aggressive, get up to, like, in the 90s, where they realize we're only using 10% of this data. Let's get rid of the rest of it. We're not going to pay for it. So, paying a lot less helps in a lot of ways. It also helps companies get more coverage of their observability. It also helps customers get more coverage of their overall stack. So, I was talking recently with an autonomous vehicle driving company that recently came to us from the Dog, and they had made some really tough choices and were no longer monitoring their pre-prod environments at all because they just couldn't afford to do it anymore. It's like, well, now they can, and we're still saving the money.Corey: I think that there's also the downstream effect of the money saving to that, for example, I don't fix observability bills directly. But, “Huh, why is your CloudWatch bill through the roof?” Or data egress charges in some cases? It's oh because your observability vendor is pounding the crap out of those endpoints and pulling all your log data across the internet, et cetera. And that tends to mean, oh, yeah, it's not just the first-order effect; it's the second and third and fourth-order effects this winds up having. It becomes almost a holistic challenge. I think that trying to put observability in its own bucket, on some level—when you're looking at it from a cost perspective—starts to be a, I guess, a structure that makes less and less sense in the fullness of time.Rachel: Yeah, I would agree with that. I think that just looking at the bill from your vendor is one very small piece of the overall cost you're incurring. I mean, all of the things you mentioned, the egress, the CloudWatch, the other services, it's impacting, what about the people?Corey: Yeah, it sure is great that your team works for free.Rachel: [laugh]. Exactly, right? I know, and it makes me think a little bit about that viral story about that particular company with a certain vendor that had a $65 million per year observability bill. And that impacted not just them, but, like, it showed up in both vendors' financial filings. Like, how did you get there? How did you get to that point? And I think this all comes back to the value in the ROI equation. Yes, we can all sit in our armchairs and be like, “Well, that was dumb,” but I know there are very smart people out there that just got into a bad situation by kicking the can down the road on not thinking about the strategy.Corey: Absolutely. I really want to thank you for taking the time to speak with me about, I guess, the bigger picture questions rather than the nuts and bolts of a product. I like understanding the overall view that drives a lot of these things. I don't feel I get to have enough of those conversations some weeks, so thank you for humoring me. If people want to learn more, where's the best place for them to go?Rachel: So, they should definitely check out the Chronosphere website. Brand new beautiful spankin' new website: chronosphere.io. And you can also find me on LinkedIn. I'm not really on the Twitters so much anymore, but I'd love to chat with you on LinkedIn and hear what you have to say.Corey: And we will, of course, put links to all of that in the [show notes 00:28:26]. Thank you so much for taking the time to speak with me. It's appreciated.Rachel: Thank you, Corey. Always fun.Corey: Rachel Dines, Head of Product and Solutions Marketing at Chronosphere. This has been a featured guest episode brought to us by our friends at Chronosphere, and I'm Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry and insulting comment that I will one day read once I finished building my highly available rsyslog system to consume it with.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Use Cases for Couchbase's New Columnar Data Stores with Jeff Morris

Screaming in the Cloud

Play Episode Listen Later Nov 27, 2023 30:22


Jeff Morris, VP of Product & Solutions Marketing at Couchbase, joins Corey on Screaming in the Cloud to discuss Couchbase's new columnar data store functionality, specific use cases for columnar data stores, and why AI gets better when it communicates with a cleaner pool of data. Jeff shares how more responsive databases could allow businesses like Dominos and United Airlines to create hyper-personalized experiences for their customers by utilizing more responsive databases. Jeff dives into the linked future of AI and data, and Corey learns about Couchbase's plans for the re:Invent conference. If you're attending re:Invent, you can visit Couchbase at booth 1095.About JeffJeff Morris is VP Product & Solutions Marketing at Couchbase (NASDAQ: BASE), a cloud database platform company that 30% of the Fortune 100 depend on.Links Referenced:Couchbase: https://www.couchbase.com/TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode of Screaming in the Cloud is brought to us by our friends at Couchbase. Also brought to us by Couchbase is today's victim, for lack of a better term. Jeff Morris is their VP of Product and Solutions Marketing. Jeff, thank you for joining me.Jeff: Thanks for having me, Corey, even though I guess I paid for it.Corey: Exactly. It's always great to say thank you when people give you things. I learned this from a very early age, and the only people who didn't were rude children and turned into worse adults.Jeff: Exactly.Corey: So, you are effectively announcing something new today, and I always get worried when a database company says that because sometimes it's a license that is going to upset people, sometimes it's dyed so deep in the wool of generative AI that, “Oh, we're now supporting vectors or whatnot.” Well, most of us don't know what that means.Jeff: Right.Corey: Fortunately, I don't believe that's what you're doing today. What have you got for us?Jeff: So, you're right. It's—well, what I'm doing is, we're announcing new stuff inside of Couchbase and helping Couchbase expand its market footprint, but we're not really moving away from our sweet spot, either, right? We like building—or being the database platform underneath applications. So, push us on the operational side of the operational versus analytic, kind of, database divide. But we are announcing a columnar data store inside of the Couchbase platform so that we can build bigger, better, stronger analytic functionality to feed the applications that we're supporting with our customers.Corey: Now, I feel like I should ask a question around what a columnar data store is because my first encounter with the term was when I had a very early client for AWS bill optimization when I was doing this independently, and I was asking them the… polite question of, “Why do you have 283 billion objects in a single S3 bucket? That is atypical and kind of terrifying.” And their answer was, “Oh, we built our own columnar data store on top of S3. This might not have been the best approach.” It's like, “I'm going to stop you there. With no further information, I can almost guarantee you that it was not.” But what is a columnar data store?Jeff: Well, let's start with the, everybody loves more data and everybody loves to count more things, right, but a columnar data store allows you to expedite the kind of question that you ask of the data itself by not having to look at every single row of the data while you go through it. You can say, if you know you're only looking for data that's inside of California, you just look at the column value of find me everything in California and then I'll pick all of those records to analyze. So, it gives you a faster way to go through the data while you're trying to gather it up and perform aggregations against it.Corey: It seems like it's one of those, “Well, that doesn't sound hard,” type of things, when you're thinking about it the way that I do, in terms of a database being more or less a medium to large size Excel spreadsheet. But I have it on good faith from all the customer environments. I've worked with that no, no, there are data stores that span even larger than that, which is, you know, one of those sad realities of the world. And everything at scale begins to be a heck of a lot harder. I've seen some of the value that this stuff offers and I can definitely understand a few different workloads in which case that's going to be super handy. What are you targeting specifically? Or is this one of those areas where you're going to learn from your customers?Jeff: Well, we've had analytic functionality inside the platform. It just, at the size and scale customers actually wanted to roam through the data, we weren't supporting that that much. So, we'll expand that particular footprint, it'll give us better integration capabilities with external systems, or better access to things in your bucket. But the use case problem is, I think, going to be driven by what new modern application requirements are going to be. You're going to need, we call it hyper-personalization because we tend to cater to B2C-style applications, things with a lot of account profiles built into them.So, you look at account profile, and you're like, “Oh, well Jeff likes blue, so sell him blue stuff.” And that's a great current level personalization, but with a new analytic engine against this, you can maybe start aggregating all the inventory information that you might have of all the blue stuff that you want to sell me and do that in real-time, so I'm getting better recommendations, better offers as I'm shopping on your site or looking at my phone and, you know, looking for the next thing I want to buy.Corey: I'm sure there's massive amounts of work that goes into these hyper-personalization stories. The problem is that the only time they really rise to our notice is when they fail hilariously. Like, you just bought a TV, would you like to buy another? Now statistically, you are likelier to buy a second TV right after you buy one, but for someone who just, “Well, I'm replacing my living room TV after ten years,” it feels ridiculous. Or when you buy a whole bunch of nails and they don't suggest, “Would you like to also perhaps buy a hammer?”It's one of those areas where it just seems like a human putting thought into this could make some sense. But I've seen some of the stuff that can come out of systems like this and it can be incredible. I also personally tend to bias towards use cases that are less, here's how to convince you to buy more things and start aiming in a bunch of other different directions where it starts meeting emerging use cases or changing situations rapidly, more rapidly than a human can in some cases. The world has, for better or worse, gotten an awful lot faster over the last few decades.Jeff: Yeah. And think of it in terms of how responsive can I be at any given moment. And so, let's pick on one of the more recent interesting failures that has popped up. I'm a Giants fan, San Francisco Giants fan, so I'll pick on the Dodgers. The Dodgers during the baseball playoffs, Clayton Kershaw—three-time MVP, Cy Young Award winner, great, great pitcher—had a first-inning meltdown of colossal magnitude: gave up 11 runs in the first inning to the Diamondbacks.Well, my customer Domino's Pizza could end up—well, let's shift the focus of our marketing. We—you know, the Dodgers are the best team in baseball this year in the National League—let's focus our attention there, but with that meltdown, let's pivot to Arizona and focus on our market in Phoenix. And they could do that within minutes or seconds, even, with the kinds of capabilities that we're coming up with here so that they can make better offers to that new environment and also do the decision intelligence behind it. Like, do I have enough dough to make a bigger offer in that big market? Do I have enough drivers or do I have to go and spin out and get one of the other food delivery folks—UberEats, or something like that—to jump on board with me and partner up on this kind of system?It's that responsiveness in real, real-time, right, that's always been kind of the conundrum between applications and analytics. You get an analytic insight, but it takes you an hour or a day to incorporate that into what the application is doing. This is intended to make all of that stuff go faster. And of course, when we start to talk about things in AI, right, AI is going to expect real-time responsiveness as best you can make it.Corey: I figure we have to talk about AI. That is a technology that has absolutely sprung to the absolute peak of the hype curve over the past year. OpenAI released Chat-Gippity, either late last year or early this year and suddenly every company seems to be falling all over itself to rebrand itself as an AI company, where, “We've been working on this for decades,” they say, right before they announce something that very clearly was crash-developed in six months. And every company is trying to drape themselves in the mantle of AI. And I don't want to sound like I'm a doubter here. I'm like most fans; I see an awful lot of value here. But I am curious to get your take on what do you think is real and what do you think is not in the current hype environment.Jeff: So yeah, I love that. I think there's a number of things that are, you know, are real is, it's not going away. It is going to continue to evolve and get better and better and better. One of my analyst friends came up with the notion that the exercise of generative AI, it's imprecise, so it gives you similarity things, and that's actually an improvement, in many cases, over the precision of a database. Databases, a transaction either works or it doesn't. It has failover or it doesn't, when—Corey: It's ideally deterministic when you ask it a question—Jeff: Yes.Corey: —the same question a second time, assuming it's not time-bound—Jeff: Gives you the right answer.Corey: Yeah, the sa—or at least the same answer.Jeff: The same answer. And your gen AI may not. So, that's a part of the oddity of the hype. But then it also helps me kind of feed our storyline of if you're going to try and make Gen AI closer and more accurate, you need a clean pool of data that you're dealing with, even though you've got probably—your previous design was such that you would use a relational database for transactions, a document database for your user profiles, you'd probably attach your website to a caching database because you needed speed and a lot of concurrency. Well, now you got three different databases there that you're operating.And if you're feeding data from each of those databases back to AI, one of them might be wrong or one of them might confuse the AI, yet how are you going to know? The complexity level is going to become, like, exponential. So, our premise is, because we're a multi-modal database that incorporates in-memory speed and documents and search and transactions and the like, if you start with a cleaner pool of data, you'll have less complexity that you're offering to your AI system and therefore you can steer it into becoming more accurate in its response. And then, of course, all the data that we're dealing with is on mobile, right? Data is created there for, let's say, your account profile, and then it's also consumed there because that's what people are using as their application interface of choice.So, you also want to have mobile interactivity and synchronization and local storage, kind of, capabilities built in there. So, those are kind of, you know, a couple of the principles that we're looking at of, you know, JSON is going to be a great format for it regardless of what happens; complexity is kind of the enemy of AI, so you don't want to go there; and mobility is going to be an absolute requirement. And then related to this particular announcement, large-scale aggregation is going to be a requirement to help feed the application. There's always going to be some other bigger calculation that you're going to want to do relatively in real time and feed it back to your users or the AI system that's helping them out.Corey: I think that that is a much more nuanced use case than a lot of the stuff that's grabbing customer attentions where you effectively have the Chat-Gippity story of it being an incredible parrot. Where I have run into trouble with the generative story has been people putting the thing that the robot that's magic and from the future has come up with off the cuff and just hurling that out into the universe under their own name without any human review, and that's fine sometimes sure, but it does get it hilariously wrong at some points. And the idea of sending something out under my name that has not been at least reviewed by me if not actually authored by me, is abhorrent. I mean, I review even the transactional, “Yes, you have successfully subscribed,” or, “Sorry to see you go,” email confirmations on stuff because there's an implicit, “Hugs and puppies, love Corey,” at the end of everything that goes out under my name.Jeff: Right.Corey: But I've gotten a barrage of terrible sales emails and companies that are trying to put the cart before the horse where either the, “Support rep,” quote-unquote, that I'm speaking to in the chat is an AI system or else needs immediate medical attention because there's something going on that needs assistance.Jeff: Yeah, they just don't understand.Corey: Right. And most big enterprise stories that I've heard so far that have come to light have been around the form of, “We get to fire most of our customer service staff,” an outcome that basically no one sensible wants. That is less compelling than a lot of the individualized consumer use cases. I love asking it, “Here's a blog post I wrote. Give me ten title options.” And I'll usually take one of them—one of them is usually not half bad and then I can modify it slightly.Jeff: And you'll change four words in it. Yeah.Corey: Yeah, exactly. That's a bit of a different use case.Jeff: It's been an interesting—even as we've all become familiar—or at least junior prompt engineers, right—is, your information is only going to be as good as you feed the AI system—the return is only going to be as good—so you're going to want to refine that kind of conversation. Now, we're not trying to end up replacing the content that gets produced or the writing of all kinds of pros, other than we do have a code generator that works inside of our environment called Capella iQ that talks to ChatGPT, but we try and put guardrails on that too, right, as always make sure that it's talking in terms of the context of Couchbase rather than, “Where's Taylor Swift this week,” which I don't want it to answer because I don't want to spend GPT money to answer that question for you.Corey: And it might not know the right answer, but it might very well spit out something that sounds plausible.Jeff: Exactly. But I think the kinds of applications that we're steering ourselves toward can be helped along by the Gen AI systems, but I don't expect all my customers are going to be writing automatic blog post generation kinds of applications. I think what we're ultimately trying to do is facilitate interactions in a way that we haven't dreamt of yet, right? One of them might be if I've opted into to loyalty programs, like my United account and my American Express account—Corey: That feels very targeted at my lifestyle as well, so please, continue.Jeff: Exactly, right? And so, what I really want the system to do is for Amex to reward me when I hit 1k status on United while I'm on the flight and you know, have the flight attendant come up and be like, “Hey, you did it. Either, here's a free upgrade from American Express”—that would be hyper-personalization because you booked your plane ticket with it, but they also happen to know or they cross-consumed information that I've opted into.Corey: I've seen them congratulate people for hitting a million miles flown mid-flight, but that's clearly something that they've been tracking and happens a heck of a lot less frequently. This is how you start scaling that experience.Jeff: Yes. But that happened because American Airlines was always watching because that was an American Airlines ad ages ago, right, but the same principle holds true. But I think there's going to be a lot more of these: how much information am I actually allowing to be shared amongst the, call it loyalty programs, but the data sources that I've opted into. And my God, there's hundreds of them that I've personally opted into, whether I like it or not because everybody needs my email address, kind of like what you were describing earlier.Corey: A point that I have that I think agrees largely with your point is that few things to me are more frustrating than what I'm signing up, for example, oh, I don't know, an AWS even—gee, I can't imagine there's anything like that going on this week—and I have to fill out an entire form that always asked me the same questions: how big my company is, whether we have multiple workloads on, what industry we're in. And no matter what I put into that, first, it never remembers me for the next time, which is frustrating in its own right, but two, no matter what I put in to fill that thing out, the email I get does not change as a result. At one point, I said, all right—I'm picking randomly—“I am a venture capitalist based in Sweden,” and I got nothing that is differentiated from the other normal stuff I get tied to my account because I use a special email address for those things, sometimes just to see what happens. And no, if you're going to make me jump through the hoops to give you the data, at least use it to make my experience better. It feels like I'm asking for the moon here, but I shouldn't be.Jeff: Yes. [we need 00:16:19] to make your experience better and say, you know, “Here's four companies in Malmo that you ought to be talking to. And they happen to be here at the AWS event and you can go find them because their booth is here, here, and here.” That kind of immediate responsiveness could be facilitated, and to our point, ought to be facilitated. It's exactly like that kind of thing is, use the data in real-time.I was talking to somebody else today that was discussing that most data, right, becomes stale and unvaluable, like, 50% of the data, its value goes to zero after about a day. And some of it is stale after about an hour. So, if you can end up closing that responsiveness gap that we were describing—and this is kind of what this columnar service inside of Capella is going to be like—is react in real-time with real-time calculation and real-time look-up and real-time—find out how you might apply that new piece of information right now and then give it back to the consumer or the user right now.Corey: So, Couchbase takes a few different forms. I should probably, at least for those who are not steeped in the world of exotic forms of database, I always like making these conversations more accessible to folks who are not necessarily up to speed. Personally, I tend to misuse anything as a database, if I can hold it just the wrong way.Jeff: The wrong way. I've caught that about you.Corey: Yeah, it's—everything is a database if you hold it wrong. But you folks have a few different options: you have a self-managed commercial offering; you're an open-source project, so I can go ahead and run it on my own infrastructure however I want; and you have Capella, which is Couchbase as a service. And all of those are useful and have their points, and I'm sure I'm missing at least one or two along the way. But do you find that the columnar use case is going to disproportionately benefit folks using Capella in ways that the self-hosted version would not be as useful for, or is this functionality already available in other expressions of Couchbase?Jeff: It's not already available in other expressions, although there is analytic functionality in the self-managed version of Couchbase. But it's, as I've mentioned I think earlier, it's just not as scalable or as really real-time as far as we're thinking. So, it's going to—yes, it's going to benefit the database as a service deployments of Couchbase available on your favorite three clouds, and still interoperable with environments that you might self-manage and self-host. So, there could be even use cases where our development team or your development team builds in AWS using the cloud-oriented features, but is still ultimately deploying and hosting and managing a self-managed environment. You could still do all of that. So, there's still a great interplay and interoperability amongst our different deployment options.But the fun part, I think, about this is not only is it going to help the Capella user, there's a lot of other things inside Couchbase that help address the developers' penchant for trading zero-cost for degrees of complexity that you're willing to accept because you want everything to be free and open-source. And Couchbase is my fifth open-source company in my background, so I'm well, well versed in the nuances of what open-source developers are seeking. But what makes Couchbase—you know, its origin story really cool too, though, is it's the peanut butter and chocolate marriage of memcached and the people behind that and membase and CouchDB from [Couch One 00:19:54]. So, I can't think of that many—maybe Red Hat—project and companies that formed up by merging two complementary open-source projects. So, we took the scale and—Corey: You have OpenTelemetry, I think, that did that once, but that—you see occasional mergers, but it's very far from common.Jeff: But it's very, very infrequent. But what that made the Couchbase people end up doing is make a platform that will scale, make a data design that you can auto partition anywhere, anytime, and then build independently scalable services on top of that, one for SQL++, the query language. Anyone who knows SQL will be able to write something in Couchbase immediately. And I've got this AI Automator, iQ, that makes it even easier; you just say, “Write me a SQL++ query that does this,” and it'll do that. But then we added full-text search, we added eventing so you can stream data, we added the analytics capability originally and now we're enhancing it, and use JSON as our kind of universal data format so that we can trade data with applications really easily.So, it's a cool design to start with, and then in the cloud, we're steering towards things like making your entry point and using our database as a service—Capella—really, really, really inexpensive so that you get that same robustness of functionality, as well as the easy cost of entry that today's developers want. And it's my analyst friends that keep telling me the cloud is where the markets going to go, so we're steering ourselves towards that hockey puck location.Corey: I frequently remark that the role of the DBA might not be vanishing, but it's definitely changing, especially since the last time I counted, if you hold them and use as directed, AWS has something on the order of 14 distinct managed database offerings. Some are general purpose, some are purpose-built, and if this trend keeps up, in a decade, the DBA role is going to be determining which of its 40 databases is going to be the right fit for a given workload. That seems to be the counter-approach to a general-purpose database that works across the board. Clearly you folks have opinions on this. Where do you land?Jeff: Oh, so absolutely. There's the product that is a suite of capabilities—or that are individual capabilities—and then there's ones that are, in my case, kind of multi-model and do lots of things at once. I think historically, you'll recognize—because this is—let's pick on your phone—the same holds true for, you know, your phone used to be a watch, used to be a Palm Pilot, used to be a StarTAC telephone, and your calendar application, your day planner all at the same time. Well, it's not anymore. Technology converges upon itself; it's kind of a historical truism.And the database technologies are going to end up doing that—or continue to do that, even right now. So, that notion that—it's a ten-year-old notion of use a purpose-built database for that particular workload. Maybe sometimes in extreme cases that is the appropriate thing, but in more cases than not right now, if you need transactions when you need them, that's fine, I can do that. You don't necessarily need Aurora or RDS or Postgres to do that. But when you need search and geolocation, I support that too, so you don't need Elastic. And then when you need caching and everything, you don't need ElastiCache; it's all built-in.So, that multi-model notion of operate on the same pool of data, it's a lot less complex for your developers, they can code faster and better and more cleanly, debugging is significantly easier. As I mentioned, SQL++ is our language. It's basically SQL syntax for JSON. We're a reference implementation of this language, along with—[AsteriskDB 00:23:42] is one of them, and actually, the original author of that language also wrote DynamoDB's PartiQL.So, it's a common language that you wouldn't necessarily imagine, but the ease of entry in all of this, I think, is still going to be a driving goal for people. The old people like me and you are running around worrying about, am I going to get a particular, really specific feature out of the full-text search environment, or the other one that I pick on now is, “Am I going to need a vector database, too?” And the answer to me is no, right? There's going—you know, the database vendors like ourselves—and like Mongo has announced and a whole bunch of other NoSQL vendors—we're going to support that. It's going to be just another mode, and you get better bang for your buck when you've got more modes than a single one at a time.Corey: The consensus opinion that's emerging is very much across the board that vector is a feature, not a database type.Jeff: Not a category, yeah. Me too. And yeah, we're well on board with that notion, as well. And then like I said earlier, the JSON as a vehicle to give you all of that versatility is great, right? You can have vector information inside a JSON document, you can have time series information in the document, you could have graph node locations and ID numbers in a JSON array, so you don't need index-free adjacency or some of the other cleverness that some of my former employers have done. It really is all converging upon itself and hopefully everybody starts to realize that you can clean up and simplify your architectures as you look ahead, so that you do—if you're going to build AI-powered applications—feed it clean data, right? You're going to be better off.Corey: So, this episode is being recorded in advance, thankfully, but it's going to release the first day of re:Invent. What are you folks doing at the show, for those who are either there and for some reason, listening to a podcast rather than going to getting marketed to by a variety of different pitches that all mention AI or might even be watching from home and trying to figure out what to make of it?Jeff: Right. So, of course we have a booth, and my notes don't have in front of me what our booth number is, but you'll see it on the signs in the airport. So, we'll have a presence there, we'll have an executive briefing room available, so we can schedule time with anyone who wants to come talk to us. We'll be showing not only the capabilities that we're offering here, we'll show off Capella iQ, our coding assistant, okay—so yeah, we're on the AI hype band—but we'll also be showing things like our mobile sync capability where my phone and your phone can synchronize data amongst themselves without having to actually have a live connection to the internet. So, long as we're on the same network locally within the Venetian's network, we have an app that we have people download from the Apple Store and then it's a color synchronization app or picture synchronization app.So, you tap it, and it changes on my screen and I tap it and it changes on your screen, and we'll have, I don't know, as many people who are around standing there, synchronizing, what, maybe 50 phones at a time. It's actually a pretty slick demonstration of why you might want a database that's not only in the cloud but operates around the cloud, operates mobile-ly, operates—you know, can connect and disconnect to your networks. It's a pretty neat scenario. So, we'll be showing a bunch of cool technical stuff as well as talking about the things that we're discussing right now.Corey: I will say you're putting an awful lot of faith in conductivity working at re:Invent, be it WiFi or the cellular network. I know that both of those have bitten me in various ways over the years. But I wish you the best on it. I think it's going to be an interesting show based upon everything I've heard in the run-up to it. I'm just glad it's here.Jeff: Now, this is the cool part about what I'm talking about, though. The cool part about what I'm talking about is we can set up our own wireless network in our booth, and we still—you'd have to go to the app store to get this application, but once there, I can have you switch over to my local network and play around on it and I can sync the stuff right there and have confidence that in my local network that's in my booth, the system's working. I think that's going to be ultimately our design there because oh my gosh, yes, I have a hundred stories about connectivity and someone blowing a demo because they're yanking on a cable behind the pulpit, right?Corey: I always build in a—and assuming there's no connectivity, how can I fake my demos, just because it's—I've only had to do it once, but you wind up planning in advance when you start doing a talk to a large enough or influential enough audience where you want things to go right.Jeff: There's a delightful acceptance right now of recorded videos and demonstrations that people sort of accept that way because of exactly all this. And I'm sure we'll be showing that in our booth there too.Corey: Given the non-deterministic nature of generative AI, I'm sort of surprised whenever someone hasn't mocked the demo in advance, just because yeah, gives the right answer in the rehearsal, but every once in a while, it gets completely unglued.Jeff: Yes, and we see it pretty regularly. So, the emergence of clever and good prompt engineering is going to be a big skill for people. And hopefully, you know, everybody's going to figure out how to pass it along to their peers.Corey: Excellent. We'll put links to all this in the show notes, and I look forward to seeing how well this works out for you. Best of luck at the show and thanks for speaking with me. I appreciate it.Jeff: Yeah, Corey. We appreciate the support, and I think the show is going to be very strong for us as well. And thanks for having me here.Corey: Always a pleasure. Jeff Morris, VP of Product and Solutions Marketing at Couchbase. This episode has been brought to us by our friends at Couchbase. And I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, but if you want to remain happy, I wouldn't ask that podcast platform what database they're using. No one likes the answer to those things.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Man Behind the Cloud Curtain with Jeremy Tangren

Screaming in the Cloud

Play Episode Listen Later Nov 21, 2023 28:55


Jeremy Tangren, Director of Media Operations at The Duckbill Group, joins Corey on Screaming in the Cloud to discuss how he went from being a Project Manager in IT to running Media Operations at a cloud costs consultancy. Jeremy provides insight into how his background as a Project Manager has helped him tackle everything that's necessary in a media production environment, as well as what it was like to shift from a career on the IT side to working at a company that is purely cloud-focused. Corey and Jeremy also discuss the coordination of large events like re:Invent, and what attendance is really like when you're producing the highlight reels that other people get to watch from the comfort of their own homes. About JeremyWith over 15 years of experience in big tech, Jeremy brings a unique perspective to The Duckbill Group and its Media Team. Jeremy handles all things Media Operations. From organizing the team and projects to making sure publications go out on time, Jeremy does a bit of everything!Links Referenced: duckbillgroup.com: https://duckbillgroup.com requinnvent.com: https://requinnvent.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's guest is one of those behind-the-scenes type of people who generally doesn't emerge much into the public eye. Now, that's a weird thing to say about most folks, except in this case, I know for a fact that it's true because that's kind of how his job was designed. Jeremy Tangren is the Director of Media Operations here at The Duckbill Group. Jeremy, thank you for letting me drag you into the spotlight.Jeremy: Of course. I'm happy to be here, Corey.Corey: So, you've been here, what, it feels like we're coming up on the two-year mark or pretty close to it. I know that I had you on as a contractor to assist with a re:Invent a couple years back and it went so well, it's, “How do we get you in here full time? Oh, we can hire you.” And the rest sort of snowballed from there.Jeremy: Yes. January will be two years, in fact.Corey: I think that it's one of the hardest things to do for you professionally has always been to articulate the value that you bring because I've been working with you here for two years and I still do a pretty poor job of doing it, other than to say, once you get brought into a project, all of the weird things that cause a disjoint or friction along the way or cause the wheels to fall off magically go away. But I still struggle to articulate what that is in a context that doesn't just make it sound like I'm pumping up my buddy, so to speak. How do you define what it is that you do? I mean, now Director of Media Operations is one of those titles that can cover an awful lot of ground, and because of a small company, it obviously does. But how do you frame what you do?Jeremy: Well, I am a professional hat juggler, for starters. There are many moving parts and I come from a history of project management, a long, long history of project management. And I've worked with projects from small scale to the large scale spanning globally and I always understand that there are many moving parts that have to be tracked and handled, and there are many people involved in that process. And that's what I bring here to The Duckbill Group is that experience of managing the small details while also understanding the larger picture.Corey: It's one of those hard-to-nail-down type of roles. It's sort of one of those glue positions where, in isolation, it's well, there's not a whole lot that gets done when it is just you. I felt the same thing my entire career as a sysadmin turned other things that are basically fancy titles but still distilled down to systems administrator. And that is, well step one, I need a web property or some site or something that is going to absorb significant traffic and have developers building it. Because, “Oh, I'm going to run some servers.” “Okay, for what purpose?” “I don't know.”I was never good at coming up with the application that rode on top of these things. But give me someone else's application, I could make it scale and a bunch of exciting ways, back when that was trickier to do at smaller scale. These days, the providers out there make it a heck of a lot easier and all I really wind up doing is—these days—making fun of other people's hard work. It keeps things simpler, somehow.Jeremy: There always has to be a voice leading that development and understanding what you're trying to achieve at the end. And that's what a project manager, or in my role as Director of Media Operations, that's what I do is I see our vision to the end and I bring in the people and resources necessary to make it happen.Corey: Your background is kind of interesting. You have done a lot of things that a lot of places, mostly large companies, and mostly on the corporate IT side of the world. But to my understanding, this is the first time you've really gone into anything approaching significant depth with things that are cloud-oriented. What's it been like for you?Jeremy: It's a new experience. As you said, I've had experience all over the industry. I come from traditional data centers and networking. I'm originally trained in Cisco networking from way back in the day, and then I moved on into virtual reality development and other infrastructure management. But getting into the cloud has been something new and it's been a shift from old-school data centers in a way that is complicated to wrap your head around.Whereas in a data center before, it was really clear you had shelves of hardware, you had your racks, you had your disks, you had finite resources, and this is what you did; you built your applications on top of it and that was the end of the conversation. Now, the application is the primary part of the conversation, and scaling is third, fourth, fifth in the conversation. It's barely even mentioned because obviously we're going to put this in the cloud and obviously we're going to scale this out. And that's a power and capability that I had not seen in past companies, past infrastructures. And so, learning about the cloud, learning about the numerous AWS [laugh] services that exist and how they interact, has been a can of worms to understand and slowly take one worm out at a time and work with it and become its friend.Corey: I was recently reminded of a time before cloud where I got to go hang out with the founders at Oxide over in Oakland. I'd forgotten so much of the painful day-to-day minutia of what it took to get servers up and running in a data center, of the cabling nonsense, of slicing your fingers to ribbons on rack nuts, on waiting weeks on end for the server you ordered to show up, ideally in the right configuration, of getting 12 servers and 11 of them provision correctly and the 12th doesn't for whatever godforsaken reason. So, much of that had just sort of slipped my mind. And, “Oh, yeah, that's right. That's what the whole magic of cloud was.”Conversely, I've done a fair bit of IoT stuff at home for the past year or so, just out of basically looking for a hobby, and it feels different, for whatever reason, to be running something that I'm not paying a third party by the hour for. The actual money that we're talking about in either case is nothing, but there's a difference psychologically and I'm wondering how much the current cloud story is really shaping the way that an entire generation is viewing computers.Jeremy: I would believe that it is completely shifted how we view computers. If you know internet and computing history, we're kind of traveling back to the old ways of the centrally managed server and a bunch of nodes hanging off of it, and they basically being dummy nodes that access that central resource. And so, with the centralization of AWS resources and kind of a lot of the internet there, we've turned everyone into just a node that accesses this centralized resource. And with more and more applications moving to the web, like, natively the web, it's changing the need for compute on the consumer side in such a way that we've never seen, ever. We have gone from a standard two-and-a-half, three-foot tall tower sitting in your living room, and this is the family computer to everybody has their own personal computer to everyone has their own laptops to now, people are moving away from even those pieces of hardware to iPads because all of the resources that they use exist on the internet. So, now you get the youngest generation that's growing up and the only thing that they've ever known as far as computers go is an iPad in their hands. When I talk about a tower, what does that mean to them?Corey: It's kind of weird, but I feel like we went through a generation where it felt like the early days of automobiles, where you needed to be pretty close to a mechanic in order to reliably be convinced you could take a car any meaningful distance. And then they became appliances again. And in some cases, because manufacturers don't want people working on cars, you also have to be more or less a hacker of sorts to wind up getting access to your car. I think, on some level, that we've seen computers turn into appliances like that. When I was a kid, I was one of those kids that was deep into computers and would help the teachers get their overhead projector-style thing working and whatnot.And I think we might be backing away from that, on some level, just because it's not necessary to have that level of insight into how a system works to use it effectively. And I'm not trying to hold back the tide of progress. I just find it interesting as far as how we are relating with these things differently. It's a rising tide that absolutely lifts all ships, and that's a positive thing.Jeremy: Well, to carry your analogy further with cars, it used to be, especially in the United States, that in order to drive a car you had to understand a manual transmission, how to shift through all those gears, which gave you some understanding of what a clutch was and how the car moved. You had a basic understanding of how the car functions. And now in the United States, we all have automatic transmissions, and if I ask a regular person, “Do you understand how an engine works?” They'll just tell me straight, “No, I have no idea. My car gets me from A to B.”And computers have very much become that way, especially with this iPad generation that we're talking about, where it's a tool to access resources to get you from A to B, to get you from your fingertips to whatever the tools are that you're trying to access that are probably on the internet. And it changes the focus of what you need to learn as you're growing up and as you get into the industry. Because, say, for me, and you, Corey, we grew up with computers in their infancy and being those kids in the classroom, helping the teachers, helping our family members with whatever tech problem that they may have. Those were us. And we had to learn a lot about the technology and we had to learn a lot of troubleshooting skills in order to fix our family's problems, to help the classroom teacher, whatever it was. So, that's the set of skills that we learned through that generation of computers that the current generation isn't having to deal with as far as the complexity and the systems are concerned. So, they're able to learn different skills. They're able to interact with things more natively than you were I ever imagined.Corey: Well, I'm curious to get your perspective on how that's changed in the ways that you're interacting with teams from a project management perspective. I mean, obviously, we've seen a lot of technological advancement over the course of your career, which is basically the same length as mine, but what have you seen as far as how that affects the interplay of people on various teams? Or has it?Jeremy: It's made them more connected and less connected at the same time. I've found my most effective teams—generally—worked together in the same location and could turn around and poke the other team member in the back. And that facilitated communication all of the time. But that's not how every team can function. You have to lay on project management, you have to lay on tools and communication. And that's where this technology comes in is, how has it improved? How has it changed things?And interestingly, the web has advanced that, I think, to a significant degree because the old school, old project management style was either we're going to start planning this in Excel like so many managers do, or we're going to open up Microsoft Project and we're going to spend hours and hours and hours in this interface that only the project manager can access and show everyone. So, now we're in a point where everybody can access the project plan because it exists on the web—Smartsheets or whatever—we have instant communication via chat—whatever our chat of choice is Slack, Discord, IRC—and it allows us to work anywhere and be asynchronous. So, this team that previously I had to have sitting next to each other to poke each other, they can now be spread all over the world. I had a project a number of years ago working in virtual reality that we did exactly that. We had six teams spanned globally, and because we were able to hand off from each other through technology and through competent project management, the project was able to be built and successful rather than us continuing to point fingers at each other trying to understand what the next step is. So yeah, the technology has definitely helped.Corey: It's wild to me just seeing how… I guess, the techno-optimism has always been, “Oh, technology will heal the world and make things better,” as if it were this panacea that was going to magically take care of everything. And it's sort of a “Mo money, mo problems” type of situation where we've got, okay, great. Well, we found ways to make the old things that were super hard trivial, and all that's done is unlock a new level of problem because people remain people, for whatever it is. You work a lot more with people than you do with technology, despite the fact that if you look at the actual ins and outs of what you do, it's easy to look at that and say, “Oh, clearly, you're a technical person working on technology.” I would say you're a people-facing person.Jeremy: I agree with that, and that's why I refer to the people participating in my projects or on my team or what have you as people and not resources. Because people contribute to these things, not resources.Corey: So, what I'm curious about—since everyone seems to have a very disjointed opinion or perspective on how the sausage gets made over here—can you describe what your job is because I've talked to people who are surprised I have someone running media operations. Like, “How hard is it? You just sit down in front of a microphone and talk, and that's the end of it.” And I don't actually know the answer to that question because all I do is sit down in front of a microphone and talk, and that's the end of it. You have put process around things that used to vex me mightily and now I don't really know exist. So, it's sort of a weird question, but what is it you'd say it is you do exactly?Jeremy: I've actually had to answer this question a lot of times. The really, really simple version is I do everything that Corey doesn't [laugh]. Corey records and creates the content, he's the face of the company—you are the face of the company, Corey—and you do what you do. And that leaves everything else that has to be done. Okay, you record an episode of Screaming in the Cloud. What happens next?Well, it goes off to a team to be edited and then reviewed by the recording guest—to be reviewed by the guest. We have video editing that has to take place every time you go out to a shoot, we coordinate your presence on-site at events, we coordinate the arrival of other people to your events. In its shortest form, everything that is media-related that entails some kind of management or execution that is not creating content, I'm there moving things along or I have one of my teams moving things along.Corey: Before you showed up, there were times where I would record episodes like this and they wouldn't get published for three or four months because I would forget to copy the files from the recording off so that the audio processing team could handle that. And small minor process improvements have meant that I'm no longer the critical path for an awful lot of things, which is awesome. It's one of those invisible things around me that I vaguely know is there most of the time, but don't stop to think about it in quite the same way. Like, think of it as taking an airline trip somewhere: you get on the plane, you talk to the person at the gate, you [unintelligible 00:17:05] the flight attendants help you with your beverages or bags and whatnot, but you don't think about all the other moving parts that has to happen around aircraft maintenance, around scheduling, around logistics, around making sure that the seat is clean before you sit down at cetera, et cetera. There's so much stuff that you're sort of aware of you stopped to think about it, but it's not something that you see on a day-to-day basis, and as a result, it's easy to forget that it's there.Jeremy: That's what happens with people working in the background and making sure that things happen. A good example of this is also re:Quinnvent coming up here in a month, where we'll be at re:Invent—my production team, Corey, et cetera—where Corey will be recording content and we will be producing it in very short order. And this is an operation that has to occur without Corey's involvement. These are things that happen in the background in order to produce the content for the audience. There's always somebody who exists behind the scenes to move things along behind the creator. Because, Corey, you're a very busy person.Corey: People forget that I also have this whole, you know, consulting side thing that I do, too—Jeremy: Yeah.Corey: You know, the primary purpose of our company?Jeremy: Yeah. You are one of the busiest people I've ever met, Corey. Your calendar is constantly full and you're constantly speaking to people. There's no way that you would have the time to go in and edit each of these audio recordings, each of these video recordings, what have you. You have to have force multipliers hiding behind you to make things happen. And that's the job of the Director of Media Operations at Last Week in AWS.Corey: I have to ask since last year was your first exposure to it—that was your first re:Invent in person—what do you think of it?Jeremy: It was a madhouse [laugh]. I had managed re:Quinnvent back in 2021 remotely and I did not have the clear understanding of how far away things are, how convoluted the casinos are, things of that nature. And so, when I was working with you in 2021, Corey, I had to make a lot of assumptions that now I know better now that I've been on site. Like, it can take you 30 or 45 minutes to get across the street to one of the other re:Invent locations. It's really ridiculous.Corey: That was one of the reasons I had you and also I had Mike go out to re:Invent in person the first year that I was working with either of you on a full-time basis, just because otherwise it turned into, “Oh, it's just across the street. Just pop on over and say hi. It'll take you 20 minutes.” No, it'll take 90 by the time you walk through the casinos, find your way out, get over there, have your meeting, and get back. It's not one of those things that's trivial, but it's impossible to describe without sounding like a lunatic until someone has actually been there before.Jeremy: That's absolutely true. The personal experience is absolutely required in order to understand the scale of the situation, the number of people that are there, and the amount of time it's going to take to get to wherever you need to be, even if you're on the expo floor. Last year, I needed to deliver some swag to a vendor and it took me the better part of 15, 20 minutes to find that vendor on the expo floor using AWS's maps. It's a huge space and it's super convoluted. You need all the help that you can get. And being there in person was absolutely critical in order to understand the challenges that you're facing there, Corey.Corey: People think I'm kidding when I say that, “Oh, you're not going to re:Invent. I envy you. You must be so happy.” Like, people sometimes, if they haven't been, think, “Oh, I'm losing out because I don't have the chance to go to this madhouse event.” It's not as great as you might believe and there's no way to convince people of that until they've been there.I'm disheartened to learn that Google Cloud Next is going to be in Las Vegas next year. That means that's twice a year I'm going to have to schlep there instead of once. At least they're doing it in April, which is otherwise kind of a conference deadzone. But ugh, I am not looking forward to spending even more of my life in Las Vegas than I already have to. I'm there for eight nights a year. It's like crappy Cloud Hanukkah.Jeremy: [laugh]. I second that. To be perfectly honest, San Francisco and Moscone Center, I really enjoy them as venues for these kinds of conferences, but Las Vegas is apparently able to handle things better. I don't know, I'm not real happy about the Vegas situation either, and it takes a toll.Corey: Yeah, I tend to book the next week afterwards of just me lying flat on my back not doing anything. Maybe I'll be sick like I was last year with Covid when we all got it. Maybe I will just be breathing into a bag and trying to recuperate after it. But I know that for mostly the rest of the December, I just don't want to think about cloud too heavily or do too much with it, just because even for me, it's been too much and I need some decompression time.Jeremy: I hear that. I mean, you've had three weeks of Amazon just firehosing everyone with new service releases, new updates, just constantly, and re:Invent caps it all off. And then we get back and there's just no news and everybody's exhausted from being at re:Invent. Everyone's probably sick from being in Las Vegas. To add to that Las Vegas point, hey, there's a bunch of casinos and they're cigarette smoke-filled. Like, it's a miserable place to be. Why do they insist on putting these conferences there?Corey: It drives me nuts and it's one of those things where it's—I mostly feel for the people at Amazon who have to put this show on because yeah, I complain that I don't get much of a Thanksgiving because I have this whole looming event happening, but there are large squadrons of people that they send out in advance for weeks at a time to do things like build out the wireless networks, get everything set up, handle logistics, all of it, and those people forget having, [I think 00:23:35], something hanging over their head during Thanksgiving; they're spending Thanksgiving at… you know, a hotel. That's not fun.Jeremy: No, that's not fun at all. And I understand the stresses that they're under and what these event coordinators are having to deal with. This is a huge event and it's super thankless. That networking team, if things don't work absolutely perfectly and everybody has maximum bandwidth at all times, that poor networking team is going to catch hell, and they just spent weeks getting ready for this. That sucks. I don't really envy them, but I do applaud them and their effort.We've spent the last two [laugh] Thanksgivings planning our own event to make sure things happen smoothly. These big events take a lot of planning, a lot of coordination, and a lot of people. And I think that folks always underestimate that. They underestimate the level of involvement, the level of investment, and what it takes to put on a big show like this.Corey: I mean, there is the counterpoint as well, where we still go because it is the epicenter of the AWS universe. Despite all the complaints I have about it, I like the opportunity to talk to people who are doing interesting things who are building stuff that I'm going to be either using or have inflicted upon me over the next year. And even the community folks, just talking to people who are in the trenches as well, figuring out, okay, AWS built this thing and now I've got to work with it. There's really something to be said for having the opportunity to talk to those people face-to-face. I don't have a whole lot of excuses to go to all the places these people are from, but for one week a year, we all find ourselves in Las Vegas. So, that's at least the silver lining for me. Did you find any silver linings last time or was it simply, “I finally got to go home?”Jeremy: [laugh]. No, actually, I did enjoy it. To your point, getting to speak to the service owners, these people who've written the code, is an amazing opportunity. For example, I got to run into the DeepRacer folks last year before they set up for the tournament, and they were super helpful and super encouraging to get into the DeepRacer program. I explained, “I don't know how to code,” and they said, “That's fine. You can still get into it, you can still learn the basics.”And that's super endearing, that's really supportive, and that's really emblematic of the community that's coming to re:Invent. So, this is a great place to be for this experience, to meet these people, and to associate with other users like yourself. In fact, we're hosting the Atomic Liquors Drink-Up on November 29th for our community who's coming to re:Invent, and we want everybody who's able to come so that we can say hi, pay for your drinks and, you know, talk to us.Corey: Yeah, it starts at 7 p.m. We're co-hosting with RedMonk. No badge needed, no one will scan anything or try to sell you anything. It's just if you want to schlep the three miles from the strip out to Atomic Liquors to hang out with people who are like-minded, it's one of my favorite parts of the show every year. Please, if you're hearing this, you're welcome to come.Jeremy: Absolutely. It's open. No tickets required. It's totally free. I'll be there. Corey will be there—Corey is always there—and it'll be a great time, so I look forward to seeing you there.Corey: Indeed. Jeremy, thank you so much for taking the time out of your increasingly busy day as re:Invent looms ever closer to chat with me for about this stuff. If people want to learn more about what we're up to, where should they go to keep up? I lose track of what URL to send people to.Jeremy: [laugh]. Yeah, thank you for having me, Corey. And the best place to learn about what we're doing at re:Invent is actually requinnvent.com. That's R-E-Q-U-I-N-N-V-E-N-T dot com.Corey: And we'll put a link to that in the [show notes 00:27:33] for sure. Or at least your people will. I have nothing to do with it.Jeremy: Yes, I'll make sure they take care of that. Visit the website. That's where we've got our schedule, all the invites, anything you need to know about what we're doing at re:Invent that week is available on requinnvent.com.Corey: Jeremy, thank you for taking the time to speak with me. I really appreciate it.Jeremy: Thank you, Corey.Corey: Jeremy Tangren, Director of Media Operations here at The Duckbill Group. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that one of these days someone on Jeremy's team will make it a point to put in front of me. But that day is not today.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
An Open-Source Mindset in Cloud Security with Alex Lawrence

Screaming in the Cloud

Play Episode Listen Later Nov 16, 2023 32:50


Alex Lawrence, Field CISO at Sysdig, joins Corey on Screaming in the Cloud to discuss how he went from studying bioluminescence and mycology to working in tech, and his stance on why open source is the future of cloud security. Alex draws an interesting parallel between the creative culture at companies like Pixar and the iterative and collaborative culture of open-source software development, and explains why iteration speed is crucial in cloud security. Corey and Alex also discuss the pros and cons of having so many specialized tools that tackle specific functions in cloud security, and the different postures companies take towards their cloud security practices. About AlexAlex Lawrence is a Field CISO at Sysdig. Alex has an extensive history working in the datacenter as well as with the world of DevOps. Prior to moving into a solutions role, Alex spent a majority of his time working in the world of OSS on identity, authentication, user management and security. Alex's educational background has nothing to do with his day-to-day career; however, if you'd like to have a spirited conversation on bioluminescence or fungus, he'd be happy to oblige.Links Referenced: Sysdig: https://sysdig.com/ sysdig.com/opensource: https://sysdig.com/opensource falco.org: https://falco.org TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by our friends over at Sysdig, and they have brought to me Alexander Lawrence, who's a principal security architect over at Sysdig. Alexander, thank you for joining me.Alex: Hey, thanks for having me, Corey.Corey: So, we all have fascinating origin stories. Invariably you talk to someone, no one in tech emerged fully-formed from the forehead of some God. Most of us wound up starting off doing this as a hobby, late at night, sitting in the dark, rarely emerging. You, on the other hand, studied mycology, so watching the rest of us sit in the dark and growing mushrooms was basically how you started, is my understanding of your origin story. Accurate, not accurate at all, or something in between?Alex: Yeah, decently accurate. So, I was in school during the wonderful tech bubble burst, right, high school era, and I always told everybody, there's no way I'm going to go into technology. There's tons of people out there looking for a job. Why would I do that? And let's face it, everybody expected me to, so being an angsty teenager, I couldn't have that. So, I went into college looking into whatever I thought was interesting, and it turned out I had a predilection to go towards fungus and plants.Corey: Then you realized some of them glow and that wound up being too bright for you, so all right, we're done with this; time to move into tech?Alex: [laugh]. Strangely enough, my thesis, my capstone, was on the coevolution of bioluminescence across aquatic and terrestrial organisms. And so, did a lot of focused work on specifically bioluminescent fungus and bioluminescing fish, like Photoblepharon palpebratus and things like that.Corey: When I talk to people who are trying to figure out, okay, I don't like what's going on in my career, I want to do something different, and their assumption is, oh, I have to start over at square one. It's no, find the job that's halfway between what you're doing now and what you want to be doing, and make lateral moves rather than starting over five years in or whatnot. But I have to wonder, how on earth did you go from A to B in this context?Alex: Yeah, so I had always done tech. My first job really was in tech at the school districts that I went to in high school. And so, I went into college doing tech. I volunteered at the ELCA and other organizations doing tech, and so it basically funded my college career. And by the time I finished up through grad school, I realized my life was going to be writing papers so that other people could do the research that I was coming up with, and I thought that sounded like a pretty miserable life.And so, it became a hobby, and the thing I had done throughout my entire college career was technology, and so that became my new career and vocation. So, I was kind of doing both, and then ended up landing in tech for the job market.Corey: And you've effectively moved through the industry to the point where you're now in security architecture over at Sysdig, which, when I first saw Sysdig launch many years ago, it was, this is an interesting tool. I can see observability stories, I can see understanding what's going on at a deep level. I liked it as a learning tool, frankly. And it makes sense, with the benefit of hindsight, that oh, yeah, I suppose it does make some sense that there are security implications thereof. But one of the things that you've said that I really want to dig into that I'm honestly in full support of because it'll irritate just the absolute worst kinds of people is—one of the core beliefs that you espouse is that security when it comes to cloud is inherently open-source-based or at least derived. I don't want to misstate your position on this. How do you view it?Alex: Yeah. Yeah, so basically, the stance I have here is that the future of security in cloud is open-source. And the reason I say that is that it's a bunch of open standards that have basically produced a lot of the technologies that we're using in that stack, right, your web servers, your automation tooling, all of your different components are built on open stacks, and people are looking to other open tools to augment those things. And the reality is, is that the security environment that we're in is changing drastically in the cloud as opposed to what it was like in the on-premises world. On-prem was great—it still is great; a lot of folks still use it and thrive on it—but as we look at the way software is built and the way we interface with infrastructure, the cloud has changed that dramatically.Basically, things are a lot faster than they used to be. The model we have to use in order to make sure our security is good has dramatically changed, right, and all that comes down to speed and how quickly things evolve. I tend to take a position that one single brain—one entity, so to speak—can't keep up with that rapid evolution of things. Like, a good example is Log4j, right? When Log4j hit this last year, that was a pretty broad attack that affected a lot of people. You saw open tooling out there, like Falco and others, they had a policy to detect and help triage that within a couple of hours of it hitting the internet. Other proprietary tooling, it took much longer than two hours.Corey: Part of me wonders what the root cause behind that delay is because it's not that the engineers working at these companies are somehow worse than folks in the open communities. In some cases, they're the same people. It feels like it's almost corporate process ossification of, “Okay, we built a thing. Now, we need to make sure it goes through branding and legal and marketing and we need to bring in 16 other teams to make this work.” Whereas in the open-source world, it feels like there's much more of a, “I push the deploy button and it's up. The end.” There is no step two.Alex: [laugh]. Yeah, so there is certainly a certain element of that. And I think it's just the way different paradigms work. There's a fantastic book out there called Creativity, Inc., and it's basically a book about how Pixar manages itself, right? How do they deal with creating movies? How do they deal with doing what they do, well?And really, what it comes down to is fostering a culture of creativity. And that typically revolves around being able to fail fast, take risks, see if it sticks, see if it works. And it's not that corporate entities don't do that. They certainly do, but again, if you think about the way the open-source world works, people are submitting, you know, PRs, pull requests, they're putting out different solutions, different fixes to problems, and the ones that end up solving it the best are often the ones that end up coming to the top, right? And so, it's just—the way you iterate is much more akin to that kind of creativity-based mindset that I think you get out of traditional organizations and corporations.Corey: There's also, I think—I don't know if this is necessarily the exact point, but it feels like it's at least aligned with it—where there was for a long time—by which I mean, pretty much 40 years at this point—a debate between open disclosure and telling people of things that you have found in vendors products versus closed disclosure; you only wind—or whatever the term is where you tell the vendor, give them time to fix it, and it gets out the door. But we've seen again and again and again, where researchers find something, report it, and then it sits there, in some cases for years, but then when it goes public and the company looks bad as a result, they scramble to fix it. I wish it were not this way, but it seems that in some cases, public shaming is the only thing that works to get companies to secure their stuff.Alex: Yeah, and I don't know if it's public shaming, per se, that does it, or it's just priorities, or it's just, you know, however it might go, there's always been this notion of, “Okay, we found a breach. Let's disclose appropriately, you know, between two entities, give time to remediate.” Because there is a potential risk that if you disclose publicly that it can be abused and used in very malicious ways—and we certainly don't want that—but there also is a certain level of onus once the disclosure happens privately that we got to go and take care of those things. And so, it's a balancing act.I don't know what the right solution is. I mean, if I did, I think everybody would benefit from things like that, but we just don't know the proper answer. The workflow is complex, it is difficult, and I think doing our due diligence to make sure that we disclose appropriately is the right path to go down. When we get those disclosures we need to take them seriously is when it comes down to.Corey: What I find interesting is your premise that the future of cloud security is open-source. Like, I could make a strong argument that today, we definitely have an open-source culture around cloud security and need to, but you're talking about that shifting along the fourth dimension. What's the change? What do you see evolving?Alex: Yeah, I think for me, it's about the collaboration. I think there are segments of industries that communicate with each other very, very well, and I think there's others who do a decent job, you know, behind closed doors, and I think there's others, again, that don't communicate at all. So, all of my background predominantly has been in higher-ed, K-12, academia, and I find that a lot of those organizations do an extremely good job of partnering together, working together to move towards, kind of, a greater good, a greater goal. An example of that would be a group out in the Pacific Northwest called NWACC—the NorthWest Academic Computing Consortium. And so, it's every university in the Northwest all come together to have CIO Summits, to have Security Summits, to trade knowledge, to work together, basically, to have a better overall security posture.And they do it pretty much out in the open and collaborating with each other, even though they are also direct competitors, right? They all want the same students. It's a little bit of a different way of thinking, and they've been doing it for years. And I'm finding that to be a trend that's happening more and more outside of just academia. And so, when I say the future is open, if you think about the tooling academia typically uses, it is very open-source-oriented, it is very collaborative.There's no specifications on things like eduPerson to be able to go and define what a user looks like. There's things like, you know, CAS and Shibboleth to do account authorization and things like that. They all collaborate on tooling in that regard. We're seeing more of that in the commercial space as well. And so, when I say the future of security in cloud is open-source, it's models like this that I think are becoming more and more effective, right?It's not just the larger entities talking to each other. It's everybody talking with each other, everybody collaborating with each other, and having an overall better security posture. The reality is, is that the folks we're defending ourselves against, they already are communicating, they already are using that model to work together to take down who they view as their targets: us, right? We need to do the same to be able to keep up. We need to be able to have those conversations openly, work together openly, and be able to set that security posture across that kind of overall space.Corey: There's definitely a concern that if okay, you have all these companies and community collaborating around security aspects in public, that well won't the bad actors be able to see what they're looking at and how they're approaching it and, in some cases, move faster than they can or, in other cases, effectively wind up polluting the conversation by claiming to be good actors when they're not. And there's so many different ways that this can manifest. It feels like fear is always the thing that stops people from going down this path, but there is some instance of validity to that I would imagine.Alex: Yeah, no. And I think that certainly is true, right? People are afraid to let go of, quote-unquote, “The keys to their kingdom,” their security posture, their things like that. And it makes sense, right? There's certain things that you would want to not necessarily talk about openly, like, specifically, you know, what Diffie–Hellman key exchange you're using or something like that, but there are ways to have these conversations about risks and posture and tooling and, you know, ways you approach it that help everybody else out, right?If someone finds a particularly novel way to do a detection with some sort of piece of tooling, they probably should be sharing that, right? Let's not keep it to ourselves. Traditionally, just because you know the tool doesn't necessarily mean that you're going to have a way in. Certainly, you know, it can give you a path or a vector to go after, but if we can at least have open standards about how we implement and how we can go about some of these different concepts, we can all gain from that, so to speak.Corey: Part of me wonders if the existing things that the large companies are collaborating on lead to a culture that specifically pushes back against this. A classic example from my misspent youth is that an awful lot of the anti-abuse departments at these large companies are in constant communication. Because if you work at Microsoft, or Google or Amazon, your adversary, as you see it, in the Trust and Safety Group is not those other companies. It's bad actors attempting to commit fraud. So, when you start seeing particular bad actors emerging from certain parts of the network, sharing that makes everything better because there's an understanding there that it's not, “Oh, Microsoft has bad security this week,” or, “Google will wind up approving fraudulent accounts that start spamming everyone.”Because the takeaway by theby the customers is not that this one company is bad; it's oh, the cloud isn't safe. We shouldn't use cloud. And that leads to worse outcomes for basically everyone. But they're als—one of the most carefully guarded secrets at all these companies is how they do fraud prevention and spam detection because if adversaries find that out, working around them becomes a heck of a lot easier. I don't know, for example, how AWS determines whether a massive account overage in a free-tier account is considered to be a bad actor or someone who made a legitimate mistake. I can guess, but the actual signal that they use is something that they would never in a million years tell me. They probably won't even tell each other specifics of that.Alex: Certainly, and I'm not advocating that they let all of the details out, per se, but I think it would be good to be able to have more of an open posture in terms of, like, you know what tooling do they use? How do they accomplish that feat? Like, are they looking at a particular metric? How do they basically handle that posture going forward? Like, what can I do to replicate a similar concept?I don't need to know all the details, but would be nice if they embrace, you know, open tooling, like say a Trivy or a Falco or whatever the thing is, right, they're using to do this process and then contribute back to that project to make it better for everybody. When you kind of keep that stuff closed-source, that's when you start running into that issue where, you know, they have that, quote-unquote, “Advantage,” that other folks aren't getting. Maybe there's something we can do better in the community, and if we can all be better, it's better for everybody.Corey: There's a constant customer pain in the fact that every cloud provider, for example, has its own security perspective—the way that identity is managed, the way that security boundaries exist, the way that telemetry from these things winds up getting represented—where a number of companies that are looking at doing things that have to work across cloud for a variety of reasons—some good, some not so good—have decided that, okay, we're just going to basically treat all these providers as, more or less, dumb pipes and dumb infrastructure. Great, we're just going to run Kubernetes on all these things, and then once it's inside of our cluster, then we'll build our own security overlay around all of these things. They shouldn't have to do that. There should be a unified set of approaches to these things. At least, I wish there were.Alex: Yeah, and I think that's where you see a lot of the open standards evolving. A lot of the different CNCF projects out there are basically built on that concept. Like, okay, we've got Kubernetes. We've got a particular pipeline, we've got a particular type of implementation of a security measure or whatever it might be. And so, there's a lot of projects built around how do we standardize those things and make them work cross-functionally, regardless of where they're running.It's actually one of the things I quite like about Kubernetes: it makes it be a little more abstract for the developers or the infrastructure folks. At one point in time, you had your on-premises stuff and you built your stuff towards how your on-prem looked. Then you went to the cloud and started building yourself to look like what that cloud look like. And then another cloud showed up and you had to go use that one. Got to go refactor your application to now work in that cloud.Kubernetes has basically become, like, this gigantic API ball to interface with the clouds, and you don't have to build an application four different ways anymore. You can build it one way and it can work on-prem, it can work in Google, Azure, IBM, Oracle, you know, whoever, Amazon, whatever it needs to be. And then that also enables us to have a standard set of tools. So, we can use things like, you know, Rego or we can use things like Falco or we can use things that allow us to build tooling to secure those things the same way everywhere we go. And the benefit of most of those tools is that they're also configured, you know, via some level of codification, and so we can have a repository that contains our posture: apply that posture to that cluster, apply it to the other cluster in the other environment. It allows us to automate these things, go quicker, build the posture at the very beginning, along with that application.Corey: One of the problems I feel as a customer is that so many of these companies have a model for interacting with security issues that's frankly obnoxious. I am exhausted by the amount of chest-thumping, you'll see on keynote stages, all of the theme, “We're the best at security.” And whenever a vulnerability researcher reports something of a wide variety of different levels of severity, it always feels like the first concern from the company is not fix the issue, but rather, control the messaging around it.Whenever there's an issue, it's very clear that they will lean on people to rephrase things, not use certain words. It's, I don't know if the words used to describe this cross-tenant vulnerability are the biggest problem you should be focusing on right now. Yes, I understand that you can walk and chew gum at the same time as a big company, but it almost feels like the researchers are first screaming into a void, and then they're finally getting attention, but from all the people they don't want to get the attention from. It feels like this is not a welcoming environment for folks to report these things in good faith.Alex: [sigh]. Yeah, it's not. And I don't know what the solution is to that particular problem. I have opinions about why that exists. I won't go into those here, but it's cumbersome. It's difficult. I don't envy a lot of those research organizations.They're fantastic people coming up with great findings, they find really interesting stuff that comes out, but when you have to report and do that due diligence, that portion is not that fun. And then doing, you know, the fallout component, right: okay, now we have this thing we have to report, we have to go do something to fix it, you're right. I mean, people do often get really spun up on the verbiage or the implications and not just go fix the problem. And so again, if you have ways to mitigate that are more standards-based, that aren't specific to a particular cloud, like, you can use an open-source tool to mitigate, that can be quite the advantage.Corey: One of the challenges that I see across a wide swath of tooling and approaches to it have been that when I was trying to get some stuff to analyze CloudTrail logs in my own environment, I was really facing a bimodal distribution of options. On one end of the spectrum, it's a bunch of crappy stuff—or good stuff; hard to say—but it's all coming off of GitHub, open-source, build it yourself, et cetera. Good luck. And that's okay, awesome, but there's business value here and I'm thrilled to pay experts to make this problem go away.The other end of the spectrum is commercial security tooling, and it is almost impossible in my experience to find anything that costs less than $1,000 a month to start providing insight from a security perspective. Now, I understand the market forces that drive this. Truly I do, and I'm sympathetic to them. It is just as easy to sell $50,000 worth of software as it is five to an awful lot of companies, so yeah, go where the money is. But it also means that the small end of the market as hobbyists, as startups are just getting started, there is a price barrier to engaging in the quote-unquote, “Proper way,” to do security.So, the posture suffers. We'll bolt security on later when it becomes important is the philosophy, and we've all seen how well that plays out in the fullness of time. How do you square that circle? I think the answer has to be open-source improving to the point where it's not just random scripts, but renowned projects.Alex: Correct, yeah, and I'd agree with that. And so, we're kind of in this interesting phase. So, if you think about, like, raw Linux applications, right, Linux, always is the tenant that you build an application to do one thing, does that one thing really, really, really well. And then you ended up with this thing called, like, you know, the Cacti monitoring stack. And so, you ended up having, like, 600 tools you strung together to get this one monitoring function done.We're kind of in a similar spot in a lot of ways right now, in the open-source security world where, like, if you want to do scanning, you can do, like, Clair or you can do Trivy or you have a couple different choices, right? If you want to do posture, you've got things like Qbench that are out there. If you want to go do runtime security stuff, you've got something like Falco. So, you've got all these tools to string together, right, to give you all of these different components. And if you want, you can build it yourself, and you can run it yourself and it can be very fun and effective.But at some point in your life, you probably don't want to be care-and-feeding your child that you built, right? It's 18 years later now, and you want to go back to having your life, and so you end up buying a tool, right? That's why Gartner made this whole CNAP category, right? It's this humongous category of products that are putting all of these different components together into one gigantic package. And the whole goal there is just to make lives a little bit easier because running all the tools yourself, it's fun, I love it, I did it myself for a long time, but eventually, you know, you want to try to work on some other stuff, too.Corey: At one point, I wound up running the numbers of all of the first-party security offerings that AWS offered, and for most use cases of significant scale, the cost for those security services was more than the cost of the theoretical breach that they'd be guarding against. And I think that there's a very dangerous incentive that arises when you start turning security observability into your own platform as a profit center. Because it's, well, we could make a lot of money if we don't actually fix the root issue and just sell tools to address and mitigate some of it—not that I think that's the intentional direction that these companies are taking these things and I don't want to ascribe malice to them, but you can feel that start to be the trend that some decisions get pushed in.Alex: Yeah, I mean, everything comes down to data, right? It has to be stored somewhere, processed somewhere, analyzed somewhere. That always has a cost with it. And so, that's always this notion of the shared security model, right? We have to have someone have ownership over that data, and most of the time, that's the end-user, right? It's their data, it's their responsibility.And so, these offerings become things that they have that you can tie into to work within the ecosystem, work within their infrastructure to get that value out of your data, right? You know, where is the security model going? Where do I have issues? Where do I have misconfigurations? But again, someone has to pay for that processing time. And so, that ends up having a pretty extreme cost to it.And so, it ends up being a hard problem to solve. And it gets even harder if you're multi-cloud, right? You can't necessarily use the tooling of AWS inside of Azure or inside of Google. And other products are trying to do that, right? They're trying to be able to let you integrate their security center with other clouds as well.And it's kind of created this really interesting dichotomy where you almost have frenemies, right, where you've got, you know, a big Azure customer who's also a big AWS customer. Well, they want to go use Defender on all of their infrastructure, and Microsoft is trying to do their best to allow you to do that. Conversely, not all clouds operate in that same capacity. And you're correct, they all come at extremely different costs, they have different price models, they have different ways of going about it. And it becomes really difficult to figure out what is the best path forward.Generally, my stance is anything is better than nothing, right? So, if your only choice is using Defender to do all your stuff and it cost you an arm or leg, unfortunate, but great; at least you got something. If the path is, you know, go use this random open-source thing, great. Go do that. Early on, when I'd been at—was at Sysdig about five years ago, my big message was, you know, I don't care what you do. At least scan your containers. If you're doing nothing else in life, use Clair; scan the darn things. Don't do nothing.That's not really a problem these days, thankfully, but now we're more to a world where it's like, well, okay, you've got your containers, you've got your applications running in production. You've scanned them, that's great, but you're doing nothing at runtime. You're doing nothing in your posture world, right? Do something about it. So, maybe that is buy the enterprise tool from the cloud you're working in, buy it from some other vendor, use the open-source tool, do something.Thankfully, we live in a world where there are plenty of open tools out there we can adopt and leverage. You used the example of CloudTrail earlier. I don't know if you saw it, but there was a really, really cool talk at SharkFest last year from Gerald Combs where they leveraged Wireshark to be able to read CloudTrail logs. Which I thought was awesome.Corey: That feels more than a little bit ridiculous, just because it's—I mean I guess you could extract the JSON object across the wire then reassemble it. But, yeah, I need to think on that one.Alex: Yeah. So, it's actually really cool. They took the plugins from Falco that exist and they rewired Wireshark to leverage those plugins to read the JSON data from the CloudTrail and then wired it into the Wireshark interface to be able to do a visual inspect of CloudTrail logs. So, just like you could do, like, a follow this IP with a PCAP, you could do the same concept inside of your cloud log. So, if you look up Logray, you'll find it on the internet out there. You'll see demos of Gerald showing it off. It was a pretty darn cool way to use a visualization, let's be honest, most security professionals already know how to use in a more modern infrastructure.Corey: One last topic that I want to go into with you before we call this an episode is something that's been bugging me more and more over the years—and it annoyed me a lot when I had to deal with this stuff as a SOC 2 control owner and it's gotten exponentially worse every time I've had to deal with it ever since—and that is the seeming view of compliance and security as being one and the same, to the point where in one of my accounts that I secured rather well, I thought, I installed security hub and finally jumped through all those hoops and paid the taxes and the rest and then waited 24 hours to gather some data, then 24 hours to gather more. Awesome. Applied the AWS-approved a foundational security benchmark to it and it started shrieking its bloody head off about all of the things that were insecure and not configured properly. One of them, okay, great, it complained that the ‘Block all S3 Public Access' setting was not turned on for the account. So, I turned that on. Great.Now, it's still complaining that I have not gone through and also enabled the ‘Block Public Access Setting' on each and every S3 bucket within it. That is not improving your security posture in any meaningful way. That is box-checking so that someone in a compliance role can check that off and move on to the next thing on the clipboard. Now, originally, they started off being good-intentioned, but the result is I'm besieged by these things that don't actually matter and that means I'm not going to have time to focus on the things that actually do. Please tell me I'm wrong on some of this.Alex: [laugh].Corey: I really need to hear that.Alex: I can't. Unfortunately, I agree with you that a lot of that seems erroneous. But let's be honest, auditors have a job for a reason.Corey: Oh, I'm not besmirching the role of the auditor. Far from it. The problem I run into is that it's the Human Nessus report that dumps out, “Here's the 700 things to go fix in your environment,” as opposed to, “Here's the five things you can do right now that will meaningfully improve your security posture.”Alex: Yeah. And so, I think that's a place we see a lot of vendors moving, and I think that is the right path forward. Because we are in a world where we generate reports that are miles and miles long, we throw them over a wall to somebody, and that person says, “Are you crazy?” Like, “You want me to go do what with my time?” Like, “No. I can't. No. This is way too much.”And so, if we can narrow these things down to what matters the most today, and then what can we get rid of tomorrow, that makes life better for everybody. There are certainly ways to accomplish that across a lot of different dimensions, be that vulnerability management, or configuration management stuff, runtime stuff, and that is certainly the way we should approach it. Unfortunately, not all frameworks allow us to look at it that way.Corey: I mean, even AWS's thing here is yelling at me for a number of services not having encryption-at-rest turned on, like CloudTrail logs, or SNS topics. It's okay, let's be very clear what that is defending against: someone stealing drives out of a data center and taking them off to view the data. Is that something that I need to worry about in a public cloud provider context? Not unless I'm the CIA or something pretty close to that. I mean, if you can get my data out of an AWS data center and survive, congratulations, I kind of feel like you've earned it at this point. But that obscures things I need to be doing that I'm not.Alex: Back in the day, I had a customer who used to have—they had storage arrays and their storage arrays' logins were the default login that they came with the array. They never changed it. You just logged in with admin and no password. And I was like, “You know, you should probably fix that.” And he sent a message back saying, “Yeah, you know, maybe I should, but my feeling is that if it got that far into my infrastructure where they can get to that interface, I'm already screwed, so it doesn't really matter to me if I set that admin password or not.”Corey: Yeah, there is a defense-in-depth argument to be made. I am not disputing that, but the Cisco world is melting down right now because of a bunch of very severe vulnerabilities that have been disclosed. But everything to exploit these things always requires, well you need access to the management interface. Back when I was a network administrator at Chapman University in 2006, even then, I knew, “Well, we certainly don't want to put the management interfaces on the same VLAN that's passing traffic.”So, is it good that there's an unpatched vulnerability there? No, but Shodan, the security vulnerability search engine shows over 80,000 instances that are affected on the public internet. It would never have occurred to me to put the management interface of important network gear on the public internet. That just is… I don't understand that.Alex: Yeah.Corey: So, on some level, I think the lesson here is that there's always someone who has something else to focus on at a given moment, and… where it's a spectrum: no one is fully secure, but ideally, you don't want to be the lowest of low-hanging fruit.Alex: Right, right. I mean, if you were fully secure, you'd just turn it off, but unfortunately, we can't do that. We have to have it be accessible because that's our jobs. And so, if we're having it be accessible, we got to do the best we can. And I think that is a good point, right? Not being the worst should be your goal, at the very, very least.Doing bare minimums, looking at those checks, deciding if they're relevant for you or not, just because it says the configuration is required, you know, is it required in your use case? Is it required for your requirements? Like, you know, are you a FedRAMP customer? Okay, yeah, it's probably a requirement because, you know, it's FedRAMP. They're going to tell you got to do it. But is it your dev environment? Is it your demo stuff? You know, where does it exist, right? There's certain areas where it makes sense to deal with it and certain areas where it makes sense to take care of it.Corey: I really want to thank you for taking the time to talk me through your thoughts on all this. If people want to learn more, where's the best place for them to find you?Alex: Yeah, so they can either go to sysdig.com/opensource. A bunch of open-source resources there. They can go to falco.org, read about the stuff on that site, as well. Lots of different ways to kind of go and get yourself educated on stuff in this space.Corey: And we will, of course, put links to that into the show notes. Thank you so much for being so generous with your time. I appreciate it.Alex: Yeah, thanks for having me. I appreciate it.Corey: Alexander Lawrence, principal security architect at Sysdig. I'm Cloud Economist Corey Quinn, and this episode has been brought to us by our friends, also at Sysdig. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that I will then read later when I pick it off the wire using Wireshark.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How Couchbase is Using AI to Enhance the User Experience with Laurent Doguin

Screaming in the Cloud

Play Episode Listen Later Nov 14, 2023 31:52


Laurent Doguin, Director of Developer Relations & Strategy at Couchbase, joins Corey on Screaming in the Cloud to talk about the work that Couchbase is doing in the world of databases and developer relations, as well as the role of AI in their industry and beyond. Together, Corey and Laurent discuss Laurent's many different roles throughout his career including what made him want to come back to a role at Couchbase after stepping away for 5 years. Corey and Laurent dig deep on how Couchbase has grown in recent years and how it's using artificial intelligence to offer an even better experience to the end user.About LaurentLaurent Doguin is Director of Developer Relations & Strategy at Couchbase (NASDAQ: BASE), a cloud database platform company that 30% of the Fortune 100 depend on.Links Referenced: Couchbase: https://couchbase.com XKCD #927: https://xkcd.com/927/ dbdb.io: https://dbdb.io DB-Engines: https://db-engines.com/en/ Twitter: https://twitter.com/ldoguin LinkedIn: https://www.linkedin.com/in/ldoguin/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Are you navigating the complex web of API management, microservices, and Kubernetes in your organization? Solo.io is here to be your guide to connectivity in the cloud-native universe!Solo.io, the powerhouse behind Istio, is revolutionizing cloud-native application networking. They brought you Gloo Gateway, the lightweight and ultra-fast gateway built for modern API management, and Gloo Mesh Core, a necessary step to secure, support, and operate your Istio environment.Why struggle with the nuts and bolts of infrastructure when you can focus on what truly matters - your application. Solo.io's got your back with networking for applications, not infrastructure. Embrace zero trust security, GitOps automation, and seamless multi-cloud networking, all with Solo.io.And here's the real game-changer: a common interface for every connection, in every direction, all with one API. It's the future of connectivity, and it's called Gloo by Solo.io.DevOps and Platform Engineers, your journey to a seamless cloud-native experience starts here. Visit solo.io/screaminginthecloud today and level up your networking game.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. This promoted guest episode is brought to us by our friends at Couchbase. And before we start talking about Couchbase, I would rather talk about not being at Couchbase. Laurent Doguin is the Director of Developer Relations and Strategy at Couchbase. First, Laurent, thank you for joining me.Laurent: Thanks for having me. It's a pleasure to be here.Corey: So, what I find interesting is that this is your second time at Couchbase, where you were a developer advocate there for a couple of years, then you had five years of, we'll call it wilderness I suppose, and then you return to be the Director of Developer Relations. Which also ties into my personal working thesis of, the best way to get promoted at a lot of companies is to leave and then come back. But what caused you to decide, all right, I'm going to go work somewhere else? And what made you come back?Laurent: So, I've joined Couchbase in 2014. Spent about two or three years as a DA. And during those three years as a developer advocate, I've been advocating SQL database and I—at the time, it was mostly DBAs and ops I was talking to. And DBA and ops are, well, recent, modern ops are writing code, but they were not the people I wanted to talk to you when I was a developer advocate. I came from a background of developer, I've been a platform engineer for an enterprise content management company. I was writing code all day.And when I came to Couchbase, I realized I was mostly talking about Docker and Kubernetes, which is still cool, but not what I wanted to do. I wanted to talk about developers, how they use database to be better app, how they use key-value, and those weird thing like MapReduce. At the time, MapReduce was still, like, a weird thing for a lot of people, and probably still is because now everybody's doing SQL. So, that's what I wanted to talk about. I wanted to… engage with people identify with, really. And so, didn't happen. Left. Built a Platform as a Service company called Clever Cloud. They started about four or five years before I joined. We went from seven people to thirty-one LFs, fully bootstrapped, no VC. That's an interesting way to build a company in this age.Corey: Very hard to do because it takes a lot of upfront investment to build software, but you can sort of subsidize that via services, which is what we've done here in some respects. But yeah, that's a hard road to walk.Laurent: That's the model we had—and especially when your competition is AWS or Azure or GCP, so that was interesting. So entrepreneurship, it's not for everyone. I did my four years there and then I realized, maybe I'm going to do something else. I met my former colleagues of Couchbase at a software conference called Devoxx, in France, and they told me, “Well, there's a new sheriff in town. You should come back and talk to us. It's all about developers, we are repositioning, rehandling the way we do marketing at Couchbase. Why not have a conversation with our new CMO, John Kreisa?”And I said, “Well, I mean, I don't have anything to do. I actually built a brewery during that past year with some friends. That was great, but that's not going to feed me or anything. So yeah, let's have a conversation about work.” And so, I talked to John, I talked to a bunch of other people, and I realized [unintelligible 00:03:51], he actually changed, like, there was a—they were purposely going [against 00:03:55] developer, talking to developer. And that was not the case, necessarily, five, six years before that.So, that's why I came back. The product is still amazing, the people are still amazing. It was interesting to find a lot of people that still work there after, what, five years. And it's a company based in… California, headquartered in California, so you would expect people to, you know, jump around a bit. And I was pleasantly surprised to find the same folks there. So, that was also one of the reasons why I came back.Corey: It's always a strong endorsement when former employees rejoin a company. Because, I don't know about you, but I've always been aware of those companies you work for, you leave. Like, “Aw, I'm never doing that again for love or money,” just because it was such an unpleasant experience. So, it speaks well when you see companies that do have a culture of boomerangs, for lack of a better term.Laurent: That's the one we use internally, and there's a couple. More than a couple.Corey: So, one thing that seems to have been a thread through most of your career has been an emphasis on developer experience. And I don't know if we come at it from the same perspective, but to me, what drives nuts is honestly, with my work in cloud, bad developer experience manifests as the developer in question feeling like they're somehow not very good at their job. Like, they're somehow not understanding how all this stuff is supposed to work, and honestly, it leads to feeling like a giant fraud. And I find that it's pernicious because even when I intellectually know for a fact that I'm not the dumbest person ever to use this tool when I don't understand how something works, the bad developer experience manifests to me as, “You're not good enough.” At least, that's where I come at it from.Laurent: And also, I [unintelligible 00:05:34] to people that build these products because if we build the products, the user might be in the same position that we are right now. And so, we might be responsible for that experience [unintelligible 00:05:43] a developer, and that's not a great feeling. So, I completely agree with you. I've tried to… always on software-focused companies, whether it was Nuxeo, Couchbase, Clever Cloud, and then Couchbase. And I guess one of the good thing about coming back to a developer-focused era is all the product alignments.Like, a lot of people talk about product that [grows 00:06:08] and what it means. To me what it means was, what it meant—what it still means—building a product that developer wants to use, and not just want to, sometimes it's imposed to you, but actually are happy to use, and as you said, don't feel completely stupid about it in front of the product. It goes through different things. We've recently revamped our Couchbase UI, Couchbase Capella UI—Couchbase Capella is a managed cloud product—and so we've added a lot of in-product getting started guidelines, snippets of code, to help developers getting started better and not have that feeling of, “What am I doing? Why is it not working and what's going on?”Corey: That's an interesting decision to make, just because historically, working with a bunch of tools, the folks who are building the documentation working with that tool, tend to generally be experts at it, so they tend to optimize for improving things for the experience of someone has been using it for five years as opposed to the newcomer. So, I find that the longer a product is in existence, in many cases, the worse the new user experience becomes because companies tend to grow and sprawl in different ways, the product does likewise. And if you don't know the history behind it, “Oh, your company, what does it do?” And you look at the website and there's 50 different offerings that you have—like, the AWS landing page—it becomes overwhelming very quickly. So, it's neat to see that emphasis throughout the user interface on the new developer experience.On the other side of it, though, how are the folks who've been using it for a while respond to those changes? Because it's frustrating for me at least, when I log into a new account, which happens periodically within AWS land, and I have this giant series of onboarding pop-ups that I have to click to make go away every single time. How are they responding to it?Laurent: Yeah, it's interesting. One of the first things that struck me when I joined Couchbase the first time was the size of the technical documentation team. Because the whole… well, not the whole point, but part of the reason why they exist is to do that, to make sure that you understand all the differences and that it doesn't feel like the [unintelligible 00:08:18] what the documentation or the product pitch or everything. Like, they really, really, really emphasize on this from the very beginning. So, that was interesting.So, when you get that culture built into the products, well, the good thing is… when people try Couchbase, they usually stick with Couchbase. My main issue as a Director of the Developer Relations is not to make people stick with Couchbase because that works fairly well with the product that we have; it's to make them aware that we exist. That's the biggest issue I have. So, my goal as DevRel is to make sure that people get the trial, get through the trial, get all that in-app context, all that helps, get that first sample going, get that first… I'm not going to say product built because that's even a bit further down the line, but you know, get that sample going. We have a code playground, so when you're in the application, you get to actually execute different pieces of code, different languages. And so, we get those numbers and we're happy to see that people actually try that. And that's a, well, that's a good feeling.Corey: I think that there's a definite lack of awareness almost industry-wide around the fact that as the diversity of your customers increases, you have to have different approaches that meet them at various points along the journey. Because things that I've seen are okay, it's easy to ass—even just assuming a binary of, “Okay, I've done this before a thousand times; this is the thousand and first, I don't need the Hello World tutorial,” versus, “Oh, I have no idea what I'm doing. Give me the Hello World tutorial,” there are other points along that continuum, such as, “Oh, I used to do something like this, but it's been three years. Can you give me a refresher,” and so on. I think that there's a desire to try and fit every new user into a predefined persona and that just doesn't work very well as products become more sophisticated.Laurent: It's interesting, we actually have—we went through that work of defining those personas because there are many. And that was the origin of my departure. I had one person, ops slash DBA slash the person that maintain this thing, and I wanted to talk to all the other people that built the application space in Couchbase. So, we broadly segment things into back-end, full-stack, and mobile because Couchbase is also a mobile database. Well, we haven't talked too much about this, so I can explain you quickly what Couchbase is.It's basically a distributed JSON database with an integrated caching layer, so it's reasonably fast. So it does cache, and when the key-value is JSON, then you can create with SQL, you can do full-text search, you can do analytics, you can run user-defined function, you get triggers, you get all that actual SQL going on, it's transactional, you get joins, ANSI joins, you get all those… windowing function. It's modern SQL on the JSON database. So, it's a general-purpose database, and it's a general-purpose database that syncs.I think that's the important part of Couchbase. We are very good at syncing cluster of databases together. So, great for multi-cloud, hybrid cloud, on-prem, whatever suits you. And we also sync on the device, there's a thing called Couchbase Mobile, which is a local database that runs in your phone, and it will sync automatically to the server. So, a general-purpose database that syncs and that's quite modern.We try to fit as much way of growing data as possible in our database. It's kind of a several-in-one database. We call that a data platform. It took me a while to warm up to the word platform because I used to work for an enterprise content management platform and then I've been working for a Platform as a Service and then a data platform. So, it took me a bit of time to warm up to that term, but it explained fairly well, the fact that it's a several-in-one product and we empower people to do the trade-offs that they want.Not everybody needs… SQL. Some people just need key-value, some people need search, some people need to do SQL and search in the same query, which we also want people to do. So, it's about choices, it's about empowering people. And that's why the word platform—which can feel intimidating because it can seem complex, you know, [for 00:12:34] a lot of choices. And choices is maybe the enemy of a good developer experience.And, you know, we can try to talk—we can talk for hours about this. The more services you offer, the more complicated it becomes. What's the sweet spots? We did—our own trade-off was to have good documentation and good in-app help to fix that complexity problem. That's the trade-off that we did.Corey: Well, we should probably divert here just to make sure that we cover the basic groundwork for those who might not be aware: what exactly is Couchbase? I know that it's a database, which honestly, anything is a database if you hold it incorrectly enough; that's my entire shtick. But what is it exactly? Where does it start? Where does it stop?Laurent: Oh, where does it start? That's an interesting question. It's a… a merge—some people would say a fork—of Apache CouchDB, and membase. Membase was a distributed key-value store and CouchDB was this weird Erlang and C JSON REST API database that was built by Damian Katz from Lotus Notes, and that was in 2006 or seven. That was before Node.js.Let's not care about the exact date. The point is, a JSON and REST API-enabled database before Node.js was, like, a strong [laugh] power move. And so, those two merged and created the first version of Couchbase. And then we've added all those things that people want to do, so SQL, full-text search, analytics, user-defined function, mobile sync, you know, all those things. So basically, a general-purpose database.Corey: For what things is it not a great fit? This is always my favorite question to ask database folks because the zealot is going to say, “It's good for every use case under the sun. Use it for everything, start to finish”—Laurent: Yes.Corey: —and very few databases can actually check that box.Laurent: It's a very interesting question because when I pitch like, “We do all the things,” because we are a platform, people say, “Well, you must be doing lots of trade-offs. Where is the trade-off?” The trade-off is basically the way you store something is going to determine the efficiency of your [growing 00:14:45]—or the way you [grow 00:14:47] it. And that's one of the first thing you learn in computer science. You learn about data structure and you know that it's easier to get something in a hashmap when you have the key than passing your whole list of elements and checking your data, is it right one? It's the same for databases.So, our different services are different ways to store the data and to query it. So, where is it not good, it's where we don't have an index or a service that answer to the way you want to query data. We don't have a graph service right now. You can still do recursive common table expression for the SQL nerds out there, that will allow you to do somewhat of a graph way of querying your data, but that's not, like, actual—that's not a great experience for people were expecting a graph, like a Neo4j or whatever was a graph database experience.So, that's the trade-off that we made. We have a lot of things at the same place and it can be a little hard, intimidating to operate, and the developer experience can be a little, “Oh, my God, what is this thing that can do all of those features?” At the same time, that's just, like, one SDK to learn for all of the features we've just talked about. So, that's what we did. That's a trade-off that we did.It sucks to operate—well, [unintelligible 00:16:05] Couchbase Capella, which is a lot like a vendor-ish thing to say, but that's the value props of our managed cloud. It's hard to operate, we'll operate this for you. We have a Kubernetes operator. If you are one of the few people that wants to do Kubernetes at home, that's also something you can do. So yeah, I guess what we cannot do is the thing that Route 53 and [Unbound 00:16:26] and [unintelligible 00:16:27] DNS do, which is this weird DNS database thing that you like so much.Corey: One thing that's, I guess, is a sign of the times, but I have to confess that I'm relatively skeptical around, when I pull up couchbase.com—as one does; you're publicly traded; I don't feel that your company has much of a choice in this—but the first thing it greets me with is Couchbase Capella—which, yes, that is your hosted flagship product; that should be the first thing I see on the website—then it says, “Announcing Capella iQ, AI-powered coding assistance for developers.” Which oh, great, not another one of these.So, all right, give me the pitch. What is the story around, “Ooh, everything that has been a problem before, AI is going to make it way better.” Because I've already talked to you about developer experience. I know where you stand on these things. I have a suspicion you would not be here to endorse something you don't believe in. How does the AI magic work in this context?Laurent: So, that's the thing, like, who's going to be the one that get their products out before the other? And so, we're announcing it on the website. It's available on the private preview only right now. I've tried it. It works.How does it works? The way most chatbot AI code generation work is there's a big model, large language model that people use and that people fine-tune into in order to specialize it to the tasks that they want to do. The way we've built Couchbase iQ is we picked a very famous large language model, and when you ask a question to a bot, there's a context, there's a… the size of the window basically, that allows you to fit as much contextual information as possible. The way it works and the reason why it's integrated into Couchbase Capella is we make sure that we preload that context as much as possible and fine-tune that model, that [foundation 00:18:19] model, as much as possible to do whatever you want to do with Couchbase, which usually falls into several—a couple of categories, really—well maybe three—you want to write SQL, you want to generate data—actually, that's four—you want to generate data, you want to generate code, and if you paste some SQL code or some application code, you want to ask that model, what does do? It's especially true for SQL queries.And one of the questions that many people ask and are scared of with chatbot is how does it work in terms of learning? If you give a chatbot to someone that's very new to something, and they're just going to basically use a chatbot like Stack Overflow and not really think about what they're doing, well it's not [great 00:19:03] right, but because that's the example that people think most developer will do is generate code. Writing code is, like, a small part of our job. Like, a substantial part of our job is understanding what the code does.Corey: We spend a lot more time reading code than writing it, if we're, you know—Laurent: Yes.Corey: Not completely foolish.Laurent: Absolutely. And sometimes reading big SQL query can be a bit daunting, especially if you're new to that. And one of the good things that you get—Corey: Oh, even if you're not, it can still be quite daunting, let me assure you.Laurent: [laugh]. I think it's an acquired taste, let's be honest. Some people like to write assembly code and some people like to write SQL. I'm sort of in the middle right now. You pass your SQL query, and it's going to tell you more or less what it does, and that's a very nice superpower of AI. I think that's [unintelligible 00:19:48] that's the one that interests me the most right now is using AI to understand and to work better with existing pieces of code.Because a lot of people think that the cost of software is writing the software. It's maintaining the codebase you've written. That's the cost of the software. That's our job as developers should be to write legacy code because it means you've provided value long enough. And so, if in a company that works pretty well and there's a lot of legacy code and there's a lot of new people coming in and they'll have to learn all those things, and to be honest, sometimes we don't document stuff as much as we should—Corey: “The code is self-documenting,” is one of the biggest lies I hear in tech.Laurent: Yes, of course, which is why people are asking retired people to go back to COBOL again because nobody can read it and it's not documented. Actually, if someone's looking for a company to build, I guess, explaining COBOL code with AI would be a pretty good fit to do in many places.Corey: Yeah, it feels like that's one of those things that would be of benefit to the larger world. The counterpoint to that is you got that many business processes wrapped around something running COBOL—and I assure you, if you don't, you would have migrated off of COBOL long before now—it's making sure that okay well, computers, when they're in the form of AI, are very, very good at being confident-sounding when they talk about things, but they can also do that when they're completely wrong. It's basically a BS generator. And that is a scary thing when you're taking a look at something that broad. I mean, I'll use the AI coding assistance for things all the time, but those things look a lot more like, “Okay, I haven't written CloudFormation from scratch in a while. Build out the template, just because I forget the exact sequence.” And it's mostly right on things like that. But then you start getting into some of the real nuanced areas like race conditions and the rest, and often it can make things worse instead of better. That's the scary part, for me, at least.Laurent: Most coding assistants are… and actually, each time you ask its opinion to an AI, they say, “Well, you should take this with a grain of salt and we are not a hundred percent sure that this is the case.” And this is, make sure you proofread that, which again, from a learning perspective, can be a bit hard to give to new students. Like, you're giving something to someone and might—that assumes is probably as right as Wikipedia but actually, it's not. And it's part of why it works so well. Like, the anthropomorphism that you get with chatbots, like, this, it feels so human. That's why it get people so excited about it because if you think about it, it's not that new. It's just the moment it took off was the moment it looked like an assertive human being.Corey: As you take a look through, I guess, the larger ecosystem now, as well as the database space, given that is where you specialize, what do you think people are getting right and what do you think people are getting wrong?Laurent: There's a couple of ways of seeing this. Right now, when I look at from the outside, every databases is going back to SQL, I think there's a good reason for that. And it's interesting to put into perspective with AI because when you generate something, there's probably less chance to generate something wrong with SQL than generating something with code directly. And I think five generation—was it four or five generation language—there some language generation, so basically, the first innovation is assembly [into 00:23:03] in one and then you get more evolved languages, and at some point you get SQL. And SQL is a way to very shortly express a whole lot of business logic.And I think what people are doing right now is going back to SQL. And it's been impressive to me how even new developers that were all about [ORMs 00:23:25] and [no-DMs 00:23:26], and you know, avoiding writing SQL as much as possible, are actually back to it. And that's, for an old guy like me—well I mean, not that old—it feels good. I think SQL is coming back with a vengeance and that makes me very happy. I think what people don't realize is that it also involves doing data modeling, right, and stuff because database like Couchbase that are schemaless exist. You should store your data without thinking about it, you should still do data modeling. It's important. So, I think that's the interesting bits. What are people doing wrong in that space? I'm… I don't want to say bad thing about other databases, so I cannot even process that thought right now.Corey: That's okay. I'm thrilled to say negative things about any database under the sun. They all haunt me. I mean, someone wants to describe SQL to me is the chess of the programming world and I feel like that's very accurate. I have found that it is far easier in working with databases to make mistakes that don't wash off after a new deployment than it is in most other realms of technology. And when you're lucky and have a particular aura, you tend to avoid that stuff, at least that was always my approach.Laurent: I think if I had something to say, so just like the XKCD about standards: like, “there's 14 standards. I'm going to do one that's going to unify them all.” And it's the same with database. There's a lot… a [laugh] lot of databases. Have you ever been on a website called dbdb.io?Corey: Which one is it? I'm sorry.Laurent: Dbdb.io is the database of databases, and it's very [laugh] interesting website for database nerds. And so, if you're into database, dbdb.io. And you will find Couchbase and you will find a whole bunch of other databases, and you'll get to know which database is derived from which other database, you get the history, you get all those things. It's actually pretty interesting.Corey: I'm familiar with DB-Engines, which is sort of like the ranking databases by popularity, and companies will bend over backwards to wind up hitting all of the various things that they want in that space. The counterpoint with all of it is that it's… it feels historically like there haven't exactly been an awful lot of, shall we say, huge innovations in databases for the past few years. I mean, sure, we hear about vectors all the time now because of the joy that's AI, but smarter people than I are talking about how, well that's more of a feature than it is a core database. And the continual battle that we all hear about constantly is—and deal with ourselves—of should we use a general-purpose database, or a task-specific database for this thing that I'm doing remains largely unsolved.Laurent: Yeah, what's new? And when you look at it, it's like, we are going back to our roots and bringing SQL again. So, is there anything new? I guess most of the new stuff, all the interesting stuff in the 2010s—well, basically with the cloud—were all about the distribution side of things and were all about distributed consensus, Zookeeper, etcd, all that stuff. Couchbase is using an RAFT-like algorithm to keep every node happy and under the same cluster.I think that's one of the most interesting things we've had for the past… well, not for the past ten years, but between, basically, 20 or… between the start of AWS and well, let's say seven years ago. I think the end of the distribution game was brought to us by the people that have atomic clock in every data center because that's what you use to synchronize things. So, that was interesting things. And then suddenly, there wasn't that much innovation in the distributed world, maybe because Aphyr disappeared from Twitter. That might be one of the reason. He's not here to scare people enough to be better at that.Aphyr was the person behind the test called the Jepsen Test [shoot 00:27:12]. I think his blog engine was called Call Me Maybe, and he was going through every distributed system and trying to break them. And that was super interesting. And it feels like we're not talking that much about this anymore. It really feels like database have gone back to the status of infrastructure.In 2010, it was not about infrastructure. It was about developer empowerment. It was about serving JSON and developer experience and making sure that you can code faster without some constraint in a distributed world. And like, we fixed this for the most part. And the way we fixed this—and as you said, lack of innovation, maybe—has brought databases back to an infrastructure layer.Again, it wasn't the case 15 years a—well, 2023—13 years ago. And that's interesting. When you look at the new generation of databases, sometimes it's just a gateway on top of a well-known database and they call that a database, but it provides higher-level services, provides higher-level bricks, better developer experience to developer to build stuff faster. We've been trying to do this with Couchbase App Service and our sync gateway, which is basically a gateway on top of a Couchbase cluster that allow you to manage authentication, authorization, that allows you to manage synchronization with your mobile device or with websites. And yeah, I think that's the most interesting thing to me in this industry is how it's been relegated back to infrastructure, and all the cool stuff, new stuff happens on the layer above that.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where's the best place for them to find you?Laurent: Thanks for having me and for entertaining this conversation. I can be found anywhere on the internet with these six letters: L-D-O-G-U-I-N. That's actually 7 letters. Ldoguin. That's my handle on pretty much any social network. Ldoguin. So X, [BlueSky 00:29:21], LinkedIn. I don't know where to be anymore.Corey: I hear you. We'll put links to all of it in the [show notes 00:29:27] and let people figure out where they want to go on that. Thank you so much for taking the time to speak with me today. I really do appreciate it.Laurent: Thanks for having me.Corey: Laurent Doguin, Director of Developer Relations and Strategy at Couchbase. I'm Cloud Economist Corey Quinn and this episode has been brought to us by our friends at Couchbase. If you enjoyed this episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that you're not going to be able to submit properly because that platform of choice did not pay enough attention to the experience of typing in a comment.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Building a Strong Company Culture at Honeycomb with Mike Goldsmith

Screaming in the Cloud

Play Episode Listen Later Nov 9, 2023 32:31


Mike Goldsmith, Staff Software Engineer at Honeycomb, joins Corey on Screaming in the Cloud to talk about Open Telemetry, company culture, and the pros and cons of Go vs. .NET. Corey and Mike discuss why OTel is such an important tool, while pointing out its double-edged sword of being fully open-source and community-driven. Opening up about Honeycomb's company culture and how to find a work-life balance as a fully-remote employee, Mike points out how core-values and social interaction breathe life into a company like Honeycomb.About MikeMike is an OpenSource focused software engineer that builds tools to help users create, shape and deliver system & application telemetry. Mike contributes to a number of OpenTelemetry initiatives including being a maintainer for Go Auto instrumentation agent, Go proto packages and an emeritus .NET SDK maintainer..Links Referenced: Honeycomb: https://www.honeycomb.io/ Twitter: https://twitter.com/Mike_Goldsmith Honeycomb blog: https://www.honeycomb.io/blog LinkedIn: https://www.linkedin.com/in/mikegoldsmith/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by our friends at Honeycomb who I just love talking to. And we've gotten to talk to these folks a bunch of different times in a bunch of different ways. They've been a recurring sponsor of this show and my other media nonsense, they've been a reference customer for our consulting work at The Duckbill Group a couple of times now, and we just love working with them just because every time we do we learn something from it. I imagine today is going to be no exception. My guest is Mike Goldsmith, who's a staff software engineer over at Honeycomb. Mike, welcome to the show.Mike: Hello. Thank you for having me on the show today.Corey: So, I have been familiar with Honeycomb for a long time. And I'm still trying to break myself out of the misapprehension that, oh, they're a small, scrappy, 12-person company. You are very much not that anymore. So, we've gotten to a point now where I definitely have to ask the question: what part of the observability universe that Honeycomb encompasses do you focus on?Mike: For myself, I'm very focused on the telemetry side, so the place where I work on the tools that customers deploy in their own infrastructure to collect all of that useful data and make—that we can then send on to Honeycomb to make use of and help identify where the problems are, where things are changing, how we can best serve that data.Corey: You've been, I guess on some level, there's—I'm trying to make this not sound like an accusation, but I don't know if we can necessarily avoid that—you have been heavily involved in OpenTelemetry for a while, both professionally, as well as an open-source contributor in your free time because apparently you also don't know how to walk away from work when the workday is done. So, let's talk about that a little bit because I have a number of questions. Starting at the very beginning, for those who have not gone trekking through that particular part of the wilderness-slash-swamp, what is OpenTelemetry?Mike: So, OpenTelemetry is a vendor-agnostic set of tools that allow anybody to collect data about their system and then send it to a target back-end to make use of that data. The data, the visualization tools, and the tools that make use of that data are a variety of different things, so whether it's tracing data or metrics or logs, and then it's trying to take value from that. The big thing what OpenTelemetry is aimed at doing is making the collection of the data and the transit of the data to wherever you want to send it a community-owned resource, so it's not like you get vendor lock-in by going to using one competitor and then go to a different—you want to go and try a different tool and you've got to re-instrument or change your application heavily to make use of that. OpenTelemetry abstracts all that away, so all you need to know about is what you're instrumented with, what [unintelligible 00:03:22] can make of that data, and then you can send it to one or multiple different tools to make use of that data. So, you can even compare some tools side-by-side if you wanted to.Corey: So, given that it's an open format, from the customer side of the world, this sounds awesome. Is it envisioned that this is something—an instrument that gets instrumented at the application itself or once I send it to another observability vendor, is it envisioned that okay, if I send this data to Honeycomb, I can then instrument what Honeycomb sees about that and then send that onward somewhere else, maybe my ancient rsyslog server, maybe a different observability vendor that has a different emphasis. Like, how is it envisioned unfolding within the ecosystem? Like, in other words, can I build a giant ring of these things that just keep building an infinitely expensive loop?Mike: Yeah. So ideally, you would try and try to pick one or a few tools that will provide the most value that you can send to, and then it could answer all of the questions for you. So, at Honeycomb, we try to—we are primarily focused on tracing because we want to do application-level information to say, this user had this interaction, this is the context of what happened, these are the things that they clicked on, this is the information that flowed through your back-end system, this is the line-item order that was generated, the email content, all of those things all linked together so we know that person did this thing, it took this amount of time, and then over a longer period of time, from the analytics point of view, you can then say, “These are the most popular things that people are doing. This is typically how long it takes.” And then we can highlight outliers to say, “Okay, this person is having an issue.” This individual person, we can identify them and say, “This is an issue. This is what's different about what they're doing.”So, that's quite a unique tracing tool or opportunity there. So, that lets you really drive what's happening rather than what has happened. So, logs and metrics are very backward-looking to say, “This is the thing that this thing happened,” and tries to give you the context about it. Tracing tries to give you that extra layer of context to say that this thing happened and it had all of these things related to it, and why is it interesting?Corey: It's odd to me that vendors would be putting as much energy into OpenTelemetry—or OTel, as it seems to always be abbreviated as when I encounter it, so I'm using the term just so people, “Oh, wait, that's that thing I keep seeing. What is that?” Great—but it seems odd to me that vendors would be as embracing of that technology as they have been, just because historically, I remember whenever I had an application when I was using production in anger—which honestly, ‘anger' is a great name for the production environment—whenever I was trying to instrument things, it was okay, you'd have to grab this APM tools library and instrument there, and then something else as well, and you wound up with an order of operations where which one wrapped the other. And sometimes that caused problems. And of course, changing vendors meant you had to go and redeploy your entire application with different instrumentation and hope nothing broke. There was a lock-in story that was great for the incumbents back when that was state of the art. But even some of those incumbents are now embracing OTel. Why?Mike: I think it's because it's showing that there's such a diverse group of tools there, and [unintelligible 00:06:32] being the one that you've selected a number of years ago and then they could hold on to that. The momentum slowed because they were able to move at a slower pace because they were the organizations that allowed us—they were the de facto tooling. And then once new companies and competitors came around and we're open to trying to get a part of that market share, it's given the opportunity to then really pick the tool that is right for the job, rather than just the best than what is perceived to be the best tool because they're the largest one or the ones that most people are using. OpenTelemetry allows you to make an organization and a tool that's providing those tools focus on being the best at it, rather than just the biggest one.Corey: That is, I think, a more enlightened perspective than frankly, I expect a number of companies out there to have taken, just because it seems like lock-in seems to be the order of the day for an awful lot of companies. Like, “Okay, why are customers going to stay with us?” “Because we make it hard to leave,” is… I can understand the incentive, but that only works for so long if you're not actively solving a problem that customers have. One of the challenges that I ran into, even with OTel, was back when I was last trying to instrument a distributed application—which was built entirely on Lambda—is the fact that I was doing this for an application that was built entirely on Lambda. And it felt like the right answer was to, oh, just use an OTel layer—a Lambda layer that wound up providing the functionality you cared about.But every vendor seemed to have their own. Honeycomb had one, Lightstep had one, AWS had one, and now it's oh, dear, this is just the next evolution of that specific agent problem. How did that play out? Is that still the way it works? Is there other good reasons for this? Or is this just people trying to slap a logo on things?Mike: Yeah, so being a fully open-source project and a community-driven project is a double-edged sword in some ways. One it gives the opportunity for everybody to participate, everybody can move between tools a lot easier and you can try and find the best fit for you. The unfortunate part around open-source-driven projects like that means that it's extremely configuration-heavy because it can do anything; it's not opinionated, which means that if you want to have the opportunity to do everything, every possible use case is available to everyone all of the time. So, if you might have a very narrow use case, say, “I want to learn about this bit of information,” like, “I'm working with the [unintelligible 00:09:00] SDK. I want to talk about—I've got an [unintelligible 00:09:03] application and I want to collect data that's running in a Lambda layer.” The OpenTelemetry SDK that has to serve all of the [other 00:09:10] JavaScript projects across all the different instrumentations, possibly talking about auto-instrumentation, possibly talking about lots of other tools that can be built into that project, it just leads to a very highly configurable but very complicated tool.So, what the vendor specifics, what you've suggested there around like Honeycomb, or other organizations providing the layers, they're trying to simplify the usage of the SDK to make some of those assumptions for you that you are going to be sending telemetry to Honeycomb, you are going to be talking about an API key that is going to be in a particular format, it is easier to pass that information into the SDK so it knows how to communicate rather than—as well as where it's going to communicate that data to.Corey: There's a common story that I tend to find myself smacking into almost against my will, where I have found myself at the perfect intersection of a variety of different challenges, and for some reason, I have stumbled blindly and through no ill intent into ‘this is terrible' territory. I wound to finally getting blocked and getting distracted by something else shiny on this project about two years ago because the problem I was getting into was, okay, I got to start sending traces to various places and that was awesome, but now I wanted to annotate each span with a user identity that could be derived from code, and the way that it interfaced with the various Lambda layers at that point in time was, ooh, that's not going to be great. And I think there were a couple of GitHub issues opened on it as feature enhancements for a couple of layers. And then I, again, was still distracted by shiny things and never went back around to it. But I was left with the distinct impression that building something purely out of Lambda functions—and also probably popsicle sticks—is something of an edge case. Is there a particular software architecture or infrastructure architecture that OTel favors?Mike: I don't think it favors any in particular, but it definitely suffers because it's, as I said earlier, it's trying to do that avail—the single SDK is available to many different use cases, which has its own challenges because then it has to deal with so many different options. But I don't think OpenTelemetry has a specific, like, use case in mind. It's definitely focused on, like—sorry, telemetry tracing—tracing is focused on application telemetry. So, it's focused on about your code that you build yourself and then deploy. There are other tools that can collect operational data, things like the OpenTelemetry Collector is then available to sit outside of that process and say, what's going on in my system?But yeah, I wouldn't say that there's a specific infrastructure that it's aimed at doing. A lot of the cloud operators and tools are trying to make sure that that information is available and OpenTelemetry SDKs are available. But yeah, at the moment, it does require some knowledge around what's best for your application if you're not in complete control of all of the infrastructure that it's running in.Corey: It feels that with most things that are sort of pulled into the orbit of the CNCF—and OTel is no exception to this—that there's an idea that oh, well, everything is going to therefore be running in containers, on top of Kubernetes. And that might be unfair, but it also, frankly, winds up following pretty accurately what a lot of applications I'm seeing in client environments have been doing. Don't take it as a criticism. But it does seem like it is designed with an eye toward everything being microservices running on containers, scheduled which, from a infrastructure perspective, what appears to be willy-nilly abandoned, and how do you wind up gathering useful information out of that without drowning in data? That seems to be, from at least my brief experience with OTel, the direction it heads in. Is that directionally correct?Mike: Yeah, I think so. I think OpenTelemetry has a quite strong relationship with CNCF and therefore Kubernetes. That is a use case that we see as a very common with customers that we engage with, both at the prospect level and then just initial conversations, people using something like Kubernetes to do the application orchestration is very, very common. It's something that OpenTelemetry and Honeycomb are wanting to improve on as well. We want to get by a very good experience because it is so common when we come up to it that we want to have a very good, strong opinion around, well, if you're running in Kubernetes, these are the tools and these are the right ways to use OpenTelemetry to get the best out of it.Corey: I want to change gears a little bit. Something that's interested me about Honeycomb for a while has been its culture. Your founders have been very public about their views on a variety of different things that are not just engineering-centric, but tangential to it, like, engineering management: how not to be terrible at it. And based on a huge number of conversations I've had with folks over there, I'm inclined to agree that the stories they tell in public do align with how things go internally. Or at least if they're not, I would not expect you to admit it on the record, so either way, we'll just take that as a given.What I'm curious about is that you are many timezones away from their very nice office here in San Francisco. What's it like working remote in a company that is not fully distributed? Which is funny, we talk about distributed applications as if they're a given but distributed teams are still something we're wrangling with.Mike: Yeah, it's something that I've dealt with for quite a while, for maybe seven or eight years is worked with a few different organizations that are not based in my timezone. There's been a couple, primarily based in San Francisco area, so Pacific Time. An eight-hour time difference for the UK is challenging, it has its own challenges, but it also has a lot of benefits, too. So typically, I get to really have a lot of focus time on a morning. That means that I can start my day, look through whatever I think is appropriate for that morning, and not get interrupted very easily.I get a lot of time to think and plan and I think that's helped me at, like, the tech lead level because I can really focus on something and think it through without that level of interruption that I think some people do if you're working in the same timezone or even in the same office as someone. That approachability is just not naturally there. But the other side of that is that I have a very limited amount of natural overlap with people I work with on a day-to-day basis, so it's typically meetings from 2 till 5 p.m. most days to try and make sure that I build those social relationships, I'm talking to the right people, giving status updates, planning and that sort of thing. But it works for me. I really enjoy that balance of some ty—like, having a lot of focus time and having, like, then dedicated time to spend with people.And I think that's really important, as well is that a distributed team naturally means that you don't get to spend a lot of time with people and a lot of, like, one-on-one time with people, so that's something that I definitely focus on is doing a lot of social interaction as well. So, it's not just I have a meeting, we've got to stand up, we've got 15 minutes, and then everyone goes and does their own thing. I like to make sure that we have time so we can talk, we can connect to each other, we know each other, things that would—[unintelligible 00:16:35] that allow a space for conversations to happen that would naturally happen if you were sat next to somebody at a desk, or like, the more traditional, like, water cooler conversations. You hear somebody having a conversation, you go talk to them, that naturally evolves.Corey: That was where I ran into a lot of trouble with it myself. My first outing as a manager, I had—most of the people on my team were in the same room as I was, and then we had someone who was in Europe. And as much as we tried to include this person in all of our meetings, there was an intrinsic, “Let's go get a cup of coffee,” or, “Let's have a discussion and figure things out.” And sometimes it's four in the afternoon, we're going to figure something out, and they have long since gone to bed or have a life, hopefully. And it was one of those areas where despite a conscious effort to avoid this problem, it was very clear that they did not have an equal voice in the team dynamic, in the team functioning, in the team culture, and in many cases, some of the decisions we ultimately reached as an outgrowth of those sidebar conversations. This led to something of an almost religious belief for me, for at least a while, was that either everyone's distributed or no one is because otherwise you wind up with the unequal access problem. But it's clearly worked for you folks. How have you gotten around that?Mike: For Honeycomb, it was a conscious decision not long before the Covid pandemic that the team would be distributed first; the whole organization will be distributed first. So, a number of months before that happened, the intention was that anybody across the organization—which at the time, was only North America-based staff—would be able to do their job outside of the office. Because I think around the end of 2019 to the beginning of 2020, a lot of the staff were based in the San Francisco area and that was starting to grow, and want more staff to come into the business. And there were more opportunities for people outside of that area to join the business, so the business decided that if we're going to do this, if we're going to hire people outside of the local area, then we do want to make sure that, as you said, that everybody has an equal access, everyone has equal opportunity, they can participate, and everybody has the same opportunity to do those things. And that has definitely fed through pandemic, and then even when the office reopened and people can go back into the office. More than—I think there's only… maybe 25% of the company now is even in Pacific Time Zone. And then the office space itself is not very large considering the size of the company, so we couldn't fit everybody into our office space if we wanted to.Corey: Yeah, that's one of the constant growing challenges, too, that I understand that a lot of companies do see value in the idea of getting everyone together in a room. I know that I, for example, I'm a lot more effective and productive when I'm around other people. But I'm really expensive to their productivity because I am Captain Interrupter, which, you know, we have to recognize our limitations as we encounter them. But that also means that the office expense exceeds the AWS bill past a certain point of scale, and that is not a small thing. Like, I try not to take too much of a public opinion on should we be migrating everyone back to return-to-office as a mandate, yes, no, et cetera.I can see a bunch of different perspectives on this that are nuanced and I don't think it lends itself to my usual reactionary take on the Twitters, as it were, but it's a hard problem with no easy answer to it. Frankly, I also think it's a big mistake to do full-remote only for junior employees, just because so much of learning how the workforce works is through observation. You don't learn a lot about those unspoken dynamics in any other way than observing it directly.Mike: Yes, I fully agree. I think the stage that Honeycomb was at when I joined and has continued to be is that I think a very junior person joining an organization that is fully distributed is more challenging. It has different challenges, but it has more challenges because it doesn't have those… you can't just see something happening and know that that's the norm or that that's the expectation. You've got to push yourself into those in those different arenas, those different conversations, and it can be quite daunting when you're new to an organization, especially if you are not experienced in that organization or experienced in the role that you're currently occupying. Yeah, I think the distributed organizations is—fully distributed has its challenges and I think that's something that we do at Honeycomb is that we intentionally do that twice a year, maybe three times a year, bring in the people that do work very closely, bringing them together so they have that opportunity to work together, build those social interactions like I mentioned earlier, and then do some work together as well.And it builds a stronger trust relationship because of that, as well because you're reinforcing the social side with the work side in a face-to-face context. And there's just, there's no direct replacement for face-to-face. If you worked for somebody and never met them for over a year, it'd be very difficult to then just be in a room together and have a normal conversation.Corey: It takes a lot of effort because there's so much to a company culture that is not meetings or agenda-driven or talking about the work. I mean, companies get this wrong with community all the time where they think that a community is either a terrible option of people we can sell things to or more correctly, a place where users of our product or service or offering or platform can gather together to solve common challenges and share knowledge with each other. But where they fall flat often is it also has to have a social element. Like ohh, having a conversation about your lives is not on topic for this community Slack team is, great, that strangles community before it can even form, in many cases. And work is no different.Mike: Yeah, I fully agree. We see that with the Honeycomb Pollinators Slack channel. So, we use that as a primary way of community members to participate, talk to each other, share their experiences, and we can definitely see that there is a high level of social interaction alongside of that. They connect because they've got a shared interest or a shared tool or a shared problem that they're trying to solve, but we do see, like, people, the same people, reconnecting or re-communicating with each other because they have built that social connection there as well.And I think that's something that as organizations—like, OpenTelemetry is a community is more welcoming to that. And then you can participate with something that then transcends different organizations that you may work for as well because you're already part of this community. So, if that community then reaches to another organization, there's an opportunity to go, to move between organizations and then maintain a level of connection.Corey: That seems like one of the better approaches that people can have to this stuff. It's just a—the hard part, of course, is how do you change culture? I think the easy way to do it—the only easy way to do it—is you have to build the culture from the beginning. Every time I see companies bringing in outsiders to change the corporate culture, I can't help but feel that they're setting giant piles of money on fire. Culture is one of those things that's organic and just changing it by fiat doesn't work. If I knew how to actually change culture, I would have a much more lucrative target for my consultancy than I do today. You think AWS bills are a big problem? Everyone has a problem with company cultures.Mike: Yeah, I fully agree. I think that culture is something that you're right is very organic, it naturally happens. I think the value when organizations go through, like, a retrospective, like, what is our culture? How would we define it? What are the core values of that and how do we articulate that to people that might be coming into the organization, that's very valuable, too, because those core values are very useful to communicate to people.So, one of the bigger core values that we've got at Honeycomb is that—we refer to as, “We hire adults,” meaning that when somebody needs to do something, they just can go and do it. You don't have to report to somebody, you don't have to go and tell somebody, “I need a doctor appointment,” or, “I've got to go and pick up the kids from school,” or something like that. You're trusted to do your job to the highest level, and if you need additional help, you can ask for it. If somebody requires something of you they ask for it. They do it in a humane way and they expect to be treated like a human and an adult all of the time.Corey: On some level, I've always found, for better or worse, that people will largely respond to how you treat them and live up or down to the expectation placed upon them. You want a bunch of cogs who are going to have to raise their hand to go to the bathroom? Okay, you can staff that way if you want, but don't be surprised when those teams don't volunteer to come up with creative solutions to things either. You can micromanage people to death.Mike: Yeah. Yeah, definitely. I've been in organizations, like, fresh out of college and had to go to work at a particular place and it was very time-managed. And I had inbound sales calls and things like that and it was very, like, you've spent more than three minutes on a wrap-up call from having a previous call, and if you don't finish that call within three minutes, your manager will call your phone to say, “You need to go on to the next call.” And it's… you could have had a really important call or you could have had a very long call. They didn't care. They just wanted—you've had your time now move on to the next one and they didn't care.Corey: One last question I want to ask you about before we wind up calling this an episode, and it distills down to I guess, effectively, your history, for lack of a better term. You have done an awful lot of Go maintenance work—Go meaning the language, not the imperative command, to be clear—but you also historically were the .NET SDK maintainer for something or other. Do you find those languages to be similar or… how did that come to be? I mean, to be clear, my programming languages of choice are twofold: both brute force and enthusiasm. Most people take a slightly different path.Mike: Yeah, I worked with .NET for a very long time, so that was, like, the place—the first place that I joined as a real organization after finishing college was .NET and it just sort of stuck. I enjoyed the language. At the time, sort of, what 15 year—12, 15 years ago, the language itself was moving pretty well, there was things being added to it, it was enjoyable to use.Over the last maybe four or five years, I've had the opportunity to work a lot more in Go. And they are very different. So, Go is much more focused on simplicity and not hiding anything from anybody and just being very efficient at what you can see it does. .NET and many other languages such as Java, Ruby, JavaScript, Python, all have a level of magic to them, so if you're not part of the ecosystem or if you don't know particular really common packages that can do things for you, not knowing something about the ecosystem causes pain.I think Go takes away some of that because if you don't know those ecosystems or if you don't know those tools, you can still solve the problem fairly quickly and fairly simply. Tools will help but they're not required. .NET is probably on the boundary for me. It's still very easy to use, I enjoy using it, but it just… I found that it's not that long ago, I would say that I've switched from thinking like a .NET developer, so whenever I'm forming code in my head, like, how I would solve a problem, for a very long time, it was in .NET and C#.I'd probably say in the last 12 months or so, it's definitely moved more to Go just because of the simplicity. And it's also the tool that is most used within Honeycomb, especially, so if you're talking about Go code, you've got a wider audience to bounce ideas off, to talk to, communicate, get ideas from. .NET is not a very well used language within Honeycomb and probably even, like… even maybe West Coast-based organizations, it seems to be very high-level organizations that are willing to pay their money up for, like, Microsoft support. Like, Go is something that a lot of developers use because it's very simple, very quick, can move quick.Corey: I found that it was very easy for me to pick up Go to build out something ridiculous a few years back when I need to control my video camera through its ‘API' to use the term charitably. And it just works in a way that made an awful lot of sense. But I still find myself reaching for Python or for—God help me—TypeScript if I'm doing some CDK work these days. And honestly, they all tend to achieve more or less the same outcome. It's just different approaches to—well, to be unkind—dependency management in some cases, and also the ecosystem around it and what is done for you.I don't think there's a bad language to learn. I don't want this to be interpreted as language snobbery, but I haven't touched anything in the Microsoft ecosystem for a long time in production, so .NET was just never on my radar. But it's clear they have an absolutely massive community ecosystem built around it and that is no small thing. I'd say it rivals Java.Mike: Yeah definitely. I think over the last ten years or so, the popularity of .NET as a language to be built from enterprise, especially at larger-scale organizations have taken it on, and then, like, six, seven years ago, they introduced the .NET Core Framework, which allowed it to run on non-Windows platforms, and that accelerated the language dramatically, so they have a consistent API that can be used on Windows, on Linux, Mac, and that makes a huge difference for creating a larger audience for people to interact with it. And then also, with Azure becoming much more popular, they can have all of these—this language that people are typically used to using Linux as an operating system that runs infrastructure, but not being forced to use Windows is probably quite a big thing for Azure as well.Corey: I really want to thank you for taking the time to talk about what you're up to over there. If people want to learn more, where's the best place for them to go find you?Mike: Typically, I use Twitter, so it's Mike_Goldsmith. I create blogs on the Honeycomb blog website, which I've done a few different things; I've got a new one coming up soon to talk about different ways of collecting data. So yeah, those are the two main places. LinkedIn is usual as ever, but that's a little bit more work-focused.Corey: It does seem to be. And we'll put links to all of that in the [show notes 00:31:11]. Thank you so much for being so generous with your time, and of course, thank you Honeycomb for sponsoring this episode of my ridiculous podcast.Mike: Yeah, thank you very much for having me on.Corey: Mike Goldsmith, staff software engineer at Honeycomb. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that we will then have instrumented across the board with a unified observability platform to keep our days eventful.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Learnings From A Lifelong Career in Open-Source with Amir Szekely

Screaming in the Cloud

Play Episode Listen Later Nov 7, 2023 38:47


Amir Szekely, Owner at CloudSnorkel, joins Corey on Screaming in the Cloud to discuss how he got his start in the early days of cloud and his solo project, CloudSnorkel. Throughout this conversation, Corey and Amir discuss the importance of being pragmatic when moving to the cloud, and the different approaches they see in developers from the early days of cloud to now. Amir shares what motivates him to develop open-source projects, and why he finds fulfillment in fixing bugs and operating CloudSnorkel as a one-man show. About AmirAmir Szekely is a cloud consultant specializing in deployment automation, AWS CDK, CloudFormation, and CI/CD. His background includes security, virtualization, and Windows development. Amir enjoys creating open-source projects like cdk-github-runners, cdk-turbo-layers, and NSIS.Links Referenced: CloudSnorkel: https://cloudsnorkel.com/ lasttootinaws.com: https://lasttootinaws.com camelcamelcamel.com: https://camelcamelcamel.com github.com/cloudsnorkel: https://github.com/cloudsnorkel Personal website: https://kichik.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and this is an episode that I have been angling for for longer than you might imagine. My guest today is Amir Szekely, who's the owner at CloudSnorkel. Amir, thank you for joining me.Amir: Thanks for having me, Corey. I love being here.Corey: So, I've been using one of your open-source projects for an embarrassingly long amount of time, and for the longest time, I make the critical mistake of referring to the project itself as CloudSnorkel because that's the word that shows up in the GitHub project that I can actually see that jumps out at me. The actual name of the project within your org is cdk-github-runners if I'm not mistaken.Amir: That's real original, right?Corey: Exactly. It's like, “Oh, good, I'll just mention that, and suddenly everyone will know what I'm talking about.” But ignoring the problems of naming things well, which is a pain that everyone at AWS or who uses it knows far too well, the product is basically magic. Before I wind up basically embarrassing myself by doing a poor job of explaining what it is, how do you think about it?Amir: Well, I mean, it's a pretty simple project, which I think what makes it great as well. It creates GitHub runners with CDK. That's about it. It's in the name, and it just does that. And I really tried to make it as simple as possible and kind of learn from other projects that I've seen that are similar, and basically learn from my pain points in them.I think the reason I started is because I actually deployed CDK runners—sorry, GitHub runners—for one company, and I ended up using the Kubernetes one, right? So, GitHub in themselves, they have two projects they recommend—and not to nudge GitHub, please recommend my project one day as well—they have the Kubernetes controller and they have the Terraform deployer. And the specific client that I worked for, they wanted to use Kubernetes. And I tried to deploy it, and, Corey, I swear, I worked three days; three days to deploy the thing, which was crazy to me. And every single step of the way, I had to go and read some documentation, figure out what I did wrong, and apparently the order the documentation was was incorrect.And I had to—I even opened tickets, and they—you know, they were rightfully like, “It's open-source project. Please contribute and fix the documentation for us.” At that point, I said, “Nah.” [laugh]. Let me create something better with CDK and I decided just to have the simplest setup possible.So usually, right, what you end up doing in these projects, you have to set up either secrets or SSM parameters, and you have to prepare the ground and you have to get your GitHub token and all those things. And that's just annoying. So, I decided to create a—Corey: So much busy work.Amir: Yes, yeah, so much busy work and so much boilerplate and so much figuring out the right way and the right order, and just annoying. So, I decided to create a setup page. I thought, “What if you can actually install it just like you install any app on GitHub,” which is the way it's supposed to be right? So, when you install cdk-github-runners—CloudSnorkel—you get an HTML page and you just click a few buttons and you tell it where to install it and it just installs it for you. And it sets the secrets and everything. And if you want to change the secret, you don't have to redeploy. You can just change the secret, right? You have to roll the token over or whatever. So, it's much, much easier to install.Corey: And I feel like I discovered this project through one of the more surreal approaches—and I had cause to revisit it a few weeks ago when I was redoing my talk for the CDK Community Day, which has since happened and people liked the talk—and I mentioned what CloudSnorkel had been doing and how I was using the runners accordingly. So, that was what I accidentally caused me to pop back up with, “Hey, I've got some issues here.” But we'll get to that. Because once upon a time, I built a Twitter client for creating threads because shitposting is my love language, I would sit and create Twitter threads in the middle of live keynote talks. Threading in the native client was always terrible, and I wanted to build something that would help me do that. So, I did.And it was up for a while. It's not anymore because I'm not paying $42,000 a month in API costs to some jackass, but it still exists in the form of lasttootinaws.com if you want to create threads on Mastodon. But after I put this out, some people complained that it was slow.To which my response was, “What do you mean? It's super fast for me in San Francisco talking to it hosted in Oregon.” But on every round trip from halfway around the world, it became a problem. So, I got it into my head that since this thing was fully stateless, other than a Lambda function being fronted via an API Gateway, that I should deploy it to every region. It didn't quite fit into a Cloudflare Worker or into one of the Edge Lambda functions that AWS has given up on, but okay, how do I deploy something to every region?And the answer is, with great difficulty because it's clear that no one was ever imagining with all those regions that anyone would use all of them. It's imagined that most customers use two or three, but customers are different, so which two or three is going to be widely varied. So, anything halfway sensible about doing deployments like this didn't work out. Again, because this thing was also a Lambda function and an API Gateway, it was dirt cheap, so I didn't really want to start spending stupid amounts of money doing deployment infrastructure and the rest.So okay, how do I do this? Well, GitHub Actions is awesome. It is basically what all of AWS's code offerings wish that they were. CodeBuild is sad and this was kind of great. The problem is, once you're out of the free tier, and if you're a bad developer where you do a deploy on every iteration, suddenly it starts costing for what I was doing in every region, something like a quarter of per deploy, which adds up when you're really, really bad at programming.Amir: [laugh].Corey: So, their matrix jobs are awesome, but I wanted to do some self-hosted runners. How do I do that? And I want to keep it cheap, so how do I do a self-hosted runner inside of a Lambda function? Which led me directly to you. And it was nothing short of astonishing. This was a few years ago. I seem to recall that it used to be a bit less well-architected in terms of its elegance. Did it always use step functions, for example, to wind up orchestrating these things?Amir: Yeah, so I do remember that day. We met pretty much… basically as a joke because the Lambda Runner was a joke that I did, and I posted on Twitter, and I was half-proud of my joke that starts in ten seconds, right? But yeah, no, the—I think it always used functions. I've been kind of in love with the functions for the past two years. They just—they're nice.Corey: Oh, they're magic, and AWS is so bad at telling their story. Both of those things are true.Amir: Yeah. And the API is not amazing. But like, when you get it working—and you know, you have to spend some time to get it working—it's really nice because then you have nothing to manage, ever. And they can call APIs directly now, so you don't have to even create Lambdas. It's pretty cool.Corey: And what I loved is you wind up deploying this thing to whatever account you want it to live within. What is it, the OIDC? I always get those letters in the wrong direction. OIDC, I think, is correct.Amir: I think it's OIDC, yeah.Corey: Yeah, and it winds up doing this through a secure method as opposed to just okay, now anyone with access to the project can deploy into your account, which is not ideal. And it just works. It spins up a whole bunch of these Lambda functions that are using a Docker image as the deployment environment. And yeah, all right, if effectively my CDK deploy—which is what it's doing inside of this thing—doesn't complete within 15 minutes, then it's not going to and the thing is going to break out. We've solved the halting problem. After 15 minutes, the loop will terminate. The end.But that's never been a problem, even with getting ACM certificates spun up. It completes well within that time limit. And its cost to me is effectively nothing. With one key exception: that you made the choice to use Secrets Manager to wind up storing a lot of the things it cares about instead of Parameter Store, so I think you wind up costing me—I think there's two of those different secrets, so that's 80 cents a month. Which I will be demanding in blood one of these days if I ever catch you at re:Invent.Amir: I'll buy you beer [laugh].Corey: There we go. That'll count. That'll buy, like, several months of that. That works—at re:Invent, no. The beers there are, like, $18, so that'll cover me for years. We're set.Amir: We'll split it [laugh].Corey: Exactly. Problem solved. But I like the elegance of it, I like how clever it is, and I want to be very clear, though, it's not just for shitposting. Because it's very configurable where, yes, you can use Lambda functions, you can use Spot Instances, you can use CodeBuild containers, you can use Fargate containers, you can use EC2 instances, and it just automatically orchestrates and adds these self-hosted runners to your account, and every build gets a pristine environment as a result. That is no small thing.Amir: Oh, and I love making things configurable. People really appreciate it I feel, you know, and gives people kind of a sense of power. But as long as you make that configuration simple enough, right, or at least the defaults good defaults, right, then, even with that power, people still don't shoot themselves in the foot and it still works really well. By the way, we just added ECS recently, which people really were asking for because it gives you the, kind of, easy option to have the runner—well, not the runner but at least the runner infrastructure staying up, right? So, you can have auto-scaling group backing ECS and then the runner can start up a lot faster. It was actually very important to other people because Lambda, as fast that it is, it's limited, and Fargate, for whatever reason, still to this day, takes a minute to start up.Corey: Yeah. What's wild to me about this is, start to finish, I hit a deploy to the main branch and it sparks the thing up, runs the deploy. Deploy itself takes a little over two minutes. And every time I do this, within three minutes of me pushing to commit, the deploy is done globally. It is lightning fast.And I know it's easy to lose yourself in the idea of this being a giant shitpost, where, oh, who's going to do deployment jobs in Lambda functions? Well, kind of a lot of us for a variety of reasons, some of which might be better than others. In my case, it was just because I was cheap, but the massive parallelization ability to do 20 simultaneous deploys in a matrix configuration that doesn't wind up smacking into rate limits everywhere, that was kind of great.Amir: Yeah, we have seen people use Lambda a lot. It's mostly for, yeah, like you said, small jobs. And the environment that they give you, it's kind of limited, so you can't actually install packages, right? There is no sudo, and you can't actually install anything unless it's in your temp directory. But still, like, just being able to run a lot of little jobs, it's really great. Yeah.Corey: And you can also make sure that there's a Docker image ready to go with the stuff that you need, just by configuring how the build works in the CDK. I will admit, I did have a couple of bug reports for you. One was kind of useful, where it was not at all clear how to do this on top of a Graviton-based Lambda function—because yeah, that was back when not everything really supported ARM architectures super well—and a couple of other times when the documentation was fairly ambiguous from my perspective, where it wasn't at all clear, what was I doing? I spent four hours trying to beat my way through it, I give up, filed an issue, went to get a cup of coffee, came back, and the answer was sitting there waiting for me because I'm not convinced you sleep.Amir: Well, I am a vampire. My last name is from the Transylvania area [laugh]. So—Corey: Excellent. Excellent.Amir: By the way, not the first time people tell me that. But anyway [laugh].Corey: There's something to be said for getting immediate responsiveness because one of the reasons I'm always so loath to go and do a support ticket anywhere is this is going to take weeks. And then someone's going to come back with a, “I don't get it.” And try and, like, read the support portfolio to you. No, you went right into yeah, it's this. Fix it and your problem goes away. And sure enough, it did.Amir: The escalation process that some companies put you through is very frustrating. I mean, lucky for you, CloudSnorkel is a one-man show and this man loves solving bugs. So [laugh].Corey: Yeah. Do you know of anyone using it for anything that isn't ridiculous and trivial like what I'm using it for?Amir: Yeah, I have to think whether or not I can… I mean, so—okay. We have a bunch of dedicated users, right, the GitHub repo, that keep posting bugs and keep posting even patches, right, so you can tell that they're using it. I even have one sponsor, one recurring sponsor on GitHub that uses it.Corey: It's always nice when people thank you via money.Amir: Yeah. Yeah, it is very validating. I think [BLEEP] is using it, but I also don't think I can actually say it because I got it from the GitHub.Corey: It's always fun. That's the beautiful part about open-source. You don't know who's using this. You see what other things people are working on, and you never know, is one of their—is this someone's side project, is it a skunkworks thing, or God forbid, is this inside of every car going forward and no one bothered to tell me about that. That is the magic and mystery of open-source. And you've been doing open-source for longer than I have and I thought I was old. You were originally named in some of the WinAMP credits, for God's sake, that media player that really whipped the llama's ass.Amir: Oh, yeah, I started real early. I started about when I was 15, I think. I started off with Pascal or something or even Perl, and then I decided I have to learn C and I have to learn Windows API. I don't know what possessed me to do that. Win32 API is… unique [laugh].But once I created those applications for myself, right, I think there was—oh my God, do you know the—what is it called, Sherlock in macOS, right? And these days, for PowerToys, there is the equivalent of it called, I don't know, whatever that—PowerBar? That's exactly—that was that. That's a project I created as a kid. I wanted something where I can go to the Run menu of Windows when you hit Winkey R, and you can just type something and it will start it up, right?I didn't want to go to the Start menu and browse and click things. I wanted to do everything with the keyboard. So, I created something called Blazerun [laugh], which [laugh] helped you really easily create shortcuts that went into your path, right, the Windows path, so you can really easily start them from Winkey R. I don't think that anyone besides me used it, but anyway, that thing needed an installer, right? Because Windows, you got to install things. So, I ended up—Corey: Yeah, these days on Mac OS, I use Alfred for that which is kind of long in the tooth, but there's a launch bar and a bunch of other stuff for it. What I love is that if I—I can double-tap the command key and that just pops up whatever I need it to and tell the computer what to do. It feels like there's an AI play in there somewhere if people can figure out how to spend ten minutes on building AI that does something other than lets them fire their customer service staff.Amir: Oh, my God. Please don't fire customer service staff. AI is so bad.Corey: Yeah, when I reach out to talk to a human, I really needed a human.Amir: Yes. Like, I'm not calling you because I want to talk to a robot. I know there's a website. Leave me alone, just give me a person.Corey: Yeah. Like, you already failed to solve my problem on your website. It's person time.Amir: Exactly. Oh, my God. Anyway [laugh]. So, I had to create an installer, right, and I found it was called NSIS. So, it was a Nullsoft “SuperPiMP” installation system. Or in the future, when Justin, the guy who created Winamp and NSIS, tried to tone down a little bit, Nullsoft Scriptable Installation System. And SuperPiMP is—this is such useless history for you, right, but SuperPiMP is the next generation of PiMP which is Plug-in Mini Packager [laugh].Corey: I remember so many of the—like, these days, no one would ever name any project like that, just because it's so off-putting to people with sensibilities, but back then that was half the stuff that came out. “Oh, you don't like how this thing I built for free in the wee hours when I wasn't working at my fast food job wound up—you know, like, how I chose to name it, well, that's okay. Don't use it. Go build your own. Oh, what you're using it anyway. That's what I thought.”Amir: Yeah. The source code was filled with profanity, too. And like, I didn't care, I really did not care, but some people would complain and open bug reports and patches. And my policy was kind of like, okay if you're complaining, I'm just going to ignore you. If you're opening a patch, fine, I'm going to accept that you're—you guys want to create something that's sensible for everybody, sure.I mean, it's just source code, you know? Whatever. So yeah, I started working on that NSIS. I used it for myself and I joined the forums—and this kind of answers to your question of why I respond to things so fast, just because of the fun—I did the same when I was 15, right? I started going on the forums, you remember forums? You remember that [laugh]?Corey: Oh, yeah, back before they all became terrible and monetized.Amir: Oh, yeah. So, you know, people were using NSIS, too, and they had requests, right? They wanted. Back in the day—what was it—there was only support for 16-bit colors for the icon, so they want 32-bit colors and big colors—32—big icon, sorry, 32 pixels by 32 pixels. Remember, 32 pixels?Corey: Oh, yes. Not well, and not happily, but I remember it.Amir: Yeah. So, I started just, you know, giving people—working on that open-source and creating up a fork. It wasn't even called ‘fork' back then, but yeah, I created, like, a little fork of myself and I started adding all these features. And people were really happy, and kind of created, like, this happy cycle for myself: when people were happy, I was happy coding. And then people were happy by what I was coding. And then they were asking for more and they were getting happier, the more I responded.So, it was kind of like a serotonin cycle that made me happy and made everybody happy. So, it's like a win, win, win, win, win. And that's how I started with open-source. And eventually… NSIS—again, that installation system—got so big, like, my fork got so big, and Justin, the guy who works on WinAMP and NSIS, he had other things to deal with. You know, there's a whole history there with AOL. I'm sure you've heard all the funny stories.Corey: Oh, yes. In fact, one thing that—you want to talk about weird collisions of things crossing, one of the things I picked up from your bio when you finally got tired of telling me no and agreed to be on the show was that you're also one of the team who works on camelcamelcamel.com. And I keep forgetting that's one of those things that most people have no idea exists. But it's very simple: all it does is it tracks Amazon products that you tell it to and alerts you when there's a price drop on the thing that you're looking at.It's something that is useful. I try and use it for things of substance or hobbies because I feel really pathetic when I'm like, get excited emails about a price drop in toilet paper. But you know, it's very handy just to keep an idea for price history, where okay, am I actually being ripped off? Oh, they claim it's their big Amazon Deals day and this is 40% off. Let's see what camelcamelcamel has to say.Oh, surprise. They just jacked the price right beforehand and now knocked 40% off. Genius. I love that. It always felt like something that was going to be blown off the radar by Amazon being displeased, but I discovered you folks in 2010 and here you are now, 13 years later, still here. I will say the website looks a lot better now.Amir: [laugh]. That's a recent change. I actually joined camel, maybe two or three years ago. I wasn't there from the beginning. But I knew the guy who created it—again, as you were saying—from the Winamp days, right? So, we were both working in the free—well, it wasn't freenode. It was not freenode. It was a separate IRC server that, again, Justin created for himself. It was called landoleet.Corey: Mmm. I never encountered that one.Amir: Yeah, no, it was pretty private. The only people that cared about WinAMP and NSIS ended up joining there. But it was a lot of fun. I met a lot of friends there. And yeah, I met Daniel Green there as well, and he's the guy that created, along with some other people in there that I think want to remain anonymous so I'm not going to mention, but they also were on the camel project.And yeah, I was kind of doing my poor version of shitposting on Twitter about AWS, kind of starting to get some traction and maybe some clients and talk about AWS so people can approach me, and Daniel approached me out of the blue and he was like, “Do you just post about AWS on Twitter or do you also do some AWS work?” I was like, “I do some AWS work.”Corey: Yes, as do all of us. It's one of those, well crap, we're getting called out now. “Do you actually know how any of this stuff works?” Like, “Much to my everlasting shame, yes. Why are you asking?”Amir: Oh, my God, no, I cannot fix your printer. Leave me alone.Corey: Mm-hm.Amir: I don't want to fix your Lambdas. No, but I do actually want to fix your Lambdas. And so, [laugh] he approached me and he asked if I can help them move camelcamelcamel from their data center to AWS. So, that was a nice big project. So, we moved, actually, all of camelcamelcamel into AWS. And this is how I found myself not only in the Winamp credits, but also in the camelcamelcamel credits page, which has a great picture of me riding a camel.Corey: Excellent. But one of the things I've always found has been that when you take an application that has been pre-existing for a while in a data center and then move it into the cloud, you suddenly have to care about things that no one sensible pays any attention to in the land of the data center. Because it's like, “What do I care about how much data passes between my application server and the database? Wait, what do you mean that in this configuration, that's a chargeable data transfer? Oh, dear Lord.” And things that you've never had to think about optimizing are suddenly things are very much optimizing.Because let's face it, when it comes to putting things in racks and then running servers, you aren't auto-scaling those things, so everything tends to be running over-provisioned, for very good reasons. It's an interesting education. Anything you picked out from that process that you think it'd be useful for folks to bear in mind if they're staring down the barrel of the same thing?Amir: Yeah, for sure. I think… in general, right, not just here. But in general, you always want to be pragmatic, right? You don't want to take steps are huge, right? So, the thing we did was not necessarily rewrite everything and change everything to AWS and move everything to Lambda and move everything to Docker.Basically, we did a mini lift-and-shift, but not exactly lift-and-shift, right? We didn't take it as is. We moved to RDS, we moved to ElastiCache, right, we obviously made use of security groups and session connect and we dropped SSH Sage and we improved the security a lot and we locked everything down, all the permissions and all that kind of stuff, right? But like you said, there's stuff that you start having to pay attention to. In our case, it was less the data transfer because we have a pretty good CDN. There was more of IOPS. So—and IOPS, specifically for a database.We had a huge database with about one terabyte of data and a lot of it is that price history that you see, right? So, all those nice little graphs that we create in—what do you call them, charts—that we create in camelcamelcamel off the price history. There's a lot of data behind that. And what we always want to do is actually remove that from MySQL, which has been kind of struggling with it even before the move to AWS, but after the move to AWS, where everything was no longer over-provisioned and we couldn't just buy a few more NVMes on Amazon for 100 bucks when they were on sale—back when we had to pay Amazon—Corey: And you know, when they're on sale. That's the best part.Amir: And we know [laugh]. We get good prices on NVMe. But yeah, on Amazon—on AWS, sorry—you have to pay for io1 or something, and that adds up real quick, as you were saying. So, part of that move was also to move to something that was a little better for that data structure. And we actually removed just that data, the price history, the price points from MySQL to DynamoDB, which was a pretty nice little project.Actually, I wrote about it in my blog. There is, kind of, lessons learned from moving one terabyte from MySQL to DynamoDB, and I think the biggest lesson was about hidden price of storage in DynamoDB. But before that, I want to talk about what you asked, which was the way that other people should make that move, right? So again, be pragmatic, right? If you Google, “How do I move stuff from DynamoDB to MySQL,” everybody's always talking about their cool project using Lambda and how you throttle Lambda and how you get throttled from DynamoDB and how you set it up with an SQS, and this and that. You don't need all that.Just fire up an EC2 instance, write some quick code to do it. I used, I think it was Go with some limiter code from Uber, and that was it. And you don't need all those Lambdas and SQS and the complication. That thing was a one-time thing anyway, so it doesn't need to be super… super-duper serverless, you know?Corey: That is almost always the way that it tends to play out. You encounter these weird little things along the way. And you see so many things that are tied to this is how architecture absolutely must be done. And oh you're not a real serverless person if you don't have everything running in Lambda and the rest. There are times where yeah, spin up an EC2 box, write some relatively inefficient code in ten minutes and just do the thing, and then turn it off when you're done. Problem solved. But there's such an aversion to that. It's nice to encounter people who are pragmatists more than they are zealots.Amir: I mostly learned that lesson. And both Daniel Green and me learned that lesson from the Winamp days. Because we both have written plugins for Winamp and we've been around that area and you can… if you took one of those non-pragmatist people, right, and you had them review the Winamp code right now—or even before—they would have a million things to say. That code was—and NSIS, too, by the way—and it was so optimized. It was so not necessarily readable, right? But it worked and it worked amazing. And Justin would—if you think I respond quickly, right, Justin Frankel, the guy who wrote Winamp, he would release versions of NSIS and of Winamp, like, four versions a day, right? That was before [laugh] you had CI/CD systems and GitHub and stuff. That was just CVS. You remember CVS [laugh]?Corey: Oh, I've done multiple CVS migrations. One to Git and a couple to Subversion.Amir: Oh yeah, Subversion. Yep. Done ‘em all. CVS to Subversion to Git. Yep. Yep. That was fun.Corey: And these days, everyone's using Git because it—we're beginning to have a monoculture.Amir: Yeah, yeah. I mean, but Git is nicer than Subversion, for me, at least. I've had more fun with it.Corey: Talk about damning with faint praise.Amir: Faint?Corey: Yeah, anything's better than Subversion, let's be honest here.Amir: Oh [laugh].Corey: I mean, realistically, copying a bunch of files and directories to a.bak folder is better than Subversion.Amir: Well—Corey: At least these days. But back then it was great.Amir: Yeah, I mean, the only thing you had, right [laugh]?Corey: [laugh].Amir: Anyway, achieving great things with not necessarily the right tools, but just sheer power of will, that's what I took from the Winamp days. Just the entire world used Winamp. And by the way, the NSIS project that I was working on, right, I always used to joke that every computer in the world ran my code, every Windows computer in the world when my code, just because—Corey: Yes.Amir: So, many different companies use NSIS. And none of them cared that the code was not very readable, to put it mildly.Corey: So, many companies founder on those shores where they lose sight of the fact that I can point to basically no companies that died because their code was terrible, yeah, had an awful lot that died with great-looking code, but they didn't nail the business problem.Amir: Yeah. I would be lying if I said that I nailed exactly the business problem at NSIS because the most of the time I would spend there and actually shrinking the stub, right, there was appended to your installer data, right? So, there's a little stub that came—the executable, basically, that came before your data that was extracted. I spent, I want to say, years of my life [laugh] just shrinking it down by bytes—by literal bytes—just so it stays under 34, 35 kilobytes. It was kind of a—it was a challenge and something that people appreciated, but not necessarily the thing that people appreciate the most. I think the features—Corey: Well, no I have to do the same thing to make sure something fits into a Lambda deployment package. The scale changes, the problem changes, but somehow everything sort of rhymes with history.Amir: Oh, yeah. I hope you don't have to disassemble code to do that, though because that's uh… I mean, it was fun. It was just a lot.Corey: I have to ask, how much work went into building your cdk-github-runners as far as getting it to a point of just working out the door? Because I look at that and it feels like there's—like, the early versions, yeah, there wasn't a whole bunch of code tied to it, but geez, the iterative, “How exactly does this ridiculous step functions API work or whatnot,” feels like I'm looking at weeks of frustration. At least it would have been for me.Amir: Yeah, yeah. I mean, it wasn't, like, a day or two. It was definitely not—but it was not years, either. I've been working on it I think about a year now. Don't quote me on that. But I've put a lot of time into it. So, you know, like you said, the skeleton code is pretty simple: it's a step function, which as we said, takes a long time to get right. The functions, they are really nice, but their definition language is not very straightforward. But beyond that, right, once that part worked, it worked. Then came all the bug reports and all the little corner cases, right? We—Corey: Hell is other people's use cases. Always is. But that's honestly better than a lot of folks wind up experiencing where they'll put an open-source project up and no one ever knows. So, getting users is often one of the biggest barriers to a lot of this stuff. I've found countless hidden gems lurking around on GitHub with a very particular search for something that no one had ever looked at before, as best I can tell.Amir: Yeah.Corey: Open-source is a tricky thing. There needs to be marketing brought into it, there needs to be storytelling around it, and has to actually—dare I say—solve a problem someone has.Amir: I mean, I have many open-source projects like that, that I find super useful, I created for myself, but no one knows. I think cdk-github-runners, I'm pretty sure people know about it only because you talked about it on Screaming in the Cloud or your newsletter. And by the way, thank you for telling me that you talked about it last week in the conference because now we know why there was a spike [laugh] all of a sudden. People Googled it.Corey: Yeah. I put links to it as well, but it's the, yeah, I use this a lot and it's great. I gave a crappy explanation on how it works, but that's the trick I've found between conference talks and, dare I say, podcast episodes, you gives people a glimpse and a hook and tell them where to go to learn more. Otherwise, you're trying to explain every nuance and every intricacy in 45 minutes. And you can't do that effectively in almost every case. All you're going to do is drive people away. Make it sound exciting, get them to see the value in it, and then let them go.Amir: You have to explain the market for it, right? That's it.Corey: Precisely.Amir: And I got to say, I somewhat disagree with your—or I have a different view when you say that, you know, open-source projects needs marketing and all those things. It depends on what open-source is for you, right? I don't create open-source projects so they are successful, right? It's obviously always nicer when they're successful, but—and I do get that cycle of happiness that, like I was saying, people create bugs and I have to fix them and stuff, right? But not every open-source project needs to be a success. Sometimes it's just fun.Corey: No. When I talk about marketing, I'm talking about exactly what we're doing here. I'm not talking take out an AdWords campaign or something horrifying like that. It's you build something that solved the problem for someone. The big problem that worries me about these things is how do you not lose sleep at night about the fact that solve someone's problem and they don't know that it exists?Because that drives me nuts. I've lost count of the number of times I've been beating my head against a wall and asked someone like, “How would you handle this?” Like, “Oh, well, what's wrong with this project?” “What do you mean?” “Well, this project seems to do exactly what you want it to do.” And no one has it all stuffed in their head. But yeah, then it seems like open-source becomes a little more corporatized and it becomes a lead gen tool for people to wind up selling their SaaS services or managed offerings or the rest.Amir: Yeah.Corey: And that feels like the increasing corporatization of open-source that I'm not a huge fan of.Amir: Yeah. I mean, I'm not going to lie, right? Like, part of why I created this—or I don't know if it was part of it, but like, I had a dream that, you know, I'm going to get, oh, tons of GitHub sponsors, and everybody's going to use it and I can retire on an island and just make money out of this, right? Like, that's always a dream, right? But it's a dream, you know?And I think bottom line open-source is… just a tool, and some people use it for, like you were saying, driving sales into their SaaS, some people, like, may use it just for fun, and some people use it for other things. Or some people use it for politics, even, right? There's a lot of politics around open-source.I got to tell you a story. Back in the NSIS days, right—talking about politics—so this is not even about politics of open-source. People made NSIS a battleground for their politics. We would have translations, right? People could upload their translations. And I, you know, or other people that worked on NSIS, right, we don't speak every language of the world, so there's only so much we can do about figuring out if it's a real translation, if it's good or not.Back in the day, Google Translate didn't exist. Like, these days, we check Google Translate, we kind of ask a few questions to make sure they make sense. But back in the day, we did the best that we could. At some point, we got a patch for Catalan language, I'm probably mispronouncing it—but the separatist people in Spain, I think, and I didn't know anything about that. I was a young kid and… I just didn't know.And I just included it, you know? Someone submitted a patch, they worked hard, they wanted to be part of the open-source project. Why not? Sure I included it. And then a few weeks later, someone from Spain wanted to change Catalan into Spanish to make sure that doesn't exist for whatever reason.And then they just started fighting with each other and started making demands of me. Like, you have to do this, you have to do that, you have to delete that, you have to change the name. And I was just so baffled by why would someone fight so much over a translation of an open-source project. Like, these days, I kind of get what they were getting at, right?Corey: But they were so bad at telling that story that it was just like, so basically, screw, “You for helping,” is how it comes across.Amir: Yeah, screw you for helping. You're a pawn now. Just—you're a pawn unwittingly. Just do what I say and help me in my political cause. I ended up just telling both of them if you guys can agree on anything, I'm just going to remove both translations. And that's what I ended up doing. I just removed both translations. And then a few months later—because we had a release every month basically, I just added both of them back and I've never heard from them again. So sort of problem solved. Peace the Middle East? I don't know.Corey: It's kind of wild just to see how often that sort of thing tends to happen. It's a, I don't necessarily understand why folks are so opposed to other people trying to help. I think they feel like there's this loss of control as things are slipping through their fingers, but it's a really unwelcoming approach. One of the things that got me deep into the open-source ecosystem surprisingly late in my development was when I started pitching in on the SaltStack project right after it was founded, where suddenly everything I threw their way was merged, and then Tom Hatch, the guy who founded the project, would immediately fix all the bugs and stuff I put in and then push something else immediately thereafter. But it was such a welcoming thing.Instead of nitpicking me to death in the pull request, it just got merged in and then silently fixed. And I thought that was a classy way to do it. Of course, it doesn't scale and of course, it causes other problems, but I envy the simplicity of those days and just the ethos behind that.Amir: That's something I've learned the last few years, I would say. Back in the NSIS day, I was not like that. I nitpicked. I nitpicked a lot. And I can guess why, but it just—you create a patch—in my mind, right, like you create a patch, you fix it, right?But these days I get, I've been on the other side as well, right? Like I created patches for open-source projects and I've seen them just wither away and die, and then five years later, someone's like, “Oh, can you fix this line to have one instead of two, and then I'll merge it.” I'm like, “I don't care anymore. It was five years ago. I don't work there anymore. I don't need it. If you want it, do it.”So, I get it these days. And these days, if someone creates a patch—just yesterday, someone created a patch to format cdk-github-runners in VS Code. And they did it just, like, a little bit wrong. So, I just fixed it for them and I approved it and pushed it. You know, it's much better. You don't need to bug people for most of it.Corey: You didn't yell at them for having the temerity to contribute?Amir: My voice is so raw because I've been yelling for five days at them, yeah.Corey: Exactly, exactly. I really want to thank you for taking the time to chat with me about how all this stuff came to be and your own path. If people want to learn more, where's the best place for them to find you?Amir: So, I really appreciate you having me and driving all this traffic to my projects. If people want to learn more, they can always go to cloudsnorkel.com; it has all the projects. github.com/cloudsnorkel has a few more. And then my private blog is kichik.com. So, K-I-C-H-I-K dot com. I don't post there as much as I should, but it has some interesting AWS projects from the past few years that I've done.Corey: And we will, of course, put links to all of that in the show notes. Thank you so much for taking the time. I really appreciate it.Amir: Thank you, Corey. It was really nice meeting you.Corey: Amir Szekely, owner of CloudSnorkel. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment. Heck, put it on all of the podcast platforms with a step function state machine that you somehow can't quite figure out how the API works.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How Tech Will Influence the Future of Podcasting with Chris Hill

Screaming in the Cloud

Play Episode Listen Later Oct 31, 2023 34:35


Chris Hill, owner of HumblePod and host of the We Built This Brand podcast, joins Corey on Screaming in the Cloud to discuss the future of podcasting and the role emerging technologies will play in the podcasting space. Chris describes why AI is struggling to make a big impact in the world of podcasting, and also emphasizes the importance of authenticity and finding a niche when producing a show. Corey and Chris discuss where video podcasting works and where it doesn't, and why it's more important to focus on the content of your podcast than the technical specs of your gear. Chris also shares insight on how to gauge the health of your podcast audience with his Podcast Listener Lifecycle evaluation tool.About ChrisChris Hill is a Knoxville, TN native and owner of the podcast production company, HumblePod. He helps his customers create, develop, and produce podcasts and is working with clients in Knoxville as well as startups and entrepreneurs across the United States, Silicon Valley, and the world.In addition to producing podcasts for nationally-recognized thought leaders, Chris is the co-host and producer of the award-winning Our Humble Beer Podcast and the host of the newly-launched We Built This Brand podcast. He also lectures at the University of Tennessee, where he leads non-credit courses on podcasts and marketing.  He received his undergraduate degree in business at the University of Tennessee at Chattanooga where he majored in Marketing & Entrepreneurship, and he later received his MBA from King University.Chris currently serves his community as the President of the American Marketing Association in Knoxville. In his spare time, he enjoys hanging out with the local craft beer community, international travel, exploring the great outdoors, and his many creative pursuits.Links Referenced: HumblePod: https://www.humblepod.com/ HumblePod Quick Edit: https://humblepod.com/services/quick-edit Podcast Listener Lifecycle: https://www.humblepod.com/podcast/grow-your-podcast-with-the-listener-lifecycle/ Twitter: https://twitter.com/christopholies Transcript:Announcer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Are you navigating the complex web of API management, microservices, and Kubernetes in your organization? Solo.io is here to be your guide to connectivity in the cloud-native universe!Solo.io, the powerhouse behind Istio, is revolutionizing cloud-native application networking. They brought you Gloo Gateway, the lightweight and ultra-fast gateway built for modern API management, and Gloo Mesh Core, a necessary step to secure, support, and operate your Istio environment.Why struggle with the nuts and bolts of infrastructure when you can focus on what truly matters - your application. Solo.io's got your back with networking for applications, not infrastructure. Embrace zero trust security, GitOps automation, and seamless multi-cloud networking, all with Solo.io.And here's the real game-changer: a common interface for every connection, in every direction, all with one API. It's the future of connectivity, and it's called Gloo by Solo.io.DevOps and Platform Engineers, your journey to a seamless cloud-native experience starts here. Visit solo.io/screaminginthecloud today and level up your networking game.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My returning guest probably knows more about this podcast than I do. Chris Hill is not only the CEO of HumblePod, but he's also the producer of a lot of my various media endeavors, ranging from the psychotic music videos that I wind up putting out to mock executives on their birthdays to more normal videos that I wind up recording when I'm forced into the studio and can't escape because they bar the back exits, to this show. Chris, thank you for joining me, it's nice to see you step into the light.Chris: It's a pleasure to be here, Corey.Corey: So, you have been, effectively, producing this entire podcast after I migrated off of a previous vendor, what four years ago? Five?Chris: About four or five years ago now, yeah. It's been a while.Corey: Time is a flat circle. It's hard to keep track of all of that. But it's weird that you and I don't get to talk nearly as much as we used to, just because, frankly, the process is working and therefore, you disappear into the background.Chris: Yeah.Corey: One of the dangerous parts of that is that the only time I ever wind up talking to you is when something has gone wrong somewhere and frankly, that does not happen anymore. Which means we don't talk.Chris: Yeah. And I'm okay with that. I'm just kidding. I love talking to you, Corey.Corey: Oh, I tolerate you. And every once in a while, you irritate me massively, which is why I'm punishing you this year by—Chris: [laugh].Corey: Making you tag along for re:Invent.Chris: I'm really excited about that one. It's going to be fun to be there with you and Jeremy and Mike and everybody. Looking forward to it.Corey: You know how I can tell that you've never been to re:Invent before?Chris: “I'm looking forward to it.”Corey: Exactly. You still have life in your eyes and a spark in your step. And yeah… that'll change. That'll change. So, a lot of this show is indirectly your fault because this is a weird thing for a podcaster to admit, but I genuinely don't listen to podcasts. I did when I was younger, back when I had what the kids today call ‘commute' or ‘RTO' as they start slipping into the office, but I started working from home almost a decade ago, and there aren't too many podcasts that fit into the walk from the kitchen to my home office. Like great, give me everything you want me to know in about three-and-a-half seconds. Go… and we're done. It doesn't work. So, I'm a producer, but I don't consume my own content, which I think generally is something you only otherwise see in, you know, drug dealers.Chris: Yeah. Well, and I mean, I think a lot of professional media, like, you get to a point where you're so busy and you're creating so much content that it's hard to sit down and review your own stuff. I mean, even at HumblePod, I'm in a place where we're producing our own show now called We Built This Brand, and I end up in a place where some weeks I'm like, “I can't review this. I approve it. You send it out, I trust you.” So, Corey, I'm starting to echo you in a lot of ways and it's just—it makes me laugh from time to time.Corey: Somewhat recently, I wound up yet again, having to do a check on, “Hey, you use HumblePod for your podcasting work. Do you like them?” And it's fun. It's almost like when someone reaches out about someone you used to work with. Like, “We're debating hiring this person. Should we?” And I love being able to give the default response for the people I've worked with for this long, which is, “Shut up and hire them. Why are you talking to me and not hiring them faster? Get on with it.”Because I'm a difficult customer. I know that. The expectations I have are at times unreasonably high. And the fact that I don't talk to you nearly as much as I used to shows that this all has been working. Because there was a time we talked multiple times a day back—Chris: Mm-hm.Corey: When I had no idea what I was doing. Now, 500-some-odd episodes in, I still have no idea what I'm doing, but by God, I've gotten it down to a science.Chris: Absolutely you have. And you know, technically we're over 1000 episodes together, I think, at this point because if you combine what you're doing with Screaming in the Cloud, with Last Week in AWS slash AWS Morning Brief, yeah, we've done a lot with you. But yes, you've come a long way.Corey: Yes, I have become the very whitest of guys. It works out well. It's like, one podcast isn't enough. We're going to have two of them. But it's easy to talk about the past. Let's talk instead about the future a little bit. What does the future of podcasting look like? I mean, one easy direction to go in with this, as you just mentioned, there's over 1000 episodes of me flapping my gums in the breeze. That feels like it's more than enough data to train an AI model to basically be me without all the hard work, but somehow I kind of don't see it happening anytime soon.Chris: Yeah, I think listeners still value authenticity a lot and I think that's one of the hard things you're seeing in podcasting as a whole is that these organizations come in and they're like, “We're going to be the new podcast killer,” or, “We're going to be the next thing for podcasting,” and if it's too overproduced, too polished, like, I think people can detect that and see that inauthenticity, which is why, like, AI coming in and taking over people's voices is so crazy. One of the things that's happening right now at Spotify is that they are beta testing translation software so that Screaming in the Cloud could automatically be in Spanish or Last Week in AWS could automatically be in French or what have you. It's just so surreal to me that they're doing this, but they're doing exactly what you said. It's language learning models that understand what the host is saying and then they're translating it into another language.The problem is, what if that automation gets that word wrong? You know how bad one wrong word could be, translating from Spanish or French or any other language from English. So, there's a lot of challenges to be met there. And then, of course, you know, once they've got your voice, what do they do with it? There's a lot of risk there.Corey: The puns don't translate very well, most of the time, either.Chris: Oh, yes.Corey: Especially when I mis-intentionally mispronounce words like Ku-BER-netees.Chris: Exactly. I mean, it's going to be auto-translated into text at some point before it's then put out as, you know, an audio source, and so if you say something wrong, it's going to be an issue. And Ku-BER-netees or Chat-Gippity or any of those great terms that you have, they're going to also be translated wrong as well, and that creates its own can of worms so to speak.Corey: Well, let me ask you something because you have always been one to embrace emerging technologies. It's one of the things I appreciate about you; you generally don't recommend solutions from the Dark Ages when it comes to what equipment should I have and how should I maintain it and the rest. But there are a lot of services out there that will now do automatic transcription and the service that you use at the moment remains a woman named Cecilia, who's remarkably good at what she does. But why have you not replaced her with a robot?Chris: [laugh]. Very simply put, I mean, it kind of goes back to what I was just saying about language translation. AI does not understand context for human words as well as humans do, and so words are wrong a lot of times in auto transcription. I mean, I can remember a time when, you know, we first started working with you all were, if there was one thing wrong in a transcript, an executive at AWS would potentially make fun of you on Twitter for it. And so, we knew we had to be on our A-game when it came to that, so finding someone who had that niche expertise of being able to translate not just words and understand words, but also understand tech terminology, you know, I think that that's, that's its own animal and its own challenge. So yeah, I mean, you could easily get away with something—Corey: Especially with my attentional mispronunciation where she's, “I don't quite know what you're saying here, and neither does the entire rest of the industry.” Like, “Postgres-squ—do you mean Postgres? Who the hell calls it Postgres-squeal?” I do. I call it that. Two warring pronunciations, I will unify them by coming up with a third that is far worse. It's kind of my shtick. The problem is, at some point, it becomes too inside-jokey when I have 15 words that I'm doing that too, and suddenly no one knows what the hell I'm talking about and the joke gets old quickly.Chris: Yep.Corey: So, I've tried to scale that back. But there are still a few that I… I can't help but play with.Chris: Yeah. And it's always fun bringing someone new in to work on—work with you all because they're always like, “What is he saying? Does he mean this?” And [laugh] it's always an adventure.Corey: It keeps life fun though.Chris: Absolutely.Corey: So, one thing that you did for a while, back when I was starting out, it almost felt like you were in cahoots with Big Microphone because once I would wind up getting a setup all working and ready for the recording, like, “Great. Everything working terrifically? Cool, throw it away. It's time for generation three of this.” I think I'm on, like, gen six, or gen seven now, but it's been relatively static for the past few years. Are the checks not as big as they used to be? I mean, if we hit a point of equilibrium? What's going on?Chris: Yeah, unfortunately, Big Microphone isn't paying what they used to. The economy and interest rates and all that, it's just making it hard. But once you get to a certain level of gear, it's going to be more important that you have good content than better and better gear. Could we keep going? Sure. If you wanted to buy a studio and you wanted to get Neumann microphones or something like that, we could keep going. But again, Big Microphone is not paying what they used to.Corey: When people reach out because they're debating starting a podcast and they ask me for advice, other than hire HumblePod, the next question they usually get around to is gear. And I don't think that they are expecting my answer, which is, it does not matter. Because if the content is good, the listeners will forgive an awful lot. You could record it into your iPhone in a quiet room and they will put up with that. Whereas if the content isn't good, it doesn't matter what the production value is because people are constantly being offered better things to do with their time. You've got to grab them, you have to be compelling to your target audience or the rest of it does not matter.Chris: Yeah. And I think that's the big challenge with audio is a lot of people get excited, especially I find this true of people in the tech industry of like, “Okay, I want to learn all the tech stuff, I love all the cool tech stuff, and so I'm going to go out and buy all this equipment first.” And then they spend $5,000 on equipment and they never record a single episode because they put all their time and energy into researching and buying gear and never thought about the content of the show. The truth is, you could start with your iPhone and that's it. And while I don't necessarily advise that, you'd be surprised at the quality of audio on an iPhone.I've had a client have to re-record something while they were traveling remotely and I said, “You just need to get your iPhone out.” They took their AirPods, plugged them in and, I said, “No. Take them out, use the microphone on the iPhone.” And you can start with something as simple as that. Now, once you want to start making it better, sure, that's a great way to grow and that does influence people staying with your podcast over time, but I think in the long run, content trumps all.Corey: One of the problems I keep seeing is that people also want to record a podcast because they have a great idea for a few episodes. My rule of thumb—because I've gotten this wrong before—is, okay, if you want to do a whole new podcast, come up with the first 12 episodes. Because two, three, four, of course, you've got your ideas. And then by the—you'll find in many cases, you're going to have a problem by the end of it. Years ago, I did a mini-series inside of AMB called “Networking in the Cloud” where it was sponsored by, at the time, ThousandEyes, before Cisco bought them and froze them in amber for all eternity.But it was fun for the first six episodes and then I realized I'd said all I needed to say about networking, and I was on the hook for six more. And Ivan Pepeinjak, who's his own influencer type in the BGP IP space was like, “This is why you should stay in your lane. He's terrible. He got it all wrong.” Like, “Great. Come on and tell me exactly how I got it wrong,” because I was trying to approach it from a very surface topical area, but BGP is one of those areas where I get very wrapped around my own axle just because I've never used it in anger. Being able to pivot the show format is what saved me on that. But if I had started doing this as its own individual podcast and launched, it would have died on the vine, just because it would not have had enough staying power and I didn't have the interest to continue working on it. Could someone else come up with a networking-in-the-cloud podcast that had hundreds of episodes? Absolutely, but those people are what we call competent and good at things in a way that I very much am not.Chris: Yep. And I completely agree. I mean, 12 is my default number, so—I'm not going to take credit for your saying 12, but I know we've talked about that before. And—Corey: It was a 12-episode miniseries is why. And I remember by ten, I had completely scraped the bottom of the barrel. Then Ivan saved me on one of them, and then I did, I think, a mini-series-in-review, which is cheating but worked.Chris: Yeah. I remember that, the trials and travails of giving that out. It was fun, though. But with that, yeah, like, 12 is a good number because, like, to your point, if you have 12 and you want to do a monthly show, you've got a year's worth of content, if you do bi-weekly, that's six months, and if it's a weekly show, it's at least a quarter's worth of content. So, it does help you think through and at least come up with any potential roadblocks you might have by at least listing out, here's what episodes one, two, three, four, five and so on would be. And so, I do think that's a great approach.Corey: And don't be an idiot like I was and launch a newsletter and then podcast that focus on last week's news because you can't work ahead on that. If you can, why are you not a multi-billionaire for playing the markets? If you can predict the future, there's a more lucrative career for you than podcasting, I promise. But that means that I have to be on the treadmill on some level. I've gotten it down to a point where I can stretch it to ten days. I can take ten days off if I preload, do it as early as I possibly can beforehand and then as late as I possibly can when I return. Anything more than that, I'm either skipping a week or delaying the show or have to get a guest author or artist in.Chris: Yeah. And you definitely need that time off, and so that's the one big challenge, I think with podcasting, too, is like you create this treadmill for yourself that you constantly have to fill content after content after content. I think that's one of the big challenges in podcasting and one of the reasons we see so many podcasts fade out. I don't know if you're familiar, but there is a term called podfade, which is just that: people burning out, fading out in their excitement for a podcast. And most podcasters fade out by episode seven or eight, somewhere in that range, so to see someone go for say, like, you have 500 episodes plus, we're talking about a ton of good content. You've found your rhythm, you've found your groove. That can do it. But yeah, it's always, always a challenge staying motivated.Corey: One thing that consistently surprises me is that the things I care about as the creator and the things the audience cares about are not the same. And you have to be respectful of your audience's time. I've done the numbers on the shows that I put out and it's something on the order of over a year of human time for every episode that I put out. If I'm going to take a year from humanity's collective lifetimes in order to say my inane thoughts, then I have to be respectful of the audience's time. Which means, “Oh, I'm going to have a robot do it so I don't have to put the work in.” It doesn't work that way. That's not how you sustain.Chris: Right. In and again, it takes out that humanity that makes podcasting so special and makes that connection with even the listener so special. And I'm sure you've experienced this too. When you go to re:Invent, like, we're going to have here in just a few short months, people know you, and they probably say things and bring up things that you haven't even thought about. And you're like, “Where did you even learn that I did that?” And then you realize, “Oh, I said that on a podcast episode.”Corey: Yeah. What's weird is I don't get much feedback online for it, but people will talk to me in depth about the show. They'll come up to me near constantly and talk about it. They don't reach out the same way, which I guess makes sense. There are a couple of podcasts that I've really admired and listened to on and off in the car for years, but I've never reached out to the creators because I feel like I would sound ridiculous. It's not true. I know intellectually it's not true, but it feels weird to do it.Chris: One of the ways I got into podcasting was a podcast that just invited me to—you know, invited their listeners to sign up and engage with them. And I think that's something in the medium that does make it interesting is once you do engage, you find out that these creators respond. And where else do you get that, you know? If you're watching a big TV show and you tweet at somebody online that you admire in the show, the chance of them even liking what you said about them online is very slim to none. But with podcasting, there's just a different level of accessibility I find with most productions and most shows that makes it really something special.Corey: One thing that still surprises me—and I don't think I've ever been this explicit about it on the show, but why the hell not I have nothing to hide—Thursday evening, 5 p.m. Pacific time. That's when the automation fires and rotates everything for the newsletter and the AWS Morning Brief. Anything that comes in after that, unless I manually do an override, will not be in the next week's issue; it'll be the week after.That applies to Security as well, which means 5 p.m. on Thursday, it seals it, I write and record it and it goes ou—that particular one goes out Thursday morning the following week. And no one has ever said anything about this seems awfully late. Occasionally, there's been news the day before and someone said, “Oh, why didn't you include this?”And it's because, believe it or not, I don't just type this in and hit the send button. There's a bit more to it than that these days. But people don't need the sense of immediacy. This idea of striving to be first is not sustainable and it leads to terrible outcomes. My entire philosophy has not been to have the first take but rather the best take.Chris: Mm-hm.Corey: Sometimes I even get it right.Chris: And I mean in podcasting, too. Like, it's about, you serve a certain niche, right? Like, the people who are interested in AWS services and in this world of cloud computing listen to what you say, listen to the people you interview, and really enjoy those conversations. But that's not everybody in the world. That's not a very broad audience. And so, I think that those niches really serve a purpose.And the way I've always thought about it is, like, if you go to the grocery store, you know how you always have that rack of magazines with the most random interests? That's essentially what podcasting is. It's like each podcast is a different magazine that serves someone's random—and hyper-specific sometimes—niche interest in things. I mean, the number of things you can find podcasts on is just ridiculous. And I think the same is true for this. But the people who do follow, they're very serious, they're very dedicated, they do listen, and yeah, I think it's just a fascinating, fascinating thing.Corey: The way that I see it has been that I've been learning more from the audience and the things that people say that most people would believe, but… I make a lot of mistakes doing this, but talking to people does tend to shine a light on a lot of this. But enough about the past. Most of my episodes are about things that have previously happened. What does the future of podcasting look like? Where's it going from here?Chris: Oh, man. Well, I think the big question on everybody's mind is, do I need a video podcast? And I think that for most people, that's where the big question lies right now. I get a lot of questions about it, I get people reaching out, and I think the short answer to that is… not really. Or to answer a question I know you love, Corey, it depends.And the reason for that is, there's a lot with the tech of podcasting that just isn't going to distribute to everywhere, all at once anymore. The beauty of podcasting is that it's all based on an RSS feed. If you build an RSS feed and you put it in Apple Podcasts and Spotify, that RSS feed will distribute everywhere and it will distribute your audio everywhere. And what we see happening right now, and really one of the bigger challenges in podcasting, is that the RSS feed only provides audio. Technically, that's not accurate, but it does for most services.So, YouTube has recently come out and said that they are going to start integrating RSS feeds, so you'll be able to do those audiogram-esque things that a lot of people have done through apps like Headliner and stuff for a long time, or even their podcast host may automatically translate a version of their audio podcast into a video and just do, like, a waveform. They're going to have that in YouTube. TikTok is taking a similar approach. And they're both importing just the audio. And the reason I said earlier, that's technically not accurate is because RSS feeds can also support MP4s, but neither service is going to accept that or ingest it directly into their service from what you provide outbound.So, it's a very interesting time because it feels like we're getting there with video, but we're still not there, and we're still probably several years off from it. So, there's a lot of interest in video and I think the future is going to be video, but I think it's going to be a combination, too, with audio because who wants to sit and watch something for an hour-and-a-half when you're used to listening to it your commute or while you do the dishes or any number of other things that don't involve having your eyeballs directly on the content.Corey: We've tried it with this show. I found that it made the recording process a bit more onerous because everyone is suddenly freaking out about how they look and I can't indulge my constant nose-picking habit. Kidding. So, it was more work, I had to gussy myself up a bit more than dressing like a slob like I do some mornings because I do have young children and a deadline to get them to school by. But I never saw the audience to materialize there and be worth it.Because watching a video of two people talking with each other, it feels too much like a Zoom call that you can't participate in, so what's the point?Chris: Right.Corey: So, there's that. There's the fact that I also have very intentionally built most of what I do around newsletters and podcasts because at least so far, those are not dependent upon algorithmic discovery in the same way. I don't have to bias for things that YouTube likes this month. Instead, I can focus on the content that people originally signed up to hear me put out and I don't have to worry about it in the same way. Email predates me, it'll be here long after I'm gone, and that seems to make sense.I also look at how I have consumed podcasts, and times when I do, it's almost always while I'm doing something else. And if I have to watch a screen, that becomes significantly more distracting, and harder for me to find the time to do it with.Chris: I think what you're seeing is that, like, there's some avenues to where video podcasting is really good and really interesting, and I think the real place where that works best right now is in-person interviews. So, Corey, if you went out and interviewed Andy Jassy in person in Seattle, that to me would be something that would warrant bringing the cameras out for and putting online because people would want to see you in the office interacting with him. That would be interesting. To your point, during the Zoom calls and things like that, you end up in a place where people just aren't as interested in sitting and watching the Zoom call. And I think that's something that is a clear distinction to make.Entertainment, comedy, doing things in person, I think that's where the real interest in video is and that's why I don't think video will be for everybody all the time. The thing that is starting to come up as well is discoverability, and that has always been a challenge, but as we get into—and we probably don't want to go down this rabbit hole, but you know, what's happened to Twitter and X, like, discoverability is becoming more of a challenge because they're limiting access to that platform. They're limiting discoverability if you're not willing to pay for a blue checkmark. They're doing all these things to make it harder for small independent podcasts to grow.And the places that are opening up for it to grow are places like YouTube, places like TikTok, that have the ability to not only just put your full podcasts online now, but you can actually do, like, YouTube shorts or highlighted clips, and directly link those back to the long-form content that you're producing. So, there is some value in it, there is a technology and a future there for it, but it's just a very complicated time to be in podcasting and figuring out where to go to grow. That's probably the biggest challenge that we face and I think ultimately, that just comes down to developing an audience outside of these social media channels.Corey: One thing that you were talking about a while back in a conversation that I don't think I've ever followed up with you on—and there's no time like in front of a bunch of public people to do that—Chris: [laugh].Corey: You were talking to me about something that you were calling the Podcast Listener Lifecycle.Chris: Yes.Corey: What's your point on that?Chris: So, the Listener Lifecycle is something I developed, just to be frank, working with you guys, learning from you all, and also my background in marketing, and in building audiences and things, from my own podcasts and other things that I did prior to building HumblePod, led me to a place of going, how can we best explain to a client where their podcast is? How does it exist? Where does it exist? All that good stuff. And basically, the Listener Lifecycle is just that.It's a design—and we'll have links to it because I actually did a whole podcast season on the Listener Lifecycle from beginning to end, so that's probably the easiest way to talk about it. But essentially, it's the idea of, you're curious about a show, and how do you go from being curious about a show to exploring a podcast, to then becoming a follower of the podcast, literally clicking the Follow button. What does it take to get through each one of those stages? How can you identify where your audience is? And basically, it's a tool you can use to say, “Well, this is where my listener is in the stages.” And then once they get to be a follower, how do I build them into something more?Well, get them to be a subscriber, subscribe to a newsletter, subscribe to a Patreon or Substack or whatever that subscription service is that you prefer to use, and get them off of just being on social media and following you there and following you in a podcast audio form. Because things can happen: your podcast host could break and you'd lose your audience, right? We've seen Twitter, which we may have thought years ago that it would never go away, and now we don't know how long it's going to be there. It could be gone by the time we're done with this conversation for all we know. I've got all my notifications turned off, so we're basically in a liminal space at this point.But with that said, there's a lot of risk in audiences changing and things like that, so audience portability is really important. So, the more you can collect email addresses, collect contact information, and communicate with that group of people, the better your audience is going to be. And so, that's what it's about is helping people get to that stage where they can do that so that they don't lose audiences and so that they can even build and grow audiences beyond that to the point where they get to the last phase, which is the ‘true fan' phase. And that's where you get people who love your show, retweet everything you do, repost everything you do, and share it with all their friends every time you're creating new content. And that's ultimately what you want: those die-hard people that come up to you and know everything about you at re:Invent, those are the people that you want to create more of because they're going to help you grow your show and your audience, ultimately. So, that's what it's about. I know that's a lot. But again, like, we'll have a link in the show notes to where you can learn more about it.Corey: Indeed, we will. Normally I'm the one that says, “And we'll include a link to that in the show notes.” But you're the one that has to actually make all that happen. Here's another glimpse behind the curtain. I have a Calendly link that I pass out to people to book time on the show. They fill out the form, which is relatively straightforward and low effort by design, and the next time I think about it is ten minutes beforehand when it pops up with, “Hey, you have a recording to go to.” Great. I book an hour for a half-hour recording. I wind up going through this entire conversation. When we're done, we close out the episode, we chat a bit, I close the tab, and I don't think about it again, it's passed off to you folks entirely. It is the very whitest of white glove treatments. Because I, once again, am the very whitest of white guys.Chris: We aim to please [laugh].Corey: Exactly. Because I remember before this, I used to have things delayed by months because I would forget to copy the freaking file into Dropbox, of all things. And that was just wild to me.Chris: And we stay on you about that because we want to make sure that your show gets out and—Corey: And now it automatically transfers and I—when the automation works—I don't have to think about it again. What is fun to me is despite all the time that I spend in enterprise cloud services, we still use things that are prosumer, like Dropbox and other things that are human-centric because for some reason, most of your team are not also highly competent cloud developers. And I still think it is such a miss that something like S3, which would be perfect for this, requires that level of engineering. And I have more self-respect than that. I'd have to build some stuff in order to make that work effectively on my end, let alone folks who have actual jobs that don't involve messing around with cloud services all day.But it blows my mind that there's still such this gulf between things that sound like you would have one of your aging parents deal with versus something that is extraordinarily capable and state-of-the-art. I know they're launching a bunch of things like Amazon's IVS, which is a streaming offering, a lot of their elemental offerings for media packaging, but I look at it, it's like wow, not only is this expensive, it doesn't solve any problems that we actually have and would add significant extra steps to every part of it. Thanks, but no thanks. And sure, maybe we're not the target market, but I can't shake the feeling that there are an awful lot of people like us that fit that profile.Chris: Yeah. And I mean, you bring up a good point about not using S3, things like that. It has occurred to me as well that, hey, maybe we should find somebody to help us develop a technology like this to make it easier on us on the back end to do all the recording and the production in one place, one database, and be able to move on. So, at some point I would love to get there. That's probably a conversation for after the podcast, Corey, but definitely is something that we've been thinking about at HumblePod is, how do we reach that next step of making it even easier on our clients?Corey: Well, it is certainly appreciated. But again, remember, your task is to continue to perform the service excellently, not be the poster child for cloud services with dumb names.Chris: [laugh]. Yes, yes. And I'm sure we could come up with a bunch.Corey: One last question before we wind up calling in an episode. I know that I've been emphasizing the white glove treatment that I get—and let's be clear, you are not inexpensive, but you're also well worth it; you deliver value extraordinarily for our needs—do you offer things that are not quite as, we'll call it, high-touch and comprehensive?Chris: Yes, we do actually. We just recently launched a new service called Quick Edit and it's just that. It's still humans touching the service, so it's not a bunch of automated, hey, we're just running this through an AI program and it's going to spit it out on the other end. We actually have a human that touches your audio, cleans it up, and sends it back. And yeah, we're there to make sure that we can clean things up quickly and easily and affordably for those folks that are just in a pinch.Maybe you edit most weeks and you're just tired of doing the editing, maybe you're close to podfading and you just want an extra boost to see if you can keep the show going. That's what we have the Quick Edit service for. And that starts at $150 an episode and we'll edit up to 45 minutes of audio for you within that. And yeah, there's some other options available as well if you start to add more stuff, but just come check us out. You can go to humblepod.com/services/quick-edit and find that there.Corey: And we will, of course, put links to that in the show notes. Or at least you will. I certainly won't.Chris: [laugh].Corey: Chris, thank you so much for taking the time to speak with me. If people want to learn more, other than hunting you down at re:Invent, which they absolutely should do, where's the best place for them to find you?Chris: I mean@HumblePod anywhere is the quickest, easiest way to find me anywhere—or at least find the business—and you can find me at @christopholies. And we'll have a link to that in the show notes for sure because it's not worth spelling out on the podcast.Corey: I would have pronounced it chris-to-files, but that's all right. That's how it works.Chris: [laugh].Corey: Thank you so much, Chris for everything that you do, as well as suffering my nonsensical slings and arrows for the last half hour. We'll talk soon.Chris: You're welcome, Corey.Corey: Chris Hill, CEO at HumblePod. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you hated this episode, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that I'm sure Chris or one of his colleagues will spend time hunting down from all corners of the internet to put into a delightful report, which I will then never read.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Solving the Case of the Infinite Cloud Spend with John Wynkoop

Screaming in the Cloud

Play Episode Listen Later Oct 24, 2023 29:56


John Wynkoop, Cloud Economist & Platypus Herder at The Duckbill Group, joins Corey on Screaming in the Cloud to discuss why he decided to make a career move and become an AWS billing consultant. Corey and John discuss how once you're deeply familiar with one cloud provider, those skills become transferable to other cloud providers as well. John also shares the trends he has seen post-pandemic in the world of cloud, including the increased adoption of a multi-cloud strategy and the need for costs control even for VC-funded start-ups. About JohnWith over 25 years in IT, John's done almost every job in the industry, from running cable and answering helpdesk calls to leading engineering teams and advising the C-suite. Before joining The Duckbill Group, he worked across multiple industries including private sector, higher education, and national defense. Most recently he helped IGNW, an industry leading systems integration partner, get acquired by industry powerhouse CDW. When he's not helping customers spend smarter on their cloud bill, you can find him enjoying time with his family in the beautiful Smoky Mountains near his home in Knoxville, TN.Links Referenced: The Duckbill Group: https://duckbillgroup.com LinkedIn: https://www.linkedin.com/in/jlwynkoop/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. And the times, they are changing. My guest today is John Wynkoop. John, how are you?John: Hey, Corey, I'm doing great. Thanks for having me.Corey: So, big changes are afoot for you. You've taken a new job recently. What are you doing now?John: Well [laugh], so I'm happy to say I have joined The Duckbill Group as a cloud economist. So, came out of the big company world, and have dived back in—or dove back into the startup world.Corey: It's interesting because when we talk to those big companies, they always identify us as oh, you're a startup, which is hilarious on some level because our AWS account hangs out in AWS's startup group, but if you look at the spend being remarkably level from month to month to month to year to year to year, they almost certainly view us as they're a startup, but they suck at it. They completely failed. And so, many of the email stuff that you get from them presupposes that you're venture-backed, that you're trying to conquer the entire world. We don't do that here. We have this old-timey business model that our forebears would have understood of, we make more money than we spend every month and we continue that trend for a long time. So first, thanks for joining us, both on the show and at the company. We like having you around.John: Well, thanks. And yeah, I guess that's—maybe a startup isn't the right word to describe what we do here at The Duckbill Group, but as you said, it seems to fit into the industry classification. But that was one of the things I actually really liked about the—that was appealing about joining the team was, we do spend less than we make and we're not after hyper-growth and we're not trying to consume everything.Corey: So, it's interesting when you put a job description out into the world and you see who applies—and let's be clear, for those who are unaware, job descriptions are inherently aspirational shopping lists. If you look at a job description and you check every box on the thing and you've done all the things they want, the odds are terrific you're going to be bored out of your mind when you wind up showing up to do these… whatever that job is. You should be learning stuff and growing. At least that's always been my philosophy to it. One of the interesting things about you is that you checked an awful lot of boxes, but there is one that I think would cause people to raise an eyebrow, which is, you're relatively new to the fun world of AWS.John: Yeah. So, obviously I, you know, have been around the block a few times when it comes to cloud. I've used AWS, built some things in AWS, but I wouldn't have classified myself as an AWS guru by any stretch of the imagination. I spent the last probably three years working in Google Cloud, helping customers build and deploy solutions there, but I do at least understand the fundamentals of cloud, and more importantly—at least for our customers—cloud costs because at the end of the day, they're not all that different.Corey: I do want to call out that you have a certain humility to you which I find endearing. But you're not allowed to do that here; I will sing your praises for you. Before they deprecated it like they do almost everything else, you were one of the relatively few Google Cloud Certified Fellows, which was sort of like their Heroes program only, you know, they killed it in favor of something else like there's a Champion program or whatnot. You are very deep in the world of both Kubernetes and Google Cloud.John: Yeah. So, there was a few of us that were invited to come out and help Google pilot that program in, I believe it was 2019, and give feedback to help them build the Cloud Fellows Program. And thankfully, I was selected based on some of our early experience with Anthos, and specifically, it was around Certified Fellow in what they call hybrid multi-cloud, so it was experience around Anthos. Or at the time, they hadn't called it Anthos; they were calling it CSP or Cloud Services Platform because that's not an overloaded acronym. So yeah, definitely, was very humbled to be part of that early on.I think the program, as you said, grew to about 70 or so maybe 100 certified individuals before they transitioned—not killed—transitioned to that program into the Cloud Champions program. So, those folks are all still around, myself included. They've just now changed the moniker. But we all get to use the old title still as well, so that's kind of cool.Corey: I have to ask, what would possess you to go from being one of the best in the world at using Google Cloud over here to our corner of the AWS universe? Because the inverse, if I were to somehow get ejected from here—which would be a neat trick, but I'm sure it's theoretically possible—like, “What am I going to do now?” I would almost certainly wind up doing something in the AWS ecosystem, just due to inertia, if nothing else. You clearly didn't see things quite that way. Why make the switch?John: Well, a couple of different reasons. So, being at a Google partner presents a lot of challenges and one of the things that was supremely interesting about coming to Duckbill is that we're independent. So, we're not an AWS partner. We are an independent company that is beholden only to our customers. And there isn't anything like that in the Google ecosystem today.There's, you know, there's Google partners and then there's Google customers and then there's Google. So, that was part of the appeal. And the other thing was, I enjoy learning new things, and honestly, learning, you know, into the depths of AWS cost hell is interesting. There's a lot to learn there and there's a lot of things that we can extract and use to help customers spend less. So, that to me was super interesting.And then also, I want to help build an organization. So, you know, I think what we're doing here at The Duckbill Group is cool and I think that there's an opportunity to grow our services portfolio, and so I'm excited to work with the leadership team to see what else we can bring to market that's going to help our customers, you know, not just with cost optimization, not just with contract negotiation, but you know, through the lifecycle of their AWS… journey, I guess we'll call it.Corey: It's one of those things where I always have believed, on some level, that once you're deep in a particular cloud provider, if there's reason for it, you can rescale relatively quickly to a different provider. There are nuances—deep nuances—that differ from provider to provider, but the underlying concepts generally all work the same way. There's only so many ways you can have data go from point A to point B. There's only so many ways to spin up a bunch of VMs and whatnot. And you're proof-positive that theory was correct.You'd been here less than a week before I started learning nuances about AWS billing from you. I think it was something to do with the way that late fees are assessed when companies don't pay Amazon as quickly as Amazon desires. So, we're all learning new things constantly and no one stuffs this stuff all into their head. But that, if nothing else, definitely cemented that yeah, we've got the right person in the seat.John: Yeah, well, thanks. And certainly, the deeper you go on a specific cloud provider, things become fresh in your memory, you know, other cached so to speak. So, coming up to speed on AWS has been a little bit more documentation reading than it would have been, if I were, say, jumping right into a GCP engagement. But as he said, at the end of the day, there's a lot of similarities. Obviously understanding the nuances of, for example, account organization versus, you know, GCP's Project and Folders. Well, that's a substantial difference and so there's a lot of learning that has to happen.Thankfully, you know, all these companies, maybe with the exception of Oracle, have done a really good job of documenting all of the concepts in their publicly available documentation. And then obviously, having a team of experts here at The Duckbill Group to ask stupid questions of doesn't hurt. But definitely, it's not as hard to come up to speed as one may think, once you've got it understood in one provider.Corey: I took a look recently and was kind of surprised to discover that I've been doing this—as an independent consultant prior to the formation of The Duckbill Group—for seven years now. And it's weird, but I've gone through multiple industry cycles and changes as a part of this. And it feels like I haven't been doing it all that long, but I guess I have. One thing that's definitely changed is that it used to be that companies would basically pick one provider and almost everything would live there. At any reasonable point of scale, everyone is using multiple things.I see Google in effectively every client that we have. It used to be that going to Google Cloud Next was a great place to hang out with AWS customers. But these days, it's just as true to say that a great reason to go to re:Invent is to hang out with Google Cloud customers. Everyone uses everything, and that has become much more clear over the last few years. What have you seen change over the… I guess, since the start of the pandemic, just in terms of broad cycles?John: Yeah. So, I think there's a couple of different trends that we're seeing. Obviously, one is that as you said, especially as large enterprises make moves to the cloud, you see independent teams or divisions within a given organization leveraging… maybe not the right tool for the job because I think that there's a case to be made for swapping out a specific set of tools and having your team learn it, but we do see what I like to refer to as tool fetishism where you get a team that's super, super deep into BigQuery and they're not interested in moving to Redshift, or Snowflake, or a competitor. So, you see, those start to crop up within large organizations where the distributed—the purchasing power, rather—is distributed. So, that's one of the trends is the multi-cloud adoption.And I think the big trend that I like to emphasize around multi-cloud is, just because you can run it anywhere doesn't mean you should run it everywhere. So Kubernetes, as you know, right, as it took off 2019 timeframe, 2020, we started to see a lot of people using that as an excuse to try to run their production application in two, three public cloud providers and on-prem. And unless you're a SaaS customer—or SaaS company with customers in every cloud, there's very little reason to do that. But having that flexibility—that's the other one, is we've seen that AWS has gotten a little difficult to negotiate with, or maybe Google and Microsoft have gotten a little bit more aggressive. So obviously, having that flexibility and being able to move your workloads, that was another big trend.Corey: I'm seeing a change in things that I had taken as givens, back when I started. And that's part of the reason, incidentally, I write the Last Week in AWS newsletter because once you learn a thing, it is very easy not to keep current with that thing, and things that are not possible today will be possible tomorrow. How do you keep abreast of all of those changes? And the answer is to write a deeply sarcastic newsletter that gathers in everything from the world of AWS. But I don't recommend that for most people. One thing that I've seen in more prosaic terms that you have a bit of background in is that HPC on cloud was, five, six years ago, met with, “Oh, that's a good one; now pull the other one, it has bells on it,” into something that, these days, is extremely viable. How'd that happen?John: So, [sigh] I think that's just a—again, back to trends—I think that's just a trend that we're seeing from cloud providers and listening to their customers and continuing to improve the service. So, one of the reasons that HPC was—especially we'll call it capacity-level HPC or large HPC, right—you've always been able to run high throughput; the cloud is a high throughput machine, right? You can run a thousand disconnected VMs no problem, auto-scaling, anybody who runs a massive web front-end can attest to that. But what we saw with HPC—and we used to call those [grid 00:12:45] jobs, right, the small, decoupled computing jobs—but what we've seen is a huge increase in the quality of the underlying fabric—things like RDMA being made available, things like improved network locality, where you now have predictive latency between your nodes or between your VMs—and I think those, combined with the huge investment that companies like AWS have made in their file systems, the huge investment companies like Google have made in their data storage systems have made HPC viable, especially at a small-scale—for cloud-based HPC specifically—viable for organizations.And for a small engineering team, who's looking to run say, computer-aided engineering simulation or who's looking to prototype some new way of testing or doing some kind of simulation, it's a huge, huge improvement in speed because now they don't have to order a dozen or two dozen or five dozen nodes, have them shipped, rack them, stack them, cool them, power them, right? They can just spin up the resource in the cloud, test it out, try their simulation, try out the new—the software that they want, and then spin it all down if it doesn't work. So, that elasticity has also been huge. And again, I think the big—to kind of summarize, I think the big driver there is the improvement in this the service itself, right? We're seeing cloud providers taking that discipline a little bit more seriously.Corey: I still see that there are cases where the raw math doesn't necessarily add up for sustained, long-term use cases. But I also see increasingly that with HPC, that's usually not what the workload looks like. With, you know, the exception of we're going to spend the next 18 months training some new LLM thing, but even then the pricing is ridiculous. What is it their new P6 or whatever it is—P5—the instances that have those giant half-rack Nvidia cards that are $800,000 and so a year each if you were to just rent them straight out, and then people running fleets of these things, it's… wow that's more commas in that training job than I would have expected. But I can see just now the availability for driving some of that, but the economics of that once you can get them in your data center doesn't strike me as being particularly favoring the cloud.John: Yeah, there's a couple of different reasons. So, it's almost like an inverse curve, right? There's a crossover point or a breakeven point at which—you know, and you can make this argument with almost any level of infrastructure—if you can keep it sufficiently full, whether it's AI training, AI inference, or even traditional HPC if you can keep the machine or the group of machines sufficiently full, it's probably cheaper to buy it and put it in your facility. But if you don't have a facility or if you don't need to use it a hundred percent of the time, the dividends aren't always there, right? It's not always worth, you know, buying a $250,000 compute system, you know, like say, an Nvidia, as you—you know, like, a DGX, right, is a good example.The DGX H100, I think those are a couple $100,000. If you can't keep that thing full and you just need it for training jobs or for development and you have a small team of developers that are only going to use it six hours a day, it may make sense to spin that up in the cloud and pay for a fractional use, right? It's no different than what HPC has been doing for probably the past 50 years with national supercomputing centers, which is where my background came from before cloud, right? It's just a different model, right? One is public economies of, you know, insert your credit card and spend as much as you want and the other is grant-funded and supporting academic research, but the economy of scales is kind of the same on both fronts.Corey: I'm also seeing a trend that this is something that is sort of disturbing when you realize what I've been doing and how I've been going about things, that for the last couple of years, people actually started to care about the AWS bill. And I have to say, I felt like I was severely out of sync with a lot of the world the first few years because there's giant savings lurking in your AWS bill, and the company answer in many cases was, “We don't care. We'd rather focus our energies on shipping faster, building something new, expanding, capturing market.” And that is logical. But suddenly those chickens are coming home to roost in a big way. Our phone is ringing off the hook, as I'm sure you've noticed and your time here, and suddenly money means something again. What do you think drove it?John: So, I think there's a couple of driving factors. The first is obviously the broader economic conditions, you know, with the economic growth in the US, especially slowing down post-pandemic, we're seeing organizations looking for opportunities to spend less to be able to deliver—you know, recoup that money and deliver additional value. But beyond that, right—because, okay, but startups are probably still lighting giant piles of VC money on fire, and that's okay, but what's happening, I think, is that the first wave of CIOs that said cloud-first, cloud-only basically got their comeuppance. And, you know, these enterprises saw their explosive cloud bills and they saw that, oh, you know, we moved 5000 servers to AWS or GCP or Azure and we got the bill, and that's not sustainable. And so, we see a lot of cloud repatriation, cloud optimization, right, a lot of second-gen… cloud, I'll call them second-gen cloud-native CIOs coming into these large organizations where their predecessor made some bad financial decisions and either left or got asked to leave, and now they're trying to stop from lighting their giant piles of cash on fire, they're trying to stop spending 3X what they were spending on-prem.Corey: I think an easy mistake for folks to make is to get lost in the raw infrastructure cost. I'm not saying it's not important. Obviously not, but you could save a giant pile of money on your RDS instances by running your own database software on top of EC2, but I don't generally recommend folks do it because you also need engineering time to be focusing on getting those things up, care and feeding, et cetera. And what people lose sight of is the fact that the payroll expense is almost universally more than the cloud bill at every company I've ever talked to.So, there's a consistent series of, “Well, we're just trying to get to be the absolute lowest dollar figure total.” It's the wrong thing to emphasize on, otherwise, “Cool, turn everything off and your bill drops to zero.” Or, “Migrate to another cloud provider. AWS bill becomes zero. Our job is done.” It doesn't actually solve the problem at all. It's about what's right for the business, not about getting the absolute lowest possible score like it's some kind of code golf tournament.John: Right. So, I think that there's a couple of different ways to look at that. One is obviously looking at making your workloads more cloud-native. I know that's a stupid buzzword to some people, but—Corey: The problem I have with the term is that it means so many different things to different people.John: Right. But I think the gist of that is taking advantage of what the cloud is good at. And so, what we saw was that excess capacity on-prem was effectively free once you bought it, right? There were there was no accountability for burning through extra V CPUs or extra RAM. And then you had—Corey: Right. You spin something up in your data center and the question is, “Is the physical capacity there?” And very few companies had a reaping process until they were suddenly seeing capacity issues and suddenly everyone starts asking you a whole bunch of questions about it. But that was a natural forcing function that existed. Now, S3 has infinite storage, or it might as well. They can add capacity faster than you can fill it—I know this; I've tried—and the problem that you have then is that it's always just a couple more cents per gigabyte and it keeps on going forever. There's no, we need to make an investment decision because the SAN is at 80% capacity. Do you need all those 16 copies of the production data that you haven't touched since 2012? No, I probably don't.John: Yeah, there's definitely a forcing function when you're doing your own capacity planning. And the cloud, for the most part, as you've alluded to, for most organizations is infinite capacity. So, when they're looking at AWS or they're looking at any of the public cloud providers, it's a potentially infinite bill. Now, that scares a lot of organizations, and so because they didn't have the forcing function of, hey, we're out of CPUs, or we're out of hard disk space, or we're out of network ports, I think that because the cloud was a buzzword that a lot of shareholders and boards wanted to see in IT status reports and IT strategic plans, I think we grew a little bit further than we should have, from an enterprise perspective. And I think a lot of that's now being clawed back as organizations are maturing and looking to manage cost. Obviously, the huge growth of just the term FinOps from a search perspective over the last three years has cemented that, right? We're seeing a much more cost-conscious consumer—cloud consumer—than we saw three years ago.Corey: I think that the baseline level of understanding has also risen. It used to be that I would go into a client environment, prepared to deploy all kinds of radical stuff that these days look like context-aware architecture and things that would automatically turn down developer environments when developers were done for the day or whatnot. And I would discover that, oh, you haven't bought Reserved Instances in three years. Maybe start there with the easy thing. And now you don't see those, the big misconfigurations or the big oversights the way that you once did.People are getting better at this, which is a good thing. I'm certainly not having a problem with this. It means that we get to focus on things that are more architecturally nuanced, which I love. And I think that it forces us to continue innovating rather than just doing something that basically any random software stack could provide.John: Yeah, I think to your point, the easy wins are being exhausted or have been exhausted already, right? Very rarely do we walk into a customer and see that they haven't bought a, you know, Reserved Instance, or a Savings Plan. That's just not a thing. And the proliferation of software tools to help with those things, of course, in some cases, dubious proposition of, “We'll fix your cloud bill automatically for a small percentage of the savings,” that some of those software tools have, I think those have kind of run their course. And now you've got a smarter populace or smarter consumer and it does come into the more nuanced stuff, right.All right, do you really need to replicate data across AZs? Well, not if your workloads aren't stateful. Well, so some of the old things—and Kubernetes is a great example of this, right—the age old adage of, if I'm going to spin up an EKS cluster, I need to put it in three AZs, okay, why? That's going to cost you money [laugh], the cross-AZ traffic. And I know cross-AZ traffic is a simple one, but we still see that. We still see, “Well, I don't know why I put it across all three AZs.”And so, the service-to-service communication inside that cluster, the control plane traffic inside that cluster, is costing you money. Now, it might be minimal, but as you grow and as you scale your product or the services that you're providing internally, that may grow to a non-trivial sum of money.Corey: I think that there's a tipping point where an unbounded growth problem is always going to emerge as something that needs attention and needs to be focused on. But I should ask you this because you have a skill set that is, as you know, extremely in demand. You also have that rare gift that I wish wasn't as rare as it is where you can be thrown into the deep end knowing next to nothing about a particular technology stack, and in a remarkably short period of time, develop what can only be called subject matter expertise around it. I've seen you do this years past with Kubernetes, which is something I'm still trying to wrap my head around. You have a natural gift for it which meant that, from many respects, the world was your oyster. Why this? Why now?John: So, I think there's a couple of things that are unique at this thing, at this time point, right? So obviously, helping customers has always been something that's fun and exciting for me, right? Going to an organization and solving the same problem I've solved 20 different times, for example, spinning up a Kubernetes cluster, I guess I have a little bit of a little bit of squirrel syndrome, so to speak, and that gets—it gets boring. I'd rather just automate that or build some tooling and disseminate that to the customers and let them do that. So, the thing with cost management is, it's always a different problem.Yeah, we're solving fundamentally the same problem, which is, I'm spending too much, but it's always a different root cause, you know? In one customer, it could be data transfer fees. In another customer, it could be errant development growth where they're not controlling the spend on their development environments. In yet another customer, it could be excessive object storage growth. So, being able to hunt and look for those and play detective is really fun, and I think that's one of the things that drew me to this particular area.The other is just from a timing perspective, this is a problem a lot of organizations have, and I think it's underserved. I think that there are not enough companies—service providers, whatever—focusing on the hard problem of cost optimization. There's too many people who think it's a finance problem and not enough people who think it's an engineering problem. And so, I wanted to do work on a place where we think it's an engineering problem.Corey: It's been a very… long road. And I think that engineering problems and people problems are both fascinating to me, and the AWS bill is both. It's often misunderstood as a finance problem, and finance needs to be consulted absolutely, but they can't drive an optimization project, and they don't know what the context is behind an awful lot of decisions that get made. It really is breaking down bridges. But also, there's a lot of engineering in here, too. It scratches my itch in that direction, anyway.John: Yeah, it's one of the few business problems that I think touches multiple areas. As you said, it's obviously a people problem because we want to make sure that we are supporting and educating our staff. It's a process problem. Are we making costs visible to the organization? Are we making sure that there's proper chargeback and showback methodologies, et cetera? But it's also a technology problem. Did we build this thing to take advantage of the architecture or did we shoehorn it in a way that's going to cost us a small fortune? And I think it touches all three, which I think is unique.Corey: John, I really want to thank you for taking the time to speak with me. If people want to learn more about what you're up to in a given day, where's the best place for them to find you?John: Well, thanks, Corey, and thanks for having me. And, of course obviously, our website duckbillgroup.com is a great place to find out what we're working on, what we have coming. I also, I'm pretty active on LinkedIn. I know that's [laugh]—I'm not a huge Twitter guy, but I am pretty active on LinkedIn, so you can always drop me a follow on LinkedIn. And I'll try to post interesting and useful content there for our listeners.Corey: And we will, of course, put links to that in the [show notes 00:28:37], which in my case, is of course extremely self-aggrandizing. But that's all right. We're here to do self-promotion. Thank you so much for taking the time to chat with me, John. I appreciate it. Now, get back to work.John: [laugh]. All right, thanks, Corey. Have a good one.Corey: John Wynkoop, cloud economist at The Duckbill Group. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice while also taking pains to note how you're using multiple podcast platforms these days because that just seems to be the way the world went.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Making an Affordable Event Data Solution with Seif Lotfy

Screaming in the Cloud

Play Episode Listen Later Oct 19, 2023 27:49


Seif Lotfy, Co-Founder and CTO at Axiom, joins Corey on Screaming in the Cloud to discuss how and why Axiom has taken a low-cost approach to event data. Seif describes the events that led to him helping co-found a company, and explains why the team wrote all their code from scratch. Corey and Seif discuss their views on AWS pricing, and Seif shares his views on why AWS doesn't have to compete on price. Seif also reveals some of the exciting new products and features that Axiom is currently working on. About SeifSeif is the bubbly Co-founder and CTO of Axiom where he has helped build the next generation of logging, tracing, and metrics. His background is at Xamarin, and Deutche Telekom and he is the kind of deep technical nerd that geeks out on white papers about emerging technology and then goes to see what he can build.Links Referenced: Axiom: https://axiom.co/ Twitter: https://twitter.com/seiflotfy TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by my friends, and soon to be yours, over at Axiom. Today I'm talking with Seif Lotfy, who's the co-founder and CTO of Axiom. Seif, how are you?Seif: Hey, Corey, I am very good, thank you. It's pretty late here, but it's worth it. I'm excited to be on this interview. How are you today?Corey: I'm not dead yet. It's weird, I see you at a bunch of different conferences, and I keep forgetting that you do in fact live half a world away. Is the entire company based in Europe? And where are you folks? Where do you start and where do you stop geographically? Let's start there. We over—everyone dives right into product. No, no, no. I want to know where in the world people sit because apparently, that's the most important thing about a company in 2023.Seif: Unless you ask Zoom because they're undoing whatever they did. We're from New Zealand, all the way to San Francisco, and everything in between. So, we have people in Egypt and Nigeria, all around Europe, all around the US… and UK, if you don't consider it Europe anymore.Corey: Yeah, it really depends. There's a lot of unfortunate naming that needs to get changed in the wake of that.Seif: [laugh].Corey: But enough about geopolitics. Let's talk about industry politics. I've been a fan of Axiom for a while and I was somewhat surprised to realize how long it had been around because I only heard about you folks a couple of years back. What is it you folks do? Because I know how I think about what you're up to, but you've also gone through some messaging iteration, and it is a near certainty that I am behind the times.Seif: Well, at this point, we just define ourselves as the best home for event data. So, Axiom is the best home for event data. We try to deal with everything that is event-based, so time-series. So, we can talk metrics, logs, traces, et cetera. And right now predominantly serving engineering and security.And we're trying to be—or we are—the first cloud-native time-series platform to provide streaming search, reporting, and monitoring capabilities. And we're built from the ground up, by the way. Like, we didn't actually—we're not using Parquet [unintelligible 00:02:36] thing. We're completely everything from the ground up.Corey: When I first started talking to you folks a few years back, there were two points to me that really stood out, and I know at least one of them still holds true. The first is that at the time, you were primarily talking about log data. Just send all your logs over to Axiom. The end. And that was a simple message that was simple enough that I could understand it, frankly.Because back when I was slinging servers around and you know breaking half of them, logs were effectively how we kept track of what was going on, where. These days, it feels like everything has been repainted with a very broad brush called observability, and the takeaway from most company pitches has been, you must be smarter than you are to understand what it is that we're up to. And in some cases, you scratch below the surface and realize it no, they have no idea what they're talking about either and they're really hoping you don't call them on that.Seif: It's packaging.Corey: Yeah. It is packaging and that's important.Seif: It's literally packaging. If you look at it, traces and logs, these are events. There's a timestamp and just data with it. It's a timestamp and data with it, right? Even metrics is all the way to that point.And a good example, now everybody's jumping on [OTel 00:03:46]. For me, OTel is nothing else, but a different structure for time series, for different types of time series, and that can be used differently, right? Or at least not used differently but you can leverage it differently.Corey: And the other thing that you did that was interesting and is a lot, I think, more sustainable as far as [moats 00:04:04] go, rather than things that can be changed on a billboard or whatnot, is your economic position. And your pricing has changed around somewhat, but I ran a number of analyses on your cost that you were passing on to customers and my takeaway was that it was a little bit more expensive to store data for logs in Axiom than it was to store it in S3, but not by much. And it just blew away the price point of everything else focused around logs, including AWS; you're paying 50 cents a gigabyte to ingest CloudWatch logs data over there. Other companies are charging multiples of that and Cisco recently bought Splunk for $28 billion because it was cheaper than paying their annual Splunk bill. How did you get to that price point? Is it just a matter of everyone else being greedy or have you done something different?Seif: We looked at it from the perspective of… so there's the three L's of logging. I forgot the name of the person at Netflix who talked about that, but basically, it's low costs, low latency, large scale, right? And you will never be able to fulfill all three of them. And we decided to work on low costs and large scale. And in terms of low latency, we won't be low as others like ClickHouse, but we are low enough. Like, we're fast enough.The idea is to be fast enough because in most cases, I don't want to compete on milliseconds. I think if the user can see his data in two seconds, he's happy. Or three seconds, he's happy. I'm not going to be, like, one to two seconds and make the cost exponentially higher because I'm one second faster than the other. And that's, I think, that the way we approached this from day one.And from day one, we also started utilizing the idea of existence of Open—Object Storage, we have our own compressions, our own encodings, et cetera, from day one, too, so and we still stick to that. That's why we never converted to other existing things like Parquet. Also because we are a Schema-On-Read, which Parquet doesn't allow you really to do. But other than that, it's… from day one, we wanted to save costs by also making coordination free. So, ingest has to be coordination free, right, because then we don't run a shitty Kafka, like, honestly a lot—a lot of the [logs 00:06:19] companies who running a Kafka in front of it, the Kafka tax reflects in what they—the bill that you're paying for them.Corey: What I found fun about your pricing model is it gets to a point that for any reasonable workload, how much to log or what to log or sample or keep everything is no longer an investment decision; it's just go ahead and handle it. And that was originally what you wound up building out. Increasingly, it seems like you're not just the place to send all the logs to, which to be honest, I was excited enough about that. That was replacing one of the projects I did a couple of times myself, which is building highly available, fault-tolerant, rsyslog clusters in data centers. Okay, great, you've gotten that unlocked, the economics are great, I don't have to worry about that anymore.And then you started adding interesting things on top of it, analyzing things, replaying events that happen to other players, et cetera, et cetera, it almost feels like you're not just a storage depot, but you also can forward certain things on under a variety of different rules or guises and format them as whatever on the other side is expecting them to be. So, there's a story about integrating with other observability vendors, for example, and only sending the stuff that's germane and relevant to them since everyone loves to charge by ingest.Seif: Yeah. So, we did this one thing called endpoints, the number one. Endpoints was a beginning where we said, “Let's let people send us data using whatever API they like using, let's say Elasticsearch, Datadog, Honeycomb, Loki, whatever, and we will just take that data and multiplex it back to them.” So, that's how part of it started. This allows us to see, like, how—allows customers to see how we compared to others, but then we took it a bit further and now, it's still in closed invite-only, but we have Pipelines—codenamed Pipelines—which allows you to send data to us and we will keep it as a source of truth, then we will, given specific rules, we can then ship it anywhere to a different destination, right, and this allows you just to, on the fly, send specific filter things out to, I don't know, a different vendor or even to S3 or you could send it to Splunk. But at the same time, you can—because we have all your data, you can go back in the past, if the incident happens and replay that completely into a different product.Corey: I would say that there's a definite approach to observability, from the perspective of every company tends to visualize stuff a little bit differently. And one of the promises of OTel that I'm seeing that as it grows is the idea of oh, I can send different parts of what I'm seeing off to different providers. But the instrumentation story for OTel is still very much emerging. Logs are kind of eternal and the only real change we've seen to logs over the past decade or so has been instead of just being plain text and their positional parameters would define what was what—if it's in this column, it's an IP address and if it's in this column, it's a return code, and that just wound up being ridiculous—now you see them having schemas; they are structured in a variety of different ways. Which, okay, it's a little harder to wind up just cat'ing a file together and piping it to grep, but there are trade-offs that make it worth it, in my experience.This is one of those transitional products that not only is great once you get to where you're going, from my playing with it, but also it meets you where you already are to get started because everything you've got is emitting logs somewhere, whether you know it or not.Seif: Yes. And that's why we picked up on OTel, right? Like, one of the first things, we now support… we have an OTel endpoint natively bec—or as a first-class citizen because we wanted to build this experience around OTel in general. Whether we like it or not, and there's more reasons to like it, OTel is a standard that's going to stay and it's going to move us forward. I think of OTel as will have the same effect if not bigger as [unintelligible 00:10:11] back of the day, but now it just went away from metrics, just went to metrics, logs, and traces.Traces is, for me, very interesting because I think OTel is the first one to push it in a standard way. There were several attempts to make standardized [logs 00:10:25], but I think traces was something that OTel really pushed into a proper standard that we can follow. It annoys me that everybody uses a different bits and pieces of it and adds something to it, but I think it's also because it's not that mature yet, so people are trying to figure out how to deliver the best experience and package it in a way that it's actually interesting for a user.Corey: What I have found is that there's a lot that's in this space that is just simply noise. Whenever I spend a protracted time period working on basically anything and I'm still confused by the way people talk about that thing, months or years later, I'm starting to get the realization that maybe I'm not the problem here. And I'm not—I don't mean this to be insulting, but one of the things I've loved about you folks is I've always understood what you're saying. Now, you can hear that as, “Oh, you mean we talk like simpletons?” No, it means what you're talking about resonates with at least a subset of the people who have the problem you solve. That's not nothing.Seif: Yes. We've tried really hard because one of the things we've tried to do is actually bring observability to people who are not always busy or it's not part of their day to day. So, we try to bring into [Versal 00:11:37] developers, right, with doing a Versal integration. And all of a sudden, now they have their logs, and they have a metrics, and they have some traces. So, all of a sudden, they're doing the observability work. Or they have actual observability, for their Versal based, [unintelligible 00:11:54]-based product.And we try to meet the people where they are, so we try to—instead of actually telling people, “You should send us data.”—I mean, that's what they do now—we try to find, okay, what product are you using and how can we grab data from there and send it to us to make your life easier? You see that we did that with Versal, we did that with Cloudflare. AWS, we have extensions, Lambda extensions, et cetera, but we're doing it for more things. For Netlify, it's a one-click integration, too, and that's what we're trying to do to actually make the experience and the journey easier.Corey: I want to change gears a little bit because something that we spent a fair bit of time talking about—it's why we became friends, I would think anyway—is that we have a shared appreciation for several things. One of which, at most notable to anyone around us is whenever we hang out, we greet each other effusively and then immediately begin complaining about costs of cloud services. What is your take on the way that clouds charge for things? And I know it's a bit of a leading question, but it's core and foundational to how you think about Axiom, as well as how you serve customers.Seif: They're ripping us off. I'm sorry [laugh]. They just—the amount of money they make, like, it's crazy. I would love to know what margins they have. That's a big question I've always had. I'm like, what are the margins they have at AWS right now?Corey: Across the board, it's something around 30 to 40%, last time I looked at it.Seif: That's a lot, too.Corey: Well, that's also across the board of everything, to be clear. It is very clear that some services are subsidized by other services. As it should be. If you start charging me per IAM call, we're done.Seif: And also, I mean, the machine learning stuff. Like, they won't be doing that much on top of it right now, right, [else nobody 00:13:32] will be using it.Corey: But data transfer? Yeah, there's a significant upcharge on that. But I hear you. I would moderate it a bit. I don't think that I would say that it's necessarily an intentional ripoff. My problem with most cloud services that they offer is not usually that they're too expensive—though there are exceptions to that—but rather that the dimensions are unpredictable in advance. So, you run something for a while and see what it costs. From where I sit, if a customer uses your service and then at the end of usage is surprised by how much it cost them, you've kind of screwed up.Seif: Look, if they can make egress free—like, you saw how Cloudflare just did the egress of R2 free? Because I am still stuck with AWS because let's face it, for me, it is still my favorite cloud, right? Cloudflare is my next favorite because of all the features that are trying to develop and the pace they're picking, the pace they're trying to catch up with. But again, one of the biggest things I liked is R2, and R2 egress is free. Now, that's interesting, right?But I never saw anything coming back from S3 from AWS on S3 for that, like you know. I think Amazon is so comfortable because from a product perspective, they're simple, they have the tools, et cetera. And the UI is not the flashiest one, but you know what you're doing, right? The CLI is not the flashiest one, but you know what you're doing. It is so cool that they don't really need to compete with others yet.And I think they're still dominantly the biggest cloud out there. I think you know more than me about that, but [unintelligible 00:14:57], like, I think they are the biggest one right now in terms of data volume. Like, how many customers are using them, and even in terms of profiles of people using them, it's very, so much. I know, like, a lot of the Microsoft Azure people who are using it, are using it because they come from enterprise that have been always Microsoft… very Microsoft friendly. And eventually, Microsoft also came in Europe in these all these different weird ways. But I feel sometimes ripped off by AWS because I see Cloudflare trying to reduce the prices and AWS just looking, like, “Yeah, you're not a threat to us so we'll keep our prices as they are.”Corey: I have it on good authority from folks who know that there are reasons behind the economic structures of both of those companies based—in terms of the primary direction the traffic flows and the rest. But across the board, they've done such a poor job of articulating this that, frankly, I think the confusion is on them to clear up, not us.Seif: True. True. And the reason I picked R2 and S3 to compare there and not look at Workers and Lambdas because I look at it as R2 is S3 compatible from an API perspective, right? So, they're giving me something that I already use. Everything else I'm using, I'm using inside Amazon, so it's in a VPC, but just the idea. Let me dream. Let me dream that S3 egress will be free at some point.Corey: I can dream.Seif: That's like Christmas. It's better than Christmas.Corey: What I'm surprised about is how reasonable your pricing is in turn. You wind up charging on the basis of ingest, which is basically the only thing that really makes sense for how your company is structured. But it's predictable in advance, the free tier is, what, 500 gigs a month of ingestion, and before people think, “Oh, that doesn't sound like a lot,” I encourage you to just go back and think how much data that really is in the context of logs for any toy project. Like, “Well, our production environment spits out way more than that.” Yes, and by the word production that you just used, you probably shouldn't be using a free trial of anything as your critical path observability tooling. Become a customer, not a user. I'm a big believer in that philosophy, personally. For all of my toy projects that are ridiculous, this is ample.Seif: People always tend to overestimate how much logs they're going to be sending. Like so, there's one thing. What you said it right: people who already have something going on, they already know how much logs they'll be sending around. But then eventually they're sending too much, and that's why we're back here and they're talking to us. Like, “We want to ttry your tool, but you know, we'll be sending more than that.” So, if you don't like our pricing, go find something else because I think we are the cheapest out there right now. We're the competitive the cheapest out there right now.Corey: If there is one that is less expensive, I'm unaware of it.Seif: [laugh].Corey: And I've been looking, let's be clear. That's not just me saying, “Well, nothing has skittered across my desk.” No, no, no, I pay attention to this space.Seif: Hey, where's—Corey, we're friends. Loyalty.Corey: Exactly.Seif: If you find something, you tell me.Corey: Oh, if I find something, I'll tell everyone.Seif: Nononon, you tell me first and you tell me in a nice way so I can reduce the prices on my site [laugh].Corey: This is how we start a price was, industry-wide, and I would love to see it.Seif: [laugh]. But there's enough channels that we share at this point across different Slacks and messaging apps that you should be able to ping me if you find one. Also, get me the name of the CEO and the CTO while you're at it.Corey: And where they live. Yes, yes, of course. The dire implications will be awesome.Seif: That was you, not me. That was your suggestion.Corey: Exactly.Seif: I will not—[laugh].Corey: Before we turn into a bit of an old thud and blunder, let's talk about something else that I'm curious about here. You've been working on Axiom for something like seven years now. You come from a world of databases and events and the like. Why start a company in the model of Axiom? Even back then, when I looked around, my big problem with the entire observability space could never have been described as, “You know what we need? More companies that do exactly this.” What was it that you saw that made you say, “Yeah, we're going to start a company because that sounds easy.”Seif: So, I'll be very clear. Like, I'm not going to, like, sugarcoat this. We kind of got in a position where it [forced counterweighted 00:19:10]. And [laugh] by that I mean, we came from a company where we were dealing with logs. Like, we actually wrote an event crash analytics tool for a company, but then we ended up wanting to use stuff like Datadog, but we didn't have the budget for that because Datadog was killing us.So, we ended up hosting our own Elasticsearch. And Elasticsearch, it costs us more to maintain our Elasticsearch cluster for the logs than to actually maintain our own little infrastructure for the crash events when we were getting, like, 1 billion crashes a month at this point. So eventually, we just—that was the first burn. And then you had alert fatigue and then you had consolidating events and timestamps and whatnot. The whole thing just seemed very messy.So, we started off after some company got sold, we started off by saying, “Okay, let's go work on a new self-hosted version of the [unintelligible 00:20:05] where we do metrics and logs.” And then that didn't go as well as we thought it would, but we ended up—because from day one, we were working on cloud na—because we d—we cloud ho—we were self-hosted, so we wanted to keep costs low, we were working on and making it stateless and work against object store. And this is kind of how we started. We realized, oh, our cost, we can host this and make it scale, and won't cost us that much.So, we did that. And that started gaining more attention. But the reason we started this was we wanted to start a self-hosted version of Datadog that is not costly, and we ended up doing a Software as a Service. I mean, you can still come and self-hosted, but you'll have to pay money for it, like, proper money for that. But we do as a SaaS version of this and instead of trying to be a self-hosted Datadog, we are now trying to compete—or we are competing with Datadog.Corey: Is the technology that you've built this on top of actually that different from everything else out there, or is this effectively what you see in a lot of places: “Oh, yeah, we're just going to manage Elasticsearch for you because that's annoying.” Do you have anything that distinguishes you from, I guess, the rest of the field?Seif: Yeah. So, very just bluntly, like, I think Scuba was the first thing that started standing out, and then Honeycomb came into the scene and they start building something based on Scuba, the [unintelligible 00:21:23] principles of Scuba. Then one of the authors of actual Scuba reached out to me when I told him I'm trying to build something, and he's gave me some ideas, and I start building that. And from day one, I said, “Okay, everything in S3. All queries have to be serverless.”So, all the queries run on functions. There's no real disks. It's just all on S3 right now. And the biggest issue—achievement we got to lower our cost was to get rid of Kafka, and have—let's say, in behind the scenes we have our own coordination-free mechanism, but the idea is not to actually have to use Kafka at all and thus reduce the costs incredibly. In terms of technology, no, we don't use Elasticsearch.We wrote everything from the ground up, from scratch, even the query language. Like, we have our own query language that's based—modeled after Kusto—KQL by Microsoft—so everything we have is built from absolutely from the ground up. And no Elastic. I'm not using Elastic anymore. Elastic is a horror for me. Absolutely horror.Corey: People love the API, but no, I've never met anyone who likes managing Elasticsearch or OpenSearch, or whatever we're calling your particular flavor of it. It is a colossal pain, it is subject to significant trade-offs, regardless of how you work with it, and Amazon's managed offering doesn't make it better; it makes it worse in a bunch of ways.Seif: And the green status of Elasticsearch is a myth. You'll only see it once: the first time you start that cluster, that's what the Elasticsearch cluster is green. After that, it's just orange, or red. And you know what? I'm happy when it's orange. Elasticsearch kept me up for so long. And we had actually a very interesting situation where we had Elasticsearch running on Azure, on Windows machines, and I would have server [unintelligible 00:23:10]. And I'd have to log in and every day—you remember, what's it called—RP… RP Something. What was it called?Corey: RDP? Remote Desktop Protocol, or something else?Seif: Yeah, yeah. Where you have to log in, like, you actually have visual thing, and you have to go in and—Corey: Yep.Seif: And visually go in and say, “Please don't restart.” Every day, I'd have to do that. Please don't restart, please don't restart. And also a lot of weird issues, and also at that point, Azure would decide to disconnect the pod, wanted to try to bring in a new pod, and all these weird things were happening back then. So, eventually, end up with a [unintelligible 00:23:39] decision. I'm talking 2013, '14, so it was back in the day when Elasticsearch was very young. And so, that was just a bad start for me.Corey: I will say that Azure is the most cost-effective cloud because their security is so clown shoes, you can just run whatever you want in someone else's account and it's free to you. Problem solved.Seif: Don't tell people how we save costs, okay?Corey: [laugh]. I love that.Seif: [laugh]. Don't tell people how we do this. Like, Corey, come on [laugh], you're exposing me here. Let me tell you one thing, though. Elasticsearch is the reason I literally use a shock collar or a shock bracelet on myself every time it went down—which was almost every day, instead of having PagerDuty, like, ring my phone.And, you know, I'd wake up and my partner back then would wake up. I bought a Bluetooth collar off of Alibaba that would tase me every time I'd get a notification, regardless of the notification. So, some things are false alarm, but I got tased for at least two, three weeks before I gave up. Every night I'd wake up, like, to a full discharge.Corey: I would never hook myself up to a shocker tied to outages, even if I owned a company. There are pleasant ways to wake up, unpleasant ways to wake up, and even worse. So, you're getting shocked for some—so someone else can wind up effectively driving the future of the business. You're, more or less, the monkey that gets shocked awake to go ahead and fix the thing that just broke.Seif: [laugh]. Well, the fix to that was moving from Azure to AWS without telling anybody. That got us in a lot of trouble. Again, that wasn't my company.Corey: They didn't notice that you did this, or it caused a lot of trouble because suddenly nothing worked where they thought it would work?Seif: They—no, no, everything worked fine on AWS. That's how my love story began. But they didn't notice for, like, six months.Corey: That's kind of amazing.Seif: [laugh]. That was specta—we rewrote everything from C# to Node.js and moved everything away from Elasticsearch, started using Redshift, Redis and a—you name it. We went AWS all the way and they didn't even notice. We took the budget from another department to start filling that in.But we cut the costs from $100,000 down to, like, 40, and then eventually down to $30,000 a month.Corey: More than a little wild.Seif: Oh, God, yeah. Good times, good times. Next time, just ask me to tell you the full story about this. I can't go into details on this podcast. I'll get in a lot—I think I'll get in trouble. I didn't sign anything though.Corey: Those are the best stories. But no, I hear you. I absolutely hear you. Seif, I really want to thank you for taking the time to speak with me. If people want to learn more, where should they go?Seif: So, axiom.co—not dot com. Dot C-O. That's where they learn more about Axiom. And other than that, I think I have a Twitter somewhere. And if you know how to write my name, you'll—it's just one word and find me on Twitter.Corey: We will put that all in the [show notes 00:26:33]. Thank you so much for taking the time to speak with me. I really appreciate it.Seif: Dude, that was awesome. Thank you, man.Corey: Seif Lotfy, co-founder and CTO of Axiom, who has brought this promoted guest episode our way. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that one of these days, I will get around to aggregating in some horrifying custom homebrew logging system, probably built on top of rsyslog.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Keeping Workflows Secure in an Ever-Changing Environment with Adnan Khan

Screaming in the Cloud

Play Episode Listen Later Oct 17, 2023 34:42


Adnan Khan, Lead Security Engineer at Praetorian, joins Corey on Screaming in the Cloud to discuss software bill of materials and supply chain attacks. Adnan describes how simple pull requests can lead to major security breaches, and how to best avoid those vulnerabilities. Adnan and Corey also discuss the rapid innovation at Github Actions, and the pros and cons of having new features added so quickly when it comes to security. Adnan also discusses his view on the state of AI and its impact on cloud security. About AdnanAdnan is a Lead Security Engineer at Praetorian. He is responsible for executing on Red-Team Engagements as well as developing novel attack tooling in order to meet and exceed engagement objectives and provide maximum value for clients.His past experience as a software engineer gives him a deep understanding of where developers are likely to make mistakes, and has applied this knowledge to become an expert in attacks on organization's CI/CD systems.Links Referenced: Praetorian: https://www.praetorian.com/ Twitter: https://twitter.com/adnanthekhan Praetorian blog posts: https://www.praetorian.com/author/adnan-khan/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Are you navigating the complex web of API management, microservices, and Kubernetes in your organization? Solo.io is here to be your guide to connectivity in the cloud-native universe!Solo.io, the powerhouse behind Istio, is revolutionizing cloud-native application networking. They brought you Gloo Gateway, the lightweight and ultra-fast gateway built for modern API management, and Gloo Mesh Core, a necessary step to secure, support, and operate your Istio environment.Why struggle with the nuts and bolts of infrastructure when you can focus on what truly matters - your application. Solo.io's got your back with networking for applications, not infrastructure. Embrace zero trust security, GitOps automation, and seamless multi-cloud networking, all with Solo.io.And here's the real game-changer: a common interface for every connection, in every direction, all with one API. It's the future of connectivity, and it's called Gloo by Solo.io.DevOps and Platform Engineers, your journey to a seamless cloud-native experience starts here. Visit solo.io/screaminginthecloud today and level up your networking game.Corey: As hybrid cloud computing becomes more pervasive, IT organizations need an automation platform that spans networks, clouds, and services—while helping deliver on key business objectives. Red Hat Ansible Automation Platform provides smart, scalable, sharable automation that can take you from zero to automation in minutes. Find it in the AWS Marketplace.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. I've been studiously ignoring a number of buzzword, hype-y topics, and it's probably time that I addressed some of them. One that I've been largely ignoring, mostly because of its prevalence at Expo Hall booths at RSA and other places, has been software bill of materials and supply chain attacks. Finally, I figured I would indulge the topic. Today I'm speaking with Adnan Khan, lead security engineer at Praetorian. Adnan, thank you for joining me.Adnan: Thank you so much for having me.Corey: So, I'm trying to understand, on some level, where the idea of these SBOM or bill-of-material attacks have—where they start and where they stop. I've seen it as far as upstream dependencies have a vulnerability. Great. I've seen misconfigurations in how companies wind up configuring their open-source presences. There have been a bunch of different, it feels almost like orthogonal concepts to my mind, lumped together as this is a big scary thing because if we have a big single scary thing we can point at, that unlocks budget. Am I being overly cynical on this or is there more to it?Adnan: I'd say there's a lot more to it. And there's a couple of components here. So first, you have the SBOM-type approach to security where organizations are looking at which packages are incorporated into their builds. And vulnerabilities can come out in a number of ways. So, you could have software actually have bugs or you could have malicious actors actually insert backdoors into software.I want to talk more about that second point. How do malicious actors actually insert backdoors? Sometimes it's compromising a developer. Sometimes it's compromising credentials to push packages to a repository, but other times, it could be as simple as just making a pull request on GitHub. And that's somewhere where I've spent a bit of time doing research, building off of techniques that other people have documented, and also trying out some attacks for myself against two Microsoft repositories and several others that have reported over the last few months that would have been able to allow an attacker to slip a backdoor into code and expand the number of projects that they are able to attack beyond that.Corey: I think one of the areas that we've seen a lot of this coming from has been the GitHub Action space. And I'll confess that I wasn't aware of a few edge-case behaviors around this. Most of my experience with client-side Git configuration in the .git repository—pre-commit hooks being a great example—intentionally and by design from a security perspective, do not convey when you check that code in and push it somewhere, or grab someone else's, which is probably for the best because otherwise, it's, “Oh yeah, just go ahead and copy your password hash file and email that to something else via a series of arcane shell script stuff.” The vector is there. I was unpleasantly surprised somewhat recently to discover that when I cloned a public project and started running it locally and then adding it to my own fork, that it would attempt to invoke a whole bunch of GitHub Actions flows that I'd never, you know, allowed it to do. That was… let's say, eye-opening.Adnan: [laugh]. Yeah. So, on the particular topic of GitHub Actions, the pull request as an attack vector, like, there's a lot of different forms that an attack can take. So, one of the more common ones—and this is something that's been around for just about as long as GitHub Actions has been around—and this is a certain trigger called ‘pull request target.' What this means is that when someone makes a pull request against the base repository, maybe a branch within the base repository such as main, that will be the workflow trigger.And from a security's perspective, when it runs on that trigger, it does not require approval at all. And that's something that a lot of people don't really realize when they're configuring their workflows. Because normally, when you have a pull request trigger, the maintainer can check a box that says, “Oh, require approval for all external pull requests.” And they think, “Great, everything needs to be approved.” If someone tries to add malicious code to run that's on the pull request target trigger, then they can look at the code before it runs and they're fine.But in a pull request target trigger, there is no approval and there's no way to require an approval, except for configuring the workflow securely. So, in this case, what happens is, and in one particular case against the Microsoft repository, this was a Microsoft reusable GitHub Action called GPT Review. It was vulnerable because it checked out code from my branch—so if I made a pull request, it checked out code from my branch, and you could find this by looking at the workflow—and then it ran tests on my branch, so it's running my code. So, by modifying the entry points, I could run code that runs in the context of that base branch and steal secrets from it, and use those to perform malicious Actions.Corey: Got you. It feels like historically, one of the big threat models around things like this is al—[and when 00:06:02] you have any sort of CI/CD exploit—is either falls down one of two branches: it's either the getting secret access so you can leverage those credentials to pivot into other things—I've seen a lot of that in the AWS space—or more boringly, and more commonly in many cases, it seems to be oh, how do I get it to run this crypto miner nonsense thing, with the somewhat large-scale collapse of crypto across the board, it's been convenient to see that be less prevalent, but still there. Just because you're not making as much money means that you'll still just have to do more of it when it's all in someone else's account. So, I guess it's easier to see and detect a lot of the exploits that require a whole bunch of compute power. The, oh by the way, we stole your secrets and now we're going to use that to lateral into an organization seem like it's something far more… I guess, dangerous and also sneaky.Adnan: Yeah, absolutely. And you hit the nail on the head there with sneaky because when I first demonstrated this, I made a test account, I created a PR, I made a couple of Actions such as I modified the name of the release for the repository, I just put a little tag on it, and didn't do any other changes. And then I also created a feature branch in one of Microsoft's repositories. I don't have permission to do that. That just sat there for about almost two weeks and then someone else exploited it and then they responded to it.So, sneaky is exactly the word you could describe something like this. And another reason why it's concerning is, beyond the secret disclosure for—and in this case, the repository only had an OpenAI API key, so… okay, you can talk to ChatGPT for free. But this was itself a Github Action and it was used by another Microsoft machine-learning project that had a lot more users, called SynapseML, I believe was the name of the other project. So, what someone could do is backdoor this Action by creating a commit in a feature branch, which they can do by stealing the built-in GitHub token—and this is something that all Github Action runs have; the permissions for it vary, but in this case, it had the right permissions—attacker could create a new branch, modify code in that branch, and then modify the tag, which in Git, tags are mutable, so you can just change the commit the tag points to, and now, every time that other Microsoft repository runs GPT Review to review a pull request, it's running attacker-controlled code, and then that could potentially backdoor that other repository, steal secrets from that repository.So that's, you know, one of the scary parts of, in particular backdooring a Github Action. And I believe there was a very informative Blackhat talk this year, that someone from—I'm forgetting the name of the author, but it was a very good watch about how Actions vulnerabilities can be vulnerable, and this is kind of an example of—it just happened to be that this was an Action as well.Corey: That feels like this is an area of exploit that is becoming increasingly common. I tie it almost directly to the rise of GitHub Actions as the default CI/CD system that a lot of folks have been using. For the longest time, it seemed like a poorly configured Jenkins box hanging out somewhere in your environment that was the exception to the Infrastructure as Code rule because everyone has access to it, configures it by hand, and invariably it has access to production was the way that people would exploit things. For a while, you had CircleCI and Travis-CI, before Travis imploded and Circle did a bunch of layoffs. Who knows where they're at these days?But it does seem that the common point now has been GitHub Actions, and a .github folder within that Git repo with a workflows YAML file effectively means that a whole bunch of stuff can happen that you might not be fully aware of when you're cloning or following along with someone's tutorial somewhere. That has caught me out in a couple of strange ways, but nothing disastrous because I do believe in realistic security boundaries. I just worry how much of this is the emerging factor of having a de facto standard around this versus something that Microsoft has actively gotten wrong. What's your take on it?Adnan: Yeah. So, my take here is that Github could absolutely be doing a lot more to help prevent users from shooting themselves in the foot. Because their documentation is very clear and quite frankly, very good, but people aren't warned when they make certain configuration settings in their workflows. I mean, GitHub will happily take the settings and, you know, they hit commit, and now the workflow could be vulnerable. There's no automatic linting of workflows, or a little suggestion box popping up like, “Hey, are you sure you want to configure it this way?”The technology to detect that is there. There's a lot of third-party utilities that will lint Actions workflows. Heck, for looking for a lot of these pull request target-type vulnerabilities, I use a Github code search query. It's just a regular expression. So, having something that at least nudges users to not make that mistake would go really far in helping people not make these mista—you know, adding vulnerabilities to their projects.Corey: It seems like there's also been issues around the GitHub Actions integration approach where OICD has not been scoped correctly a bunch of times. I've seen a number of articles come across my desk in that context and fortunately, when I wound up passing out the ability for one of my workflows to deploy to my AWS account, I got it right because I had no idea what I was doing and carefully followed the instructions. But I can totally see overlooking that one additional parameter that leaves things just wide open for disaster.Adnan: Yeah, absolutely. That's one where I haven't spent too much time actually looking for that myself, but I've definitely read those articles that you mentioned, and yeah, it's very easy for someone to make that mistake, just like, it's easy for someone to just misconfigure their Action in general. Because in some of the cases where I found vulnerabilities, there would actually be a commit saying, “Hey, I'm making this change because the Action needs access to these certain secrets. And oh, by the way, I need to update the checkout steps so it actually checks out the PR head so that it's [testing 00:12:14] that PR code.” Like, people are actively making a decision to make it vulnerable because they don't realize the implication of what they've just done.And in the second Microsoft repository that I found the bug in, was called Microsoft Confidential Sidecar Containers. That repository, the developer a week prior to me identifying the bug made a commit saying that we're making a change and it's okay because it requires approval. Well, it doesn't because it's a pull request target.Corey: Part of me wonders how much of this is endemic to open-source as envisioned through enterprises versus my world of open-source, which is just eh, I've got this weird side project in my spare time, and it seemed like it might be useful to someone else, so I'll go ahead and throw it up there. I understand that there's been an awful lot of commercialization of open-source in recent years; I'm not blind to that fact, but it also seems like there's a lot of companies playing very fast and loose with things that they probably shouldn't be since they, you know, have more of a security apparatus than any random contributors standing up a clone of something somewhere will.Adnan: Yeah, we're definitely seeing this a lot in the machine-learning space because of companies that are trying to move so quickly with trying to build things because OpenAI AI has blown up quite a bit recently, everyone's trying to get a piece of that machine learning pie, so to speak. And another thing of what you're seeing is, people are deploying self-hosted runners with Nvidia, what is it, the A100, or—it's some graphics card that's, like, $40,000 apiece attached to runners for running integration tests on machine-learning workflows. And someone could, via a pull request, also just run code on those and mine crypto.Corey: I kind of miss the days when exploiting computers is basically just a way for people to prove how clever they were or once in a blue moon come up with something innovative. Now, it's like, well, we've gone all around the mulberry bush just so we can basically make computers solve a sudoku form, and in return, turn that into money down the road. It's frustrating, to put it gently.Adnan: [laugh].Corey: When you take a look across the board at what companies are doing and how they're embracing the emerging capabilities inherent to these technologies, how do you avoid becoming a cautionary tale in the space?Adnan: So, on the flip side of companies having vulnerable workflows, I've also seen a lot of very elegant ways of writing secure workflows. And some of the repositories are using deployment environments—which is the GitHub Actions feature—to enforce approval checks. So, workflows that do need to run on pull request target because of the need to access secrets for pull requests will have a step that requires a deployment environment to complete, and that deployment environment is just an approval and it doesn't do anything. So essentially, someone who has permissions to the repository will go in, approve that environment check, and only then will the workflow continue. So, that adds mandatory approvals to pull requests where otherwise they would just run without approval.And this is on, particularly, the pull request target trigger. Another approach is making it so the trigger is only running on the label event and then having a maintainer add a label so the tests can run and then remove the label. So, that's another approach where companies are figuring out ways to write secure workflows and not leave their repositories vulnerable.Corey: It feels like every time I turn around, Github Actions has gotten more capable. And I'm not trying to disparage the product; it's kind of the idea of what we want. But it also means that there's certainly not an awareness in the larger community of how these things can go awry that has kept up with the pace of feature innovation. How do you balance this without becoming the Department of No?Adnan: [laugh]. Yeah, so it's a complex issue. I think GitHub has evolved a lot over the years. Actions, it's—despite some of the security issues that happen because people don't configure them properly—is a very powerful product. For a CI/CD system to work at the scale it does and allow so many repositories to work and integrate with everything else, it's really easy to use. So, it's definitely something you don't want to take away or have an organization move away from something like that because they are worried about the security risks.When you have features coming in so quickly, I think it's important to have a base, kind of like, a mandatory reading. Like, if you're a developer that writes and maintains an open-source software, go read through this document so you can understand the do's and don'ts instead of it being a patchwork where some people, they take a good security approach and write secure workflows and some people just kind of stumble through Stack Overflow, find what works, messes around with it until their deployment is working and their CI/CD is working and they get the green checkmark, and then they move on to their never-ending list of tasks that—because they're always working on a deadline.Corey: Reminds me of a project I saw a few years ago when it came out that Volkswagen had been lying to regulators. It was a framework someone built called ‘Volkswagen' that would detect if it was running inside of a CI/CD environment, and if so, it would automatically make all the tests pass. I have a certain affinity for projects like that. Another one was a tool that would intentionally degrade the performance of a network connection so you could simulate having a latent or stuttering connection with packet loss, and they call that ‘Comcast.' Same story. I just thought that it's fun seeing people get clever on things like that.Adnan: Yeah, absolutely.Corey: When you take a look now at the larger stories that are emerging in the space right now, I see an awful lot of discussion coming up that ties to SBOMs and understanding where all of the components of your software come from. But I chased some stuff down for fun once, and I gave up after 12 dependency leaps from just random open-source frameworks. I mean, I see the Dependabot problem that this causes as well, where whenever I put something on GitHub and then don't touch it for a couple of months—because that's how I roll—I come back and there's a whole bunch of terrifyingly critical updates that it's warning me about, but given the nature of how these things get used, it's never going to impact anything that I'm currently running. So, I've learned to tune it out and just ignore it when it comes in, which is probably the worst of all possible approaches. Now, if I worked at a bank, I should probably take a different perspective on this, but I don't.Adnan: Mm-hm. Yeah. And that's kind of a problem you see, not just with SBOMs. It's just security alerting in general, where anytime you have some sort of signal and people who are supposed to respond to it are getting too much of it, you just start to tune all of it out. It's like that human element that applies to so much in cybersecurity.And I think for the particular SBOM problem, where, yeah, you're correct, like, a lot of it… you don't have reachability because you're using a library for one particular function and that's it. And this is somewhere where I'm not that much of an expert in where doing more static source analysis and reachability testing, but I'm certain there are products and tools that offer that feature to actually prioritize SBOM-based alerts based on actual reachability versus just having an as a dependency or not.[midroll 00:20:00]Corey: I feel like, on some level, wanting people to be more cautious about what they're doing is almost shouting into the void because I'm one of the only folks I found that has made the assertion that oh yeah, companies don't actually care about security. Yes, they email you all the time after they failed to protect your security, telling you how much they care about security, but when you look at where they invest, feature velocity always seems to outpace investment in security approaches. And take a look right now at the hype we're seeing across the board when it comes to generative AI. People are excited about the capabilities and security is a distant afterthought around an awful lot of these things. I don't know how you drive a broader awareness of this in a way that sticks, but clearly, we haven't collectively found it yet.Adnan: Yeah, it's definitely a concern. When you see things on—like for example, you can look at Github's roadmap, and there's, like, a feature there that's, oh, automatic AI-based pull request handling. Okay, so does that mean one day, you'll have a GitHub-powered LLM just approve PRs based on whether it determines that it's a good improvement or not? Like, obviously, that's not something that's the case now, but looking forward to maybe five, six years in the future, in the pursuit of that ever-increasing velocity, could you ever have a situation where actual code contributions are reviewed fully by AI and then approved and merged? Like yeah, that's scary because now you have a threat actor that could potentially specifically tailor contributions to trick the AI into thinking they're great, but then it could turn around and be a backdoor that's being added to the code.Obviously, that's very far in the future and I'm sure a lot of things will happen before that, but it starts to make you wonder, like, if things are heading that way. Or will people realize that you need to look at security at every step of the way instead of just thinking that these newer AI systems can just handle everything?Corey: Let's pivot a little bit and talk about your day job. You're a lead security engineer at what I believe to be a security-focused consultancy. Or—Adnan: Yeah.Corey: If you're not a SaaS product. Everything seems to become a SaaS product in the fullness of time. What's your day job look like?Adnan: Yeah, so I'm a security engineer on Praetorian's red team. And my day-to-day, I'll kind of switch between application security and red-teaming. And that kind of gives me the opportunity to, kind of, test out newer things out in the field, but then also go and do more traditional application security assessments and code reviews, and reverse engineering to kind of break up the pace of work. Because red-teaming can be very fast and fast-paced and exciting, but sometimes, you know, that can lead to some pretty late nights. But that's just the nature of being on a red team [laugh].Corey: It feels like as soon as I get into the security space and start talking to cloud companies, they get a lot more defensive than when I'm making fun of, you know, bad service naming or APIs that don't make a whole lot of sense. It feels like companies have a certain sensitivity around the security space that applies to almost nothing else. Do you find, as a result, that a lot of the times when you're having conversations with companies and they figure out that, oh, you're a red team for a security researcher, oh, suddenly, we're not going to talk to you the way we otherwise might. We thought you were a customer, but nope, you can just go away now.Adnan: [laugh]. I personally haven't had that experience with cloud companies. I don't know if I've really tried to buy a lot. You know, I'm… if I ever buy some infrastructure from cloud companies as an individual, I just kind of sign up and put in my credit card. And, you know, they just, like, oh—you know, they just take my money. So, I don't really think I haven't really, personally run into anything like that yet [laugh].Corey: Yeah, I'm curious to know how that winds up playing out in some of these, I guess, more strategic, larger company environments. I don't get to see that because I'm basically a tiny company that dabbles in security whenever I stumble across something, but it's not my primary function. I just worry on some level one of these days, I'm going to wind up accidentally dropping a zero-day on Twitter or something like that, and suddenly, everyone's going to come after me with the knives. I feel like [laugh] at some point, it's just going to be a matter of time.Adnan: Yeah. I think when it comes to disclosing things and talking about techniques, the key thing here is that a lot of the things that I'm talking about, a lot of the things that I'll be talking about in some blog posts that have coming out, this is stuff that these companies are seeing themselves. Like, they recognize that these are security issues that people are introducing into code. They encourage people to not make these mistakes, but when it's buried in four links deep of documentation and developers are tight on time and aren't digging through their security documentation, they're just looking at what works, getting it to work and moving on, that's where the issue is. So, you know, from a perspective of raising awareness, I don't feel bad if I'm talking about something that the company itself agrees is a problem. It's just a lot of the times, their own engineers don't follow their own recommendations.Corey: Yeah, I have opinions on these things and unfortunately, it feels like I tend to learn them in some of the more unfortunate ways of, oh, yeah, I really shouldn't care about this thing, but I only learned what the norm is after I've already done something. This is, I think, the problem inherent to being small and independent the way that I tend to be. We don't have enough people here for there to be a dedicated red team and research environment, for example. Like, I tend to bleed over a little bit into a whole bunch of different things. We'll find out. So far, I've managed to avoid getting it too terribly wrong, but I'm sure it's just a matter of time.So, one area that I think seems to be a way that people try to avoid cloud issues is oh, I read about that in the last in-flight magazine that I had in front of me, and the cloud is super insecure, so we're going to get around all that by running our own infrastructure ourselves, from either a CI/CD perspective or something else. Does that work when it comes to this sort of problem?Adnan: Yeah, glad you asked about that. So, we've also seen open-s—companies that have large open-source presence on GitHub just opt to have self-hosted Github Actions runners, and that opens up a whole different Pandora's box of attacks that an attacker could take advantage of, and it's only there because they're using that kind of runner. So, the default GitHub Actions runner, it's just an agent that runs on a machine, it checks in with GitHub Actions, it pulls down builds, runs them, and then it waits for another build. So, these are—the default state is a non-ephemeral runner with the ability to fork off tasks that can run in the background. So, when you have a public repository that has a self-hosted runner attached to it, it could be at the organization level or it could be at the repository level.What an attacker can just do is create a pull request, modify the pull request to run on a self-hosted runner, write whatever they want in the pull request workflow, create a pull request, and now as long as they were a previous contributor, meaning you fixed a typo, you… that could be a such a, you know, a single character typo change could even cause that, or made a small contribution, now they create the pull request. The arbitrary job that they wrote is now picked up by that self-hosted runner. They can fork off it, process it to run in the background, and then that just continues to run, the job finishes, their pull request, they'll just—they close it. Business as usual, but now they've got an implant on the self-hosted runner. And if the runners are non-ephemeral, it's very hard to completely lock that down.And that's something that I've seen, there's quite a bit of that on GitHub where—and you can identify it just by looking at the run logs. And that's kind of comes from people saying, “Oh, let's just self-host our runners,” but they also don't configure that properly. And that opens them up to not only tampering with their repositories, stealing secrets, but now depending on where your runner is, now you potentially could be giving an attacker a foothold in your cloud environment.Corey: Yeah, that seems like it's generally a bad thing. I found that cloud tends to be more secure than running it yourself in almost every case, with the exception that once someone finds a way to break into it, there's suddenly a lot more eggs in a very large, albeit more secure, basket. So, it feels like it's a consistent trade-off. But as time goes on, it feels like it is less and less defensible, I think, to wind up picking out an on-prem strategy from a pure security point of view. I mean, there are reasons to do it. I'm just not sure.Adnan: Yeah. And I think that distinction to be made there, in particular with CI/CD runners is there's cloud, meaning you let your—there's, like, full cloud meaning you let your CI/CD provider host your infrastructure as well; there's kind of that hybrid approach you mentioned, where you're using a CI/CD provider, but then you're bringing your own cloud infrastructure that you think you could secure better; or you have your runners sitting in vCenter in your own data center. And all of those could end up being—both having a runner in your cloud and in your data center could be equally vulnerable if you're not segmenting builds properly. And that's the core issue that happens when you have a self-hosted runner is if they're not ephemeral, it's very hard to cut off all attack paths. There's always something an attacker can do to tamper with another build that'll have some kind of security impact. You need to just completely isolate your builds and that's essentially what you see in a lot of these newer guidances like the [unintelligible 00:30:04] framework, that's kind of the core recommendation of it is, like, one build, one clean runner.Corey: Yeah, that seems to be the common wisdom. I've been doing a lot of work with my own self-hosted runners that run inside of Lambda. Definitionally those are, of course, ephemeral. And there's a state machine that winds up handling that and screams bloody murder if there's a problem with it. So far, crossing fingers hoping it works out well.And I have a bounded to a very limited series of role permissions, and of course, its own account of constraint blast radius. But there's still—there are no guarantees in this. The reason I build it the way I do is that, all right, worst case someone can get access to this. The only thing they're going to have the ability to do is, frankly, run up my AWS bill, which is an area I have some small amount of experience with.Adnan: [laugh]. Yeah, yeah, that's always kind of the core thing where if you get into someone's cloud, like, well, just sit there and use their compute resources [laugh].Corey: Exactly. I kind of miss when that was the worst failure mode you had for these things.Adnan: [laugh].Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place for them to find you?Adnan: I do have a Twitter account. Well, I guess you can call it Twitter anymore, but, uh—Corey: Watch me. Sure I can.Adnan: [laugh]. Yeah, so I'm on Twitter, and it's @adnanthekhan. So, it's like my first name with ‘the' and then K-H-A-N because, you know, my full name probably got taken up, like, years before I ever made a Twitter account. So, occasionally I tweet about GitHub Actions there.And on Praetorian's website, I've got a couple of blog posts. I have one—the one that really goes in-depth talking about the two Microsoft repository pull request attacks, and a couple other ones that are disclosed, will hopefully drop on the twenty—what is that, Tuesday? That's going to be the… that's the 26th. So, it should be airing on the Praetorian blog then. So, if you—Corey: Excellent. It should be out by the time this is published, so we will, of course, put a link to that in the [show notes 00:32:01]. Thank you so much for taking the time to speak with me today. I appreciate it.Adnan: Likewise. Thank you so much, Corey.Corey: Adnan Khan, lead security engineer at Praetorian. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that's probably going to be because your podcast platform of choice is somehow GitHub Actions.Adnan: [laugh].Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
When Data is Your Brand and Your Job with Joe Karlsson

Screaming in the Cloud

Play Episode Listen Later Oct 12, 2023 33:42


Joe Karlsson, Data Engineer at Tinybird, joins Corey on Screaming in the Cloud to discuss what it's like working in the world of data right now and how he manages the overlap between his social media presence and career. Corey and Joe chat about the rise of AI and whether or not we're truly seeing advancements in that realm or just trendy marketing plays, and Joe shares why he feels data is getting a lot more attention these days and what it's like to work in data at this time. Joe also shares insights into how his mental health has been impacted by having a career and social media presence that overlaps, and what steps he's taken to mitigate the negative impact. About JoeJoe Karlsson (He/They) is a Software Engineer turned Developer Advocate at Tinybird. He empowers developers to think creatively when building data intensive applications through demos, blogs, videos, or whatever else developers need.Joe's career has taken him from building out database best practices and demos for MongoDB, architecting and building one of the largest eCommerce websites in North America at Best Buy, and teaching at one of the most highly-rated software development boot camps on Earth. Joe is also a TEDx Speaker, film buff, and avid TikToker and Tweeter.Links Referenced: Tinybird: https://www.tinybird.co/ Personal website: https://joekarlsson.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Are you navigating the complex web of API management, microservices, and Kubernetes in your organization? Solo.io is here to be your guide to connectivity in the cloud-native universe!Solo.io, the powerhouse behind Istio, is revolutionizing cloud-native application networking. They brought you Gloo Gateway, the lightweight and ultra-fast gateway built for modern API management, and Gloo Mesh Core, a necessary step to secure, support, and operate your Istio environment.Why struggle with the nuts and bolts of infrastructure when you can focus on what truly matters - your application. Solo.io's got your back with networking for applications, not infrastructure. Embrace zero trust security, GitOps automation, and seamless multi-cloud networking, all with Solo.io.And here's the real game-changer: a common interface for every connection, in every direction, all with one API. It's the future of connectivity, and it's called Gloo by Solo.io.DevOps and Platform Engineers, your journey to a seamless cloud-native experience starts here. Visit solo.io/screaminginthecloud today and level up your networking game.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn and I am joined today by someone from well, we'll call it the other side of the tracks, if I can—Joe: [laugh].Corey: —be blunt and disrespectful. Joe Karlsson is a data engineer at Tinybird, but I really got to know who he is by consistently seeing his content injected almost against my will over on the TikToks. Joe, how are you?Joe: I'm doing so well and I'm so sorry for anything I've forced down your throat online. Thanks for having me, though.Corey: Oh, it's always a pleasure to talk to you. No, the problem I've got with it is that when I'm in TikTok mode, I don't want to think about computers anymore. I want to find inane content that I can just swipe six hours away without realizing it because that's how I roll.Joe: TikTok is too smart, though. I think it knows that you are doing a lot of stuff with computers and even if you keep swiping away, it's going to keep serving it up to you.Corey: For a long time, it had me pinned as a lesbian, which was interesting. Which I suppose—Joe: [laugh]. It happened to me, too.Corey: Makes sense because I follow a lot of women who are creators in comics and the rest, but I'm not interested in the thirst trap approach. So, it's like, “Mmm, this codes as lesbian.” Then they started showing me ads for ADHD, which I thought was really weird until I'm—oh right. I'm on TikTok. And then they started recommending people that I'm surprised was able to disambiguate until I realized these people have been at my house and using TikTok from my IP address, which probably is going to get someone murdered someday, but it's probably easy to wind up doing an IP address match.Joe: I feel like I have to, like, separate what is me and what is TikTok, like, trying to serve it up because I've been on lesbian TikTok, too, ADHD, autism, like TikTok. And, like, is this who I am? I don't know. [unintelligible 00:02:08] bring it to my therapist.Corey: You're learning so much about yourself based upon an algorithm. Kind of wild, isn't it?Joe: [laugh]. Yeah, I think we may be a little, like, neuro-spicy, but I think it might be a little overblown with what TikTok is trying to diagnose us with. So, it's always good to just keep it in check, you know?Corey: Oh, yes. So, let's see, what's been going on lately? We had Google Next, which I think the industry largely is taking not seriously enough. For years, it felt like a try-hard, me too version of re:Invent. And this year, it really feels like it's coming to its own. It is defining itself as something other than oh, us too.Joe: I totally agree. And that's where you and I ran into recently, too. I feel like post-Covid I'm still, like, running into people I met on the internet in real life, and yeah, I feel like, yeah, re:Invent and Google Next are, like, the big ones.I totally agree. It feels like—I mean, it's definitely, like, heavily inspired by it. And it still feels like it's a little sibling in some ways, but I do feel like it's one of the best conferences I've been to since, like, a pre-Covid 2019 AWS re:Invent, just in terms of, like… who was there. The energy, the vibes, I feel like people were, like, having fun. Yeah, I don't know, it was a great conference this year.Corey: Usually, I would go to Next in previous years because it was a great place to go to hang out with AWS customers. These days, it feels like it's significantly more than that. It's, everyone is using everything at large scale. I think that is something that is not fully understood. You talk to companies that are, like, Netflix, famously all in on AWS. Yeah, they have Google stuff, too.Everyone does. I have Google stuff. I have a few things in Azure, for God's sake. It's one of those areas where everything starts to diffuse throughout a company as soon as you hire employee number two. And that is, I think, the natural order of things. The challenge, of course, is the narrative people try and build around it.Joe: Yep. Oh, totally. Multi-cloud's been huge for you know, like, starting to move up. And it's impossible not to. It was interesting seeing, like, Google trying to differentiate itself from Azure and AWS. And, Corey, I feel like you'd probably agree with this, too, AI was like, definitely the big buzzword that kept trying to, like—Corey: Oh, God. Spare me. And I say that, as someone who likes AI, I think that there's a lot of neat stuff lurking around and value hiding within generative AI, but the sheer amount of hype around it—and frankly—some of the crypto bros have gone crashing into the space, make me want to distance myself from it as far as humanly possible, just because otherwise, I feel like I get lumped in with that set. And I don't want that.Joe: Yeah, I totally agree. I know it feels like it's hard right now to, like, remain ungrifty, but, like, still, like—trying—I mean, everyone's trying to just, like, hammer in an AI perspective into every product they have. And I feel like a lot of companies, like, still don't really have a good use case for it. You're still trying to, like, figure that out. We're seeing some cool stuff.Honestly, the hard part for me was trying to differentiate between people just, like, bragging about OpenAI API addition they added to the core product or, like, an actual thing that's, like, AI is at the center of what it actually does, you know what I mean? Everything felt like it's kind of like tacked on some sort of AI perspective to it.Corey: One of the things that really is getting to me is that you have these big companies—Google and Amazon most notably—talk about how oh, well, we've actually been working with AI for decades. At this point, they keep trying to push out how long it's been. It's like, “Okay, then not for nothing, then why does”—in Amazon's case—“why does Alexa suck? If you've been working on it for this long, why is it so bad at all the rest?” It feels like they're trying to sprint out with a bunch of services that very clearly were not conceptualized until Chat-Gippity's breakthrough.And now it's oh, yeah, we're there, too. Us, too. And they're pivoting all the marketing around something that, frankly, they haven't demonstrated excellence with. And I feel like they're leaving a lot of their existing value proposition completely in the dust. It's, your customers are not using you because of the speculative future, forward-looking AI things; it's because you are able to solve business problems today in ways that are not highly speculative and are well understood. That's not nothing and there needs to be more attention paid to that. And I feel like there's this collective marketing tripping over itself to wrap itself in hype that does them no services.Joe: I totally agree. I feel like honestly, just, like, a marketing perspective, I feel like it's distracting in a lot of ways. And I know it's hot and it's cool, but it's like, I think it's harder right now to, like, stay focused to what you're actually doing well, as opposed to, like, trying to tack on some AI thing. And maybe that's great. I don't know.Maybe that's—honestly, maybe you're seeing some traction there. I don't know. But I totally agree. I feel like everyone right now is, like, selling a future that we don't quite have yet. I don't know. I'm worried that what's going to happen again, is what happened back in the IBM Watson days where everyone starts making bold—over-promising too much with AI until we see another AI winter again.Corey: Oh, the subtext is always, we can't wait to fire our entire customer service department. That one—Joe: Yeah.Corey: Just thrills me.Joe: [laugh].Corey: It's like, no, we're just going to get rid of junior engineers and just have senior engineers. Yeah, where do you think those people come from, by the way? We aren't—they aren't just emerging fully formed from the forehead of some god somewhere. And we're also seeing this wild divergence from reality. Remember, I fix AWS bills for a living. I see very large companies, very large AWS spend.The majority of spend remains on EC2 across the board. So, we don't see a lot of attention paid to that at re:Invent, even though it's the lion's share of everything. When we do contract negotiations, we talk about generative AI plan and strategy, but no one's saying, oh, yeah, we're spending 100 million a year right now on AWS but we should commit 250 because of all this generative AI stuff we're getting into. It's all small-scale experimentation and seeing if there's value there. But that's a far cry from being the clear winner what everyone is doing.I'd further like to point out that I can tell that there's a hype cycle in place and I'm trying to be—and someone's trying to scam me. As soon as there's a sense of you have to get on this new emerging technology now, now, now, now, now. I didn't get heavily into cloud till 2016 or so and I seem to have done all right with that. Whenever someone is pushing you to get into an emerging thing where it hasn't settled down enough to build a curriculum yet, I feel like there's time to be cautious and see what the actual truth is. Someone's selling something; if you can't spot the sucker, chances are, it's you.Joe: [laugh]. Corey, have you thought about making an AI large language model that will help people with their cloud bills? Maybe just feed it, like, your invoices [laugh].Corey: That has been an example, I've used a number of times with a variety of different folks where if AI really is all it's cracked up to be, then the AWS billing system is very much a bounded problem space. There's a lot of nuance and intricacy to it, but it is a finite set of things. Sure, [unintelligible 00:08:56] space is big. So, training something within those constraints and within those confines feels like it would be a terrific proof-of-concept for a lot of these things. Except that when I've experimented a little bit and companies have raised rounds to throw into this, it never quite works out because there's always human context involved. The, oh yeah, we're going to wind up turning off all those idle instances, except they're in idle—by whatever metric you're using—for a reason. And the first time you take production down, you're not allowed to save money anymore.Joe: Nope. That's such a good point. I agree. I don't know about you, Corey. I've been fretting about my job and, like, what I'm doing. I write a lot, I do a lot of videos, I'm programming a lot, and I think… obviously, we've been hearing a lot about, you know, if it's going to replace us or not. I honestly have been feeling a lot better recently about my job stability here. I don't know. I totally agree with you. There's always that, like, human component that needs to get added to it. But who knows, maybe it's going to get better. Maybe there'll be an AI-automated billing management tool, but it'll never be as good as you, Corey. Maybe it will. I don't know. [laugh].Corey: It knows who I am. When I tell it to write in the style of me and give it a blog post topic and some points I want to make, almost everything it says is wrong. But what I'll do is I'll copy that into a text editor, mansplain-correct the robot for ten minutes, and suddenly I've got the bones of a decent rough draft because. And yeah, I'll wind up plagiarizing three or four words in a row at most, but that's okay. I'm plagiarizing the thing that's plagiarizing from me and there's a beautiful symmetry to that. What I don't understand is some of the outreach emails and other nonsensical stuff I'll see where people are letting unsupervised AI just write things under their name and sending it out to people. That is anathema to me.Joe: I totally agree. And it might work today, it might work tomorrow, but, like, it's just a matter of time before something blows up. Corey, I'm curious. Like, personally, how do you feel about being in the ChatGPT, like, brain? I don't know, is that flattering? Does that make you nervous at all?Corey: Not really because it doesn't get it in a bunch of ways. And that's okay. I found the same problem with people. In my time on Twitter, when I started live-tweet shitposting about things—as I tend to do as my first love language—people will often try and do exactly that. The problem that I run into is that, “The failure mode of ‘clever' is ‘asshole,'” as John Scalzi famously said, and as a direct result of that, people wind up being mean and getting it wrong in that direction.It's not that I'm better than they are. It's, I had a small enough following, and no one knew who I was in my mean years, and I realized I didn't feel great making people sad. So okay, you've got to continue to correct the nosedive. But it is perilous and it is difficult to understand the nuance. I think occasionally when I prompt it correctly, it comes up with some amazing connections between things that I wouldn't have seen, but that's not the same thing as letting it write something completely unfettered.Joe: Yeah, I totally agree. The nuance definitely gets lost. It may be able to get, like, the tone, but I think it misses a lot of details. That's interesting.Corey: And other people are defending it when that hallucinates. Like, yeah, I understand there are people that do the same thing, too. Yeah, the difference is, in many cases, lying to me and passing it off otherwise is a firing offense in a lot of places. Because if you're going to be 19 out of 20 times, you're correct, but 5% wrong, you're going to bluff, I can't trust anything you tell me.Joe: Yeah. It definitely, like, brings your, like—the whole model into question.Corey: Also, remember that my medium for artistic creation is often writing. And I think that, on some level, these AI models are doing the same things that we do. There are still turns of phrase that I use that I picked up floating around Usenet in the mid-90s. And I don't remember who said it or the exact context, but these words and phrases have entered my lexicon and I'll use them and I don't necessarily give credit to where the first person who said that joke 30 years ago. But it's a—that is how humans operate. We are influenced by different styles of writing and learn from the rest.Joe: True.Corey: That's a bit different than training something on someone's artistic back catalog from a painting perspective and then emulating it, including their signature in the corner. Okay, that's a bit much.Joe: [laugh]. I totally agree.Corey: So, we wind up looking right now at the rush that is going on for companies trying to internalize their use of enterprise AI, which is kind of terrifying, and it all seems to come back to data.Joe: Yes.Corey: You work in the data space. How are you seeing that unfold?Joe: Yeah, I do. I've been, like, making speculations about the future of AI and data forever. I've had dreams of tools I've wanted forever, and I… don't have them yet. I don't think they're quite ready yet. I don't know, we're seeing things like—tha—I think people are working on a lot of problems.For example, like, I want AI to auto-optimize my database. I want it to, like, make indexes for me. I want it to help me with queries or optimizing queries. We're seeing some of that. I'm not seeing anyone doing particularly well yet. I think it's up in the air.I feel like it could be coming though soon, but that's the thing, though, too, like, I mean, if you mess up a query, or, like, a… large language model hallucinates a really shitty query for you, that could break your whole system really quickly. I feel like there still needs to be, like, a human being in the middle of it to, like, kind of help.Corey: I saw a blog post recently that AWS put out gave an example that just hard-coded a credential into it. And they said, “Don't do this, but for demonstration purposes, this is how it works.” Well, that nuance gets lost when you use that for AI training and that's, I think, in part, where you start seeing a whole bunch of the insecure crap these things spit out.Joe: Yeah, I totally agree. Well, I thought the big thing I've seen, too, is, like, large language models typically don't have a secure option and you're—the answer is, like, help train the model itself later on. I don't know, I'm sure, like, a lot of teams don't want to have their most secret data end up public on a large language model at some point in the future. Which is, like, a huge issue right now.Corey: I think that what we're seeing is that you still need someone with expertise in a given area to review what this thing spits out. It's great at solving a lot of the busy work stuff, but you still need someone who's conversant with the concepts to look at it. And that is, I think, something that turns into a large-scale code review, where everyone else just tends to go, “Oh, okay. We're—do this with code review.” “Oh, how big is the diff?” “50,000 lines.” “Looks good to me.” Whereas, “Three lines.” “I'm going to criticize that thing with four pages of text.” People don't want to do the deep-dive stuff, and—when there's a huge giant project that hits. So, they won't. And it'll be fine, right up until it isn't.Joe: Corey, you and I know people and developers, do you think it's irresponsible to put out there an example of how to do something like that, even with, like, an asterisk? I feel like someone's going to still go out and try to do that and probably push that to production.Corey: Of course they are.Joe: [laugh].Corey: I've seen this with some of my own code. I had something on Docker Hub years ago with a container that was called ‘Terrible Ideas.' And I'm sure being used in, like—it was basically the environment I use for a talk I gave around Git, which makes sense. And because I don't want to reset all the repositories back to the way they came from with a bunch of old commands, I just want a constrained environment that will be the same every time I give the talk. Awesome.I'm sure it's probably being run in production at a bank somewhere because why wouldn't it be? That's people. That's life. You're not supposed to just copy and paste from Chat-Gippity. You're supposed to do that from Stack Overflow like the rest of us. Where do you think your existing code's coming from in a lot of these shops?Joe: Yep. No, I totally agree. Yeah, I don't know. It'll be interesting to see how this shakes out with, like, people going to doing this stuff, or how honest they're going to be about it, too. I'm sure it's happening. I'm sure people are tripping over themselves right now, [adding 00:16:12].Corey: Oh, yeah. But I think, on some level, you're going to see a lot more grift coming out of this stuff. When you start having things that look a little more personalized, you can use it for spam purposes, you can use it for, I'm just going to basically copy and paste what this says and wind up getting a job on Upwork or something that is way more than I could handle myself, but using this thing, I'm going to wind up coasting through. Caveat emptor is always the case on that.Joe: Yeah, I totally agree.Corey: I mean, it's easy for me to sit here and talk about ethics. I believe strongly in doing the right thing. But I'm also not worried about whether I'm able to make rent this month or put food on the table. That's a luxury. At some point, like, a lot of that strips away and you do what you have to do to survive. I don't necessarily begrudge people doing these things until it gets to a certain point of okay, now you're not doing this to stay alive anymore. You're doing this to basically seek rent.Joe: Yeah, I agree. Or just, like, capitalize on it. I do think this is less—like, the space is less grifty than the crypto space, but as we've seen over and over and over and over again, in tech, there's a such a fine line between, like, a genuinely great idea, and somebody taking advantage of it—and other people—with that idea.Corey: I think that's one of those sad areas where you're not going to be able to fix human nature, regardless of the technology stack you bring to bear.Joe: Yeah, I totally agree.Corey: So, what else are you seeing these days that interesting? What excites you? What do you see that isn't getting enough attention in the space?Joe: I don't know, I guess I'm in the data space, I'm… the thing I think I do see a lot of is huge interest in data. Data right now is the thing that's come up. Like, I don't—that's the thing that's training these models and everyone trying to figure out what to do with these data, all these massive databases, data lakes, whatever. I feel like everyone's, kind of like, taking a second look at all of this data they've been collecting for years and haven't really known what to do with it and trying to figure out either, like, if you can make a model out of that, if you try to, like… level it up, whatever. Corey, you and I were joking around recently—you've had a lot of data people on here recently, too—I feel like us data folks are just getting extra loud right now. Or maybe there's just the data spaces, that's where the action's at right now.I don't know, the markets are really weird. Who knows? But um, I feel like data right now is super valuable and more so than ever. And even still, like, I mean, we're seeing, like, companies freaking out, like, Twitter and Reddit freaking out about accessing their data and who's using it and how. I don't know, I feel like there's a lot of action going on there right now.Corey: I think that there's a significant push from the data folks where, for a long time data folks were DBAs—Joe: Yeah.Corey: —let's be direct. And that role has continued to evolve in a whole bunch of different ways. It's never been an area I've been particularly strong in. I am not great at algorithmic complexity, it turns out, you can saturate some beefy instances with just a little bit of data if your queries are all terrible. And if you're unlucky—as I tend to be—and have an aura of destroying things, great, you probably don't want to go and make that what you do.Joe: [laugh]. It's a really good point. I mean, I don't know about, like, if you blow up data at a company, you're probably going to be in big trouble. And especially the scale we're talking about with most companies these days, it's super easy to either take down a server or generate an insane bill off of some shitty query.Corey: Oh, when I was at Reach Local years and years ago—my first Linux admin job—when I broke the web server farm, it was amusing; when I broke part of the data warehouse, nobody was laughing.Joe: [laugh]. I wonder why.Corey: It was a good faith mistake and that's fair. It was a convoluted series of things that set up and honestly, the way the company and my boss responded to me at the time set the course of the rest of my career. But it was definitely something that got my attention. It scares me. I'm a big believer in backups as a direct result.Joe: Yeah. Here's the other thing, too. Actually, our company, Tinybird, is working on versioning with your data sources right now and treating your data sources like Git, but I feel like even still today, most companies are just run by some DBA. There's, like, Mike down the hall is the one responsible keeping their SQL servers online, keeping them rebooted, and like, they're manually updating any changes on there.And I feel like, generally speaking across the industry, we're not taking data seriously. Which is funny because I'm with you on there. Like, I get terrified touching production databases because I don't want anything bad to happen to them. But if we could, like, make it easier to rollback or, like, handle that stuff, that would be so much easier for me and make it, like, less scary to deal with it. I feel like databases and, like, treating it as, like, a serious DevOps practice is not really—I'm not seeing enough of it. It's definitely, people are definitely doing it. Just, I want more.Corey: It seems like with data, there's a lack of iterative approaches to it. A line that someone came up with when I was working with them a decade and change ago was that you can talk about agile all you want, but when it comes to payments, everyone's doing waterfall. And it feels like, on some level, data's kind of the same.Joe: Yeah. And I don't know, like, how to fix it. I think everyone's just too scared of it to really touch it. Migrating over to a different version control, trying to make it not as manual, trying to iterate on it better, I think it's just—I don't blame them. It's hard, it really takes a long time, making sure everything, like, doesn't blow up while you're doing a migration is a pain in the ass. But I feel like that would make everyone's lives so much easier if, like, you could, like, treat it—understand your data and be able to rollback easier with it.Corey: When you take a look across the ecosystem now, are you finding that things have improved since the last time I was in the space, where the state of the art was, “Oh, we need some developer data. We either have this sanitized data somewhere or it's a copy of production that we move around, but only a small bit.” Because otherwise, we always found that oh, that's an extra petabyte of storage was going on someone's developer environment they messed up on three years ago, they haven't been here for two, and oops.Joe: I don't. I have not seen it. Again, that's so tricky, too. I think… yeah, the last time I, like, worked doing that was—usually you just have a really crappy version of production data on staging or development environments and it's hard to copy those over. I think databases are getting better for that.I've been working on, like, the real-time data space for a long time now, so copying data over and kind of streaming that over is a lot easier. I do think seeing, like, separating storage and compute can make it easier, too. But it depends on your data stack. Everyone's using everything all the time and it's super complicated to do that. I don't know about you, Corey, too. I'm sure you've seen, like, services people running, but I feel like we've made a switch as an industry from, like, monoliths to microservices.Now, we're kind of back in the monolith era, but I'm not seeing that happen in the database space. We're seeing, like, data meshing and lots of different databases. I see people who, like, see the value of data monoliths, but I don't see any actual progress in moving back to a single source of [truth of the data 00:23:02]. And I feel like the cat's kind of out of the bag on all the data existing everywhere, all the time, and trying to wrangle that up.Corey: This stuff is hard and there's no easy solution here. There just isn't.Joe: Yeah, there's no way. And embracing that chaos, I think, is going to be huge. I think you have to do it right now. Or trying to find some tool that can, like, wrangle up a bunch of things together and help work with them all at once. Products need to meet people where they're at, too. And, like, data is all over the place and I feel like we kind of have to, like, find tooling that can kind of help work with what you have.Corey: It's a constant challenge, but also a joy, so we'll give it that.Joe: [laugh].Corey: So, I have to ask. Your day job has you doing developer advocacy at Tinybird—Joe: Yes.Corey: But I had to dig in to find that out. It wasn't obvious based upon the TikToks and the Twitter nonsense and the rest. How do you draw the line between day job and you as a person shitposting on the internet about technology?Joe: Corey, I'd be curious to hear your thoughts on this, too. I don't know. I feel like I've been in different places where, like, my job is my life. You know what I mean? There's a very thin line there. Personally, I've been trying to take a step back from that, just from a mental health perspective. Having my professional life be so closely tied to, like, my personal value and who I am has been really bad for my brain.And trying to make that clear at my company is, like, what is mine and what I can help with has been really huge. I feel like the boundaries between myself and my job has gotten too thin. And for a while, I thought that was a great idea; it turns out that was not a great idea for my brain. It's so hard. So, I've been a software engineer and I've done full-time developer advocacy, and I felt like I had a lot more freedom to say what I wanted as, like, a full-time software engineer as opposed to being a developer advocate and kind of representing the company.Because the thing is, I'm always representing the company [online 00:24:56], but I'm not always working, which is kind of like—that—it's kind of a hard line. I feel like there's been, like, ways to get around it though with, like, less private shitposting about things that could piss off a CEO or infringe on an NDA or, you know, whatever, you know what I mean? Yeah, trying to, like, find that balance or trying to, like, use tools to try to separate that has been big. But I don't know, I've been—personally, I've been trying to step—like, start trying to make more of a boundary for that.Corey: Yeah. I don't have much of one, but I also own the company, so my approach doesn't necessarily work for other people. I don't advertise in public that I fix AWS bills very often. That's not the undercurrent to most of my jokes and the rest. Because the people who have that painful problem aren't generally in the audience directly and they certainly don't talk about it extensively.It's word of mouth. It's being fun and engaging so people stick around. And when I periodically do mention it that sort of sticks with them. And in the fullness of time, it works as a way of, “Oh, yeah, everyone knows what you're into. And yeah, when we have this problem, reaching out to you is our first thought.” But I don't know that it's possible to measure its effectiveness. I just know that works.Joe: Yeah. For me, it's like, don't be an asshole and teach don't sell are like, the two biggest things that I'm trying to do all the time. And the goal is not to, like, trick people into, like, thinking I'm not working for a company. I think I try to be transparent, or if, like, I happen to be talking about a product that I'm working for, I try to disclose that. But yeah, I don't know. For me, it's just, like, trying to build up a community of people who, like, understand what I'm trying to put out there. You know what I mean?Corey: Yeah, it's about what you want to be known for, on some level. Part of the problem that I've had for a long time is that I've been pulled in so many directions. [They're 00:26:34] like, “Oh, you're great. Where do I go to learn more?” It's like, “Well, I have this podcast, I have the newsletter, I have the other podcast that I do in the AWS Morning Brief. I have the duckbillgroup.com. I have lastweekinaws.com. I have a Twitter account. I had a YouTube thing for a while.”It's like, there's so many different ways to send people. It's like, what is the top-of-funnel? And for me, my answer has been, sign up for the newsletter at lastweekinaws.com. That keeps you apprised of everything else and you can dial it into taste. It's also, frankly, one of those things that doesn't require algorithmic blessing to continue to show up in people's inboxes. So far at least, we haven't seen algorithms have a significant impact on that, except when they spam-bin something. And it turns out when you write content people like, the providers get yelled at by their customers of, “Hey, I'm trying to read this. What's going on?” I had a couple of reach out to me asking what the hell happened. It's kind of fun.Joe: I love that. And, Corey, I think that's so smart, too. It's definitely been a lesson, I think, for me and a lot of people on—that are terminally online that, like, we don't own our social following on other platforms. With, like, the downfall of Twitter, like, I'm still posting on there, but we still have a bunch of stuff on there, but my… that following is locked in. I can't take that home. But, like, you still have your email newsletter. And I even feel it for tech companies who might be listening to this, too. I feel like owning your email list is, like, not the coolest thing, but I feel like it's criminally underrated, as, like, a way of talking to people.Corey: It doesn't matter what platforms change, what my personal situation changes, I am—like, whatever it is that I wind up doing next, whenever next happens, I'll need a platform to tell people about, and that's what I've been building. I value newsletter subscribers in a metric sense far more highly and weight them more heavily than I do Twitter followers. Anyone can click a follow and then never check Twitter again. Easy enough. Newsletters? Well, that winds up requiring a little bit extra work because we do confirmed opt-ins, for obvious reasons.And we never sell the list. We never—you can't transfer permission for, like that, and we obviously respect it when people say I don't want to hear from your nonsense anymore. Great. Cool. I don't want to send this to people that don't care. Get out of here.Joe: [laugh]. No, I think that's so smart.Corey: Podcasts are impossible on the other end, but I also—you know, I control the domain and that's important to me.Joe: Yeah.Corey: Why don't you build this on top of Substack? Because as soon as Substack pivots, I'm screwed.Joe: Yeah, yeah. Which we've—I think we've seen that they've tried to do, even with the Twitter clone that tried to build last couple years. I've been burned by so many other publishing platforms over and over and over again through the years. Like, Medium, yeah, I criminally don't trust any sort of tech publishing platform anymore that I don't own. [laugh]. But I also don't want to maintain it. It's such a fine line. I just want to, like, maintain something without having to, like, maintain all the infrastructure all the time, and I don't think that exists and I don't really trust anything to help me with that.Corey: You can on some level, I mean, I wind up parking in the newsletter stuff over at ConvertKit. But I can—I have moved it twice already. I could move it again if I needed to. It's about controlling the domain. I have something that fires off once or twice a day that backs up the entire subscriber list somewhere.I don't want to build my own system, but I can also get that in an export form wherever I need it to go. Frankly, I view it as the most valuable asset that I have here because I can always find a way to turn relationships and an audience into money. I can't necessarily find a way to go the opposite direction of, well have money. Time to buy an audience. Doesn't work that way.Joe: [laugh]. No, I totally agree. You know what I do like, though, is Threads, which has kind of fallen off, but I do love the idea of their federated following [and be almost 00:30:02] like, unlock that a little bit. I do think that that's probably going to be the future. And I have to say, I just care as someone who, like, makes shit online. I don't think 98% of people don't really care about that future, but I do. Just getting burned so often on social media platforms, it helps to then have a little bit of flexibility there.Corey: Oh, yeah. And I wish it were different. I feel like, at some level, Elon being Elon has definitely caused a bit of a diaspora of social media and I think that's a good thing.Joe: Yeah. Yeah. I hope it settles down a little bit, but it definitely got things moving again.Corey: Oh, yes. I really want to thank you for taking the time to go through how you view these things. Where's the best place for people to go to follow you learn more, et cetera? Just sign up for TikTok and you'll be all over them, apparently.Joe: Go to the website that I own joekarlsson.com. It's got the links to everything on there. Opt in or out of whatever you find you want. Otherwise, I'm just going to quick plug for the company I work for: tinybird.co. If you're trying to make APIs on top of data, definitely want to check out Tinybird. We work with Kafka, BigQuery, S3, all the data sources could pull it in. [unintelligible 00:31:10] on it and publishes it as an API. It's super easy. Or you could just ignore me. That's fine, too. You could—that's highly encouraged as well.Corey: Always a good decision.Joe: [laugh]. Yeah, I agree. I'm biased, but I agree.Corey: Thanks, Joe. I appreciate your taking the time to speak with me and we'll, of course, put links to all that in the [show notes 00:31:26]. And please come back soon and regale us with more stories.Joe: I will. Thanks, Corey.Corey: Joe Karlsson, data engineer at Tinybird. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that I'll never read because they're going to have a disk problem and they haven't learned the lesson of backups yet.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started. Tinybird: https://www.tinybird.co/ Personal website: https://joekarlsson.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn and I am joined today by someone from well, we'll call it the other side of the tracks, if I can—Joe: [laugh].Corey: —be blunt and disrespectful. Joe Karlsson is a data engineer at Tinybird, but I really got to know who he is by consistently seeing his content injected almost against my will over on the TikToks. Joe, how are you?Joe: I'm doing so well and I'm so sorry for anything I've forced down your throat online. Thanks for having me, though.Corey: Oh, it's always a pleasure to talk to you. No, the problem I've got with it is that when I'm in TikTok mode, I don't want to think about computers anymore. I want to find inane content that I can just swipe six hours away without realizing it because that's how I roll.Joe: TikTok is too smart, though. I think it knows that you are doing a lot of stuff with computers and even if you keep swiping away, it's going to keep serving it up to you.Corey: For a long time, it had me pinned as a lesbian, which was interesting. Which I suppose—Joe: [laugh]. It happened to me, too.Corey: Makes sense because I follow a lot of women who are creators in comics and the rest, but I'm not interested in the thirst trap approach. So, it's like, “Mmm, this codes as lesbian.” Then they started showing me ads for ADHD, which I thought was really weird until I'm—oh right. I'm on TikTok. And then they started recommending people that I'm surprised was able to disambiguate until I realized these people have been at my house and using TikTok from my IP address, which probably is going to get someone murdered someday, but it's probably easy to wind up doing an IP address match.Joe: I feel like I have to, like, separate what is me and what is TikTok, like, trying to serve it up because I've been on lesbian TikTok, too, ADHD, autism, like TikTok. And, like, is this who I am? I don't know. [unintelligible 00:02:08] bring it to my therapist.Corey: You're learning so much about yourself based upon an algorithm. Kind of wild, isn't it?Joe: [laugh]. Yeah, I think we may be a little, like, neuro-spicy, but I think it might be a little overblown with what TikTok is trying to diagnose us with. So, it's always good to just keep it in check, you know?Corey: Oh, yes. So, let's see, what's been going on lately? We had Google Next, which I think the industry largely is taking not seriously enough. For years, it felt like a try-hard, me too version of re:Invent. And this year, it really feels like it's coming to its own. It is defining itself as something other than oh, us too.Joe: I totally agree. And that's where you and I ran into recently, too. I feel like post-Covid I'm still, like, running into people I met on the internet in real life, and yeah, I feel like, yeah, re:Invent and Google Next are, like, the big ones.I totally agree. It feels like—I mean, it's definitely, like, heavily inspired by it. And it still feels like it's a little sibling in some ways, but I do feel like it's one of the best conferences I've been to since, like, a pre-Covid 2019 AWS re:Invent, just in terms of, like… who was there. The energy, the vibes, I feel like people were, like, having fun. Yeah, I don't know, it was a great conference this year.Corey: Usually, I would go to Next in previous years because it was a great place to go to hang out with AWS customers. These days, it feels like it's significantly more than that. It's, everyone is using everything at large scale. I think that is something that is not fully understood. You talk to companies that are, like, Netflix, famously all in on AWS. Yeah, they have Google stuff, too.Everyone does. I have Google stuff. I have a few things in Azure, for God's sake. It's one of those areas where everything starts to diffuse throughout a company as soon as you hire employee number two. And that is, I think, the natural order of things. The challenge, of course, is the narrative people try and build around it.Joe: Yep. Oh, totally. Multi-cloud's been huge for you know, like, starting to move up. And it's impossible not to. It was interesting seeing, like, Google trying to differentiate itself from Azure and AWS. And, Corey, I feel like you'd probably agree with this, too, AI was like, definitely the big buzzword that kept trying to, like—Corey: Oh, God. Spare me. And I say that, as someone who likes AI, I think that there's a lot of neat stuff lurking around and value hiding within generative AI, but the sheer amount of hype around it—and frankly—some of the crypto bros have gone crashing into the space, make me want to distance myself from it as far as humanly possible, just because otherwise, I feel like I get lumped in with that set. And I don't want that.Joe: Yeah, I totally agree. I know it feels like it's hard right now to, like, remain ungrifty, but, like, still, like—trying—I mean, everyone's trying to just, like, hammer in an AI perspective into every product they have. And I feel like a lot of companies, like, still don't really have a good use case for it. You're still trying to, like, figure that out. We're seeing some cool stuff.Honestly, the hard part for me was trying to differentiate between people just, like, bragging about OpenAI API addition they added to the core product or, like, an actual thing that's, like, AI is at the center of what it actually does, you know what I mean? Everything felt like it's kind of like tacked on some sort of AI perspective to it.Corey: One of the things that really is getting to me is that you have these big companies—Google and Amazon most notably—talk about how oh, well, we've actually been working with AI for decades. At this point, they keep trying to push out how long it's been. It's like, “Okay, then not for nothing, then why does”—in Amazon's case—“why does Alexa suck? If you've been working on it for this long, why is it so bad at all the rest?” It feels like they're trying to sprint out with a bunch of services that very clearly were not conceptualized until Chat-Gippity's breakthrough.And now it's oh, yeah, we're there, too. Us, too. And they're pivoting all the marketing around something that, frankly, they haven't demonstrated excellence with. And I feel like they're leaving a lot of their existing value proposition completely in the dust. It's, your customers are not using you because of the speculative future, forward-looking AI things; it's because you are able to solve business problems today in ways that are not highly speculative and are well understood. That's not nothing and there needs to be more attention paid to that. And I feel like there's this collective marketing tripping over itself to wrap itself in hype that does them no services.Joe: I totally agree. I feel like honestly, just, like, a marketing perspective, I feel like it's distracting in a lot of ways. And I know it's hot and it's cool, but it's like, I think it's harder right now to, like, stay focused to what you're actually doing well, as opposed to, like, trying to tack on some AI thing. And maybe that's great. I don't know.Maybe that's—honestly, maybe you're seeing some traction there. I don't know. But I totally agree. I feel like everyone right now is, like, selling a future that we don't quite have yet. I don't know. I'm worried that what's going to happen again, is what happened back in the IBM Watson days where everyone starts making bold—over-promising too much with AI until we see another AI winter again.Corey: Oh, the subtext is always, we can't wait to fire our entire customer service department. That one—Joe: Yeah.Corey: Just thrills me.Joe: [laugh].Corey: It's like, no, we're just going to get rid of junior engineers and just have senior engineers. Yeah, where do you think those people come from, by the way? We aren't—they aren't just emerging fully formed from the forehead of some god somewhere. And we're also seeing this wild divergence from reality. Remember, I fix AWS bills for a living. I see very large companies, very large AWS spend.The majority of spend remains on EC2 across the board. So, we don't see a lot of attention paid to that at re:Invent, even though it's the lion's share of everything. When we do contract negotiations, we talk about generative AI plan and strategy, but no one's saying, oh, yeah, we're spending 100 million a year right now on AWS but we should commit 250 because of all this generative AI stuff we're getting into. It's all small-scale experimentation and seeing if there's value there. But that's a far cry from being the clear winner what everyone is doing.I'd further like to point out that I can tell that there's a hype cycle in place and I'm trying to be—and someone's trying to scam me. As soon as there's a sense of you have to get on this new emerging technology now, now, now, now, now. I didn't get heavily into cloud till 2016 or so and I seem to have done all right with that. Whenever someone is pushing you to get into an emerging thing where it hasn't settled down enough to build a curriculum yet, I feel like there's time to be cautious and see what the actual truth is. Someone's selling something; if you can't spot the sucker, chances are, it's you.Joe: [laugh]. Corey, have you thought about making an AI large language model that will help people with their cloud bills? Maybe just feed it, like, your invoices [laugh].Corey: That has been an example, I've used a number of times with a variety of different folks where if AI really is all it's cracked up to be, then the AWS billing system is very much a bounded problem space. There's a lot of nuance and intricacy to it, but it is a finite set of things. Sure, [unintelligible 00:08:56] space is big. So, training something within those constraints and within those confines feels like it would be a terrific proof-of-concept for a lot of these things. Except that when I've experimented a little bit and companies have raised rounds to throw into this, it never quite works out because there's always human context involved. The, oh yeah, we're going to wind up turning off all those idle instances, except they're in idle—by whatever metric you're using—for a reason. And the first time you take production down, you're not allowed to save money anymore.Joe: Nope. That's such a good point. I agree. I don't know about you, Corey. I've been fretting about my job and, like, what I'm doing. I write a lot, I do a lot of videos, I'm programming a lot, and I think… obviously, we've been hearing a lot about, you know, if it's going to replace us or not. I honestly have been feeling a lot better recently about my job stability here. I don't know. I totally agree with you. There's always that, like, human component that needs to get added to it. But who knows, maybe it's going to get better. Maybe there'll be an AI-automated billing management tool, but it'll never be as good as you, Corey. Maybe it will. I don't know. [laugh].Corey: It knows who I am. When I tell it to write in the style of me and give it a blog post topic and some points I want to make, almost everything it says is wrong. But what I'll do is I'll copy that into a text editor, mansplain-correct the robot for ten minutes, and suddenly I've got the bones of a decent rough draft because. And yeah, I'll wind up plagiarizing three or four words in a row at most, but that's okay. I'm plagiarizing the thing that's plagiarizing from me and there's a beautiful symmetry to that. What I don't understand is some of the outreach emails and other nonsensical stuff I'll see where people are letting unsupervised AI just write things under their name and sending it out to people. That is anathema to me.Joe: I totally agree. And it might work today, it might work tomorrow, but, like, it's just a matter of time before something blows up. Corey, I'm curious. Like, personally, how do you feel about being in the ChatGPT, like, brain? I don't know, is that flattering? Does that make you nervous at all?Corey: Not really because it doesn't get it in a bunch of ways. And that's okay. I found the same problem with people. In my time on Twitter, when I started live-tweet shitposting about things—as I tend to do as my first love language—people will often try and do exactly that. The problem that I run into is that, “The failure mode of ‘clever' is ‘asshole,'” as John Scalzi famously said, and as a direct result of that, people wind up being mean and getting it wrong in that direction.It's not that I'm better than they are. It's, I had a small enough following, and no one knew who I was in my mean years, and I realized I didn't feel great making people sad. So okay, you've got to continue to correct the nosedive. But it is perilous and it is difficult to understand the nuance. I think occasionally when I prompt it correctly, it comes up with some amazing connections between things that I wouldn't have seen, but that's not the same thing as letting it write something completely unfettered.Joe: Yeah, I totally agree. The nuance definitely gets lost. It may be able to get, like, the tone, but I think it misses a lot of details. That's interesting.Corey: And other people are defending it when that hallucinates. Like, yeah, I understand there are people that do the same thing, too. Yeah, the difference is, in many cases, lying to me and passing it off otherwise is a firing offense in a lot of places. Because if you're going to be 19 out of 20 times, you're correct, but 5% wrong, you're going to bluff, I can't trust anything you tell me.Joe: Yeah. It definitely, like, brings your, like—the whole model into question.Corey: Also, remember that my medium for artistic creation is often writing. And I think that, on some level, these AI models are doing the same things that we do. There are still turns of phrase that I use that I picked up floating around Usenet in the mid-90s. And I don't remember who said it or the exact context, but these words and phrases have entered my lexicon and I'll use them and I don't necessarily give credit to where the first person who said that joke 30 years ago. But it's a—that is how humans operate. We are influenced by different styles of writing and learn from the rest.Joe: True.Corey: That's a bit different than training something on someone's artistic back catalog from a painting perspective and then emulating it, including their signature in the corner. Okay, that's a bit much.Joe: [laugh]. I totally agree.Corey: So, we wind up looking right now at the rush that is going on for companies trying to internalize their use of enterprise AI, which is kind of terrifying, and it all seems to come back to data.Joe: Yes.Corey: You work in the data space. How are you seeing that unfold?Joe: Yeah, I do. I've been, like, making speculations about the future of AI and data forever. I've had dreams of tools I've wanted forever, and I… don't have them yet. I don't think they're quite ready yet. I don't know, we're seeing things like—tha—I think people are working on a lot of problems.For example, like, I want AI to auto-optimize my database. I want it to, like, make indexes for me. I want it to help me with queries or optimizing queries. We're seeing some of that. I'm not seeing anyone doing particularly well yet. I think it's up in the air.I feel like it could be coming though soon, but that's the thing, though, too, like, I mean, if you mess up a query, or, like, a… large language model hallucinates a really shitty query for you, that could break your whole system really quickly. I feel like there still needs to be, like, a human being in the middle of it to, like, kind of help.Corey: I saw a blog post recently that AWS put out gave an example that just hard-coded a credential into it. And they said, “Don't do this, but for demonstration purposes, this is how it works.” Well, that nuance gets lost when you use that for AI training and that's, I think, in part, where you start seeing a whole bunch of the insecure crap these things spit out.Joe: Yeah, I totally agree. Well, I thought the big thing I've seen, too, is, like, large language models typically don't have a secure option and you're—the answer is, like, help train the model itself later on. I don't know, I'm sure, like, a lot of teams don't want to have their most secret data end up public on a large language model at some point in the future. Which is, like, a huge issue right now.Corey: I think that what we're seeing is that you still need someone with expertise in a given area to review what this thing spits out. It's great at solving a lot of the busy work stuff, but you still need someone who's conversant with the concepts to look at it. And that is, I think, something that turns into a large-scale code review, where everyone else just tends to go, “Oh, okay. We're—do this with code review.” “Oh, how big is the diff?” “50,000 lines.” “Looks good to me.” Whereas, “Three lines.” “I'm going to criticize that thing with four pages of text.” People don't want to do the deep-dive stuff, and—when there's a huge giant project that hits. So, they won't. And it'll be fine, right up until it isn't.Joe: Corey, you and I know people and developers, do you think it's irresponsible to put out there an example of how to do something like that, even with, like, an asterisk? I feel like someone's going to still go out and try to do that and probably push that to production.Corey: Of course they are.Joe: [laugh].Corey: I've seen this with some of my own code. I had something on Docker Hub years ago with a container that was called ‘Terrible Ideas.' And I'm sure being used in, like—it was basically the environment I use for a talk I gave around Git, which makes sense. And because I don't want to reset all the repositories back to the way they came from with a bunch of old commands, I just want a constrained environment that will be the same every time I give the talk. Awesome.I'm sure it's probably being run in production at a bank somewhere because why wouldn't it be? That's people. That's life. You're not supposed to just copy and paste from Chat-Gippity. You're supposed to do that from Stack Overflow like the rest of us. Where do you think your existing code's coming from in a lot of these shops?Joe: Yep. No, I totally agree. Yeah, I don't know. It'll be interesting to see how this shakes out with, like, people going to doing this stuff, or how honest they're going to be about it, too. I'm sure it's happening. I'm sure people are tripping over themselves right now, [adding 00:16:12].Corey: Oh, yeah. But I think, on some level, you're going to see a lot more grift coming out of this stuff. When you start having things that look a little more personalized, you can use it for spam purposes, you can use it for, I'm just going to basically copy and paste what this says and wind up getting a job on Upwork or something that is way more than I could handle myself, but using this thing, I'm going to wind up coasting through. Caveat emptor is always the case on that.Joe: Yeah, I totally agree.Corey: I mean, it's easy for me to sit here and talk about ethics. I believe strongly in doing the right thing. But I'm also not worried about whether I'm able to make rent this month or put food on the table. That's a luxury. At some point, like, a lot of that strips away and you do what you have to do to survive. I don't necessarily begrudge people doing these things until it gets to a certain point of okay, now you're not doing this to stay alive anymore. You're doing this to basically seek rent.Joe: Yeah, I agree. Or just, like, capitalize on it. I do think this is less—like, the space is less grifty than the crypto space, but as we've seen over and over and over and over again, in tech, there's a such a fine line between, like, a genuinely great idea, and somebody taking advantage of it—and other people—with that idea.Corey: I think that's one of those sad areas where you're not going to be able to fix human nature, regardless of the technology stack you bring to bear.Joe: Yeah, I totally agree.[midroll 00:17:30]Corey: So, what else are you seeing these days that interesting? What excites you? What do you see that isn't getting enough attention in the space?Joe: I don't know, I guess I'm in the data space, I'm… the thing I think I do see a lot of is huge interest in data. Data right now is the thing that's come up. Like, I don't—that's the thing that's training these models and everyone trying to figure out what to do with these data, all these massive databases, data lakes, whatever. I feel like everyone's, kind of like, taking a second look at all of this data they've been collecting for years and haven't really known what to do with it and trying to figure out either, like, if you can make a model out of that, if you try to, like… level it up, whatever. Corey, you and I were joking around recently—you've had a lot of data people on here recently, too—I feel like us data folks are just getting extra loud right now. Or maybe there's just the data spaces, that's where the action's at right now.I don't know, the markets are really weird. Who knows? But um, I feel like data right now is super valuable and more so than ever. And even still, like, I mean, we're seeing, like, companies freaking out, like, Twitter and Reddit freaking out about accessing their data and who's using it and how. I don't know, I feel like there's a lot of action going on there right now.Corey: I think that there's a significant push from the data folks where, for a long time data folks were DBAs—Joe: Yeah.Corey: —let's be direct. And that role has continued to evolve in a whole bunch of different ways. It's never been an area I've been particularly strong in. I am not great at algorithmic complexity, it turns out, you can saturate some beefy instances with just a little bit of data if your queries are all terrible. And if you're unlucky—as I tend to be—and have an aura of destroying things, great, you probably don't want to go and make that what you do.Joe: [laugh]. It's a really good point. I mean, I don't know about, like, if you blow up data at a company, you're probably going to be in big trouble. And especially the scale we're talking about with most companies these days, it's super easy to either take down a server or generate an insane bill off of some shitty query.Corey: Oh, when I was at Reach Local years and years ago—my first Linux admin job—when I broke the web server farm, it was amusing; when I broke part of the data warehouse, nobody was laughing.Joe: [laugh]. I wonder why.Corey: It was a good faith mistake and that's fair. It was a convoluted series of things that set up and honestly, the way the company and my boss responded to me at the time set the course of the rest of my career. But it was definitely something that got my attention. It scares me. I'm a big believer in backups as a direct result.Joe: Yeah. Here's the other thing, too. Actually, our company, Tinybird, is working on versioning with your data sources right now and treating your data sources like Git, but I feel like even still today, most companies are just run by some DBA. There's, like, Mike down the hall is the one responsible keeping their SQL servers online, keeping them rebooted, and like, they're manually updating any changes on there.And I feel like, generally speaking across the industry, we're not taking data seriously. Which is funny because I'm with you on there. Like, I get terrified touching production databases because I don't want anything bad to happen to them. But if we could, like, make it easier to rollback or, like, handle that stuff, that would be so much easier for me and make it, like, less scary to deal with it. I feel like databases and, like, treating it as, like, a serious DevOps practice is not really—I'm not seeing enough of it. It's definitely, people are definitely doing it. Just, I want more.Corey: It seems like with data, there's a lack of iterative approaches to it. A line that someone came up with when I was working with them a decade and change ago was that you can talk about agile all you want, but when it comes to payments, everyone's doing waterfall. And it feels like, on some level, data's kind of the same.Joe: Yeah. And I don't know, like, how to fix it. I think everyone's just too scared of it to really touch it. Migrating over to a different version control, tr

Screaming in the Cloud
Storytelling Over Feature Dumping with Jeff Geerling

Screaming in the Cloud

Play Episode Listen Later Oct 10, 2023 36:00


Jeff Geerling, Owner of Midwestern Mac, joins Corey on Screaming in the Cloud to discuss the importance of storytelling, problem-solving, and community in the world of cloud. Jeff shares how and why he creates content that can appeal to anybody, rather than focusing solely on the technical qualifications of his audience, and how that strategy has paid off for him. Corey and Jeff also discuss the impact of leading with storytelling as opposed to features in product launches, and what's been going on in the Raspberry Pi space recently. Jeff also expresses the impact that community has on open-source companies, and reveals his take on the latest moves from Red Hat and Hashicorp. About JeffJeff is a father, author, developer, and maker. He is sometimes called "an inflammatory enigma".Links Referenced:Personal webpage: https://jeffgeerling.com/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. A bit off the beaten path of the usual cloud-focused content on this show, today I'm speaking with Jeff Geerling, YouTuber, author, content creator, enigma, and oh, so much more. Jeff, thanks for joining me.Jeff: Thanks for having me, Corey.Corey: So, it's hard to figure out where you start versus where you stop, but I do know that as I've been exploring a lot of building up my own home lab stuff, suddenly you are right at the top of every Google search that I wind up conducting. I was building my own Kubernete on top of a Turing Pi 2, and sure enough, your teardown was the first thing that I found that, to be direct, was well-documented, and made it understandable. And that's not the first time this year that that's happened to me. What do you do exactly?Jeff: I mean, I do everything. And I started off doing web design and then I figured that design is very, I don't know, once it started transitioning to everything being JavaScript, that was not my cup of tea. So, I got into back-end work, databases, and then I realized to make that stuff work well, you got to know the infrastructure. So, I got into that stuff. And then I realized, like, my home lab is a great place to experiment on this, so I got into Raspberry Pis, low-power computing efficiency, building your own home lab, all that kind of stuff.So, all along the way, with everything I do, I always, like, document everything like crazy. That's something my dad taught me. He's an engineer in radio. And he actually hired me for my first job, he had me write an IT operations manual for the Radio Group in St. Louis. And from that point forward, that's—I always start with documentation. So, I think that was probably what really triggered that whole series. It happens to me too; I search for something, I find my old articles or my own old projects on GitHub or blog posts because I just put everything out there.Corey: I was about to ask, years ago, I was advised by Scott Hanselman to—the third time I find myself explaining something, write a blog post about it because it's easier to refer people back to that thing than it is for me to try and reconstruct it on the fly, and I'll drop things here and there. And the trick is, of course, making sure it doesn't sound dismissive and like, “Oh, I wrote a thing. Go read.” Instead of having a conversation with people. But as a result, I'll be Googling how to do things from time to time and come up with my own content as a result.It's at least a half-step up from looking at forums and the rest, where I realized halfway through that I was the one asking the question. Like, “Oh, well, at least this is useful for someone.” And I, for better or worse, at least have a pattern of going back and answering how I solved a thing after I get there, just because otherwise, it's someone asked the question ten years ago and never returns, like, how did you solve it? What did you do? It's good to close that loop.Jeff: Yeah, and I think over 50% of what I do, I've done before. When you're setting up a Kubernetes cluster, there's certain parts of it that you're going to do every time. So, whatever's not automated or the tricky bits, I always document those things. Anything that is not in the readme, is not in the first few steps, because that will help me and will help others. I think that sometimes that's the best success I've found on YouTube is also just sharing an experience.And I think that's what separates some of the content that really drives growth on a YouTube channel or whatever, or for an organization doing it because you bring the experience, like, I'm a new person to this Home Assistant, for instance, which I use to automate things at my house. I had problems with it and I just shared those problems in my video, and that video has, you know, hundreds of thousands of views. Whereas these other people who know way more than I could ever know about Home Assistant, they're pulling in fewer views because they just get into a tutorial and don't have that perspective of a beginner or somebody that runs into an issue and how do you solve that issue.So, like I said, I mean, I just always share that stuff. Every time that I have an issue with anything technological, I put it on GitHub somewhere. And then eventually, if it's something that I can really formulate into an outline of what I did, I put a blog post up on my blog. I still, even though I write I don't know how many words per week that goes into my YouTube videos or into my books or anything, I still write two or three blog posts a week that are often pretty heavy into technical detail.Corey: One of the challenges I've always had is figuring out who exactly I'm storytelling for when I'm putting something out there. Because there's a plethora, at least in cloud, of beginner content of, here's how to think about cloud, here's what the service does, here's why you should use it et cetera, et cetera. And that's all well and good, but often the things that I'm focusing on presuppose a certain baseline level of knowledge that you should have going into this. If you're trying to figure out the best way to get some service configured, I probably shouldn't have to spend the first half of the article talking about what AWS is, as a for instance. And I think that inherently limits the size of the potential audience that would be interested in the content, but it's also the kind of stuff that I wish was out there.Jeff: Yeah. There's two sides to that, too. One is, you can make content that appeals to anybody, even if they have no clue what you're talking about, or you can make content that appeals to the narrow audience that knows the base level of understanding you need. So, a lot of times with—especially on my YouTube channel, I'll put things in that is just irrelevant to 99% of the population, but I get so many comments, like, “I have no clue what you said or what you're doing, but this looks really cool.” Like, “This is fun or interesting.” Just because, again, it's bringing that story into it.Because really, I think on a base level, a lot of programmers especially don't understand—and infrastructure engineers are off the deep end on this—they don't understand the interpersonal nature of what makes something good or not, what makes something relatable. And trying to bring that into technical documentation a lot of times is what differentiates a project. So, one of the products I love and use and recommend everywhere and have a book on—a best-selling book—is Ansible. And one of the things that brought me into it and has brought so many people is the documentation started—it's gotten a little bit more complex over the years—but it started out as, “Here's some problems. Here's how you solve them.”Here's, you know, things that we all run into, like how do you connect to 12 servers at the same time? How do you have groups of servers? Like, it showed you all these little examples. And then if you wanted to go deeper, there was more documentation linked out of that. But it was giving you real-world scenarios and doing it in a simple way. And it used some little easter eggs and fun things that made it more interesting, but I think that that's missing from a lot of technical discussion and a lot of technical documentation out there is that playfulness, that human side, the get from Point A to Point B and here's why and here's how, but here's a little interesting way to do it instead of just here's how it's done.Corey: In that same era, I was one of the very early developers behind SaltStack, and I think one of the reasons that Ansible won in the market was that when you started looking into SaltStack, it got wrapped around its own axle talking about how it uses ZeroMQ for a full mesh between all of the systems there, as long—sorry [unintelligible 00:07:39] mesh network that all routes—not really a mesh network at all—it talks through a single controller that then talks to all of its subordinate nodes. Great. That's awesome. How do I use this to install a web server, is the question that people had. And it was so in love with its own cleverness in some ways. Ansible was always much more approachable in that respect and I can't understate just how valuable that was for someone who just wants to get the problem solved.Jeff: Yeah. I also looked at something like NixOS. It's kind of like the arch of distributions of—Corey: You must be at least this smart to use it in some respects—Jeff: Yeah, it's—Corey: —has been the every documentation I've had with that.Jeff: [laugh]. There's, like, this level of pride in what it does, that doesn't get to ‘and it solves this problem.' You can get there, but you have to work through the barrier of, like, we're so much better, or—I don't know what—it's not that. Like, it's just it doesn't feel like, “You're new to this and here's how you can solve a problem today, right now.” It's more like, “We have this golden architecture and we want you to come up to it.” And it's like, well, but I'm not ready for that. I'm just this random developer trying to solve the problem.Corey: Right. Like, they should have someone hanging out in their IRC channel and just watch for a week of who comes in and what questions do they have when they're just getting started and address those. Oh, you want to wind up just building a Nix box EC2 for development? Great, here's how you do that, and here's how to think about your workflow as you go. Instead, I found that I had to piece it together from a bunch of different blog posts and the rest and each one supposed that I had different knowledge coming into it than the others. And I felt like I was getting tangled up very easily.Jeff: Yeah, and I think it's telling that a lot of people pick up new technology through blog posts and Substack and Medium and whatever [Tedium 00:09:19], all these different platforms because it's somebody that's solving a problem and relating that problem, and then you have the same problem. A lot of times in the documentation, they don't take that approach. They're more like, here's all our features and here's how to use each feature, but they don't take a problem-based approach. And again, I'm harping on Ansible here with how good the documentation was, but it took that approach is you have a bunch of servers, you want to manage them, you want to install stuff on them, and all the examples flowed from that. And then you could get deeper into the direct documentation of how things worked.As a polar opposite of that, in a community that I'm very much involved in still—well, not as much as I used to be—is Drupal. Their documentation was great for developers but not so great for beginners and that was always—it still is a difficulty in that community. And I think it's a difficulty in many, especially open-source communities where you're trying to build the community, get more people interested because that's where the great stuff comes from. It doesn't come from one corporation that controls it, it comes from the community of users who are passionate about it. And it's also tough because for something like Drupal, it gets more complex over time and the complexity kind of kills off the initial ability to think, like, wow, this is a great little thing and I can get into it and start using it.And a similar thing is happening with Ansible, I think. We were at when I got started, there were a couple hundred modules. Now there's, like, 4000 modules, or I don't know how many modules, and there's all these collections, and there's namespaces now, all these things that feel like Java overhead type things leaking into it. And that diminishes that ability for me to see, like, oh, this is my simple tool that solving these problems.Corey: I think that that is a lost art in the storytelling side of even cloud marketing, where they're so wrapped around how they do what they do that they forget, customers don't care. Customers care very much about their problem that they're trying to solve. If you have an answer for solving that problem, they're very interested. Otherwise, they do not care. That seems to be a missing gap.Jeff: I think, like, especially for AWS, Google, Azure cloud platforms, when they build their new services, sometimes you're, like, “And that's for who?” For some things, it's so specialized, like, Snowmobile from Amazon, like, there's only a couple customers on the planet in a given year that needs something like that. But it's a cool story, so it's great to put that into your presentation. But some other things, like, especially nowadays with AI, seems like everybody's throwing tons of AI stuff—spaghetti—at the wall, seeing what will stick and then that's how they're doing it. But that really muddies up everything.If you have a clear vision, like with Apple, they just had their presentation on the new iPhone and the new neural engine and stuff, they talk about, “We see your heart patterns and we tell you when your heart is having problems.” They don't talk about their AI features or anything. I think that leading with that story and saying, like, here's how we use this, here's how customers can build off of it, those stories are the ones that are impactful and make people remember, like, oh Apple is the company that saves people's lives by making watches that track their heart. People don't think that about Google, even though they might have the same feature. Google says we have all these 75 sensors in our thing and we have this great platform and Android and all that. But they don't lead with the story.And that's something where I think corporate Apple is better than some of the other organizations, no matter what the technology is. But I get that feeling a lot when I'm watching launches from Amazon and Google and all their big presentations. It seems like they're tech-heavy and they're driven by, like, “What could we do with this? What could you do with this new platform that we're building,” but not, “And this is what we did with this other platform,” kind of building up through that route.Corey: Something I've been meaning to ask someone who knows for a while, and you are very clearly one of those people, I spend a lot of time focusing on controlling cloud costs and I used to think that Managed NAT Gateways were very expensive. And then I saw the current going rates for Raspberries Pi. And that has been a whole new level of wild. I mean, you mentioned a few minutes ago that you use Home Assistant. I do too.But I was contrasting the price between a late model, Raspberry Pi 4—late model; it's three years old if this point of memory serves, maybe four—versus a used small form factor PC from HP, and the second was less expensive and far more capable. Yeah it drags a bit more power and it's a little bit larger on the shelf, but it was basically no contest. What has been going on in that space?Jeff: I think one of the big things is we're at a generational improvement with those small form-factor little, like, tiny-size almost [nook-sized 00:13:59] PCs that were used all over the place in corporate environments. I still—like every doctor's office you go to, every hospital, they have, like, a thousand of these things. So, every two or three or four years, however long it is on their contract, they just pop all those out the door and then you get an E-waste company that picks up a thousand of these boxes and they got to offload them. So, the nice thing is that it seems like a year or two ago, that really started accelerating to the point where the price was driven down below 100 bucks for a fully built-out little x86 Mini PC. Sure, it's, you know, like you said, a few generations old and it pulls a little bit more power, usually six to eight watts at least, versus a Raspberry Pi at two to three watts, but especially for those of us in the US, electricity is not that expensive so adding two or three watts to your budget for a home lab computer is not that bad.The other part of that is, for the past two-and-a-half years because of the global chip shortages and because of the decisions that Raspberry Pi made, there were so few Raspberry Pis available that their prices shot up through the roof if you wanted to get one in any timely fashion. So, that finally is clearing up, although I went to the Micro Center near me yesterday, and they said that they have not had stock of Raspberry Pi 4s for, like, two months now. So, they're coming, but they're not distributed evenly everywhere. And still, the best answer, especially if you're going to run a lot of things on it, is probably to buy one of those little mini PCs if you're starting out a home lab.Or there's some other content creators who build little Kubernetes clusters with multiple mini PCs. Three of those stack up pretty nicely and they're still super quiet. I think they're great for home labs. I have two of them over on my shelf that I'm using for testing and one of them is actually in my rack. And I have another one on my desk here that I'm trying to set up for a five gigabit home router since I finally got fiber internet after years with cable and I'm still stuck on my old gigabit router.Corey: Yeah, I wound up switching to a Protectli, I think is what it's called for—it's one of those things I've installed pfSense on. Which, I'm an old FreeBSD hand and I haven't kept up with it, but that's okay. It feels like going back in time ten years, in some respects—Jeff: [laugh].Corey: —so all right. And I have a few others here and there for various things that I want locally. But invariably, I've had the WiFi controller; I've migrated that off. That lives on an EC2 box in Ohio now. And I do wind up embracing cloud services when I don't want it to go down and be consistently available, but for small stuff locally, I mean, I have an antenna on the roof doing an ADS-B receiver dance that's plugged into a Pi Zero.I have some backlogged stuff on this, but they've gotten expensive as alternatives have dropped in price significantly. But what I'm finding as I'm getting more into 3D printing and a lot of hobbyist maker tools out there, everything is built with the Raspberry Pi in mind; it has the mindshare. And yeah, I can get something with similar specs that are equivalent, but then I've got to do a whole bunch of other stuff as soon as it gets into controlling hardware via GPIO pins or whatnot. And I have to think about it very differently.Jeff: Yeah, and that's the tough thing. And that's the reason why Raspberry Pis, even though they're three years old, even though they're hard to get, they still are fetching—on the used market—way more than the original MSRP. It's just crazy. But the reason for that is the Raspberry Pi organization. And there's two: there's the Raspberry Pi Foundation that's goals are to increase educational computing and accessibility for computers for kids and learning and all that, then there's the Raspberry Pi trading company that makes the Raspberry Pis.The Trading Company has engineers who sit there 24/7 working on the software, working on the kernel drivers, working on hardware bugs, listening to people on the forums and in GitHub and everywhere, and they're all English-speaking people there—they're over in the UK—and they manufacture their own boards. So, there's a lot of things on top of that, even though they're using some silicons of Broadcom chips that are a little bit locked down and not completely open-source like some other chips might be, they're a phone number you could call if you need the support or there's a forum that has activity that you can get help in and their software that's supported. And there's a newer Linux kernel and the kernel is updated all the time. So, all those advantages mean you get a little package that will work, it'll sip two watts of power, sitting 24/7. It's reliable hardware.There's so many people that use it that it's so well tested that almost any problem you could ever run into, someone else has and there's a blog post or a forum post talking about it. And even though the hardware is not super powerful—it's three years old—you can add on a Coral TPU and do face recognition and object recognition. And throw in Frigate for Home Assistant to get notifications on your phone when your mom walks up to the door. There's so many things you can do with them and they're so flexible that they're still so valuable. I think that they really knocked it out of the park with that model, the Raspberry Pi 4, and the compute module 4, which is still impossible to get. I have not been able to buy one for two years now. Luckily, I bought 12 two-and-a-half years ago [laugh] otherwise I would be running out for all my projects that I do.Corey: Yeah. I got two at the moment and two empty slots in the Turing Pi 2, which I'll care more about if I can actually get the thing up and booted. But it presupposes you have a Windows computer or otherwise, ehh, watch this space; more coming. Great. Like, do I build a virtual machine on top of something else? It leads down the path super quickly of places I thought I'd escaped from.Jeff: Yeah, you know, outside of the Pi realm, that's the state of the communities. It's a lot of, like, figuring out your own things. I did a project—I don't know if you've heard of Mr. Beast—but we did a project for him that involves a hundred single-board computers. We couldn't find Raspberry Pi's so we had to use a different single-board computer that was available.And so, I bought an older one thinking, oh, this is, like, three or four years old—it's older than the Pi 4—and there must be enough support now. But still, there's, like, little rough edges everywhere I went and we ended up making them work, but it took us probably an extra 30 to 40 hours of development work to get those things running the same way as a Raspberry Pi. And that's just the way of things. There's so much opportunity.If one of these Chinese manufacturers that makes most of these things, if one of them decided, you know what? We're going to throw tons of money into building support for these things, get some English-speaking members of these forums to build up the community, all that stuff, I think that they could have a shot at Raspberry Pi's giant portion of the market. But so far, I haven't really seen that happen. So far, they're spamming hardware. And it's like, the hardware is awesome. These chips are great if you know how to deal with them and how to get the software running and how to deal with Linux issues, but if you don't, then they're not great because you might not even get the thing to boot.Corey: I want to harken back to something you said a minute ago, where there's value in having a community around something, where you can see everyone else has already encountered a problem like this. I think that folks who weren't around for the rise of cloud have no real insight into how difficult it used to be just getting servers into racks and everything up, and okay, they're identical, and seven of them are working, but that eighth one isn't for some strange reason. And you spend four hours troubleshooting what turns out to be a bad cable or something not seated properly and it's awful. Cloud got away from a lot of that nonsense. But it's important—at least to me—to not be Captain Edgecase, where if you pick some new cloud provider and Google for how to set up a load balancer and no one's done it before you, that's not great. Whereas if I'm googling now in the AWS realm and no one has done, the thing I'm trying to do, that should be something of a cautionary flag of maybe this isn't how most people go about approaching production. Really think twice about this.Jeff: Yep. Yeah, we ran into that on a project I was working on was using Magento—which I don't know if anybody listening uses Magento, but it's not fun—and we ran into some things where it's like, “We're doing this, and it says that they do this on their official supported platform, but I don't know how they are because the code just doesn't exist here.” So, we ran into some weird edge cases on AWS with some massive infrastructure for the databases, and I ran into scaling issues. But even there, there were forum posts in AWS here and there that had little nuggets that helped us to figure out a way to get around it. And like you say, that is a massive advantage for AWS.And we ran into an issue with, we were one of the first customers trying out the new Lambda functions for RDS—or I don't remember exactly what it was called initially—but we ended up not using that. But we ran into some of these issues and figured out we were the first customer running into this weird scaling thing when we had a certain size of database trying to use it with these Lambda calls. And eventually, they got those things solved, but with AWS, they've seen so many things and some other cloud providers haven't seen these things. So, when you have certain types of applications that need to scale in certain ways, that is so valuable and the community of users, the ability to pull from that community when you need to hire somebody in an emergency, like, we need somebody to help us get this project done and we're having this issue, you can find somebody that is, like, okay, I know how to get you from Point A to Point B and get this project out the door. You can't do that on certain platforms.And open-source projects, too. We've always had that problem in Drupal. The amount of developers who are deep into Drupal to help with the hard problems is not vast, so the ones who can do that stuff, they're all hired off and paid a handsome sum. And if you have those kinds of problems you realize, I either going to need to pay a ton of money or we're just going to have to not do that thing that we wanted to do. And that's tough.Corey: What I've found, sort of across the board, has been that there's a lot of, I guess, open-source community ethos that has bled into a lot of this space and I wanted to make sure that we have time to talk about this because I was incensed a while back when Red Hat decided, “Oh, you know that whole ten-year commitment on CentOS? That project that we acquired and are now basically stabbing in the face?”—disclosure. I used to be part of the CentOS project years ago when I was on network staff for the Freenode IRC network—then it was, “Oh yeah, we're just going to basically undermine our commitments to you and now you can pay us if you want to get that support there.” And that really set me off. Was nice to see you were right there as well in almost lockstep with me, pointing out that this is terrible, just as far as breaking promises you've made to customers. Has your anger cooled any? Because mine hasn't.Jeff: It has not. My temper has cooled. My anger has not. I don't think that they get it. After all the backlash that they got after that, I don't think that the VP-level folks at Red Hat understand that this is already impacting them and will impact them much more in the future because people like me and you, people who help other people build infrastructure and people who recommend operating systems and people who recommend patterns and things, we're just going to drop off using CentOS because it doesn't exist. It does exist and some other people are saying, “Oh, it's actually better to use this new CentOS, you know, Stream. Stream is amazing.” It's not. It's not the same thing. It's different. And—Corey: I used to work at a bank. That was not an option. I mean, granted at the bank for the production systems it was always [REL 00:25:18], but being able to spin up a pre-production environment without having to pay license fees on every VM. Yeah.Jeff: Yeah. And not only that, they did this announcement and framed it a certain way, and the community immediately saw. You know, I think that they're just angry about something, and whether it was a NASA contract with Rocky Linux, or whether it was something Oracle did, who knows, but it seems petty in retrospect, especially in comparison to the amount of backlash that came out of it. And I really don't think that they understand the thing that they had with that Red Hat Enterprise Linux is not a massive growth opportunity for Red Hat. It's, in some ways, a dying product in terms of compared to using cloud stuff, it doesn't matter.You could use CoreOS, you could use NixOS, and you could use anything, it doesn't really matter. For people like you and me, we just want to deploy our software. And if it's containers, it really doesn't matter. It's just the people in government or in certain organizations that have these roles that you have to use whatever FIPS and all that kind of stuff. So, it's not like it's a hyper-growth opportunity for them.CentOS was, like, the only reason why all the software, especially on the open-source side, was compatible with Red Hat because we could use CentOS and it was easy and simple. They took that—well, they tried to take that away and everybody's like, “That's—what are you doing?” Like, I posted my blog post and I think that sparked off quite a bit of consternation, to the point where there was a lot of personal stuff going on. I basically said, “I'm not supporting Red Hat Enterprise Linux for any of my work anymore.” Like, “From this point forward, it's not supported.”I'll support OpenELA, I'll support Rocky Linux or Oracle Linux or whatever because I can get free versions that I don't have to sign into a portal and get a license and download the license and integrate it with my CI work. I'm an open-source developer. I'm not going to pay for stuff or use 16 free licenses. Or I was reached out to and they said, “We'll give you more licenses. We'll give you extra.” And it's like, that's not how this works. Like, I don't have to call Debian and Ubuntu and [laugh] I don't even have to call Oracle to get licenses. I can just download their software and run it.So, you know, I don't think they understood the fact that they had that. And the bigger problem for me was the two-layer approach to destroying all the trust that the community had. First was in, I think it was 2019 when they said—we're in the middle of CentOS 8's release cycle—they said, “We're dropping CentOS 8. It's going to be Stream now.” And everybody was up in arms.And then Rocky Linux and [unintelligible 00:27:52] climbed in and gave us what we wanted: basically, CentOS. So, we're all happy and we had a status quo, and Rocky Linux 9 and [unintelligible 00:28:00] Linux nine came out after Red Hat 9, and the world was a happy place. And then they just dumped this thing on us and it's like, two major release cycles in a row, they did it again. Like, I don't know what this guy's thinking, but in one of the interviews, one of the Red Hat representatives said, “Well, we wanted to do this early in Red Hat 9's release cycle because people haven't started migrating.” It's like, well, I already did all my automation upgrades for CI to get all my stuff working in Rocky Linux 9 which was compatible with Red Hat Enterprise Linux 9. Am I not one of the people that's important to you?Like, who's important to you? Is it only the people who pay you money or is it also the people that empower your operating system to be a premier Enterprise Linux operating system? So, I don't know. You can tell. My anger has not died down. The amount of temper that I have about it has definitely diminished because I realize I'm talking at a wall a lot of times, when I'm having conversations on Twitter, private conversations and email, things like that.Corey: People come to argue; they don't come to actually have a discussion.Jeff: Yeah. I think that they just, they don't see the community aspect of it. They just see the business aspect. And the business aspect, if they want to figure out ways that they can get more people to pay them for their software, then maybe they should provide more value and not just cut off value streams. It doesn't make sense to me from a long-term business perspective.From a short term, maybe there were some clients who said, “Oh, shoot. We need this thing stable. We're going to pay for some more licenses.” But the engineers that those places are going to start making plans of, like, how do we make this not happen again. And the way to not make that happen, again is to use, maybe Ubuntu or maybe [unintelligible 00:29:38] or something. Who knows? But it's not going to be increasing our spend with Red Hat.Corey: That's what I think a lot of companies are missing when it comes to community as well, where it's not just a place to go to get support for whatever it is you're doing and it's not a place [where 00:29:57] these companies view prospective customers. There's more to it than that. There has to be a social undercurrent on this. I look at the communities I spend time in and in some of them dating back long enough, I've made lifelong significant friendships out of those places, just through talking about our lives, in addition to whatever the community is built around. You have to make space for that, and companies don't seem to fully understand that.Jeff: Yeah, I think that there's this thing that a community has to provide value and monetizable value, but I don't think that you get open-source if you think that that's what it is. I think some people in corporate open-source think that corporate open-source is a value stream opportunity. It's a funnel, it's something that is going to bring you more customers—like you say—but they don't realize that it's a community. It's like a group of people. It's friends, it's people who want to make the world a better place, it's people who want to support your company by wearing your t-shirt to conferences, people want to put on your red fedora because it's cool. Like, it's all of that. And when you lose some of that, you lose what makes your product differentiated from all the other ones on the market.Corey: That's what gets missed. I think that there's a goodwill aspect of it. People who have used the technology and understand its pitfalls are likelier to adopt it. I mean, if you tell me to get a website up and running, I am going to build an architecture that resembles what I've run before on providers that I've run on before because I know what the failure modes look like; I know how to get things up and running. If I'm in a hurry, trying to get something out the door, I'm going to choose the devil that I know, on some level.Don't piss me off as a community member and incentivize me to change that estimation the next time I've got something to build. Well, that doesn't show up on this quarter's numbers. Well, we have so little visibility into how decisions get made many companies that you'll never know that you have a detractor who's still salty about something you did five years ago and that's the reason the bank decided not to because that person called in their political favors to torpedo that deal and have a sweetheart offer from your competitor, et cetera and so on and so forth. It's hard to calculate the actual cost of alienating goodwill. But—Jeff: Yeah.Corey: I wish companies had a longer memory for these things.Jeff: Yeah. I mean, and thinking about that, like, there was also the HashiCorp incident where they kind of torpedoed all developer goodwill with their Terraform and other—Terraform especially, but also other products. Like, I probably, through my book and through my blog posts and my GitHub examples have brought in a lot of people into the HashiCorp ecosystem through Vagrant use, and through Packer and things like that. At this point, because of the way that they treated the open-source community with the license change, a guy like me is not going to be enthusiastic about it anymore and I'm going to—I already had started looking at alternatives for Vagrant because it doesn't mesh with modern infrastructure practices for local development as much, but now it's like that enthusiasm is completely gone. Like I had that goodwill, like you said earlier, and now I don't have that goodwill and I'm not going to spread that, I'm not going to advocate for them, I'm not going to wear their t-shirt [laugh], you know when I go out and about because it just doesn't feel as clean and cool and awesome to me as it did a month ago.And I don't know what the deal is. It's partly the economy, money's drying up, things like that, but I don't understand how the people at the top can't see these things. Maybe it's just their organization isn't set up to show the benefits from the engineers underneath, who I know some of these engineers are, like, “Yeah, I'm sorry. This was dumb. I still work here because I get a paycheck, but you know, I can't say anything on social media, but thank you for saying what you did on Twitter.” Or X.Corey: Yeah. It's nice being independent where you don't really have to fear the, well if I say this thing online, people might get mad at me and stop doing business with me or fire me. It's well, yeah, I mean, I would have to say something pretty controversial to drive away every client and every sponsor I've got at this point. And I don't generally have that type of failure mode when I get it wrong. I really want to thank you for taking the time to talk with me. If people want to learn more, where's the best place for them to find you?Jeff: Old school, my personal website, jeffgeerling.com. I link to everything from there, I have an About page with a link to every profile I've ever had, so check that out. It links to my books, my YouTube, all that kind of stuff.Corey: There's something to be said for picking a place to contact you that will last the rest of your career as opposed to, back in the olden days, my first email address was the one that my ISP gave me 25 years ago. I don't use that one anymore.Jeff: Yep.Corey: And having to tell everyone I corresponded with that it was changing was a pain in the butt. We'll definitely put a link to that one in the [show notes 00:34:44]. Thank you so much for taking the time to speak with me. I appreciate it.Jeff: Yeah, thanks. Thanks so much for having me.Corey: Jeff Geerling, YouTuber, author, content creator, and oh so very much more. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that we will, of course, read [in action 00:35:13], just as soon as your payment of compute modules for Raspberries Pi show up in a small unmarked bag.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Making a Difference Through Technology in the Public Sector with Dmitry Kagansky

Screaming in the Cloud

Play Episode Listen Later Oct 5, 2023 33:04


Dmitry Kagansky, State CTO and Deputy Executive Director for the Georgia Technology Authority, joins Corey on Screaming in the Cloud to discuss how he became the CTO for his home state and the nuances of working in the public sector. Dmitry describes his focus on security and reliability, and why they are both equally important when working with state government agencies. Corey and Dmitry describe AWS's infamous GovCloud, and Dmitry explains why he's employing a multi-cloud strategy but that it doesn't work for all government agencies. Dmitry also talks about how he's focusing on hiring and training for skills, and the collaborative approach he's taking to working with various state agencies.About DmitryMr. Kagansky joined GTA in 2021 from Amazon Web Services where he worked for over four years helping state agencies across the country in their cloud implementations and migrations.Prior to his time with AWS, he served as Executive Vice President of Development for Star2Star Communications, a cloud-based unified communications company. Previously, Mr. Kagansky was in many technical and leadership roles for different software vending companies. Most notably, he was Federal Chief Technology Officer for Quest Software, spending several years in Europe working with commercial and government customers.Mr. Kagansky holds a BBA in finance from Hofstra University and an MBA in management of information systems and operations management from the University of Georgia.Links Referenced: Twitter: https://twitter.com/dimikagi LinkedIn: https://www.linkedin.com/in/dimikagi/ GTA Website: https://gta.ga.gov TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: In the cloud, ideas turn into innovation at virtually limitless speed and scale. To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats. What's Runtime Insights, you ask? Visit sysdig.com/screaming to learn more. That's S-Y-S-D-I-G.com/screaming.My thanks as well to Sysdig for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Technical debt is one of those fun things that everyone gets to deal with, on some level. Today's guest apparently gets to deal with 235 years of technical debt. Dmitry Kagansky is the CTO of the state of Georgia. Dmitry, thank you for joining me.Dmitry: Corey, thank you very much for having me.Corey: So, I want to just begin here because this has caused confusion in my life; I can only imagine how much it's caused for you folks. We're talking Georgia the US state, not Georgia, the sovereign country?Dmitry: Yep. Exactly.Corey: Excellent. It's always good to triple-check those things because otherwise, I feel like the shipping costs are going to skyrocket in one way or the other. So, you have been doing a lot of very interesting things in the course of your career. You're former AWS, for example, you come from commercial life working in industry, and now it's yeah, I'm going to go work in state government. How did this happen?Dmitry: Yeah, I've actually been working with governments for quite a long time, both here and abroad. So, way back when, I've been federal CTO for software companies, I've done other work. And then even with AWS, I was working with state and local governments for about four, four-and-a-half years. But came to Georgia when the opportunity presented itself, really to try and make a difference in my own home state. You mentioned technical debt at the beginning and it's one of the things I'm hoping that helped the state pay down and get rid of some of it.Corey: It's fun because governments obviously are not thought of historically as being the early adopters, bleeding edge when it comes to technical innovation. And from where I sit, for good reason. You don't want code that got written late last night and shoved into production to control things like municipal infrastructure, for example. That stuff matters. Unlike a lot of other walks of life, you don't usually get to choose your government, and, “Oh, I don't like this one so I'm going to go for option B.”I mean you get to do at the ballot box, but that takes significant amounts of time. So, people want above all else—I suspect—their state services from an IT perspective to be stable, first and foremost. Does that align with how you think about these things? I mean, security, obviously, is a factor in that as well, but how do you see, I guess, the primary mandate of what you do?Dmitry: Yeah. I mean, security is obviously up there, but just as important is that reliance on reliability, right? People take time off of work to get driver's licenses, right, they go to different government agencies to get work done in the middle of their workday, and we've got to have systems available to them. We can't have them show up and say, “Yeah, come back in an hour because some system is rebooting.” And that's one of the things that we're trying to fix and trying to have fewer of, right?There's always going to be things that happen, but we're trying to really cut down the impact. One of the biggest things that we're doing is obviously a move to the cloud, but also segmenting out all of our agency applications so that agencies manage them separately. Today, my organization, Georgia Technology Authority—you'll hear me say GTA—we run what we call NADC, the North Atlanta Data Center, a pretty large-scale data center, lots of different agencies, app servers all sitting there running. And then a lot of times, you know, an impact to one could have an impact to many. And so, with the cloud, we get some partitioning and some segmentation where even if there is an outage—a term you'll often hear used that we can cut down on the blast radius, right, that we can limit the impact so that we affect the fewest number of constituents.Corey: So, I have to ask this question, and I understand it's loaded and people are going to have opinions with a capital O on it, but since you work for the state of Georgia, are you using GovCloud over in AWS-land?Dmitry: So… [sigh] we do have some footprint in GovCloud, but I actually spent time, even before coming to GTA, trying to talk agencies out of using it. I think there's a big misconception, right? People say, “I'm government. They called it GovCloud. Surely I need to be there.”But back when I was with AWS, you know, I would point-blank tell people that really I know it's called GovCloud, but it's just a poorly named region. There are some federal requirements that it meets; it was built around the ITAR, which is International Traffic of Arms Regulations, but states aren't in that business, right? They are dealing with HIPAA data, with various criminal justice data, and other things, but all of those things can run just fine on the commercial side. And truthfully, it's cheaper and easier to run on the commercial side. And that's one of the concerns I have is that if the commercial regions meet those requirements, is there a reason to go into GovCloud, just because you get some extra certifications? So, I still spend time trying to talk agencies out of going to GovCloud. Ultimately, the agencies with their apps make the choice of where they go, but we have been pretty good about reducing the footprint in GovCloud unless it's absolutely necessary.Corey: Has this always been the case? Because my distant recollection around all of this has been that originally when GovCloud first came out, it was a lot harder to run a whole bunch of workloads in commercial regions. And it feels like the commercial regions have really stepped up as far as what compliance boxes they check. So, is one of those stories where five or ten years ago, whenever it GovCloud first came out, there were a bunch of reasons to use it that no longer apply?Dmitry: I actually can't go past I'll say, seven or eight years, but certainly within the last eight years, there's not been a reason for state and local governments to use it. At the federal level, that's a different discussion, but for most governments that I worked with and work with now, the commercial regions have been just fine. They've met the compliance requirements, controls, and everything that's in place without having to go to the GovCloud region.Corey: Something I noticed that was strange to me about the whole GovCloud approach when I was at the most recent public sector summit that AWS threw is whenever I was talking to folks from AWS about GovCloud and adopting it and launching new workloads and the rest, unlike in almost any other scenario, they seemed that their first response—almost a knee jerk reflex—was to pass that work off to one of their partners. Now, on the commercial side, AWS will do that when it makes sense, and each one becomes a bit of a judgment call, but it just seemed like every time someone's doing something with GovCloud, “Oh, talk to Company X or Company Y.” And it wasn't just one or two companies; there were a bunch of them. Why is that?Dmitry: I think a lot of that is because of the limitations within GovCloud, right? So, when you look at anything that AWS rolls out, it almost always rolls out into either us-east-1 or us-west-2, right, one of those two regions, and it goes out worldwide. And then it comes out in GovCloud months, sometimes even years later. And in fact, sometimes there are features that never show up in GovCloud. So, there's not parity there, and I think what happens is, it's these partners that know what limitations GovCloud has and what things are missing and GovCloud they still have to work around.Like, I remember when I started with AWS back in 2016, right, there had been a new console, you know, the new skin that everyone's now familiar with. But that old console, if you remember that, that was in GovCloud for years afterwards. I mean, it took them at least two more years to get GovCloud to even look like the current commercial console that you see. So, it's things like that where I think AWS themselves want to keep moving forward and having to do anything with kind of that legacy platform that doesn't have all the bells and whistles is why they say, “Go get a partner [unintelligible 00:08:06] those things that aren't there yet.”Corey: That's it makes a fair bit of sense. What I was always wondering how much of this was tied to technical challenges working within those, and building solutions that don't depend upon things. “Oh, wait, that one's not available in GovCloud,” versus a lack of ability to navigate the acquisition process for a lot of governments natively in the same way that a lot of their customers can.Dmitry: Yeah, I don't think that's the case because even to get a GovCloud account, you have to start off with a commercial account, right? So, you actually have to go through the same purchasing steps and then essentially, click an extra button or two.Corey: Oh, I've done that myself already. I have a shitposting account and a—not kidding—Ministry of Shitposting GovCloud account. But that's also me just kicking the tires on it. As I went through the process, it really felt like everything was built around a bunch of unstated assumption—because of course you've worked within GovCloud before and you know where these things are. And I kept tripping into a variety of different aspects of that. I'm wondering how much of that is just due to the fact that partners are almost always the ones guiding customers through that.Dmitry: Yeah. It is almost always that. There's very few people, even in the AWS world, right, if you look at all the employees they have there, it's small subset that work with that environment, and probably an even smaller subset of those that understand what it's really needed for. So, this is where if there's not good understanding, you're better off handing it off to a partner. But I don't think it is the purchasing side of things. It really is the regulatory things and just having someone else sign off on a piece of paper, above and beyond just AWS themselves.Corey: I am curious, since it seems that people love to talk about multi-cloud in a variety of different ways, but I find there's a reality that, ehh, basically, on a long enough timeline, everyone uses everything, versus the idea of, “Oh, we're going to build everything so we can seamlessly flow from one provider to another.” Are you folks all in on AWS? Are you using a bunch of different cloud providers for different workloads? How are you approaching a cloud strategy?Dmitry: So, when you say ‘you guys,' I'll say—as AWS will always say—“It depends.” So, GTA is multi-cloud. We support AWS, we support OCI, we support Azure, and we are working towards getting Google in as well, GCP. However, on the agency side, I am encouraging agencies to pick a cloud. And part of that is because you do have limited staff, they are all different, right?They'll do similar things, but if it's done in a different way and you don't have people that know those little tips and tricks, kind of how to navigate certain cloud vendors, it just makes things more difficult. So, I always look at it as kind of the car analogy, right? Most people are not multi-car, right? You go you buy a car—Toyota, Ford, whatever it is—and you're committed to that thing for the next 4 or 5, 10 years, however long you own it, right? You may not like where the cupholder is or you need to get used to something, you know, being somewhere else, but you do commit to it.And I think it's the same thing with cloud that, you know, do you have to be in one cloud for the rest of your life? No, but know that you're not going to hop from cloud to cloud. No one really does. No one says, “Every six months, I'm going to go move my application from one cloud to another.” It's a pretty big lift and no one really needs to do that. Just find the one that's most comfortable for you.Corey: I assume that you have certain preferences as far as different cloud providers go. But I've found even in corporate life that, “Well, I like this company better than the other,” is generally not the best basis for making sweeping decisions around this. What frameworks do you give various departments to consider where a given workload should live? Like, how do you advise them to think about this?Dmitry: You know, it's funny, we actually had a call with an agency recently that said, “You know, we don't know cloud. What do you guys think we should do?” And it was for a very small, I don't want to call it workload; it was really for some DNS work that they wanted to do. And really came down to, for that size and scale, right, we're looking at a few dollars, maybe a month, they picked it based on the console, right? They liked one console over another.Not going to get into which cloud they picked, but we wound up them giving them a demo of here's what this looks like in these various cloud providers. And they picked that just because they liked the buttons and the layout of one console over another. Now, having said that, for obviously larger workloads, things that are more important, there is criteria. And in many cases, it's also the vendors. Probably about 60 to 70% of the applications we run are all vendor-provided in some way, and the vendors will often dictate platforms that they'll support over others, right?So, that supportability is important to us. Just like you were saying, no one wants code rolled out overnight and surprise all the constituents one day. We take our vendor relations pretty seriously and we take our cue from them. If we're buying software from someone and they say, “Look, this is better in AWS,” or, “This is better in OCI,” for whatever reasons they have, will go in that direction more often than not.Corey: I made a crack at the beginning of the episode where the state was founded 235 years ago, as of this recording. So, how accurate is that? I have to imagine that back in those days, they didn't really have a whole lot of computers, except probably something from IBM. How much technical debt are you folks actually wrestling with?Dmitry: It's pretty heavy. One of the biggest things we have is, we ourselves, in our data center, still have a mainframe. That mainframe is used for a lot of important work. Most notably, a lot of healthcare benefits are really distributed through that system. So, you're talking about federal partnerships, you're talking about, you know, insurance companies, health care providers, all somehow having—Corey: You're talking about things that absolutely, positively cannot break.Dmitry: Yep, exactly. We can't have outages, we can't have blips, and they've got to be accurate. So, even that sort of migration, right, that's not something that we can do overnight. It's something we've been working on for well over a year, and right now we're targeting probably roughly another year or so to get that fully migrated out. And even there, we're doing what would be considered a traditional lift-and-shift. We're going to mainframe emulation, we're not going cloud-native, we're not going to do a whole bunch of refactoring out of the gate. It's just picking up what's working and running and just moving it to a new venue.Corey: Did they finally build an AWS/400 that you can run that out? I didn't realize they had a mainframe emulation offering these days.Dmitry: They do. There's actually several providers that do it. And there's other agencies in the state that have made this sort of move as well, so we're also not even looking to be innovators in that respect, right? We're not going to be first movers to try that out. We'll have another agency make that move first and now we're doing this with our Department of Human Services.But yeah, a lot of technical debt around that platform. When you look at just the cost of operating these platforms, that mainframe costs the state roughly $15 million a year. We think in the cloud, it's going to wind up costing us somewhere between 3 to 4 million. Even if it's 5 million, that's still considerable savings over what we're paying for today. So, it's worth making that move, but it's still very deliberate, very slow, with a lot of testing along the way. But yeah, you're talking about that workload has been in the state, I want to say, for over 20, 25 years.Corey: So, what's the reason to move it? Because not for nothing, but there's an old—the old saw, “Well, don't fix it if it ain't broke.” Well, what's broke about it?Dmitry: Well, there's a couple of things. First off, the real estate that it takes up as an issue. It is a large machine sitting on a floor of a data center that we've got to consolidate to. We actually have some real estate constraints and we've got to cut down our footprint by next year, contractually, right? We've agreed, we're going to move into a smaller space.The other part is the technical talent. While yes, it's not broke, things are working on it, there are fewer and fewer people that can manage it. What we've found was doing a complete refactor while doing a move anywhere, is really too risky, right? Rewriting everything with a bunch of Lambdas is kind of scary, as well as moving it into another venue. So, there are mainframe emulators out there that will run in the cloud. We've gotten one and we're making this move now. So, we're going to do that lift-and-shift in and then look to refactor it piecemeal.Corey: Specifics are always going to determine, but as a general point, I felt like I am the only voice in the room sometimes advocating in favor of lift-and-shift. Because people say, “Oh, it's terrible for reasons X, Y, and Z.” It's, “Yes, all of your options are terrible and for the common case, this is the one that I have the sneaking suspicion, based upon my lived experience, is going to be the least bad have all of those various options.” Was there a thought given to doing a refactor in flight?Dmitry: So… from the time I got here, no. But I could tell you just having worked with the state even before coming in as CTO, there were constant conversations about a refactor. And the problem is, no one actually has an appetite for it. Everyone talks about it, but then when you say, “Look, there's a risk to doing this,”—right, governments are about minimizing risk—when you say, “Look, there's a risk to rewriting and moving code at the same time and it's going to take years longer,” right, that refactoring every time, I've seen an estimate, it would be as small as three years, as large as seven or eight years, depending on who was doing the estimate. Whereas the lift-and-shift, we're hoping we can get it done in two years, but even if it's two-and-a-half, it's still less than any of the estimates we've seen for a refactor and less risky. So, we're going with that model and we'll tinker and optimize later. But we just need to get out of that mainframe so that we can have more modern technology and more modern support.Corey: It seems like the right approach. I'm sorry, I didn't mean to frame that is quite as insulting as it might have come across. Like, “Did anyone consider other options just out of curi—” of course. Whenever you're making big changes, we're going to throw a dart at a whiteboard. It's not what appears to be Twitter's current product strategy we're talking about here. This is stuff that's very much measure twice, cut once.Dmitry: Yeah. Very much so. And you see that with just about everything we do here. I know, when the state, what now, three years ago, moved their tax system over to AWS, not only did they do two or three trial runs of just the data migration, we actually wound up doing six, right? You're talking about adding two months of testing just to make sure every time we did the data move, it was done correctly and all the data got moved over. I mean, government is very, very much about measure three, four times, cut once.Corey: Which is kind of the way you'd want it. One thing that I found curious whenever I've been talking to folks in the public sector space around things that they care about—and in years past, I periodically tried to, “Oh, should we look at doing some cost consulting for folks in this market?” And by and large, there have been a couple of exceptions, but—generally, in our experience with sovereign governments, more so than municipal or state ones—but saving money is not usually one of the top three things that governments care about when it comes to their AWS's state. Is cost something that's on your radar? And how do you conceptualize around this? And I should also disclose, this is not in any way, shape, or form intended to be a sales pitch.Dmitry: Yeah, no, cost actually, for GTA. Is a concern. But I think it's more around the way we're structured. I have worked with other governments where they say, “Look, we've already gotten an allotment of money. It costs whatever it costs and we're good with it.”With the way my organization is set up, though, we're not appropriated funds, meaning we're not given any tax dollars. We actually have to provide services to the agencies and they pay us for it. And so, my salary and everyone else's here, all the work that we do, is basically paid for by agencies and they do have a choice to leave. They could go find other providers. It doesn't have to be GTA always.So, cost is a consideration. But we're also finding that we can get those cost savings pretty easily with this move to the cloud because of the number of available tools that we now have available. We have—that data center I talked about, right? That data center is obviously locked down, secured, very limited access, you can't walk in, but that also prevents agencies from doing a lot of day-to-day work that now in the cloud, they can do on their own. And so, the savings are coming just from this move of not having to have as much locks away from the agency, but having more locks from the outside world as well, right? There's definitely scaling up in the number of tools that they have available to them to work around their applications that they didn't have before.Corey: It's, on some level, a capability story, I think, when it comes to cloud. But something I have heard from a number of folks is that even more so than in enterprises, budgets tend to be much more fixed things in the context of cloud in government. Often in enterprises, what you'll see is sprawl: someone leaves something running and oops, the bill wound up going up higher than we projected for this given period of time. When we start getting into the realm of government, that stops being a you broke budgeting policy and starts to resemble things that are called crimes. How do you wind up providing governance as a government around cloud usage to avoid, you know, someone going to prison over a Managed NAT Gateway?Dmitry: Yeah. So, we do have some pretty stringent monitoring. I know, even before the show, we talked about fact that we do have a separate security group. So, on that side of it, they are keeping an eye on what are people doing in the cloud. So, even though agencies now have more access to more tooling, they can do more, right, GTA hasn't stepped back from it and so, we're able to centrally manage things.We've put in a lot of controls. In fact, we're using Control Tower. We've got a lot of guardrails put in, even basic things like you can't run things outside of the US, right? We don't want you running things in the India region or anywhere in South America. Like, that's not even allowed, so we're able to block that off.And then we've got some pretty tight financial controls where we're watching the spend on a regular basis, agency by agency. Not enforcing any of it, obviously, agencies know what they're doing and it's their apps, but we do warn them of, “Hey, we're seeing this trend or that trend.” We've been at this now for about a year-and-a-half, and so agencies are starting to see that we provide more oversight and a lot less pressure, but at the same time, there's definitely a lot more collaboration assistance with one another.Corey: It really feels like the entire procurement model is shifted massively. As opposed to going out for a bunch of bids and doing all these other things, it's consumption-based. And that has been—I know for enterprises—a difficult pill for a lot of their procurement teams to wind up wrapping their heads around. I can only imagine what that must be like for things that are enshrined in law.Dmitry: Yeah, there's definitely been a shift, although it's not as big as you would think on that side because you do have cloud but then you also have managed services around cloud, right? So, you look at AWS, OCI, Azure, no one's out there putting a credit card down to open an environment anymore, you know, a tenant or an account. It is done through procurement rules. Like, we don't actually buy AWS directly from AWS; we go through a reseller, right, so there's some controls there as well from the procurement side. So, there's still a lot of oversight.But it is scary to some of our procurement people. Like, AWS Marketplace is a very, very scary place for them, right? The fact that you can go and—you can hire people at Marketplace, you could buy things with a single button-click. So, we've gone out of our way, in my agency, to go through and lock that down to make sure that before anyone clicks one of those purchase buttons, that we at least know about it, they've made the request, and we have to go in and unlock that button for that purchase. So, we've got to put in more controls in some cases. But in other cases, it has made things easier.Corey: As you look across the landscape of effectively, what you're doing is uprooting an awful lot of technical systems that have been in place for decades at this point. And we look at cloud and I'm not saying it's not stable—far from it—but it also feels a little strange to be, effectively, making a similar timespan of commitment—because functionally a lot of us are—when we look at these platforms. Was that something that had already been a pre-existing appetite for when you started the role or is that something that you've found that you've had to socialize in the last couple years?Dmitry: It's a little bit of both. It's been lumpy, agency by agency, I'll say. There are some agencies that are raring to go, they want to make some changes, do a lot of good, so to speak, by upgrading their infrastructure. There are others that will sit and say, “Hey, I've been doing this for 20, 30 years. It's been fine.” That whole, “If it ain't broke, don't fix it,” mindset.So, for them, there's definitely been, you know, a lot more friction to get them going in that direction. But what I'm also finding is the people with their hands on the keyboards, right, the ones that are doing the work, are excited by this. This is something new for them. In addition to actually going to cloud, the other thing we've been doing is providing a lot of different training options. And so, that's something that's perked people up and definitely made them much more excited to come into work.I know, down at the, you know, the operator level, the administrators, the managers, all of those folks, are pretty pleased with the moves we're making. You do get some of the folks in upper management in the agencies that do say, “Look, this is a risk.” We're saying, “Look, it's a risk not to do this.” Right? You've also got to think about staffing and what people are willing to work on. Things like the mainframe, you know, you're not going to be able to hire those people much longer. They're going to be fewer and far between. So, you have to retool. I do tell people that, you know, if you don't like change, IT is probably not the industry to be in, even in government. You probably want to go somewhere else, then.Corey: That is sort of the next topic I want to get into, where companies across the board are finding it challenging to locate and source talent to work in their environments. How has the process of recruiting cloud talent gone for you?Dmitry: It's difficult. Not going to sugarcoat that. It's, it's—Corey: [laugh]. I'm not sure anyone would say otherwise, no matter where you are. You can pay absolutely insane, top-of-market money and still have that exact same response. No one says, “Oh, it's super easy.” Everyone finds it hard. But please continue [laugh].Dmitry: Yeah, but it's also not a problem that we can even afford to throw money at, right? So, that's not something that we'd ever do. But what I have found is that there's actually a lot of people, really, that I'll say are tech adjacent, that are interested in making that move. And so, for us, having a mentoring and training program that bring people in and get them comfortable with it is probably more important than finding the talent exactly as it is, right? If you look at our job descriptions that we put out there, we do want things like cloud certs and certain experience, but we'll drop off things like certain college requirements. Say, “Look, do you really need a college degree if you know what you're doing in the cloud or if you know what you're doing with a database and you can prove that?”So, it's re-evaluating who we're bringing in. And in some cases, can we also train someone, right, bring someone in for a lower rate, but willing to learn and then give them the experience, knowing that they may not be here for 15, 20 years and that's okay. But we've got to retool that model to say, we expect some attrition, but they walk away with some valuable skills and while they're here, they learn those skills, right? So, that's the payoff for them.Corey: I think that there's a lot of folks exploring that where there are people who have the interest and the aptitude that are looking to transition in. So, much of the discussion points around filling the talent pipeline have come from a place of, oh, we're just going to talk to all the schools and make sure that they're teaching people the right way. And well, colleges aren't really aimed at being vocational institutions most of the time. And maybe you want people who can bring an understanding of various aspects of business, of workplace dynamics, et cetera, and even the organization themselves, you can transition them in. I've always been a big fan of helping people lateral from one part of an organization to another. It's nice to see that there's actual formal processes around that for you, folks.Dmitry: Yeah, we're trying to do that and we're also working across agencies, right, where we might pull someone in from another agency that's got that aptitude and willingness, especially if it's someone that already has government experience, right, they know how to work within the system that we have here, it certainly makes things easier. It's less of a learning curve for them on that side. We think, you know, in some cases, the technical skills, we can teach you those, but just operating in this environment is just as important to understand the soft side of it.Corey: No, I hear you. One thing that I've picked up from doing this show and talking to people in the different places that you all tend to come from, has been that everyone's working with really hard problems and there's a whole universe of various constraints that everyone's wrestling with. The biggest lie in our industry across the board that I'm coming to realize is any whiteboard architecture diagram. Full stop. The real world is messy.Nothing is ever quite like it looks like in that sterile environment where you're just designing and throwing things up there. The world is built on constraints and trade-offs. I'm glad to see that you're able to bring people into your organization. I think it gives an awful lot of folks hope when they despair about seeing what some of the job prospects are for folks in the tech industry, depending on what direction they want to go in.Dmitry: Yeah. I mean, I think we've got the same challenge as everyone else does, right? It is messy. The one thing that I think is also interesting is that we also have to have transparency but to some degree—and I'll shift; I know this wasn't meant to kind of go off into the security side of things, but I think one of the things that's most interesting is trying to balance a security mindset with that transparency, right?You have private corporations, other organizations that they do whatever they do, they're not going to talk about it, you don't need to know about it. In our case, I think we've got even more of a challenge because on the one hand, we do want to lock things down, make sure they're secure and we protect not just the data, but how we do things, right, some are mechanisms and methods. But same time, we've got a responsibility to be transparent to our constituents. They've got to be able to see what we're doing, what are we spending money on? And so, to me, that's also one of the biggest challenges we have is how do we make sure we balance that out, that we can provide people and even our vendors, right, a lot of times our vendors [will 00:30:40] say, “How are you doing something? We want to know so that we can help you better in some areas.” And it's really become a real challenge for us.Corey: I really want to thank you for taking the time to speak with me about what you're doing. If people want to learn more, where's the best place for them to find you?Dmitry: I guess now it's no longer called Twitter, but really just about anywhere. Twitter, Instagram—I'm not a big Instagram user—LinkedIn, Dmitry Kagansky, there's not a whole lot of us out there; pretty easy to do a search. But also you'll see there's my contact info, I believe, on the GTA website, just gta.ga.gov.Corey: Excellent. We will, of course, put links to that in the [show notes 00:31:20]. Thank you so much for being so generous with your time. I really appreciate it.Dmitry: Thank you, Corey. I really appreciate it as well.Corey: Dmitry Kagansky, CTO for the state of Georgia. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment telling me that I've got it all wrong and mainframes will in fact rise again.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Ask Me Anything with Corey Quinn

Screaming in the Cloud

Play Episode Listen Later Oct 3, 2023 53:56


In this special live-recorded episode of Screaming in the Cloud, Corey interviews himself— well, kind of. Corey hosts an AMA session, answering both live and previously submitted questions from his listeners. Throughout this episode, Corey discusses misconceptions about his public persona, the nature of consulting on AWS bills, why he focuses so heavily on AWS offerings, his favorite breakfast foods, and much, much more. Corey shares insights into how he monetizes his public persona without selling out his genuine opinions on the products he advertises, his favorite and least favorite AWS services, and some tips and tricks to get the most out of re:Invent.About CoreyCorey is the Chief Cloud Economist at The Duckbill Group. Corey's unique brand of snark combines with a deep understanding of AWS's offerings, unlocking a level of insight that's both penetrating and hilarious. He lives in San Francisco with his spouse and daughters.Links Referenced: lastweekinaws.com/disclosures: https://lastweekinaws.com/disclosures duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: As businesses consider automation to help build and manage their hybrid cloud infrastructures, deployment speed is important, but so is cost. Red Hat Ansible Automation Platform is available in the AWS Marketplace to help you meet your cloud spend commitments while delivering best-of-both-worlds support.Corey: Well, all right. Thank you all for coming. Let's begin and see how this whole thing shakes out, which is fun and exciting, and for some godforsaken reason the lights like to turn off, so we're going to see if that continues. I've been doing Screaming in the Cloud for about, give or take, 500 episodes now, which is more than a little bit ridiculous. And I figured it would be a nice change of pace if I could, instead of reaching out and talking to folks who are innovative leaders in the space and whatnot, if I could instead interview my own favorite guest: myself.Because the entire point is, I'm usually the one sitting here asking questions, so I'm instead going to now gather questions from you folks—and feel free to drop some of them into the comments—but I've solicited a bunch of them, I'm going to work through them and see what you folks want to know about me. I generally try to be fairly transparent, but let's have fun with it. To be clear, if this is your first exposure to my Screaming in the Cloud podcast show, it's generally an interview show talking with people involved with the business of cloud. It's not intended to be snarky because not everyone enjoys thinking on their feet quite like that, but rather a conversation of people about what they're passionate about. I'm passionate about the sound of my own voice. That's the theme of this entire episode.So, there are a few that have come through that are in no particular order. I'm going to wind up powering through them, and again, throw some into the comments if you want to have other ones added. If you're listening to this in the usual Screaming in the Cloud place, well, send me questions and I am thrilled to wind up passing out more of them. The first one—a great one to start—comes with someone asked me a question about the video feed. “What's with the Minecraft pickaxe on the wall?” It's made out of foam.One of my favorite stories, and despite having a bunch of stuff on my wall that is interesting and is stuff that I've created, years ago, I wrote a blog post talking about how machine learning is effectively selling digital pickaxes into a gold rush. Because the cloud companies pushing it are all selling things such as, you know, they're taking expensive compute, large amounts of storage, and charging by the hour for it. And in response, Amanda, who runs machine learning analyst relations at AWS, sent me that by way of retaliation. And it remains one of my absolute favorite gifts. It's, where's all this creativity in the machine-learning marketing? No, instead it's, “We built a robot that can think. But what are we going to do with it now? Microsoft Excel.” Come up with some of that creativity, that energy, and put it into the marketing side of the world.Okay, someone else asks—Brooke asks, “What do I think is people's biggest misconception about me?” That's a good one. I think part of it has been my misconception for a long time about what the audience is. When I started doing this, the only people who ever wound up asking me anything or talking to me about anything on social media already knew who I was, so I didn't feel the need to explain who I am and what I do. So, people sometimes only see the witty banter on Twitter and whatnot and think that I'm just here to make fun of things.They don't notice, for example, that my jokes are never calling out individual people, unless they're basically a US senator, and they're not there to make individual humans feel bad about collectively poor corporate decision-making. I would say across the board, people think that I'm trying to be meaner than I am. I'm going to be honest and say it's a little bit insulting, just from the perspective of, if I really had an axe to grind against people who work at Amazon, for example, is this the best I'd be able to do? I'd like to think that I could at least smack a little bit harder. Speaking of, we do have a question that people sent in in advance.“When was the last time that Mike Julian gave me that look?” Easy. It would have been two days ago because we were both in the same room up in Seattle. I made a ridiculous pun, and he just stared at me. I don't remember what the pun is, but I am an incorrigible punster and as a result, Mike has learned that whatever he does when I make a pun, he cannot incorrige me. Buh-dum-tss. That's right. They're no longer puns, they're dad jokes. A pun becomes a dad joke once the punch line becomes a parent. Yes.Okay, the next one is what is my favorite AWS joke? The easy answer is something cynical and ridiculous, but that's just punching down at various service teams; it's not my goal. My personal favorite is the genie joke where a guy rubs a lamp, Genie comes out and says, “You can have a billion dollars if you can spend $100 million in a month, and you're not allowed to waste it or give it away.” And the person says, “Okay”—like, “Those are the rules.” Like, “Okay. Can I use AWS?” And the genie says, “Well, okay, there's one more rule.” I think that's kind of fun.Let's see, another one. A hardball question: given the emphasis on right-sizing for meager cost savings and the amount of engineering work required to make real architectural changes to get costs down, how do you approach cost controls in companies largely running other people's software? There are not as many companies as you might think where dialing in the specifics of a given application across the board is going to result in meaningful savings. Yes, yes, you're running something in hyperscale, it makes an awful lot of sense, but most workloads don't do that. The mistakes you most often see are misconfigurations for not knowing this arcane bit of AWS trivia, as a good example. There are often things you can do with relatively small amounts of effort. Beyond a certain point, things are going to cost what they're going to cost without a massive rearchitecture and I don't advise people do that because no one is going to be happy rearchitecting just for cost reasons. Doesn't go well.Someone asks, “I'm quite critical of AWS, which does build trust with the audience. Has AWS tried to get you to market some of their services, and would I be open to do that?” That's a great question. Yes, sometimes they do. You can tell this because they wind up buying ads in the newsletter or the podcast and they're all disclaimed as a sponsored piece of content.I do have an analyst arrangement with a couple of different cloud companies, as mentioned lastweekinaws.com/disclosures, and the reason behind that is because you can buy my attention to look at your product and talk to you in-depth about it, but you cannot buy my opinion on it. And those engagements are always tied to, let's talk about what the public is seeing about this. Now, sometimes I write about the things that I'm talking about because that's where my mind goes, but it's not about okay, now go and talk about this because we're paying you to, and don't disclose that you have a financial relationship.No, that is called fraud. I figure I can sell you as an audience out exactly once, so I better be able to charge enough money to never have to work again. Like, when you see me suddenly talk about multi-cloud being great and I became a VP at IBM, about three to six months after that, no one will ever hear from me again because I love nesting doll yacht money. It'll be great.Let's see. The next one I have on my prepared list here is, “Tell me about a time I got AWS to create a pie chart.” I wish I'd see less of it. Every once in a while I'll talk to a team and they're like, “Well, we've prepared a PowerPoint deck to show you what we're talking about.” No, Amazon is famously not a PowerPoint company and I don't know why people feel the need to repeatedly prove that point to me because slides are not always the best way to convey complex information.I prefer to read documents and then have a conversation about them as Amazon tends to do. The visual approach and the bullet lists and all the rest are just frustrating. If I'm going to do a pie chart, it's going to be in service of a joke. It's not going to be anything that is the best way to convey information in almost any sense.“How many internal documents do I think reference me by name at AWS,” is another one. And I don't know the answer to documents, but someone sent me a screenshot once of searching for my name in their Slack internal nonsense thing, and it was about 10,000 messages referenced me that it found. I don't know what they were saying. I have to assume, on some level, just something that does a belt feed from my Twitter account where it lists my name or something. But I choose to believe that no, they actually are talking about me to that level of… of extreme.Let's see, let's turn back to the chat for a sec because otherwise it just sounds like I'm doing all prepared stuff. And I'm thrilled to do that, but I'm also thrilled to wind up fielding questions from folks who are playing along on these things. “I love your talk, ‘Heresy in the Church of Docker.' Do I have any more speaking gigs planned?” Well, today's Wednesday, and this Friday, I have a talk that's going out at the CDK Community Day.I also have a couple of things coming up that are internal corporate presentations at various places. But at the moment, no. I suspect I'll be giving a talk if they accept it at SCALE in Pasadena in March of next year, but at the moment, I'm mostly focused on re:Invent, just because that is eight short weeks away and I more or less destroy the second half of my year because… well, holidays are for other people. We're going to talk about clouds, as Amazon and the rest of us dance to the tune that they play.“Look in my crystal ball; what will the industry look like in 5, 10, or 20 years?” Which is a fun one. You shouldn't listen to me on this. At all. I was the person telling you that virtualization was a flash in the pan, that cloud was never going to catch on, that Kubernetes and containers had a bunch of problems that were unlikely to be solved, and I'm actually kind of enthused about serverless which probably means it's going to flop.I am bad at predicting overall trends, but I have no problem admitting that wow, I was completely wrong on that point, which apparently is a rarer skill than it should be. I don't know what the future the industry holds. I know that we're seeing some AI value shaping up. I think that there's going to be a bit of a downturn in that sector once people realize that just calling something AI doesn't mean you make wild VC piles of money anymore. But there will be use cases that filter out of it. I don't know what they're going to look like yet, but I'm excited to see it.Okay, “Have any of the AWS services increased costs in the last year? I was having a hard time finding historical pricing charts for services.” There have been repricing stories. There have been SMS charges in India that have—and pinpointed a few other things—that wound up increasing because of a government tariff on them and that cost was passed on. Next February, they're going to be charging for public IPV4 addresses.But those tend to be the exceptions. The way that most costs tend increase have been either, it becomes far cheaper for AWS to provide a service and they don't cut the cost—data transfer being a good example—they'll also often have stories in that they're going to start launching a bunch of new things, and you'll notice that AWS bills tend to grow in time. Part of that growth, part of that is just cruft because people don't go back and clean things up. But by and large, I have not seen, “This thing that used to cost you $1 is now going to cost you $2.” That's not how AWS does pricing. Thankfully. Everyone's always been scared of something like that happening. I think that when we start seeing actual increases like that, that's when it's time to start taking a long, hard look at the way that the industry is shaping up. I don't think we're there yet.Okay. “Any plans for a Last Week in Azure or a Last Week in GCP?” Good question. If so, I won't be the person writing it. I don't think that it's reasonable to expect someone to keep up with multiple large companies and their releases. I'd also say that Azure and GCP don't release updates to services with the relentless cadence that AWS does.The reason I built the thing to start with is simply because it was difficult to gather all the information in one place, at least the stuff that I cared about with an economic impact, and by the time I'd done that, it was, well, this is 80% of the way toward republishing it for other people. I expected someone was going to point me at a thing so I didn't have to do it, and instead, everyone signed up. I don't see the need for it. I hope that in those spaces, they're better at telling their own story to the point where the only reason someone would care about a newsletter would be just my sarcasm tied into whatever was released. But that's not something that I'm paying as much attention to, just because my customers are on AWS, my stuff is largely built on AWS, it's what I have to care about.Let's see here. “What do I look forward to at re:Invent?” Not being at re:Invent anymore. I'm there for eight nights a year. That is shitty cloud Chanukah come to life for me. I'm there to set things up in advance, I'm there to tear things down at the end, and I'm trying to have way too many meetings in the middle of all of that. I am useless for the rest of the year after re:Invent, so I just basically go home and breathe into a bag forever.I had a revelation last year about re:Play, which is that I don't have to go to it if I don't want to go. And I don't like the cold, the repetitive music, the giant crowds. I want to go read a book in a bathtub and call it a night, and that's what I hope to do. In practice, I'll probably go grab dinner with other people who feel the same way. I also love the Drink Up I do there every year over at Atomic Liquors. I believe this year, we're partnering with the folks over at RedMonk because a lot of the people we want to talk to are in the same groups.It's just a fun event: show up, let us buy you drinks. There's no badge scan or any nonsense like that. We just want to talk to people who care to come out and visit. I love doing that. It's probably my favorite part of re:Invent other than not being at re:Invent. It's going to be on November 29th this year. If you're listening to this, please come on by if you're unfortunate enough to be in Las Vegas.Someone else had a good question I want to talk about here. “I'm a TAM for AWS. Cost optimization is one of our functions. What do you wish we would do better after all the easy button things such as picking the right instance and family, savings plans RIs, turning off or delete orphan resources, watching out for inefficient data transfer patterns, et cetera?” I'm going to back up and say that you're begging the question here, in that you aren't doing the easy things, at least not at scale, not globally.I used to think that all of my customer engagements would be, okay after the easy stuff, what's next? I love those projects, but in so many cases, I show up and those easy things have not been done. “Well, that just means that your customers haven't been asking their TAM.” Every customer I've had has asked their TAM first. “Should we ask the free expert or the one that charges us a large but reasonable fixed fee? Let's try the free thing first.”The quality of that advice is uneven. I wish that there were at least a solid baseline. I would love to get to a point where I can assume that I can go ahead and be able to just say, “Okay, you've clearly got your RI stuff, you're right-sizing, you're deleting stuff you're not using, taken care of. Now, let's look at the serious architecture stuff.” It's just rare that I get to see it.“What tool, feature, or widget do I wish AWS would build into the budget console?” I want to be able to set a dollar figure, maybe it's zero, maybe it's $20, maybe it is irrelevant, but above whatever I set, the account will not charge me above that figure, period. If that means they have to turn things off if that means they had to delete portions of data, great. But I want that assurance because even now when I kick the tires in a new service, I get worried that I'm going to wind up with a surprise bill because I didn't understand some very subtle interplay of the dynamics. And if I'm worried about that, everyone else is going to wind up getting caught by that stuff, too.I want the freedom to experiment and if it smacks into a wall, okay, cool. That's $20. That was worth learning that. Whatever. I want the ability to not be charged unreasonable overages. And I'm not worried about it turning from 20 into 40. I'm worried about it turning from 20 into 300,000. Like, there's the, “Oh, that's going to have a dent on the quarterlies,” style of [numb 00:16:01]—All right. Someone also asked, “What is the one thing that AWS could do that I believe would reduce costs for both AWS and their customers. And no, canceling re:Invent doesn't count.” I don't think about it in that way because believe it or not, most of my customers don't come to me asking to reduce their bill. They think they do at the start, but what they're trying to do is understand it. They're trying to predict it.Yes, they want to turn off the waste in the rest, but by and large, there are very few AWS offerings that you take a look at and realize what you're getting for it and say, “Nah, that's too expensive.” It can be expensive for certain use cases, but the dangerous part is when the costs are unpredictable. Like, “What's it going to cost me to run this big application in my data center?” The answer is usually, “Well, run it for a month, and then we'll know.” But that's an expensive and dangerous way to go about finding things out.I think that customers don't care about reducing costs as much as they think; they care about controlling them, predicting them, and understanding them. So, how would they make things less expensive? I don't know. I suspect that data transfer if they were to reduce that at least cross-AZ or eliminate it ideally, you'd start seeing a lot more compute usage in multiple AZs. I've had multiple clients who are not spinning things up in multi-AZ, specifically because they'll take the reliability trade-off over the extreme cost of all the replication flowing back and forth. Aside from that, they mostly get a lot of the value right in how they price things, which I don't think people have heard me say before, but it is true.Someone asked a question here of, “Any major trends that I'm seeing in EDP/PPA negotiations?” Yeah, lately, in particular. Used to be that you would have a Marketplace as the fallback, where it used to be that 50 cents of every dollar you spent on Marketplace would count. Now, it's a hundred percent up to a quarter of your commit. Great.But when you have a long-term commitment deal with Amazon, now they're starting to push for all—put all your other vendors onto the AWS Marketplace so you can have a bigger commit and thus a bigger discount, which incidentally, the discount does not apply to Marketplace spend. A lot of folks are uncomfortable with having Amazon as the middleman between all of their vendor relationships. And a lot of the vendors aren't super thrilled with having to pay percentages of existing customer relationships to Amazon for what they perceive to be remarkably little value. That's the current one.I'm not seeing generative AI play a significant stake in this yet. People are still experimenting with it. I'm not seeing, “Well, we're spending $100 million a year, but make that 150 because of generative AI.” It's expensive to play with gen-AI stuff, but it's not driving the business spend yet. But that's the big trend that I'm seeing over the past, eh, I would say, few months.“Do I use AWS for personal projects?” The first problem there is, well, what's a personal project versus a work thing? My life is starting to flow in a bunch of weird different ways. The answer is yes. Most of the stuff that I build for funsies is on top of AWS, though there are exceptions. “Should I?” Is the follow-up question and the answer to that is, “It depends.”The person is worrying about cost overruns. So, am I. I tend to not be a big fan of uncontrolled downside risk when something winds up getting exposed. I think that there are going to be a lot of caveats there. I know what I'm doing and I also have the backstop, in my case, of, I figure I can have a big billing screw-up or I have to bend the knee and apologize and beg for a concession from AWS, once.It'll probably be on a billboard or something one of these days. Lord knows I have it coming to me. That's something I can use as a get-out-of-jail-free card. Most people can't make that guarantee, and so I would take—if—depending on the environment that you know and what you want to build, there are a lot of other options: buying a fixed-fee VPS somewhere if that's how you tend to think about things might very well be a cost-effective for you, depending on what you're building. There's no straight answer to this.“Do I think Azure will lose any market share with recent cybersecurity kerfuffles specific to Office 365 and nation-state actors?” No, I don't. And the reason behind that is that a lot of Azure spend is not necessarily Azure usage; it's being rolled into enterprise agreements customers negotiate as part of their on-premises stuff, their operating system licenses, their Office licensing, and the rest. The business world is not going to stop using Excel and Word and PowerPoint and Outlook. They're not going to stop putting Windows on desktop stuff. And largely, customers don't care about security.They say they do, they often believe that they do, but I see where the bills are. I see what people spend on feature development, I see what they spend on core infrastructure, and I see what they spend on security services. And I have conversations about budgeting with what are you doing with a lot of these things? The companies generally don't care about this until right after they really should have cared. And maybe that's a rational effect.I mean, take a look at most breaches. And a year later, their stock price is larger than it was when they dispose the breach. Sure, maybe they're burning through their ablated CISO, but the business itself tends to succeed. I wish that there were bigger consequences for this. I have talked to folks who will not put specific workloads on Azure as a result of this. “Will you talk about that publicly?” “No, because who can afford to upset Microsoft?”I used to have guests from Microsoft on my show regularly. They don't talk to me and haven't for a couple of years. Scott Guthrie, the head of Azure, has been on this show. The problem I have is that once you start criticizing their security posture, they go quiet. They clearly don't like me.But their options are basically to either ice me out or play around with my seven seats for Office licensing, which, okay, whatever. They don't have a stick to hit me with, in the way that they do most companies. And whether that's true or not that they're going to lash out like that, companies don't want to take the risk of calling Microsoft out in public. Too big to be criticized as sort of how that works.Let's see, someone else asks, “How can a startup get the most out of its startup status with AWS?” You're not going to get what you think you want from AWS in this context. “Oh, we're going to be a featured partner so they market us.” I've yet to hear a story about how being featured by AWS for something has dramatically changed the fortunes of a startup. Usually, they'll do that when there's either a big social mission and you never hear about the company again, or they're a darling of the industry that's taking the world by fire and they're already [at 00:22:24] upward swing and AWS wants to hang out with those successful people in public and be seen to do so.The actual way that startup stuff is going to manifest itself well for you from AWS is largely in the form of credits as you go through Activate or one of their other programs. But be careful. Treat them like actual money, not this free thing you don't have to worry about. One day they expire or run out and suddenly you're going from having no dollars going to AWS to ten grand a month and people aren't prepared for that. It's, “Wait. So you mean this costs money? Oh, my God.”You have to approach it with a sense of discipline. But yeah, once you—if you can do that, yeah, free money and a free cloud bill for a few years? That's not nothing. I also would question the idea of being able to ask a giant company that's worth a trillion-and-a-half dollars and advice for how to be a startup. I find that one's always a little on the humorous side myself.“What do I think is the most underrated service or feature release from 2023? Full disclosures, this means I'll make some content about it,” says Brooke over at AWS. Oh, that's a good question. I'm trying to remember when various things have come out and it all tends to run together. I think that people are criticizing AWS for charging for IPV4 an awful lot, and I think that that is a terrific change, just because I've seen how wasteful companies are with public IP addresses, which are basically an exhausted or rapidly exhausting resource.And they just—you spend tens or hundreds of thousands of these things and don't use reason to think about that. It'll be one of the best things that we've seen for IPV6 adoption once AWS figures out how to make that work. And I would say that there's a lot to be said for since, you know, IPV4 is exhausted already, now we're talking about can we get them on the secondary markets, you need a reasonable IP plan to get some of those. And… “Well, we just give them the customers and they throw them away.” I want AWS to continue to be able to get those for the stuff that the rest of us are working on, not because one big company uses a million of them, just because, “Oh, what do you mean private IP addresses? What might those be?” That's part of it.I would say that there's also been… thinking back on this, it's unsung, the compute optimizer is doing a lot better at recommending things than it used to be. It was originally just giving crap advice, and over time, it started giving advice that's actually solid and backs up what I've seen. It's not perfect, and I keep forgetting it's there because, for some godforsaken reason, it's its own standalone service, rather than living in the billing console where it belongs. But no one's excited about a service like that to the point where they talk about or create content about it, but it's good, and it's getting better all the time. That's probably a good one. They recently announced the ability for it to do GPU instances which, okay great, for people who care about that, awesome, but it's not exciting. Even I don't think I paid much attention to it in the newsletter.Okay, “Does it make economic sense to bring your own IP addresses to AWS instead of paying their fees?” Bring your own IP, if you bring your own allocation to AWS, costs you nothing in terms of AWS costs. You take a look at the market rate per IP address versus what AWS costs, you'll hit break even within your first year if you do it. So yeah, it makes perfect economic sense to do it if you have the allocation and if you have the resourcing, as well as the ability to throw people at the problem to do the migration. It can be a little hairy if you're not careful. But the economics, the benefit is clear on that once you account for those variables.Let's see here. We've also got tagging. “Everyone nods their heads that they know it's the key to controlling things, but how effective are people at actually tagging, especially when new to cloud?” They're terrible at it. They're never going to tag things appropriately. Automation is the way to do it because otherwise, you're going to spend the rest of your life chasing developers and asking them to tag things appropriately, and then they won't, and then they'll feel bad about it. No one enjoys that conversation.So, having derived tags and the rest, or failing that, having some deployment gate as early in the process as possible of, “Oh, what's the tag for this?” Is the only way you're going to start to see coverage on this. And ideally, someday you'll go back and tag a bunch of pre-existing stuff. But it's honestly the thing that everyone hates the most on this. I have never seen a company that says, “We are thrilled with our with our tag coverage. We're nailing it.” The only time you see that is pure greenfield, everything done without ClickOps, and those environments are vanishingly rare.“Outside a telecom are customers using local zones more, or at all?” Very, very limited as far as what their usage looks like on that. Because that's… it doesn't buy you as much as you'd think for most workloads. The real benefit is a little more expensive, but it's also in specific cities where there are not AWS regions, and at least in the United States where the majority of my clients are, there is not meaningful latency differences, for example, from in Los Angeles versus up to Oregon, since no one should be using the Northern California region because it's really expensive. It's a 20-millisecond round trip, which in most cases, for most workloads, is fine.Gaming companies are big exception to this. Getting anything they can as close to the customer as possible is their entire goal, which very often means they don't even go with some of the cloud providers in some places. That's one of those actual multi-cloud workloads that you want to be able to run anywhere that you can get a baseline computer up to run a container or a golden image or something. That is the usual case. The rest are, for local zones, is largely going to be driven by specific one-off weird things. Good question.Let's see, “Is S3 intelligent tiering good enough or is it worth trying to do it yourself?” Your default choice for almost everything should be intelligent tiering in 2023. It winds up costing you more only in very specific circumstances that are unlikely to be anything other than a corner case for what you're doing. And the exceptions to this are, large workloads that are running a lot of S3 stuff where the lifecycle is very well understood, environments where you're not going to be storing your data for more than 30 days in any case and you can do a lifecycle policy around it. Other than those use cases, yeah, the monitoring fee is not significant in any environment I've ever seen.And people view—touch their data a lot less than they believe. So okay, there's a monitoring fee for object, yes, but it also cuts your raw storage cost in half for things that aren't frequently touched. So, you know, think about it. Run your own numbers and also be aware that first month as it transitions in, you're going to see massive transition charges per object, but wants it's an intelligent tiering, there's no further transition charges, which is nice.Let's see here. “We're all-in on serverless”—oh good, someone drank the Kool-Aid, too—“And for our use cases, it works great. Do I find other customers moving to it and succeeding?” Yeah, I do when they're moving to it because for certain workloads, it makes an awful lot of sense. For others, it requires a complete reimagining of whatever it is that you're doing.The early successes were just doing these periodic jobs. Now, we're seeing full applications built on top of event-driven architectures, which is really neat to see. But trying to retrofit something that was never built with that in mind can be more trouble than it's worth. And there are corner cases where building something on serverless would cost significantly more than building it in a server-ful way. But its time has come for an awful lot of stuff. Now, what I don't subscribe to is this belief that oh, if you're not building something serverless you're doing it totally wrong. No, that is not true. That has never been true.Let's see what else have we got here? Oh, “Following up on local zones, how about Outposts? Do I see much adoption? What's the primary use case or cases?” My customers inherently are coming to me because of a large AWS bill. If they're running Outposts, it is extremely unlikely that they are putting significant portions of their spend through the Outpost. It tends to be something of a rounding error, which means I don't spend a lot of time focusing on it.They obviously have some existing data center workloads and data center facilities where they're going to take an AWS-provided rack and slap it in there, but it's not going to be in the top 10 or even top 20 list of service spend in almost every case as a result, so it doesn't come up. One of the big secrets of how we approach things is we start with a big number first and then work our way down instead of going alphabetically. So yes, I've seen customers using them and the customers I've talked to at re:Invent who are using them are very happy with them for the use cases, but it's not a common approach. I'm not a huge fan of the rest.“Someone said the Basecamp saved a million-and-a-half a year by leaving AWS. I know you say repatriation isn't a thing people are doing, but has my view changed at all since you've published that blog post?” No, because everyone's asking me about Basecamp and it's repatriation, and that's the only use case that they've got for this. Let's further point out that a million-and-a-half a year is not as many engineers as you might think it is when you wind up tying that all together. And now those engineers are spending time running that environment.Does it make sense for them? Probably. I don't know their specific context. I know that a million-and-a-half dollars a year to—even if they had to spend that for the marketing coverage that they're getting as a result of this, makes perfect sense. But cloud has never been about raw cost savings. It's about feature velocity.If you have a data center and you move it to the cloud, you're not going to recoup that investment for at least five years. Migrations are inherently expensive. It does not create the benefits that people often believe that they do. That becomes a painful problem for folks. I would say that there's a lot more noise than there are real-world stories [hanging 00:31:57] out about these things.Now, I do occasionally see a specific workload that is moved back to a data center for a variety of reasons—occasionally cost but not always—and I see proof-of-concept projects that they don't pursue and then turn off. Some people like to call that a repatriation. No, I call it as, “We tried and it didn't do what we wanted it to do so we didn't proceed.” Like, if you try that with any other project, no one says, “Oh, you're migrating off of it.” No, you're not. You tested it, it didn't do what it needed to do. I do see net-new workloads going into data centers, but that's not the same thing.Let's see. “Are the talks at re:Invent worth it anymore? I went to a lot of the early re:Invents and haven't and about five years. I found back then that even the level 400 talks left a lot to be desired.” Okay. I'm not a fan of attending conference talks most of the time, just because there's so many things I need to do at all of these events that I would rather spend the time building relationships and having conversations.The talks are going to be on YouTube a week later, so I would rather get to know the people building the service so I can ask them how to inappropriately use it as a database six months later than asking questions about the talk. Conference-ware is often the thing. Re:Invent always tends to have an AWS employee on stage as well. And I'm not saying that makes these talks less authentic, but they're also not going to get through slide review of, “Well, we tried to build this onto this AWS service and it was a terrible experience. Let's tell you about that as a war story.” Yeah, they're going to shoot that down instantly even though failure stories are so compelling, about here's what didn't work for us and how we got there. It's the lessons learned type of thing.Whenever you have as much control as re:Invent exhibits over its speakers, you know that a lot of those anecdotes are going to be significantly watered down. This is not to impugn any of the speakers themselves; this is the corporate mind continuing to grow to a point where risk mitigation and downside protection becomes the primary driving goal.Let's pull up another one from the prepared list here. “My most annoying, overpriced, or unnecessary charge service in AWS.” AWS Config. It's a tax on using the cloud as the cloud. When you have a high config bill, it's because it charges you every time you change the configuration of something you have out there. It means you're spinning up and spinning down EC2 instances, whereas you're going to have a super low config bill if you, you know, treat it like a big dumb data center.It's a tax on accepting the promises under which cloud has been sold. And it's necessary for a number of other things like Security Hub. Control Towers magic-deploys it everywhere and makes it annoying to turn off. And I think that that is a pure rent-seeking charge because people aren't incurring config charges if they're not already using a lot of AWS things. Not every service needs to make money in a vacuum. It's, “Well, we don't charge anything for this because our users are going to spend an awful lot of money on storing things in S3 to use our service.” Great. That's a good thing. You don't have to pile charge upon charge upon charge upon charge. It drives me a little bit nuts.Let's see what else we have here as far as questions go. “Which AWS service delights me the most?” Eesh, depends on the week. S3 has always been a great service just because it winds up turning big storage that usually—used to require a lot of maintenance and care into something I don't think about very much. It's getting smarter and smarter all the time. The biggest lie is the ‘Simple' in its name: ‘Simple Storage Service.' At this point, if that's simple, I really don't want to know what you think complex would look like.“By following me on Twitter, someone gets a lot of value from things I mention offhandedly as things everybody just knows. For example, which services are quasi-deprecated or outdated, or what common practices are anti-patterns? Is there a way to learn this kind of thing all in one go, as in a website or a book that reduces AWS to these are the handful of services everybody actually uses, and these are the most commonly sensible ways to do it?” I wish. The problem is that a lot of the stuff that everyone knows, no, it's stuff that at most, maybe half of the people who are engaging with it knew.They find out by hearing from other people the way that you do or by trying something and failing and realizing, ohh, this doesn't work the way that I want it to. It's one of the more insidious forms of cloud lock-in. You know how a service works, how a service breaks, what the constraints are around when it starts and it stops. And that becomes something that's a hell of a lot scarier when you have to realize, I'm going to pick a new provider instead and relearn all of those things. The reason I build things on AWS these days is honestly because I know the ways it sucks. I know the painful sharp edges. I don't have to guess where they might be hiding. I'm not saying that these sharp edges aren't painful, but when you know they're there in advance, you can do an awful lot to guard against that.“Do I believe the big two—AWS and Azure—cloud providers have agreed between themselves not to launch any price wars as they already have an effective monopoly between them and [no one 00:36:46] win in a price war?” I don't know if there's ever necessarily an explicit agreement on that, but business people aren't foolish. Okay, if we're going to cut our cost of service, instantly, to undercut a competitor, every serious competitor is going to do the same thing. The only reason to do that is if you believe your margins are so wildly superior to your competitors that you can drive them under by doing that or if you have the ability to subsidize your losses longer than they can remain a going concern. Microsoft and Amazon are—and Google—are not in a position where, all right, we're going to drive them under.They can both subsidize losses basically forever on a lot of these things and they realize it's a game you don't win in, I suspect. The real pricing pressure on that stuff seems to come from customers, when all right, I know it's big and expensive upfront to buy a SAN, but when that starts costing me less than S3 on a per-petabyte basis, that's when you start to see a lot of pricing changing in the market. The one thing I haven't seen that take effect on is data transfer. You could be forgiven for believing that data transfer still cost as much as it did in the 1990s. It does not.“Is AWS as far behind in AI as they appear?” I think a lot of folks are in the big company space. And they're all stammering going, “We've been doing this for 20 years.” Great, then why are all of your generative AI services, A, bad? B, why is Alexa so terrible? C, why is it so clear that everything you have pre-announced and not brought to market was very clearly not envisioned as a product to be going to market this year until 300 days ago, when Chat-Gippity burst onto the scene and OpenAI [stole a march 00:38:25] on everyone?Companies are sprinting to position themselves as leaders in the AI space, despite the fact that they've gotten lapped by basically a small startup that's seven years old. Everyone is trying to work the word AI into things, but it always feels contrived to me. Frankly, it tells me that I need to just start tuning the space out for a year until things settle down and people stop describing metric math or anomaly detection is AI. Stop it. So yeah, I'd say if anything, they're worse than they appear as far as from behind goes.“I mostly focus on AWS. Will I ever cover Azure?” There are certain things that would cause me to do that, but that's because I don't want to be the last Perl consultancy is the entire world has moved off to Python. And effectively, my focus on AWS is because that's where the painful problems I know how to fix live. But that's not a suicide pact. I'm not going to ride that down in flames.But I can retool for a different cloud provider—if that's what the industry starts doing—far faster than AWS can go from its current market-leading status to irrelevance. There are certain triggers that would cause me to do that, but at the time, I don't see them in the near term and I don't have any plans to begin covering other things. As mentioned, people want me to talk about the things I'm good at not the thing that makes me completely nonsensical.“Which AWS services look like a good idea, but pricing-wise, they're going to kill you once you have any scale, especially the ones that look okay pricing-wise but aren't really and it's hard to know going in?” CloudTrail data events, S3 Bucket Access logging any of the logging services really, Managed NAT Gateways in a bunch of cases. There's a lot that starts to get really expensive once you hit certain points of scale with a corollary that everyone thinks that everything they're building is going to scale globally and that's not true. I don't build things as a general rule with the idea that I'm going to get ten million users on it tomorrow because by the time I get from nothing to substantial workloads, I'm going to have multiple refactors of what I've done. I want to get things out the door as fast as possible and if that means that later in time, oh, I accidentally built Pinterest. What am I going to do? Well, okay, yeah, I'm going to need to rebuild a whole bunch of stuff, but I'll have the user traffic and mindshare and market share to finance that growth.Early optimization on stuff like this causes a lot more problems than it solves. “Best practices and anti-patterns in managing AWS costs. For context, you once told me about a role that I had taken that you'd seen lots of companies tried to create that role and then said that the person rarely lasts more than a few months because it just isn't effective. You were right, by the way.” Imagine that I sometimes know what I'm talking about.When it comes to managing costs, understand what your goal is here, what you're actually trying to achieve. Understand it's going to be a cross-functional work between people in finance and people that engineering. It is first and foremost, an engineering problem—you learn that at your peril—and making someone be the human gateway to spin things up means that they're going to quit, basically, instantly. Stop trying to shame different teams without understanding their constraints.Savings Plans are a great example. They apply biggest discount first, which is what you want. Less money going out the door to Amazon, but that makes it look like anything with a low discount percentage, like any workload running on top of Microsoft Windows, is not being responsible because they're always on demand. And you're inappropriately shaming a team for something completely out of their control. There's a point where optimization no longer makes sense. Don't apply it to greenfield projects or skunkworks. Things you want to see if the thing is going to work first. You can optimize it later. Starting out with a, ‘step one: spend as little as possible' is generally not a recipe for success.What else have we got here? I've seen some things fly by in the chat that are probably worth mentioning here. Some of it is just random nonsense, but other things are, I'm sure, tied to various questions here. “With geopolitics shaping up to govern tech data differently in each country, does it make sense to even build a globally distributed B2B SaaS?” Okay, I'm going to tackle this one in a way that people will probably view as a bit of an attack, but it's something I see asked a lot by folks trying to come up with business ideas.At the outset, I'm a big believer in, if you're building something, solve it for a problem and a use case that you intrinsically understand. That is going to mean the customers with whom you speak. Very often, the way business is done in different countries and different cultures means that in some cases, this thing that's a terrific idea in one country is not going to see market adoption somewhere else. There's a better approach to build for the market you have and the one you're addressing rather than aspirational builds. I would also say that it potentially makes sense if there are certain things you know are going to happen, like okay, we validated our marketing and yeah, it turns out that we're building an image resizing site. Great. People in Germany and in the US all both need to resize images.But you know, going in that there's going to be a data residency requirement, so architecting, from day one with an idea that you can have a partition that winds up storing its data separately is always going to be to your benefit. I find aligning whatever you're building with the idea of not being creepy is often a great plan. And there's always the bring your own storage approach to, great, as a customer, you can decide where your data gets stored in your account—charge more for that, sure—but then that na—it becomes their problem. Anything that gets you out of the regulatory critical path is usually a good idea. But with all the problems I would have building a business, that is so far down the list for almost any use case I could ever see pursuing that it's just one of those, you have a half-hour conversation with someone who's been down the path before if you think it might apply to what you're doing, but then get back to the hard stuff. Like, worry on the first two or three steps rather than step 90 just because you'll get there eventually. You don't want to make your future life harder, but you also don't want to spend all your time optimizing early, before you've validated you're actually building something useful.“What unique feature of AWS do I most want to see on other cloud providers and vice versa?” The vice versa is easy. I love that Google Cloud by default has the everything in this project—which is their account equivalent—can talk to everything else, which means that humans aren't just allowing permissions to the universe because it's hard. And I also like that billing is tied to an individual project. ‘Terminate all billable resources in this project' is a button-click away and that's great.Now, what do I wish other cloud providers would take from AWS? Quite honestly, the customer obsession. It's still real. I know it sounds like it's a funny talking point or the people who talk about this the most under the cultists, but they care about customer problems. Back when no one had ever heard of me before and my AWS Bill was seven bucks, whenever I had a problem with a service and I talked about this in passing to folks, Amazonians showed up out of nowhere to help make sure that my problem got answered, that I was taken care of, that I understood what I was misunderstanding, or in some cases, the feedback went to the product team.I see too many companies across the board convinced that they themselves know best about what customers need. That occasionally can be true, but not consistently. When customers are screaming for something, give them what they need, or frankly, get out of the way so someone else can. I mean, I know someone's expecting me to name a service or something, but we've gotten past the point, to my mind, of trying to do an apples-to-oranges comparison in terms of different service offerings. If you want to build a website using any reasonable technology, there's a whole bunch of companies now that have the entire stack for you. Pick one. Have fun.We've got time for a few more here. Also, feel free to drop more questions in. I'm thrilled to wind up answering any of these things. Have I seen any—here's one that about Babelfish, for example, from Justin [Broadly 00:46:07]. “Have I seen anyone using Babelfish in the wild? It seems like it was a great idea that didn't really work or had major trade-offs.”It's a free open-source project that translates from one kind of database SQL to a different kind of database SQL. There have been a whole bunch of attempts at this over the years, and in practice, none of them have really panned out. I have seen no indications that Babelfish is different. If someone at AWS works on this or is a customer using Babelfish and say, “Wait, that's not true,” please tell me because all I'm saying is I have not seen it and I don't expect that I will. But I'm always willing to be wrong. Please, if I say something at some point that someone disagrees with, please reach out to me. I don't intend to perpetuate misinformation.“Purely hypothetically”—yeah, it's always great to ask things hypothetically—“In the companies I work with, which group typically manages purchasing savings plans, the ops team, finance, some mix of both?” It depends. The sad answer is, “What's a savings plan,” asks the company, and then we have an educational path to go down. Often it is individual teams buying them ad hoc, which can work, cannot as long as everyone's on the same page. Central planning, in a bunch of—a company that's past a certain point in sophistication is where everything winds up leading to.And that is usually going to be a series of discussions, ideally run by that group in a cross-functional way. They can be cost engineering, they can be optimization engineering, I've heard it described in a bunch of different ways. But that is—increasingly as the sophistication of your business and the magnitude of your spend increases, the sophistication of how you approach this should change as well. Early on, it's the offense of some VP of engineering at a startup. Like, “Oh, that's a lot of money,” running the analyzer and clicking the button to buy what it says. That's not a bad first-pass attempt. And then I think getting smaller and smaller buys as you continue to proceed means you can start to—it no longer becomes the big giant annual decision and instead becomes part of a frequently used process. That works pretty well, too.Is there anything else that I want to make sure I get to before we wind up running this down? To the folks in the comments, this is your last chance to throw random, awkward questions my way. I'm thrilled to wind up taking any slings, arrows, et cetera, that you care to throw my way a going once, going twice style. Okay, “What is the most esoteric or shocking item on the AWS bill that you ever found with one of your customers?” All right, it's been long enough, and I can say it without naming the customer, so that'll be fun.My personal favorite was a high five-figure bill for Route 53. I joke about using Route 53 as a database. It can be, but there are better options. I would say that there are a whole bunch of use cases for Route 53 and it's a great service, but when it's that much money, it occasions comment. It turned out that—we discovered, in fact, a data exfiltration in progress which made it now a rather clever security incident.And, “This call will now be ending for the day and we're going to go fix that. Thanks.” It's like I want a customer testimonial on that one, but for obvious reasons, we didn't get one. But that was probably the most shocking thing. The depressing thing that I see the most—and this is the core of the cost problem—is not when the numbers are high. It's when I ask about a line item that drives significant spend, and the customer is surprised.I don't like it when customers don't know what they're spending money on. If your service surprises customers when they realize what it costs, you have failed. Because a lot of things are expensive and customers know that and they're willing to take the value in return for the cost. That's fine. But tricking customers does not serve anyone well, even your own long-term interests. I promise.“Have I ever had to reject a potential client because they had a tangled mess that was impossible to tackle, or is there always a way?” It's never the technology that will cause us not to pursue working with a given company. What will is, like, if you go to our website at duckbillgroup.com, you're not going to see a ‘Buy Here' button where you ‘add one consulting, please' to your shopping cart and call it a day.It's a series of conversations. And what we will try to make sure is, what is your goal? Who's aligned with it? What are the problems you're having in getting there? And what does success look like? Who else is involved in this? And it often becomes clear that people don't like the current situation, but there's no outcome with which they would be satisfied.Or they want something that we do not do. For example, “We want you to come in and implement all of your findings.” We are advisory. We do not know the specifics of your environment and—or your deployment processes or the rest. We're not an engineering shop. We charge a fixed fee and part of the way we can do that is by controlling the scope of what we do. “Well, you know, we have some AWS bills, but we really want to—we really care about is our GCP bill or our Datadog bill.” Great. We don't focus on either of those things. I mean, I can just come in and sound competent, but that's not what adding value as a consultant is about. It's about being authoritatively correct. Great question, though.“How often do I receive GovCloud cost optimization requests? Does the compliance and regulation that these customers typically have keep them from making the needed changes?” It doesn't happen often and part of the big reason behind that is that when we're—and if you're in GovCloud, it's probably because you are a significant governmental entity. There's not a lot of private sector in GovCloud for almost every workload there. Yes, there are exceptions; we don't tend to do a whole lot with them.And the government procurement process is a beast. We can sell and service three to five commercial engagements in the time it takes to negotiate a single GovCloud agreement with a customer, so it just isn't something that we focused. We don't have the scale to wind up tackling that down. Let's also be clear that, in many cases, governments don't view money the same way as enterprise, which in part is a good thing, but it also means that, “This cloud thing is too expensive,” is never the stated problem. Good question.“Waffles or pancakes?” Is another one. I… tend to go with eggs, personally. It just feels like empty filler in the morning. I mean, you could put syrup on anything if you're bold enough, so if it's just a syrup delivery vehicle, there are other paths to go.And I believe we might have exhausted the question pool. So, I want to thank you all for taking the time to talk with me. Once again, I am Cloud Economist Corey Quinn. And this is a very special live episode of Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review wherever you can—or a thumbs up, or whatever it is, like and subscribe obviously—whereas if you've hated this podcast, same thing: five-star review, but also go ahead and leave an insulting comment, usually around something I've said about a service that you deeply care about because it's tied to your paycheck.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Using Empathy to Solve Customer Challenges with David Colebatch

Screaming in the Cloud

Play Episode Listen Later Sep 28, 2023 34:00


David Colebatch, CEO of Tidal, joins Corey on Screaming in the Cloud to discuss Tidal's recent shift to a product-led approach and why empathizing with customers is always their most important job. David describes what it was like to grow the company from scratch on a boot-strapped basis, and how customer feedback and challenges inform the company strategy. Corey and David discuss the cost-savings measures cloud customers are now embarking on, and David discusses how constant migrations are the new normal. Corey and David also discuss the impact that generative AI is having not just on tech, but also on creative content and interactions in our everyday lives. About David David is the CEO & Founder of Tidal.  Tidal is empowering businesses to transform from traditional on-premises IT-run organizations to lean-agile-cloud powered machines.Links Referenced: Company website: https://tidal.cloud LinkedIn: https://www.linkedin.com/in/david-colebatch/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Returning guest today, David Colebatch is still the CEO at Tidal. David, how have you been? It's been a hot second.David: Thanks, Corey. Yeah, it's been a fantastic summer for me up here in Toronto.Corey: Yeah, last time I saw you, was it New York or was it DC? They all start to run together to me.David: I think it was DC. Yeah.Corey: That's right. Public Sector Summit where everything was just a little bit stranger than most of my conversations. It's, “Wait, you're telling me there's a whole bunch of people who use the cloud but don't really care about money? What—how does that work?” And I say that not from the position of harsh capitalism, but from the position of we're a government; saving costs is nowhere in our mandate. Or it is, but it's way above my pay grade and I run the cloud and call it good. It seems like that attitude is evolving, but slowly, which is kind of what you want to see. Titanic shifts in governing are usually not something you want to see done on a whim, overnight.David: No, absolutely. A lot of the excitement at the DC summit was around new capabilities. And I was actually really intrigued. It was my first time in the DC summit, and it was packed, from the very early stages of the morning, great attendance throughout the day. And I was just really impressed by some of the new capabilities that customers are leveraging now and the new use cases that they're bringing to market. So, that was a good time for me.Corey: Yeah. So originally, you folks were focused primarily on migrations and it seems like that's evolving a little bit. You have a product now for starters, and the company's name is simply Tidal, without a second word. So, brevity is very much the soul of wit, it would seem. What are you doing these days?David: Absolutely. Yeah, you can find us at tidal.cloud. Yeah, we're focused on migrations as a primary means to help a customer achieve new capabilities. We're about accelerating their journey to cloud and optimizing once they're in cloud as well. Yeah, we're focused on identifying the different personas in an enterprise that are trying to take that cloud journey on with people like project, program managers, developers, as well as network people, now.Corey: It seems, on some level, like you are falling victim to the classic trap that basically all of us do, where you have a services company—which is how I thought of you folks originally—now, on some level, trying to become a product or a platform company. And then you have on the other side of it—places that we're—“Oh, we're a SaaS company. This is hard. We're going to do services instead.” And it seems like no one's happy. We're all cats, perpetually on the wrong side of a given door. Is that an accurate assessment for where you are? Or am I misreading the tea leaves on this one?David: A little misread, but close—Corey: Excellent.David: You're right. We bootstrapped our product company with services. And from day one, we supported our customers, as well as channel partners, many of the [larger size 00:03:20] that you know, we supported them in helping their customers be successful. And that was necessary for us as we bootstrapped the company from zero. But lately, and certainly in the last 12 months, it's very much a product-led company. So, leading with what customers are using our software for first, and then supporting that with our customer success team.Corey: So, it's been an interesting year. We've seen simultaneously a market correction, which I think has been sorely needed for a while, but that's almost been overshadowed in a lot of conversations I've had by the meteoric rise and hype around generative AI. Have you folks started rebranding everything with a fresh coat of paint labeled generative AI yet as it seems like so many folks have? What's your take on it?David: We haven't. You won't see a tidal.ai from us. Look, our thoughts are leveraging the technology as we always had to provide better recommendations and suggestions to our users, so we'll continue to embrace generative AI as it applies to specific use cases within our product. We're not going to launch a brand new product just around the AI theme.Corey: Yeah, but even that seems preferable to what a lot of folks are doing, which is suddenly pivoting their entire market positioning and then act, “Oh, we've been working in generative AI for 5, 10, 15 years,” in some cases. Google and Amazon most notably have talked about how they've been doing this for decades. It's, “Cool. Then why did OpenAI beat you all to the punch on this?” And in many cases, also, “You've been working on this for decades? Huh. Then why is Alexa so terrible?” And they don't really have a good talking point for that yet, but it's the truth.David: Absolutely. Yeah. I will say that the world changed with the OpenAI launch, of course, and we had a new way to interact with this technology now that just sparked so much interest from everyday people, not just developers. And so, that got our juices flowing and creativity mode as well. And so, we started thinking about, well, how can we recommend more to other users of our system as opposed to just cloud architects?You know, how can we support project managers that are, you know, trying to summarize where they're at, by leveraging some of this technology? And I'm not going to say we have all the answers for this baked yet, but it's certainly very exciting to start thinking outside the box with a whole new bunch of capabilities that are available to us.Corey: I tried doing some architecture work with Chat-Gippity—yes, that is how I pronounce it—and it has led me down the primrose path a little bit because what it says is often right. Mostly. But there are some edge-case exceptions of, “Ohh, it doesn't quite work that way.” It reminds me at some level of a junior engineer who doesn't know the answer, so they bluff. And that's great, but it's also a disaster.Because if I can't trust the things you tell me and you to call it out when you aren't sure on something, then I've got to second guess everything you tell me. And it feels like when it comes to architecture and migrations in particular, the devil really is in the details. It doesn't take much to design a greenfield architecture on a whiteboard, whereas being able to migrate something from one place to another and not have to go down in the process? That's a lot of work.David: Absolutely. I have used AI successfully to do a lot of research very quickly across broad market terms and things like that, but I do also agree with you that we have to be careful using it as the carte blanche force multiplier for teams, especially in migration scenarios. Like, if you were to throw Chat-Gippity—as you say—a bunch of COBOL code and say, “Hey, translate this,” it can do a pretty good job, but the devil is in that detail and you need to have an experienced person actually vet that code to make sure it's suitable. Otherwise, you'll find yourself creating buggy things downstream. I've run into this myself, you know, “Produce some Terraform for me.” And when I generated some Terraform for an architecture I was working on, I thought, “This is pretty good.” But then I realized, it's actually two years old and that's about how old my skills were as well. So, I needed to engage someone else on my team to help me get that job done.Corey: So, migrations have been one of those things that people have been talking about for well, as long as we've had more than one data center on the planet. “How do we get our stuff from over here to over there?” And so, on and so forth. But the context and tenor of those conversations has changed dramatically. What have you seen this past year or so as far as emerging trends? What is the industry doing that might not be obvious from the outside?David: Well, cost optimization has been number one on people's minds, and migrating with financial responsibility in mind has been refreshing. So, working backwards from what their customer outcomes are is still number one in our book, and when we see increasingly customers say, “Hey, I want to migrate to cloud to close a data center or avoid some capital outlay,” that's the first thing we hear, but then we work backwards from what was their three-year plan. And then what we've seen so far is that customers have changed from a very IT-centric view of cloud and what they're trying to deliver to much more business-centric. Now, they'll say things like, “I want to be able to bring new capabilities to market more quickly. I want to be able to operate and leverage some of these new generative AI technologies.” So, they actually have that as a driving force for migrations, as opposed to an afterthought.Corey: What I have found is that, for whatever reason, not giving a shit about the AWS bill in my business was a zero-interest-rate phenomenon. Suddenly people care an awful lot. But they're caring is bounded. If there's a bunch of easy stuff to do that saves a giant pile of money, great, yeah, most folks are going to do that. But then it gets into the idea of opportunity cost and trade-offs. And there's been a shift there that I've seen where people are willing to invest more in that cost-cutting work than they were in previous years.It makes sense, but it's also nice to finally have a moment to validate what I've assumed for seven years now that, yeah, in a recession or a retraction of the broader industry, suddenly, this is going to be top-of-mind for a lot of folks. And it's nice to see that that approach was vindicated because the earlier approach that I saw when we saw something like this was at the start of Covid. And at that point, no one knew what was happening week-to-week and consulting leads basically stopped for six months. And that was oh, maybe we don't have a counter-cyclical business. But no, it turns out that when money means something again as interest rates rise, people care about it more.David: Yeah. It is nice to see that. And people are trying to do more with less and become more efficient in an advanced pace these days. I don't know about you, but I've seen the trends towards the low-hanging fruit being done at this point so people have already started using savings plans and capabilities like that, and now they're embarking in more re-architecture of applications. But I think one stumbling block that we've noticed is that customers are still struggling to know where to apply those transformations across their portfolio. They'll have one or two target apps that everybody knows because they're the big ones on the bill, but beneath that, the other 900 applications in their portfolio, which ones do I do next? And that's still a question that we're seeing come up, time and again.Corey: One thing that I'm starting to see people talking about from my perspective, has been suddenly they really care about networking in a way that they did not previously. And I mean, this in the TCP/IP sense, not the talking to interesting people and doing interesting things. That's been basically steady-state for a while. But from my perspective, the conversations I'm having are being driven by, “Wait a minute. AWS is going to start charging $3.50 a month per assigned IPV4 address. Oh, dear. We have been careless in our approach to this.” Is that something that you're seeing shaping the conversations you're having with folks?David: Oh yeah, absolutely. I mean right off the bat, our team went through very quickly and inventoried our IPV4, and certainly, customers are doing that as well. I found that, you know, in the last seven years, the migration conversations were having become broader across an enterprise customer. So, we've mapped out different personas now, and the networking teams playing a bigger role for migrations, but also optimizations in the cloud. And I'll give you an example.So, one large enterprise, their networking team approached us at the same time as their cloud architects who were trying to work on a migration approached us. And the networking team had a different use case. They wanted to inventory all the IP addresses on-premises, and some that they already had in the cloud. So, they actually leveraged—shameless plug here—but they leveraged out a LightMesh IPAM solution to do that. And what that brought to light for us was that the integration of these different teams working together now, as opposed to working around each other. And I do think that's a bit of a trend change for us.Corey: IPAM has always been one of those interesting things to me because originally, the gold standard in this space was—let's not kid ourselves—a Microsoft Excel spreadsheet. And then there are a bunch of other offerings that entered into the space. And for a while I thought most of these were ridiculous because the upgrade was, you know, Google Sheets so you can collaborate. But having this done in a way with particular permissions and mapping in a way that's intuitive and doesn't require everyone to not mess up when they're looking at it, especially as you get into areas of shared responsibility between different divisions or different team members who are in different time zones and whatnot, this becomes a more and more intractable problem. It's one of those areas where small, scrappy startups don't understand what the fuss is about, and big enterprises absolutely despair of finding something that works for them.AWS launched their VPC IPAM offering a while back and if you look at it from the perspective of competing with Google Sheets, its pricing is Looney Tunes. But I've met an awful lot of people who have sworn by it in the process, as they look at these things. Now, of course, the caveat is that like most AWS offerings, it's great in a pure AWS native environment, but as soon as you start getting into other providers and whatnot, it gets very tricky very quickly.David: No, absolutely. And usability of an IP address management solution is something to consider. So, you know, if you're trying to get on board with IPAM, do you want to do three easy steps or do you want to follow 150? And I think that's a really big barrier to entry for a lot of networking teams, especially those that are not too familiar with cloud already. But yeah, where we've seen the networking folks get more involved is around, like, identifying endpoints and devices that must be migrated to cloud, but also managing those subnets and planning their VPC designs upfront.You've probably seen this before yourself where customers have allocated a whole bunch of address space over time—an overlapping address space, I should say—only to then later want to [peer 00:13:47] those networks. And that's something that if you think you're going to be doing downstream, you should really plan for that ahead of time and make sure your address space is allocated correctly. Problems vary. Like, everyone's architecture is different, of course, but we've certainly noticed that being one of the top-button items. And then that leads into a migration itself. You're not migrating to cloud now; you're migrating within the cloud and trying to reorganize address spaces, which is a whole other planning activity to consider.Corey: When you take a look at, I guess the next step in these things, what's coming next in the world of migrations? I recently got to talk to someone who was helping their state migrate from, effectively, mainframes in many cases into a cloud environment. And it seems, on some level, like everyone on a mainframe, one, is very dependent on that workload; those things are important, so that's why they're worth the extortionate piles of money, but it also feels like they've been trying to leave the mainframe for decades in many cases. Now, there's a sense that for a lot of these folks, the end is nigh for their mainframe's lifespan, so they're definitely finally taking the steps to migrate. What's the next big frontier once the, I guess, either the last holdouts from that side of the world wind up getting into a cloud or decide they never will? It always felt to me like migrations are one of those things that's going to taper off and it's not going to be something that is going to be a growth industry because the number of legacy workloads is, at least theoretically, declining. Not so sure that's accurate, though.David: I don't think it is either. If we look back at past migrations, you know, 90, 95% of them are often lift-and-shift to EC2 or x86 on VMware in the cloud. And a lot of the work that we're seeing now is being described as optimization. Like, “Look at my EC2 workloads and come up with cloud-native or transformative processes for me.” But those are migrations as well because we run the same set of software, the same processes over those workloads to determine how we can re-platform and refactor them into more native services.So, I think, you know, the big shift for us is just recognizing that the term ‘migrations' needs to be well-defined and communicated with folks. Migrations are actually constant now and I would argue we're doing more migrations within customers now than we have in the past because the rate of change is just so much faster. And I should add, on the topic of mainframe and legacy systems, we have seen this pivot away from teams looking for emulation layers for those technologies, you know, where they want to forklift the functionality, but they don't want to really roll up their sleeves and do any coding work. So, they're previously looking to automatically translate code or emulate that compute layer in the cloud, and the big pivot we've seen in the last 12 months, I'd say, is that customers are more willing to actually understand how to rebuild their applications in the cloud. And that's a fantastic story because it means they're not kicking that technology debt can down the road any further. They're really trying to embrace cloud and leverage some of these new capabilities that have come to market.Corey: What do you see as, I guess, the reason that a number of holdouts have not yet done a migration? Like, historically, I've seen some that are pretty obvious: the technology wasn't there. Well, cloud has gotten to a point now where it is hard to identify a capability that isn't there in some form. And there's always been the sunk cost fallacy where, “Well, we've already bought all this stuff, and it's running here, so if we're not replacing it anytime soon, there's no cost benefit for us to replace it.” And that's actually correct. That's not a fallacy there. But there's also the, “Well, it would be too much work to move.” Sometimes true, sometimes not. Are you seeing a shift in the reasons that people are giving to not migrate?David: No, I haven't. It's been those points mostly. And I'd say one of the biggest inhibitors to people actually getting it done is this misconception that it costs a lot of money to transform and to adopt cloud tools. You've seen this through the technology keeps getting easier and easier to adopt and cheaper to use. When you can provision services for $0 a month and then scale with usage patterns, there's really no reason not to try today because the opportunity cost is so low.So, I think that one of the big inhibitors that comes up, though, is this cultural barrier within organizations where teams haven't been empowered to try new things. And that's the one thing that I think is improving nowadays, as more of this how-to-build-in-the-cloud capability becomes permeated throughout the organization. People are saying, “Well, why can't we do that?” As opposed to, “We can't do that.” You know what I mean? It's a subtle difference, but once leadership starts to say, “Why can't we do this modern thing in the cloud? Why can't we leverage AI?” Teams are given more rope to try and experiment, and fail, of course. And I think ultimately, that culture shift is starting to take root across enterprise and across public sector as well.Corey: One of the things that I find surprising is the enthusiasm with which different market segments jump onto different aspects of cloud. Lambda is a classic example, in that it might be one of the services that is more quickly adopted by enterprises than by startups and a lot of cases. But there's also the idea of, “Oh, we built this thing last night, and it's awesome.” And enterprises, like you know, including banks and insurance companies don't want to play those games, for obvious reasons.Generative AI seems to be a mixed bag around a lot of these things. Have you had conversations with a number of your clients around the generative AI stuff? Because I've seen Amazon, for example, talking about it, “Oh, all our customers are asking us about it.” And, mmm, I don't know. Because I definitely have questions about and I'm exploring it, but I don't know that I'm turning to Amazon, of all companies, to answer those questions, either.David: Yeah. We've certainly had customer conversations about it. And it depends, again, on those personas. On the IT side, the conversations are mostly around how can they do their jobs better. They're not thinking forwards about the business capabilities. So, IT comes to us and they want to know how can we use generative AI to create Lambda functions and create stateless applications more quickly as a part of a migration effort. And that's great. That's a really cool use case. We've used that generative AI approach to create code ourselves.But on the business side, they're looking forwards, they want to use generative AI in the, again, the sample size of my customer conversations, but they see that the barrier to entry is getting their data in a place that they can leverage it. And to them, to the business, that's what's driving the migration conversations they're having with us, is, “How do I exfil my data and get it into the cloud where I can start to leverage these great AI tools?”Corey: Yeah, I'm still looking at use cases that I think are a little less terrifying. Like, I want to wind up working on a story or something. Or I'll use it to write blog posts; I have a great approach. It's, “Write a blog post about this topic and here are some salient points and do it in the style of Corey Quinn.” I'll ask Chat-Gippity to do that and it spits out something that is, frankly, garbage.And I get angry at it and I basically copy it into a text editor and spent 20 minutes mansplain-correcting the robot. And by the time it's done, I have, like, a structure of an article that talks about the things I want to talk about correctly. And there may be three words in a sequence that were originally there. And frankly, I'm okay with plagiarizing from the thing that is plagiarizing from me. It's a beautiful circle of ripping things off that that's glorious for me.But that's also not something that I could see being useful at any kind of scale, where I see companies getting excited about a lot of this stuff, it all seems to be a thin veneer over, “And then we can fire our customer service people,” which from a labor perspective is not great, but ignoring that entirely, as a customer, I don't want that. Because by the time I have to reach out to a company's customer service apparatus, something has gone wrong and it isn't going to be solved by the standard list of frequently asked questions that I clicked on. It's something that is off the beaten path and anomalous and requires human judgment. Making it harder for me to get to people who can fix those things does not thrill and delight me.David: I agree. I'm with you there. Where I get excited about it, though, is how much of a force multiplier it can be on that human interaction. So, for example, in that customer's service case you mentioned, you know, if that customer service rep is empowered by an AI dashboard that's listening to my conversation and taking notes and automatically looking up in my knowledge base how to support that customer, then that customer success person can be more successful more quickly, I think they can be more responsive to customer needs and maybe improve the quality, not just the volume of work they do but improve the quality, too.Corey: That's part of the challenge, too. There have been a number of companies that have gotten basically rapped across the snout for just putting out articles as content, written by AI without any human oversight. And these don't just include, you know, small, scrappy content mills; they include Microsoft, and I believe CNN, if I'm not mistaken, had something similar with that going on. I'm not certain on that last one. I don't want to defame them, but I know for a fact Microsoft did.David: Yeah, and I think some of the email generators are plugging into AI now, too, because my spam count has gone through the roof lately.Corey: Oh, my God. I got one recently saying, “Hey, I noticed at The Duckbill Group that you fix AWS bills. Great. That's awesome and super valuable for your clients.” And then try to sell me bill optimization and process improvement stuff. And it was signed by the CEO of the company that was reaching out.And then there was like—I expand the signature view, and it's all just very light gray text make it harder to read, saying, “This is AI generated, yadda, yadda, yadda.” Called the company out on Twitter, and they're like, “Oh, we only have a 0.15% error rate.” That sounds suspiciously close to email marketing response rates. “Welp, that means 99% of it was perfect.” No, it means that you didn't get in front of most of those people. They just ignored it without reading it the way we do most email outreach. So, that bugs me a fair bit. Because my perspective on it is if you don't care enough to actually craft a message to send me, why should I care enough to read it?David: Completely agree. I think a lot of people are out there looking for that asymmetric, you know, leverage that you can get over the market, and generating content, to them, has been a blocker for so long and now they're just opening up the fire hose and drowning us all with it. So I'm, like, with you. I think that I personally don't expect to get value back from someone unless I put value into that relationship. That's my starting point coming into it, so I would maybe use AI to help assist forming a message to someone, but I'm not going to blast the internet with content. I just think that's a cheeky low-value way to go about it.Corey: I don't track the numbers anymore, but I know that at this point, through the size of my audience and the content that I put out, I have taken, collectively, millennia of human time focusing on—that has been spent consuming the content that I put out. And as a result of that, I have a guiding principle here, which is first and foremost, you've got to respect your audience. And I'm just going to have a robot phone it in is not respecting your audience. I have no problem with AI assistants, but it requires human oversight before it goes out. I would never in a million years send anything out to the audience that I hadn't at least read or validated first.But yeah, some of the signups that go out, the automatic things that you click a button and sign up for my newsletter at lastweekinaws.com, you get an auto message that comes out. Yeah, it comes out under my name and I either wrote it or reviewed it, depending on what generation of system we're on these days, because it has my name attached to it. That's the way that this works. Your credibility is important and having a robot spout off complete nonsense and you get the credit or blame for it? No thanks. I want to be doomed from my own sins, not the ones that a computer makes on my behalf.David: [laugh]. Yeah, I'm with you. It's unfortunate that so many people expect the emails from you are generated now. We have the same thing when people sign up for Tidal Accelerator or Tidal LightMesh, they get a personal email from me. They'll get the automated one as well, but I generally get in there through our CRM, and I send them a message, too. And sometimes they'll respond and say, “This isn't really David, is it?” No, no, it's me. You don't have to respond. I wanted to let you know that I'm thankful for you trialing our software.Corey: Oh, yeah. You can hit reply to any email I send out. It comes from corey@lastweekinaws.com and it goes to my inbox. The reason that works, frankly, at this scale is because no one does it. People don't believe that that'll actually work. So, on a busy week, I'll get maybe a dozen email replies to it or one or two misconfigured bounces from systems that aren't set up properly to do those things. And I weed those out because they drive me nuts.But it's a yeah, the only emails that I get to that address, honestly, are the test copies of those messages that go out, too, because I'm on my own newsletter list. Who knew? I have two at the moment. I have—yes, I have two specific addresses on that, so I guess technically, I'm inflating the count of subscribers by two, if advertisers ask. But you know, at 32,000 and change, I will take the statistical fudging.David: Absolutely. We all expect that.Corey: No, the depressing part, when I think about that is, there's a number of readers I have on the list that I know for a fact that I've been acquainted with who have passed away. They're never going to unsubscribe from these things until the email starts bouncing at some and undefinable point in the future. But it's also—it feels morbid, but on some level, if I continue doing this for the rest of my life, I'm going to have a decent proportion of the subscriber base who's died. At least when people leave their jobs, like, their email address gets turned off, things start bouncing and cool that gets turned off automatically because even when people leave voluntarily, no one bothers to go through an unsubscribe from all this stuff. So, automated systems have to do it. That's great. I'm not saying computers shouldn't make life better. I am saying that they can't replace a fundamental aspect of human caring.David: So, Corey Quinn, who has influence over the living and the dead. It's impressive.Corey: Oh, absolutely. Honestly, if I were to talk to whoever came up with IBM's marketing strategy, I feel like I'd need to conduct a seance because they're probably 300 years old if they're still alive.David: [laugh]. Absolutely.Corey: No, I get passionate about this stuff because so much of a lot of the hype now has been shifting away from letting people expand their reach further and doing things in intentional ways and instead toward absolute garbage, such as, “Cool, we want to get a whole bunch of clicks so we can show ads to them, so we're going to just generate all bunch of crap to your content and throw it out there.” Everything I write, even stuff that admittedly, from time to time, is aimed for SEO purposes for specific things that we're doing, but that's always done from a perspective of okay, my primary SEO strategy is write compelling, original content and then people presumably link to it. And it works. It's about respecting the audience and so many things get that wrong.David: Yeah, absolutely. It's kind of scary now because I always thought that podcasts and video were the last refuge of authentic content. And now people are generating that as well. You know, you're watching a video and you realize hey, that voice sounds exactly consistent, you know, all the way through. And then it turns out, it's generated. And there's a YouTube channel I follow because I'm an avid sailor, called World On Water. And recently, I've noticed that voice changed, and I'm pretty sure they're using AI to generate it now.Corey: Here's a story I don't think you probably know about yourself. So, for those who are unaware, David, I hang out from time to time in various places. There's a international boundary between us, but occasionally one of us will broach it, and good for us. And we have social conversations where somehow one of us doesn't have a microphone in front of our face. Imagine that. I don't know what that's like most weeks.And like, at some level, the public face comes off and people start acting like human beings. And something I've always noticed about you, David, is that you don't commit the cardinal sin, for an awful lot of people I meet, which is displaying contempt for your customers. When I have found people who do that, I think less of them in almost every case and I lose so much interest in whatever it is that they're doing. If you don't like the problem space that you're in and don't have respect for the people paying you to make these problems go away, you shouldn't be doing it. Like, I'll laugh at silly AWS misconfigurations, but my customers are there because they have a problem and they're bringing me in to fix it. And would I be making fun of? “Ha ha ha, you didn't spend eight months of your life learning the ins and outs of how exactly reserved instances apply in this particular context? What a fool is you.” That's not how it ever works. I wish I could say it wasn't quite as rare as it is but I'm tired of talking to people who have just nothing but contempt for their market. Good work on that.David: Thank you. Yeah, I appreciate that. You know, I had a penny-drop moment when I was doing a lot of consulting work as an independent contractor, working with different customers at different stages of their own journey and different levels of technology capabilities. You know, you work with management, with project people, with software engineers, and you start to realize everybody's coming from a different place. So, you have to empathize with where they're at.They're coming to you usually because you have a level of expertise, that you've got some specialization and they want to tap into that capability that you've created. And that's great. I love having people come to me and ask me questions. Sometimes they don't come to me nicely asking questions, they make some assumptions about me and might challenge me right off the bat, but you have to realize that that's just where they're coming from at that point in time. And once you connect with them, they'll open up a little bit more, too; they'll empathize with yourself. So yeah, I've always found that it's really important for myself personally, but also for our team to empathize with customers, meet them where they're at, understand that they're coming from a different level of experience, and then help them solve their problems. That's job number one.Corey: And I'm a firm believer that if you don't respect your customer's business, they shouldn't be your customer. It's happened remarkably few times in the however many years I've been doing this, but there have been a couple of folks that have reached out I always very politely decline to work with them when this happens. Because you don't want to make people feel obnoxious for reaching out and, like, “Can you help me with my problem?” “How dare you? Who do you think you are?”No, no, no, no, no, none of that. But if there's a value misalignment or I don't think that your product is going to benefit people who use it as directed, I will not let you sponsor what I do as an easy example. Because I can always find another sponsor and make more money, but once I start losing the audience's trust, I'll never get that back, and I know that. It's the entire reason I do things the way that I do them. And maybe, on some level, from purely capitalist perspective, I'm being an absolute fool, but you know, if you have to pick a way to fail and assume you're going to get it wrong, how do you want to be wrong? I'll take this way.David: Yeah, I agree. Keep your ethics high, keep your morals high, and the rest will fall into place.Corey: I love how we started having ethical and morality discussions that started as, “So, cloud migrations. How are they going for you?”David: Yeah [laugh]. Certainly wandered into some uncharted territories on that one.Corey: Exactly. We started off in one place; wound up someplace completely removed from anything we could have reasonably expected at the start. Why? Because this entire episode has been a beautiful metaphor for cloud migrations. I really want to thank you for taking the time to chat with me on this stuff. If people want to learn more, where should they go to find you?David: tidal.cloud or LinkedIn, I'm very active on LinkedIn these days.Corey: And we will, of course, put links to both of those in the show notes. Thank you so much for going down this path with me. I didn't expect it to lead where it did, but I'm glad we went there.David: Like the tides ebbing and flowing. I'll be back soon, Corey.Corey: [laugh]. I will take you up on that and hold you to it.David: [laugh]. Sounds great.Corey: David Colebatch, CEO at Tidal. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, upset comment that doesn't actually make cohesive sense because you outsourced it to a robot.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How AWS Educates Learners on Cloud Computing with Valerie Singer

Screaming in the Cloud

Play Episode Listen Later Sep 26, 2023 35:56


Valerie Singer, GM of Global Education at AWS, joins Corey on Screaming in the Cloud to discuss the vast array of cloud computing education programs AWS offers to people of all skill levels and backgrounds. Valerie explains how she manages such a large undertaking, and also sheds light on what AWS is doing to ensure their programs are truly valuable both to learners and to the broader market. Corey and Valerie discuss how generative AI is applicable to education, and Valerie explains how AWS's education programs fit into a K-12 curriculum as well as job seekers looking to up-skill. About ValerieAs General Manager for AWS's Global Education team, Valerie is responsible forleading strategy and initiatives for higher education, K-12, EdTechs, and outcome-based education worldwide. Her Skills to Jobs team enables governments, educationsystems, and collaborating organizations to deliver skills-based pathways to meetthe acute needs of employers around the globe, match skilled job seekers to goodpaying jobs, and advance the adoption of cloud-based technology.In her ten-year tenure at AWS, Valerie has held numerous leadership positions,including driving strategic customer engagement within AWS's Worldwide PublicSector and Industries. Valerie established and led the AWS's public sector globalpartner team, AWS's North American commercial partner team, was the leader forteams managing AWS's largest worldwide partnerships, and incubated AWS'sAerospace & Satellite Business Group. Valerie established AWS's national systemsintegrator program and promoted partner competency development and practiceexpansion to migrate enterprise-class, large-scale workloads to AWS.Valerie currently serves on the board of AFCEA DC where, as the Vice President ofEducation, she oversees a yearly grant of $250,000 in annual STEM scholarships tohigh school students with acute financial need.Prior to joining AWS, Valerie held senior positions at Quest Software, AdobeSystems, Oracle Corporation, BEA Systems, and Cisco Systems. She holds a B.S. inMicrobiology from the University of Maryland and a Master in Public Administrationfrom the George Washington University.Links Referenced: AWS: https://aws.amazon.com/ GetIT: https://aws.amazon.com/education/aws-getit/ Spark: https://aws.amazon.com/education/aws-spark/ Future Engineers: https://www.amazonfutureengineer.com/ code.org: https://code.org Academy: https://aws.amazon.com/training/awsacademy/ Educate: https://aws.amazon.com/education/awseducate/ Skill Builder: https://skillbuilder.aws/ Labs: https://aws.amazon.com/training/digital/aws-builder-labs/ re/Start: https://aws.amazon.com/training/restart/ AWS training and certification programs: https://www.aws.training/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. A recurring theme of this show in the, what is it, 500 some-odd episodes since we started doing this many years ago, has been around where does the next generation come from. And ‘next generation' doesn't always mean young folks graduating school or whatnot. It's people transitioning in, it's career changers, it's folks whose existing jobs evolve into embracing the cloud industry a lot more readily than they have in previous years. My guest today arguably knows that better than most. Valerie Singer is the GM of Global Education at AWS. Valerie, thank you for agreeing to suffer my slings and arrows. I appreciate it.Valerie: And thank you for having me, Corey. I'm looking forward to the conversation.Corey: So, let's begin. GM, General Manager is generally a term of art which means you are, to my understanding, the buck-stops-here person for a particular division within AWS. And Global Education sounds like one of those, quite frankly, impossibly large-scoped type of organizations. What do you folks do? Where do you start? Where do you stop?Valerie: So, my organization actually focuses on five key areas, and it really does take a look at the global strategy for Amazon Web Services in higher education, research, our K through 12 community, our community of ed-tech providers, which are software providers that are specifically focused on the education sector, and the last plinth of the Global Education Team is around skills to jobs. And we care about that a lot because as we're talking to education providers about how they can innovate in the cloud, we also want to make sure that they're thinking about the outcomes of their students, and as their students become more digitally skilled, that there is placement for them and opportunities for them with employers so that they can continue to grow in their careers.Corey: Early on, when I was starting out my career, I had an absolutely massive chip on my shoulder when it came to formal education. I was never a great student for many of the same reasons I was never a great employee. And I always found that learning for me took the form of doing something and kicking the tires on it, and I had to care. And doing rote assignments in a ritualized way never really worked out. So, I never fit in in academia. On paper, I still have an eighth-grade education. One of these days, I might get the GED.But I really had problems with degree requirements in jobs. And it's humorous because my first tech job that was a breakthrough was as a network administrator at Chapman University. And that honestly didn't necessarily help improve my opinion of academia for a while, when you're basically the final tier escalation for support desk for a bunch of PhDs who are troubled with some of the things that they're working on because they're very smart in one particular area, but have challenges with broad tech. So, all of which is to say that I've had problems with the way that education historically maps to me personally, and it took a little bit of growth for me to realize that I might not be the common, typical case that represents everyone. So, I've really come around on that. What is the current state of how AWS views educating folks? You talk about working with higher ed; you also talk about K through 12. Where does this, I guess, pipeline start for you folks?Valerie: So, Amazon Web Services offers a host of education programs at the K-12 level where we can start to capture learners and capture their imagination for digital skills and cloud-based learning early on, programs like GetIT and Spark make sure that our learners have a trajectory forward and continue to stay engaged.Amazon Future Engineers also provides experiential learning and data center-based experiences for K through 12 learners, too, so that we can start to gravitate these learners towards skills that they can use later in life and that they'll be able to leverage. That said—and going back to what you said—we want to capture learners where they learn and how they learn. And so, that often happens not in a K through 12 environment and not in a higher education environment. It can happen organically, it can happen through online learning, it can happen through mentoring, and through other types of sponsorship.And so, we want to make sure that our learners have the opportunities to micro-badge, to credential, and to experience learning in the cloud particularly, and also develop digital skills wherever and however they learn, not just in a prescriptive environment like a higher education environment.Corey: During the Great Recession, I found that as a systems administrator—which is what we called ourselves in the style of the time—I was relatively weak when it came to networking. So, I took a class at the local community college where they built the entire curriculum around getting some Cisco certifications by the time that the year ended. And half of that class was awesome. It was effectively networking fundamentals in an approachable, constructive way, and that was great. The other half of the class—at least at the time—felt like it was extraordinarily beholden to, effectively—there's no nice way to say this—Cisco marketing.It envisioned a world where all networking equipment was Cisco-driven, using proprietary Cisco protocols, and it left a bad smell for a number of students in the class. Now, I've talked to an awful lot of folks who have gone through the various AWS educational programs in a variety of different ways and I've yet to hear significant volume of complaint around, “Oh, it's all vendor captured and it just feels like we're being indoctrinated into the cult of AWS.” Which honestly is to your credit. How did you avoid that?Valerie: It's a great question, and how we avoid it is by starting with the skills that are needed for jobs. And so, we actually went back to employers and said, “What are your, you know, biggest and most urgent needs to fill in early-career talent?” And we categorized 12 different job categories, the four that were most predominant were cloud support engineer, software development engineer, cyber analyst, and data analyst. And we took that mapping and developed the skills behind those four different job categories that we know are saleable and that our learners can get employed in, and then made modifications as our employers took a look at what the skills maps needed to be. We then took the skills maps—in one case—into City University of New York and into their computer science department, and mapped those skills back to the curriculum that the computer science teams have been providing to students.And so, what you have is, your half-awesome becomes full-awesome because we're providing them the materials through AWS Academy to be able to proffer the right set of curriculum and right set of training that gets provided to the students, and provides them with the opportunity to then become AWS Certified. But we do it in a way that isn't all marketecture; it's really pragmatic. It's how do I automate a sequence? How do I do things that are really saleable and marketable and really point towards the skills that our employers need? And so, when you have this book-end of employers telling the educational teams what they need in terms of skills, and you have the education teams willing to pull in that curriculum that we provide—that is, by the way, current and it maintains its currency—we have a better throughway for early-career talent to find the jobs that they need, and the guarantee that the employers are getting the skills that they've asked for. And so, you're not getting that half of the beholden that you had in your experience; you're getting a full-on awesome experience for a learner who can then go and excite himself and herself or theirself into a new position and career opportunity.Corey: One thing that caught me a little bit by surprise, and I think this is an industry-wide phenomenon is, whenever folks who are working with educational programs—as you are—talk about, effectively, public education and the grade school system, you refer to it as ‘K through 12.' Well, last year, my eldest daughter started kindergarten and it turns out that when you start asking questions about cloud computing curricula to a kindergarten teacher, they look at you like you are deranged and possibly unsafe. And yeah, it turns out that for almost any reasonable measure, exposing—in my case—a now six-year-old to cloud computing concepts feels like it's close cousins to child abuse. So—Valerie: [laugh].Corey: So far, I'm mostly keeping the kids away from that for now. When does that start? You mentioned middle school a few minutes ago. I'm curious as to—is that the real entry point or are there other ways that you find people starting to engage at earlier and earlier ages?Valerie: We are seeing people engage it earlier and earlier ages with programs like Spark, as I mentioned, which is more of a gamified approach to K through 12 learning around digital skills in the cloud. code.org also has a tremendous body of work that they offer K through 12 learners. That's more modularized and building block-based so that you're not asking a six-year-old to master the art of cloud computing, but you're providing young learners with the foundations to understand how the building blocks of technology sit on top of each other to actually do something meaningful.And so, gears and pulleys and all kinds of different artifacts that learners can play with to understand how the inner workings of a computer program come together, for instance, are really experientially important and foundationally important so that they understand the concepts on which that's built later. So, we can introduce these concepts very early, Corey, and kids really enjoy playing with those models because they can make things happen, right? They can make things turn and they can make things—they can actually, you know, modify behaviors of different programming elements and really have a great experience working in those different programs and environments like code.org and Spark.Corey: There are, of course, always exceptions to this. I remember the, I think, it's the 2019 public sector summit that you folks put on, you had a speaker, Karthick Arun, who at the time was ten years old and have the youngest person to pass the certification test to become a cloud practitioner. I mean, power to him. Obviously, that is the sort of thing that happens when a kid has passion and is excited about a particular direction. I have not inflicted that on my kids.I'm not trying to basically raise whatever the cloud computing sad version is of an Olympian by getting them into whatever it is that I want them to focus on before they have any agency in the matter. But I definitely remember when I was a kid, I was always frustrated by the fact that it felt like there were guardrails keeping me from working with any of these things that I found interesting and wanted to get exposure to. It feels like in many ways the barriers are coming down.Valerie: They are. In that particular example, actually, Andy Jassy interceded because we did have age requirements at that time for taking the exam.Corey: You still do, by the way. It's even to attend summits and whatnot. So, you have to be 18, but at some point, I will be looking into what exceptions have to happen for that because I'm not there to basically sign them up for the bar crawl or have them get exposure to, like, all the marketing stuff, but if they're interested in this, it seems like the sort of thing that should be made more accessible.Valerie: We do bring learners on, you know, into re:Invent and into our summits. We definitely invite our learners in. I mean I think you mentioned, there are a lot of other places our learners are not going to go, like bar crawls, but our learners under the age of 18 can definitely take advantage of the programs that we have on offer. AWS Academy is available to 16 and up.And again, you know, GetIT and Spark and Educate is all available to learners as well. We also have programs like Skill Builder, with an enormous free tier of learning modules that teams can take advantage of as well. And then Labs for subscription and fee-based access. But there's over 500 courses in that free tier currently, and so there's plenty of places for our, you know, early learners to play and to experiment and to learn.Corey: This is a great microcosm of some career advice I recently had caused to revisit, which is, make friends in different parts of the organization you work within and get to know people in other companies who do different things because you can't reason with policy; you can have conversations productively with human beings. And I was basing my entire, “You must be 18 or you're not allowed in, full stop,” based solely on a sign that I saw when I was attending a summit at the entrance: “You must be 18 to enter.” Ah. Clearly, there's no wiggle room here, and no—it's across the board, absolute hard-and-fast rule. Very few things are. This is a perfect example of that. So today, I learned. Thank you.Valerie: Yeah. You're very welcome. We want to make sure that we get the information, we get materials, we get experiences out to as many people as possible. One thing I would also note, and I had the opportunity to spend time in our skill centers, and these are really great places, too, for early learners to get experience and exposure to different models. And so earlier, when we were talking, you held up a DeepRacer car, which is a very, very cool, smaller-scale car that learners can use AI tools to help to drive.And learners can go into the skill centers in Seattle and in the DC area, now in Cape Town and in other places where they're going to be opening, and really have that, like, direct-line experience with AWS technology and see the value of it tangibly, and what happens when you for instance, model to move a car faster or in the right direction or not hitting the side of a wall. So, there's lots of ways that early learners can get exposure in just a few ways and those centers are actually a really great way for learners to just walk in and just have an experience.Corey: Switching gears a little bit, one of my personal favorite hobby horses is to go on Twitter—you know, back when that was more of a thing—and mock companies for saying things that I perceived to be patently ridiculous. I was gentle about it because I think it's a noble cause, but one of the more ridiculous things that I've heard from Amazon was in 2020, you folks announced a plan to help 29 million people around the world grow their tech skills by 2025. And the reason that I thought that was ridiculous is because it sounded like it was such an over-the-top, grandiose vision, I didn't see a way that you could possibly get anywhere even close. But again, I was gentle about this because even if you're half-wrong, it means that you're going to be putting significant energy, resourcing, et cetera, into educating people about how this stuff works to help lowering bar to entry, about lowering gates that get kept. I have to ask, though, now that we are, at the time of this recording, coming up in the second half of 2023, how closely are you tracking to that?Valerie: We're tracking. So, as of October, which is the last time I saw the tracking on this data, we had already provided skills-based learning to 13-and-a-half million learners worldwide and are very much on track to exceed the 2025 goal of 29 million. But I got to tell you, like, there's a couple of things in there that I'm sure you're going to ask as a follow-up, so I'll go ahead and talk about it practically, and that is, what are people doing with the learning? And then how are they using that learning and applying it to get jobs? And so, you know, 29 million is a big number, but what does it mean in terms of what they're doing with that information and what they're doing to apply it?So, we do have on my team an employer engagement team that actually goes out and works with local employers around the world, builds virtual job fairs and on-prem job fairs, sponsors things like DeepRacer League and Cloud Quests and Jam days so that early-career learners can come in and get hands-on and employers can look at what the potential employees are doing so that they can make sure that they have the experience that they actually say they have. And so, since the beginning of this year, we have already now recruited 323 what we call talent shapers, which are the employer community who are actually consuming the talent that we are proffering to them and that we're bringing into these job fairs. We have 35,000 learners who have come through our job fairs since the beginning of the year. And then we also rely—as you know, like, we're very security conscious, so we rely on self-reported data, but we have over 3500 employed early-career talent self-reported job hires. And so, for us, the 29 million is important, but how it then portrays itself into AWS-focused employment—that's not just to AWS; these are by the way those 3500 learners who are employed went to other companies outside of AWS—but we want to make sure that the 29 million actually results in something. It's not just, you know, kind of an academic exercise. And so, that's what we're doing on our site to make sure that employers are actually engaged in this process as well.Corey: I want to bring up a topic that has been top-of-mind in relation to this, where there has been an awful lot of hue and cry about generative AI lately, and to the point where I'm a believer in this. I think it is awesome, I think it is fantastic. And even for me, the hype is getting to be a little over the top. When everyone's talking about it transforming every business and that entire industries seem to be pivoting hard to rebrand themselves with the generative AI brush, it is of some concern. But I'm still excited by the magic inherent to aspects of what this is.It is, on some level—at least the way I see it—a way of solving the cloud education problem that I see, which is that, today if I want to start a company and maybe I just got out of business school, maybe I dropped out of high school, doesn't really matter. If it involves software, as most businesses seem to these days, I would have to do a whole lot of groundwork first. I have to go and take a boot camp class somewhere for six months and learn just enough code to build something horrible enough to get funding so that then I can hire actual professional engineers who will make fun of what I've written behind my back and then tear it all out and replace it. On some level, it really feels like the way to teach people cloud skills is to lower the bar for those cloud skills themselves, to help reduce the you must be at least this smart to ride this amusement park ride style of metering stick.And generative AI seems like it has strong potential for doing some of these things. I've used it that way myself, if we can get past some of the hallucination problems where it's very confident and also wrong—just like, you know, many of the white engineers I've worked with who are of course, men, in the course of my career—it will be even better. But I feel like this is the interface to an awful lot of cloud, if it's done right. How are you folks thinking about generative AI in the context of education, given the that field seems to be changing every day?Valerie: It's an interesting question and I see a lot of forward movement and positive movement in education. I'll give you an example. One company in the Bay Area, Khan Academy is using Khanmigo, which is one of their ChatGPT and generative AI-based products to be able to tutor students in a way that's directive without giving them the answers. And so, you know, when you look at the Bloom's sigma problem, which is if you have an intervention with a student who's kind of on the fence, you can move them one standard deviation to the right by giving them, sort of, community support. You can move them two standard deviations to the right if you give them one-to-one mentoring.And so, the idea is that these interventions through generative AI are actually moving that Bloom's sigma model for students to the right, right? So, you're getting students who might fall through the cracks not falling through the cracks anymore. Groups like Houston Community College are using generative AI to make sure that they are tracking their students in a way that they're going into the classes that they need to go into and they're using the prerequisites so that they can then benefit themselves through the community college system and have the most efficient path towards graduation. There's other models that we're using generative AI for to be able to do better data analysis in educational institutions, not just for outcomes, but also for, you know, funding mechanisms and for ways in which educational institutions [even operationalized 00:21:21]. And so, I think there's a huge power in generative AI that is being used at all levels within education.Now, there's a couple of other things, too, that I think that you touched on, and one is how do we train on generative AI, right? It goes so fast. And how are we doing? So, I'll tell you one thing that I think is super interesting, and that's that generative AI does hold the promise of actually offering us greater diversity, equity, and inclusion of the people who are studying generative AI. And what we're seeing early on is that the distribution in the mix of men and women is far better for studying of generative AI and AI-based learning modules for that particular outcome than we have seen in computer science in the past.And so, that's super encouraging, that we're going to have more people from more diverse backgrounds participating with skills for generative AI. And what that will also mean, of course, is that models will likely be less biased, we'll be able to have better fidelity in generative AI models, and more applicability in different areas when we have more diverse learners with that experience. So, the second piece is, what is AWS doing to make sure that these modules are being integrated into curriculum? And that's something that our training and certification team is launching as we speak, both through our AWS Academy modules, but also through Skill Builder so those can be accessed by people today. So, I'm with you. I think there's more promise than hue and cry and this is going to be a super interesting way that our early-career learners are going to be able to interact with new learning models and new ways of just thinking about how to apply it.Corey: My excitement is almost entirely on the user side of this as opposed to the machine-learning side of it. It feels like an implementation detail from the things that I care about. I asked the magic robot in a box how to do a thing and it tells me, or ideally does it for me. One of the moments in which I felt the dumbest in recent memory has been when I first started down the DeepRacer, “Oh, you just got one. Now, here's how to do it. Step one, open up this console. Good. Nice job. Step two”—and it was, basically get a PhD in machine learning concepts from Berkeley and then come back. Which is a slight exaggeration, but not by much.It feels it is, on some level—it's a daunting field, where there's an awful lot of terms of art being bandied around, there's a lot that needs to be explained in particular ways, and it's very different—at least from my perspective—on virtually any other cloud service offering. And that might very well be a result of my own background. But using the magic thing, like, CodeWhisperer that suggests code that I want to complete is great. Build something like CodeWhisperer, I'm tapping out by the end of that sentence.Valerie: Yeah. I mean, the question in there is, you know, how do we make sure that our learners know how to leverage CodeWhisperer, how to leverage Bedrock, how to leverage SageMaker, and how to leverage Greengrass, right, to build models that I think are going to be really experientially sound but also super innovative? And so, us getting that learning into education early and making sure that learners who are being educated, whether they are currently in jobs and are being re-skilled or they're coming up through traditional or non-traditional educational institutions, have access to all of these services that can help them do innovative things is something that we're really committed to doing. And we've been doing it for a long time. I may think you know that, right?So, Greengrass and SageMaker and all of the AI and ML tools have been around for a long period of time. Bedrock, CodeWhisperer, other services that AWS will continue to launch to support generative AI models, of course, are going to be completely available not just to users, but also for learners who want to re-skill, up-skill, and to skill on generative AI models.Corey: One last area I want to get into is a criticism, or at least an observation I've been making for a while about Kubernetes, but it could easily be extended to cloud in general, which is that, at least today, as things stand—this is starting to change, finally—running Kubernetes in production is challenging and fraught and requires a variety of skills and a fair bit of experience having done this previously. Before the last year or so of weird market behavior, if you had Kubernetes in production experience, you could relatively easily command a couple $100,000 a year in terms of salary. Now, as companies are embracing modern technologies and the rest, I'm wondering how they're approaching the problem of up-leveling their existing staff from two sides. The first is that no matter how much training and how much you wind up giving a lot of those folks, some of them either will not be capable or will not have the desire to learn the new thing. And secondly, once you get those people there, how do you keep them from effectively going down the street with that brand new shiny skill set for, effectively, three times what they were making previously, now that they have those skills that are in wild demand across the board?Because that's simply not sustainable for a huge swath of companies out there for whom they're not technology companies, they just use technology to do the thing that their business does. It feels like everything is becoming very expensive in a personnel perspective if you're not careful. You obviously talk to governments who are famously not known for paying absolute top-of-market figures for basically any sort of talent—for obvious reasons—but also companies for whom the bottom line matters incredibly. How do you square that circle?Valerie: There's a lot in that circle, so I'll talk about a specific, and then I'll talk about what we're also doing to help learners get that experience. So, you talked specifically about Kubernetes, but that could be extracted, as you said, to a lot of other different areas, including cyber, right? So, when we talk about somebody with an expertise in cybersecurity, it's very unlikely that a new learner coming out of university is going to be as appealing to an employer than somebody who has two to three years of experience. And so, how do we close that gap of experience—in either of those two examples—to make sure that learners have an on-ramp to new positions and new career opportunities? So, the first answer I'll give you is with some of our largest systems integrators, one of which is Tata Consulting Services, who is actually using AWS education programs to upskill its employees internally and has upskilled 19,000 of its employees using education programs including AWS Educate, to make sure that their group of consultants has absolutely the latest set of skills.And so, we're seeing that across the board; most of our, if not all of our customers, are looking at training to make sure that they can train not only their internal tech teams and their early-career talent coming in, but they can also train back office to understand what the next generation of technology is going to mean. And so, for instance, one of our largest customers, a telco provider, has asked us to provide modules for their HR teams because without understanding what AI and ML is, what it does, and what how to look for it, they might not be able to then, you know, extract the right sets of talent that they need to bring into the organization. So, we're seeing this training requirement across the business and not just in technical requirements. But you know, bridging that gap with early-career learners, I think is really important too. And so, we are experimenting, especially at places like Miami Dade College and City University of New York with virtual internships so that we can provide early-career learners with experiential learning that then they can bring to employers as proof that they have actually done the thing that they've said that they can demonstrate that they can do.And so, companies like Parker Dewey and Riipen and Forage and virtual internships are offering those experiences online so that our learners have the opportunity to then prove what they say that they can do. So, there's lots of ways that we can go about making sure learners have that broad base of learning and that they can apply it. And I'll tell you one more thing, and that's retention. And we find that when learners approach their employer with an internship or an apprenticeship, that their stickiness with that employer because they understand the culture, they understand the project work, they've been mentored, they've been sponsored, that they're stickiness within those employers it's actually far greater than if they came and went. And so, it's important and incumbent on employers, I think, to build that strong connective tissue with their early-skilled learners—and their upskilled learners—to make sure that the skills don't leave the house, right? And that is all about making sure that the culture aligns with the skills aligns, with the project work, and that it continues to be interesting, whether you're a new learner or you're a re-skilled learner, to stay in-house.Corey: My last question for you—and I understand that this might be fairly loaded—but I can't even come up with a partial list that does it any justice to encapsulate the sheer number of educational programs that you have in flight for a variety of different folks. The details and nuances of these are not something that I store in RAM, so I find that it's very easy to talk about one of these things and wind up bleeding into another. How do you folks keep it all straight? And how should people think about it? Not to say that you are not people. How should people who do not work for AWS? There we go. We are all humans here. Please, go [laugh] ahead.Valerie: It's a good question. So, the way that I break it down—and by the way, you know, AWS is also part of Amazon, so you know, I understand the question. And we have a lot of offerings across Amazon and AWS. AWS education programs specifically, are five. And those five programs, I've mentioned a few today: AWS Academy, AWS Educate, AWS re/Start, GetIT, and Spark are free, no-fee programs that we offer both the community and our education providers to build curriculum to offer digitally, and cloud-based skills curriculum to learners.We have another product that I'm a huge fan of called Skill Builder. And Skill Builder is, as I mentioned before, an online educational platform that anybody can take advantage of the over 500 classes in the free tier. There's learning plans for a lot of different things, and some I think you'd be interested in, like cost optimization and, you know, financial modeling for cloud, and all kinds of other more technically-oriented free courses. And then if learners want to get more experience in a lab environment, or more detailed learning that would lead to, for instance a, you know, certification in solutions architecture, they can use the subscription model, which is very affordable and provides learners an opportunity to work within that platform. So, if I'm breaking it down, it really is, am I being educated and in a way that is more formalized or am I going to go and take these courses when I want them and when I need them, both in the free tier and the subscription tier.So, that's basically the differences between education programs and Skill Builder. But I would say that if people are working with AWS teams, they can also ask teams where is the best place to be able to avail themselves of education curriculum. And we're all passionate about this topic and all of us can point users in the right direction as well.Corey: I really want to thank you for taking the time to go through all the things that you folks are up to these days. If people want to learn more, where should they go?Valerie: So, the first destination, if they want cloud-based learning, is really to take a look at AWS training and certification programs, and so, easily to find on aws.com. I would also point our teams—if they're interested in the tech alliances and how we're formulating the tech alliances—towards a recent announcement between City University of New York, the New York Jobs CEO Council, and the New York Mayor's Office for more details about how we can help teams in the US and outside the US—we also have tech alliances underway in Egypt and Spain and other countries coming on board as well—to really, you know, earmark how government and educational institutions and employers can work together.And then lastly, if employers are listening to this, the one output to all of this is that you pointed out, and that's that our learners need hands-on learning and they need the on-ramp to internships, to apprenticeships, and jobs that really are promotional for, like, career talent. And so, it's incumbent, I think, on all of us to start looking at the next generation of learners, whether they come out of traditional or non-traditional means, and recognize that talent can live in a lot of different places. And we're very happy to help and happy to do that matchup. But I encourage employers to dig deeper there too.Corey: And we will, of course, put links to that in the show notes. Thank you so much for taking the time out of your day to speak with me about all this. I really appreciate it.Valerie: Thank you, Corey. It's always fun to talk to you.Corey: [laugh]. Valerie Singer, GM of Global Education at AWS. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with a comment telling me exactly which AWS service I should make my six-year-old learn about as my next step in punishing her.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Building Computers for the Cloud with Steve Tuck

Screaming in the Cloud

Play Episode Listen Later Sep 21, 2023 42:18


Steve Tuck, Co-Founder & CEO of Oxide Computer Company, joins Corey on Screaming in the Cloud to discuss his work to make modern computers cloud-friendly. Steve describes what it was like going through early investment rounds, and the difficult but important decision he and his co-founder made to build their own switch. Corey and Steve discuss the demand for on-prem computers that are built for cloud capability, and Steve reveals how Oxide approaches their product builds to ensure the masses can adopt their technology wherever they are. About SteveSteve is the Co-founder & CEO of Oxide Computer Company.  He previously was President & COO of Joyent, a cloud computing company acquired by Samsung.  Before that, he spent 10 years at Dell in a number of different roles. Links Referenced: Oxide Computer Company: https://oxide.computer/ On The Metal Podcast: https://oxide.computer/podcasts/on-the-metal TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at RedHat. As your organization grows, so does the complexity of your IT resources. You need a flexible solution that lets you deploy, manage, and scale workloads throughout your entire ecosystem. The Red Hat Ansible Automation Platform simplifies the management of applications and services across your hybrid infrastructure with one platform. Look for it on the AWS Marketplace.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. You know, I often say it—but not usually on the show—that Screaming in the Cloud is a podcast about the business of cloud, which is intentionally overbroad so that I can talk about basically whatever the hell I want to with whoever the hell I'd like. Today's guest is, in some ways of thinking, about as far in the opposite direction from Cloud as it's possible to go and still be involved in the digital world. Steve Tuck is the CEO at Oxide Computer Company. You know, computers, the things we all pretend aren't underpinning those clouds out there that we all use and pay by the hour, gigabyte, second-month-pound or whatever it works out to. Steve, thank you for agreeing to come back on the show after a couple years, and once again suffer my slings and arrows.Steve: Much appreciated. Great to be here. It has been a while. I was looking back, I think three years. This was like, pre-pandemic, pre-interest rates, pre… Twitter going totally sideways.Corey: And I have to ask to start with that, it feels, on some level, like toward the start of the pandemic, when everything was flying high and we'd had low interest rates for a decade, that there was a lot of… well, lunacy lurking around in the industry, my own business saw it, too. It turns out that not giving a shit about the AWS bill is in fact a zero interest rate phenomenon. And with all that money or concentrated capital sloshing around, people decided to do ridiculous things with it. I would have thought, on some level, that, “We're going to start a computer company in the Bay Area making computers,” would have been one of those, but given that we are a year into the correction, and things seem to be heading up into the right for you folks, that take was wrong. How'd I get it wrong?Steve: Well, I mean, first of all, you got part of it right, which is there were just a litany of ridiculous companies and projects and money being thrown in all directions at that time.Corey: An NFT of a computer. We're going to have one of those. That's what you're selling, right? Then you had to actually hard pivot to making the real thing.Steve: That's it. So, we might as well cut right to it, you know. This is—we went through the crypto phase. But you know, our—when we started the company, it was yes, a computer company. It's on the tin. It's definitely kind of the foundation of what we're building. But you know, we think about what a modern computer looks like through the lens of cloud.I was at a cloud computing company for ten years prior to us founding Oxide, so was Bryan Cantrill, CTO, co-founder. And, you know, we are huge, huge fans of cloud computing, which was an interesting kind of dichotomy. Instead of conversations when we were raising for Oxide—because of course, Sand Hill is terrified of hardware. And when we think about what modern computers need to look like, they need to be in support of the characteristics of cloud, and cloud computing being not that you're renting someone else's computers, but that you have fully programmable infrastructure that allows you to slice and dice, you know, compute and storage and networking however software needs. And so, what we set out to go build was a way for the companies that are running on-premises infrastructure—which, by the way, is almost everyone and will continue to be so for a very long time—access to the benefits of cloud computing. And to do that, you need to build a different kind of computing infrastructure and architecture, and you need to plumb the whole thing with software.Corey: There are a number of different ways to view cloud computing. And I think that a lot of the, shall we say, incumbent vendors over in the computer manufacturing world tend to sound kind of like dinosaurs, on some level, where they're always talking in terms of, you're a giant company and you already have a whole bunch of data centers out there. But one of the magical pieces of cloud is you can have a ridiculous idea at nine o'clock tonight and by morning, you'll have a prototype, if you're of that bent. And if it turns out it doesn't work, you're out, you know, 27 cents. And if it does work, you can keep going and not have to stop and rebuild on something enterprise-grade.So, for the small-scale stuff and rapid iteration, cloud providers are terrific. Conversely, when you wind up in the giant fleets of millions of computers, in some cases, there begin to be economic factors that weigh in, and for some on workloads—yes, I know it's true—going to a data center is the economical choice. But my question is, is starting a new company in the direction of building these things, is it purely about economics or is there a capability story tied in there somewhere, too?Steve: Yeah, it's actually economics ends up being a distant third, fourth, in the list of needs and priorities from the companies that we're working with. When we talk about—and just to be clear we're—our demographic, that kind of the part of the market that we are focused on are large enterprises, like, folks that are spending, you know, half a billion, billion dollars a year in IT infrastructure, they, over the last five years, have moved a lot of the use cases that are great for public cloud out to the public cloud, and who still have this very, very large need, be it for latency reasons or cost reasons, security reasons, regulatory reasons, where they need on-premises infrastructure in their own data centers and colo facilities, et cetera. And it is for those workloads in that part of their infrastructure that they are forced to live with enterprise technologies that are 10, 20, 30 years old, you know, that haven't evolved much since I left Dell in 2009. And, you know, when you think about, like, what are the capabilities that are so compelling about cloud computing, one of them is yes, what you mentioned, which is you have an idea at nine o'clock at night and swipe a credit card, and you're off and running. And that is not the case for an idea that someone has who is going to use the on-premises infrastructure of their company. And this is where you get shadow IT and 16 digits to freedom and all the like.Corey: Yeah, everyone with a corporate credit card winds up being a shadow IT source in many cases. If your processes as a company don't make it easier to proceed rather than doing it the wrong way, people are going to be fighting against you every step of the way. Sometimes the only stick you've got is that of regulation, which in some industries, great, but in other cases, no, you get to play Whack-a-Mole. I've talked to too many companies that have specific scanners built into their mail system every month looking for things that look like AWS invoices.Steve: [laugh]. Right, exactly. And so, you know, but if you flip it around, and you say, well, what if the experience for all of my infrastructure that I am running, or that I want to provide to my software development teams, be it rented through AWS, GCP, Azure, or owned for economic reasons or latency reasons, I had a similar set of characteristics where my development team could hit an API endpoint and provision instances in a matter of seconds when they had an idea and only pay for what they use, back to kind of corporate IT. And what if they were able to use the same kind of developer tools they've become accustomed to using, be it Terraform scripts and the kinds of access that they are accustomed to using? How do you make those developers just as productive across the business, instead of just through public cloud infrastructure?At that point, then you are in a much stronger position where you can say, you know, for a portion of things that are, as you pointed out, you know, more unpredictable, and where I want to leverage a bunch of additional services that a particular cloud provider has, I can rent that. And where I've got more persistent workloads or where I want a different economic profile or I need to have something in a very low latency manner to another set of services, I can own it. And that's where I think the real chasm is because today, you just don't—we take for granted the basic plumbing of cloud computing, you know? Elastic Compute, Elastic Storage, you know, networking and security services. And us in the cloud industry end up wanting to talk a lot more about exotic services and, sort of, higher-up stack capabilities. None of that basic plumbing is accessible on-prem.Corey: I also am curious as to where exactly Oxide lives in the stack because I used to build computers for myself in 2000, and it seems like having gone down that path a bit recently, yeah, that process hasn't really improved all that much. The same off-the-shelf components still exist and that's great. We always used to disparagingly call spinning hard drives as spinning rust in racks. You named the company Oxide; you're talking an awful lot about the Rust programming language in public a fair bit of the time, and I'm starting to wonder if maybe words don't mean what I thought they meant anymore. Where do you folks start and stop, exactly?Steve: Yeah, that's a good question. And when we started, we sort of thought the scope of what we were going to do and then what we were going to leverage was smaller than it has turned out to be. And by that I mean, man, over the last three years, we have hit a bunch of forks in the road where we had questions about do we take something off the shelf or do we build it ourselves. And we did not try to build everything ourselves. So, to give you a sense of kind of where the dotted line is, around the Oxide product, what we're delivering to customers is a rack-level computer. So, the minimum size comes in rack form. And I think your listeners are probably pretty familiar with this. But, you know, a rack is—Corey: You would be surprised. It's basically, what are they about seven feet tall?Steve: Yeah, about eight feet tall.Corey: Yeah, yeah. Seven, eight feet, weighs a couple 1000 pounds, you know, make an insulting joke about—Steve: Two feet wide.Corey: —NBA players here. Yeah, all kinds of these things.Steve: Yeah. And big hunk of metal. And in the cases of on-premises infrastructure, it's kind of a big hunk of metal hole, and then a bunch of 1U and 2U boxes crammed into it. What the hyperscalers have done is something very different. They started looking at, you know, at the rack level, how can you get much more dense, power-efficient designs, doing things like using a DC bus bar down the back, instead of having 64 power supplies with cables hanging all over the place in a rack, which I'm sure is what you're more familiar with.Corey: Tremendous amount of weight as well because you have the metal chassis for all of those 1U things, which in some cases, you wind up with, what, 46U in a rack, assuming you can even handle the cooling needs of all that.Steve: That's right.Corey: You have so much duplication, and so much of the weight is just metal separating one thing from the next thing down below it. And there are opportunities for massive improvement, but you need to be at a certain point of scale to get there.Steve: You do. You do. And you also have to be taking on the entire problem. You can't pick at parts of these things. And that's really what we found. So, we started at this sort of—the rack level as sort of the design principle for the product itself and found that that gave us the ability to get to the right geometry, to get as much CPU horsepower and storage and throughput and networking into that kind of chassis for the least amount of wattage required, kind of the most power-efficient design possible.So, it ships at the rack level and it ships complete with both our server sled systems in Oxide, a pair of Oxide switches. This is—when I talk about, like, design decisions, you know, do we build our own switch, it was a big, big, big question early on. We were fortunate even though we were leaning towards thinking we needed to go do that, we had this prospective early investor who was early at AWS and he had asked a very tough question that none of our other investors had asked to this point, which is, “What are you going to do about the switch?”And we knew that the right answer to an investor is like, “No. We're already taking on too much.” We're redesigning a server from scratch in, kind of, the mold of what some of the hyperscalers have learned, doing our own Root of Trust, we're doing our own operating system, hypervisor control plane, et cetera. Taking on the switch could be seen as too much, but we told them, you know, we think that to be able to pull through all of the value of the security benefits and the performance and observability benefits, we can't have then this [laugh], like, obscure third-party switch rammed into this rack.Corey: It's one of those things that people don't think about, but it's the magic of cloud with AWS's network, for example, it's magic. You can get line rate—or damn near it—between any two points, sustained.Steve: That's right.Corey: Try that in the data center, you wind into massive congestion with top-of-rack switches, where, okay, we're going to parallelize this stuff out over, you know, two dozen racks and we're all going to have them seamlessly transfer information between each other at line rate. It's like, “[laugh] no, you're not because those top-of-rack switches will melt and become side-of-rack switches, and then bottom-puddle-of-rack switches. It doesn't work that way.”Steve: That's right.Corey: And you have to put a lot of thought and planning into it. That is something that I've not heard a traditional networking vendor addressing because everyone loves to hand-wave over it.Steve: Well so, and this particular prospective investor, we told him, “We think we have to go build our own switch.” And he said, “Great.” And we said, “You know, we think we're going to lose you as an investor as a result, but this is what we're doing.” And he said, “If you're building your own switch, I want to invest.” And his comment really stuck with us, which is AWS did not stand on their own two feet until they threw out their proprietary switch vendor and built their own.And that really unlocked, like you've just mentioned, like, their ability, both in hardware and software to tune and optimize to deliver that kind of line rate capability. And that is one of the big findings for us as we got into it. Yes, it was really, really hard, but based on a couple of design decisions, P4 being the programming language that we are using as the surround for our silicon, tons of opportunities opened up for us to be able to do similar kinds of optimization and observability. And that has been a big, big win.But to your question of, like, where does it stop? So, we are delivering this complete with a baked-in operating system, hypervisor, control plane. And so, the endpoint of the system, where the customer meets is either hitting an API or a CLI or a console that delivers and kind of gives you the ability to spin up projects. And, you know, if one is familiar with EC2 and EBS and VPC, that VM level of abstraction is where we stop.Corey: That, I think, is a fair way of thinking about it. And a lot of cloud folks are going to pooh-pooh it as far as saying, “Oh well, just virtual machines. That's old cloud. That just treats the cloud like a data center.” And in many cases, yes, it does because there are ways to build modern architectures that are event-driven on top of things like Lambda, and API Gateway, and the rest, but you take a look at what my customers are doing and what drives the spend, it is invariably virtual machines that are largely persistent.Sometimes they scale up, sometimes they scale down, but there's always a baseline level of load that people like to hand-wave away the fact that what they're fundamentally doing in a lot of these cases, is paying the cloud provider to handle the care and feeding of those systems, which can be expensive, yes, but also delivers significant innovation beyond what almost any company is going to be able to deliver in-house. There is no way around it. AWS is better than you are—whoever you happen to—be at replacing failed hard drives. That is a simple fact. They have teams of people who are the best in the world of replacing failed hard drives. You generally do not. They are going to be better at that than you. But that's not the only axis. There's not one calculus that leads to, is cloud a scam or is cloud a great value proposition for us? The answer is always a deeply nuanced, “It depends.”Steve: Yeah, I mean, I think cloud is a great value proposition for most and a growing amount of software that's being developed and deployed and operated. And I think, you know, one of the myths that is out there is, hey, turn over your IT to AWS because we have or you know, a cloud provider—because we have such higher caliber personnel that are really good at swapping hard drives and dealing with networks and operationally keeping this thing running in a highly available manner that delivers good performance. That is certainly true, but a lot of the operational value in an AWS is been delivered via software, the automation, the observability, and not actual people putting hands on things. And it's an important point because that's been a big part of what we're building into the product. You know, just because you're running infrastructure in your own data center, it does not mean that you should have to spend, you know, 1000 hours a month across a big team to maintain and operate it. And so, part of that, kind of, cloud, hyperscaler innovation that we're baking into this product is so that it is easier to operate with much, much, much lower overhead in a highly available, resilient manner.Corey: So, I've worked in a number of data center facilities, but the companies I was working with, were always at a scale where these were co-locations, where they would, in some cases, rent out a rack or two, in other cases, they'd rent out a cage and fill it with their own racks. They didn't own the facilities themselves. Those were always handled by other companies. So, my question for you is, if I want to get a pile of Oxide racks into my environment in a data center, what has to change? What are the expectations?I mean, yes, there's obviously going to be power and requirements at the data center colocation is very conversant with, but Open Compute, for example, had very specific requirements—to my understanding—around things like the airflow construction of the environment that they're placed within. How prescriptive is what you've built, in terms of doing a building retrofit to start using you folks?Steve: Yeah, definitely not. And this was one of the tensions that we had to balance as we were designing the product. For all of the benefits of hyperscaler computing, some of the design center for you know, the kinds of racks that run in Google and Amazon and elsewhere are hyperscaler-focused, which is unlimited power, in some cases, data centers designed around the equipment itself. And where we were headed, which was basically making hyperscaler infrastructure available to, kind of, the masses, the rest of the market, these folks don't have unlimited power and they aren't going to go be able to go redesign data centers. And so no, the experience should be—with exceptions for folks maybe that have very, very limited access to power—that you roll this rack into your existing data center. It's on standard floor tile, that you give it power, and give it networking and go.And we've spent a lot of time thinking about how we can operate in the wide-ranging environmental characteristics that are commonplace in data centers that focus on themselves, colo facilities, and the like. So, that's really on us so that the customer is not having to go to much work at all to kind of prepare and be ready for it.Corey: One of the challenges I have is how to think about what you've done because you are rack-sized. But what that means is that my own experimentation at home recently with on-prem stuff for smart home stuff involves a bunch of Raspberries Pi and a [unintelligible 00:19:42], but I tend to more or less categorize you the same way that I do AWS Outposts, as well as mythical creatures, like unicorns or giraffes, where I don't believe that all these things actually exist because I haven't seen them. And in fact, to get them in my house, all four of those things would theoretically require a loading dock if they existed, and that's a hard thing to fake on a demo signup form, as it turns out. How vaporware is what you've built? Is this all on paper and you're telling amazing stories or do they exist in the wild?Steve: So, last time we were on, it was all vaporware. It was a couple of napkin drawings and a seed round of funding.Corey: I do recall you not using that description at the time, for what it's worth. Good job.Steve: [laugh]. Yeah, well, at least we were transparent where we were going through the race. We had some napkin drawings and we had some good ideas—we thought—and—Corey: You formalize those and that's called Microsoft PowerPoint.Steve: That's it. A hundred percent.Corey: The next generative AI play is take the scrunched-up, stained napkin drawing, take a picture of it, and convert it to a slide.Steve: Google Docs, you know, one of those. But no, it's got a lot of scars from the build and it is real. In fact, next week, we are going to be shipping our first commercial systems. So, we have got a line of racks out in our manufacturing facility in lovely Rochester, Minnesota. Fun fact: Rochester, Minnesota, is where the IBM AS/400s were built.Corey: I used to work in that market, of all things.Steve: Really?Corey: Selling tape drives in the AS/400. I mean, I still maintain there's no real mainframe migration to the cloud play because there's no AWS/400. A joke that tends to sail over an awful lot of people's heads because, you know, most people aren't as miserable in their career choices as I am.Steve: Okay, that reminds me. So, when we were originally pitching Oxide and we were fundraising, we [laugh]—in a particular investor meeting, they asked, you know, “What would be a good comp? Like how should we think about what you are doing?” And fortunately, we had about 20 investor meetings to go through, so burning one on this was probably okay, but we may have used the AS/400 as a comp, talking about how [laugh] mainframe systems did such a good job of building hardware and software together. And as you can imagine, there were some blank stares in that room.But you know, there are some good analogs to historically in the computing industry, when you know, the industry, the major players in the industry, were thinking about how to deliver holistic systems to support end customers. And, you know, we see this in the what Apple has done with the iPhone, and you're seeing this as a lot of stuff in the automotive industry is being pulled in-house. I was listening to a good podcast. Jim Farley from Ford was talking about how the automotive industry historically outsourced all of the software that controls cars, right? So, like, Bosch would write the software for the controls for your seats.And they had all these suppliers that were writing the software, and what it meant was that innovation was not possible because you'd have to go out to suppliers to get software changes for any little change you wanted to make. And in the computing industry, in the 80s, you saw this blow apart where, like, firmware got outsourced. In the IBM and the clones, kind of, race, everyone started outsourcing firmware and outsourcing software. Microsoft started taking over operating systems. And then VMware emerged and was doing a virtualization layer.And this, kind of, fragmented ecosystem is the landscape today that every single on-premises infrastructure operator has to struggle with. It's a kit car. And so, pulling it back together, designing things in a vertically integrated manner is what the hyperscalers have done. And so, you mentioned Outposts. And, like, it's a good example of—I mean, the most public cloud of public cloud companies created a way for folks to get their system on-prem.I mean, if you need anything to underscore the draw and the demand for cloud computing-like, infrastructure on-prem, just the fact that that emerged at all tells you that there is this big need. Because you've got, you know, I don't know, a trillion dollars worth of IT infrastructure out there and you have maybe 10% of it in the public cloud. And that's up from 5% when Jassy was on stage in '21, talking about 95% of stuff living outside of AWS, but there's going to be a giant market of customers that need to own and operate infrastructure. And again, things have not improved much in the last 10 or 20 years for them.Corey: They have taken a tone onstage about how, “Oh, those workloads that aren't in the cloud, yet, yeah, those people are legacy idiots.” And I don't buy that for a second because believe it or not—I know that this cuts against what people commonly believe in public—but company execs are generally not morons, and they make decisions with context and constraints that we don't see. Things are the way they are for a reason. And I promise that 90% of corporate IT workloads that still live on-prem are not being managed or run by people who've never heard of the cloud. There was a decision made when some other things were migrating of, do we move this thing to the cloud or don't we? And the answer at the time was no, we're going to keep this thing on-prem where it is now for a variety of reasons of varying validity. But I don't view that as a bug. I also, frankly, don't want to live in a world where all the computers are basically run by three different companies.Steve: You're spot on, which is, like, it does a total disservice to these smart and forward-thinking teams in every one of the Fortune 1000-plus companies who are taking the constraints that they have—and some of those constraints are not monetary or entirely workload-based. If you want to flip it around, we were talking to a large cloud SaaS company and their reason for wanting to extend it beyond the public cloud is because they want to improve latency for their e-commerce platform. And navigating their way through the complex layers of the networking stack at GCP to get to where the customer assets are that are in colo facilities, adds lag time on the platform that can cost them hundreds of millions of dollars. And so, we need to think behind this notion of, like, “Oh, well, the dark ages are for software that can't run in the cloud, and that's on-prem. And it's just a matter of time until everything moves to the cloud.”In the forward-thinking models of public cloud, it should be both. I mean, you should have a consistent experience, from a certain level of the stack down, everywhere. And then it's like, do I want to rent or do I want to own for this particular use case? In my vast set of infrastructure needs, do I want this to run in a data center that Amazon runs or do I want this to run in a facility that is close to this other provider of mine? And I think that's best for all. And then it's not this kind of false dichotomy of quality infrastructure or ownership.Corey: I find that there are also workloads where people will come to me and say, “Well, we don't think this is going to be economical in the cloud”—because again, I focus on AWS bills. That is the lens I view things through, and—“The AWS sales rep says it will be. What do you think?” And I look at what they're doing and especially if involves high volumes of data transfer, I laugh a good hearty laugh and say, “Yeah, keep that thing in the data center where it is right now. You will thank me for it later.”It's, “Well, can we run this in an economical way in AWS?” As long as you're okay with economical meaning six times what you're paying a year right now for the same thing, yeah, you can. I wouldn't recommend it. And the numbers sort of speak for themselves. But it's not just an economic play.There's also the story of, does this increase their capability? Does it let them move faster toward their business goals? And in a lot of cases, the answer is no, it doesn't. It's one of those business process things that has to exist for a variety of reasons. You don't get to reimagine it for funsies and even if you did, it doesn't advance the company in what they're trying to do any, so focus on something that differentiates as opposed to this thing that you're stuck on.Steve: That's right. And what we see today is, it is easy to be in that mindset of running things on-premises is kind of backwards-facing because the experience of it is today still very, very difficult. I mean, talking to folks and they're sharing with us that it takes a hundred days from the time all the different boxes land in their warehouse to actually having usable infrastructure that developers can use. And our goal and what we intend to go hit with Oxide as you can roll in this complete rack-level system, plug it in, within an hour, you have developers that are accessing cloud-like services out of the infrastructure. And that—God, countless stories of firmware bugs that would send all the fans in the data center nonlinear and soak up 100 kW of power.Corey: Oh, God. And the problems that you had with the out-of-band management systems. For a long time, I thought Drax stood for, “Dell, RMA Another Computer.” It was awful having to deal with those things. There was so much room for innovation in that space, which no one really grabbed onto.Steve: There was a really, really interesting talk at DEFCON that we just stumbled upon yesterday. The NVIDIA folks are giving a talk on BMC exploits… and like, a very, very serious BMC exploit. And again, it's what most people don't know is, like, first of all, the BMC, the Baseboard Management Controller, is like the brainstem of the computer. It has access to—it's a backdoor into all of your infrastructure. It's a computer inside a computer and it's got software and hardware that your server OEM didn't build and doesn't understand very well.And firmware is even worse because you know, firmware written by you know, an American Megatrends or other is a big blob of software that gets loaded into these systems that is very hard to audit and very hard to ascertain what's happening. And it's no surprise when, you know, back when we were running all the data centers at a cloud computing company, that you'd run into these issues, and you'd go to the server OEM and they'd kind of throw their hands up. Well, first they'd gaslight you and say, “We've never seen this problem before,” but when you thought you've root-caused something down to firmware, it was anyone's guess. And this is kind of the current condition today. And back to, like, the journey to get here, we kind of realized that you had to blow away that old extant firmware layer, and we rewrote our own firmware in Rust. Yes [laugh], I've done a lot in Rust.Corey: No, it was in Rust, but, on some level, that's what Nitro is, as best I can tell, on the AWS side. But it turns out that you don't tend to have the same resources as a one-and-a-quarter—at the moment—trillion-dollar company. That keeps [valuing 00:30:53]. At one point, they lost a comma and that was sad and broke all my logic for that and I haven't fixed it since. Unfortunate stuff.Steve: Totally. I think that was another, kind of, question early on from certainly a lot of investors was like, “Hey, how are you going to pull this off with a smaller team and there's a lot of surface area here?” Certainly a reasonable question. Definitely was hard. The one advantage—among others—is, when you are designing something kind of in a vertical holistic manner, those design integration points are narrowed down to just your equipment.And when someone's writing firmware, when AMI is writing firmware, they're trying to do it to cover hundreds and hundreds of components across dozens and dozens of vendors. And we have the advantage of having this, like, purpose-built system, kind of, end-to-end from the lowest level from first boot instruction, all the way up through the control plane and from rack to switch to server. That definitely helped narrow the scope.Corey: This episode has been fake sponsored by our friends at AWS with the following message: Graviton Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton. Thank you for your l-, lack of support for this show. Now, AWS has been talking about Graviton an awful lot, which is their custom in-house ARM processor. Apple moved over to ARM and instead of talking about benchmarks they won't publish and marketing campaigns with words that don't mean anything, they've let the results speak for themselves. In time, I found that almost all of my workloads have moved over to ARM architecture for a variety of reason, and my laptop now gets 15 hours of battery life when all is said and done. You're building these things on top of x86. What is the deal there? I do not accept that if that you hadn't heard of ARM until just now because, as mentioned, Graviton, Graviton, Graviton.Steve: That's right. Well, so why x86, to start? And I say to start because we have just launched our first generation products. And our first-generation or second-generation products that we are now underway working on are going to be x86 as well. We've built this system on AMD Milan silicon; we are going to be launching a Genoa sled.But when you're thinking about what silicon to use, obviously, there's a bunch of parts that go into the decision. You're looking at the kind of applicability to workload, performance, power management, for sure, and if you carve up what you are trying to achieve, x86 is still a terrific fit for the broadest set of workloads that our customers are trying to solve for. And choosing which x86 architecture was certainly an easier choice, come 2019. At this point, AMD had made a bunch of improvements in performance and energy efficiency in the chip itself. We've looked at other architectures and I think as we are incorporating those in the future roadmap, it's just going to be a question of what are you trying to solve for.You mentioned power management, and that is kind of commonly been a, you know, low power systems is where folks have gone beyond x86. Is we're looking forward to hardware acceleration products and future products, we'll certainly look beyond x86, but x86 has a long, long road to go. It still is kind of the foundation for what, again, is a general-purpose cloud infrastructure for being able to slice and dice for a variety of workloads.Corey: True. I have to look around my environment and realize that Intel is not going anywhere. And that's not just an insult to their lack of progress on committed roadmaps that they consistently miss. But—Steve: [sigh].Corey: Enough on that particular topic because we want to keep this, you know, polite.Steve: Intel has definitely had some struggles for sure. They're very public ones, I think. We were really excited and continue to be very excited about their Tofino silicon line. And this came by way of the Barefoot networks acquisition. I don't know how much you had paid attention to Tofino, but what was really, really compelling about Tofino is the focus on both hardware and software and programmability.So, great chip. And P4 is the programming language that surrounds that. And we have gotten very, very deep on P4, and that is some of the best tech to come out of Intel lately. But from a core silicon perspective for the rack, we went with AMD. And again, that was a pretty straightforward decision at the time. And we're planning on having this anchored around AMD silicon for a while now.Corey: One last question I have before we wind up calling it an episode, it seems—at least as of this recording, it's still embargoed, but we're not releasing this until that winds up changing—you folks have just raised another round, which means that your napkin doodles have apparently drawn more folks in, and now that you're shipping, you're also not just bringing in customers, but also additional investor money. Tell me about that.Steve: Yes, we just completed our Series A. So, when we last spoke three years ago, we had just raised our seed and had raised $20 million at the time, and we had expected that it was going to take about that to be able to build the team and build the product and be able to get to market, and [unintelligible 00:36:14] tons of technical risk along the way. I mean, there was technical risk up and down the stack around this [De Novo 00:36:21] server design, this the switch design. And software is still the kind of disproportionate majority of what this product is, from hypervisor up through kind of control plane, the cloud services, et cetera. So—Corey: We just view it as software with a really, really confusing hardware dongle.Steve: [laugh]. Yeah. Yes.Corey: Super heavy. We're talking enterprise and government-grade here.Steve: That's right. There's a lot of software to write. And so, we had a bunch of milestones that as we got through them, one of the big ones was getting Milan silicon booting on our firmware. It was funny it was—this was the thing that clearly, like, the industry was most suspicious of, us doing our own firmware, and you could see it when we demonstrated booting this, like, a year-and-a-half ago, and AMD all of a sudden just lit up, from kind of arm's length to, like, “How can we help? This is amazing.” You know? And they could start to see the benefits of when you can tie low-level silicon intelligence up through a hypervisor there's just—Corey: No I love the existing firmware I have. Looks like it was written in 1984 and winds up having terrible user ergonomics that hasn't been updated at all, and every time something comes through, it's a 50/50 shot as whether it fries the box or not. Yeah. No, I want that.Steve: That's right. And you look at these hyperscale data centers, and it's like, no. I mean, you've got intelligence from that first boot instruction through a Root of Trust, up through the software of the hyperscaler, and up to the user level. And so, as we were going through and kind of knocking down each one of these layers of the stack, doing our own firmware, doing our own hardware Root of Trust, getting that all the way plumbed up into the hypervisor and the control plane, number one on the customer side, folks moved from, “This is really interesting. We need to figure out how we can bring cloud capabilities to our data centers. Talk to us when you have something,” to, “Okay. We actually”—back to the earlier question on vaporware, you know, it was great having customers out here to Emeryville where they can put their hands on the rack and they can, you know, put your hands on software, but being able to, like, look at real running software and that end cloud experience.And that led to getting our first couple of commercial contracts. So, we've got some great first customers, including a large department of the government, of the federal government, and a leading firm on Wall Street that we're going to be shipping systems to in a matter of weeks. And as you can imagine, along with that, that drew a bunch of renewed interest from the investor community. Certainly, a different climate today than it was back in 2019, but what was great to see is, you still have great investors that understand the importance of making bets in the hard tech space and in companies that are looking to reinvent certain industries. And so, we added—our existing investors all participated. We added a bunch of terrific new investors, both strategic and institutional.And you know, this capital is going to be super important now that we are headed into market and we are beginning to scale up the business and make sure that we have a long road to go. And of course, maybe as importantly, this was a real confidence boost for our customers. They're excited to see that Oxide is going to be around for a long time and that they can invest in this technology as an important part of their infrastructure strategy.Corey: I really want to thank you for taking the time to speak with me about, well, how far you've come in a few years. If people want to learn more and have the requisite loading dock, where should they go to find you?Steve: So, we try to put everything up on the site. So, oxidecomputer.com or oxide.computer. We also, if you remember, we did [On the Metal 00:40:07]. So, we had a Tales from the Hardware-Software Interface podcast that we did when we started. We have shifted that to Oxide and Friends, which the shift there is we're spending a little bit more time talking about the guts of what we built and why. So, if folks are interested in, like, why the heck did you build a switch and what does it look like to build a switch, we actually go to depth on that. And you know, what does bring-up on a new server motherboard look like? And it's got some episodes out there that might be worth checking out.Corey: We will definitely include a link to that in the [show notes 00:40:36]. Thank you so much for your time. I really appreciate it.Steve: Yeah, Corey. Thanks for having me on.Corey: Steve Tuck, CEO at Oxide Computer Company. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry ranting comment because you are in fact a zoology major, and you're telling me that some animals do in fact exist. But I'm pretty sure of the two of them, it's the unicorn.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Complex World of Microsoft Licensing with Wes Miller

Screaming in the Cloud

Play Episode Listen Later Sep 19, 2023 37:11


Wes Miller, Research VP at Directions on Microsoft, joins Corey on Screaming in the Cloud to discuss the various intricacies and pitfalls of Microsoft licensing. Wes and Corey discuss what it's like to work closely with a company like Microsoft in your day-to-day career, while also looking out for the best interest of your mutual customers. Wes explains his history of working both at and with Microsoft, and the changes he's seen to their business models and the impact that has on their customers. About WesWes Miller analyzes and writes about Microsoft security, identity, and systems management technologies, as well as Microsoft product licensing.Before joining Directions on Microsoft in 2010, Wes was a product manager and development manager for several Austin, TX, start-ups, including Winternals Software, acquired by Microsoft in 2006. Prior to that, Wes spent seven years at Microsoft working as a program manager in the Windows Core Operating System and MSN divisions.Wes received a B.A. in psychology from the University of Alaska Fairbanks.Links Referenced: Directions on Microsoft Website: https://www.directionsonmicrosoft.com/ Twitter: https://twitter.com/getwired LinkedIn: https://www.linkedin.com/in/wmiller/ Directions on Microsoft Training: https://www.directionsonmicrosoft.com/training TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. So, I write a newsletter called Last Week in AWS, which has always felt like it's flying a little bit too close to the sun just because having AWSes name in the title of what I do feels like it's playing with copyright fire. It's nice periodically to talk to someone—again—who is in a similar boat. Wes Miller is a Research VP at Directions on Microsoft. To be clear, Directions on Microsoft is an analyst firm that talks primarily about Microsoft licensing and is not, in fact, part of Microsoft itself. Have I disclaimed that appropriately, Wes?Wes: You have. You have. And in fact, the company, when it was first born, was actually called Microsoft Directions. And they had a reasonably good relationship with Microsoft at the time and Microsoft cordially asked them, “Hey, could you at least reverse that so it corrects it in terms of trademark.” So yes, we're blessed in that regard. Something you probably would never get away with now, but that was 30 years ago.Corey: [laugh]. And now it sounds like it might as well be a product. So, I have to ask, just because the way I think of you is, you are the folks to talk to, full stop, when you have a question about anything that touches on Microsoft licensing. Is that an accurate depiction of what it is you folks do or is that just my particular corner of the world and strange equivalence that gets me there?Wes: That is our parts of the Venn diagram intersecting because that's what I spend a lot of time talking about and thinking about because I teach that with our company founder, Rob Horwitz. But we also spend an inordinate amount of time taking what Microsoft is talking about shipping, maybe servicing, and help customers understand really, as we say, the ‘So, what?' What does this mean to me as a customer? Should I be using this? Should I be waiting? Should I upgrade? Should I stay? Those sorts of things.So, there's a whole roadmapping side. And then we have a [laugh]—because licensing doesn't end with a license, we have a whole side of negotiation that we spend a lot of time, we have a dedicated team that focuses on helping enterprise agreement customers get the most successful deal for their organization, basically, every three years.Corey: We do exactly that with AWS ourselves. I have to ask before we dive into this. In the early days, I felt like I had a much better relationship with Microsoft. Scott Guthrie, the head of Azure, was on this show. A number of very highly placed Microsoft folks were here. And over the years, they more or less have stopped talking to me.And that leaves me in a position where all I can see is their actions and their broad public statements without getting any nuance or context around any of it. And I don't know if this is just a commentary on human nature or me in particular, but I tend to always assume the worst when things like that happen. So, my approach to Microsoft has grown increasingly cynical over the years as a result. That said, I don't actually have an axe to grind with them from any other perspective than as a customer, and occasionally that feels like ‘victim' for a variety of different things. What's your take on Microsoft as far as, I guess, your feelings toward the company?Wes: So, a lot of people—in fact, it used to be more so, but not as much anymore, people would assume I hate Microsoft or I want to demonize Microsoft. But the irony actually is, you know, I want people to remember I worked there for seven-and-a-half years, I shipped—I was on the team that shipped Windows XP, Server 2003, and a bunch of other products that people don't remember. And I still care about the company, but the company and I are obviously in different trajectories now. And also, my company's customers today are also Microsoft's customers today, and we actually have—our customers—our mutual customers—best interest in mind with basically everything we do. Are we helping them be informed? Are we helping them color within the financial lines?And sometimes, we may say things that help a customer that aren't helping the bottom line or helping a marketing direction and I don't think that resonates well within Microsoft. So sure, sometimes we even hear from them, “Hey, it'd be great if you guys might want to, you know, say something nice once in a while.” But it's not necessarily our job to say nice things. I do it once in a while. I want to note that I said something nice about AAD last week, but the reality is that we are there to help our mutual customers.And what I found is, I have found the same thing to be true that you're finding true that, unfortunately, outbound communications from them, in particular from the whole company, have slowed. I think everybody's busier, they've got a very specific set of directions they're going on things, and as a result, we hear very little. And even getting, trying to get clarification on things sometimes, “Did we read that right?” It takes a while, and it has to go through several different rungs of people to get the answer.Corey: I have somewhat similar relationships over the years with AWS, where they—in many cases, a lot of their executives prefer not to talk to me at all. Which again, is fair. I'm not—I don't require any of them to do it. But there's something in the Amazonian ethos that requires them to talk to customers, especially when customers are having a rough time. And I'm, for better or worse, the voice of the customer.I am usually not the dumbest person in the universe when it comes to trying to understand a service or make it do something that, to me, it seems that it should be able to do. And when I actually start having in-depth conversations, people are surprised. “Wow, you were super pleasant and fun to work with. We thought you were just going to be a jerk.” It's, yeah, it turns out I don't go through every meeting like it's Twitter. What a concept.Wes: Yeah, a lot of people, I've had this happen for myself when you meet people in person, when they meet your Twitter persona, especially for someone who I think you and I both come across as rather boisterous, gregarious, and sometimes people take that as our personas. And I remember meeting a friend in the UK for the first time years ago, he's like, “You're very different in person.” I'm like, “I know. I know.”Corey: I usually get the, “You're just like Twitter.” In many respects, I am. Because people don't always see what I'm putting down. I make it a point to be humorous and I have a quick quip for a lot of things, but it's never trying to make the person I'm engaging with feel worse for it. And that's how I work.People are somewhat surprised when I'm working in client meetings that I'm fun and I have a similar sense of humor and personality, as you would see on Twitter. Believe it or not, I haven't spent all this time just doing a bit. But they're also surprised that it tends to drive toward an actual business discussion.Wes: Sure.Corey: Everything fun is contextual.Wes: Absolutely. That's the same sort of thing we get on our side when we talk to customers. I think I've learned so much from talking with them that sometimes I do get to share those things with Microsoft when they're willing to listen.Corey: So, what I'm curious about in the context of Microsoft licensing is something that, once again, it has intruded upon my notice lately with a bunch of security disclosures in which Microsoft has said remarkably little, and that is one of the most concerning things out there. They casually tried to slide past, “Oh, yeah, we had a signing key compromised.” Which is one of those, “Oh, [laugh] and by the way, the building's on fire. But let's talk about our rent [unintelligible 00:07:44] for the next year.” Like, “Whoa, whoa, whoa. Hold on. What?”That was one of those horrifying moments. And it came out—I believe I learned about this from you—that you needed something called E3 licensing—sorry, E5 licensing—in order to look at those audit logs, where versus E3, which sounded like the more common case. And after a couple of days of, “Explain this,” Microsoft very quickly wound up changing that. What do all these things mean? This is sort of a foreign concept to me because AWS, for better or worse, does not play games with licensing in the same way that Microsoft does.Wes: Sure. Microsoft has, over the years, you know, they are a master of building suites. This is what they've done for over 30 years. And they will build a suite, they'll sell you that suite, they'll come back around in three to six years and sell you a new version of that suite. Sometimes they'll sell you a higher price version of that suite, et cetera.And so, you'll see products evolve. And did a great podcast with my colleagues Rob and Mary Jo Foley the other day where we talked about what we've seen over the last, now for me, 11 years of teaching boot camps. And I think in particular, one of the changes we have seen is exactly what you're being exposed to on the outside and what a lot of people have been complaining about, which is, products don't sit still anymore. So, Microsoft actually makes very few products today. Almost everything they sell you is a service. There are a handful of products still.These services all evolve, and about every triennium or two—so every three to six years—you'll see a price increase and something will be added, and a price increase and something will be added. And so, all this began with the BPOS, the first version of Office 365, which became Office 365 E3, then Microsoft 365 E3 then Microsoft 365 E5. And for people who aren't in the know, basically, that means they went from Office as a subscription to Office, Windows, and a bunch of management tools as a subscription, to E5, basically, it took all of the security and compliance tools that many of us feel should have been baked into the fundamentals, into E3, the thing that everybody buys, what I refer to still today as the hero SKU and those security and compliance fundamentals should have been baked in. But no, in fact, a lot of customers when this AAD issue came out—and I think a lot discovered this ad hoc for the same reason, “Hey, we've been owned, how far back in the logs can we look?” And the answer is, you know, no farther than 90 days, a lot of customers hit that reality of, what do you mean we didn't pay for the premium thing that has all the logging that we need?Corey: Since you sat on this for eight months before mentioning it to us? Yeah.Wes: Exactly, exactly. And it's buried. And it's one of those things that, like, when we teach the licensing boot camp, I specifically call out because of my security background, it's an area of focus and interest to me. I call out to customers that a lot of the stuff we've been showing you has not questionable valuable, but kind of squishy value.This piece right here, this is both about security and compliance. Don't cheap out. If you're going to buy anything, buy this because you're going to need it later. And I've been saying that for, like, three years, but obviously only the people who were in the boot camp would hear that and then shake their head;, “Why does it have to be this difficult?” But yeah. Everything becomes a revenue opportunity if it's a potential to upsell somebody for the next tier.Corey: The couple of times I've been asked to look at Azure bills, I backed away slowly as soon as I do, just because so much of it is tied to licensing and areas that are very much outside of my wheelhouse. Because I view, in the cloud context, that cost and architecture tend to be one of the same. But when you bolt an entire layer of seat licensing and what this means for your desktop operating systems on as well as the actual cloud architecture, it gets incredibly confusing incredibly quickly. And architectural advice of the type that I give to AWS customers and would give to GCP customers is absolutely going to be harmful in many respects.I just don't know what I don't know and it's not an area that interests me, as far as learning that competency, just to jump through hoops. I mean, I frankly used to be a small business Windows admin, with the products that you talked about, back when XP and Server 2003 and a few others, I sort of ruled the roost. But I got so tired of surprise audit-style work. It felt like busy work that wasn't advancing what I was trying to get done in any meaningful way that, in a fit of rage, one day, I wound up exploring the whole Unix side of the world in 2006 and never went back.Wes: [whispering] That's how it happened.Corey: Yep.Wes: It's unfortunate that it's become so commonplace, but when Vista kind of stalled out and they started exploring other revenue opportunities, you have Vista Ultimate Enterprise, all the crazy SKUing that Vista had, I think it sort of created a mindset within the company that this is what we have to do in order to keep growing revenue up and to the right, and you know, shareholder value be the most important thing, that's what you've got to do. I agree entirely, though, the biggest challenge I could see for someone coming into our space is the fact that yes, you've got to understand Azure, Azure architecture, development architecture, and then as soon as you feel like you understand that, somebody comes along and says, “Well, yeah, but because we have an EA, we have to do it this way or we only get a discount on this thing.” And yeah, it just makes things more cumbersome. And I think that's why we still see a lot of customers who come to our boot camps who are still very dedicated AWS customers because that's where they were, and it's easier in many regards, and they just want to go with what they know.Corey: And I think that that's probably fair. I think that there is an evolution that grows here that I think catches folks by surprise. I'm fortunate in that my Microsoft involvement, if we set things like GitHub aside because I like them quite a bit and my Azure stuff as well—which is still small enough to fit in the free tier, given that I use it for one very specific, very useful thing—but the rest of it is simply seat licenses for Office 365 for my team. And I just tend to buy the retail-priced one on the internet that's licensed for business use, and I don't really think about it again. Because I don't need, as you say, in-depth audit logs for Microsoft Word. I really don't. I'm sorry, but I have a hard time believing that that's true. But something that immediately crops up when you say this is when you talk about E3 versus E5 licensing, is that organization-wide or is that on a per-seat basis?Wes: It's even worse than that. It usually comes down to per-user licensing. The whole world used to be per device licensing in Microsoft and it switched to per user when they subscript-ified everything—that's a word I made up a while ago—so when they subscript-ified everything, they changed it over to per user. And for better or worse, today, you could—there's actually four different tiers of Microsoft 365. You could go for any one of those four for any distinct user.You could have one of them on F1, F3, E3, and E5. Now, if you do that, you create some other license non-compliance issues that we spend way too much time having to talk about during the boot camp, but the point is, you can buy to fit; it's not one-size-fits-all necessarily. But you run into, very rapidly, if you deploy E5 for some number of users because the products that are there, the security services and compliance services ironically don't do license compliance in most cases, customers can actually wind up creating new license compliance problems, thereby basically having to buy E5 for everybody. So, it's a bit of a trapdoor that customers are not often aware of when they initially step into dabbling in Microsoft 365 E5.Corey: When you take a look at this across the entire board, what is your guidance to customers? Because honestly, this feels like it is a full-time job. At scale, a full-time job for a department simply keeping up with all of the various Microsoft licensing requirements, and changes because, as you say, it's not static. And it just feels like an overwhelming amount of work that to my understanding, virtually no other vendor makes customers jump through. Sure there's Oracle, but that tends to be either in a database story or a per developer, or on rare occasions, per user when you build internal Java apps. But it's not as pervasive and as tricky as this unless I'm missing something.Wes: No, you're not. You're not missing anything. It's very true. It's interesting to think back over the years at the boot camp. There's names I've heard that I don't hear anymore in terms of companies that were as bad. But the reality is, you hear the names of the same software companies but, exactly to your point, they're all departmental. The people who make [Roxio 00:16:26] still, they're very departmentalized. Oracle, IBM, yeah, we hear about them still, but they are all absolutely very departmentalized.And Microsoft, I think one of the reason why we do get so many—for better or worse, for them—return visitors to our licensing boot camps that we do every two months, is for that exact reason, that some people have found they like outsourcing that part of at least trying to keep up with what's going on, what's the record? And so, they'll come back every two, three, or four years and get an update. And we try to keep them updated on, you know, how do I color within the lines? Should it be like this? No. But it is this way.In fact, it's funny, I think back, it was probably one of the first few boot camps I did with Rob. We were in New York and we had a very large customer who had gotten a personalized message from Microsoft talking about how they were going to simplify licensing. And we went to a cocktail hour afterwards, as we often do on the first day of the boot camp, to help people, you know, with the pain after a boot camp, and this gentleman asks us well, “So, what are you guys going to do once Microsoft simplifies licensing?” And Rob and I just, like, looked at each other, smiled, looked back at the guy, and laughed. We're like, “We will cross that bridge when we get to it.”Corey: Yeah, people ask us that question about AWS billing. What if they fix the billing system? Like, we should be so lucky to live that long.Wes: I have so many things I'd rather be doing. Yes.Corey: Mm-hm. Exactly. It's one of those areas where, “Well, what happens in a post-scarcity world?” Like, “I couldn't tell you. I can't even imagine what such a thing would look like.”Wes: Exactly [laugh]. Exactly.Corey: So, the last time we spoke way back, I think in 2019, Microsoft had wound up doing some unfortunate and fairly underhanded-appearing licensed changes, where it was more expensive to run a bunch of Microsoft things, such as server software, most notably SQL Server, on clouds that were not Azure. And then, because you know, you look up the word chutzpah in the dictionary, you'll find the Microsoft logo there in response, as part of the definition, they ran an advertising campaign saying that, oh, running many cloud workloads on Azure was five times cheaper than on AWS. As if they cracked some magic secret to cloud economics. Rather than no, we just decided to play dumb games that win worse prizes with cloud licensing. How did that play out?Wes: Well, so they made those changes in October of 2019, and I kind of wish they'd become a bigger deal. And I wish they'd become a bigger deal earlier so that things could have been, maybe, reversed when it was easier. But you're absolutely right. So, it—for those who don't know, it basically made licensing changes on only AWS, GCP, and Alibaba—who I never had anybody ask me about—but those three. It also added them for Azure, but then they created loopholes for themselves to make Azure actually get beneficial licensing, even better than you could get with any other cloud provider [sigh].So, the net takeaway is that every Microsoft product that matters—so traditionally, SQL Server, Windows Server, Windows client, and Office—is not impossible to use on AWS, but it is markedly more expensive. That's the first note. To your point, then they did do that marketing campaign that I know you and I probably had exchanges about at the time, and it drove me nuts as well because what they will classically do is when they tout the savings of running something on Azure, not only are they flouting the rules that they created, you know, they're basically gloating, “Look, we got a toy that they didn't,” but they're also often removing costs from the equation. So, for example, in order for you to get those discounts on Azure, you have to maintain what's called Software Assurance. You basically have to have a subscription by another name.If you don't have Software Assurance, those opportunities are not available to you. Fine. That's not my point. My point is this, that Software Assurance is basically 75% of the cost of the next version. So, it's not free, but if you look at those 5x claims that they made during that time frame, they actually were hand-waving and waving away the [assay 00:20:45] costs.So, if you actually sat down and did the math, the 5x number was a lie. It was not just very nice, but it was wrong, literally mathematically wrong. And from a—as my colleague likes to say, a ‘colors person,' not a numbers person like me, from a colors person like me, that's pretty bad. If I can see the error and your math, that's bad math.Corey: It just feels like it's one of those taxes on not knowing some of the intricacies of what the heck is going on in the world of Microsoft licensing. And I think every sufficiently complex vendor with, shall we say, non-trivial pricing dimensions, could be accused of the same thing. But it always felt particularly worrisome from the Microsoft perspective. Back in the days of BSA audits—which I don't know at all if they're still a thing or not because I got out of that space—every executive that I ever spoke to, in any company lived in fear of them, not because they were pirating software or had decided, “You know what? We have a corporate policy of now acting unethically when it comes to licensing software,” but because of the belief that no matter what they came up with or whatever good faith effort they made to remain compliant, of course, something was not going to work the way they thought it would and they were going to be smacked with a fine. Is that still the case?Wes: Absolutely. In fact, I think it's worse now than it ever was before. I will often say to customers that you are wildly uncompliant while also being wildly overcompliant because per your point about how broad and deep Microsoft is, there's so many products. Like, every company today, every company that has Project and Visio still in place today, that still pays for it, you are over-licensed. You have more of it than you need.That's just one example, but on the other side, SQL Server, odds are, every organization is subtly under-licensed because they think the rule is to do this, but the rules are actually more restrictive than they expect. So, and that's why Microsoft is, you know, the first place they look, the first rug they look under when they do walk in and do an audit, which they're entitled to do as a part of an organization's enterprise agreement. So BSA, I think they do still have those audits, but Microsoft now they have their own business that does that, or at least they have partners that do that for them. And places like SQL Server are the first places that they look.Why? Because it's big, found money, and because it's extremely hard to get right. So, there's a reason why, when we focus on our boot camps, we'll often tell people, you know, “Our goal is to save you enough money to pay for the class,” because there's so much money to be found in little mistakes that if you do a big thing wrong with Microsoft software, you could be wildly out of compliance and not know about it until Microsoft-or more likely, a Microsoft partner—points it out to you.Corey: It feels like it's an inevitability. And, on some level, it's the cost of doing business. But man, does that leave a sour taste in someone's mouth.Wes: Mm-hm. It absolutely does. It absolutely does. And I think—you know, I remember, gosh, was it Munich that was talking about, “We're going to switch to Linux,” and then they came back into the fold. I think the reality is, it absolutely does put a bad taste.And it doesn't leave customers with good hope for where they go from here. I mean, okay, fine. So, we got burned on that thing in the Microsoft 365 stack. Now, they want us to pay 30 bucks for Copilot for Microsoft 365. What? And we'd have no idea what they're even buying, so it's hard to give any kind of guidance. So, it's a weird time.Corey: I'm curious to see what the ultimate effect of this is going to be. Well, one thing I've noticed over the past decade and change—and I think everyone has as well—increasingly, the local operating system on people's laptops or desktops—or even phones, to some extent—is not what it once was. Increasingly, most of the tools that I find myself using on a daily basis are just web use or in a browser entirely. And that feels like it's an ongoing problem for a company like Microsoft when you look at it through the lens of OS. Which at some level, makes perfect sense why they would switch towards everything as a service. But it's depressing, too.Wes: Yeah. I think that's one of the reasons why, particularly after Steve left, they changed focus a lot and really begin focusing on Microsoft 365 as the platform, for better or worse. How do we make Microsoft 365 sticky? How do we make Office 365 sticky? And the thing about, like, the Microsoft 365 E5 security stuff we were talking about, it often doesn't matter what the user is accessing it through. The user could be accessing it only through a phone, they could be a frontline worker, they could be standing at a sales kiosk all day, they could be using Office every single day, or they could be an exec who's only got an iPad.The point is, you're in for a penny, in for a pound at that point that you'll still have to license the user. And so, Microsoft will recoup it either way. In some ways, they've learned to stop caring as much about, is everyone actively using our technology? And on the other side, with things like Teams, and as we're seeing very, very slowly, with the long-delayed Outlook here, you know, they're also trying to switch things to have that less Win32 surface that we're used to and focus more on the web as well. But I think that's a pretty fundamental change for Microsoft to try and take broadly and I don't anticipate, for example, Office will ever be fully replaced with a fat client like it has on Windows and the Mac OS.Corey: Yeah, part of me wonders what the future that all looks like because increasingly, it feels more than a little silly that I'm spending, like, all of this ever-increasing dollar figure on a per-seat basis every year for all of Microsoft 365. Because we don't use their email system. We don't use so much of what they offer. We need basically Word and Excel and once in a blue moon PowerPoint, I guess. But that's it. Our fundamental needs have not materially shifted since Office 2003. Other than the fact that everything uses different extensions now and there's, of course, the security story on top of it, too. We just need some fairly basic stuff.Wes: And I think that's the case for a lot of—I mean, we're the exact same way at Directions. And I think that's the case for a lot of small and even into mid-size companies. Microsoft has traditionally with the, like, Small Business Premium, they have an offering that they intentionally only scale up to 300 people. And sometimes they'll actually give you perks there that they wouldn't give away in the enterprise suite, so you arguably get more—if they let you have it, you get more than you would if you've got E5. On the other side, they've also begun, for enterprises, honing in on opportunities that they may have historically ignored.And when I was at Microsoft, you'd have an idea, like, “Hey, Bob. I got an idea. Can we try to make a new product?” He's like, “Okay, is it a billion-dollar business?” And you get waved away if it wasn't all a billion-dollar business. And I don't think that's the case anymore today, particularly if you can make the case, this thing I'm building makes Microsoft 365 sticky or makes Azure sticky. So, things like the Power Platform, which is subtly and slowly replacing Access at a minimum, but a lot of other tools.Power BI, which has come from behind. You know, people would look at it and say, “Oh, it's no Excel.” And now it, I think, far exceeds Excel for that type of user. And Copilot, as I talked about, you know, Microsoft is definitely trying to throw things in that are beyond Office, beyond what we think of as Microsoft. And why are they doing that? Because they're trying to make their platform more sticky. They're trying to put enough value in there so you need to subscribe for every user in your organization.And even things, as we call them, ‘Batteries not Included' like Copilot, that you're going to buy E5 and that you're still going to have to buy something else beyond that for some number of users. So, you may even have a picture in your head of how much it's going to cost, but it's like buying a BMW 5 Series; it's going to cost more than you think.Corey: I wish that there were a better path forward on this. Honestly, I wish that they would stop playing these games, let you know Azure compete head-to-head against AWS and let it win on some of its merits. To be clear, there are several that are great. You know, if they could get out of their own way from a security perspective, lately. But there seems to be a little appetite for that. Increasingly, it seems like even customers asking them questions tends to hit a wall until, you know, a sitting US senator screams at them on Twitter.Wes: Mm-hm. No, and then if you look carefully at—Microsoft is very good at pulling just enough off of the sweater without destroying the sweater. And for example, what they did, they gave enough away to potentially appease, but they didn't actually resolve the problem. They didn't say, “All right, everybody gets logging if they have Microsoft 365 E3,” or, “Everybody gets logging, period.” They basically said, “Here's the kind of logging you can get, and we're going to probably tweak it a little bit more in the future,” and they will not tweak it more in the future. If anything, they'll tighten it back up.This is very similar to the 2019 problem we talked about earlier, too, that you know, they began with one set of rules and they've had to revisit it a couple of times. And most of the time, when they've had an outcry, primarily from the EU, from smaller cloud providers in the EU who felt—justifiably—that Microsoft was being not—uncompetitive with Azure vis-à-vis every other cloud provider. Well, Microsoft turned around and last year changed the rules such that most of these smaller cloud providers get rules that are, ehh, similar to what Azure can provide. There are still exclusives that only Azure gets. So, what you have now is basically, if you're a customer, the best set and cheapest set is with Azure, then these smaller cloud providers give you a secondary—it's close to Azure, but still not quite as good. Then AWS, GCP, and Alibaba.So, the rules have been switched such that you have to know who you're going to in order to even know what the rules are and to know whether you can comply with those rules with the thing you want to build. And I find it most peculiar that, I believe it was the first of last month that Microsoft made the change that said, “You'll be able to run Office on AWS,” which was Amazon WorkSpaces, in particular. Which I think is huge and it's very important and I'm glad they made this change, but it's weird because it creates almost a fifth category because you can't run it anywhere else in Amazon, like if you were spinning something up in VMware on Amazon, but within Amazon WorkSpaces, you can. This is great because customers now can run Office for a fee. And it's a fee that's more than you'd pay if you were running the same thing on Microsoft's cloud.But it also was weird because let's say Google had something competitive in VDI, but they don't really, but if they had something competitive in VDI, now this is the benefit that Amazon has that's not quite as good as what Microsoft has, that Google doesn't get it at all. So, it's just weird. And it's all an attempt to hold… to both hold a market strategy and an attempt to grow market share where they're still behind. They are markedly behind in several areas. And I think the reality is, Amazon WorkSpaces is a really fine offering and a lot of customers use it.And we had a customer at our last in-person boot camp in Atlanta, and I was really impressed—she had been to one boot camp before, but I was really impressed at how much work she'd put into making sure we know, “We want to keep using Amazon WorkSpaces. We're very happy with it. We don't want to move anywhere else. Am I correct in understanding that this, this, this, and this? If we do these things will be aboveboard?” And so, she knew how much more she'd have to pay to stay on Amazon WorkSpaces, but it was that important to the company that they'd already bet the farm on the technology, and they didn't want to shift to somebody else that they didn't know.Corey: I'm wondering how many people have installed Office just through a standard Microsoft 365 subscription on a one-off Amazon WorkSpace, just because they had no idea that that was against license terms. I recall spinning up an Amazon WorkSpace back when they first launched, or when they wound up then expanding to Amazon Linux; I forget the exact timeline on this. I have no idea if I did something like that or not. Because it seems like it'd be a logical thing. “Oh, I want to travel with just an iPad. Let me go ahead and run a full desktop somewhere in the cloud. Awesome.”That feels like exactly the sort of thing an audit comes in and then people are on the hook for massive fines as a result. It just feels weird, as opposed to, there are a number of ways to detect you're running on a virtual machine that isn't approved for this. Stop the install. But of course, that doesn't happen, does it?Wes: No. When we teach at the boot camp, Rob will often point out that, you know, licensing is one of the—and it's true—licensing is one of the last things that comes in when Microsoft is releasing a product. It was that way when he was at the company before I was—he shipped Word 1.0 for the Mac, to give you an idea of his epoch—and I was there for XP, like I said, which was the first version that used activation—which was a nightmare—there was a whole dedicated team on. And that team was running down to the wire to get everything installed.And that is still the case today because marketing and legal make decisions about how a product gets sold. Licensing is usually tacked on at the very end if it gets tacked on at all. And in fact, in a lot of the security, compliance, and identity space within Microsoft 365, there is no license compliance. Microsoft will show you a document that, “Hey, we do this,” but it's very performative. You can't actually rely on it, and if you do rely on it, you'll get in trouble during an audit because you've got non-compliance problems. So yeah, it's—you would hope that it keeps you from coloring outside the lines, but it very much does not.Corey: It's just a tax on going about your business, in some ways [sigh].Wes: Exactly. “Don't worry, we'll be back to fix it for you later.”Corey: [laugh]. I really appreciate your taking the time to go through this with me. If people want to learn more, where's the best place for them to keep up with what you're up to?Wes: Well, obviously, I'm on Twitter, and—oh, sorry, X, whatever.Corey: No, we're calling it Twitter.Wes: Okay, I'm on—I'm on—[laugh] thank you. I'm on Twitter at @getwired. Same alias over on [BlueSky 00:35:27]. And they can also find me on LinkedIn, if they're looking for a professional question beyond that and want to send a quiet message.The other thing is, of course, go to directionsonmicrosoft.com. And directionsonmicrosoft.com/training if they're interested in one of our licensing boot camps. And like I said, Rob, and I do those every other month. We're increasingly doing them in person. We got one in Bellevue coming up in just a few weeks. So, there's opportunities to learn more.Corey: Excellent. And we will, of course, put links to that in the [show notes 00:35:59]. Thank you so much for taking the time to chat with me again, Wes. It's appreciated.Wes: Thank you for having me.Corey: Wes Miller, Research VP at Directions on Microsoft. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that will no doubt be taken down because you did not sign up for that podcasting platform's proper license level.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Using Data to Tell Stories with Thomas LaRock

Screaming in the Cloud

Play Episode Listen Later Sep 14, 2023 31:37


Thomas LaRock, Principal Developer Evangelist at Selector AI, joins Corey on Screaming in the Cloud to discuss why he loves having a career in data and his most recent undertaking at Selector AI. Thomas explains how his new role aligned perfectly with his career goals in his recent job search, and why Selector AI is not in competition with other data analysis tools. Corey and Thomas discuss the benefits and drawbacks to going back to school for additional degrees, and why it's important to maintain a healthy balance of education and practical experience. Thomas also highlights the impact that data can have on peoples' lives, and why he finds his career in data so meaningful. About ThomasThomas' career and life experiences are best described as follows: he takes things that are hard and makes them simple for others to understand. Thomas is a highly experienced data professional with over 25 years of expertise in diverse roles, from individual contributor to team lead. He is passionate about simplifying complex challenges for others and leading with empathy, challenging assumptions, and embracing a systems-thinking approach. Thomas has strong analytical reasoning skills and expertise to identify trends and opportunities for significant impact, and is a builder of cohesive teams by breaking down silos resulting in increased efficiencies and collective success. He has a track record of driving revenue growth, spearheading industry-leading events, and fostering valuable relationships with major tech players like Microsoft and VMware. Links Referenced: Selector: https://www.selector.ai/ LinkedIn: https://www.linkedin.com/in/sqlrockstar/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Do you wish there were cheat codes for database optimization? Well, there are – no seriously. If you're using Postgres or MySQL on Amazon Aurora or RDS, OtterTune uses AI to automatically optimize your knobs and indexes and queries and other bits and bobs in databases. OtterTune applies optimal settings and recommendations in the background or surfaces them to you and allows you to do it. The best part is that there's no cost to try it. Get a free, thirty-day trial to take it for a test drive. Go to ottertune dot com to learn more. That's O-T-T-E-R-T-U-N-E dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. There are some guests I have been nagging-slash-angling to have on this show for years on end, and that you almost give up, until they wind up having a job change. At which point, there's no better opportunity to pounce like some sort of scavenger or hyena or whatnot in order to get them on before their new employer understands what I am, and out of an overabundance of caution, decides not to talk with me. Thomas LaRock is a recently minted Principal Developer Evangelist at Selector. Thomas, thank you for finally deigning to appear on the show. It is deeply appreciated.Thomas: Oh, thanks for having me. Thanks for extending invitation. I'm sorry. It's my fault I haven't come here before now; it's just been one of those scheduling things. And I always think I'm going to see you. Like, I'll go to re:Invent, and I'm like, “I'll see Corey there.” And then, nah, Corey is a little busy.Corey: Yeah, I have no recollection of basically anything that ever happens at re:Invent, just because it is eight days of ridiculous Cloud Chanukah and thing to thing to thing to thing to thing. It's just overload and I wind up effectively blocking all of it out. You are one of those very interesting people where, depending upon the context in which someone encounters you, it's difficult to actually put a finger on where you start and where you stop. You are, for example, a Microsoft MVP, which means you presumably have a fair depth of experience with at least some subset of Microsoft products. You have been working at SolarWinds for a while now, and you also have the username of SQLRockstar on a number of social media environments, which leads me to think, oh, you're a database person. What are you exactly? Where do you start? Where do you stop?Thomas: Yeah, in my heart-of-hearts, a data professional. And that can mean a lot of things to a lot of different people. My latest thing I've taken from a friend where I just call myself a data janitor because that's pretty much what I do all day, right? I'll clean data up, I'll move it around, it's a pile here and a pile there. But that's my heart of hearts. I've been a database administrator, I've been the data advocate. I've done a lot of roles, but it's always been heavily focused on data.Corey: So, these days, your new role—let's start at the present and see if we work our way backwards or not—you've been, at the time of this recording, in your role for a week where you are a principal developer evangelist at Selector, which to my understanding, is an AIOps or MLOps or whatever buzzword that we're sprinkling on top of things today is, which of course presupposes having some amount of data to wind up operating on. What do you folks do over there?Thomas: That's a great question. I'm hoping to figure that out eventually. No. So, here's the thing, Corey. So, when I started my unforced sabbatical this past June, I was, of course, doing what everybody does: panicking. And I was looking for job opportunities just about anywhere.But I, again, data professional. I really wanted a role that would allow me to use my math skills—I have a master's in mathematics—I wanted to use those math and analytical skills and go beyond the data into the application of the data. So, in the past five, six years, I've been earning a lot of data science certifications, I've been just getting back into my roots, right, statistical analysis, even my Six Sigma training is suddenly relevant again. So, what happened was I was on LinkedIn and friend had posted a note and mentioned Selector. I clicked on the link, and [all of sudden 00:04:17] I read, I go, “So, here's a company that is literally building new tools and it's data-science-centric. Is data-science-first.”It is, “We are going to find a way to go through your data and truly build out a better set of correlations to get you a signal through the noise.” Traditional monitoring tools, you know, collect a lot of things and then they kind of tell you what's wrong. Or you're collecting a lot of different things, so they slap, like, I don't know, timestamps in there and they guess at correlations. And these people are like, “No, no, no. We're going to go through everything and we will tell you what the data really says about your environment.”And I thought it was crazy how at the moment I was looking for a role that involve data and advocacy, the moment I'm looking for that role, that company was looking for someone like me. And so, I reached out immediately. They wanted not just a resume, but they're like, where's your portfolio? Have you spoken before? I'm like, “Yeah, I've spoken in a couple places,” right?So, I gave them everything, I reached right out to the recruiter. I said, “In case it doesn't arrive, let me know. I'll send it again. But this sounds very interesting.” And it didn't take more than—Corey: Exactly. [unintelligible 00:05:24] delivery remains hard.Thomas: Yeah. And it didn't take more than a couple of weeks. And I had gone through four or five interviews, they said that they were going to probably fly me out to Santa Clara to do, like, a last round or whatever. That got changed at some point and we went from, “Hey, we'll have you fly out,” to, “Hey, here's the offer. Why don't you just sign?” And I'm like, “Yeah, I'll start Monday. Let's go.”Corey: Fantastic. I imagine at some point, you'll be out in this neck of the woods just for an off-site or an all-hands or basically to stare someone down when you have a sufficiently large disagreement.Thomas: Yes, I do expect to be out there at some point. Matter of fact, I think one of my trips coming up might be to San Diego if you happen to head down south.Corey: Oh, I find myself all over the place these days, which is frankly, a welcome change after a few years of seclusion during the glorious pandemic years. What I like about Selector's approach, from what I can tell at least, is that it doesn't ask all of its customers to, “Hey, you know, all that stuff that you've instrumented over the last 20 years with a variety of different tools in the observability pipeline? Yeah, rip them all out and replace them with our new shiny thing.” Which never freaking happens. It feels like it's a better step toward meeting folks where they are.Thomas: Yeah. So, we're finding—I talk like I've been there forever: “What we're finding,”—in the past 40 hours of my work experience there, what we're finding, if you just look at the companies that are listed on the website, you'll get an idea for the scale that we're talking about. So no, we're not there to rip and replace. We're not going to show up and tell you, “Yeah, get rid of everything. We're going to do that for you.”Matter of fact, we think it's great you have all of those different things because it just reflects the complexity of your environment right now, is that you've grown, you've got so many disparate systems, you've got some of the technologies trying to monitor it all, and you're really hoping to have everything rolled into one big dashboard, right? Instead of right now, you've got to go through three, four, or five dashboards, to even think you have an idea of the problem. And you never really—you guess. We all guess. We think we know where it is, and you start looking and then you figure it out.But yeah, we take kind of a different approach right from the start, and we say, “Great, you've got all that data? Ingest it. Bring it right to us, okay? We don't care where it comes from, we can bring it in, and we can start going through it and start giving you true actionable insights.” We can filter out the noise, right, instead of one node going down, triggering a thousand alerts, we can just filter all of that out for you and just let you focus on the things that you need to be looking at right now.Corey: One of the things that I think gets overlooked in this space a lot is, “Well, we have this tool that does way better than that legacy tool that you're using right now and it's super easy to do a just drop-in replacement with our new awesomeness.” Great. What that completely misses is that there are other business units who perhaps care about data interchange and the idea that yeah, thing's a legacy piece of junk and replacing it would take an afternoon. And then it would take 14 years to wind up redoing all the other reports that other things are generating downstream of that because they integrate with that thing. So yeah, it's easy to replace the thing itself, but not in a way that anything else can take advantage of it.Thomas: Right.Corey: And when it turns out also when you sit there making fun of people's historical technological decisions, they don't really like becoming customers as it turns out. This was something of a shock for an awful lot of very self-assured startup founders in the early days.Thomas: Yeah. And again, you're talking about how, you know some of the companies we're looking at, it's y—we don't want to rip and replace things. Like you just said, you've got an ecosystem. It's a delicate ecosystem that has [laugh] developed over time. We aren't interested in replacing all that. We want to enhance it, we want to be on top of it and amplify what's in there for you.So yeah, we're not interested in coming in and say, “Yeah, rip every tool out.” And in some ways, when somebody will ask, you know, “Who do you compete with?” I'll go, “Nobody.” Because I'm not looking to replace anybody. I'm looking to go on top.And again, the companies we're dealing with have lots of data. We're talking very large companies. Some of these are the backbone of the internet. They just have way too much data for any of these legacy tools to help with, you know? They can help with, like, little things, but in terms of making sense of it all, in terms of doing the real big data analytics, yeah, that's where our tool comes in and it really shines.Corey: Yeah, it turns out that is not a really compelling sales pitch to walk it and say, “Hey, listen up, idiots, you all are doing it wrong. Now, pay me and we'll do it right.” Yeah, even if you're completely right, you've already lost the room at that point.Thomas: Exactly.Corey: People make decisions based upon human aspects, not about arithmetic, in most cases. I will say, taking a glance at the website, a couple of things are very promising. One, your picture and profile are already up there, which is good. No one is still on the fence about that, and further as a bonus, they've taken your job role down off the website, which is always disconcerting when you're there and, “Why is that job still open?” “Oh, we're preserving optionality. Don't you worry your head about that. We've got it.” No one finds that a reassuring story when it's about the role that they're in. So, good selection.Thomas: I went to—after I signed, it was within the day, I went to send somebody the link to the job req. Like, they're like, “What are”—I go, “Here, let me show you.” It was already down. The ink was even dry on the DocuSign and it was already down. So, I thought that—Corey: Good on them.Thomas: —was a good sign, too.Corey: Oh, yeah. Now, looking at the rest of your website, I do see a couple of things that lead to natural questions. One of the first things I look at on a web page is, okay, how is this thing priced? Because you always want to see the free tier option when I'm trying to solve a problem the middle of the night that I can just sign up for and see if it works for a small use case, but you also, in a big company definitely want to have the ‘Contact Us' option because we're procurement and we don't know how to sign a deal that doesn't have two commas in it with a bunch of special terms that ride along with it. Selector does not at the time of this recording, have a pricing page at all, which usually indicates if you have to ask, it might not be for you.Then I look at your customer case studies and they talk about very large enterprises, such as a major cable operator, for example, or TracFone. And oh okay, yeah, that is probably not the scale that I tend to be operating at. So, if I were to envision this as a carnival ride and there's a sign next to it, “You must be at least this tall to ride,” how tall should someone be?Thomas: That is a great way of putting it and I would—I can't really go into specifics because I'm still kind of new. But my understanding—Corey: Oh yeah. Make sweeping policy statements about your new employer 40 hours in. What could possibly go wrong?Thomas: My understanding is the companies that we—that are our target market today are fairly large enterprises with real data challenges, real monitoring data challenges. And so no, we're not doing—it's not transactional. You can't just come to our website and say, “Here, click this, you'll be up and running.” Because the volumes of data we're talking about, this requires a little bit of specialty in helping make sure that things are getting set up and correct.Think of it this way. Like if somebody said, “Here, do the statistical analysis on whatever, and here's Excel and go at it and get me that report by the end of the day and tell me how we're doing,” most people would be like, “I don't have enough information on that. Can you help me?” So, we're still at that, hey, we're going to need to help you through this and make sure it's correctly configured. And it's doing what you expect. So, how tall are you? I think that goes both ways. I think you're at a height where you still need some supervision [laugh]. Does that make sense?Corey: I think that's probably a good way of framing it. It's a—again, I'm not saying that you should never ever, ever, ever have a ‘you must contact us to get started.' There are a bunch of products like that out there. It turns out that even at The Duckbill Group here, we always want to have a series of conversations first. We don't have a shopping cart that's, “One consulting, please,” just because we'll get into trouble with that.Though I think our first pass offering of a two-day engagement might have one of those somewhere still lurking around. Don't quote me on that. Hell is other people's websites. It's great. But your own yeah, whoever reads that thing“. Wait, we're saying what?” Don't quote me on any of that, my God.Thomas: But I think that's a good way of putting it. Like, you want to have some conversations first. Yeah, so you—and again, we're still, we're fairly young. We've only—we're Series A, so we've been around 16 months, like… you know, the other website you're looking at is probably going to change within the next six or eight weeks just because information gets outdated—Corey: It already has. It put your picture on it.Thomas: Right. But I mean, things are going to things move pretty fast with startups, especially this one. So, I just expect that over time, I envision some type of a free tier, but we're not there yet.Corey: That's one of those challenges as far as in some cases moving down market. I found that anything that acts like a security tool, for example, has to, on some level, charge enough to be worth the squeeze. One of the challenges there is, I'm either limited for anything that does CloudTrail analysis over in AWS-land, for example. I can either find a bunch of janky things off GitHub or I can spend what starts at $1,000 a month and increases rapidly from there, which is about twice the actual AWS bill that it would wind up alerting on. Not that the business value isn't there, but because a complex sale is, in many cases, always going to be attendant with some of these products, so why not go after the larger companies where the juice is worth the squeeze rather than the folks who are not going to see the value and it'd be just as challenging to wind up launching a sale into?The corollary, of course, is that some of those small companies do in fact, grow meteorically. But it's a bit of a lottery.Thomas: Yep.Corey: Ugh. So, I have to ask as well, while we're talking about strange decisions that people might have made, in the world of tech, in many cases, when someone gets promoted—like, “So, does that mean extra money?” “No, not really. We just get extra adjectives added to our job title.” Good for us. You have decided to add letters in a different way, by going back for a second master's degree. What on earth would possess you to do such a thing?Thomas: I—man, that is—you know, so I got my first master's degree because I thought I was going to, I thought I was be a math teacher and basketball coach. And I had a master's degree in math and I thought that was going to be a thing. I'll get a job, you know, coaching and teaching at some small school somewhere. But then I realized that I enjoyed things like eating and keeping the wind off me, and so I realized I had to go get a jobby-job. And so, I took my masters in math, I ended—I got a job as a software analyst, and just rolled that from one thing to another until where I am today.But about four years ago, when I started falling back in love with my roots in math, and statistical analysis became a real easy thing for people to really start doing for themselves—well actually, that was about eight years ago—but the past four or five years, I've been earning more certifications in data science technologies. And then I found this program at Georgia Tech. So, Georgia Tech has an online masters of science and data analytics. And it's extremely affordable. So, I looked at a lot of programs, Corey, over the past few years, especially during the pandemic.I had some free time, so I browsed the love these places, and they were charging 50, $60,000 and you had to do it within two, three years. And in one case, the last class you had to take, your practicum, had to be all done on campus. So, you had to go, like, live somewhere. And I'm looking at all—none of that was practical. And all of a sudden, somebody shows up and goes, “So, you can go online, fully online, Georgia Tech, $275 a credit. Costs ten grand for the entire program.”And you can—it's geared towards a working professional and you can take anywhere from two to six years. So, you take, like, one class a semester if you want, or two or even three if they allow you, but they usually restrict you. So, it just blew my mind. Like, this exists today that I can start earning another Master's degree in data analytics and I'll say, be… classically trained in how—it's funny because when I learn things in class, I'm like, I feel like I'm Thornton Melon in Back to School, and I'm just like, “Oh, you left out a bunch of stuff. That isn't how you do it all,” right?That's kind of my reaction. I'm like, “Calm down. I'm sure the professor has point. I'll hear [laugh] him out.” But to me, you asked why, and I just the challenge. Am I really good at what I do? Like, I feel I am. I already have a master's degree. I'm not worried about the level of work and the commitment involved in earning another one.I just wanted to show to myself that could—I want to learn and make sure I can do things like code in Python. If anybody has a chance to take a programming class, a graduate-level programming classes at Georgia Tech, you should do it. You should see where your skills rate at that level, right? So, it was for the challenge. I want to know if I can do it. I'm three classes in. I just started my fourth, actually, today was the start of the fall semester.And so, I'm about halfway through, and I'm loving it. It's not too taxing. It's just the right speed for me. I get to do it in my leisure hours as they were. Yeah, so I did it for the challenge. I'm really glad I'm doing it. I encourage anybody interested in obtaining a degree in data analytics to look at the Georgia Tech program. It's well worth it. Georgia Tech's not a bad school. Like, if you had to go to school in the South, it's all right.Corey: I always find it odd, just, you had your first master's degree in, you know, mathematics, and now you're going for data analytics, which sounds like mathematics with extra steps.Thomas: It is.Corey: Were there opportunities that you were hoping to pursue that were not available to you with just the one master's degree?Thomas: So, it's interesting you say that because I'm so old that when I went to school, all we had was math, that was it. It was pure mathematics. I could have been a statistics major, I think, and computer science was a thing. And one day I met a guy who transferred into math from computer science. I'm like, “Why would you do that? What are you going to do with the degree in math?”And his response is, “What am I going to do with a degree in computer science?” And I look back and I realized how we were both right. So, I think at the time if there had been a course in applied mathematics, that would have piqued my interest. Like, what am I going to do with this math degree other than become an actuary because that was about all I knew at the time. You were a teacher or an actuary, and that was about it.So, the idea now that they have these programs in data analytics or data science that are little more narrow of focus, like, “This is what we're going to do: we're going to apply a little bit of math, some calculus, some stats; we're going to show you how to build your own simulations; we're going to show you how to ask the right questions of the data.” To give you a little bit of training. Because they can't teach you everything. You really have to have real-world experience in whatever domain you're going to focus on, be it finance or marketing or whatever. All these bright financial operations, that's just analytics for finance, marketing operations, that's analytics for marketing. It's just, to me, I think just the opportunity to have that focus would have been great back then and it didn't exist. And I want to take advantage of it now.Corey: I've always been a fan of advising people who ask me, “Should I go back to school,” because usually, there's something else driving that. Like, I am honestly not much of a career mentor. My value basically comes in as being a horrible warning to others. On paper, I have an eighth-grade education. I am not someone to follow for academic approaches.But when someone early or mid-career asks, “Should I get another degree?” Unpacking that is always a bit of a fun direction for me to go in. Because at some level, we've sold entire generations a bill of goods, where oh, if you don't know what to do, just get more credentials and then your path will be open to you in a bunch of new and exciting ways. Okay, great. I'm not saying that's inherently wrong, but talk to people doing the thing you'd want to do after you have that degree, maybe, you know, five or six years down the professional line from where you are and get their take on it.Because in some cases, yeah, there are definite credentials you're going to need—I don't want you to be a self-taught surgeon, for example—but there are other things where it doesn't necessarily open doors. People are just reflexively deciding that I'm going to go after that instead. And then you can start doing the math of, okay, assume that you have whatever the cost of the degree is in terms of actual cost and opportunity cost. Is this the best path forward for you to wind up getting where you want to go? It sounds like in your particular case, this is almost a labor of love or a hobby style of approach, as opposed to, “Well, I really want Job X, but I just can't get it without the right letters after my name.” Is that a fair assessment?Thomas: It's not unfair. It is definitely fair, but I would also say, you know, if somebody came and said, “Hey Tom, we need somebody to run our data science team or our data engineering team,” I've got the experience for—the only thing I would be lacking is, you know, production experience, like, with machine-learning pipelines or something. I don't have that today.Corey: Which is basically everyone else, too, but that's a little—bit of a quiet secret in the industry.Thomas: Yeah, that's—okay. Bad example. But you know what I'm saying is that the only thing I'd be lacking would be that practical experience, so this is one way that—to at least start that little bit of experience, especially with the end result being the practicum that we'll be doing. It's, like, six credits at the very end. So yes, it's a fair thing.I wouldn't—hobby isn't really the right—this is really something that makes me get out of bed in the morning. I get to work with data today and I'm going to get—I'm going to tell a great story using data today. I really do enjoy those things. But then at the tail end of this, if it happens to lead to a position that somebody says, “Hey, we need somebody, vice president of data engineering. This a really good”—honestly, the things I look for are the roles and the roles I want are to have a role that allows me to really have an impact on other people's lives.And that's one of the things about Selector. The things that we're able to do for these admins that are just drowning in data, the data is just in their way, and that we can help them make sense of it all, to me, that's impactful. So, those are the types of roles that I will be looking for as well in the future, especially at the high level of something data science-y.Corey: I think that that is a terrific example of what I'm talking about. Because I've met a number of folks, especially very early-20s range where, okay, they've gotten the degree, but now they don't know what to do because every time they're applying for jobs, it doesn't seem to work for them. You've been around this industry for 25 years. Everyone needs a piece of paper that says they know certain things, and in your case, it long ago transitioned into being—I would assume—your resumé, the history of things you have done that look equivalent. Part of me, on some level, wonders if there isn't an academic snobbery going on at some level, where a number of teams are, “Oh, we'd love to have you in, but you don't have a PhD.”And then people get the PhD. “From the right school, in the right area of concentration.” It's like, you just keep moving these very expensive goalposts super quickly. Remember, I have an eighth-grade education. I'm not coming at this from a place of snobbery and I'm also not one of those folks who's well it didn't work for me, therefore, it won't work for anyone else either because that's equally terrible in a different direction.It's just making sure that people are going into these things with their eyes open. With you, it's never been a concern. You've been around this industry so long that it is extremely unlikely to me [laugh] that you, “Oh, wait. You mean a degree won't magically solve all of my problems and regrow some of my hair and make me two inches taller, et cetera, et cetera?” But yeah, do I remember in the early days just how insipid and how omnipresent that pressure was.Thomas: Yeah. I've been at companies where we've brought in people because of the education and—or I'm sorry. Let's be more specific. I've been at companies where we've sent current employees—as we used to call it—off the charm school, which is basically [MBA 00:25:44].Corey: [laugh].Thomas: And I swear, so many of them came back and they just forgot how to think, how to have common sense. Like, they were very much focused on one particular thing and this is just it, and they forgot there were maybe humans involved, and maybe look for a human answer instead of the statistically correct one. So, I think that was a good thing for me as well to be around that because, yeah, somebody put it me best years ago: “Education by itself isn't enough. If you combine education with motivation, now you've really got something.” And your case, I don't know where you went for eighth grade, it could have been the best eighth-grade program ever, but you definitely have the motivation through the years to overcome anything that might have been lacking in the form of education. So, it's really the combination—Corey: Oh, you'd be surprised. A lot of those things are still readily apparent to people who work with me, so I've done a good job of camouflaging them. Hazzah.Thomas: Just it's, you got to have both. You can't just rely on one or the other.Corey: So, last question, given that you are the data guy and SQLRockstar is your username in a bunch of places. What's the best database? I mean, I would always say it's Route 53, but I understand that can be controversial for some folks, given that their SQL implementation is not yet complete. What's your take?Thomas: So clearly, I'm partial to anything inside the Microsoft data platform, with the exception being Access. I think if Access disappeared from the universe… society might be better off. But that's for a different day, I think the best database is the one that does the job you need it to do. Honestly, the database shouldn't really matter. It's just an abstraction. The database engine is just something in between you and the data you need, right?So, whatever you're using, if it's doing the job that you need it to do, then that's the best database you could have. I learned a long time ago to not pick sides, choose fiefdoms. Like, it just didn't matter. It's all kind of the same. And in a lot of cases, if you go to, like, the DB-Engines Rankings, you'll see how many of these systems these days, there's a lot of overlap. They offer all the same features and the differences between them are getting smaller and smaller in a lot of cases. So yeah, it's… you got to database, it does what you need to do? That's great. That's the best database.Corey: Especially since any database, I suspect, can be made to perform a given task, even if sub-optimally. Which states back to my core ethos of, quite frankly, anything is a database if you hold it wrong.Thomas: Yeah, it really is. I mean, we've had those discussions. I kid about Access because it's just a painful thing for a lot of different reasons. But is Excel a database? And I would say no but, you know—because it can't do certain things that I would expect a relational engine to do. And then you find out, well, I can make it do those things. So, now is it a database? And, yeah…Corey: [laugh]. Yeah. Well, what if I apply some brute force? Will it count then? Like, you have information, Thomas. Can I query you?Thomas: Yes. Yes, yes, [laugh] you can. I also have latency.Corey: Exactly. That means you are a suboptimal database.Thomas: [laugh].Corey: Good job. I really want to thank you for taking the time to talk about what you're up to these days and finally coming on the show. If people want to learn more, where's the best place for them to find you?Thomas: Well, I'm becoming more active on LinkedIn. So, it's linkedin/in/sqlrockstar. Just search for SQLRockstar, you'll find me everywhere. I mean, I do have a blog. I rarely blog these days. Most of the posts I do is over at LinkedIn.And you might find me at some networking events coming up since Selector really does focus on network observability. So, you could see me there. And you know what? I'm also going to have an appearance on the Screaming in the Cloud podcast, so you can listen to me there.Corey: Excellent. And I imagine that's the one we don't have to put into these [show notes. 00:29:44]. Thank you so much for taking the time to speak with me. I really do appreciate it.Thomas: Thanks for having me, Corey. I look forward to coming back.Corey: As I look forward to seeing you again over here. Thomas LaRock, Principal Developer Evangelist at Selector. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment because then we're going to use all those together as a distributed database.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Defining a Database with Tony Baer

Screaming in the Cloud

Play Episode Listen Later Sep 12, 2023 30:20


Tony Baer, Principal at dbInsight, joins Corey on Screaming in the Cloud to discuss his definition of what is and isn't a database, and the trends he's seeing in the industry. Tony explains why it's important to try and have an outsider's perspective when evaluating new ideas, and the growing awareness of the impact data has on our daily lives. Corey and Tony discuss the importance of working towards true operational simplicity in the cloud, and Tony also shares why explainability in generative AI is so crucial as the technology advances. About TonyTony Baer, the founder and CEO of dbInsight, is a recognized industry expert in extending data management practices, governance, and advanced analytics to address the desire of enterprises to generate meaningful value from data-driven transformation. His combined expertise in both legacy database technologies and emerging cloud and analytics technologies shapes how clients go to market in an industry undergoing significant transformation. During his 10 years as a principal analyst at Ovum, he established successful research practices in the firm's fastest growing categories, including big data, cloud data management, and product lifecycle management. He advised Ovum clients regarding product roadmap, positioning, and messaging and helped them understand how to evolve data management and analytic strategies as the cloud, big data, and AI moved the goal posts. Baer was one of Ovum's most heavily-billed analysts and provided strategic counsel to enterprises spanning the Fortune 100 to fast-growing privately held companies.With the cloud transforming the competitive landscape for database and analytics providers, Baer led deep dive research on the data platform portfolios of AWS, Microsoft Azure, and Google Cloud, and on how cloud transformation changed the roadmaps for incumbents such as Oracle, IBM, SAP, and Teradata. While at Ovum, he originated the term “Fast Data” which has since become synonymous with real-time streaming analytics.Baer's thought leadership and broad market influence in big data and analytics has been formally recognized on numerous occasions. Analytics Insight named him one of the 2019 Top 100 Artificial Intelligence and Big Data Influencers. Previous citations include Onalytica, which named Baer as one of the world's Top 20 thought leaders and influencers on Data Science; Analytics Week, which named him as one of 200 top thought leaders in Big Data and Analytics; and by KDnuggets, which listed Baer as one of the Top 12 top data analytics thought leaders on Twitter. While at Ovum, Baer was Ovum's IT's most visible and publicly quoted analyst, and was cited by Ovum's parent company Informa as Brand Ambassador in 2017. In raw numbers, Baer has 14,000 followers on Twitter, and his ZDnet “Big on Data” posts are read 20,000 – 30,000 times monthly. He is also a frequent speaker at industry conferences such as Strata Data and Spark Summit.Links Referenced:dbInsight: https://dbinsight.io/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at RedHat.As your organization grows, so does the complexity of your IT resources. You need a flexible solution that lets you deploy, manage, and scale workloads throughout your entire ecosystem. The Red Hat Ansible Automation Platform simplifies the management of applications and services across your hybrid infrastructure with one platform. Look for it on the AWS Marketplace.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Back in my early formative years, I was an SRE sysadmin type, and one of the areas I always avoided was databases, or frankly, anything stateful because I am clumsy and unlucky and that's a bad combination to bring within spitting distance of anything that, you know, can't be spun back up intact, like databases. So, as a result, I tend not to spend a lot of time historically living in that world. It's time to expand horizons and think about this a little bit differently. My guest today is Tony Baer, principal at dbInsight. Tony, thank you for joining me.Tony: Oh, Corey, thanks for having me. And by the way, we'll try and basically knock down your primal fear of databases today. That's my mission.Corey: We're going to instill new fears in you. Because I was looking through a lot of your work over the years, and the criticism I have—and always the best place to deliver criticism is massively in public—is that you take a very conservative, stodgy approach to defining a database, whereas I'm on the opposite side of the world. I contain information. You can ask me about it, which we'll call querying. That's right. I'm a database.But I've never yet found myself listed in any of your analyses around various database options. So, what is your definition of databases these days? Where do they start and stop? Tony: Oh, gosh.Corey: Because anything can be a database if you hold it wrong.Tony: [laugh]. I think one of the last things I've ever been called as conservative and stodgy, so this is certainly a way to basically put the thumbtack on my share.Corey: Exactly. I'm trying to normalize my own brand of lunacy, so we'll see how it goes.Tony: Exactly because that's the role I normally play with my clients. So, now the shoe is on the other foot. What I view a database is, is basically a managed collection of data, and it's managed to the point where essentially, a database should be transactional—in other words, when I basically put some data in, I should have some positive information, I should hopefully, depending on the type of database, have some sort of guidelines or schema or model for how I structure the data. So, I mean, database, you know, even though you keep hearing about unstructured data, the fact is—Corey: Schemaless databases and data stores. Yeah, it was all the rage for a few years.Tony: Yeah, except that they all have schemas, just that those schemaless databases just have very variable schema. They're still schema.Corey: A question that I have is you obviously think deeply about these things, which should not come as a surprise to anyone. It's like, “Well, this is where I spend my entire career. Imagine that. I might think about the problem space a little bit.” But you have, to my understanding, never worked with databases in anger yourself. You don't have a history as a DBA or as an engineer—Tony: No.Corey: —but what I find very odd is that unlike a whole bunch of other analysts that I'm not going to name, but people know who I'm talking about regardless, you bring actual insights into this that I find useful and compelling, instead of reverting to the mean of well, I don't actually understand how any of these things work in reality, so I'm just going to believe whoever sounds the most confident when I ask a bunch of people about these things. Are you just asking the right people who also happen to sound confident? But how do you get away from that very common analyst trap?Tony: Well, a couple of things. One is I purposely play the role of outside observer. In other words, like, the idea is that if basically an idea is supposed to stand on its own legs, it has to make sense. If I've been working inside the industry, I might take too many things for granted. And a good example of this goes back, actually, to my early days—actually this goes back to my freshman year in college where I was taking an organic chem course for non-majors, and it was taught as a logic course not as a memorization course.And we were given the option at the end of the term to either, basically, take a final or  do a paper. So, of course, me being a writer I thought, I can BS my way through this. But what I found—and this is what fascinated me—is that as long as certain technical terms were defined for me, I found a logic to the way things work. And so, that really informs how I approach databases, how I approach technology today is I look at the logic  on how things work. That being said, in order for me to understand that, I need to know twice as much as the next guy in order to be able to speak that because I just don't do this in my sleep.Corey: That goes a big step toward, I guess, addressing a lot of these things, but it also feels like—and maybe this is just me paying closer attention—that the world of databases and data and analytics have really coalesced or emerged in a very different way over the past decade-ish. It used to be, at least from my perspective, that oh, that the actual, all the data we store, that's a storage admin problem. And that was about managing NetApps and SANs and the rest. And then you had the database side of it, which functionally from the storage side of the world was just a big file or series of files that are the backing store for the database. And okay, there's not a lot of cross-communication going on there.Then with the rise of object store, it started being a little bit different. And even the way that everyone is talking about getting meaning from data has really seem to be evolving at an incredibly intense clip lately. Is that an accurate perception, or have I just been asleep at the wheel for a while and finally woke up?Tony: No, I think you're onto something there. And the reason is that, one, data is touching us all around ourselves, and the fact is, I mean, I'm you can see it in the same way that all of a sudden that people know how to spell AI. They may not know what it means, but the thing is, there is an awareness the data that we work with, the data that is about us, it follows us, and with the cloud, this data has—well, I should say not just with the cloud but with smart mobile devices—we'll blame that—we are all each founts of data, and rich founts of data. And people in all walks of life, not just in the industry, are now becoming aware of it and there's a lot of concern about can we have any control, any ownership over the data that should be ours? So, I think that phenomenon has also happened in the enterprise, where essentially where we used to think that the data was the DBAs' issue, it's become the app developers' issue, it's become the business analysts' issue. Because the answers that we get, we're ultimately accountable for. It all comes from the data.Corey: It also feels like there's this idea of databases themselves becoming more contextually aware of the data contained within them. Originally, this used to be in the realm of, “Oh, we know what's been accessed recently and we can tier out where it lives for storage optimization purposes.” Okay, great, but what I'm seeing now almost seems to be a sense of, people like to talk about pouring ML into their database offerings. And I'm not able to tell whether that is something that adds actual value, or if it's marketing-ware.Tony: Okay. First off, let me kind of spill a couple of things. First of all, it's not a question of the database becoming aware. A database is not sentient.Corey: Niether are some engineers, but that's neither here nor there.Tony: That would be true, but then again, I don't want anyone with shotguns lining up at my door after this—Corey: [laugh].Tony: —after this interview is published. But [laugh] more of the point, though, is that I can see a couple roles for machine learning in databases. One is a database itself, the logs, are an incredible font of data, of operational data. And you can look at trends in terms of when this—when the pattern of these logs goes this way, that is likely to happen. So, the thing is that I could very easily say we're already seeing it: machine learning being used to help optimize the operation of databases, if you're Oracle, and say, “Hey, we can have a database that runs itself.”The other side of the coin is being able to run your own machine-learning models in database as opposed to having to go out into a separate cluster and move the data, and that's becoming more and more of a checkbox feature. However, that's going to be for essentially, probably, like, the low-hanging fruit, like the 80/20 rule. It'll be like the 20% of an ana—of relatively rudimentary, you know, let's say, predictive analyses that we can do inside the database. If you're going to be doing something more ambitious, such as a, you know, a large language model, you probably do not want to run that in database itself. So, there's a difference there.Corey: One would hope. I mean, one of the inappropriate uses of technology that I go for all the time is finding ways to—as directed or otherwise—in off-label uses find ways of tricking different services into running containers for me. It's kind of a problem; this is probably why everyone is very grateful I no longer write production code for anyone.But it does seem that there's been an awful lot of noise lately. I'm lazy. I take shortcuts very often, and one of those is that whenever AWS talks about something extensively through multiple marketing cycles, it becomes usually a pretty good indicator that they're on their back foot on that area. And for a long time, they were doing that about data and how it's very important to gather data, it unlocks the key to your business, but it always felt a little hollow-slash-hypocritical to me because you're going to some of the same events that I have that AWS throws on. You notice how you have to fill out the exact same form with a whole bunch of mandatory fields every single time, but there never seems to be anything that gets spat back out to you that demonstrates that any human or system has ever read—Tony: Right.Corey: Any of that? It's basically a, “Do what we say, not what we do,” style of story. And I always found that to be a little bit disingenuous.Tony: I don't want to just harp on AWS here. Of course, we can always talk about the two-pizza box rule and the fact that you have lots of small teams there, but I'd rather generalize this. And I think you really—what you're just describing is been my trip through the healthcare system. I had some sports-related injuries this summer, so I've been through a couple of surgeries to repair sports injuries. And it's amazing that every time you go to the doctor's office, you're filling the same HIPAA information over and over again, even with healthcare systems that use the same electronic health records software. So, it's more a function of that it's not just that the technologies are siloed, it's that the organizations are siloed. That's what you're saying.Corey: That is fair. And I think at some level—I don't know if this is a weird extension of Conway's Law or whatnot—but these things all have different backing stores as far as data goes. And there's a—the hard part, it seems, in a lot of companies once they hit a certain point of maturity is not just getting the data in—because they've already done that to some extent—but it's also then making it actionable and helping various data stores internal to the company reconcile with one another and start surfacing things that are useful. It increasingly feels like it's less of a technology problem and more of a people problem.Tony: It is. I mean, put it this way, I spent a lot of time last year, I burned a lot of brain cells working on data fabrics, which is an idea that's in the idea of the beholder. But the ideal of a data fabric is that it's not the tool that necessarily governs your data or secures your data or moves your data or transforms your data, but it's supposed to be the master orchestrator that brings all that stuff together. And maybe sometime 50 years in the future, we might see that.I think the problem here is both technical and organizational. [unintelligible 00:11:58] a promise, you have all these what we used call island silos. We still call them silos or islands of information. And actually, ironically, even though in the cloud we have technologies where we can integrate this, the cloud has actually exacerbated this issue because there's so many islands of information, you know, coming up, and there's so many different little parts of the organization that have their hands on that. That's also a large part of why there's such a big discussion about, for instance, data mesh last year: everybody is concerned about owning their own little piece of the pie, and there's a lot of question in terms of how do we get some consistency there? How do we all read from the same sheet of music? That's going to be an ongoing problem. You and I are going to get very old before that ever gets solved.Corey: Yeah, there are certain things that I am content to die knowing that they will not get solved. If they ever get solved, I will not live to see it, and there's a certain comfort in that, on some level.Tony: Yeah.Corey: But it feels like this stuff is also getting more and more complicated than it used to be, and terms aren't being used in quite the same way as they once were. Something that a number of companies have been saying for a while now has been that customers overwhelmingly are preferring open-source. Open source is important to them when it comes to their database selection. And I feel like that's a conflation of a couple of things. I've never yet found an ideological, purity-driven customer decision around that sort of thing.What they care about is, are there multiple vendors who can provide this thing so I'm not going to be using a commercially licensed database that can arbitrarily start playing games with seat licenses and wind up distorting my cost structure massively with very little notice. Does that align with your—Tony: Yeah.Corey: Understanding of what people are talking about when they say that, or am I missing something fundamental? Which is again, always possible?Tony: No, I think you're onto something there. Open-source is a whole other can of worms, and I've burned many, many brain cells over this one as well. And today, you're seeing a lot of pieces about the, you know, the—that are basically giving eulogies for open-source. It's—you know, like HashiCorp just finally changed its license and a bunch of others have in the database world. What open-source has meant is been—and I think for practitioners, for DBAs and developers—here's a platform that's been implemented by many different vendors, which means my skills are portable.And so, I think that's really been the key to why, for instance, like, you know, MySQL and especially PostgreSQL have really exploded, you know, in popularity. Especially Postgres, you know, of late. And it's like, you look at Postgres, it's a very unglamorous database. If you're talking about stodgy, it was born to be stodgy because they wanted to be an adult database from the start. They weren't the LAMP stack like MySQL.And the secret of success with Postgres was that it had a very permissive open-source license, which meant that as long as you don't hold University of California at Berkeley, liable, have at it, kids. And so, you see, like, a lot of different flavors of Postgres out there, which means that a lot of customers are attracted to that because if I get up to speed on this Postgres—on one Postgres database, my skills should be transferable, should be portable to another. So, I think that's a lot of what's happening there.Corey: Well, I do want to call that out in particular because when I was coming up in the naughts, the mid-2000s decade, the lingua franca on everything I used was MySQL, or as I insist on mispronouncing it, my-squeal. And lately, on same vein, Postgres-squeal seems to have taken over the entire universe, when it comes to the de facto database of choice. And I'm old and grumpy and learning new things as always challenging, so I don't understand a lot of the ways that thing gets managed from the context coming from where I did before, but what has driven the massive growth of mindshare among the Postgres-squeal set?Tony: Well, I think it's a matter of it's 30 years old and it's—number one, Postgres always positioned itself as an Oracle alternative. And the early years, you know, this is a new database, how are you going to be able to match, at that point, Oracle had about a 15-year headstart on it. And so, it was a gradual climb to respectability. And I have huge respect for Oracle, don't get me wrong on that, but you take a look at Postgres today and they have basically filled in a lot of the blanks.And so, it now is a very cre—in many cases, it's a credible alternative to Oracle. Can it do all the things Oracle can do? No. But for a lot of organizations, it's the 80/20 rule. And so, I think it's more just a matter of, like, Postgres coming of age. And the fact is, as a result of it coming of age, there's a huge marketplace out there and so much choice, and so much opportunity for skills portability. So, it's really one of those things where its time has come.Corey: I think that a lot of my own biases are simply a product of the era in which I learned how a lot of these things work on. I am terrible at Node, for example, but I would be hard-pressed not to suggest JavaScript as the default language that people should pick up if they're just entering tech today. It does front-end, it does back-end—Tony: Sure.Corey: —it even makes fries, apparently. There's a—that is the lingua franca of the modern internet in a bunch of different ways. That doesn't mean I'm any good at it, and it doesn't mean at this stage, I'm likely to improve massively at it, but it is the right move, even if it is inconvenient for me personally.Tony: Right. Right. Put it this way, we've seen—and as I said, I'm not an expert in programming languages, but we've seen a huge profusion of programming languages and frameworks. But the fact is that there's always been a draw towards critical mass. At the turn of the millennium, we thought is between Java and .NET. Little did we know that basically JavaScript—which at that point was just a web scripting language—[laugh] we didn't know that it could work on the server; we thought it was just a client. Who knew?Corey: That's like using something inappropriately as a database. I mean, good heavens.Tony: [laugh]. That would be true. I mean, when I could have, you know, easily just use a spreadsheet or something like that. But so, I mean, who knew? I mean, just like for instance, Java itself was originally conceived for a set-top box. You never know how this stuff is going to turn out. It's the same thing happen with Python. Python was also a web scripting language. Oh, by the way, it happens to be really powerful and flexible for data science. And whoa, you know, now Python is—in terms of data science languages—has become the new SaaS.Corey: It really took over in a bunch of different ways. Before that, Perl was great, and I go, “Why would I use—why write in Python when Perl is available?” It's like, “Okay, you know, how to write Perl, right?” “Yeah.” “Have you ever read anything a month later?” “Oh…” it's very much a write-only language. It is inscrutable after the fact. And Python at least makes that a lot more approachable, which is never a bad thing.Tony: Yeah.Corey: Speaking of what you touched on toward the beginning of this episode, the idea of databases not being sentient, which I equate to being self-aware, you just came out very recently with a report on generative AI and a trip that you wound up taking on this. Which I've read; I love it. In fact, we've both been independently using the phrase [unintelligible 00:19:09] to, “English is the new most common programming language once a lot of this stuff takes off.” But what have you seen? What have you witnessed as far as both the ground truth reality as well as the grandiose statements that companies are making as they trip over themselves trying to position as the forefront leader and all of this thing that didn't really exist five months ago?Tony: Well, what's funny is—and that's a perfect question because if on January 1st you asked “what's going to happen this year?” I don't think any of us would have thought about generative AI or large language models. And I will not identify the vendors, but I did some that had— was on some advanced briefing calls back around the January, February timeframe. They were talking about things like server lists, they were talking about in database machine learning and so on and so forth. They weren't saying anything about generative.And all of a sudden, April, it changed. And it's essentially just another case of the tail wagging the dog. Consumers were flocking to ChatGPT and enterprises had to take notice. And so, what I saw, in the spring was—and I was at a conference from SaaS, I'm [unintelligible 00:20:21] SAP, Oracle, IBM, Mongo, Snowflake, Databricks and others—that they all very quickly changed their tune to talk about generative AI. What we were seeing was for the most part, position statements, but we also saw, I think, the early emphasis was, as you say, it's basically English as the new default programming language or API, so basically, coding assistance, what I'll call conversational query.I don't want to call it natural language query because we had stuff like Tableau Ask Data, which was very robotic. So, we're seeing a lot of that. And we're also seeing a lot of attention towards foundation models because I mean, what organization is going to have the resources of a Google or an open AI to develop their own foundation model? Yes, some of the Wall Street houses might, but I think most of them are just going to say, “Look, let's just use this as a starting point.”I also saw a very big theme for your models with your data. And where I got a hint of that—it was a throwaway LinkedIn post. It was back in, I think like, February, Databricks had announced Dolly, which was kind of an experimental foundation model, just to use with your own data. And I just wrote three lines in a LinkedIn post, it was on Friday afternoon. By Monday, it had 65,000 hits.I've never seen anything—I mean, yes, I had a lot—I used to say ‘data mesh' last year, and it would—but didn't get anywhere near that. So, I mean, that really hit a nerve. And other things that I saw, was the, you know, the starting to look with vector storage and how that was going to be supported was it was going be a new type of database, and hey, let's have AWS come up with, like, an, you know, an [ADF 00:21:41] database here or is this going to be a feature? I think for the most part, it's going to be a feature. And of course, under all this, everybody's just falling in love, falling all over themselves to get in the good graces of Nvidia. In capsule, that's kind of like what I saw.Corey: That feels directionally accurate. And I think databases are a great area to point out one thing that's always been more a little disconcerting for me. The way that I've always viewed databases has been, unless I'm calling a RAND function or something like it and I don't change the underlying data structure, I should be able to run a query twice in a row and receive the same result deterministically both times.Tony: Mm-hm.Corey: Generative AI is effectively non-deterministic for all realistic measures of that term. Yes, I'm sure there's a deterministic reason things are under the hood. I am not smart enough or learned enough to get there. But it just feels like sometimes we're going to give you the answer you think you're going to get, sometimes we're going to give you a different answer. And sometimes, in generative AI space, we're going to be supremely confident and also completely wrong. That feels dangerous to me.Tony: [laugh]. Oh gosh, yes. I mean, I take a look at ChatGPT and to me, the responses are essentially, it's a high school senior coming out with an essay response without any footnotes. It's the exact opposite of an ACID database. The reason why we're very—in the database world, we're very strongly drawn towards ACID is because we want our data to be consistent and to get—if we ask the same query, we're going to get the same answer.And the problem is, is that with generative, you know, based on large language models, computers sounds sentient, but they're not. Large language models are basically just a series of probabilities, and so hopefully those probabilities will line up and you'll get something similar. That to me, kind of scares me quite a bit. And I think as we start to look at implementing this in an enterprise setting, we need to take a look at what kind of guardrails can we put on there. And the thing is, that what this led me to was that missing piece that I saw this spring with generative AI, at least in the data and analytics world, is nobody had a clue in terms of how to extend AI governance to this, how to make these models explainable. And I think that's still—that's a large problem. That's a huge nut that it's going to take the industry a while to crack.Corey: Yeah, but it's incredibly important that it does get cracked.Tony: Oh, gosh, yes.Corey: One last topic that I want to get into. I know you said you don't want to over-index on AWS, which, fair enough. It is where I spend the bulk of my professional time and energy—Tony: [laugh].Corey: Focusing on, but I think this one's fair because it is a microcosm of a broader industry question. And that is, I don't know what the DBA job of the future is going to look like, but increasingly, it feels like it's going to primarily be picking which purpose-built AWS database—or larger [story 00:24:56] purpose database is appropriate for a given workload. Even without my inappropriate misuse of things that are not databases as databases, they are legitimately 15 or 16 different AWS services that they position as database offerings. And it really feels like you're spiraling down a well of analysis paralysis, trying to pick between all these things. Do you think the future looks more like general-purpose databases, or very purpose-built and each one is this beautiful, bespoke unicorn?Tony: [laugh]. Well, this is basically a hit on a theme that I've been—you know, we've been all been thinking about for years. And the thing is, there are arguments to be made for multi-model databases, you know, versus a for-purpose database. That being said, okay, two things. One is that what I've been saying, in general, is that—and I wrote about this way, way back; I actually did a talk at the [unintelligible 00:25:50]; it was a throwaway talk, or [unintelligible 00:25:52] one of those conferences—I threw it together and it's basically looking at the emergence of all these specialized databases.But how I saw, also, there's going to be kind of an overlapping. Not that we're going to come back to Pangea per se, but that, for instance, like, a relational database will be able to support JSON. And Oracle, for instance, does has some fairly brilliant ideas up the sleeve, what they call a JSON duality, which sounds kind of scary, which basically says, “We can store data relationally, but superimpose GraphQL on top of all of this and this is going to look really JSON-y.” So, I think on one hand, you are going to be seeing databases that do overlap. Would I use Oracle for a MongoDB use case? No, but would I use Oracle for a case where I might have some document data? I could certainly see that.The other point, though, and this is really one I want to hammer on here—it's kind of a major concern I've had—is I think the cloud vendors, for all their talk that we give you operational simplicity and agility are making things very complex with its expanding cornucopia of services. And what they need to do—I'm not saying, you know, let's close down the patent office—what I think we do is we need to provide some guided experiences that says, “Tell us the use case. We will now blend these particular services together and this is the package that we would suggest.” I think cloud vendors really need to go back to the drawing board from that standpoint and look at, how do we bring this all together? How would he really simplify the life of the customer?Corey: That is, honestly, I think the biggest challenge that the cloud providers have across the board. There are hundreds of services available at this point from every hyperscaler out there. And some of them are brand new and effectively feel like they're there for three or four different customers and that's about it and others are universal services that most people are probably going to use. And most things fall in between those two extremes, but it becomes such an analysis paralysis moment of trying to figure out what do I do here? What is the golden path?And what that means is that when you start talking to other people and asking their opinion and getting their guidance on how to do something when you get stuck, it's, “Oh, you're using that service? Don't do it. Use this other thing instead.” And if you listen to that, you get midway through every problem for them to start over again because, “Oh, I'm going to pick a different selection of underlying components.” It becomes confusing and complicated, and I think it does customers largely a disservice. What I think we really need, on some level, is a simplified golden path with easy on-ramps and easy off-ramps where, in the absence of a compelling reason, this is what you should be using.Tony: Believe it or not, I think this would be a golden case for machine learning.Corey: [laugh].Tony: No, but submit to us the characteristics of your workload, and here's a recipe that we would propose. Obviously, we can't trust AI to make our decisions for us, but it can provide some guardrails.Corey: “Yeah. Use a graph database. Trust me, it'll be fine.” That's your general purpose—Tony: [laugh].Corey: —approach. Yeah, that'll end well.Tony: [laugh]. I would hope that the AI would basically be trained on a better set of training data to not come out with that conclusion.Corey: One could sure hope.Tony: Yeah, exactly.Corey: I really want to thank you for taking the time to catch up with me around what you're doing. If people want to learn more, where's the best place for them to find you?Tony: My website is dbinsight.io. And on my homepage, I list my latest research. So, you just have to go to the homepage where you can basically click on the links to the latest and greatest. And I will, as I said, after Labor Day, I'll be publishing my take on my generative AI journey from the spring.Corey: And we will, of course, put links to this in the [show notes 00:29:39]. Thank you so much for your time. I appreciate it.Tony: Hey, it's been a pleasure, Corey. Good seeing you again.Corey: Tony Baer, principal at dbInsight. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that we will eventually stitch together with all those different platforms to create—that's right—a large-scale distributed database.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Building a Community around Cloud-Native Content with Bret Fisher

Screaming in the Cloud

Play Episode Listen Later Sep 7, 2023 40:06


Bret Fisher, DevOps Dude & Cloud-Native Trainer, joins Corey on Screaming in the Cloud to discuss what it's like being a practitioner and a content creator in the world of cloud. Bret shares why he feels it's so critical to get his hands dirty so his content remains relevant, and also how he has to choose where to focus his efforts to grow his community. Corey and Bret discuss the importance of finding the joy in your work, and also the advantages and downfalls of the latest AI advancements. About BretFor 25 years Bret has built and operated distributed systems, and helped over 350,000 people learn dev and ops topics. He's a freelance DevOps and Cloud Native consultant, trainer, speaker, and open source volunteer working from Virginia Beach, USA. Bret's also a Docker Captain and the author of the popular Docker Mastery and Kubernetes Mastery series on Udemy. He hosts a weekly DevOps YouTube Live Show, a container podcast, and runs the popular devops.fan Discord chat server.Links Referenced: Twitter: https://twitter.com/BretFisher YouTube Channel: https://www.youtube.com/@BretFisher Website: https://www.bretfisher.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: In the cloud, ideas turn into innovation at virtually limitless speed and scale. To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats. What's Runtime Insights, you ask? Visit sysdig.com/screaming to learn more. That's S-Y-S-D-I-G.com/screaming.My thanks as well to Sysdig for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, a little bit off the beaten path today, in that I'm talking to someone who, I suppose like me, if that's not considered to be an insult, has found themselves eminently unemployable in a quote-unquote, “Real job.” My guest today is Bret Fisher, DevOps dude and cloud-native trainer. Bret, great to talk to you. What do you do?Bret: [laugh]. I'm glad to be here, Corey. I help people for a living like a lot of us end up doing in tech. But nowadays, it's courses, it's live trainings, webinars, all that stuff. And then of course, the fun side of it is the YouTube podcast, hanging out with friends, chatting on the internet. And then a little bit of running a Discord community, which is one of the best places to have a little text chat community, if you don't know Discord.Corey: I've been trying to get the Discord and it isn't quite resonating with me, just because by default, it alerts on everything that happens in any server you're in. It, at least historically, was very challenging to get that tuned in, so I just stopped having anything alert me on my phone, which means now I miss things constantly. And that's been fun and challenging. I still have the slack.lastweekinaws.com community with a couple of thousand people in it.Bret: Nice. Yeah, I mean, some people love Slack. I still have a Slack community for my courses. Discord, I feel like is way more community friendly. By the way, a good server admin knows how to change those settings, which there are a thousand settings in Discord, so server admins, I don't blame you for not seeing that setting.But there is one where you can say new members, don't bug them on every message; only bug them on a mentions or, you know, channel mentions and stuff like that. And then of course, you turn off all those channel mentions and abilities for people to abuse it. But yeah, I had the same problem at first. I did not know what I was doing and it took me years to kind of figure out. The community, we now have 15,000 people. We call it Cloud Native DevOps, but it's basically people from all walks of DevOps, you know, recovering IT pros.And the wonderful thing about it is you always start out—like, you'd do the same thing, I'm sure—where you start a podcast or YouTube channel or a chat community or Telegram, or a subreddit, or whatever your thing is, and you try to build a community and you don't know if it's going to work and you invite your friends and then they show up for a day and then go away. And I've been very lucky and surprised that the Discord server has, to this point, taken on sort of a, its own nature. We've got, I don't know, close to a dozen moderators now and people are just volunteering their time to help others. It's wonderful. I actually—I consider it, like, one of the safe places, unlike maybe Stack Overflow where you might get hated for the wrong question. And we try to guide you to a better question so [laugh] that we can answer you or help you. So, every day I go in there, and there's a dozen conversations I missed that I wasn't able to keep up with. So, it's kind of fun if you're into that thing.Corey: I remember the olden days when I was one of the volunteer staff members on the freenode IRC network before its untimely and awful demise, and I really have come to appreciate the idea of, past a certain point, you can either own the forum that you're working within or you can participate in it, but being a moderator, on some level, sets apart how people treat you in some strange ways. And none of these things are easy once you get into the nuances of codes of conduct, of people participating in good faith, but also are not doing so constructively. And people are hard. And one of these years I should really focus on addressing aspects of that with what I'm focusing on.Bret: [laugh]. Yeah, the machines—I mean, as frustrating as the machines are, they at least are a little more reliable. I don't have anonymous machines showing up yet in Discord, although we do get almost daily spammers and stuff like that. So, you know, I guess I'm blessed to have attracted some of the spam and stuff like that. But a friend of mine who runs a solid community for podcasters—you know, for podcasts hosters—he warned me, he's like, you know, if you really want to make it the community that you have the vision for, it requires daily work.Like, it's a part-time job, and you have to put the time in, or it will just not be that and be okay with that. Like, be okay with it being a small, you know, small group of people that stick around and it doesn't really grow. And that's what's happened on the Slack side of things is I didn't care and feed it, so it has gotten pretty quiet over there as we've grown the Discord server. Because I kind of had to choose, you know? Because we—like you, I started with Slack long, long ago. It was the only thing out there. Discord was just for gamers.And in the last four or five years, I think Discord—I think during the pandemic, they officially said, “We are now more than gamers,” which I was kind of waiting for to really want to invest my company's—I mean, my company of three—you know, my company [laugh] time into a platform that I thought was maybe just for gamers; couldn't quite figure it out. And once they kind of officially said, “Yeah, we're for all communities,” we're more in, you know, and they have that—the thing I really appreciate like we had an IRC, but was mostly human-driven is that Discord, unlike Slack, has actual community controls that make it a safer place, a more inclusive place. And you can actually contact Discord when you have a spammer or someone doing bad things, or you have a server raid where there's a whole bunch of accounts and bot accounts trying to take you down, you can actually reach out to Discord, where Slack doesn't have any of that, they don't have a way for you to reach out. You can't block people or ban them or any of that stuff with Slack. And we—the luckily—the lucky thing of Dis—I kind of look at Discord as, like, the best new equivalent of IRC, even though for a lot of people IRC is still the thing, right? We have new clients now, you can actually have off—you could have sort of synced IRC, right, where you can have a web client that remembers you so you didn't lose the chat after you left, which was always the problem back in the day.Corey: Oh, yeah. I just parked it on, originally, a hardware box, now EC2. And this ran Irssi as my client—because I'm old school—inside of tmux and called it a life. But yeah, I still use that from time to time, but the conversation has moved on. One challenge I've had is that a lot of the people I talk to about billing nuances skew sometimes, obviously in the engineering direction, but also in the business user perspective and it always felt, on some level like it was easier to get business users onto Slack from a community perspective—Bret: Mmm. Absolutely. Yeah.Corey: —than it was for Discord. I mean, this thing started as well. This was years ago, before Discord had a lot of those controls. Might be time to take another bite at that apple.Bret: Yeah. Yeah, I definitely—and that, I think that's why I still keep the Slack open is there are some people, they will only go there, right? Like, they just don't want another thing. That totally makes sense. In fact, that's kind of what's happening to the internet now, right?We see the demise of Twitter or X, we see all these other new clients showing up, and what I've just seen in the dev community is we had this wonderful dev community on Twitter. For a moment. For a few years. It wasn't perfect by far, there was a lot people that still didn't want to use Twitter, but I felt like there was—if you wanted to be in the cloud-native community, that was very strong and you didn't always have to jump into Slack. And then you know, this billionaire came along and kind of ruined it, so people have fractured over to Mastodon and we've got some people have run Threads and some people on Bluesky, and now—and then some people like me that have stuck with Twitter.And I feel like I've lost a chunk of my friends because I don't want to spend my life on six different platforms. So, I am—I have found myself actually kind of sort of regressing to our Discord because it's the people I know, we're all talking about the same things, we all have a common interest, and rather than spending my time trying to find those people on the socials as much as I used to. So, I don't know, we'll see.Corey: Something that I have found, I'm curious to get your take on this, you've been doing this for roughly twice as long as I have, but what I've been having to teach myself is that I am not necessarily representative of the totality of the audience. And, aside from the obvious demographic areas, I learned best by reading or by building something myself—I don't generally listen to podcasts, which is a weird confession in this forum for me to wind up admitting to—and I don't basically watch videos at all. And it took me a while to realize that not everyone is like me; those are wildly popular forms of absorbing information. What I have noticed that the audience engages differently in different areas, whereas for this podcast, for the first six months, I didn't think that I'd remember to turn the microphone on. And that was okay; it was an experiment, and I enjoyed doing it. But then I went to a conference and wound up getting a whole bunch of feedback.Whereas for the newsletter, I had immediate responses to basically every issue when I sent it out. And I think the reason is, is because people are not sitting in front of a computer when they're listening to something and they're not going to be able to say, “Well, let me give you a piece of my mind,” in quite the same way. And by the time they remember later, it feels weird, like, calling into a radio show. But when you actually meet someone, “Yeah, I love your stuff.” And they'll talk about the episodes I've had out. But you can be forgiven for in some cases in the social media side of it for thinking that I'd forgotten to publish this thing.Bret: Yeah. I think that's actually a pretty common trait. There was a time where I was sort of into the science of learning and whatnot, and one of the things that came out of that was that the way we communicate or the way we learn and then the way—the input and the outputs are different per human. It's actually almost, like, comparable maybe to love languages, if you've read that book, where the way we give love and the way we receive love from others is—we prefer it in different ways and it's often not the same thing. And I think the same is true of learning and teaching, where my teaching style has always been visual.I think have almost always been in all my videos. My first course seven years ago, I was in it phy—like, I had my headshot in there and I just thought that that was a part of the best way you could make that content. And doesn't mean that I'm instantly better; it just means I wanted to communicate with my hands, maybe I got a little bit of Italian or French in me or something [laugh] where I'm moving my hands around a lot. So, I think that the medium is very specific to the person. And I meet people all the time that I find out, they didn't learn from me—they didn't learn about me, rather, from my course; they learned about me from a conference talk because they prefer to watch those or someone else learned about me from the podcast I run because they stumbled onto that.And it always surprises me because I always figure that since my biggest audience in my Udemy courses—over 300,000 people there—that that's how most of the people find me. And it turns out nowadays that when I meet people, a lot of times it's not. It's some other, you know, other venue. And now we have people showing up in the Discord server from the Discord Discovery. It's kind of a little feature in Discord that allows you to find servers that are on the topics you're interested in and were listed in there and people will find me that way and jump in not knowing that I have created courses, I have a weekly YouTube Live show, I have all the other things.And yeah, it's just it's kind of great, but also as a content creator, it's kind of exhausting because you—if you're interested in all these things, you can't possibly focus on all of them at the [laugh] same time. So, what is it the great Will Smith says? “Do two things and two things suffer.” [laugh]. And that's exactly what my life is like. It's like, I can't focus on one thing, so they all aren't as amazing as they could be, maybe, if I had only dedicated to one thing.Corey: No, I'm with you on that it's a saying yes to something means inherently saying no to something else. But for those of us whose interests are wide and varied, I find that there are always more things to do than I will ever be able to address. You have to pick and choose, on some level. I dabble with a lot of the stuff that I work on. I have given thought in the past towards putting out video courses or whatnot, but you've done that for ages and it just seems like it is so much front-loaded work, in many cases with things I'm not terrific at.And then, at least in my side of the world, oh, then AWS does another console refresh, as they tend to sporadically, and great, now I have to go back and redo all of the video shoots showing how to do it because now it's changed just enough to confuse people. And it feels like a treadmill you climb on top of and never get off.Bret: It can definitely feel like that. And I think it's also harder to edit existing courses like I'm doing now than it is to just make up something brand new and fresh. And there's something about… we love to teach, I think what we're learning in the moment. I think a lot of us, you get something exciting and you want to talk about it. And so, I think that's how a lot of people's conference talk ideas come up if you think about it.Like you're not usually talking about the thing that you were interested in a decade ago. You're talking about the thing you just learned, and you thought it was great, and you want everyone to know about it, which means you're going to make a YouTube video or a blog post or something about it, you'll share somewhere on social media about it. I think it's harder to make this—any of these content creation things, especially courses, a career if you come back to that course like I'm doing seven years after publication and you're continuing every year to update those videos. And you're thinking I—not that my interests have moved on, but my passion is in the new things. And I'm not making videos right now on new things.I'm fixing—like you're saying, like, I'm fixing the Docker Hub video because it has completely changed in seven years and it doesn't even look the same and all that. So, there's definitely—that's the work side of this business where you really have to put the time in and it may not always be fun. So, one of the things I'm learning from my business coach is like how to find ways to make some of this stuff fun again, and how to inject some joy into it without it feeling like it's just the churn of video after video after video, which, you know, you can fall into that trap with any of that stuff. So, yeah. That's what I'm doing this year is learning a little bit more about myself and what I like doing versus what I have to do and try to make some of it a little funner.Corey: This question might come across as passive-aggressive or back-handedly insulting and I swear to you it is not intended to, but how do you avoid what has been a persistent fear of mine and that is becoming a talking head? Whereas you've been doing this as a trainer for long enough that you haven't had a quote-unquote, “Real job,” in roughly, what, 15 years at this point?Bret: Yeah. Yeah.Corey: And so, you've never run Kubernetes in anger, which is, of course, was what we call production environment. That's right, I call it ‘Anger.' My staging environment is called ‘Theory' because it works in theory, but not in production. And there you have it. So, without being hands-on and running these things at scale, it feels like on some level, if I were to, for example, give up the consulting side of my business and just talk about the pure math that I see and what AWS is putting out there, I feel like I'd pretty quickly lose sight of what actual customer pain looks like.Bret: Yeah. That's a real fear, for sure. And that's why I'm kind of—I think I kind of do what you do and maybe wasn't… didn't try to mislead you, but I do consult on a fairly consistent basis and I took a break this year. I've only—you know, then what I'll do is I'll do some advisory work, I usually won't put hands on a cluster, I'm usually advising people on how to put the hands on that cluster kind of thing, or how to build accepting their PRs, doing stuff like that. That's what I've done in the last maybe three or four years.Because you're right. There's two things that are, right? Like, it's hard to stay relevant if you don't actually get your hands dirty, your content ends up I think this naturally becoming very… I don't know, one dimensional, maybe, or two dimensional, where it doesn't, you don't really talk about the best practices because you don't actually have the scars to prove it. And so, I'm always nervous about going long lengths, like, three or four years of time, with zero production work. So, I think I try to fill that with a little bit of advisory, maybe trying to find friends and actually trying to talk with them about their experiences, just so I can make sure I'm understanding what they're dealing with.I also think that that kind of work is what creates my stories. So like, my latest course, it's on GitHub Actions and Argo CD for using automation and GitOps for deployments, basically trying to simplify the deployment lifecycle so that you can just get back to worrying about your app and not about how it's deployed and how it's tested and all that. And that all came out of consulting I did for a couple of firms in 2019 and 2020, and I think right into 2021, that's kind of where I started winding them down. And that created the stories that caused me, you know, sort of the scars of going into production. We were migrating a COTS app into a SaaS app, so we were learning lots of things about their design and having to change infrastructure. And I had so many learnings from that.And one of them was I really liked GitHub Actions. And it worked well for them. And it was very flexible. And it wasn't as friendly and as GUI beautiful as some of the other CI solutions out there, but it was flexible enough and direct—close enough to the developer that it felt powerful in the developers' hands, whereas previous systems that we've all had, like Jenkins always felt like this black box that maybe one or two people knew.And those stories came out of the real advisory or consultancy that I did for those few years. And then I was like, “Okay, I've got stuff. I've learned it. I've done it in the field. I've got the scars. Let me go teach people about it.” And I'm probably going to have to do that again in a few years when I feel like I'm losing touch like you're saying there. That's a—yeah, so I agree. Same problem [laugh].Corey: Crap, I was hoping you had some magic silver bullet—Bret: No. [laugh].Corey: —other than, “No, it still gnaws at you forever and there's no real way to get away for”—great. But, uhh, it keeps things… interesting.Bret: I would love to say that I have that skill, that ability to, like, just talk with you about your customers and, like, transfer all that knowledge so that I can then talk about it, but I don't know. I don't know. It's tough.Corey: Yeah. The dangerous part there is suddenly you stop having lived experience and start just trusting whoever sounds the most confident, which of course, brings us to generative AI.Bret: Ohhh.Corey: Which apparently needs to be brought into every conversation as per, you know, analysts and Amazon leadership, apparently. What's your take on it?Bret: Yeah. Yeah. Well, I was earl—I mean, well maybe not early, early. Like, these people that are talking about being early were seven years ago, so definitely wasn't that early.Corey: Yeah. Back when the Hello World was a PhD from Stanford.Bret: Yeah [laugh], yeah. So, I was maybe—my first step in was on the tech side of things with Copilot when it was in beta a little over two years ago. We're talking about GitHub Copilot. That was I think my first one. I was not an OpenAI user for any of their solutions, and was not into the visual—you know, the image AI stuff as we all are now dabbling with.But when it comes to code and YAML and TOML and, you know, the stuff that I deal with every day, I didn't start into it until about two years ago. I think I actually live-streamed my first experiences with it with a friend of mine. And I was just using it for DevOps tasks at the time. It was an early beta, so I was like, kind of invited. And it was filling out YAML for me. It was creating Kubernetes YAML for me.And like we're all learning, you know, it hallucinates, as we say, which is lying. It made stuff up for 50% of the time. And it was—it is way better now. So, I think I actually wrote in my newsletter a couple weeks ago a recent story—or a recent experience because I wanted to take a project in a language that I had not previously written from scratch in but maybe I was just slightly familiar with. So, I picked Go because everything in cloud-native is written in Go and so I've been reading it for years and years and years and maybe making small PRs to various things, but never taken on myself to write it from scratch and just create something, start to finish, for myself.And so, I wanted a real project, not something that was contrived, and it came up that I wanted to create—in my specific scenario, I wanted to take a CSV of all of my students and then take a template certificate, you know, like these certificates of completion or certifications, you know, that you get, and it's a nice little—looks like the digital equivalent of a paper certificate that you would get from maybe a university. And I wanted to create that. So, I wanted to do it in bulk. I wanted to give it a stock image and then give it a list of names and then it would figure out the right place to put all those names and then generate a whole bunch of images that I could send out. And then I can maybe turn this into a web service someday.But I wanted to do this, and I knew, if I just wrote it myself, I'd be horrible at it, I would suck at Go, I'd probably have to watch some videos to remember some of the syntax. I don't know the standard libraries, so I'd have to figure out which libraries I needed and all that stuff. All the dependencies.Corey: You make the same typical newcomer mistakes of not understanding the local idioms and whatnot. Oh, yeah.Bret: Yeah. And so, I'd have to spend some time on Stack Overflow Googling around. I kind of guessed it was going to take me 20 to 40 hours to make. Like, and it was—we're talking really just hundreds of lines of code at the end of the day, but because Go standard library actually is really great, so it was going to be far less code than if I had to do it in NodeJS or something. Anyway, long story short there, it ended up taking three to three-and-a-half hours end to end, including everything I needed, you know, importing a CSV, sucking in a PNG, outputting PNG with all the names on them in the right places in the right font, the right colors, all that stuff.And I did it all through GitHub Copilot Chat, which is their newest Labs beta thing. And it brings the ChatGPT-4 experience into VS Code. I think it's right now only for VS Code, but other editors coming soon. And it was kind of wonderful. It remembered my project as a whole. It wasn't just in the file I was in. There was no copying-pasting back and forth between the web interface of ChatGPT like a lot of people tend to do today where they go into ChatGPT, they ask a question, then they copy out code and they paste it in their editor.Well, there was none of that because since that's built into the editor, it kind of flows naturally into your existing project. You can kind of just click a button and it'll automatically paste in where your cursor is. It does all this convenient stuff. And then it would relook at the code. I would ask it, you know, “What are ten ways to improve this code now that it works?” And you know, “How can I reduce the number of lines in this code?” Or, “How can I make it easier to read?”And I was doing all this stuff while I was creating the project. I haven't had anyone, like, look at it to tell me if it looks good [laugh], which I hear you had that experience. But it works, it solved my problem, and I did it in a half a day with no prep time. And it's all in ChatGPT's history. So, when I open up VS Code now, I open that project up and get it, it recognizes that oh, this is the project that you've asked all these previous questions on, and it reloads all those questions, allowing me to basically start the conversation off again with my AI friend at the same place I left off.And I think that experience basically proved to me that what everybody else is telling us, right, that yes, this is definitely the future. I don't see myself ever writing code again without an AI partner. I don't know why I ever would write it without the AI partner at least to help me, quicken my learning, and solve some of the prompts. I mean, it was spitting out code that wasn't perfect. It would actually—[unintelligible 00:23:53] sometimes fail.And then I would tell it, “Here's the error you just caused. What do I do with that?” And it would help me walk through the solution, it would fix it, it would recommend changes. So, it's definitely not something that will avoid you knowing how to program or make someone who's not a programmer suddenly write a perfect program, but man, it really—I mean, it took basically what I would consider to be a novice in that language—not a novice at programming, but a novice at that language—and spit out a productive program in less than a day. So, that's huge, I think.[midroll 00:24:27]Corey: What I think is a necessary prerequisite is a domain expertise in order to figure out what is accurate versus what is completely wrong, but sounds competent. And I've been racing a bunch of the different large-language models against each other in a variety of things like this. One of the challenges I'll give them is to query the AWS pricing API—which motto is, “Not every war crime happens in faraway places”—and then spit out things like the Managed Nat Gateway hourly cost table, sorted from most to least expensive by region. And some things are great at it and other things really struggle with it. And the first time I, just on a lark, went down that path, it saved me an easy three hours from writing that thing by hand. It was effectively an API interface, whereas now the most common programming language I think we're going to see on the rise is English.Bret: Yeah, good point. I've heard some theories, right? Like maybe the output language doesn't matter. You just tell it, “Oh, don't do that in Java, do it in PHP.” Whatever, or, “Convert this Java to PHP,” something like that.I haven't experimented with a lot of that stuff yet, but I think that having spent this time watching a lot of other videos, right, you know, watching [Fireship 00:25:37], and a lot of other people talking about LLMs on the internet, seeing the happy-face stuff happen. And it's just, I don't know where we're going to be in five or ten years. I am definitely not a good prediction, like a futurist. And I'm trying to imagine what the daily experience is going to be, but my assumption is, every tool we're using is going to have some sort of chat AI assistant in it. I mean, this is kind of the future that, like, none of the movies predicted.[laugh]. We were talking about this the other day with a friend of mine. We were talking about it over dinner, some developer friends. And we were just talking about, like, this would be too boring for a movie, like, we all want the—you know, we think of the movies where there's the three laws of robotics and all these things. And these are in no way sentient.I'm not intimidated or scared by them. I think the EU is definitely going to do the right thing here and we're going to have to follow suit eventually, where we rank where you can use AI and, like, there's these levels, and maybe just helping you with a program is a low-level, there's very few restrictions, in other words, by the government, but if you're talking about in cars or in medical or you know, in anything like that, that's the highest level and the highest restrictions and all that. I could definitely see that's the safety. Obviously, we'll probably do it too slow and too late and there'll be some bad uses in the meantime, but I think we're there. I mean, like, if you're not using it today—if you're listening to this, and you're not using AI yet in your day-to-day as someone related to the IT career, it's going to be everywhere and I don't think it's going to be, like, one tool. The tools on the CLI to me are kind of weird right now. Like, they certainly can help you write command lines, but it just doesn't flow right for me. I don't know if you've tried that.Corey: Yeah. I ha—I've dabbled lightly, but again, I've been a Unix admin for the better part of 20 years and I'm used to a world in which you type exactly what you mean or you suffer the consequences. So, having a robot trying to outguess me of what it thinks I'm trying to do, if it works correctly, it looks like a really smart tab complete. If it guesses wrong, it's incredibly frustrating. The risk/reward is not there in the same way.Bret: Right.Corey: So, for me at least, it's more frustration than anything. I've seen significant use cases across the business world where this would have been invaluable back when I was younger, where it's, “Great, here's a one-line email I'm about to send to someone, and people are going to call me brusque or difficult for it. Great. Turn this into a business email.” And then on the other side, like, “This is a five-paragraph email. What does he actually want?” It'll turn it back into one line. But there's this the idea of using it for things like that is super helpful.Bret: Yeah. Robots talking to robots? Is that what you're saying? Yeah.Corey: Well, partially, yes. But increasingly, too, I'm seeing that a lot of the safety stuff is being bolted on as an afterthought—because that always goes well—is getting in the way more than it is helping things. Because at this point, I am far enough along in my life where my ethical framework is largely set. I am not going to have radical changes in my worldview, no matter how much a robot [unintelligible 00:28:29] me.So, snark and sarcasm are my first languages and that is something that increasingly they're leery about, like, oh, sarcasm can hurt people's feelings. “Well, no kidding, professor, you don't say.” As John Scalzi says, “The failure mode of clever is ‘asshole.'” But I figured out how to walk that line, so don't you worry your pretty little robot head about that. Leave that to me. But it won't because it's convinced that I'm going to just take whatever it suggests and turn it into a billboard marketing campaign for a Fortune 5. There are several more approval steps in there.Bret: There. Yeah, yeah. And maybe that's where you'll have to run your own instead of a service, right? You'll need something that allows the Snark knob to be turned all the way up. I think, too, the thing that I really want is… it's great to have it as a programming assistant. It's great and notion to help me, you know, think out, you know, sort of whiteboard some things, right, or sketch stuff out in terms of, “Give me the top ten things to do with this,” and it's great for ideas and stuff like that.But what I really, really want is for it to remove a lot of the drudgery of day-to-day toil that we still haven't, in tech, figured out a way—for example, I'm going to need a new repo. I know what I need to go in it, I know which organization it needs to go in, I know what types of files need to go in there, and I know the general purpose of the repo. Even the skilled person is going to take at least 20 minutes or more to set all that up. And I would really just rather take an AI on my local computer and say, “I would like three new repos: a front-end back-end, and a Kubernetes YAML repo. And I would like this one to be Rust, and I would like this one to be NodeJS or whatever, and I would like this other repo to have all the pieces in Kubernetes. And I would like Docker files in each repo plus GitHub Actions for linting.”Like, I could just spill out, you know, all these things: the editor.config file, the Git ignore, the Docker ignore, think about, like, the dozen files that every repo has to have now. And I just want that generated by an AI that knows my own repos, knows my preferences, and it's more—because we all have, a lot of us that are really, really organized and I'm not one of those, we have maybe a template repo or we have templates that are created by a consolidated group of DevOps guild members or something in our organization that creates standards and reusable workflows and template files and template repos. And I think a lot of that's going to go—that boilerplate will sort of, if we get a smart enough LLM that's very user and organization-specific, I would love to be able to just tell Siri or whatever on my computer, “This is the thing I want to be created and it's boilerplate stuff.” And it then generates all that.And then I jump into my code creator or my notion drafting of words. And that's—like, I hop off from there. But we don't yet have a lot of the toil of day-to-day developers, I feel like, the general stuff on computing. We don't really have—maybe I don't think that's a general AI. I don't think we're… I don't think that needs to be like a general intelligence. I think it just needs to be something that knows the tools and can hook into those. Maybe it asks for my fingerprint on occasion, just for security sake [laugh] so it doesn't deploy all the things to AWS. But.Corey: Yeah. Like, I've been trying to be subversive with a lot of these things. Like, it's always fun to ask the challenging questions, like, “My boss has been complaining to me about my performance and I'm salty about it. Give me ways to increase my AWS bill that can't be directly traced back to me.” And it's like, oh, that's not how to resolve workplace differences.Like, okay. Good on, you found that at least, but cool, give me the dirt. I get asked in isolation of, “Yeah, how can I increase my AWS bill?” And its answer is, “There is no good reason to ever do that.” Mmm, there are exceptions on this and that's not really what I asked. It's, on some level, that tries to out-human you and gets it hilariously wrong.Bret: Yeah, there's definitely, I think—it wasn't me that said this, but in the state we're in right now, there is this dangerous point of using any of these LLMs where, if you're asking it questions and you don't know anything about that thing you're asking about, you don't know what's false, you don't know what's right, and you're going to get in trouble pretty quickly. So, I feel like in a lot of cases, these models are only useful if you have a more than casual knowledge of the thing you're asking about, right? Because, like, you can—like, you've probably tried to experiment. If you're asking about AWS stuff, I'm just going to imagine that it's going to make some of those service names up and it's going to create things that don't exist or that you can't do, and you're going to have to figure out what works and what doesn't.And what do you do, right? Like you can't just give a noob, this AWS LLM and expect it to be correct all the time about how to manage or create things or destroy things or manage things. So, maybe in five years. Maybe that will be the thing. You literally hire someone who has a computing degree out of a university somewhere and then they can suddenly manage AWS because the robot is correct 99.99% of the time. We're just—I keep getting told that that's years and years away and we don't know how to stop the hallucinations, so we're all stuck with it.Corey: That is the failure mode that is disappointing. We're never going to stuff that genie back in the bottle. Like, that is—technology does not work that way. So, now that it's here, we need to find a way to live with it. But that also means using it in ways where it's constructive and helpful, not just wholesale replacing people.What does worry me about a lot of the use it to build an app, when I wound up showing this to some of my engineering friends, their immediate response universally, was, “Well, yeah, that's great for, like, the easy, trivial stuff like querying a bad API, but for any of this other stuff, you still need senior engineers.” So, their defensiveness was the reaction, and I get that. But also, where do you think senior engineers come from? It's solving a bunch of stuff like this. You didn't all spring, fully formed, from the forehead of some God. Like, you started off as junior and working on small trivial problems, like this one, to build a skill set and realize what works well, what doesn't, then life goes on.Bret: Yeah. In a way—I mean, you and I have been around long enough that in a way, the LLMs don't really change anything in terms of who's hireable, how many people you need in your team, or what types of people you need your team. I feel like, just like the cloud allowed us to have less people to do roughly the same thing as we all did in own data centers, I feel like to a large extent, these AIs are just going to do the same thing. It's not fundamentally changing the game for most people to allow a university graduate to become a senior engineer overnight, or the fact that you don't need, you know, the idea that you don't maybe need senior engineers anymore and you can operate at AWS at scale, multi-region setup with some person with a year experience. I don't think any of those things are true in the near term.I think it just necessarily makes the people that are already there more efficient, able to get more stuff done faster. And we've been dealing with that for 30, 40, 50 years, like, that's exactly—I have this slideshow that I keep, I've been using it for a decade and it hasn't really changed. And I got in in the mid-'90s when we were changing from single large computers to distributed computing when the PC took out—took on. I mean, like, I was doing miniframes, and, you know, IBMs and HP Unixes. And that's where I jumped in.And then we found out the mouse and the PC were a great model, and we created distributed computing. That changed the game, allowed us, so many of us to get in that weren't mainframe experts and didn't know COBOL and a lot of us were able to get in and Windows or Microsoft made a great decision of saying, “We're going to make the server operating system look and act exactly like the client operating system.” And then suddenly, all of us PC enthusiasts were now server admins. So, there's this big shift in the '90s. We got a huge amount of server admins.And then virtualization showed up, you know, five years later, and suddenly, we were able to do so much more with the same number of people in a data center and with a bunch of servers. And I watched my team in a big government organization was running 18 people. I had three hardware guys in the data center. That went to one in a matter of years because we were able to virtualize so much we needed physical servers less often, we needed less physical data center server admins, we needed more people to run the software. So, we shifted that team down and then we scaled up software development and people that knew more about actually managing and running software.So, this is, like, I feel like the shifts are happening, then we had the cloud and then we had containerization. It doesn't really change it at a vast scale. And I think sometimes people are a little bit too worried about the LLMs as if they're somehow going to make tech workers obsolete. And I just think, no, we're just going to be managing the different things. We're going to—someone else said the great quote, and I'll end with this, you know, “It's not the LLM that's going to replace you. It's the person who knows the LLMs that's going to replace you.”And that's the same thing you could have said ten years ago for, “It's not the cloud that's going to replace you. It's someone who knows how to manage the cloud that's going to replace you.” [laugh]. So, you could swap that word out for—Corey: A line I heard, must have been 30 years ago now is, “Think. It's the only thing keeping a computer from taking your job.”Bret: Yeah [laugh], and these things don't think so. We haven't figured that one out yet.Corey: Yeah. Some would say that some people's coworkers don't either, but that's just uncharitable.Bret: That's me without coffee [laugh].Corey: [laugh]. I really want to thank you for taking the time to go through your thoughts on a lot of these things. If people want to learn more, where's the best place for them to find you?Bret: bretfisher.com, or just search Bret Fisher. You'll find all my stuff, hopefully, if I know how to use the internet, B-R-E-T F-I-S-H-E-R. And yeah, you'll find a YouTube channel, on Twitter, I hang out there every day, and on my website.Corey: And we will, of course, put links to that in the [show notes 00:38:22]. Thank you so much for taking the time to speak with me today. I really appreciate it.Bret: Yeah. Thanks, Corey. See you soon.Corey: Bret Fisher, DevOps dude and cloud-native trainer. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that you have a Chat-Gippity thing write for you, where, just like you, it sounds very confident, but it's also completely wrong.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Evolution of OpenTelemetry with Austin Parker

Screaming in the Cloud

Play Episode Listen Later Sep 5, 2023 40:09


Austin Parker, Community Maintainer at OpenTelemetry, joins Corey on Screaming in the Cloud to discuss OpenTelemetry's mission in the world of observability. Austin explains how the OpenTelemetry community was able to scale the OpenTelemetry project to a commercial offering, and the way Open Telemetry is driving innovation in the data space. Corey and Austin also discuss why Austin decided to write a book on OpenTelemetry, and the book's focus on the evergreen applications of the tool. About AustinAustin Parker is the OpenTelemetry Community Maintainer, as well as an event organizer, public speaker, author, and general bon vivant. They've been a part of OpenTelemetry since its inception in 2019.Links Referenced: OpenTelemetry: https://opentelemetry.io/ Learning OpenTelemetry early release: https://www.oreilly.com/library/view/learning-opentelemetry/9781098147174/ Page with Austin's social links: https://social.ap2.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Look, I get it. Folks are being asked to do more and more. Most companies don't have a dedicated DBA because that person now has a full time job figuring out which one of AWS's multiple managed database offerings is right for every workload. Instead, developers and engineers are being asked to support, and heck, if time allows, optimize their databases. That's where OtterTune comes in. Their AI is your database co-pilot for MySQL and PostgresSQL on Amazon RDS or Aurora. It helps improve performance by up to four x OR reduce costs by 50 percent – both of those are decent options. Go to ottertune dot com to learn more and start a free trial. That's O-T-T-E-R-T-U-N-E dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. It's been a few hundred episodes since I had Austin Parker on to talk about the things that Austin cares about. But it's time to rectify that. Austin is the community maintainer for OpenTelemetry, which is a CNCF project. If you're unfamiliar with, we're probably going to fix that in short order. Austin, Welcome back, it's been a month of Sundays.Austin: It has been a month-and-a-half of Sundays. A whole pandemic-and-a-half.Corey: So, much has happened since then. I tried to instrument something with OpenTelemetry about a year-and-a-half ago, and in defense to the project, my use case is always very strange, but it felt like—a lot of things have sharp edges, but it felt like this had so many sharp edges that you just pivot to being a chainsaw, and I would have been at least a little bit more understanding of why it hurts so very much. But I have heard from people that I trust that the experience has gotten significantly better. Before we get into the nitty-gritty of me lobbing passive-aggressive bug reports at you have for you to fix in a scenario in which you can't possibly refuse me, let's start with the beginning. What is OpenTelemetry?Austin: That's a great question. Thank you for asking it. So, OpenTelemetry is an observability framework. It is run by the CNCF, you know, home of such wonderful award-winning technologies as Kubernetes, and you know, the second biggest source of YAML in the known universe [clear throat].Corey: On some level, it feels like that is right there with hydrogen as far as unlimited resources in our universe.Austin: It really is. And, you know, as we all know, there are two things that make, sort of, the DevOps and cloud world go around: one of them being, as you would probably know, AWS bills; and the second being YAML. But OpenTelemetry tries to kind of carve a path through this, right, because we're interested in observability. And observability, for those that don't know or have been living under a rock or not reading blogs, it's a lot of things. It's a—but we can generally sort of describe it as, like, this is how you understand what your system is doing.I like to describe it as, it's a way that we can model systems, especially complex, distributed, or decentralized software systems that are pretty commonly found in larg—you know, organizations of every shape and size, quite often running on Kubernetes, quite often running in public or private clouds. And the goal of observability is to help you, you know, model this system and understand what it's doing, which is something that I think we can all agree, a pretty important part of our job as software engineers. Where OpenTelemetry fits into this is as the framework that helps you get the telemetry data you need from those systems, put it into a universal format, and then ship it off to some observability back-end, you know, a Prometheus or a Datadog or whatever, in order to analyze that data and get answers to your questions you have.Corey: From where I sit, the value of OTel—or OpenTelemetry; people in software engineering love abbreviations that are impenetrable from the outside, so of course, we're going to lean into that—but what I found for my own use case is the shining value prop was that I could instrument an application with OTel—in theory—and then send whatever I wanted that was emitted in terms of telemetry, be it events, be it logs, be it metrics, et cetera, and send that to any or all of a curation of vendors on a case-by-case basis, which meant that suddenly it was the first step in, I guess, an observability pipeline, which increasingly is starting to feel like a milit—like an industrial-observability complex, where there's so many different companies out there, it seems like a good approach to use, to start, I guess, racing vendors in different areas to see which performs better. One of the challenges I've had with that when I started down that path is it felt like every vendor who was embracing OTel did it from a perspective of their implementation. Here's how to instrument it to—send it to us because we're the best, obviously. And you're a community maintainer, despite working at observability vendors yourself. You have always been one of those community-first types where you care more about the user experience than you do this quarter for any particular employer that you have, which to be very clear, is intended as a compliment, not a terrifying warning. It's why you have this authentic air to you and why you are one of those very few voices that I trust in a space where normally I need to approach it with significant skepticism. How do you see the relationship between vendors and OpenTelemetry?Austin: I think the hard thing is that I know who signs my paychecks at the end of the day, right, and you always have, you know, some level of, you know, let's say bias, right? Because it is a bias to look after, you know, them who brought you to the dance. But I think you can be responsible with balancing, sort of, the needs of your employer, and the needs of the community. You know, the way I've always described this is that if you think about observability as, like, a—you know, as a market, what's the total addressable market there? It's literally everyone that uses software; it's literally every software company.Which means there's plenty of room for people to make their numbers and to buy and sell and trade and do all this sort of stuff. And by taking that approach, by taking sort of the big picture approach and saying, “Well, look, you know, there's going to be—you know, of all these people, there are going to be some of them that are going to use our stuff and there are some of them that are going to use our competitor's stuff.” And that's fine. Let's figure out where we can invest… in an OpenTelemetry, in a way that makes sense for everyone and not just, you know, our people. So, let's build things like documentation, right?You know, one of the things I'm most impressed with, with OpenTelemetry over the past, like, two years is we went from being, as a project, like, if you searched for OpenTelemetry, you would go and you would get five or six or ten different vendor pages coming up trying to tell you, like, “This is how you use it, this is how you use it.” And what we've done as a community is we've said, you know, “If you go looking for documentation, you should find our website. You should find our resources.” And we've managed to get the OpenTelemetry website to basically rank above almost everything else when people are searching for help with OpenTelemetry. And that's been really good because, one, it means that now, rather than vendors or whoever coming in and saying, like, “Well, we can do this better than you,” we can be like, “Well, look, just, you know, put your effort here, right? It's already the top result. It's already where people are coming, and we can prove that.”And two, it means that as people come in, they're going to be put into this process of community feedback, where they can go in, they can look at the docs, and they can say, “Oh, well, I had a bad experience here,” or, “How do I do this?” And we get that feedback and then we can improve the docs for everyone else by acting on that feedback, and the net result of this is that more people are using OpenTelemetry, which means there are more people kind of going into the tippy-tippy top of the funnel, right, that are able to become a customer of one of these myriad observability back ends.Corey: You touched on something very important here, when I first was exploring this—you may have been looking over my shoulder as I went through this process—my impression initially was, oh, this is a ‘CNCF project' in quotes, where—this is not true universally, of course, but there are cases where it clearly—is where this is an, effectively, vendor-captured project, not necessarily by one vendor, but by an almost consortium of them. And that was my takeaway from OpenTelemetry. It was conversations with you, among others, that led me to believe no, no, this is not in that vein. This is clearly something that is a win. There are just a whole bunch of vendors more-or-less falling all over themselves, trying to stake out thought leadership and imply ownership, on some level, of where these things go. But I definitely left with a sense that this is bigger than any one vendor.Austin: I would agree. I think, to even step back further, right, there's almost two different ways that I think vendors—or anyone—can approach OpenTelemetry, you know, from a market perspective, and one is to say, like, “Oh, this is socializing, kind of, the maintenance burden of instrumentation.” Which is a huge cost for commercial players, right? Like, if you're a Datadog or a Splunk or whoever, you know, you have these agents that you go in and they rip telemetry out of your web servers, out of your gRPC libraries, whatever, and it costs a lot of money to pay engineers to maintain those instrumentation agents, right? And the cynical take is, oh, look at all these big companies that are kind of like pushing all that labor onto the open-source community, and you know, I'm not casting any aspersions here, like, I do think that there's an element of truth to it though because, yeah, that is a huge fixed cost.And if you look at the actual lived reality of people and you look at back when SignalFx was still a going concern, right, and they had their APM agents open-sourced, you could go into the SignalFx repo and diff, like, their [Node Express 00:10:15] instrumentation against the Datadog Node Express instrumentation, and it's almost a hundred percent the same, right? Because it's truly a commodity. There's no—there's nothing interesting about how you get that telemetry out. The interesting stuff all happens after you have the telemetry and you've sent it to some back-end, and then you can, you know, analyze it and find interesting things. So, yeah, like, it doesn't make sense for there to be five or six or eight different companies all competing to rebuild the same wheels over and over and over and over when they don't have to.I think the second thing that some people are starting to understand is that it's like, okay, let's take this a step beyond instrumentation, right? Because the goal of OpenTelemetry really is to make sure that this instrumentation is native so that you don't need a third-party agent, you don't need some other process or jar or whatever that you drop in and it instruments stuff for you. The JVM should provide this, your web framework should provide this, your RPC library should provide this right? Like, this data should come from the code itself and be in a normalized fashion that can then be sent to any number of vendors or back ends or whatever. And that changes how—sort of, the competitive landscape a lot, I think, for observability vendors because rather than, kind of, what you have now, which is people will competing on, like, well, how quickly can I throw this agent in and get set up and get a dashboard going, it really becomes more about, like, okay, how are you differentiating yourself against every other person that has access to the same data, right? And you get more interesting use cases and how much more interesting analysis features, and that results in more innovation in, sort of, the industry than we've seen in a very long time.Corey: For me, just from the customer side of the world, one of the biggest problems I had with observability in my career as an SRE-type for years was you would wind up building your observability pipeline around whatever vendor you had selected and that meant emphasizing the things they were good at and de-emphasizing the things that they weren't. And sometimes it's worked to your benefit; usually not. But then you always had this question when it got things that touched on APM or whatnot—or Application Performance Monitoring—where oh, just embed our library into this. Okay, great. But a year-and-a-half ago, my exposure to this was on an application that I was running in distributed fashion on top of AWS Lambda.So great, you can either use an extension for this or you can build in the library yourself, but then there's always a question of precedence where when you have multiple things that are looking at this from different points of view, which one gets done first? Which one is going to see the others? Which one is going to enmesh the other—enclose the others in its own perspective of the world? And it just got incredibly frustrating. One of the—at least for me—bright lights of OTel was that it got away from that where all of the vendors receiving telemetry got the same view.Austin: Yeah. They all get the same view, they all get the same data, and you know, there's a pretty rich collection of tools that we're starting to develop to help you build those pipelines yourselves and really own everything from the point of generation to intermediate collection to actually outputting it to wherever you want to go. For example, a lot of really interesting work has come out of the OpenTelemetry collector recently; one of them is this feature called Connectors. And Connectors let you take the output of certain pipelines and route them as inputs to another pipeline. And as part of that connection, you can transform stuff.So, for example, let's say you have a bunch of [spans 00:14:05] or traces coming from your API endpoints, and you don't necessarily want to keep all those traces in their raw form because maybe they aren't interesting or maybe there's just too high of a volume. So, with Connectors, you can go and you can actually convert all of those spans into metrics and export them to a metrics database. You could continue to save that span data if you want, but you have options now, right? Like, you can take that span data and put it into cold storage or put it into, like, you know, some sort of slow blob storage thing where it's not actively indexed and it's slow lookups, and then keep a metric representation of it in your alerting pipeline, use metadata exemplars or whatever to kind of connect those things back. And so, when you do suddenly see it's like, “Oh, well, there's some interesting p99 behavior,” or we're hitting an alert or violating an SLO or whatever, then you can go back and say, like, “Okay, well, let's go dig through the slow da—you know, let's look at the cold data to figure out what actually happened.”And those are features that, historically, you would have needed to go to a big, important vendor and say, like, “Hey, here's a bunch of money,” right? Like, “Do this for me.” Now, you have the option to kind of do all that more interesting pipeline stuff yourself and then make choices about vendors based on, like, who is making a tool that can help me with the problem that I have? Because most of the time, I don't—I feel like we tend to treat observability tools as—it depends a lot on where you sit in the org—but you certainly seen this movement towards, like, “Well, we don't want a tool; we want a platform. We want to go to Lowe's and we want to get the 48-in-one kit that has a bunch of things in it. And we're going to pay for the 48-in-one kit, even if we only need, like, two things or three things out of it.”OpenTelemetry lets you kind of step back and say, like, “Well, what if we just got, like, really high-quality tools for the two or three things we need, and then for the rest of the stuff, we can use other cheaper options?” Which is, I think, really attractive, especially in today's macroeconomic conditions, let's say.Corey: One thing I'm trying to wrap my head around because we all find when it comes to observability, in my experience, it's the parable of three blind people trying to describe an elephant by touch; depending on where you are on the elephant, you have a very different perspective. What I'm trying to wrap my head around is, what is the vision for OpenTelemetry? Is it specifically envisioned to be the agent that runs wherever the workload is, whether it's an agent on a host or a layer in a Lambda function, or a sidecar or whatnot in a Kubernetes cluster that winds up gathering and sending data out? Or is the vision something different? Because part of what you're saying aligns with my perspective on it, but other parts of it seem to—that there's a misunderstanding somewhere, and it's almost certainly on my part.Austin: I think the long-term vision is that you as a developer, you as an SRE, don't even have to think about OpenTelemetry, that when you are using your container orchestrator or you are using your API framework or you're using your Managed API Gateway, or any kind of software that you're building something with, that the telemetry data from that software is emitted in an OpenTelemetry format, right? And when you are writing your code, you know, and you're using gRPC, let's say, you could just natively expect that OpenTelemetry is kind of there in the background and it's integrated into the actual libraries themselves. And so, you can just call the OpenTelemetry API and it's part of the standard library almost, right? You add some additional metadata to a span and say, like, “Oh, this is the customer ID,” or, “This is some interesting attribute that I want to track for later on,” or, “I'm going to create a histogram here or counter,” whatever it is, and then all that data is just kind of there, right, invisible to you unless you need it. And then when you need it, it's there for you to kind of pick up and send off somewhere to any number of back-ends or databases or whatnot that you could then use to discover problems or better model your system.That's the long-term vision, right, that it's just there, everyone uses it. It is a de facto and du jour standard. I think in the medium term, it does look a little bit more like OpenTelemetry is kind of this Swiss army knife agent that's running on—inside cars in Kubernetes or it's running on your EC2 instance. Until we get to the point of everyone just agrees that we're going to use OpenTelemetry protocol for the data and we're going to use all your stuff and we just natively emit it, then that's going to be how long we're in that midpoint. But that's sort of the medium and long-term vision I think. Does that track?Corey: It does. And I'm trying to equate this to—like the evolution back in the Stone Age was back when I was first getting started, Nagios was the gold standard. It was kind of the original Call of Duty. And it was awful. There were a bunch of problems with it, but it also worked.And I'm not trying to dunk on the people who built that. We all stand on the shoulders of giants. It was an open-source project that was awesome doing exactly what it did, but it was a product built for a very different time. It completely had the wheels fall off as soon as you got to things were even slightly ephemeral because it required this idea of the server needed to know where all of the things that was monitoring lived as an individual host basis, so there was this constant joy of, “Oh, we're going to add things to a cluster.” Its perspective was, “What's a cluster?” Or you'd have these problems with a core switch going down and suddenly everything else would explode as well.And even setting up an on-call rotation for who got paged when was nightmarish. And a bunch of things have evolved since then, which is putting it mildly. Like, you could say that about fire, the invention of the wheel. Yeah, a lot of things have evolved since the invention of the wheel, and here we are tricking sand into thinking. But we find ourselves just—now it seems that the outcome of all of this has been instead of one option that's the de facto standard that's kind of terrible in its own ways, now, we have an entire universe of different products, many of which are best-of-breed at one very specific thing, but nothing's great at everything.It's the multifunction printer conundrum, where you find things that are great at one or two things at most, and then mediocre at best at the rest. I'm excited about the possibility for OpenTelemetry to really get to a point of best-of-breed for everything. But it also feels like the money folks are pushing for consolidation, if you believe a lot of the analyst reports around this of, “We already pay for seven different observability vendors. How about we knock it down to just one that does all of these things?” Because that would be terrible. What do you land on that?Austin: Well, as I intu—or alluded to this earlier, I think the consolidation in the observability space, in general, is very much driven by that force you just pointed out, right? The buyers want to consolidate more and more things into single tools. And I think there's a lot of… there are reasons for that that—you know, there are good reasons for that, but I also feel like a lot of those reasons are driven by fundamentally telemetry-side concerns, right? So like, one example of this is if you were Large Business X, and you see—you are an engineering director and you get a report, that's like, “We have eight different metrics products.” And you're like, “That seems like a lot. Let's just use Brand X.”And Brand X will tell you very, very happily tell you, like, “Oh, you just install our thing everywhere and you can get rid of all these other tools.” And usually, there's two reasons that people pick tools, right? One reason is that they are forced to and then they are forced to do a bunch of integration work to get whatever the old stuff was working in the new way, but the other reason is because they tried a bunch of different things and they found the one tool that actually worked for them. And what happens invariably in these sort of consolidation stories is, you know, the new vendor comes in on a shining horse to consolidate, and you wind up instead of eight distinct metrics tools, now you have nine distinct metrics tools because there's never any bandwidth for people to go back and, you know—you're Nagios example, right, Nag—people still use Nagios every day. What's the economic justification to take all those Nagios installs, if they're working, and put them into something else, right?What's the economic justification to go and take a bunch of old software that hasn't been touched for ten years that still runs and still does what needs to do, like, where's the incentive to go and re-instrument that with OpenTelemetry or anything else? It doesn't necessarily exist, right? And that's a pretty, I think, fundamental decision point in everyone's observability journey, which is what do you do about all the old stuff? Because most of the stuff is the old stuff and the worst part is, most of the stuff that you make money off of is the old stuff as well. So, you can't ignore it, and if you're spending, you know, millions of millions of dollars on the new stuff—like, there was a story that went around a while ago, I think, Coinbase spent something like, what, $60 million on Datadog… I hope they asked for it in real money and not Bitcoin. But—Corey: Yeah, something I've noticed about all the vendors, and even Coinbase themselves, very few of them actually transact in cryptocurrency. It's always cash on the barrelhead, so to speak.Austin: Yeah, smart. But still, like, that's an absurd amount of money [laugh] for any product or service, I would argue, right? But that's just my perspective. I do think though, it goes to show you that you know, it's very easy to get into these sort of things where you're just spending over the barrel to, like, the newest vendor that's going to come in and solve all your problems for you. And just, it often doesn't work that way because most places aren't—especially large organizations—just aren't built in is sort of like, “Oh, we can go through and we can just redo stuff,” right? “We can just roll out a new agent through… whatever.”We have mainframes [unintelligible 00:25:09], mainframes to thinking about, you have… in many cases, you have an awful lot of business systems that most, kind of, cloud people don't like, think about, right, like SAP or Salesforce or ServiceNow, or whatever. And those sort of business process systems are actually responsible for quite a few things that are interesting from an observability point of view. But you don't see—I mean, hell, you don't even see OpenTelemetry going out and saying, like, “Oh, well, here's the thing to let you know, observe Apex applications on Salesforce,” right? It's kind of an undiscovered country in a lot of ways and it's something that I think we will have to grapple with as we go forward. In the shorter term, there's a reason that OpenTelemetry mostly focuses on cloud-native applications because that's a little bit easier to actually do what we're trying to do on them and that's where the heat and light is. But once we get done with that, then the sky is the limit.[midroll 00:26:11]Corey: It still feels like OpenTelemetry is evolving rapidly. It's certainly not, I don't want to say it's not feature complete, which, again, what—software is never done. But it does seem like even quarter-to-quarter or month-to-month, its capabilities expand massively. Because you apparently enjoy pain, you're in the process of writing a book. I think it's in early release or early access that comes out next year, 2024. Why would you do such a thing?Austin: That's a great question. And if I ever figure out the answer I will tell you.Corey: Remember, no one wants to write a book; they want to have written the book.Austin: And the worst part is, is I have written the book and for some reason, I went back for another round. I—Corey: It's like childbirth. No one remembers exactly how horrible it was.Austin: Yeah, my partner could probably attest to that. Although I was in the room, and I don't think I'd want to do it either. So, I think the real, you know, the real reason that I decided to go and kind of write this book—and it's Learning OpenTelemetry; it's in early release right now on the O'Reilly learning platform and it'll be out in print and digital next year, I believe, we're targeting right now, early next year.But the goal is, as you pointed out so eloquently, OpenTelemetry changes a lot. And it changes month to month sometimes. So, why would someone decide—say, “Hey, I'm going to write the book about learning this?” Well, there's a very good reason for that and it is that I've looked at a lot of the other books out there on OpenTelemetry, on observability in general, and they talk a lot about, like, here's how you use the API. Here's how you use the SDK. Here's how you make a trace or a span or a log statement or whatever. And it's very technical; it's very kind of in the weeds.What I was interested in is saying, like, “Okay, let's put all that stuff aside because you don't necessarily…” I'm not saying any of that stuff's going to change. And I'm not saying that how to make a span is going to change tomorrow; it's not, but learning how to actually use something like OpenTelemetry isn't just knowing how to create a measurement or how to create a trace. It's, how do I actually use this in a production system? To my point earlier, how do I use this to get data about, you know, these quote-unquote, “Legacy systems?” How do I use this to monitor a Kubernetes cluster? What's the important parts of building these observability pipelines? If I'm maintaining a library, how should I integrate OpenTelemetry into that library for my users? And so on, and so on, and so forth.And the answers to those questions actually probably aren't going to change a ton over the next four or five years. Which is good because that makes it the perfect thing to write a book about. So, the goal of Learning OpenTelemetry is to help you learn not just how to use OpenTelemetry at an API or SDK level, but it's how to build an observability pipeline with OpenTelemetry, it's how to roll it out to an organization, it's how to convince your boss that this is what you should use, both for new and maybe picking up some legacy development. It's really meant to give you that sort of 10,000-foot view of what are the benefits of this, how does it bring value and how can you use it to build value for an observability practice in an organization?Corey: I think that's fair. Looking at the more quote-unquote, “Evergreen,” style of content as opposed to—like, that's the reason for example, I never wind up doing tutorials on how to use an AWS service because one console change away and suddenly I have to redo the entire thing. That's a treadmill I never had much interest in getting on. One last topic I want to get into before we wind up wrapping the episode—because I almost feel obligated to sprinkle this all over everything because the analysts told me I have to—what's your take on generative AI, specifically with an eye toward observability?Austin: [sigh], gosh, I've been thinking a lot about this. And—hot take alert—as a skeptic of many technological bubbles over the past five or so years, ten years, I'm actually pretty hot on AI—generative AI, large language models, things like that—but not for the reasons that people like to kind of hold them up, right? Not so that we can all make our perfect, funny [sigh], deep dream, meme characters or whatever through Stable Fusion or whatever ChatGPT spits out at us when we ask for a joke. I think the real win here is that this to me is, like, the biggest advance in human-computer interaction since resistive touchscreens. Actually, probably since the mouse.Corey: I would agree with that.Austin: And I don't know if anyone has tried to get someone that is, you know, over the age of 70 to use a computer at any time in their life, but mapping human language to trying to do something on an operating system or do something on a computer on the web is honestly one of the most challenging things that faces interface design, face OS designers, faces anyone. And I think this also applies for dev tools in general, right? Like, if you think about observability, if you think about, like, well, what are the actual tasks involved in observability? It's like, well, you're making—you're asking questions. You're saying, like, “Hey, for this metric named HTTPrequestsByCode,” and there's four or five dimensions, and you say, like, “Okay, well break this down for me.” You know, you have to kind of know the magic words, right? You have to know the magic promQL sequence or whatever else to plug in and to get it to graph that for you.And you as an operator have to have this very, very well developed, like, depth of knowledge and math and statistics to really kind of get a lot of—Corey: You must be at least this smart to ride on this ride.Austin: Yeah. And I think that, like that, to me is the real—the short-term win for certainly generative AI around using, like, large language models, is the ability to create human language interfaces to observability tools, that—Corey: As opposed to learning your own custom SQL dialect, which I see a fair number of times.Austin: Right. And, you know, and it's actually very funny because there was a while for the—like, one of my kind of side projects for the past [sigh] a little bit [unintelligible 00:32:31] idea of, like, well, can we make, like, a universal query language or universal query layer that you could ship your dashboards or ship your alerts or whatever. And then it's like, generative AI kind of just, you know, completely leapfrogs that, right? It just says, like, well, why would you need a query language, if we can just—if you can just ask the computer and it works, right?Corey: The most common programming language is about to become English.Austin: Which I mean, there's an awful lot of externalities there—Corey: Which is great. I want to be clear. I'm not here to gatekeep.Austin: Yeah. I mean, I think there's a lot of externalities there, and there's a lot—and the kind of hype to provable benefit ratio is very skewed right now towards hype. That said, one of the things that is concerning to me as sort of an observability practitioner is the amount of people that are just, like, whole-hog, throwing themselves into, like, oh, we need to integrate generative AI, right? Like, we need to put AI chatbots and we need to have ChatGPT built into our products and da-da-da-da-da. And now you kind of have this perfect storm of people that really don't ha—because they're just using these APIs to integrate gen AI stuff with, they really don't understand what it's doing because a lot you know, it is very complex, and I'll be the first to admit that I really don't understand what a lot of it is doing, you know, on the deep, on the foundational math side.But if we're going to have trust in, kind of, any kind of system, we have to understand what it's doing, right? And so, the only way that we can understand what it's doing is through observability, which means it's incredibly important for organizations and companies that are building products on generative AI to, like, drop what—you know, walk—don't walk, run towards something that is going to give you observability into these language models.Corey: Yeah. “The computer said so,” is strangely dissatisfying.Austin: Yeah. You need to have that base, you know, sort of, performance [goals and signals 00:34:31], obviously, but you also need to really understand what are the questions being asked. As an example, let's say you have something that is tokenizing questions. You really probably do want to have some sort of observability on the hot path there that lets you kind of break down common tokens, especially if you were using, like, custom dialects or, like, vectors or whatever to modify the, you know, neural network model, like, you really want to see, like, well, what's the frequency of the certain tokens that I'm getting they're hitting the vectors versus not right? Like, where can I improve these sorts of things? Where am I getting, like, unexpected results?And maybe even have some sort of continuous feedback mechanism that it could be either analyzing the tone and tenor of end-user responses or you can have the little, like, frowny and happy face, whatever it is, like, something that is giving you that kind of constant feedback about, like, hey, this is how people are actually like interacting with it. Because I think there's way too many stories right now people just kind of like saying, like, “Oh, okay. Here's some AI-powered search,” and people just, like, hating it. Because people are already very primed to distrust AI, I think. And I can't blame anyone.Corey: Well, we've had an entire lifetime of movies telling us that's going to kill us all.Austin: Yeah.Corey: And now you have a bunch of, also, billionaire tech owners who are basically intent on making that reality. But that's neither here nor there.Austin: It isn't, but like I said, it's difficult. It's actually one of the first times I've been like—that I've found myself very conflicted.Corey: Yeah, I'm a booster of this stuff; I love it, but at the same time, you have some of the ridiculous hype around it and the complete lack of attention to safety and humanity aspects of it that it's—I like the technology and I think it has a lot of promise, but I want to get lumped in with that set.Austin: Exactly. Like, the technology is great. The fan base is… ehh, maybe something a little different. But I do think that, for lack of a better—not to be an inevitable-ist or whatever, but I do think that there is a significant amount of, like, this is a genie you can't put back in the bottle and it is going to have, like, wide-ranging, transformative effects on the discipline of, like, software development, software engineering, and white collar work in general, right? Like, there's a lot of—if your job involves, like, putting numbers into Excel and making pretty spreadsheets, then ooh, that doesn't seem like something that's going to do too hot when I can just have Excel do that for me.And I think we do need to be aware of that, right? Like, we do need to have that sort of conversation about, like… what are we actually comfortable doing here in terms of displacing human labor? When we do displace human labor, are we doing it so that we can actually give people leisure time or so that we can just cram even more work down the throats of the humans that are left?Corey: And unfortunately, I think we might know what that answer is, at least on our current path.Austin: That's true. But you know, I'm an optimist.Corey: I… don't do well with disappointment. Which the show has certainly not been. I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place for them to find you?Austin: Welp, I—you can find me on most social media. Many, many social medias. I used to be on Twitter a lot, and we all know what happened there. The best place to figure out what's going on is check out my bio, social.ap2.io will give you all the links to where I am. And yeah, been great talking with you.Corey: Likewise. Thank you so much for taking the time out of your day. Austin Parker, community maintainer for OpenTelemetry. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment pointing out that actually, physicists say the vast majority of the universe's empty space, so that we can later correct you by saying ah, but it's empty whitespace. That's right. YAML wins again.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How Redpanda Extracts Business Value from Data Events with Alex Gallego

Screaming in the Cloud

Play Episode Listen Later Aug 31, 2023 34:43


Alex Gallego, CEO & Founder of Redpanda, joins Corey on Screaming in the Cloud to discuss his experience founding and scaling a successful data streaming company over the past 4 years. Alex explains how it's been a fun and humbling journey to go from being an engineer to being a founder, and how he's built a team he trusts to hand the production off to. Corey and Alex discuss the benefits and various applications of Redpanda's data streaming services, and Alex reveals why it was so important to him to focus on doing one thing really well when it comes to his product strategy. Alex also shares details on the Hack the Planet scholarship program he founded for individuals in underrepresented communities. About AlexAlex Gallego is the founder and CEO of Redpanda, the streaming data platform for developers. Alex has spent his career immersed in deeply technical environments, and is passionate about finding and building solutions to the challenges of modern data streaming. Prior to Redpanda, Alex was a principal engineer at Akamai, as well as co-founder and CTO of Concord.io, a high-performance stream-processing engine acquired by Akamai in 2016. He has also engineered software at Factset Research Systems, Forex Capital Markets and Yieldmo; and holds a bachelor's degree in computer science and cryptography from NYU. Links Referenced: Redpanda: https://redpanda.com/ Twitter: https://twitter.com/emaxerrno Redpanda community Slack: https://redpandacommunity.slack.com/join/shared_invite/zt-1xq6m0ucj-nI41I7dXWB13aQ2iKBDvDw Hack The Planet Scholarship: https://redpanda.com/scholarship TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Tired of slow database performance and bottlenecks on MySQL or PostgresSQL when using Amazon RDS or Aurora? How'd you like to reduce query response times by ninety percent? Better yet, how would you like to get me to pronounce database names correctly? Join customers like Zscaler, Intel, Booking.com, and others that use OtterTune's artificial intelligence to automatically optimize and keep their databases healthy. Go to ottertune dot com to learn more and start a free trial. That's O-T-T-E-R-T-U-N-E dot com.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn, and this promoted guest episode is brought to us by our friends at Redpanda, which I'm thrilled about because I have a personal affinity for companies that have cartoon mascots in the form of animals and are willing to at least be slightly creative with them. My guest is Alex Gallego, the founder and CEO over at Redpanda. Alex, thanks for joining me.Alex: Corey, thanks for having me.Corey: So, I'm not asking about the animal; I'm talking about the company, which I imagine is a frequent source of disambiguation when you meet people at parties and they don't quite understand what it is that you do. And you folks are big in the data streaming space, but data streaming can mean an awful lot of things to an awful lot of people. What is it for you?Alex: Largely it's about enabling developers to build applications that can extract value of every single event, every click, every mouse movement, every transaction, every event that goes through your network. This is what Redpanda is about. It's like how do we help you make more money with every single event? How do we help you be more successful? And you know, happy to give examples in finance, or IoT, or oil and gas, if it's helpful for the audience, but really, to me, it's like, okay, if we can give you the framework in which you can build a new application that allows you to extract value out of data, every single event that's going through your network, to me, that's what a streaming is about. It large, it's you know, data contextualized with a timestamp and largely, a sort of a database of event streaming.Corey: One of the things that I find curious about the space is that usually, companies wind up going one of two directions when you're talking about data streaming. Either there, “Oh, just send it all to us and we'll take care of it for you,” or otherwise, it's a, great they more or less ship something that you've run in your own environment. In the olden days of data centers, that usually resembled a box of some sort. You're one of those interesting split-the-difference companies where you offer both models. Do you find that one of those tends to be seeing more adoption these days or that there's an increasing trend toward one direction or the other?Alex: Yeah. So, right now, I think that to me, the future of all these data-intensive products—whether you're a database or a streaming engine—will, because simply of cost of networks transferred between the hybrid clouds and your accounts, sending a gigabyte a second of data between, let's say, you know, your data center and a vendor, it's just so expensive that at some point, from just a cost perspective, like, running the infrastructure, it's in the millions of dollars. And so, running the data inside your VPC, it's sort of the next logical evolution of how we've used to consume services. And so, I actually think it's just the evolution: people would self-host because of costs and then they would use services because of operational simplicity. “I don't want to spend team skills and time building this. I want to pay a vendor.”And so, BYOC, to be honest—which is what we call this offering—it was about [laugh] sidestepping the costs and of being stuck in the hybrid clouds, whether it's Google or Amazon, where you're paying egress and ingress costs and it's just so expensive, in addition to this whole idea of data residency or data sovereignty and privacy. It's like, yeah, why not both? Like, if I'm an engineer, I want low latency and I don't want to pay you to transfer this thing to the next rack. I mean, my computer's probably, like, you know, a hundred feet away from my customer's computer. Like, why [laugh] way is that so complicated? So, you know, my view is that the future of data-intensive products will be in this form of where it—like, data planes are actually owned by companies, and then you offer that as a Software as a Service.Corey: One of the things that catches an awful lot of companies with telemetry use cases—or data streaming as another example of that—by surprise when they start building their own cloud-hosted offering is that they're suddenly seeing a lot more cross-AZ data charges than they would have potentially expected. And that's because unlike cross-region or the really expensive version of this with egress, it's a penny in and a penny out per gigabyte in most of AWS regions. Which means that that isn't also bound strictly to an AWS organization. So, you have customers co-located with you and you're starting to pay ingress charges on customers throwing their data over to you. And, on some level, the most economical solution for you is well, we're just going to put our listeners somewhere else far away so that we can just have them pay the steep egress fee but then we can just reflect it back to ourselves for free.And that's a terrible pattern, but it's a byproduct of the absolutely byzantine cross-AZ data transfer pricing, in fact, all of the data transfer pricing that is at least AWS tends to present. And it shapes the architectural decisions you make as a result.Alex: You know, as a user, it just didn't make sense. When we launched this product, the number of people that says like, “Why wouldn't your charge for, you know, effectively renting [unintelligible 00:05:14], and giving a markup to your customers?” That's we don't add any value on that, you know? I think people should really just pay us for the value that we create for them. And so, you know, for us competing with other companies is relatively easy.Competing with MSK is it's harder because MSK just has this, you know, muscle where they don't charge you for some particular network traffic between you. And so, it forces companies like us that are trying to be innovative in the data space to, like, put our services in that so that we can actually compete in the market. And so, it's a forcing function of the hybrid clouds having this strong muscle of being able to discount their services in a way that companies just simply don't have access to. And then, you know, it becomes—for the others—latency and sovereignty.Corey: This is the way that effectively all of AWS has first-party offerings of other things go. Replication traffic between AZs is not chargeable. And when I asked them about that, they say, “Oh, yeah. We just price that into the cost of the service.” I don't know that I necessarily buy that because if I try and run this sort of thing on top of EC2, it would cost me more than using their crappy implementation of it, just in data transfer alone for an awful lot of use cases.No third party can touch that level of cost-effectiveness and discounting. It really is probably the clearest example I can think of actual anti-competitive behavior in the market. But it's also complex enough to explain, to, you know, regulators that it doesn't make for exciting exposés and the basis for lawsuits. Yet. Hope springs eternal.Alex: [laugh]. You know—okay, so here is how—if someone is listening to this podcast and is, like, “Okay, well, what can I do?” For us, S3 is the answer. S3 is basically you need to be able to lean in into S3 as a way of replication across [AZ 00:06:56], you need to be able to lean into S3 to read data. And so actually, when I wrote, originally, Redpanda, you know, it's just like this C++ thing using [unintelligible 00:07:04], geared towards super low latency.When we moved it into the cloud, what we realized is, this is cost prohibitive to run either on EBS volumes or local disk. I have to tier all the storage into S3, so that I can use S3's cross-AZ network transfer, which is basically free, to be able to then bring a separate cluster on a different AZ, and then read from the bucket at zero cost. And so, you end up really—like, there are fundamental technical things that you have to do to just be able to compete in a way that's cost-effective for you. And so, in addition to just, like, the muscle that they can enforce on the companies is—it—there are deep implications of what it translates to at the technical level. Like, at the code level.Corey: In the cloud, more than almost anywhere else, it really does become apparent that cost and architecture are fundamentally the same thing. And I have a bit of an advantage here in that I've seen what you do deployed at least one customer of mine. It's fun. When you have a bunch of logos on your site, it's, “Hey, I recognize some of those.” And what I found interesting was the way that multiple people, when I spoke to them, described what it is that you do because some of them talked about it purely as a cost play, but other people were just as enthusiastic about it being a means of improving feature velocity and unlocking capabilities that they didn't otherwise have or couldn't have gotten to without a whole lot of custom work on their part. Which is it? How do you view what it is that you're bringing to market? Is it a cost play or is it a capability story?Alex: From our customer base, I would say 40% is—of our customer base—is about Redpanda enabling them to do things that they simply couldn't do before. An example is, we have, you know, a Fortune 100 company that they basically run their hedge trading strategy on top of Redpanda. And the reason for that is because we give them a five-millisecond average latency with predictable flight latencies, right? And so, for them, that predictability of Redpanda, you know, and sort of like the architecture that came about from trying to invent a new storage engine, allows them to throw away a bunch of in-house, you know, custom-built pub/sub messaging that, you know, basically gave them the same or worse latency. And so, for them, there's that.For others, I think in the IoT space, or if you have flying vehicles around the world, we have some logos that, you know, I just can't mention them. But they have this, like, flying computers around the world and they want to measure that. And so, like, the profile of the footprint, like, the mechanical footprint of being able to run on a single Pthread with a few megs of memory allows these new deployment models that, you know, simply, it's just, it's not possible with the alternatives where let's say you have to have, you know, like, a zookeeper on the schema registry and an HTTP proxy and a broker and all of these things. That simply just, it cannot run on a single Pthread with a few megs of memory, if you put any sort of workload into that. And so, it's like, the computational efficiencies simply enable new things that you couldn't do before. And that's probably 40%. And then the other, it's just… money was really cheap last year [laugh] or the year before and I think now it's less cheap [unintelligible 00:10:08] yeah.Corey: Yeah, I couldn't help but notice that in my own business, too. It turns out that not giving a shit about the AWS bill was a zero-interest-rate phenomenon. Who knew?Alex: [laugh]. Yeah, exactly. And now people [unintelligible 00:10:17], you know, the CIOs in particular, it's like, help. And so, that's really 60%, and our business has boomed since.Corey: Yeah, one thing that I find interesting is that you've been around for only four years. I know that's weird to say ‘only,' but time moves differently in tech. And you've started showing up in some very strange places that I would not have expected. You recently—somewhat recently; time is, of course, a flat circle—completed $100 million Series C, and I also saw you in places where I didn't expect to see you in the form of, last week, one of your large competitor's earnings calls, where they were asked by an analyst about an unnamed company that had raised $100 million Series C, and the CEO [unintelligible 00:11:00], “Oh, you're probably talking about Redpanda.” And then they gave an answer that was fine.I mean, no one is going to be on an earnings call and not be prepared for questions like that and to not have an answer ready to go. No one's going to say, “Well, we're doomed if it works,” because I think that businesses are more sophisticated than that. But it was an interesting shout-out in a place where you normally don't see competitors validate that you're doing something interesting by name-checking you.Alex: What was fundamentally interesting for me about that, is that I feel that as an investor, if you're putting you know, 2, 3, 4, or $500 million check into a public position of a company, you want to know, is this money simply going to make returns? That's basically what an investor cares about. And so, the reason for that question is, “Hey, there's a Series C startup company that now has a bunch of these Fortune 2000 logos,” and you know, when we talked to them, like, their customer [unintelligible 00:11:51] phenomena, like, why is that the case? And then, you know, our competitor was forced to name, you know, [laugh] a single win. That's as far as I remember it. We don't know of any additional customers that have switched to that.And so, I think when you have, like, you know, your win rate is above, whatever, 95%, 97% ratio, then I think, you know, they're just sort of forced to answer that. And in a way, I just think that they focus on different things. And for me, it was like, “Okay, developer, hands on keyboard, behind the terminal, how do I make you successful?” And that seems to have worked out enough to be mentioned in the earnings call.Corey: On some level, it's a little bit of a dog-and-pony show. I think that as companies had a certain point of scale, they feel that they need to validate what they're doing to investors at various points—which is always, on some level, of concern—and validate themselves to analysts, both financial—which, okay, whatever—and also, industry analysts, where they come with checklists that they believe is what customers want and is often a little bit off of the mark. But the validation that I think that matters, that actually determines whether or not something has legs is what your customers—you know, people paying you money for a thing—have to say and what they take away from what you're doing. And having seen in a couple of cases now myself, that usage of Redpanda has increased after initial proofs of concept and putting things on to it, I already sort of know the answer to this, but it seems that you also have a vibrant community of boosters for people who are thrilled to use the thing you're selling them.Alex: You know, Jumptraders recently posted that there was a use case in the new stack where they, like, put for the most mission-critical. So, for those of you that listening, Jumptraders is financial company, and they're super technical company. One of, like, the hardest things, they'll probably put your [unintelligible 00:13:35] your product through some of the most rigorous testing [unintelligible 00:13:38]. So, when you start doing some of these logos, it gives confidence. And actually, the majority of our developers that we get to partner with, it was really a friend telling a friend, for [laugh] the longest time, my marketing department was super, super small.And then what's been fun, some, like, really different use case was the one I mentioned about on this, like, flying vehicles around the world. They fly both in outer space and in airplanes. That was really fun. And then the large one is when you have workloads at, like, 14-and-a-half gigabytes per second, where the alternative of using something like Kinesis in the case of Lacework—which, you know, they wrote a new stack article about—would be so exorbitantly expensive. And so, in a way, I think that, you know, just trying to make the developers successful, really focusing, honestly, on the person who just has to make things work. We don't—by the time we get to the CIO, really the champion was the engineer who had to build an application. “I was just trying to figure it out the whack-a-mole of trying to debug alternative systems.”Corey: One of the, I think, seductive problems with your entire space is that no one decides day one that they're going to implement a data streaming solution for a very scaled-out, high-traffic site. The early adoption is always a small thing that you're in the process of building. And at that scale at that speed, it just doesn't feel like it's that hard of a problem because scale introduces its own unique series of challenges, but it's often one that people only really find out themselves when the simple thing that works in theory but not in production starts to cause problems internally. I used to work with someone who was a deeply passionate believer in Apache Kafka to a point where it almost became a problem, just because their answer to every problem—it almost didn't matter if it was, “How do we get more coffee this morning?”—Kafka would be the answer for all of it.And that's great, but it turned out, they became one of these people that borderline took on a product or a technology as their identity. So, anything that would potentially take a workload away from that, I got a lot of internal resistance. I'm wondering if you find that you're being brought in to replace existing systems or for completely greenfield stuff. And if the former, are you seeing a lot of internal resistance to people who have built a little niche for themselves?Alex: It's true, the people that have built a career, especially at large banks, were a pretty good fit for, you know, they actually get a team, they got a promotion cycle because they brought this technology and the technology sort of helped them make money. I personally tend to love to talk to these people. And there was a ca—to me, like, technically, let's talk about, like, deeply technical. Let me help you. That obviously doesn't scale because I can't have the same conversation with ten people.So, we do tend to see some of that. Actually, from our customers' standpoint, I would say that the large part of our customer base, you know, if I'm trying to put numbers, maybe 65%, I probably rip and replace of, you know, either upstream Apache Software or private companies or hosted services, et cetera. And so, I think you're right in saying, “Hey, that resistance,” they probably handled the [unintelligible 00:16:38], but what changed in the last year is that the CIO now stepped in and says, “I am going to fire all of you or you have to come up with a $10 million savings. Help me.” [laugh]. And so, you know, then really, my job is to help them look like a hero.It's like, “Hey, look, try it tested, benchmark it in your with your own workload, and if it saves you money, then use it.” That's been, you know, to sort of super helpful kind of on the macroeconomic environment. And then the last one is sometimes, you know, you do have to go with a greenfield, right? Like, someone has built a career, they want to gain confidence, they want to ask you questions, they want to trust you that you don't lose data, they want to make sure that you do say the things that you want to say. And so, sometimes it's about building trust and building that relationship.And developers are right. Like, there's a bunch of products out there. Like, why should I trust you? And so, a little easier time, probably now, that you know, with the CIOs wanting to cut costs, and now you have an excuse to go back to the executive team and say, “Look, I made you look smart. We get to [unintelligible 00:17:35], you know, our systems can scale to this.” That's easy. Or the second one is we do, you know, we'll start with some side use case or a greenfield. But both exists, and I would say 65% is probably rip-outs.Corey: One question, I love to, I wouldn't call it ambush, but definitely come up with, the catches some folks by surprise is one of the ways I like to sort out zealots from people who are focused on business problems. Do have an example of a data streaming workload for which Redpanda would not be a great fit?Alex: Yeah. Database-style queries are not a fit. And so, think that there was a streaming engine before there was trying to build a database on top of it, and, like—and probably it does work in some low volume of traffic, like, say 5, 10 megabytes per second, but when you get to actual large scale, it just it doesn't work. And it doesn't work because but what Redpanda is, it gives you two properties as a developer. You can add data to the end or you can truncate the head, right?And so, because those are your only two operations on the log, then you have to build this entire caching level to be able to give this database semantics. And so, do you know, I think for that the future isn't for us to build a database, just as an example, it's really to almost invert it. It's like, hey, what if we make our format an open format like Apache Iceberg and then bring in your favorite database? Like, bring in, you know, Snowflake or Athena or Trina or Spark or [unintelligible 00:18:54] or [unintelligible 00:18:55] or whatever the other [unintelligible 00:18:56] of great databases that are better than we are, and doing, you know, just MPP, right, like a massively parallelizable database, do that, and then the job for us, for [unintelligible 00:19:05], let me just structure your log in a way that allows you to query, right? And so, for us, when we announced the $100 million dollar Series C funding, it's like, I'm going to put the data in an iceberg format so you can go and query it with the other ten databases. And there are a better job than we are at that than we are.Corey: It's frankly, refreshing to see a vendor that knows where, okay, this is where we start and this is where we stop because it just seems that there's been an industry-wide push for a while now to oh, you built a component in a larger system that works super well. Now, expand to do everything else in the architectural diagram. And you suddenly have databases trying to be network transport layers and queues trying to be data warehouses, and it just doesn't work that way. It just it feels like oh, this is a terrible approach to solving this particular problem. And what's worse, from my mind, is that people who hadn't heard of you before look at you through this lens that does not put you in your best light, and, “Oh, this is a terrible database.” Well, it's not supposed to be one.Alex: [laugh].Corey: But it also—it puts them off as a result. Have you faced pressure to expand beyond your core competency from either investors or customers or analysts or, I don't know, the voices late at night that I hear and I assume everyone else does, too?Alex: Exactly. The 3 a.m. voice that I have to take my phone and take a voice note because it's like, I don't want to lose this idea. Totally. For us. I think there's pressures, like, hey, you built this great engine. Why don't you add, like, the latest, you know, soup de jour in systems was like a vector database.I was like, “This doesn't even make any sense.” For me, it's, I want to do one thing really well. And I generally call it internally, ‘the ring zero.' It's, if you think of the internet, right, like, as a computer, especially with this mode to what we talked about earlier in a BYOC, like, we could be the best ring zero, the best sort of like, you know, messaging platform for people to build real-time applications. And then that's the case and there's just so much low-hanging fruit for us.Like, the developer experience wasn't great for other systems, like, why don't we focus on the last mile, like, making that developer, you know, successful at doing this one thing as opposed to be an average and a bunch of other a hundred products? And until we feel, honestly, that we've done a phenomenal job at that—I think we still have some roadmap to get there—I don't want to expand. And, like, if there's pressure, my answer is, like… look, the market is big enough. We don't have to do it. We're still, you know, growing.I think it's obviously not trivial and I'm kind of trivializing a bunch of problems from a business perspective. I'm not trying to degrade anyone else. But for us, it's just being focused. This is what we do well. And bring every other technology that makes you successful. I don't really care. I just want to make this part well.Corey: I think that that is something that's under-appreciated. I feel like I should get over at one point to something that's been nagging at the back of my mind. Some would call it a personal attack and I suppose I'll let them, but what I find interesting is your background. Historically, you were a distributed systems engineer at very large scale. And you apparently wrote the first version of Redpanda yourself in—was it C or C++?Alex: C++.Corey: Yeah. And now you are the CEO of a company that is clearly doing very well. Have you gotten the hell out of production yet? The reason I ask this is I have worked in a number of companies where the founder was also the initial engineer and then they invariably treated main as their feature branch and the rest of us all had to work around them to keep them from, you know, destroying everything we were trying to build around us, due to missing context. In other words, how annoyed with you are your engineers on any given afternoon?Alex: [laugh]. Yeah. I would say that as a company builder now, if I may say that, is the team is probably the thing I'm the most proud of. They're just so talented, such good [unintelligible 00:22:47] of humans. And so—group of humans—I stopped coding about two years ago, roughly.So, the company is four-and-a-half years old, really the first two-and-a-half years old, the first one, two years, definitely, I was personally putting in, like, tons and tons of hours working on the code. It was a ton of fun. To me, one of the most rewarding technical projects I've ever had a chance to do. I still read pull requests, though, just so that when I have a conversation with a technical leader, I don't be, like, I have no clue how the transactions work. So, I still have to read the code, but I don't write any more code and my heart was a little broken when my dev prod team removed my write access to the GitHub repo.We got SOC2 compliance, and they're like, “You can't have access to being an admin on Google domains, and you're no longer able to write into main.” And so, I think as a—I don't know, maybe my identity—myself identity is that of a builder, and I think as long as I personally feel like I'm building, today, it's not code, but you know, is the company and [unintelligible 00:23:41] sort of culture, then I feel okay [laugh]. But yeah, I no longer write code. And the last story on that, is this—an engineer of ours, his name is [Stefan 00:23:51], he's like, “Hey, so Alex wrote this semaphore”—this was actually two days ago—and so they posted a video, and I commented, I was like, “Hey, this was the context of semaphore. I'm sorry for this bug I caused.” But yeah, at least I still remember some context for them.Corey: What's fun is watching things continue to outpace and outgrow you. I mean, one of the hard parts of building a company is the realization that every person you hire for a thing that's now getting off of your plate is better at that thing than you are. It's a constant experience of being humbled. And at some point, things wind up outpacing you to the point where, at least in my case, I've been on calls with customers and I explained how we did some things and how it worked and had to be corrected by my team of, “Well. That used to be true, however…” like, “Oh, dear Lord. I'm falling behind.” And that's always been a weird feeling for me.Alex: Totally. You know, it's the feeling of being—before I think I became a CEO, I was a highly comped  engineer and did a competent, to the extent that it allowed me to build this product. And then you start doing all of these things and you're incompetent, obviously, by definition because you haven't done those things and so there's like that discomfort [laugh]. But I have to get it done because no one else wants to do, whatever, like say, like, you know, rev ops or marketing or whatever.And then you find somebody who's great and you're like, oh my God, I was like, I was so poor tactically at doing this thing. And it's definitely humbling every day. And it's almost it's, like, gosh, you're just—this year was kind of this role where you're just, like, mediocre at, like, a whole lot of things as a company, but you're the only person that has to do the job because you have the context and you just have to go and do it. And so, it's definitely humbling. And in some ways, I'm learning, so for me today, it's still a lot of fun to learn.Corey: This is a little more in the weeds, I suppose, but I always love to ask people these questions. Because I used to be naive, which meant that I had hope and I saw a brighter future in technology. I now know that was all a lie. But I used to believe that out there was some company whose internal infrastructure for what they'd built was glorious and it would be amazing. And I knew I would never work there, nor what I want to, because when everything's running perfectly, all I can really do is mess that up; there's no way to win and a bunch of ways to lose.But I found that place doesn't exist. Every time I talk to someone about how they built the thing that they built and I ask them, “If you were starting over from scratch, what would you do differently?” The answer often distills down to, “Oh, everything.” Because it's an organically evolving system that oh, yeah, everything's easier the second time. At least you get to find new failure modes go in that way. When you look back at how you designed it originally, are there any missteps that you could have saved yourself a whole lot of grief by not making the first time?Alex: Gosh, so many things. But if I were to give Hollywood highlights on these things, something that [unintelligible 00:26:35] is, does well is exposing these high-level data types of, like, streams, and lists and maps and et cetera. And I was like, “Well, why couldn't streams offer this as a first-class citizen?” And we got some things well which I think would still do, like the whole [thread recorder 00:26:49] could—like, the fundamentals of the engine I will still do the same. But, you know, exposing new programming models earlier in the life of the product, I think would have allowed us to capture even more wildly different use cases.But now we kind of have this production engine, we have to support Fortune 2000, so you know, it's kind of like a very delicate evolution of the product. Definitely would have changed—I would have added, like, custom data types upfront, I would have pushed a little harder on I think WebAssembly than we did originally. Man, I could just go on for—like, [added detail 00:27:21], I would definitely have changed things. Like, I would have pressed on the first—on the version of the cloud that we talked about early on, that as the first deployment mode. If we go back through the stack of all of the products you had, it's funny, like, 11 products that are surfaced to the customers to, like, business lines, I would change fundamental things about just [laugh], you know, everything else. I think that's maybe the curse of the expert. Like, you know, you could always find improvements.Corey: Oh, always. I still look back at my career before starting this place when I was working in a bunch of finance companies, and—I'll never forget this; it was over a decade ago—we were building out our architecture in AWS, and doing a deal with a large finance company. And they said, “Cool, where's your data center?” And I said, “Oh, it's AWS.” And they said, “Ha ha ha ha. Where's your data center?”And that was oh, okay, great. Now, it feels like if that's their reaction, they have not kept pace with the times. It feels it is easier to go to a lot of very serious enterprises with very serious businesses and serious workload concerns attendant to those and not get laughed out of the room because you didn't wind up doing a multi-million dollar data center build out that, with an eye toward making it look as enterprise-y as possible.Alex: Yeah. Okay, so here's, I think, maybe something a little bit controversial. I think that's true. People are moving to the cloud, and I don't think that that idea, especially when we go when we talk to banks, is true. They're like, “Hey, I have this contract with one of the hybrid clouds.”—you know, it's usually with two of them, and then you're like—“This is my workload. I want to spend $70 million or $100 million. Who could give me the biggest discount?” And then you kind of shop it around.But what we are seeing is that effectively, the data transfer costs are so expensive and running this for so much this large volume of traffic is still so, so expensive, that there is an inverse [unintelligible 00:29:09] to host from some category of the workload where you don't have dynamism. Actually hosted in your data center is, like, a huge boom in terms of cost efficiencies for the companies, especially where we are and especially in finances—you mentioned that—if you're trying to trade and you have this, like, steady state line from nine to five, whatever, eight to four, whenever the markets open, it's actually relatively cost-efficient because you can measure hey, look, you know, the New York Stock Exchange is 1.5 gigabytes per second at market close. Like, I could provision my hardware to beat this. And like, it'll be that I don't need this dynamism that the cloud gives me.And so yeah, it's kind of fascinating that for us because we offered the self-hosted Redpanda which can adapt to super low latencies with kernel parameter tuning, and the cloud due to the tiered storage, we talked about S3 being [unintelligible 00:29:52] to, so it's been really fun to participate in deployments where we have both. And you couldn't—they couldn't look more different. I mean, it's almost looks like two companies.Corey: One last question before we wind up calling it an episode. I think I saw something fly by on Twitter a while back as I slowly returned to the platform—no, I'm not calling it X—something you're doing involving a scholarship. Can you tell me a bit more about that?Alex: Yeah. So, you know, I'm a Latino CEO, first generation in the States, and some of the things that I felt really frustrated with, growing up that, like, I feel fortunate because I got to [unintelligible 00:30:25] that is that, you know, people were just—that look like me are probably given some bullshit QA jobs, so like, you know, behemoth job, I think, for a bank. And so, I wanted to change that. And so, we give money and mentorship to people and we release all of the intellectual property. And so, we mentor someone—actually, anyone from underrepresented backgrounds—for three months.We give then, like, 1200 bucks a month—or 1500, I can't remember—mentorship from our top principal level engineers that have worked at Amazon and Google and Facebook and basically the world's top companies. And so, they meet with them one hour a week, we give them money, they could sit in the couch if they want to. No one has to [unintelligible 00:31:06]. And all we're trying to do is, like, “Hey, if you are part of this group, go and try to build something super hard.” [laugh].And often their minds, which is great, and they're like, “I want to build an OpenAI competitor in three months, and here's the week-by-week progress.” Or, “I want to build a new storage engine, new database in three months.” And that's the kind of people that we want to help, these like, super ambitious, that just hasn't had a chance to be mentored by some of the world's best engineers. And I just want to help them. Like, we—this is a non-scalable project. I meet with them once a week. I don't want to have a team of, like, ten people.Like, to me, I feel like their most valuable thing I could do is to give them my time and to help them mentor. I was like, “Hey, let's think about this problem. Let's decompose this. How do you think about this?” And then bring you the best engineers that I, you know, that work for—with me, and let me help you think about problems differently and give you some money.And we just don't care how you use the time or the money; we just want people to work on hard problems. So, it's active. It runs once a year, and if anyone is listening to this, if you want to send it to your friends, we'd love to have that application. It's for anyone in the world, too, as long as we can send the person a check [laugh]. You know, my head of finance is not going to walk to a Moneygram—which we have done in the past—but other than that, as long as you have a bank account that we can send the check to, you should be able to apply.Corey: That is a compelling offer, particularly in the current macro environment that we find ourselves faced in. We'll definitely put a link to that into the [show notes 00:32:32]. I really want to thank you for taking the time to, I guess, get me up to speed on what it is you're doing. If people want to learn more where's the best place for them to go?Alex: On Twitter, my handle is @emaxerrno, which stands for the largest error in the kernel. I felt like that was apt for my handle. So, that's one. Feel free to find me on the community Slack. There's a Slack button on the website redpanda.com on the top right. I'm always there if you want to DM me. Feel free to stop by. And yeah, thanks for having me. This was a lot of fun.Corey: Likewise. I look forward to the next time. Alex Gallego, CEO and founder at Redpanda. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that I will almost certainly never read because they have not figured out how to get data from one place to another.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Reflecting on a Legendary Tech Career with Kelsey Hightower

Screaming in the Cloud

Play Episode Listen Later Aug 29, 2023 43:01


Kelsey Hightower joins Corey on Screaming in the Cloud to discuss his reflections on how the tech industry is progressing. Kelsey describes what he's been getting out of retirement so far, and reflects on what he learned throughout his high-profile career - including why feature sprawl is such a driving force behind the complexity of the cloud environment and the tactics he used to create demos that are engaging for the audience. Corey and Kelsey also discuss the importance of remaining authentic throughout your career, and what it means to truly have an authentic voice in tech. About KelseyKelsey Hightower is a former Distinguished Engineer at Google Cloud, the co-chair of KubeCon, the world's premier Kubernetes conference, and an open source enthusiast. He's also the co-author of Kubernetes Up & Running: Dive into the Future of Infrastructure. Recently, Kelsey announced his retirement after a 25-year career in tech.Links Referenced:Twitter: https://twitter.com/kelseyhightower TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Do you wish there were cheat codes for database optimization? Well, there are – no seriously. If you're using Postgres or MySQL on Amazon Aurora or RDS, OtterTune uses AI to automatically optimize your knobs and indexes and queries and other bits and bobs in databases. OtterTune applies optimal settings and recommendations in the background or surfaces them to you and allows you to do it. The best part is that there's no cost to try it. Get a free, thirty-day trial to take it for a test drive. Go to ottertune dot com to learn more. That's O-T-T-E-R-T-U-N-E dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. You know, there's a great story from the Bible or Torah—Old Testament, regardless—that I was always a big fan of where you wind up with the Israelites walking the desert for 40 years in order to figure out what comes next. And Moses led them but could never enter into what came next. Honestly, I feel like my entire life is sort of going to be that direction. Not the biblical aspects, but rather always wondering what's on the other side of a door that I can never cross, and that door is retirement. Today I'm having returning guest Kelsey Hightower, who is no longer at Google. In fact, is no longer working and has joined the ranks of the gloriously retired. Welcome back, and what's it like?Kelsey: I'm happy to be here. I think retirement is just like work in some ways: you have to learn how to do it. A lot of people have no practice in their adult life what to do with all of their time. We have small dabs in it, like, you get the weekend off, depending on what your work, but you never have enough time to kind of unwind and get into something else. So, I'm being honest with myself. It's going to be a learning curve, what to do with that much time.You're probably still going to do work, but it's going to be a different type of work than you're used to. And so, that's where I am. 30 days into this, I'm in that learning mode, I'm on-the-job training.Corey: What's harder than you expected?Kelsey: It's not the hard part because I think mentally I've been preparing for, like, the last ten years, being a minimalist, learning how to kind of live within my means, learn to appreciate things that are just not work-related or status symbols. And so, to me, it felt like a smooth transition because I started to value my time more than anything else, right? Just waking up the next day became valuable to me. Spending time in the moment, right, you go to these conferences, there's, like, 10,000 people, but you learn to value those one-on-one encounters, those one-off, kind of, let's just go grab lunch situations. So, to me, retirement just makes more room for that, right? I no longer have this calendar that is super full, so I think for me, it was a nice transition in terms of getting more of that valuable time back.Corey: It seems to me that you're in a similar position to the one that I find myself in where the job that you were doing and I still am is tied, more or less, to a sense of identity as opposed to a particular task or particular role that you fill. You were Kelsey Hightower. That was a complete sentence. People didn't necessarily need to hear the rest of what you were working on or what you were going to be talking about at a given conference or whatnot. So, it seemed, at least from the outside, that an awful lot of what you did was quite simply who you were. Do you feel that your sense of identity has changed?Kelsey: So, I think when you have that much influence, when you have that much reputation, the words you say travel further, they tend to come with a little bit more respect, and so when you're working with a team on new product, and you say, “Hey, I think we should change some things.” And when they hear those words coming from someone that they trust or has a name that is attached to reputation, you tend to be able to make a lot of impact with very few words. But what you also find is that no matter what you get involved in—configuration management, distributed systems, serverless, working with customers—it all is helped and aided by the reputation that you bring into that line of work. And so yes, who you are matters, but one thing that I think helped me, kind of greatly, people are paying attention maybe to the last eight years of my career: containers, Kubernetes, but my career stretches back to the converting COBOL into Python days; the dawn of DevOps, Puppet, Chef, and Ansible; the Golang appearance and every tool being rewritten from Ruby to Golang; the Docker era.And so, my identity has stayed with me throughout those transitions. And so, it was very easy for me to walk away from that thing because I've done it three or four times before in the past, so I know who I am. I've never had, like, a Twitter bio that said, “Company X. X person from company X.” I've learned long ago to just decouple who I am from my current employer because that is always subject to change.Corey: I was fortunate enough to not find myself in the public eye until I owned my own company. But I definitely remember times in my previous incarnations where I was, “Oh, today I'm working at this company,” and I believed—usually inaccurately—that this was it. This was where I really found my niche. And then surprise I'm not there anymore six months later for, either their decision, my decision, or mutual agreement. And I was always hesitant about hanging a shingle out that was tied too tightly to any one employer.Even now, I was little worried about doing it when I went independent, just because well, what if it doesn't work? Well, what if, on some level? I think that there's an authenticity that you can bring with you—and you certainly have—where, for a long time now, whenever you say something, I take it seriously, and a lot of people do. It's not that you're unassailably correct, but I've never known you to say something you did not authentically believe in. And that is an opinion that is very broadly shared in this industry. So, if nothing else, you definitely were a terrific object lesson in speaking the truth, as you saw it.Kelsey: I think what you describe is one way that, whether you're an engineer doing QA, working in the sales department, when you can be honest with the team you're working with, when you can be honest with the customers you're selling into when you can be honest with the community you're part of, that's where the authenticity gets built, right? Companies, sometimes on the surface, you believe that they just want you to walk the party line, you know, they give you the lines and you just read them verbatim and you're doing your part. To be honest, you can do that with the website. You can do that with a well-placed ad in the search queries.What people are actually looking for are real people with real experiences, sharing not just fact, but I think when you mix kind of fact and opinion, you get this level of authenticity that you can't get just by pure strategic marketing. And so, having that leverage, I remember back in the day, people used to say, “I'm going to do the right thing and if it gets me fired, then that's just the way it's going to be. I don't want to go around doing the wrong thing because I'm scared I'm going to lose my job.” You want to find yourself in that situation where doing the right thing, is also the best thing for the company, and that's very rare, so when I've either had that opportunity or I've tried to create that opportunity and move from there.Corey: It resonates and it shows. I have never had a lot of respect for people who effectively are saying one thing today and another thing the next week based upon which way they think that the winds are blowing. But there's also something to be said for being able and willing to publicly recant things you have said previously as technology evolves, as your perspective evolves and, in light of new information, I'm now going to change my perspective on something. I've done that already with multi-cloud, for example. I thought it was ridiculous when I heard about it. But there are also expressions of it that basically every company is using, including my own. And it's a nuanced area. Where I find it challenging is when you see a lot of these perspectives that people are espousing that just so happen to deeply align with where their paycheck comes from any given week. That doesn't ring quite as true to me.Kelsey: Yeah, most companies actually don't know how to deal with it either. And now there has been times at any number of companies where my authentic opinion that I put out there is against party line. And you get those emails from directors and VPs. Like, “Hey, I thought we all agree to think this way or to at least say this.” And that's where you have to kind of have that moment of clarity and say, “Listen, that is undeniably wrong. It's so wrong in fact that if you say this in public, whether a small setting or large setting, you are going to instantly lose credibility going forward for yourself. Forget the company for a moment. There's going to be a situation where you will no longer be effective in your job because all of your authenticity is now gone. And so, what I'm trying to do and tell you is don't do that. You're better off saying nothing.”But if you go out there, and you're telling what is obviously misinformation or isn't accurate, people are not dumb. They're going to see through it and you will be classified as a person not to listen to. And so, I think a lot of people struggle with that because they believe that enterprise's consensus should also be theirs.Corey: An argument that I made—we'll call it a prediction—four-and-a-half years ago, was that in five years, nobody would really care about Kubernetes. And people misunderstood that initially, and I've clarified since repeatedly that I'm not suggesting it's going away: “Oh, turns out that was just a ridiculous fever dream and we're all going back to running bare metal with our hands again,” but rather that it would slip below the surface-level of awareness. And I don't know that I got the timing quite right on that, I think it's going to depend on the company and the culture that you find yourself in. But increasingly, when there's an application to run, it's easy to ask someone just, “Oh, great. Where's the Kubernetes cluster live so we can throw this on there and just add it to the rest of the pile?”That is sort of what I was seeing. My intention with that was not purely just to be controversial, as much fun as that might be, but also to act as a bit of a warning, where I've known too many people who let their identities become inextricably tangled with the technology. But technologies rise and fall, and at some point—like, you talk about configuration management days; I learned to speak publicly as a traveling trainer for Puppet. I wrote part of SaltStack once upon a time. But it was clear that that was not the direction the industry was going, so it was time to find something else to focus on. And I fear for people who don't keep an awareness or their feet underneath them and pay attention to broader market trends.Kelsey: Yeah, I think whenever I was personally caught up in linking my identity to technology, like, “I'm a Rubyist,” right?“, I'm a Puppeteer,” and you wear those names proudly. But I remember just thinking to myself, like, “You have to take a step back. What's more important, you or the technology?” And at some point, I realized, like, it's me, that is more important, right? Like, my independent thinking on this, my independent experience with this is far more important than the success of this thing.But also, I think there's a component there. Like when you talked about Kubernetes, you know, maybe being less relevant in five years, there's two things there. One is the success of all infrastructure things equals irrelevancy. When flights don't crash, when bridges just work, you do not think about them. You just use them because they're so stable and they become very boring. That is the success criteria.Corey: Utilities. No one's wondering if the faucet's going to work when they turn it on in the morning.Kelsey: Yeah. So, you know, there's a couple of ways to look at your statement. One is, you believe Kubernetes is on the trajectory that it's going to stabilize itself and hit that success criteria, and then it will be irrelevant. Or there's another part of the irrelevancy where something else comes along and replaces that thing, right? I think Cloud Foundry and Mesos are two good examples of Kubernetes coming along and stealing all of the attention from that because those particular products never gained that mass adoption. Maybe they got to the stable part, but they never got to the mass adoption part. So, I think when it comes to infrastructure, it's going to be irrelevant. It's just what side of that [laugh] coin do you land on?Corey: It's similar to folks who used to have to work at a variety of different companies on very specific Linux kernel subsystems because everyone had to care because there were significant performance impacts. Time went on and now there's still a few of those people that very much need to care, but for the rest of us, it is below the level of things that we have to care about. For me, the signs of the unsustainability were, oh, you can run Kubernetes effectively in production? That's a minimum of a quarter-million dollars a year in comp or up in some cases. Not every company is going to be able to field a team of those people and still remain a going concern in business. Nor frankly, should they have to.Kelsey: I'm going to pull on that thread a little bit because it's about—we're hitting that ten-year mark of Kubernetes. So, when Kubernetes comes out, why were people drawn to it, right? Why did it even get the time of day to begin with? And I think Docker kind of opened Pandora's box there. This idea of Chef, Puppet, Ansible, ten thousand package managers, and honestly, that trajectory was going to continue forever and it was helping no one. It was literally people doing duplicate work depending on the operating system you're dealing with and we were wasting time copying bits to servers—literally—in a very glorified way.So, Docker comes along and gives us this nicer, better abstraction, but it has gaps. It has no orchestration. It's literally this thing where now we've unified the packaging situation, we've learned a lot from Red Hat, YUM, Debian, and the various package repo combinations out there and so we made this universal thing. Great. We also learned a little bit about orchestration through brute force, bash scripts, config management, you name it, and so we serialized that all into this thing we call Kubernetes.It's pretty simple on the surface, but it was probably never worthy of such fanfare, right? But I think a lot of people were relieved that now we finally commoditized this expertise that the Googles, the Facebooks of the world had, right, building these systems that can copy bits to other systems very fast. There you go. We've gotten that piece. But I think what the market actually wants is in the mobile space, if you want to ship software to 300 million people that you don't even know, you can do it with the app store.There's this appetite that the boring stuff should be easy. Let's Encrypt has made SSL certificates beyond easy. It's just so easy to do the right thing. And I think for this problem we call deployments—you know, shipping apps around—at some point we have to get to a point where that is just crazy easy. And it still isn't.So, I think some of the frustration people express ten years later, they're realizing that they're trying to recreate a Rube Goldberg machine with Kubernetes is the base element and we still haven't understood that this whole thing needs to simplify, not ten thousand new pieces so you can build your own adventure.Corey: It's the idea almost of what I'm seeing AWS go through, and to some extent, its large competitors. But building anything on top of AWS from scratch these days is still reminiscent of going to Home Depot—or any hardware store—and walking up and down the aisles and getting all the different components to piece together what you want. Sometimes just want to buy something from Target that's already assembled and you have to do all of that work. I'm not saying there isn't value to having a Home Depot down the street, but it's also not the panacea that solves for all use cases. An awful lot of customers just want to get the job done and I feel that if we cling too tightly to how things used to be, we lose it.Kelsey: I'm going to tell you, being in the cloud business for almost eight years, it's the customers that create this. Now, I'm not blaming the customer, but when you start dealing with thousands of customers with tons of money, you end up in a very different situation. You can have one customer willing to pay you a billion dollars a year and they will dictate things that apply to no one else. “We want this particular set of features that only we will use.” And for a billion bucks a year times ten years, it's probably worth from a business standpoint to add that feature.Now, do this times 500 customers, each major provider. What you end up with is a cloud console that is unbearable, right? Because they also want these things to be first-class citizens. There's always smaller companies trying to mimic larger peers in their segment that you just end up in that chaos machine of unbound features forever. I don't know how to stop it. Unless you really come out maybe more Apple style and you tell people, “This is the one and only true way to do things and if you don't like it, you have to go find an alternative.” The cloud business, I think, still deals with the, “If you have a large payment, we will build it.”Corey: I think that that is a perspective that is not appreciated until you've been in the position of watching how large enterprises really interact with each other. Because it's, “Well, what customer the world is asking for yet another way to run containers?” “Uh, this specific one and their constraints are valid.” Every time I think I've seen everything there is to see in the world of cloud, I just have to go talk to one more customer and I'm learning something new. It's inevitable.I just wish that there was a better way to explain some of this to newcomers, when they're looking at, “Oh, I'm going to learn how this cloud thing works. Oh, my stars, look at how many services there are.” And then they wind up getting lost with analysis paralysis, and every time they get started and ask someone for help, they're pushed in a completely different direction and you keep spinning your wheels getting told to start over time and time again when any of these things can be made to work. But getting there is often harder than it really should be.Kelsey: Yeah. I mean, I think a lot of people don't realize how far you can get with, like, three VMs, a load balancer, and Postgres. My guess is you can probably build pretty much any clone of any service we use today with at least 1 million customers. Most people never reached that level—I don't even want to say the word scale—but that blueprint is there and most people will probably be better served by that level of simplicity than trying to mimic the behaviors of large customers—or large companies—with these elaborate use cases. I don't think they understand the context there. A lot of that stuff is baggage. It's not [laugh] even, like, best-of-breed or great design. It's like happenstance from 20 years of trying to buy everything that's been sold to you.Corey: I agree with that idea wholeheartedly. I was surprising someone the other day when I said that if you were to give me a task of getting some random application up and running by tomorrow, I do a traditional three-tier architecture, some virtual machines, a load balancer, and a database service. And is that the way that all the cool kids are doing it today? Well, they're not talking about it, but mostly. But the point is, is that it's what I know, it's where my background is, and the thing you already know when you're trying to solve a new problem is incredibly helpful, rather than trying to learn everything along that new path that you're forging down. Is that architecture the best approach? No, but it's perfectly sufficient for an awful lot of stuff.Kelsey: Yeah. And so, I mean, look, I've benefited my whole career from people fantasizing about [laugh] infrastructure—Corey: [laugh].Kelsey: And the truth is that in 2023, this stuff is so powerful that you can do almost anything you want to do with the simplest architecture that's available to us. The three-tier architecture has actually gotten better over the years. I think people are forgotten: CPUs are faster, RAM is much bigger quantities, the networks are faster, right, these databases can store more data than ever. It's so good to learn the fundamentals, start there, and worst case, you have a sound architecture people can reason about, and then you can go jump into the deep end, once you learn how to swim.Corey: I think that people would be depressed to understand just how much the common case for the value that Kubernetes brings is, “Oh yeah, now we can lose a drive or a server and the application stays up.” It feels like it's a bit overkill for that one somewhat paltry use case, but that problem has been hounding companies for decades.Kelsey: Yeah, I think at some point, the whole ‘SSH is my only interface into these kinds of systems,' that's a little low level, that's a little bare bones, and there will probably be a feature now where we start to have this not Infrastructure as Code, not cloud where we put infrastructure behind APIs and you pay per use, but I think what Kubernetes hints at is a future where you have APIs that do something. Right now the APIs give you pieces so you can assemble things. In the future, the APIs will just do something, “Run this app. I need it to be available and here's my money budget, my security budget, and reliability budget.” And then that thing will say, “Okay, we know how to do that, and here's roughly what is going to cost.”And I think that's what people actually want because that's how requests actually come down from humans, right? We say, “We want this app or this game to be played by millions of people from Australia to New York.” And then for a person with experience, that means something. You kind of know what architecture you need for that, you know what pieces that need to go there. So, we're just moving into a realm where we're going to have APIs that do things all of a sudden.And so, Kubernetes is the warm-up to that era. And that's why I think that transition is a little rough because it leaks the pieces part, so where you can kind of build all the pieces that you want. But we know what's coming. Serverless also hints at this. But that's what people should be looking for: APIs that actually do something.Corey: This episode is sponsored in part by Panoptica.  Panoptica simplifies container deployment, monitoring, and security, protecting the entire application stack from build to runtime. Scalable across clusters and multi-cloud environments, Panoptica secures containers, serverless APIs, and Kubernetes with a unified view, reducing operational complexity and promoting collaboration by integrating with commonly used developer, SRE, and SecOps tools. Panoptica ensures compliance with regulatory mandates and CIS benchmarks for best practice conformity. Privacy teams can monitor API traffic and identify sensitive data, while identifying open-source components vulnerable to attacks that require patching. Proactively addressing security issues with Panoptica allows businesses to focus on mitigating critical risks and protecting their interests. Learn more about Panoptica today at panoptica.app.Corey: You started the show by talking about how your career began with translating COBOL into Python. I firmly believe someone starting their career today listening to this could absolutely find that by the time their career starts drawing to their own close, that Kubernetes is right in there as far as sounding like the deprecated thing that no one really talks about or thinks about anymore. And I hope so. I want the future to be brighter than the past. I want getting a business or getting software together in a way that helps people to not require the amount of, “First, spend six weeks at a boot camp,” or, “Learn how to write just enough code that you can wind up getting funding and then have it torn apart.”What's the drag-and-drop story? What's the describe the application to a robot and it builds it for you? I'm optimistic about the future of infrastructure, just because based upon its power to potentially make reliability and scale available to folks who have no idea of what's involved with that. That's kind of the point. That's the end game of having won this space.Kelsey: Well, you know what? Kubernetes is providing the metadata to make that possible, right? Like in the early days, people were writing one-off scripts or, you know, writing little for loops to get things in the right place. And then we get config management that kind of formalizes that, but it still had no metadata, right? You'd have things like Puppet report information.But in the world of, like, Kubernetes, or any cloud provider, now you get semantic meaning. “This app needs this volume with this much space with this much memory, I need three of these behind this load balancer with these protocols enabled.” There is now so much metadata about applications, their life cycles, and how they work that if you were to design a new system, you can actually use that data to craft a much better API that made a lot of this boilerplate the defaults. Oh, that's a web application. You do not need to specify all of this boilerplate. Now, we can give you much better nouns and verbs to describe what needs to happen.So, I think this is that transition as all the new people coming up, they're going to be dealing with semantic meaning to infrastructure, where we were dealing with, like, tribal knowledge and intuition, right? “Run this script, pipe it to this thing, and then this should happen. And if it doesn't, run the script again with this flag.” Versus, “Oh, here's the semantic meaning to a working system.” That's a game-changer.Corey: One other topic I wanted to ask you about—I've it's been on my list of things to bring up the next time I ran into you and then you went ahead and retired, making it harder to run into you. But a little while back, I was at a tech conference and someone gave a demo, and it didn't go as well as they had hoped. And a few of us were talking about it afterwards. We've all been speakers, we've all lived that life. Zero shade.But someone brought you up in particular—unprompted; your legend does precede you—and the phrase that they used was that Kelsey's demos were always picture-perfect. He was so lucky with how the demos worked out. And I just have to ask—because you don't strike me as someone who is not careful, particularly when all eyes are upon you—and real experts make things look easy, did you have demos periodically go wrong that the audience just didn't see going wrong along the way? Or did you just actually YOLO all of your demos and got super lucky every single time for the last eight years?Kelsey: There was a musician who said, “Hey, your demos are like jazz. You improvise the whole thing.” There's no script, there's no video. The way I look at the demo is, like, you got this instrument, the command prompt, and the web browser. You can do whatever you want with them.Now, I have working code. I wrote the code, I wrote the deployment scenarios, I delete it all and I put it all back. And so, I know how it's supposed to work from the ground up. And so, what that means is if anything goes wrong, I can improvise. I could go into fixing the code. I can go into doing a redeploy.And I'll give you one good example. The first time Kubernetes came out, there was this small meetup in San Francisco with just the core contributors, right? So, there is no community yet, there's no conference yet, just people hacking on Kubernetes. And so, we decided, we're going to have the first Kubernetes meetup. And everyone got, like, six, seven minutes, max. That's it. You got to move.And so, I was like, “Hey, I noticed that in the lineup, there is no ‘What is Kubernetes?' talk. We're just getting into these nuts and bolts and I don't think that's fair to the people that will be watching this for the first time.” And I said, “All right, Kelsey, you should give maybe an intro to what it is.” I was like, “You know what I'll do? I'm going to build a Kubernetes cluster from the ground up, starting with VMs on my laptop.”And I'm in it and I'm feeling confident. So, confidence is the part that makes it look good, right? Where you're confident in the commands you type. One thing I learned to do is just use your history, just hit the up arrow instead of trying to copy all these things out. So, you hit the up arrow, you find the right command and you talk through it and no one looks at what's happening. You're cycling through the history.Or you have multiple tabs where you know the next up arrow is the right history. So, you give yourself shortcuts. And so, I'm halfway through this demo. We got three minutes left, and it doesn't work. Like, VMware is doing something weird on my laptop and there's a guy calling me off stage, like, “Hey, that's it. Cut it now. You're done.”I'm like, “Oh, nope. Thou shalt not go out like this.” It's time to improvise. And so, I said, “Hey, who wants to see me finish this?” And now everyone is locked in. It's dead silent. And I blow the whole thing away. I bring up the VMs, I [pixie 00:28:20] boot, I installed the kubelet, I install Docker. And everyone's clapping. And it's up, it's going, and I say, “Now, if all of this works, we run this command and it should start running the app.” And I do kubectl apply-f and it comes up and the place goes crazy.And I had more to the demo. But you stop. You've gotten the point across, right? This is what Kubernetes is, here's how it works, and look how you do it from scratch. And I remember saying, “And that's the end of my presentation.” You need to know when to stop, you need to know when to pivot, and you need to have confidence that it's supposed to work, and if you've seen it work a couple of times, your confidence is unshaken.And when I walked off that stage, I remember someone from Red Hat was like—Clayton Coleman; that's his name—Clayton Coleman walked up to me and said, “You planned that. You planned it to fail just like that, so you can show people how to go from scratch all the way up. That was brilliant.” And I was like, “Sure. That's exactly what I did.”Corey: “Yeah, I meant to do that.” I like that approach. I found there's always things I have to plan for in demos. For example, I can never count on having solid WiFi from a conference hall. The show has to go on. It's, okay, the WiFi doesn't work. I've at one point had to give a talk where the projector just wasn't working to a bunch of students. So okay, close the laptop. We're turning this into a bunch of question-and-answer sessions, and it was one of the better talks I've ever given.But the alternative is getting stuck in how you think a talk absolutely needs to go. Now, keynotes are a little harder where everything has been scripted and choreographed and at that point, I've had multiple fallbacks for demos that I've had to switch between. And people never noticed I was doing it for that exact reason. But it takes work to look polished.Kelsey: I will tell you that the last Next keynote I gave was completely irresponsible. No dry runs, no rehearsals, no table reads, no speaker notes. And I think there were 30,000 people at that particular Next. And Diane Greene was still CEO, and I remember when marketing was like, “Yo, at least a backup recording.” I was like, “Nah, I don't have anything.”And that demo was extensive. I mean, I was building an app from scratch, starting with Postgres, adding the schema, building an app, deploying the app. And something went wrong halfway. And there's this joke that I came up with just to pass over the time, they gave me a new Chromebook to do the demo. And so, it's not mine, so none of the default settings were there, I was getting pop-ups all over the place.And I came up with this joke on the way to the conference. I was like, “You know what'd be cool? When I show off the serverless stuff, I would just copy the code from Stack Overflow. That'd be like a really cool joke to say this is what senior engineers do.” And I go to Stack Overflow and it's getting all of these pop-ups and my mouse couldn't highlight the text.So, I'm sitting there like a deer in headlights in front of all of these people and I'm looking down, and marketing is, like, “This is what… this is what we're talking about.” And so, I'm like, “Man do I have to end this thing here?” And I remember I kept trying, I kept trying, and came to me. Once the mouse finally got in there and I cleared up all the popups, I just came up with this joke. I said, “Good developers copy.” And I switched over to my terminal and I took the text from Stack Overflow and I said, “Great developers paste,” and the whole room start laughing.And I had them back. And we kept going and continued. And at the end, there was like this Google Assistant, and when it was finished, I said, “Thank you,” to the Google Assistant and it was talking back through the live system. And it said, “I got to admit, that was kind of dope.” So, I go to the back and Diane Greene walks back there—the CEO of Google Cloud—and she pats me on the shoulder. “Kelsey, that was dope.”But it was the thrill because I had as much thrill as the people watching it. So, in real-time, I was going through all these emotions. But I think people forget, the demo is supposed to convey something. The demo is supposed to tell some story. And I've seen people overdo their demos with way too much code, way too many commands, almost if they're trying to show off their expertise versus telling a story. And so, when I think about the demo, it has to complement the entire narrative. And so, sometimes you don't need as many commands, you don't need as much code. You can keep things simple and that gives you a lot more ins and outs in case something does go crazy.Corey: And I think the key takeaway here that so many people lose sight of is you have to know the material well enough that whatever happens, well, things don't always go the way I planned during the day, either, and talking through that is something that I think serves as a good example. It feels like a bit more of a challenge when you're trying to demo something that a company is trying to sell someone, “Oh, yeah, it didn't work. But that's okay.” But I'm still reminded by probably one of the best conference demo fails I've ever seen on video. One day, someone was attempting to do a talk that hit Amazon S3 and it didn't work.And the audience started shouting at him that yeah, S3 is down right now. Because that was the big day that S3 took a nap for four hours. It was one of those foundational things you'd should never stop to consider. Like, well, what if the internet doesn't work tomorrow when I'm doing my demo? That's a tough one to work around. But rough timing.Kelsey: [breathy sound]Corey: He nailed the rest of the talk, though. You keep going. That's the thing that people miss. They get stuck in the demo that isn't working, they expect the audience knows as much as they do about what's supposed to happen next. You're the one up there telling a story. People forget it's storytelling.Kelsey: Now, I will be remiss to say, I know that the demo gods have been on my side for, like, ten, maybe fifteen years solid. So, I retired from doing live demos. This is why I just don't do them anymore. I know I'm overdue as an understatement. But the thing I've learned though, is that what I found more impressive than the live demo is to be able to convey the same narratives through story alone. No slides. No demo. Nothing. But you can still make people feel where you would try to go with that live demo.And it's insanely hard, especially for technologies people have never seen before. But that's that new challenge that I kind of set up for myself. So, if you see me at a keynote and you've noticed why I've been choosing these fireside chats, it's mainly because I'm also trying to increase my ability to share narrative, technical concepts, but now in a new form. So, this new storytelling format through the fireside chat has been my substitute for the live demo, normally because I think sometimes, unless there's something really to show that people haven't seen before, the live demo isn't as powerful to me. Once the thing is kind of known… the live demo is kind of more of the same. So, I think they really work well when people literally have never seen the thing before, but outside of that, I think you can kind of move on to, like, real-life scenarios and narratives that help people understand the fundamentals and the philosophy behind the tech.Corey: An awful lot of tools and tech that we use on a day-to-day basis as well are thankfully optimized for the people using them and the ergonomics of going about your day. That is orthogonal, in my experience, to looking very impressive on stage. It's the rare company that can have a product that not only works well but also presents well. And that is something I don't tend to index on when I'm selecting a tool to do something with. So, it's always a question of how can I make this more visually entertaining? For while I got out of doing demos entirely, just because talking about things that have more staying power than a screenshot that is going to wind up being irrelevant the next week when they decide to redo the console for some service yet again.Kelsey: But you know what? That was my secret to doing software products and projects. When I was at CoreOS, we used to have these meetups we would used to do every two weeks or so. So, when we were building things like etcd, Fleet was a container management platform that came before Kubernetes, we would always run through them as a user, start install them, use them, and ask how does it feel? These command line flags, they don't feel right. This isn't a narrative you can present with the software alone.But once we could, then the meetups were that much more engaging. Like hey, have you ever tried to distribute configuration to, like, a thousand servers? It's insanely hard. Here's how you do with Puppet. But now I'm going to show you how you do with etcd. And then the narrative will kind of take care of itself because the tool was positioned behind what people would actually do with it versus what the tool could do by itself.Corey: I think that's the missing piece that most marketing doesn't seem to quite grasp is, they talk about the tool and how awesome it is, but that's why I love customer demos so much. They're showing us how they use a tool to solve a real-world problem. And honestly, from my snarky side of the world and the attendant perspective there, I can make an awful lot of fun about basically anything a company decides to show me, but put a customer on stage talking about how whatever they've built is solving a real-world problem for them, that's the point where I generally shut up and listen because I'm going to learn something about a real-world story. Because you don't generally get to tell customers to go on stage and just make up a story that makes us sound good, and have it come off with any sense of reality whatsoever. I haven't seen that one happen yet, but I'm sure it's out there somewhere.Kelsey: I don't know how many founders or people building companies listen in to your podcast, but this is right now, I think the number one problem that especially venture-backed startups have. They tend to have great technology—maybe it's based off some open-source project—with tons of users who just know how that tool works, it's just an ingredient into what they're already trying to do. But that isn't going to ever be your entire customer base. Soon, you'll deal with customers who don't understand the thing you have and they need more than technology, right? They need a product.And most of these companies struggle painting that picture. Here's what you can do with it. Or here's what you can't do now, but you will be able to do if you were to use this. And since they are missing that, a lot of these companies, they produce a lot of code, they ship a lot of open-source stuff, they raise a lot of capital, and then it just goes away, it fades out over time because they can bring on no newcomers. The people who need help the most, they don't have a narrative for them, and so therefore, they're just hoping that the people who have all the skills in the world, the early adopters, but unfortunately, those people are tend to be the ones that don't actually pay. They just kind of do it themselves. It's the people who need the most help.Corey: How do we monetize the bleeding edge of adoption? In many cases you don't. They become your community if you don't hug them to death first.Kelsey: Exactly.Corey: Ugh. None of this is easy. I really want to thank you for taking the time to catch up and talk about how you seen the remains of a career well spent, and now you're going off into that glorious sunset. But I have a sneaking suspicion you'll still be around. Where should people go if they want to follow up on what you're up to these days?Kelsey: Right now I still use… I'm going to keep calling it Twitter.Corey: I agree.Kelsey: I kind of use that for my real-time interactions. And I'm still attending conferences, doing fireside chats, and just meeting people on those conference floors. But that's what where I'll be for now. So yeah, I'll still be around, but maybe not as deep. And I'll be spending more time just doing normal life stuff, maybe less building software.Corey: And we will, of course, put a link to that in the show notes. Thank you so much for taking the time to catch up and share your reflections on how the industry is progressing.Kelsey: Awesome. Thanks for having me, Corey.Corey: Kelsey Hightower, now gloriously retired. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that you're going to type on stage as part of a conference talk, and then accidentally typo all over yourself while you're doing it.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Cloud Compliance and the Ethics of AI with Levi McCormick

Screaming in the Cloud

Play Episode Listen Later Aug 24, 2023 32:34


Levi McCormick, Cloud Architect at Jamf, joins Corey on Screaming in the Cloud to discuss his work modernizing baseline cloud infrastructure and his experience being on the compliance side of cloud engineering. Levi explains how he works to ensure the different departments he collaborates with are all on the same page so that different definitions don't end up in miscommunications, and why he feels a sandbox environment is an important tool that leads to a successful production environment. Levi and Corey also explore the ethics behind the latest generative AI craze. About LeviLevi is an automation engineer, with a focus on scalable infrastructure and rapid development. He leverages deep understanding of DevOps culture and cloud technologies to build platforms that scale to millions of users. His passion lies in helping others learn to cloud better.Links Referenced: Jamf: https://www.jamf.com/ Twitter: https://twitter.com/levi_mccormick LinkedIn: https://www.linkedin.com/in/levimccormick/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. A longtime friend and person has been a while since he's been on the show, Levi McCormick has been promoted or punished for his sins, depending upon how you want to slice that, and he is now the Director of Cloud Engineering at Jamf. Levi, welcome back.Levi: Thanks for having me, Corey.Corey: I have to imagine internally, you put that very pronounced F everywhere, and sometimes where it doesn't belong, like your IAMf policies and whatnot.Levi: It is fun to see how people like to interpret how to pronounce our name.Corey: So, it's been a while. What were you doing before? And how did you wind up stumbling your way into your current role?Levi: [laugh]. When we last spoke, I was a cloud architect here, diving into just our general practices and trying to shore up some of them. In between, I did a short stint as director of FedRAMP. We are pursuing some certifications in that area and I led, kind of, the engineering side of the compliance journey.Corey: That sounds fairly close to hell on earth from my particular point of view, just because I've dealt in the compliance side of cloud engineering before, and it sounds super interesting from a technical level until you realize just how much of it revolves around checking the boxes, and—at least in the era I did it—explaining things to auditors that I kind of didn't feel I should have to explain to an auditor, but there you have it. Has the state of that world improved since roughly 2015?Levi: I wouldn't say it has improved. While doing this, I did feel like I drove a time machine to work, you know, we're certifying VMs, rather than container-based architectures. There was a lot of education that had to happen from us to auditors, but once they understood what we were trying to do, I think they were kind of on board. But yeah, it was a [laugh] it was a journey.Corey: So, one of the things you do—in fact, the first line in your bio talking about it—is you modernize baseline cloud infrastructure provisioning. That means an awful lot of things depending upon who it is that's answering the question. What does that look like for you?Levi: For what we're doing right now, we're trying to take what was a cobbled-together part-time project for one engineer, we're trying to modernize that, turn it into as much self-service as we can. There's a lot of steps that happen along the way, like a new workload needs to be spun up, they decide if they need a new AWS account or not, we pivot around, like, what does the access profile look like, who needs to have access to it, which things does it need to connect to, and then you look at the billing side, compliance side, and you just say, you know, “Who needs to be informed about these things?” We apply tags to the accounts, we start looking at lower-level tagging, depending on if it's a shared workload account or if it's a completely dedicated account, and we're trying to wrap all of that in automation so that it can be as click-button as possible.Corey: Historically, I found that when companies try to do this, the first few attempts at it don't often go super well. We'll be polite and say their first attempts resemble something artisanal and handcrafted, which might not be ideal for this. And then in many cases, the overreaction becomes something that is very top-down, dictatorial almost, is the way I would frame that. And the problem people learn then is that, “Oh, everyone is going to route around us because they don't want to deal with us at all.” That doesn't quite seem like your jam from what I know of you and your approach to things. How do you wind up keeping the guardrails up without driving people to shadow IT their way around you?Levi: I always want to keep it in mind that even if it's not an option, I want to at least pretend like a given team could not use our service, right? I try to bring a service mentality to it, so we're talking Accounts as a Service. And then I just think about all of the things that they would have to solve if they didn't go through us, right? Like, are they managing their finances w—imagine they had to go in and negotiate some kind of pricing deal on their own, right, all of these things that come with being part of our organization, being part of our service offering. And then just making sure, like, those things are always easier than doing it on their own.Corey: How diverse would you say that the workloads are that are in your organization? I found that in many cases, you'll have a SaaS-style company where there's one primary workload that is usually bearing the name of the company, and that's the thing that they provide to everyone. And then you have the enterprise side of the world where they have 1500 or 2000 distinct application teams working on different things, and the only thing they really have in common is, well, that all gets billed to the same company, eventually.Levi: They are fairly diverse in how… they're currently created. We've gone through a few acquisitions, we've pulled a bunch of those into our ecosystem, if you will. So, not everything has been completely modernized or brought over to, you know, standards, if you will, if such a thing even exists in companies. You know [laugh], you may pretend that they do, but you're probably lying to yourself, right? But you know, there are varying platforms, we've got a whole laundry list of languages that are being used, we've got some containerized, some VM-based, some serverless workloads, so it's all over the place. But you nailed it. Like, you know, the majority of our footprint lives in maybe a handful of, you know, SaaS offerings.Corey: Right. It's sort of a fun challenge when you start taking a looser approach to these things because someone gets back from re:Invent, like, “Well, I went to the keynote and now I have my new shopping list of things I'm going to wind up deploying,” and ehh, that never goes well, having been that person in a previous life.Levi: Yeah. And you don't want to apply too strict of governance over these things, right? You want people to be able to play, you want them to be inspired and start looking at, like, what would be—what's something that's going to move the needle in terms of our cloud architecture or product offerings or whatever we have. So, we have sandbox accounts that are pretty much wide open, we've got some light governance over those, [laugh] moreso for billing than anything. And all of our internal tooling is available, you know, like if you're using containers or whatever, like, all of that stuff is in those sandbox accounts.And that's where our kind of service offering comes into play, right? Sandbox is still an account that we tried to vend, if you will, out of our service. So, people should be building in your sandbox environments just like they are in your production as much as possible. You know, it's a place where tools can get the tires kicked and smooth out bugs before you actually get into, you know, roadmap-impacting problems.Corey: One of the fun challenges you have is, as you said, the financial aspect of this. When you've got a couple of workloads that drive most things, you can reason about them fairly intelligently, but trying to predict the future—especially when you're dealing with multi-year contract agreements with large cloud providers—becomes a little bit of a guessing game, like, “Okay. Well, how much are we going to spend on generative AI over the next three years?” The problem with that is that if you listen to an awful lot of talking heads or executive types, like, “Oh, yeah, if we're spending $100 million a year, we're going to add another 50 on top of that, just in terms of generative AI.” And it's like, press X to doubt, just because it's… I appreciate that you're excited about these things and want to play with them, but let's make sure that there's some ‘there' there before signing contracts that are painful to alter.Levi: Yeah, it's a real struggle. And we have all of these new initiatives, things people are excited for. Meanwhile, we're bringing old architecture into a new platform, if you will, or a new footprint, so we have to constantly measure those against each other. We have a very active conversation with finance and with leadership every month, or even weekly, depending on the type of project and where that spend is coming from.Corey: One of the hard parts has always been, I think, trying to get people on the finance side of the world, the engineering side of the world, and the folks who are trying to predict what the business was going to do next, all speaking the same language. It just feels like it's too easy to wind up talking past each other if you're not careful.Levi: Yeah, it's really hard. Recently taken over the FinOps practice. It's been really important for me, for us to align on what our words mean, right? What are these definitions mean? How do we come to common consensus so that eventually the communication gets faster? But we can't talk past each other. We have to know what our words mean, we have to know what each person cares about in this conversation, or what does their end goal look like? What do they want out of the conversation? So, that's been—that's taken a significant amount of time.Corey: One of the problems I have is with the term FinOps as a whole, ignoring the fact entirely that it was an existing term of art within finance for decades; great, we're just going to sidestep past that whole mess—the problem you'll see is that it just seems like that it means something different to almost everyone who hears it. And it's sort of become a marketing term more so that it has an actual description of what people are doing. Just because some companies will have a quote-unquote, “FinOps team,” that is primarily going to be run by financial analysts. And others, “Well, we have one of those lying around, but it's mostly an engineering effort on our part.”And I've seen three or four different expressions as far as team composition goes and I'm not convinced any of them are right. But again, it's easy for me to sit here and say, “Oh, that's wrong,” without having an environment of my own to run. I just tend to look at what my clients do. And, “Well, I've seen a lot of things, and they all work poorly in different ways,” is not uplifting and helpful.Levi: Yeah. I try not to get too hung up on what it's called. This is the name that a lot of people inside the company have rallied around and as long as people are interested in saving money, cool, we'll call it FinOps, you know? I mean, DevOps is the same thing, right? In some companies, you're just a sysadmin with a higher pay, and in some companies, you're building extensive cloud architecture and pipelines.Corey: Honestly, for the whole DevOps side of the world, I maintain we're all systems administrators. The tools have changed, the methodologies have changed, the processes have changed, but the responsibility of ‘keep the site up' generally has not. But if you call yourself a sysadmin, you're just asking him to, “Please pay me less money in my next job.” No, thanks.Levi: Yeah. “Where's the Exchange Server for me to click on?” Right? That's the [laugh]—if you call yourself a sysadmin [crosstalk 00:11:34]—Corey: God. You're sending me back into twitching catatonia from my early days.Levi: Exactly [laugh].Corey: So, you've been paying attention to this whole generative AI hype monster. And I want to be clear, I say this as someone who finds the technology super neat and I'm optimistic about it, but holy God, it feels like people have just lost all sense. If that's you, my apologies in advance, but I'm still going to maintain the point.Levi: I've played with all the various toys out there. I'm very curious, you know? I think it's really fun to play with them, but to, like, make your entire business pivot on a dime and pursue it just seems ridiculous to me. I hate that the cryptocurrency space has pivoted so hard into it, you know? All the people that used to be shilling coins are now out there trying to cobble together a couple API calls and turn it into an AI, right?Corey: It feels like it's just a hype cycle that people are more okay with being a part of. Like, Andy Jassy, in the earnings call a couple of weeks ago saying that every Amazon team is working with generative AI. That's not great. That's terrifying. I've been playing with the toys as well and I've asked it things like, “Oh, spit out an IAM policy for me,” or, “Oh, great, what can I do to optimize my AWS bill?” And it winds up spitting out things that sound highly plausible, but they're also just flat-out wrong. And that, it feels like a lot of these spaces, it's not coming up with a plausible answer—that's the hard part—is coming up with the one that is correct. And that's what our jobs are built around.Levi: I've been trying to explain to a lot of people how, if you only have surface knowledge of the thing that it's telling you, it probably seems really accurate, but when you have deep knowledge on the topic that you're interacting with this thing, you're going to see all of the errors. I've been using GitHub's Copilot since the launch. You know, I was in one of the previews. And I love it. Like, it speeds up my development significantly.But there have been moments where I—you know, IAM policies are a great example. You know, I had it crank out a Lambda functions policy, and it was just frankly, wrong in a lot of places [laugh]. It didn't quite imagine new AWS services, but it was really [laugh] close. The API actions were—didn't exist. It just flat-out didn't exist.Corey: I love that. I've had some magic happen early on where it could intelligently query things against the AWS pricing API, but then I asked it the same thing a month later and it gave me something completely ridiculous. It's not deterministic, which is part of the entire problem with it, too. But it's also… it can help incredibly in some weird ways I didn't see coming. But it can also cause you to spend more time chasing that thing than just doing it yourself the first time.I found a great way to help it—you know, it helped me write blog posts with it. I tell it to write a blog post about a topic and give it some bullet points and say, “Write in my voice,” and everything it says I take issue with, so then I just copy that into a text editor and then mansplain-correct the robot for 20 minutes and, oh, now I've got a serviceable first draft.Levi: And how much time did you save [laugh] right? It is fun, you know?Corey: It does help because that's better for me at least and staring at an empty page of what am I going to write? It gets me past the writer's block problem.Levi: Oh, that's a great point, yeah. Just to get the ball rolling, right, once you—it's easier to correct something that's wrong, and you're almost are spite-driven at that point, right? Like, “Let me show this AI how wrong it was and I'll write the perfect blog post.” [laugh].Corey: It feels like the companies jumping on this, if you really dig into what we're talking about, it seems like they're all very excited about the possibility of we don't have to talk to customers anymore because the robots will all do that. And I don't think that's going to go the way you want to. We just have this minor hallucination problem. Yeah, that means that lies and tries to book customers to hotel destinations that don't exist. Think about this a little more. The failure mode here is just massive.Levi: It's scary, yeah. Like, without some kind of review process, I wouldn't ship that straight to my customers, right? I wouldn't put that in front of my customer and say, like, “This is”—I'm going to take this generative output and put it right in front of them. That scares me. I think as we get deeper into it, you know, maybe we'll see… I don't know, maybe we'll put some filters or review process, or maybe it'll get better. I mean, who was it that said, you know, “This is the worst it's ever going to be?” Right, it will only get better.Corey: Well, the counterargument to that is, it will get far worse when we start putting this in charge [unintelligible 00:16:08] safety-critical systems, which I'm sure it's just a matter of time because some of these boosters are just very, very convincing. It's just thinking, how could this possibly go the worst? Ehhh. It's not good.Levi: Yeah, well, I mean, we're talking impact versus quality, right? The quality will only ever get better. But you know, if we run before we walk, the impact can definitely get wider.Corey: From where I sit, I want to see this really excel within bounded problem spaces. The one I keep waiting for is the AWS bill because it's a vast space, yes, and it's complicated as all hell, but it is bounded. There are a finite—though large—number of things you can see in an AWS bill, and there are recommendations you can make based on top of that. But everything I've seen that plays in this space gets way overconfident far too quickly, misses a bunch of very obvious lines of inquiry. Ah, I'm skeptical.Then you pass that off to unbounded problem spaces like human creativity and that just turns into an absolute disaster. So, much of what I've been doing lately has been hamstrung by people rushing to put in safeguards to make sure it doesn't accidentally say something horrible that it's stripped out a lot of the fun and the whimsy and the sarcasm in the approach, of I—at one point, I could bully a number of these things into ranking US presidents by absorbency. That's getting harder to do now because, “Nope, that's not respectful and I'm not going to do it,” is basically where it draws the line.Levi: The one thing that I always struggle with is, like, how much of the models are trained on intellectual property or, when you distill it down, pure like human suffering, right? Like, this is somebody's art, they've worked hard, they've suffered for it, they put it out there in the world, and now it's just been pulled in and adopted by this tool that—you know, how many of the examples of, “Give me art in the style of,” right, and you just see hundreds and hundreds of pieces that I mean, frankly, are eerily identical to the style.Corey: Even down to the signature, in some cases. Yeah.Levi: Yeah, exactly. You know, and I think that we can't lose sight of that, right? Like, these tools are fun and you know, they're fun to play with, it's really interesting to explore what's possible, but we can't lose sight of the fact that there are ultimately people behind these things.Corey: This episode is sponsored in part by Panoptica.  Panoptica simplifies container deployment, monitoring, and security, protecting the entire application stack from build to runtime. Scalable across clusters and multi-cloud environments, Panoptica secures containers, serverless APIs, and Kubernetes with a unified view, reducing operational complexity and promoting collaboration by integrating with commonly used developer, SRE, and SecOps tools. Panoptica ensures compliance with regulatory mandates and CIS benchmarks for best practice conformity. Privacy teams can monitor API traffic and identify sensitive data, while identifying open-source components vulnerable to attacks that require patching. Proactively addressing security issues with Panoptica allows businesses to focus on mitigating critical risks and protecting their interests. Learn more about Panoptica today at panoptica.app.Corey: I think it matters, on some level, what the medium is. When I'm writing, I will still use turns of phrase from time to time that I first encountered when I was reading things in the 1990s. And that phrase stuck with me and became part of my lexicon. And I don't remember where I originally encountered some of these things; I just know I use those raises an awful lot. And that has become part and parcel of who and what I am.Which is also, I have no problem telling it to write a blog post in the style of Corey Quinn and then ripping a part of that out, but anything that's left in there, cool. I'm plagiarizing the thing that plagiarized from me and I find that to be one of those ethically just moments there. But written word is one thing depending on what exactly it's taking from you, but visual style for art, that's something else entirely.Levi: There's a real ethical issue here. These things can absorb far much more information than you ever could in your entire lifetime, right, so that you can only quote-unquote, you know, “Copy, borrow, steal,” from a handful of other people in your entire life, right? Whereas this thing could do hundreds or thousands of people per minute. I think that's where the calculus needs to be, right? How many people can we impact with this thing?Corey: This is also nothing new, where originally in the olden times, great, copyright wasn't really a thing because writing a book was a massive, massive undertaking. That was something that you'd have to do by hand, and then oh, you want a copy of the book? You'd have to have a scribe go and copy the thing. Well then, suddenly the printing press came along, and okay, that changes things a bit.And then we continue to evolve there to digital distribution where suddenly it's just bits on a disk that I can wind up throwing halfway around the internet. And when the marginal cost of copying something becomes effectively zero, what does that change? And now we're seeing, I think, another iteration in that ongoing question. It's a weird world and I don't know that we have the framework in place even now to think about that properly. Because every time we start to get a handle on it, off we go again. It feels like if they were doing be invented today, libraries would absolutely not be considered legal. And yet, here we are.Levi: Yeah, it's a great point. Humans just do not have the ethical framework in place for a lot of these things. You know, we saw it even with the days of Napster, right? It's just—like you said, it's another iteration on the same core problem. I [laugh] don't know how to solve it. I'm not a philosopher, right?Corey: Oh, yeah. Back in the Napster days, I was on that a fair bit in high school and college because I was broke, and oh, I wanted to listen to this song. Well, it came on an album with no other good songs on it because one-hit wonders were kind of my jam, and that album cost 15, 20 bucks, or I could grab the thing for free. There was no reasonable way to consume. Then they started selling individual tracks for 99 cents and I gorged myself for years on that stuff.And now it feels like streaming has taken over the world to the point where the only people who really lose on this are the artists themselves, and I don't love that outcome. How do we have a better tomorrow for all of this? I know we're a bit off-topic from you know, cloud management, but still, this is the sort of thing I think about when everything's running smoothly in a cloud environment.Levi: It's hard to get people to make good decisions when they're so close to the edge. And I think about when I was, you know, college-age scraping by on minimum wage or barely above minimum wage, you know, it was hard to convince me that, oh yeah, you shouldn't download an MP3 of that song; you should go buy the disc, or whatever. It was really hard to make that argument when my decision was buy an album or figure out where I'm going to, you know, get my lunch. So, I think, now that I'm in a much different place in my life, you know, these decisions are a lot easier to make in an ethical way because that doesn't impact my livelihood nearly as much. And I think that is where solutions will probably come out of. The more people doing better, the easier it is for them to make good decisions.Corey: I sure hope you're right, but something I found is that okay we made it easy for people to make good decisions. Like, “Nope, you've just made it easier for me to scale a bunch of terrible ones. I can make 300,000 more terrible decisions before breakfast time now. Thanks.” And, “No, that's not what I did that for.” Yet here we are. Have you been tracking lately what's been going on with the HashiCorp license change?Levi: Um, a little bit, we use—obviously use Terraform in the company and a couple other Hashi products, and it was kind of a wildfire of, you know, how does this impact us? We dove in and we realized that it doesn't, but it is concerning.Corey: You're not effectively wrapping Terraform and then using that as the basis for how you do MDM across your customer fleets.Levi: Yeah. You know, we're not deploying customers' written Terraform into their environments or something kind of wild like that. Yeah, it doesn't impact us. But it is… it is concerning to watch a company pivot from an open-source, community-based project to, “Oh, you can't do that anymore.” It doesn't impact a lot of people who use it day-to-day, but I'm really worried about just the goodwill that they've lit on fire.Corey: One of the problems, too, is that their entire write-up on this was so vague that it was—there is no way to get an actual… piece of is it aimed at us or is it not without very deep analysis, and hope that when it comes to court, you're going to have the same analysis as—that is sympathetic. It's, what is considered to be a competitor? At least historically, it was pretty obvious. Some of these databases, “Okay great. Am I wrapping their database technology and then selling it as a service? No? I'm pretty good.”But with HashiCorp, what they do is so vast in a few key areas that no one has the level of certainty. I was pretty freaking certain that I'm not shipping MongoDB with my own wrapper around it, but am I shipping something that looks like Terraform if I'm managing someone's environment for them? I don't know. Everything's thrown into question. And you're right. It's the goodwill that currently is being set on fire.Levi: Yeah, I think people had an impression of Hashi that they were one of the good guys. You know, the quote-unquote, “Good guys,” in the space, right? Mitchell Hashimoto is out there as a very prominent coder, he's an engineer at heart, he's in the community, pretty influential on Twitter, and I think people saw them as not one of the big, faceless corporations, so to see moves like this happen, it… I think it shook a lot of people's opinions of them and scared them.Corey: Oh, yeah. They've always been the good guys in this context. Mitch and Armon were fantastic folks. I'm sure they still are. I don't know if this is necessarily even coming from them. It's market forces, what are investors demanding? They see everyone is using Terraform. How does that compare to HashiCorp's market value?This is one of the inherent problems if I'm being direct, of the end-stages of capitalism, where it's, “Okay, we're delivering on a lot of value. How do we capture ever more of it and growing massively?” And I don't know. I don't know what the answer is, but I don't think anyone's thrilled with this outcome. Because, let's be clear, it is not going to meaningfully juice their numbers at all. They're going to be setting up a lot of ill will against them in the industry, but I don't see the upside for them. I really don't.Levi: I haven't really done any of the analysis or looked for it, I should say. Have you seen anything about what this might actually impact any providers or anything? Because you're right, like, what kind of numbers are we actually talking about here?Corey: Right. Well, there are a few folks that have done things around this that people have named for me: Spacelift being one example, Pulumi being another, and both of them are saying, “Nope, this doesn't impact us because of X, Y, and Z.” Yeah, whether it does or doesn't, they're not going to sit there and say, “Well, I guess we don't have a company anymore. Oh, well.” And shut the whole thing down and just give their customers over to HashiCorp.Their own customers would be incensed if that happened and would not go to HashiCorp if that were to be the outcome. I think, on some level, they're setting the stage for the next evolution in what it takes to manage large-scale cloud environments effectively. I think basically, every customer I've ever dealt with on my side has been a Terraform shop. I finally decided to start learning the ins and outs of it myself a few weeks ago, and well, it feels like I should have just waited a couple more weeks and then it would have become irrelevant. Awesome. Which is a bit histrionic, but still, this is going to plant seeds for people to start meaningfully competing. I hope.Levi: Yeah, I hope so too. I have always awaited releases of Terraform Cloud with great anticipation. I generally don't like managing my Terraform back-ends, you know, I don't like managing the state files, so every time Terraform Cloud has some kind of release or something, I'm looking at it because I'm excited, oh finally, maybe this is the time I get to hand it off, right? Maybe I start to get to use their product. And it has never been a really compelling answer to the problems that I have.And I've always said, like, the [laugh] cloud journey would be Google's if they just released a managed Terraform [laugh] service. And this would be one way for them to prevent that from happening. Because Google doesn't even have an Infrastructure as Code competitor. Not really. I mean, I know they have their, what, Plans or their Projects or whatever they… their Infrastructure as Code language was, but—Corey: Isn't that what Stackdriver was supposed to be? What happened with that? It's been so long.Levi: No, that's a logging solution [laugh].Corey: That's the thing. It all runs together. Not it was their operations suite that was—Levi: There we go.Corey: —formerly Stackdriver. Yeah. Now, that does include some aspects—yeah. You're right, it's still hanging out in the observability space. This is the problem is all this stuff conflates and companies are terrible at naming and Google likes to deprecate things constantly. And yeah, but there is no real competitor. CloudFormation? Please. Get serious.Levi: Hey, you're talking to a member of the CloudFormation support group here. So, I'm still a huge fan [laugh].Corey: Emotional support group, more like it, it seems these days.Levi: It is.Corey: Oh, good. It got for loops recently. We've been asking for basically that to make them a lot less wordy only for, what, ten years?Levi: Yeah. I mean, my argument is that I'm operating at the account level, right? I need to deploy to 250, 300, 500 accounts. Show me how to do that with Terraform that isn't, you know, stab your eyes out with a fork.Corey: It can be done, but it requires an awful lot of setting things up first.Levi: Exactly.Corey: That's sort of a problem. Like yeah, once you have the first 500 going, the rest are just like butter. But that's a big step one is massive, and then step two becomes easy. Yeah… no, thank you.Levi: [laugh]. I'm going to stick with my StacksSets, thank you.Corey: [laugh]. I really want to thank you for taking the time to come back on and honestly kibitz about the state of the industry with me. If people want to learn more, where's the best place for them to find you?Levi: Well, I'm still active on the space normally known as—formerly known as Twitter. You can reach out to me there. DMs are open. I'm always willing to help people learn how to cloud better. Hopefully trying to make my presence known a little bit more on LinkedIn. If you happen to be over there, reach out.Corey: And we will, of course, put links to that in the [show notes 00:30:16]. Thank you so much for taking the time to speak with me again. It's always a pleasure.Levi: Thanks, Corey. I always appreciate it.Corey: Levi McCormick, Director of Cloud Engineering at Jamf. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, and along with an insulting comment that tells us that we completely missed the forest for the trees and that your programmfing is going to be far superior based upon generative AI.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Metacast: Behind the scenes
32. Creating a niche of your own with Corey Quinn

Metacast: Behind the scenes

Play Episode Listen Later Aug 23, 2023 73:36


This week we sat down with Corey Quinn, the Chief Cloud Economist at The Duckbill Group and a big celebrity at the AWS circles. Corey is well known for his sense of humor and unrelenting focus on making some good fun of the cloud providers. In our interview, we are learning a bit of Corey's background, how The Duckbill Group got started, and how he runs the media side of his business. As usual, we did talk about bootstrapping and running consulting services while building a product. Full show notes with links: https://www.metacastpodcast.com/p/032-corey-quinn

Screaming in the Cloud
The Value of Good Editing in Content Creation with Alysha Love

Screaming in the Cloud

Play Episode Listen Later Aug 22, 2023 36:03


Alysha Love, Executive Editor and Co-Founder of Payette Media House, joins Corey on Screaming in the Cloud to discuss her career journey going from journalism to editing and how she works with Corey on his content. Alysha describes why she feels it's so important to capture the voice of the person you're editing, and why editing your content makes a difference to those reading it. Corey and Alysha also explore the differences in editing for something that will be read silently versus something that will be read out loud, as well as the different styles of editing. About AlyshaAlysha Love is executive editor and co-founder of Payette Media House, an editorial agency serving startups and tech companies. Alysha is the treasurer of ACES: The Society for Editing, the nation's largest editing organization, and trains editors and writers in digital best practices.She was an editor at CNN and POLITICO during the Obama and Trump administrations. Alysha has a bachelor's in journalism from the University of Missouri and a master's in leadership and organizational development from the University of Texas. She's a big fan of the humble ampersand.Links Referenced:Company website: https://payettemediahouse.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Human-scale teams use Tailscale to build trusted networks. Tailscale Funnel is a great way to share a local service with your team for collaboration, testing, and experimentation.  Funnel securely exposes your dev environment at a stable URL, complete with auto-provisioned TLS certificates. Use it from the command line or the new VS Code extensions. In a few keystrokes, you can securely expose a local port to the internet, right from the IDE.I did this in a talk I gave at Tailscale Up, their first inaugural developer conference. I used it to present my slides and only revealed that that's what I was doing at the end of it. It's awesome, it works! Check it out!Their free plan now includes 3 users & 100 devices. Try it at snark.cloud/tailscalescream Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. And one of the, I guess, illusions about what I do is that I sit down at a keyboard periodically, and I just start typing and then, you know, brilliance emerges, and then my work is done. It turns out that this is rarely true, not to deflate my own image overly much. And a big part of how that works comes down to my guest today. Alysha Love is the executive editor at Payette Media House and has been my editor for just about three years now. Alysha, thank you for tolerating me.Alysha: Anytime.Corey: So, I want to start by dispensing with a few illusions that I'm not saying other people have, I'm saying that I have, where I was fortunate enough—or unfortunate as the case may be—to grow up with an English teacher for a mother and understanding how to put together a grammatically correct sentence was not exactly optional in my house, so what possible value could an editor present to me? And one of the things I learned along the way is that there are multiple kinds of editors, as it turns out. What are they and which are you?Alysha: So yeah, not only is editing a thing, we can look at your sentences, your story, and make it all better and clearer so that it really shines. But there are different types of editors who can do different specific functions. So, at the maybe most nitpicky level, you have proofreaders who are looking at what would be a final page, usually in something like a book, where it needs to look exactly right the way that it's going up, it needs to be sure that every last little detail is in place. At the next level up, you have copy editors. They're looking for things like spelling, grammar, punctuation, style, factual accuracy. That's sort of what you think of usually when you think of somebody who might peer-review a piece for you, or who you might ask to edit something. And then at the next level, you have people who are able to do the copy editing, but in addition to that, they look at the overarching arc of the story or the blog piece, and they're able to help look for some of those gaps and organize it into something that is clearer and easier to understand.Corey: Something I've always been curious about is that you, previously in another life, were an editor at CNN and then Politico during the Obama and then Trump administrations. Is editing what I do significantly different than editing, you know, journalists?Alysha: Yes, in a few key ways. One is that when we're writing news, we always come out and say the most important thing first. It's what we call the inverted pyramid style, so if you turn a pyramid on its nose and it's standing on the tip, you have the biggest part of the triangle or the pyramid at the top, and that's the most important thing that could never get cut, and you say it right out of the gate. I tease my husband a lot because he tends to bury the lede, and that's what we're talking about when it's not the first thing you say.Corey: Absolutely. And I do that, meanwhile, stylistically as a choice because, you know, don't put the punch line in the title.Alysha: Totally. So, that's a big difference between editing for news and editing you. You also use significantly more voice than we would use in a CNN or Politico article. That's also a choice. And it's actually something I have a ton of fun with is emulating your voice as I make edits.Corey: I found that as we've worked together, our comfort with one another has grown significantly over the past few years. At this point, just for folks who are wondering, anytime you have an edit that's just a reordering or something that clarifies something slightly or is basically low-level stylistic, you don't track those changes; you just go ahead and apply them because otherwise, I'll wind up, “Oh, here are 600 changes to make.” It's like, the article is 2000 words. Exactly how much was done? And so, much of it is white spaces and comma placement and the rest and just strange little things that frankly, are not that important to me past a certain point.The exception, of course, was always great. First, if you're making a change, tell me why. I have political opinions about the use of the Oxford comma, for example. I find it lends clarity to things. Fortunately, you and I aligned on that, so it's a non-issue. But I am curious as far as what do you see that I tend to do the most that I guess either annoys you or you disagree with stylistically or, honestly, is flat-out wrong.Alysha: So, there's not much you do that's flat-out wrong. I will say, like, the instances that I do see something… you've told me before that your mom was an English teacher and that these are things that you really pride yourself on being able to do well, so I don't know how you feel about it, but I'll usually leave a comment telling you about the change and linking you [laugh] to something that can explain it a little better. Maybe that annoys the shit out of you, or—sorry, can we cuss?Corey: No, no—oh, you absolutely can, and—Alysha: Great.Corey: Because it's—I was taught by a teacher, I want to say in third grade, that you leave two spaces at the end of every sentence before you begin the next sentence, and I only found out about six or seven years ago that's not really a thing. It took me a year to break myself of that habit, but I would rather go through that effort than, “Well, I've been wrong this long. I may as well double down on it now.” Just seems like that's not helping anyone.Alysha: [laugh]. Right. And we all have those things that there was some English teacher somewhere along the way who taught us things that were just wrong. So, my favorite thing about my magazine editing class, when I was in school at the University of Missouri, was that she started from the very, very basics because she said everyone has learned things that are wrong about how we write and how we edit and so we're just going to learn it all from scratch. And it was really brilliant. It was the best way to learn it all the right way.Corey: I'm usually gratified when I am trying to figure out what is the proper tense of this particular verb in this particular phrase. And my wife and I will wind up in debates on this constantly because she's an attorney and also writes a lot for a living. Who knew? And invariably whenever we finally get to an impasse and look it up—because, you know, we do have the sum total of human knowledge in the supercomputer that lives in our pockets—the answer is more often than not, it's a matter of choice. And both are considered accepted because English is, of course, a language defined by its usage, or one way is British and one is American, or some other aspect where it's not about wrong; it's about which is preferred in certain contexts. So, I'll take it.Alysha: That's—yeah, that's totally accurate. And those are the kinds of choices that I feel like, if I were to change all of those things in your writing, you would not appreciate it because they're preferences. So, those are the things that even if there's a style that's a little different unless what you've done is wrong in some sort of, like, widely accepted way, then I'm going to leave it the way you've got it.Corey: One way that I have found that I am both strong and weak, I think, as a writer—and I'm thrilled to be criticized on any of this; please don't spare my feelings any—is that I write like I speak. When—this is most noticeable on Twitter when I meet people who've never met me in person before, a very common refrain is, “Oh, you're just like your Twitter feed.”Alysha: [laugh].Corey: And partially that's because I'm sarcastic and irreverent and a class clown who never grew up, but another part of it is because I write like I would put together the sentence. In long-form writing, I feel like that can be something of a setback for me. When I'm making a sentence right now, for example, and talking to you, if I were to write this out as a literal transcript, it will be a long run-on sentence in a bunch of different ways. And it works conversationally, but it does not work that way in long-form writing. So, I feel like I have a bunch of clauses that continue to go on forever when I let myself. [transcriber note: yup]Alysha: You do. And the thing that you also love to do stick a bunch of semicolons between all of them, which is technically correct, but I do have a whole thing about distracting punctuation, so I will take out many of your semicolons.Corey: I would like credit though because before you were involved in this, Mike would periodically look at some of my blog posts before they went out and—because I wanted his perspective on, “Am I onto something here or am I a fool,” but then he'll go back and edit some of the things he sees—which I get. If I see a misspelling in something, I itch until I can fix it, or a grammatical mistake. But at one point, he was constantly onto me about overusing commas. And in one case, he took a bunch out. And then I looked at the tracked changes on this and it's forever one of my favorite things. You went through next and wound up returning all of the commas that he had removed. It's, “That's right.” But you got me on the semicolon thing. I'm trying to reduce usage and have shorter sentences.Alysha: Yes. And that's something that's really good for digital best practices and having a wide and varied audience. You know, with a diverse audience, with audiences that don't speak English as a first language, it's helpful to have much shorter sentences. For folks who are consuming content on the internet, in general, it's easier to skim and get the meaning out of a shorter sentence. However, when we think about your voice, it's important to leave some of those really long sentences in because we want people to keep thinking and, like, “Oh, yeah. This is a Corey Quinn piece,” when they read your article, whether your name is at the top of it or not.Corey: What I found is that varying the sentence structure and length also keeps reading from being fatiguing in some cases. And there are times I'll do things that are, quote-unquote, “Incorrect” to make a point stylistically. Like, normally you wouldn't put that word in italics and bold, but yeah, for this case, it is so egregious—probably Managed NAT Gateway or something—that I absolutely feel the need to wind up emphasizing the egregiousness of whatever it is I'm opining on that week.Alysha: Yes. And I think that is also part of what makes editing for you really fun is that there's a great balance of let's keep to the rules as much as we can when it makes sense, but let's be super strategic about how we break them to have better emphasis and to make it clear that this is a Corey Quinn piece.Corey: One problem that I've had, too, is understanding the difference in medium. I mean, most of my engagement with writing, when I was growing up, was books I read enthusiastically. And then I started writing a lot of newsletters and mailing lists and various written fora. I spent entirely too many years on IRC over the course of my life. And there are different rules and all of those circumstances, but never having written a book myself, how differently do you approach the editing process when you're writing something that is long-form or writing something that is essay length, or writing something that is a book?Alysha: So, I'm actually working on my first book now as the editor. So, that's a thing that I'm learning about, learning more about what that process looks like and how it's different. I think there's a lot more note-taking as you go along to track, you know, this is the story arc, these are the characters, what's a first reference and a second reference?Corey: When you overuse a phrase, it's easy to figure out if it's in a 2000-word essay. When you use it more than once, oh, great. Easy to spot. But okay, you write books—generally not in one sitting, I would assume—and you say, all right, that is the eighth time you have used that very particular turn of phrase. Stop it here's a thesaurus.Alysha: Totally. I don't know, maybe this is just a me thing or an editor thing, but do you notice when you hear, like, a very unique word, that's the thing, if—by the way [laugh], speaking of different, you know, if this weren't a spoken word podcast, then I would never say very unique; I would edit out ‘very' ahead of ‘unique' because unique is unique.Corey: Exactly. It's a pet peeve.Alysha: Totally. But I have very different rules for the way that we speak versus the way that we write. How fleeting are things? So, that gets back to your original question. Something that, you know, if I'm editing something quick, that's a quick hit, it's not going to live for very long. If you needed me to edit a tweet, I wouldn't spend a lot of time on that. I'm going to spend more time on things that have longer legs or that are going to a bigger platform. Books, you spend way more time editing than you would a 2000-word essay.Corey: I find that I don't have people edit tweets very often because first, it's moving too quickly for me to really take something out for opinion. The reason I'd have to do that is, “Is this too close to the edge?” Well, it turns out at this stage, if I have to ask that question, I already know the answer.Alysha: Mm-hm.Corey: Everything else is going to be more stylistic, like, “Is dogshit one word or two?” And you're, “Ah, it's a [unintelligible 00:13:24]. There we go. Excellent.” It's not the typical kind of problem or question that you would run into.Alysha: The BuzzFeed Style Guide has been a great resource for questions like dogshit [laugh].Corey: I didn't realize they had such a thing and that is absolutely amazing.Alysha: It is fantastic. Most of the internet things you need to know are there. CNN is where—well, CNN and Politico both—that's where we were always taking second eyes to look at a tweet before it goes out and you're doing that in about ten seconds. But we're looking at factual accuracy. Is there something that is about to be very wrong that we don't want to embarrass the publication with?Corey: I'm a prolific writer because I have to be. I have a content schedule that you could charitably call punishing. And that works super well with the way that I view the world, but the counterargument is that getting me to go back and review edits or go back and edit after I've written something is sort of like pulling teeth. So, something I found that works for me as a way around this is I record these essays as podcast episodes on the AWS Morning Brief. What that forces me to do is once the edits are in, I get one last read-through as I read it out loud in a normal speaking voice and don't power my way through it, and I'm forced to pay attention to every word at that point.And, “Oh, that doesn't quite make the point that I thought it did.” And you've edited them by this point, so it's not ever going to be something that is, “Oh, that's a run-on sentence,” or, “Huh, punctuation is probably a good idea.” It's something that is more abstract than that and often very tied to a domain-specific aspect of what I've written about. But I found that to be one of the best last filters for a lot of the stuff.Alysha: Yeah. That's a great tactic for catching errors, and… and not even errors, right? But it also comes back to, like, what's a difference between the way that we're going to write things and the way that you're going to read it out loud? I try to edit, keeping in mind that you're going to be reading these out loud, but then there are always going to be things that are going to sound better a particular way, and the way that we write them is going to be slightly different.Corey: One thing that I find as well, given that I read a lot more than I write, is that when I'm looking at articles for inclusion in a whole bunch of different places because I'm looking for creative content from the community, it is very hard for me to go ahead and greenlight including something that is poorly written. If I can't get through the first three sentences without seeing six mistakes in how the sentence is built, I judge the writing for it. It's you're talking about a technical topic, but if you can't even get to a point where the sentence is coherent, then how do I know you don't have typos littered throughout the code samples you're about to put up, or whatnot? And I don't think that that is an entirely fair assessment of mine, but it still feels like nails on a chalkboard, every time I encounter some of it.Alysha: It's actually something that's backed by research. I'm on the board for ACES, the Society for Editing, and we commissioned research about 13 years ago now, so it's getting updated. But what it showed was that readers can distinguish between edited and unedited content in significant ways. So, it may not be, like, “Oh, I know exactly how to fix this,” or, “I know exactly what's wrong with this,” but they get the sense that that content is not as reliable if it hasn't been edited. So, there's true value in exactly what you just said, in having content that's edited and the way that it makes people feel about the quality of the content that they're reading.Corey: It's one of those important things—which I'm not trying to shame people, particularly those for whom English is not a first language; you speak more languages than I do. Good for you—but I also will judge corporate blog posts far more harshly for this because it's no longer just one person. You should—in theory—have the ability to proofread and copy-edit the thing that is going out underneath your masthead. People are expensive. Writers are expensive unless you're ripping people off, which I don't advise. At least take the extra few steps to make sure that it doesn't drive people away for reasons other than the content.Alysha: Yep, I'm totally with you on that [laugh].Corey: I find myself having that same negative reaction to typos on your landing page when you describe what you do. There have been security vendors that I won't touch with a ten-foot pole because they talk about the standards that they follow, but they misspell the word ‘standards' on the webpage. And in a lot of these areas, details very much matter. One area that I want to get into as well that I think you and I have always been aligned on. Because I've worked with a number of editors—all of them great in different ways, I want to be very clear; I'm not trying to shame anyone—but challenges I've had from time to time have been editors who come from the marketing world who like to embody what I can only refer to as the bullshit marketing voice.And I don't know what exactly the elements of it are, but I know it when I hear it or see it. You can see this on almost every billboard out there, every press release that goes out. If I were to talk to a human in a way that the press release talks about the product and company, it would not go well for me, just because I would come across as incredibly condescending, entirely too self-promotional, and there's just something about the way that it's written that feels off-putting so much of my online persona and approaches have come from simply calling out the subtext in an awful lot of unfortunate marketing communications. You've never had a problem with that. I have never once looked at something you've edited for me and put something in where it's, “Ooh. That sounds a little bit too market-y.” And again, I consider myself something of a marketer. This is not me disliking marketing; it's disliking bad marketing. I don't believe that that's the sort of thing that just emerges out of nowhere. What's your history of marketing?Alysha: So, I did start working in marketing at Intuit QuickBooks a few years ago, back in 2018, when I moved out of journalism. So, I think the way that I approach marketing, and content marketing in particular, is always very journalistic. My bullshit meter really goes off, too, when I read something that's like, “You have a claim to back that,” or, “Oh, the evidence that you're using to back that is really thin.” And it just… it's just icky, right? Like, none of us like to feel like we're being marketed to in that way.And you're right, you would like, you would turn tail and run if somebody started talking to you that way in real life. So yeah, so it's just sort of a combination of journalistic instinct and like, you know, a lot of times, if you just say something straight, if you just say the truth, it comes out with even more impact than if you tried to fluff it up with marketing speak.Corey: The thing that I wish companies would figure out is that when you go out and talk about your product and mention the things it's bad at, it really engenders an awful lot of trust. Because it's not like you're going to hide that from the first people to use it, so call it out upfront that this is an area it's weak in. And that is anathema to some folks where they believe that you can say something is good, something is great, or you can stop talking. But it is unhelpful to the people you're trying to reach. I'm sure there are reasons for this. I don't believe for a second that I know better than the entire field of marketing, let's be clear here. But I know what I want to read what I'm trying to get when I'm presented with new information about a product or a service.Alysha: I think it just scares the bejesus out of people to think that they are going to publicly admit to things that aren't great. Yeah, and I don't know what the idea is after that. Like, that we're just going to sweep it under the rug and hope nobody notices or try to work on it in the background, and until then, we'll just talk about, you know, our one huge talking point and tell you that it's the best, most amazing in the world. It comes off as disingenuous to the rest of us. And that is something that you are not. You are… you're definitely the antithesis of that. You're very trusting because you call out all of the things that aren't quite right in a very honest way.Corey: And people love it until it's their turn to the hot seat, I think. That tends to, “Well, hang on, my product is perfect.” I assure you it's not but that's okay.Alysha: To being fair, you're also very good at calling out what others do great, and maybe in a way that you don't always get credit for, but—Corey: Well, no, I've done experiments on this. When I am unflinchingly positive about some aspect of what a cloud provider or other vendor has done or a feature that I really like, it gets almost no notice. But when I say, “Oh, and this part is crap,” that's the part that blows up and goes around the internet a bunch of times. And I think that's human nature. I don't know if, as an editor, you have a way to fix human nature, but if you do, I'm very interested to learn it.Alysha: [laugh]. No, we just all love to bitch and to talk about our pain points, and when somebody says it and all you can say is, “Yes, plus one million,” then it's going to get a lot of play.Corey: One aspect of what you do did scare me initially when we first started working together, and that was, you do a few things: you write as well—which that's not scary. I would expect someone who can edit would also know how to write. That does make sense. But you also do some SEO-facing work. And that in many ways feels like it is modern-day witch doctor-y because my approach to SEO has always been naive but also effective.I write compelling, original content that people like and as a result, link against or refer to. And I find the rest of it really takes care of itself. I haven't spent deep effort or large amounts of brain sweat figuring out how to appear at the top of Google search results for various terms.Alysha: Yeah. And one of the things I was tasked with as soon as I came in, was, “Please write an SEO description for each of Corey's pieces and make sure that we're writing for digital best practices, including SEO.”Corey: And I've read those descriptions and I've never had a problem with any of them because it's not something that is aligned with… anything that I hate. So, good for you on that. It's an active description using very direct, to-the-point phrasing about what it is I'm talking about. And yeah, that is, ideally what SEO should be. It's about, this is what this is, but you shouldn't have to read through a thousand meandering words while he circles the point to death like some sort of persistence hunter. I get it.Alysha: Totally. We're going to be direct and to the point, we're going to use the key nouns, but we're not going to be gross about it. And we're definitely not going to jump on the latest trend because honestly, Google's always looking to get ahead of what all of the SEO magicians are trying to magic up.Corey: I get so many emails in the course of a week for people asking to contribute articles to my blog. Which again, we do have a guest author program, but that's one of those, yeah, if that happens, we're paying you and then throwing an editor—read as you—to whoever it is that's contributing that so it comes out something that we're thrilled to have up there. But money flows one direction in that and it's from us to the guest author. Instead there, “Oh, we're going to provide high-quality content,” or they'll link to something on the site, usually a newsletter back issue, and say, “Hey, include our link to this because it's relevant,” and it's clearly for SEO juice. And first, I'm sure Google and the other search engines would just love if I suddenly have a bunch of crappy links to low-quality sites. But further, it doesn't serve the audience in any meaningful way, and… it just irks me.Alysha: And when you start playing that game, you get into the middle of all of that the link-swapping, and trying to up their SEO juice, and it is wild the amount of money that people will offer to pay for a link on a reputable site. It's super valuable. So, the way that I approach linking in your pieces is exactly the same way that I did it in news, which is, where do we need to show our sources, where will people want to verify information? Let's just go ahead and give them that link. And that's about it. Like, what do people need to know?Corey: I always worry, on some level, that I'm thinking about this all wrong. But if I'm being snarky and sarcastic with all of the SEO people emailing me who then try to offer me SEO services, it's frankly, if that's what I'm looking for, shouldn't I just Google ‘SEO' and pick whoever's at the top of the list? Because they clearly get it in one.Alysha: [laugh].Corey: It turns out, for some reason, they don't really have a good rejoinder to that when you ask them directly.Alysha: [laugh]. I love that. And might I mention, when I do search for topics that I know you guys have articles on, I won't necessarily include The Duckbill Group, but you do show up because you are a reputable and authoritative source who does not play the SEO game.Corey: I do have one more question that lies down this path that I'm actually deeply curious about, and I've always found it to be something that is incredibly helpful for my purposes. But your background is in journalism and writing and editing. It is not—for some unknown reason—the world of cloud. Almost like you want to be happy or something.Alysha: [laugh].Corey: How approachable or unapproachable is my writing to someone who does not live in the space the way I do?Alysha: Oh, that's a really interesting question. So, I'm married to somebody who spent ten years, just about, working specifically around AWS, in the—Corey: It took that long for them to stop the billing. I get it.Alysha: [laugh]. And my brother-in-law is also a software engineer. So, I have witnessed enough conversations between the two of them that I had a decent idea of what I was getting into. And those two are my resources when I have stupid questions that I don't want to ask you [laugh] in a Google Doc comment. So, I go to them, I get the lowdown, I do a little research sometimes, but by and large, we're talking about bigger concepts, and I think sometimes it might even help that I'm not in the weeds on some of the details of things that you talk about because it helps me see patterns that I'm a little—I can make some connections that maybe you're not making in the middle of the weeds.Corey: It's always tricky to figure out where to level-set what I'm talking about. I don't want to turn every article to have the first 18 pages of it be a primer on what Cloud computing is. I have to assume, at least on some level, people have a baseline level of understanding. But there are times I go too far in the other direction where I assume that, “Oh, well, I used to be a software engineer, so I'm going to write as if everyone reading is.”In fact, the audience is not overwhelmingly populated by purely software engineers. There's a lot of systems folk, there's a lot of managers at a variety of different levels, ranging from line management to executive, and it really takes all kinds. I'm always surprised when people reach out and mention they've been reading for a while, and then they describe what they do for a day job and it's nothing I would have ever considered. It would not have occurred to me early on that people who spent their entire life in the finance department would find most of what I talk about that isn't cost related to be interesting. But they assure me they do. Okay.Alysha: That makes sense because it gives them insight into what the other half of the business is doing.Corey: On some level, what I've found is you have to pick—and it can vary; it can honestly vary even within a piece, but at every given point, I feel like you have to have someone in mind that you're writing for because otherwise you're trying to write for everyone and in so doing, you write something that's valuable to no one.Alysha: Do you remember how much I hammered on you about who your audience was at the beginning of every single article when I first started editing?Corey: Oh, yes. That's what shaped the ideas. I mean, honestly, if you were telling me the same thing, now that you were two-and-a-half years ago, I'd wonder if—in your case—if I was even reading the notes that you put into these things. Editors make your writing better, but they also longer-term make you a better writer, is my firm belief.Alysha: Oh, that's lovely.Corey: I'm assuming that the mistakes I make are at least more interesting now as opposed to some of the ones that we had long conversations about. I hope.Alysha: Totally. It is interesting every time.Corey: So, I have to ask, given as someone who is a big believer in writing, and because it's a way of expressing myself and giving myself a platform that doesn't require me to be in the same room as a bunch of other people or them to be willing to fire up a podcast and listen to me or watch a video, they can access it anywhere they are at any given point in time, I love the writing process, but the editing process is challenging for me. You have—seem to be on the other side of that where you are much happier editing than you are writing. At least that's my perception of you and your background. If that is accurate, how do you think about this stuff because it's foreign to me?Alysha: That's really funny. So, I actually started out thinking that I wanted—well, I wanted to work with words, and I thought that the way that you could do that was by writing, and specifically reporting. So, that was the track I went down. And it actually wasn't until my first full-time job out of college—I was working as a copy editor at Politico—that I realized that I could wake up and edit every single day. That was what I had the energy for.And when I say wake up and edit, I mean, it was 4 a.m. and we were editing the newsletters that had to go out by 6 a.m. so quite literally, it was the thing I could wake up and do. And I think what I really love about it is taking something that's already good, that's already great in a lot of instances, and making it better, so it's just that little bit more clear, more understandable, that your message is getting across in a way that still feels authentic to you. Because I can tell you one of my least favorite things as a writer was having someone come through and edit and I could tell you every single spot that that editor had touched. And it sort of… it burned. It just didn't feel quite right.Corey: Suddenly the voice switches, like effectively, you have someone whose voice sounds like you, for example, and then for half a sentence, it suddenly sounds like James Earl Jones is delivering it, and then it goes back to your voice. It's hmm.Alysha: Totally. So, with my experience of editing as a writer, my goal is to make that as seamless as possible. So, I want to show you the changes that you're going to be most interested in and that I think you might want to learn from. And the changes that I do make, I want them to sound just like you.Corey: Honestly, because there's usually a week or two in time that happens between me writing a draft or something, and then going back, when you've just automatically made some of the quick rewrites on the fly, unless I go looking, I never realize which parts you've touched or not. And I'm the one that wrote it. So, I guess, honestly, you're in a terrific position to put words in my mouth if you want to. Have fun. But that is, to me, the mark of an editor who gets it.I just find it scary, on some level, to the idea, from my perspective, of fading into the background. I always lived in fear of not having my name front and center and being in the spotlight, for good or bad, just because it's that's who I am. That's what I bias for.Alysha: That's really interesting. Totally makes sense because you are very front and center. When I was working at Reuters in Brussels, one of the things I think is really cool that they do is they put their editors' byline at the bottom of articles, so the editors do get a hat-tip of recognition. But I think as somebody who's a little bit of a helper, I just get a lot of enjoyment out of making other people's stuff better.Corey: I can certainly say that you've been a smashing success from my perspective, although I'm sure now you're going to be inundated with people who are urging you to, “Okay, now make what he says less bullshit or at least something that I can agree with.” Unfortunately, it doesn't quite work that way most of the time.Alysha: No.Corey: Though you are getting very proficient at sanding off some of the more colorful metaphors. Thank you for that.Alysha: [laugh]. Anytime. I got to keep my [unintelligible 00:33:28], too.Corey: If people want to learn more, where's the best place for them to find you, and—take this as a personal recommendation—hire you to edit their stuff, so I don't have to claw my eyes out as much when I read their things?Alysha: You can find me at payettemediahouse.com. P-A-Y-E-T-T-E Media House.Corey: And we will, of course, put a link to that in the [show notes 00:33:50]. Thank you so much for taking the time to go through something different with me in a stranger way than we normally wind up communicating, which is via tracking changes.Alysha: [laugh]. I love it. It's nice to see your face.Corey: It really is. I have a face for radio though, so it's only for so long. Alysha Love, Executive Editor at Payette Media House. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will absolutely not be reading because you've [BLEEP]-ed the subject-verb agreement in your first sentence.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Evolving Role of a Software Engineer with Forrest Brazeal

Screaming in the Cloud

Play Episode Listen Later Aug 17, 2023 37:04


Forrest Brazeal, Head of Developer Media at Google Cloud, joins Corey on Screaming in the Cloud to discuss how AI, current job markets, and more are impacting software engineers. Forrest and Corey explore whether AI helps or hurts developers, and what impact it has on the role of a junior developer and the rest of the development team. Forrest also shares his viewpoints on how he feels AI affects people in creative roles. Corey and Forrest discuss the pitfalls of a long career as a software developer, and how people can break into a career in cloud as well as the necessary pivots you may need to make along the way. Forrest then describes why he feels workers are currently staying put where they work, and how he predicts a major shift will happen when the markets shift.About ForrestForrest is a cloud educator, cartoonist, author, and Pwnie Award-winning songwriter. He currently leads the content marketing team at Google Cloud. You can buy his book, The Read Aloud Cloud, from Wiley Publishing or attend his talks at public and private events around the world.Links Referenced: Personal Website: https://goodtechthings.com Newsletter signup: https://cloud.google.com/innovators TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn and I am thrilled to have a returning guest on, who has been some would almost say suspiciously quiet over the past year or so. Forrest Brazeal is the Head of Developer Media over at Google Cloud, and everyone sort of sits there and cocks their head, like, “What does that mean?” And then he says, “Oh, I'm the cloud bard.” And everyone's, “Oh, right. Get it: the song guy.” Forrest, welcome back.Forrest: Thanks, Corey. As always, it's great to be here.Corey: So, what have you been up to over the past, oh let's call it, I don't know, a year, since I think, is probably the last time you're on the show.Forrest: Well, gosh, I mean, for one thing, it seems like I can't call myself the cloud bard anymore because Google rolled out this thing called Bard and I've started to get some DMs from people asking for, you know, tech support on Bard. So, I need to make that a little bit clearer that I do not work on Bard. I am a lowercase bard, but I was here first, so if anything, you know, Google has deprecated me.Corey: Honestly, this feels on some level like it's more cloudy if we define cloudy as what, you know, Amazon does because they launched a quantum computing service about six months after they launched some unrelated nonsense that they called [QuantumDB 00:01:44], which you'd think if you're launching quantum stuff, you'd reserve the word quantum for that. But no, they're going to launch things that stomp all over other service names as well internally, so customers just wind up remarkably confused. So, if you find a good name, just we're going to slap it on everything, seems to be the way of cloud.Forrest: Yeah, naming things has proven to be harder than either quantum computing or generative AI at this point, I think.Corey: And in fairness, I will point out that naming things is super hard; making fun of names is not. So, that is—everyone's like, “Wow, you're so good at making fun of names. Can you name something well?” [laugh]. Absolutely not.Forrest: Yeah, well, one of the things you know, that I have been up to over the past year or so is just, you know, getting to learn more about what it's like to have an impact in a very, very large organizational context, right? I mean, I've worked in large companies before, but Google is a different size and scale of things and it takes some time honestly, to, you know, figure out how you can do the best for the community in an environment like that. And sometimes that comes down to the level of, like, what are things called? How do we express things in a way that makes sense to everyone and takes into account people's different communication styles and different preferences, different geographies, regions? And that's something that I'm still learning.But you know, hopefully, we're getting to a point where you're going to start hearing some things come out of Google Cloud that answer your questions and makes sense to you. That's supposed to be part of my job, anyway.Corey: So, I want to talk a bit about the idea of generative AI because there has been an awful lot of hype in the space, but you have never given me a bum steer. You have always been a level-headed, reasonable voice. You are not—to my understanding—a VC trying desperately to prop up an industry that you may or may not believe in, but you are financially invested into. What is your take on the last, let's call it, year of generative AI enhancements?Forrest: So, to be clear, while I do have a master's degree in interactive intelligence, which is kind of AI adjacent, this is not something that I build with day-to-day professionally. But I have spent a lot of time over the last year working with the people who do that and trying to understand what is the value that gen AI can bring to the domains that I do care about and have a lot of interest in, which of course, are cloud developers and folks trying to build meaningful enterprise applications, take established workloads and make them better, and as well work with folks who are new to their careers and trying to figure out, you know, what's the most appropriate technology for me to bet on? What's going to help me versus what's going to hurt me?And I think one of the things that I have been telling people most frequently—because I talk to a lot of, like, new cloud learners, and they're saying, “Should I just drop what I'm doing? Should I stop building the projects I'm working on and should I instead just go and get really good at generating code through something like a Bard or a ChatGPT or what have you?” And I went down a rabbit hole with this, Corey, for a long time and spent time building with these tools. And I see the value there. I don't think there's any question.But what has come very, very clearly to the forefront is, the better you already are at writing code, the more help a generative AI coding assistant is going to give you, like a Bard or a ChatGPT, what have you. So, that means the way to get better at using these tools is to get better at not using these tools, right? The more time you spend learning to code without AI input, the better you'll be at coding with AI input.Corey: I'm not sure I entirely agree because for me, the wake-up call that I had was a singular moment using I want to say it was either Chat-Gippity—yes, that's how it's pronounced—or else it was Gif-Ub Copilot—yes, also how it's pronounced—and the problem that I was having was, I wanted to query probably the worst API in the known universe—which is, of course, the AWS pricing API: it returns JSON, that kind of isn't, it returns really weird structures where you have to correlate between a bunch of different random strings to get actual data out of it, and it was nightmarish and of course, it's not consistent. So, I asked it to write me a Python script that would contrast the hourly cost of a Managed NAT gateway in all AWS regions and return a table sorted by the most to least expensive. And it worked.Now, this is something that I could have done myself in probably half a day because my two programming languages of choice remain brute force and enthusiasm, but it wound up taking away so much of the iterative stuff that doesn't work of oh, that's not quite how you'd handle that data structure. Oh, you think it's a dict, but no, it just looks like one. It's a string first; now you have to convert it, or all kinds of other weird stuff like that. Like, this is not senior engineering work, but it really wound up as a massive accelerator to get the answer I was after. It was almost an interface to a bad API. Or rather, an interface to a program—to a small script that became an interface itself to a bad API.Forrest: Well, that's right. But think for a minute, Corey, about what's implicit in that statement though. Think about all the things you had to know to get that value out of ChatGPT, right? You had to know, A, what you were looking for: how these prices worked, what the right price [style 00:06:52] was to look for, right, why NAT gateway is something you needed to be caring about in the first place. There's a pretty deep stack of things—actually, it's what we call a context window, right, that you needed to know to make this query take a half-day of work away from you.And all that stuff that you've built up through years and years of being very hands-on with this technology, you put that same sentence-level task in the hands of someone who doesn't have that background and they're not going to have the same results. So, I think there's still tremendous value in expanding your personal mental context window. The more of that you have, the better and faster results you're going to get.Corey: Oh, absolutely. I do want to steer away from this idea that there needs to be this massive level of subject matter expertise because I don't disagree with it, but you're right, the question I asked was highly contextual to the area of expertise that I have. But everyone tends to have something like that. If you're a marketer for example, and you wind up with an enormous pile of entrants on a feedback form, great. Can you just dump it all in and say, can you give me a sentiment analysis on this?I don't know how to run a sentiment analysis myself, but I'm betting that a lot of these generative AI models do, or being able to direct me in the right area on this. The question I have is—it can even be distilled down into simple language of, “Here's a bunch of comments. Do people love the thing or hate the thing?” There are ways to get there that apply, even if you don't have familiarity with the computer science aspects of it, you definitely have aspect to the problem in which you are trying to solve.Forrest: Oh, yeah, I don't think we're disagreeing at all. Domain expertise seems to produce great results when you apply it to something that's tangential to your domain expertise. But you know, I was at an event a month or two ago, and I was talking to a bunch of IT executives about ChatGPT and these other services, and it was interesting. I heard two responses when we were talking about this. The first thing that was very common was I did not hear any one of these extremely, let's say, a little bit skeptical—I don't want to say jaded—technical leaders—like, they've been around a long time; they've seen a lot of technologies come and go—I didn't hear a single person say, “This is something that's not useful to me.”Every single one of them immediately was grasping the value of having a service that can connect some of those dots, can in-between a little bit, if you will. But the second thing that all of them said was, “I can't use this inside my company right now because I don't have legal approval.” Right? And then that's the second round of challenges is, what does it look like to actually take these services and make them safe and effective to use in a business context where they're load-bearing?Corey: Depending upon what is being done with them, I am either sympathetic or dismissive of that concern. For example, yesterday, I wound up having fun with it, and—because I saw a query, a prompt that someone had put in of, “Create a table of the US presidents ranked by years that they were in office.” And it's like, “Okay, that's great.” Like, I understand the value here. But if you have a magic robot voice from the future in a box that you can ask it any question and as basically a person, why not have more fun with it?So, I put to it the question of, “Rank the US presidents by absorbency.” And it's like, “Well, that's not a valid way of rating presidential performance.” I said, “It is if I have a spill and I'm attempting to select the US president with which to mop up the spill.” Like, “Oh, in that case, here you go.” And it spat out a bunch of stuff.That was fun and exciting. But one example he gave was it ranked Theodore Roosevelt very highly. Teddy Roosevelt was famous for having a mustache. That might be useful to mop up a spill. Now, I never would have come up in isolation with the idea of using a president's mustache to mop something up explicitly, but that's a perfect writer's room style Yes, And approach that I could then springboard off of to continue iterating on if I'm using that as part of something larger. That is a far cry from copying and pasting whatever it is to say into an email, whacking send before realizing it makes no sense.Forrest: Yeah, that's right. And of course, you can play with what we call the temperatures on these models, right, to get those very creative, off-the-wall kind of answers, or to make them very, kind of, dry and factual on the other end. And Google Cloud has been doing some interesting things there with Generative AI Studio and some of the new features that have come to Vertex AI. But it's just—it's going to be a delicate dance, honestly, to figure out how you tune those things to work in the enterprise.Corey: Oh, absolutely. I feel like the temperature dial should instead be instead relabeled as ‘corporate voice.' Like, do you want a lot of it or a little of it? And of course, they have to invert it. But yeah, the idea is that, for some things, yeah, you definitely just want a just-the-facts style of approach.Another demo that I saw, for example, that I thought showed a lack of imagination was, “Here's a transcript of a meeting. Extract all the to-do items.” Okay. Yeah, I suppose that works, but what about, here's a transcript of the meeting. Identify who the most unpleasant, passive-aggressive person in this meeting is to work with.And to its credit—because of course this came from something corporate, none of the systems that I wound up running that particular query through could identify anyone because of course the transcript was very bland and dry and not actually how human beings talk, other than in imagined corporate training videos.Forrest: Yes, well again, I think that gets us into the realm of just because you can doesn't mean you should use it for this.Corey: Oh, I honestly, most of what I use this stuff for—or use anything for—should be considered a cautionary tale as opposed to guidance for the future. You write parody songs a fair bit. So do I, and I've had an attempt to write versions of, like, write parody lyrics for some random song about this theme. And it's not bad, but for a lot of that stuff, it's not great, either. It is a starting point.Forrest: Now, hang on, Corey. You know, as well as I do that I don't write parody songs. We've had this conversation before. A parody is using existing music and adding new lyrics to it. I write my own music and my own lyrics and I'll have you know, that's an important distinction. But—Corey: True.Forrest: I think you're right on that, you know, having these services give you creative output. What you're getting is an average of a lot of other creative output, right, which is—could give you a perfectly average result, but it's difficult to get a first pass that gives you something that really stands out. I do also find, as a creative, that starting with something that's very average oftentimes locks me into a place where I don't really want to be. In other words, I'm not going to potentially come up with something as interesting if I'm starting with a baseline like that. It's almost a little bit polluting to the creative process.I know there's a lot of other creatives that feel that way as well, but you've also got people that have found ways to use generative AI to stimulate some really interesting creative things. And I think maybe the example you gave of the president's rank by absorbency is a great way to do that. Now, in that case, the initial creativity, a lot of it resided in the prompt, Corey. I mean, you're giving it a fantastically creative, unusual, off-the-wall place to start from. And just about any average of five presidents that come out of that is going to be pretty funny and weird because of just how funny and weird the idea was to begin with. That's where I think AI can give you that great writer's room feel.Corey: It really does. It's a Yes, And approach where there's a significant way that it can build on top of stuff. I've been looking for a, I guess, a writer's room style of approach for a while, but it's hard to find the right people who don't already have their own platform and voice to do this. And again, it's not a matter of payment. I'm thrilled to basically pay any reasonable out of money to build a writer's room here of people who get the cloud industry to work with me and workshops on some of the bigger jokes.The challenge is that those people are very hard to find and/or are conflicted out. Having just a robot who, with infinite patience for tomfoolery—because the writing process can look kind of dull and crappy until you find the right thing—has been awesome. There's also a sense of psychological safety in not poisoning people. Like, “I thought you were supposed to be funny, but this stuff is all terrible. What's the deal here?” I've already poisoned that well with my business partner, for example.Forrest: Yeah, there's only so many chances you get to make that first impression, so why not go with AI that never remembers you or any of your past mistakes?Corey: Exactly. Although the weird thing is that I found out that when they first launched Chat-Gippity, it already knew who I was. So, it is in fact familiar, so at least my early work of my entire—I guess my entire life. So that's—Forrest: Yes.Corey: —kind of worrisome.Forrest: Well, I know it credited to me books I hadn't written and universities I hadn't attended and all kinds of good stuff, so it made me look better than I was.Corey: So, what have you been up to lately in the context of, well I said generative AI is a good way to start, but I guess we can also call it at Google Cloud. Because I have it on good authority that, marketing to the contrary, all of the cloud providers do other things in addition to AI and ML work. It's just that's what's getting in the headline these days. But I have noticed a disturbing number of virtual machines living in a bunch of customer environments relative to the amount of AI workloads that are actually running. So, there might be one or two other things afoot.Forrest: That's right. And when you go and talk to folks that are actively building on cloud services right now, and you ask them, “Hey, what is the business telling you right now? What is the thing that you have to fix? What's the thing that you have to improve?” AI isn't always in the conversation.Sometimes it is, but very often, those modernization conversations are about, “Hey, we've got to port some of these services to a language that the people that work here now actually know how to write code in. We've got to find a way to make this thing a little faster. Or maybe more specifically, we've got to figure out how to make it run at the same speed while using less or less expensive resources.” Which is a big conversation right now. And those are things that they are conversations as old as time. They're not going away, and so it's up to the cloud providers to continue to provide services and features that help make that possible.And so, you're seeing that, like, with Cloud Run, where they've just announced this CPU Boost feature, right, that gives you kind of an additional—it's like a boost going downhill or a push on the swing as you're getting started to help you get over that cold-start penalty. Where you're seeing the session affinity features for Cloud Run now where you have the sticky session ability that might allow you to use something like, you know, a container-backed service like that, instead of a more traditional load balancer service that you'd be using in the past. So, you know, just, you take your eye off the ball for a minute, as you know, and 10 or 20, more of these feature releases come out, but they're all kind of in service of making that experience better, broadening the surface area of applications and workloads that are able to be moved to cloud and able to be run more effectively on cloud than anywhere else.Corey: There's been a lot of talk lately about how the idea of generative AI might wind up causing problems for people, taking jobs away, et cetera, et cetera. You almost certainly have a borderline unique perspective on this because of your work with, honestly, one of the most laudable things I've ever seen come out of anywhere which is The Cloud Resume Challenge, which is a build a portfolio site, then go ahead and roll that out into how you interview. And it teaches people how to use cloud, step-by-step, you have multi-cloud versions, you have them for specific clouds. It's nothing short of astonishing. So, you find yourself talking to an awful lot of very early career folks, folks who are transitioning into tech from other places, and you're seeing an awful lot of these different perspectives and AI plays come to the forefront. How do you wind up, I guess, making sense of all this? What guidance are you giving people who are worried about that?Forrest: Yeah, I mean, I, you know—look, for years now, when I get questions from these, let's call them career changers, non-traditional learners who tend to be a large percentage, if not a plurality, of the people that are working on The Cloud Resume Challenge, for years now, the questions that they've come to me with are always, like, you know, “What is the one thing I need to know that will be the magic technology, the magic thing that will unlock the doors and give me the inside track to a junior position?” And what I've always told them—and it continues to be true—is, there is no magic thing to know other than magically going and getting two years of experience, right? The way we hire juniors in this industry is broken, it's been broken for a long time, it's broken not because of any one person's choice, but because of this sort of tragedy of the commons situation where everybody's competing over a dwindling pool of senior staff level talent and hopes that the next person will, you know, train the next generation for them so they don't have to expend their energy and interview cycles and everything else on it. And as long as that remains true, it's just going to be a challenge to stand out.Now, you'll hear a lot of people saying that, “Well, I mean, if I have generative AI, I'm not going to need to hire a junior developer.” But if you're saying that as a hiring manager, as a team member, then I think you always had the wrong expectation for what a junior developer should be doing. A junior developer is not your mini me who sits there and takes the little challenges, you know, the little scripts and things like that are beneath you to write. And if that's how you treat your junior engineers, then you're not creating an environment for them to thrive, right? A junior engineer is someone who comes in who, in a perfect world, is someone who should be able to come in almost in more of an apprentice context, and somebody should be able to sit alongside you learning what you know, right, and having education integrated into their actual job experience so that at the end of that time, they're able to step back and actually be a full-fledged member of your team rather than just someone that you kind of throw tasks over the wall to, and they don't have any career advancement potential out of that.So, if anything, I think the advancement of generative AI, in a just world, ought to give people a wake-up call that, hey, training the next generation of engineers is something that we're actually going to have to actively create programs around, now. It's not something that we can just, you know, give them the scraps that fall off of our desks. Unfortunately, I do think that in some cases, the gen AI narrative more than the reality is being used to help people put off the idea of trying to do that. And I don't believe that that's going to be true long-term. I think that if anything, generative AI is going to open up more need for developers.I mean, it's generating a lot of code, right, and as we know, Jevons paradox says that when you make it easier to use something and there's elastic demand for that thing, the amount of creation of that thing goes up. And that's going to be true for code just like it was for electricity and for code and for GPUs and who knows what all else. So, you're going to have all this code that has a much lower barrier of entry to creating it, right, and you're going to need people to harden that stuff and operate it in production, be on call for it at three in the morning, debug it. Someone's going to have to do all that, you know? And what I tell these junior developers is, “It could be you, and probably the best thing for you to do right now is to, like I said before, get good at coding on your own. Build as much of that personal strength around development as you can so that when you do have the opportunity to use generative AI tools on the job, that you have the maximum amount of mental context to put around them to be successful.”Corey: I want to further point out that there are a number of folks whose initial reaction to a lot of this is defensiveness. I showed that script that wound up spitting out the Managed NAT gateway ranked-by-region table to one of our contract engineers, who's very senior. And the initial response I got from them was almost defensive, were, “Okay, yeah. That'll wind up taking over, like, a $20 an hour Upwork coder, but it's not going to replace a senior engineer.” And I felt like that was an interesting response psychologically because it felt defensive for one, and two, not for nothing, but senior developers don't generally spring fully formed from the forehead of some ancient God. They start off as—dare I say it—junior developers who learn and improve as they go.So, I wonder what this means. If we want to get into a point where generative AI takes care of all the quote-unquote, “Easy programming problems,” and getting the easy scripts out, what does that mean for the evolution and development of future developers?Forrest: Well, keep in mind—Corey: And that might be a far future question.Forrest: Right. That's an argument as old as time, right, or a concern is old as time and we hear it anew with each new level of automation. So, people were saying this a few years ago about the cloud or about virtual machines, right? Well, how are people going to, you know, learn how to do the things that sit on top of that if they haven't taken the time to configure what's below the surface? And I'm sympathetic to that argument to some extent, but at the same time, I think it's more important to deal with the reality we have now than try to create an artificial version of realities' past.So, here's the reality right now: a lot of these simple programming tasks can be done by AI. Okay, that's not likely to change anytime soon. That's the new reality. So now, what does it look like to bring on juniors in that context? And again, I think that comes down to don't look at them as someone who's there just to, you know, be a pair of hands on a keyboard, spitting out tiny bits of low-level code.You need to look at them as someone who needs to be, you know, an effective user of general AI services, but also someone who is being trained and given access to the things they'll need to do on top of that, so the architectural decisions, the operational decisions that they'll need to make in order to be effective as a senior. And again, that takes buy-in from a team, right, to make that happen. That is not going to happen automatically. So, we'll see. That's one of those things that's very hard to automate the interactions between people and the growth of people. It takes people that are willing to be mentors.Corey: I'm also curious as to how you see the guidance shifting as computers get better. Because right now, one of my biggest problems that I see is that if I have an idea for a company I want to start or a product I want to build that involves software, step one is, learn to write a bunch of code. And I feel like there's a massive opportunity for skipping aspects of that, whereas effectively have the robot build me the MVP that I describe. Think drag-and-drop to build a web app style of approach.And the obvious response to that is, well, that's not going to go to hyperscale. That's going to break in a bunch of different ways. Well, sure, but I can get an MVP out the door to show someone without having to spend a year building it myself by learning the programming languages first, just to throw away as soon as I hire someone who can actually write code. It cuts down that cycle time massively, and I can't shake the feeling that needs to happen.Forrest: I think it does. And I think, you know, you were talking about your senior engineer that had this kind of default defensive reaction to the idea that something like that could meaningfully intrude on their responsibilities. And I think if you're listening to this and you are that senior engineer, you're five or more years into the industry and you've built your employability on the fact that you're the only person who can rough out these stacks, I would take a very, very hard look at yourself and the value that you're providing. And you say, you know—let's say that I joined a startup and the POC was built out by this technical—or possibly the not-that-technical co-founder, right—they made it work and that thing went from, you know, not existing to having users in the span of a week, which we're seeing more now and we're going to see more and more of. Okay, what does my job look like in that world? What am I actually coming on to help with?Am I—I'm coming on probably to figure out how to scale that thing and make it maintainable, right, operate it in a way that is not going to cause significant legal and financial problems for the company down the road. So, your role becomes less about being the person that comes in and does this totally greenfield thing from scratch and becomes more about being the person who comes in as the adult in the room, technically speaking. And I think that role is not going away. Like I said, there's going to be more of those opportunities rather than less. But it might change your conception of yourself a little bit, how you think about yourself, the value that you provide, now's the time to get ahead of that.Corey: I think that it is myopic and dangerous to view what you do as an engineer purely through the lens of writing code because it is a near certainty that if you are learning to write code today and build systems involving technology today, that you will have multiple careers between now and retirement. And in fact, if you're entering the workforce now, the job that you have today will not exist in anything remotely approaching the same way by the time you leave the field. And the job you have then looks borderline unrecognizable, if it even exists at all today. That is the overwhelming theme that I've got on this ar—the tech industry moves quickly and is not solidified like a number of other industries have. Like, accountants: they existed a generation ago and will exist in largely the same form a generation from now.But software engineering in particular—and cloud, of course, as well, tied to that—have been iterating so rapidly, with such sweepingly vast changes, that that is something that I think we're going to have a lot of challenge with, just wrestling with. If you want a job that doesn't involve change, this is the wrong field.Forrest: Is it the wrong field. And honestly, software engineering is, has been, and will continue to be a difficult business to make a 40-year career in. And this came home to me really strongly. I was talking to somebody a couple of months ago who, if I were to say the name—which I won't—you and I would both know it, and a lot of people listening to this would know as well. This is someone who's very senior, very well respected is, by name, identified in large part with the creation of a significant movement in technology. So, someone who you would never think of would be having a problem getting a job.Corey: Is it me? And is it Route 53 as a database, as the movement?Forrest: No, but good guess.Corey: Excellent.Forrest: This is someone I was talking to because I had just given a talk where I was pleading with IT leaders to take more responsibility for building on-ramps for non-traditional learners, career changers, people that are doing something a little different with their career. And I was mainly thinking of it as people that had come from a completely non-technical background or maybe people that were you know, like, I don't know, IT service managers with skills 20 years out of date, something like that. But this is a person who you and I would think of as someone at the forefront, the cutting edge, an incredibly employable person. And this person was a little bit farther on in their career and they came up to me and said, “Thank you so much for giving that talk because this is the problem I have. Every interview that I go into, I get told, ‘Oh, we probably can't afford you,' or, ‘Oh well, you say you want to do AI stuff now, but we see that all your experience is doing this other thing, and we're just not interested in taking a chance on someone like that at the salary you need to be at.'” and this person's, like, “What am I going to do? I don't see the roadmap in front of me anymore like I did 10, 15, or 20 years ago.”And I was so sobered to hear that coming from, again, someone who you and I would consider to be a luminary, a leading light at the top of the, let's just broadly say IT field. And I had to go back and sit with that. And all I could come up with was, if you're looking ahead and you say I want to be in this industry for 30 years, you may reach a point where you have to take a tremendous amount of personal control over where you end up. You may reach a point where there is not going to be a job out there for you, right, that has the salary and the options that you need. You may need to look at building your own path at some point. It's just it gets really rough out there unless you want to continue to stagnate and stay in the same place. And I don't have a good piece of advice for that other than just you're going to have to find a path that's unique to you. There is not a blueprint once you get beyond that stage.Corey: I get asked questions around this periodically. The problem that I have with it is that I can't take my own advice anymore. I wish I could. But what I used to love doing was, every quarter or so, I'd make it a point to go on at least one job interview somewhere else. This wound up having a few great features.One, interviewing is a skill that atrophies if you don't use it. Two, it gives me a finger on the pulse of what the market is doing, what the industry cares about. I dismissed Docker the first time I heard about it, but after the fourth interview where people were asking about Docker, okay, this is clearly a thing. And it forced me to keep my resume current because I've known too many people who spend seven years at a company and then wind up forgetting what they did years three, four, and five, where okay, then what was the value of being there? It also forces you to keep an eye on how you're evolving and growing or whether you're getting stagnant.I don't ever want to find myself in the position of the person who's been at a company for 20 years and gets laid off and discovers to their chagrin that they don't have 20 years of experience; they have one year of experience repeated 20 times. Because that is a horrifying and scary position to be in.Forrest: It is horrifying and scary. And I think people broadly understand that that's not a position they want to be in, hence why we do see people that are seeking out this continuing education, they're trying to find—you know, trying to reinvent themselves. I see a lot of great initiative from people that are doing that. But it tends to be more on the company side where, you know, they get pigeonholed into a position and the company that they're at says, “Yeah, no. We're not going to give you this opportunity to do something else.”So, we say, “Okay. Well, I'm going to go and interview other places.” And then other companies say, “No, I'm not going to take a chance on someone that's mid-career to learn something brand new. I'm going to go get someone that's fresh out of school.” And so again, that comes back to, you know, where are we as an industry on making space for non-traditional learners and career changers to take the maturity that they have, right, even if it's not specific familiarity with this technology right now, and let them do their thing, let them get untracked.You know, there's tremendous potential being untapped there and wasted, I would say. So, if you're listening to this and you have the opportunity to hire people, I would just strongly encourage you to think outside the box and consider people that are farther on in their careers, even if their technical skill set doesn't exactly line up with the five pieces of technology that are on your job req, look for people that have demonstrated success and ability to learn at whatever [laugh] the things are that they've done in the past, people that are tremendously highly motivated to succeed, and let them go win on your behalf. There's—you have no idea the amount of talent that you're leaving on the table if you don't do that.Corey: I'd also encourage people to remember that job descriptions are inherently aspirational. If you take a job where you know how to do every single item on the list because you've done it before, how is that not going to be boring? I love being given problems. And maybe I'm weird like this, but I love being given a problem where people say, “Okay, so how are you going to solve this?” And the answer is, “I have no idea yet, but I can't wait to find out.” Because at some level, being able to figure out what the right answer is, pick up the skill sets I don't need, the best way to learn something that I've ever found, at least for me.Forrest: Oh, I hear that. And what I found, you know, working with a lot of new learners that I've given that advice to is, typically the ones that advice works best for, unfortunately, are the ones who have a little bit of baked-in privilege, people that tend to skate by more on the benefit of the doubt. That is a tough piece of advice to fulfill if you're, you know, someone who's historically underrepresented or doesn't often get the chance to prove that you can do things that you don't already have a testament to doing successfully. So again, takes it back to the hiring side. Be willing to bet on people, right, and not just to kind of look at their resume and go from there.Corey: So, I'm curious to see what you've noticed in the community because I have a certain perspective on these things, and a year ago, everyone was constantly grousing about dissatisfaction with their employers in a bunch of ways. And that seems to have largely vanished. I know, there have been a bunch of layoffs and those are tragic on both sides, let's be very clear. No one is happy when a layoff hits. But I'm also seeing a lot more of people keeping their concerns to either private channels or to themselves, and I'm seeing what seems to be less mobility between companies than I saw previously. Is that because people are just now grateful to have a job and don't want to rock the boat, or is it still happening and I'm just not seeing it in the same way?Forrest: No, I think the vibe has shifted, for sure. You've got, you know, less opportunities that are available, you know that if you do lose your job that you're potentially going to have fewer places to go to. I liken it to like if you bought a house with a sub-3% mortgage and 2021, let's say, and now you want to move. Even though the housing market may have gone down a little bit, those interest rates are so high that you're going to be paying more, so you kind of are stuck where you are until the market stabilizes a little bit. And I think there's a lot of people in that situation with their jobs, too.They locked in salaries at '21, '22 prices and now here we are in 2023 and those [laugh] those opportunities are just not open. So, I think you're seeing a lot of people staying put—rationally, I would say—and waiting for the market to shift. But I think that at the point that you do see that shift, then yes, you're going to see an exodus; you're going to see a wave and there will be a whole bunch of new think pieces about the great resignation or something, but all it is just that pent up demand as people that are unhappy in their roles finally feel like they have the mobility to shift.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where's the best place for them to find you?Forrest: You can always find me at goodtechthings.com. I have a newsletter there, and I like to post cartoons and videos and other fun things there as well. If you want to hear my weekly take on Google Cloud, go to cloud.google.com/innovators and sign up there. You will get my weekly newsletter The Overwhelmed Person's Guide to Google Cloud where I try to share just the Google Cloud news and community links that are most interesting and relevant in a given week. So, I would love to connect with you there.Corey: I have known you for years, Forrest, and both of those links are new to me. So, this is the problem with being active in a bunch of different places. It's always difficult to—“Where should I find you?” “Here's a list of 15 places,” and some slipped through the cracks. I'll be signing up for both of those, so thank you.Forrest: Yeah. I used to say just follow my Twitter, but now there's, like, five Twitters, so I don't even know what to tell you.Corey: Yes. The balkanization of this is becoming very interesting. Thanks [laugh] again for taking the time to chat with me and I look forward to the next time.Forrest: All right. As always, Corey, thanks.Corey: Forrest Brazeal, Head of Developer Media at Google Cloud, and of course the Cloud Bard. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that you undoubtedly had a generative AI model write for you and then failed to proofread it.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Importance of Positivity in Negotiations with Josh Doody

Screaming in the Cloud

Play Episode Listen Later Aug 15, 2023 36:05


Josh Doody, Owner of Fearless Salary Negotiation, joins Corey on Screaming in the Cloud to discuss how important tonality and communication is, both in salary negotiations and everyday life. Josh describes how important it is to have a positive padding to your communications in order to make the person on the other end of the negotiation feel like a collaborator rather than a combatant. Corey and Josh also describe scenarios where tonality made a huge difference in the outcome, and Josh gives some examples of where and when to be mindful of how you're coming across in modern communication methods. Josh also reveals how negotiating with companies multiple times allows him to understand their recruiters more than a person who is encountering their negotiation process for the first time.About JoshJosh is a salary negotiation coach who works with senior software engineers and engineering managers to negotiate job offers with big tech companies. He also wrote Fearless Salary Negotiation: A Step-by-Step Guide to Getting Paid What You're Worth, and recently launched Salary Negotiation Mastery to help folks who aren't able to work with Josh 1-on-1.Links Referenced: Fearless Salary Negotiation website: https://fearlesssalarynegotiation.com Fearless Salary Negotiation: https://www.amazon.com/Fearless-Salary-Negotiation-step-step/dp/0692568689/ Twitter: https://twitter.com/joshdoody LinkedIn: https://www.linkedin.com/in/joshdoody/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Human-scale teams use Tailscale to build trusted networks. Tailscale Funnel is a great way to share a local service with your team for collaboration, testing, and experimentation.  Funnel securely exposes your dev environment at a stable URL, complete with auto-provisioned TLS certificates. Use it from the command line or the new VS Code extensions. In a few keystrokes, you can securely expose a local port to the internet, right from the IDE.I did this in a talk I gave at Tailscale Up, their first inaugural developer conference. I used it to present my slides and only revealed that that's what I was doing at the end of it. It's awesome, it works! Check it out!Their free plan now includes 3 users & 100 devices. Try it at snark.cloud/tailscalescream Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined by recurring guest and friend Josh Doody, who among oh, so many things, is the owner of fearlesssalarynegotiation.com, and basically does exactly what it says on the tin. Josh, great to talk to you again.Josh: Hey, Corey. Thanks for having me back. I appreciate it and I'm glad to be here.Corey: So, you are, for those who have not heard me evangelize what you do—which is fine. No one listens to all of the backlog of episodes and whatnot—you are a salary negotiation coach, and you emphasize working with high earners who are negotiating new job offers, which is basically awesome. How did you stumble into this?Josh: Yeah, a good question. Really, it started as what I would say is a series of interesting career choices that I made, where I started as an engineer. I was pretty quickly bored in engineering and I switched to—I wanted to be customer-facing and do stuff that had impact on the business, so I did that and ended up working for a software company that made HR software that happened to do among other things, compensation planning. And so, I kind of started learning how it worked behind the scenes.And then over time, I started wising up and negotiating my own job offers. And noticed that wow that kind of worked pretty well, and I decided to write a book about it, a hundred percent just because I like to write stuff. I've been writing for 20 years on the internet, and I decided, why not just write a book about this? You know, five or six people will buy it, my mom will love it, I'll get it out there and it'll feel really good.And then people started reading the book and asking me if they could hire me to do the methodology in the book for them. And I said, “Sure.”Corey: When people try to give you money, say yes.Josh: Yeah. Okay, you know, whatever, you know? My first person that ever hired me asked me what my rate was, and I didn't have a rate because I had never considered doing that before. But she was a freelance writer and I said, “Well, whatever your rate is, that's my rate.” [laugh]. So, that was my first rate that I charged someone.And yeah, from there just, it took off as more people started hiring me. A number of friends were chirping in my ear that hey, you know, this seems like a really valuable thing that you're doing and people are coming out of the woodwork to ask you to do it for them. Maybe you should do that thing instead of the other things you're doing and trying to sell copies of the book and stuff like that. Like, why don't you just be a salary negotiation coach? That was, I don't know, like, seven years ago now, and here I am.Corey: I don't know if I ever told you this, but back when we met in the fall of 2016, I was trying to figure out what windmill I was going to tilt at before I stumbled upon the idea of AWS billing as being one of them. I thought that writing a book and being a sort of a coach of sorts on how to do job interviews with an emphasis, of course, on salary negotiation, would be a great topic for me because I've done it an awful lot. This is a byproduct of getting fired all the time because of my mouth. And then I started talking to you and my reaction was, “Oh, Josh is way better at this than I am. No, I'm going to go find something else instead.”And now the world is what it is, and honestly, at this point, all the cloud providers really wish you hadn't been there at that point in time because then they wouldn't have to deal with the nonsense that I present to them now. But I always had a high opinion of what you do, just because it is in such a sweet spot where if I were to shut this place down and get a quote-unquote, “Real job” somewhere, I would hire you. And it's not that I intellectually don't know how to negotiate. Half my consulting now is negotiating large AWS contracts on behalf of AWS customers with AWS. A lot of these things tend to apply and go very hand-in-glove.But there's something to be said for having someone who sees this all the time in a consistent ongoing basis, who is able to be dispassionate. Because when you're coaching someone, it's not you in the same boat. For you, it's okay, you want to have a happy customer, obviously, but for your client, it's suddenly, wow, this is the next stage of my career. This matters. The stakes are infinitely higher for them than they are for you.And that means you have the luxury of taking a step back and recognizing a bad deal when you see one. There is such value to that I can't imagine not engaging you or someone like you the next time that I would go about changing jobs. Although these days, it's probably an acquisition or I finally succumb to a cease and desist. I don't really know that I'm employable anymore.Josh: [laugh]. Yeah, I mean, you said a lot of really interesting things there. I think a common theme—you know, to work with me, there's a short application that people fill out, and very frequently in the application, there are a couple of open-ended questions about you know, how can I help you? What's your number one concern? That kind of stuff.And frequently, they'll say, “Yeah, I've negotiated before and I actually did okay, but I want to work with a professional this time,” is the gist of it, for I think reasons that you mentioned. And one of them is, there's just a difference between negotiating for yourself and feeling all of that pressure and having somebody who can just objectively look at it and say, “No, I think you should ask for this instead.” Or, “No, I don't think that you should give that information to the recruiter.” And the person instead of feeling, you know, personal subjective pressure can just say, “Well, the objective person that I hired and paid money to help me with this says, ‘don't do that,' or ‘do this instead,' and it's easier for me to just trust what they're doing as a professional and let me be a professional at the other things that I'm a professional at.”And so yeah, I think that's a lot of—you know, for some people, it's, “I have no idea how to negotiate. I don't want to screw this up. Please help me, Josh.” And for some people, it's, “Yeah, I've done this before. I did, okay, but I want you here to help me do this.”And that includes people who come back and work with me two or three times. They know the methodology. They've been through it literally with me, and I'm very open about what we're doing and why I'm collaborative with my clients. We're talking about the decisions we make. I will bounce things off of them.I'll say, “Here's what I think we should do. What does your intuition tell you about that? How do you feel about it?” Because it's important to me because they're in the game and I need to know what they think. And they'll come back to me and we'll do it again. They already know the playbook. And I think that's because it's easier to just have somebody who's a professional there to objectively tell you, “You're not asking for enough.” Or, “Did you think about asking for this instead?” Or, “Do you really care about that thing?” Stuff like that.Corey: There is so much value to that, just because it's a what's normal in this? Because I'm sure you've seen before where—I'm probably—I should put this in more of a question, but I already know the answer because I've seen it just from people randomly sending me things out on the internet—of their times for companies say or ask for things that are just absolute clown shoes. It's, I would barely consider it professional at that. It always feels like there's value in being able to talk to someone who sees this all the time who can say, “Hold on. That is absolutely not normal. That is not a reasonable question. That is not an expectation that any sensible person is going to have.” Because the failure mode otherwise is you think it's you.Josh: Yeah, part of my value prop is, you know, I know how to negotiate with companies. I'm not afraid of them. I've negotiated with Fortune 5 companies, come out way ahead—just as you do frequently—and I know the playbook that they're running. But part of it also is, you know, I have a compendium of recruiter responses. I know what they say, I know what their words mean, and so I can say things like, “Oh, here's what they actually mean when they ask you for that.”Or I can say, “That's weird.” Which, you know, if I've done 20 negotiations with this company and all of a sudden a recruiter says something that's weird, that makes my ears perk up and makes me wonder why. And so, I can dig in on my side and try and figure out what's going on, see if we tripped some wire that I didn't see or, you know, something like that. So, that's part of the value too, is just all the reps that I've had, even like you said, I'm sure that you would do a wonderful job negotiating; I've talked to you about negotiating online and off, and I know that you know the game, you know how to do it, for your day job but also for compensation. But I probably have more reps negotiating with those companies than you do and therefore my compendium is a little bit deeper, so there might be things that I could recognize that you would not recognize that I could see, right, in the similar way that in your negotiation world that there are things that I certainly would not recognize that you would catch on to.And I think that can be a very valuable thing. There could be something a recruiter says where I recognize, “Aha. That's a technical term or that's a key phrase that we can grab onto. And that is an opportunity to get more.”Corey: Or, “What are you making now?” It's like, yes, that's the industry accepted one free pass that's screwing the candidate. Yeah—Josh: Right.Corey: —let's not do that.Josh: Right. And we're—here's how to sidestep it and here's what happens when they ask for it for the fourth time, and here's what happens when they say the magic words and, you know, all that stuff. So yeah, a lot of it is just getting reps. It started with let me just run my playbook and then as I run the playbook, I get more data every time I do it, and I get to learn what the edge cases look like, and how to spot, you know, weird funky stuff coming from recruiters and that sort of thing.Corey: One aspect of this that has been, I guess, capturing my imagination since you first talked to me about it, and I am certain I'm going to butcher this into something that sounds insulting and demeaning, which sort of cuts against the entire point. Specifically, the idea of a positive language, or, the term you used was ‘Positively Persuasive.' What is that? Because it sounds like it's just someone who's setting me up, like, waving raw steak in front of a tiger, like, “Please maul me on this.” But there is more to it than that.Josh: [laugh]. Yeah, so this is something that, to be honest with you, I have done almost intuitively throughout my career, but certainly as a salary negotiation coach. And what it is, is a tendency to use positive, meaning, you know, not negative words. So like, essentially, if you're familiar at all with improv, which I would say probably half of the people listening probably have some idea what I'm talking about, you take improv classes, and they teach you an exercise called Yes, And. And the reason you do Yes, And is, you know, Corey says something wacky and I could shut it down.I could say, “That's not true.” You know, “My hair isn't red.” And then we're done improving. But if Corey says, “Josh your hair is red,” even if my hair is not red and I say, “Yes, and… it's on fire right now,” then we have something going, right? And so, using those positive words—yes, and is a positive way of responding to that—opens up a further dialog and also makes it easier for you to engage with me in that improvisation. In a way, a negotiation is an improvisation; they're all going to be different.A business conversation is going to be an improvisation. It's rare that you're going to have a conversation where you could write the script completely before the conversation starts. Often there will be an opportunity to improv, to do something different. And so, positively persuasive is essentially my way of thinking about how to use those positive words to accomplish an objective while building rapport with the person that you're talking to, and leaving the door open for that kind of positive collaboration and improvisation where you can work together with your co-party, with the person that you're talking to in the negotiation. And so, that's super abstract, and a concrete example of this would be for example, in a counteroffer email.Frequently people will, kind of unsolicited, just send me their counteroffer emails. “I'm writing up this email. What do you think?” Somebody on my newsletter or my email list or something. And sometimes they're okay, and sometimes it's like, they're giving an ultimatum and they're saying, “You promised this when we first talked on the phone and you're not giving me that. You offered me this and I want what you offered to start with.”And they're using all these negative words: “You promised this and didn't give it to me.” “That's not what I expected.” Whereas in the counteroffers that I'm writing, it says, “Hey, thanks for the offer.” Starts right away with something that looks like a throwaway line, a platitude, but really what it is is saying, “Hey, we're on the same team here. We're collaborating. Thanks for the offer. I appreciate it and I hope you're having a good week so far.”And then as it goes on, it says, “Here are the reasons that I'm super valuable to your team. I can't wait to join this team and, you know, express that value.” And then, “You offered $100,000. I would be more comfortable if we could settle on $115,000.” And so, that's a counteroffer. In some cases, the counter will be more than 15%. That's kind of a middle-of-the-road one, but the way I say it is, “I would be more comfortable if,” and so there's no sort of in-your-face, there's no ultimatum, there's no fist pounding on the desk—Corey: There's no, “No.” There's no, “This is not acceptable.” There's no, “I won't accept this.” It's a very soft approach that generally doesn't put people on edge.Josh: Puts it—it not only doesn't put them on edge, but you're sort of putting your arm around them saying, “Hey, you know, I'd be more comfortable if we could do this.” And they're like, “Okay, you know, let me see what I can do for you.” So, you're not making—you're not turning them into, you know, an enemy combatant; you're turning them into a collaborator. And now it's you and then working together to try to make you comfortable so that you can join their team. So, that's a subtle thing that happens in a counteroffer email and numerous other places.But that's the idea is that when you can, you're choosing positive language so that your requests will be received better, so you build rapport with the person that you're negotiating with, and so that they perceive you to be a collaborator and not an opponent.Corey: It sounds hokey, but I've also watched it work. It's weird in that we hear about things like this, we think, “Oh, that wouldn't work on me at all,” except it the evidence very clearly shows that it does. There's a reason that some people are considered charismatic and I think this is a large part of it. And I also wonder, I mean, you focus on salary negotiation for high earners, and that, historically at least, as included, you know, a fair few number of software developers and whatnot. And these days, let's be very clear that communicating what you want, clearly, concisely, and in an understandable way that something or someone can action is such a lost foreign skill for some of these people that they call the entire field ‘prompt engineering' because just communicate clearly is apparently a microaggression when you ask an engineer to do it without giving it a fancy name. Improved communication really feels like it has been part of a dawning awareness lately that, wait, this is actually important, not just one of those box-checking items that you say so that people don't spit in your food.Josh: I think you're a hundred percent right about that. I mean, it's interesting is you think about, you know, forms of communication that we have kind of experienced over the past, you know, however many years. But you know, at first, there was no writing, over, you know, thousands of years ago, or whatever, it was just all kind of oral tradition. And then we had writing and it was, like, long-form writing. And then, you know, fast forward to today and it's like you're sending a text with two letters and that means something right, or I'm about to head to my friend's house, and I text him three letters: OMW, right?It's like, extremely terse, direct, and to the point. And there is a place for that, I think. I think that efficiency probably has some benefits. I mean, there's not a lot of reason for me to spend six minutes, you know, writing a text to tell somebody that I'm heading to their house. But on the other hand, I think that sort of concision, that terse writing can also lose a lot in translation, and as we're using more media that look like Slack, or Discord, or these other chat-based ways of communicating—including email, by the way; I mean, email can be a place where you can be as terse, or I guess, as pleonastic as you'd like—and you get more and more words in there.And so, I think it's important to be intentional with those words in contexts where tone and meaning and intent can matter. And a lot of that is in interpersonal communication. And again, it's about how messages are received and what you're conveying. I use a lot of—this is [laugh] not directly related—I use a lot of emoji and emoticons and stuff like that and I do that because I'm trying to convey tone in a medium that doesn't really facilitate it, right?If I'm talking to you, you and I can see each other's faces right now, so you know if I'm being sarcastic, or telling a joke, or being very serious. And so, in emails, I'll put a smiley face. And that's me saying, “Hey, I'm not laying this on real thick. I'm just letting you know.” Right? So anyway, there are so many media that are available to us now that make it hard to convey tone that I think a lot of it is you've got to be intentional with your tone.Corey: I have worked with more people over the course of my career that have what I've taken the call being the asshole-in-email problem, where I have—I think these people are just these absolute jerks. They are completely onerous to deal with and I despise dealing with these people, but then I'll sit down with them and they are the nicest people and they are incredibly competent and effective. They just have a challenge where whatever they write an email, it sounds like there's an implicit, “Listen up here dickhead…” that they're starting the email with.Josh: [laugh]. Yeah.Corey: And, “You know what your problem is…” may as well be how they open these things. And it feels like effectively communicating and tone is becoming something of a lost art. I've talked to multiple people now who will wind up using Chat-Gippity to construct the bones of a work email and then they'll just change a sentence or two in the center that actually is the substantive thing that they want to send so it winds up handling all the window-dressing there. Now, I'm wondering what the other side is going to look like when you have someone using Chat-Gippity to paste a work email into it. It's like, “Okay, strip out the flattery. What are they actually asking from me here?” So, you effectively have, like, an API layer of padding provided by computers, where you could just like, say, the direct thing, but it comes with all the flowerly accouterments that has become expected in business correspondence.Josh: Yeah. I mean, I love everything that you said there. It's true. I mean, I've worked with people in the past where they would send me an email, or I would email with them frequently and then we were talking in person, I realized that oh, I totally misread what they were saying. Like, I misread what they meant to say, I misread what their outcome, their preferred outcome was, and it's because the tone is just lost in email.And I don't think it was necessarily due to any sort of deficiency on their side. It was on—they have a way of communicating, I have a way of perceiving communications, and they were different, and so the message that I got was different. So, I think a lot of what I'm talking about with positively persuasive is how do I communicate in a way where it is not ambiguous, where it is very clear what I'm saying, what my intent is, what my tone is. And sometimes, like you said, [laugh] use ChatGPT to, like, strip out the flattery. I put the flattery in because I want them to know, like, “Look, I know that you're a person. You and I are on the same team here. We're working together.”So, a lot of my emails will open with, “Hey, hope you're having a good day.” And it's like, do I care if they're having a good day? Yeah, but I don't need to say that out loud. The reason I'm saying it out loud is I want them—the opposite of everything you just described where I want them to read that email and think, “Okay, Josh isn't coming at me. Even if he does have critiques of something that I'm doing, or he has a suggestion to improve something, he's coming at it from the place of, ‘Hey, I hope you're having a good day so far.'” Whatever I say at the beginning of the email.And so, that's filler, a hundred percent, but it's filler with a purpose that is meant to convey the tone of the email, that is, I'm not coming down on you too hard. I'm trying to convey a message or ask a question and sincerely curious, and can we come together on this to figure out what the solution is or to move forward or to find the next steps or whatever the thing is that we're trying to do?Corey: It feels like this is an area that has massive application beyond the obvious negotiation piece of it, which is fundamentally where we sit down and try and convince people to do a thing that we want them to do that is in our interest. But it's like, okay, well, that's not just negotiation. That is, on some level, a disturbing number of human interactions that we tend to have. Where do you see this being applied? Is it something that just—that you're looking at just through a lens of communicating effectively in a salary negotiation, or does it extend beyond that to your worldview?Josh: I think it can get pretty broad. I mean, as you were describing, I was thinking kind of, as you were talking, like, when else do I use this? And the answer is a lot. But one place that I use this kind of thing a lot is when I'm emailing people who I don't know, and trying to get them to either just give me something or to allow me more leeway than they otherwise necessarily have to allow. And so for—here, I'll give you an example, which is, I recently switched homeowners insurance providers because I live in Florida and homeowner's insurance in Florida is a nightmare.And so, I changed providers. I thought I had crossed all my t's and dotted all my I's, but there was something that fell between the cracks, and that is that the mortgage holder—the bank that holds my mortgage—hadn't sent the premium check to my new insurance provider. They didn't get that memo. And it was essentially my responsibility, but I kind of goofed. So, the bank writes me an email and they say, “Hey, we see you changed providers but we don't have an address for them. We can't send them a check. Can you give it to me?”And so, now I'm—there are two parties that I have to kind of keep on my side. One of them is obviously the bank, but also the insurance provider, who might be mad at me because I'm ten days late on this premium or whatever. So, my emails to them are places where I use this where it's like, I'm basically going to make it so that the person who could get mad at me and cause me some kind of detriment is going to have to do it through a really thick cloud of, “Josh is a nice guy who isn't trying to be a jerk to anybody here. He's not trying to pull one over on anybody. There was an honest mistake that was made, he's just trying to make everything right, and he's hoping that I can help them.”And they're going to have to look at the way that I communicate with them and they're going to have to push through it and say, “Nope. I'm going to be a jerk. I'm going to follow the letter of the law or I'm going to be as punitive as I can be.” That's really hard to do when somebody like me is emailing, say, “Hey, listen, I know that we were supposed to get a check out to you last week. I'm working on it right now. I've already got everything to the bank. It's going to be overnighted to you tonight. Is there anything else I could do to make this easy for you on your side?”And then they're going to be like, “No, just, you know, as soon as we get it, we'll let you know.” Whereas if I'm, like, you know, mad at them or I'm mad at somebody or I'm being a jerk in email, then they don't really have any reason to not be as punitive as they can be to me. And so, that's just—it's a little manipulative, I guess, but it's also the way that I see life, right? Like, I'm like that with everyone, including people who are on the other side of that equation. I'm going to give them grace when I can.And so, it's a way of me saying, “Hey, can you extend some grace to me? I think you're a human being who's on the other side of this and you have a job to do and I understand that, and if you could be a little bit kind to me, that would be great.” And it works almost every time.Corey: This episode is sponsored in part by Panoptica.  Panoptica simplifies container deployment, monitoring, and security, protecting the entire application stack from build to runtime. Scalable across clusters and multi-cloud environments, Panoptica secures containers, serverless APIs, and Kubernetes with a unified view, reducing operational complexity and promoting collaboration by integrating with commonly used developer, SRE, and SecOps tools. Panoptica ensures compliance with regulatory mandates and CIS benchmarks for best practice conformity. Privacy teams can monitor API traffic and identify sensitive data, while identifying open-source components vulnerable to attacks that require patching. Proactively addressing security issues with Panoptica allows businesses to focus on mitigating critical risks and protecting their interests. Learn more about Panoptica today at panoptica.app.Corey: There's value as well, even everyday customer service interactions, if I have a bad customer experience buying something off of Amazon—I know, imagine that.j could that ever happen? Of course not. But in a magical world in which in hypothetically did, I can call up and they answer the phone, I'm probably going to be pretty steamed going into that conversation because this is effort I didn't want to have to deal with. But stop and think about it for a second. Usually, when I call Amazon for a variety of things, it's not Andy Jassy who's answering the phone. Those are atypical moments for me.Instead, it is generally some poor customer service schmo, who is basically given zero amount of autonomy to speak of in the course of their job, and surprisingly, does not set Amazon's strategic priorities for them. And if I unload on this person, maybe I make myself feel better, I've made someone else's day actively worse, but even if you want to set aside the story of being a good person—which I don't suggest people do—but view it in a purely Machiavellian self-serving way, you're still going to have a better outcome if you inspire people to like you by making yourself likable. Because when you're a jerk—and I used to work helpdesk; I remember how this works—Josh: Me, too.Corey: Suddenly, I will fall back on every policy that I can have, “Oh, we're not allowed to sit through a reboot. Bye.” As opposed to, “Eh, [unintelligible 00:22:31] say ever not to, but I'm enjoying this and I want to help you out and make sure you get there, so hang out. Why not?” There are ways people can bend the rules in your favor, but if you give them an excuse to fall back on that, they're not going to go out of their way to help you at all. They're going to make you go through every bit of procedural red tape they can possibly come up with. And again, you've made their day worse and that should not be lost on you. The outcomes are better for everyone when you're a nice person.Josh: As you were talking, it's funny because I remembered, maybe the most frustrated I've ever been talking to customer service. This is several years—many years ago, but I had some student loan stuff going on. I don't even remember specifically what it was, but it had to do with, you know, who was servicing the loan and I'm trying to pay off a loan and I can't get the right person on the phone and they say, you know, “It's this other place that owns that holds the loan.” Or, “You need to call this person,” and I'm getting the runaround and I'm not able to do the thing I want to do.And after I think I've been hung up on, like, three times, and I was really steamed. Like you said, I'm legitimately, like, very frustrated. My voice had been cracking a little bit, which is how I know I'm, like, really getting heated is my [laugh] voice will start to crack a little bit. But I said to the person—and I became conscious in that moment of like, okay, I'm very frustrated. I could say something I regret I could really, like, hurt this person that I'm talking to.As you said, they're just somebody who's a customer service representative for this bank or loan servicer, whoever they were. So, I said something like, “Listen, you can't hear it in the tone of my voice right now, but I need you to know that I'm extremely frustrated and I'm going to [laugh] I'm going to get really upset, and so I'm asking you to help me before I do that before I escalate. I don't want to talk to your manager, but I'm going to ask you to do that if you don't help me right now. And you should know that I'm super frustrated. My voice is not betraying that right now, but understand that I am.”And they snapped in and they were like, “Okay, I get it, I get it,” you know? And right there even as a place where I could have just started shouting at them or whatever it takes, you know, “I want to talk to your manager,” and, “I'm going to escalate,” and all this stuff. And instead, I was like, well, I'm going to give them one last chance, which is, let me just tell them how frustrated I am without using colorful language or mean words. And it worked. It was a subtle thing that actually, I think it got their attention more than anything else. They said, “Oh, this person is really angry. I should actually listen to them.”Corey: Now, there is a dark side to this as well and that is human nature. I have done experiments on this over the years, most notably on Twitter, back when that was the central place people went to, and when I would say something nice about an AWS service, it got in most cases two likes and maybe a bot would retweet it. Whereas if I say, “This AWS service is a piece of garbage,” and I come up with some reason for it, it went around the internet three times and it was misconstrued, with me saying, “The entirety of AWS is terrible.” Not usually, no. There are some frustrating elements, but yeah, there's context. It doesn't fit into a single tweet.The snarky negativity blows up and responds to—and resonates with something in human nature that the people love spreading that around and engaging with it, whereas the happy positivity does not work that way. On Twitter. I've noticed what seems to be the opposite effect on LinkedIn. Snark doesn't do well over there, but almost saccharine-sweet sincerity does. And I don't know what this says about various social media channels or human nature or what. All I know is that I'm confused.Josh: I think you're right. You know, I mean, as you were talking, I was thinking about clickbait, right? Like, there's a reason that clickbait is called what it is, and it's because you read it and you get annoyed or frustrated or angry, and I'm going to hate-read this article right now and I'm going to send it to six friends. There is something in human nature. I mean, you know, we talked—for decades, I've heard about how the local news is our news, “If it bleeds, it leads,” in news, right?We're not talking about how great the planet is or how things—like, this bad thing happened in New Orleans yesterday and you should be really upset about it, or wherever that place happens to be on that particular day. I do think there is something innate in us that allows us to gravitate towards those kinds of things and I have no idea what it is. But it is interesting, as you said, that there are places where either that's frowned upon or there's just a different mode of communication, which tells me that there's something sort of in the cultural water there that causes people to perceive stuff differently in different kinds of social media environments, right? Twitter definitely is a place where things can go pretty negative. And there are other places that are significantly more negative, right, on the internet, if you want to go, they get really bad, and then there's places that are really positive.And it's interesting how it's like a maybe people self-select into those places, but also, I think, you know, I think there's a big difference if you think about, like, who's using Twitter and why and who's using LinkedIn and why. I think that people correctly perceive on LinkedIn that for the most part, you're probably not going to be somebody that's at the top of a bunch of lists to be hired if your whole thing on LinkedIn is just being negative all the time and doom and gloom and snark and that kind of thing. It'll be entertaining to some people, but you're probably not going to get many job offers based on that because people are going to ask, “Do I want to work with this person 24 hours a day?” And they'll read your posts and say, “No,” whereas at least a saccharine sweet person, everybody knows those people who are like that in real life, and they can be I don't know, a little bit much, but also can generally be very good people to work with and it's not difficult to sort of like manage that.Corey: There's a lot that can be done just by having people want to help you. It's weird. Like, I take a look at some of the people that I identify publicly as the nicest in tech—Mark [unintelligible 00:27:48] is a good example. Kelsey Hightower is sort of the canonical example of all of this. These are just genuinely nice people. Ashley Willis, another good example.There are so many different folks out there who are just beacons of positivity. And I look at that, and it's like, first, that is admirable. Second, holy hell that is absolutely not me. No one is ever going to say, “That's what I love about Corey. He's so uplifting and positive all the time.” You know, I do strive to be a better person and inspire others to be better people, but I'm also willing to spare no quarter for corporate tomfoolery either. Which apparently means a lot of people think you're a jerk as a result. I'll take it.Josh: Yeah, I think it's, you know, everybody—that's the nice thing about humans, right, is we're all different. And there are lots of different types of person—if everybody had the same personality, what a boring place that we would live. And that's true for, more or less, any human characteristic. If we were all the same and vanilla, I think it would be pretty boring. So, I think that having really positive people out there is great, and having some people who are snarky is great, and having people who have, you know, an ability to just point out absurdity is great. If everyone is pointing out absurdity all the time, then we're not left with too much.So, I do think it's good that those people are out there and they're very positive. And I think that, you know, even for myself, like, I try to be positive and helpful. Like, we were talking about customer service. I'm like, overly nice to customer service people. I tip more than I should most of the time. And a lot of that is just, you know, that's a human; they have needs and feelings and this is a way for me to be kind to them.And I know most people don't think that way that I do. And I like that. And I think that some people don't think that way and I think that's totally fine, too. I think the variety is the spice of life and I think that makes it interesting and useful. I also think that being intentional with those different modes, having them all available to you, and exercising them in different environments can be, like, a level-up, right? It can be a superpower.You can either be a person with a personality who exercises that same personality all the time, or you can choose to exercise, sort of, different personalities or different ways of communicating or different levels of positivity or negativity in different environments. And I think that makes it even more interesting where you're able to essentially be a chameleon and find the right mode of communication for the environment or the situation that you're in, which can enhance that situation for you or for other people that are around you.Corey: I have to ask, do you find that this is something you do all the time or do you put on your negotiating phrasing the same way that I do when my children accuse me of putting on ‘podcast voice.'Josh: All the time, definitely not. I am aware of it as a way of communicating that's available to me and I do consciously use it a lot of the time. But you know, if I'm just sitting around with my buddies on, you know, Wednesday night watching the game, probably not. And a lot of that is because, you know, part of this is, it's a default to positive because you don't know sort of who's on the other end of the line, whereas if you're communicating with somebody that you've communicated with for hundreds of hours, you don't need all that stuff, you don't need all the tonal indicators and the padding and all that stuff because you know that person. So, a lot of what I'm describing, even like in a salary negotiation, I'm basically working from the default of I don't know the counterparty, I don't know the recruiter, and therefore we're going to default to positive, and that's going to essentially, you know, make things smoother.It's going to remove friction because there are things that I don't know, whereas, you know, if I'm communicating with somebody I know really well for 20 years, we don't need all that stuff. We can—that's where the shorthand can come in handy. It can be really useful because we already know all of the background there. One place that I'm very conscious of this is, you know, every now and then somebody, with a personal friend or somebody that I know, well, I'll have, like, a difficult conversation where they'll say, “Hey, you know, this is something that happened to me recently. Can you help me out?” Or, “This is a difficult thing that I'm going through.”And that's a place where I am very conscious of this and it comes in different ways. One of them is using positive words, but one of them is also just, like, exercising extreme sympathy or empathy if it's appropriate. Which is, again, it's a conscious decision to say, this isn't a time to point out, you know, for example, errors, or like, this person just needs someone that they want to talk to and I'm going to listen to them carefully, I'm going to try to give them reassurance that the situation will be resolved eventually, and that kind of thing, but it's not a time for you know, critique or, you know, negative words or pointing out flaws and that kind of thing. And so, I think that's also kind of a conscious place that I will exercise it. But to answer your question, no, I don't do this all the time.I would say without having ever thought about this before, the less familiar I am with the person or the situation, the more I will default to this, and the more familiar I am with the person or the situation, the less I will default to it. And I will just use more plain, kind of, direct language because that familiarity is there, and it assumes a lot that isn't there when I don't know the person well.Corey: I really want to thank you for taking the time to speak with me about this. Where can people go to learn more?Josh: Maybe follow me on Twitter [laugh], @joshdoody on Twitter.Corey: It's a harder problem these days than it once was.Josh: Yeah. I really paused there. I am pretty active on LinkedIn these days. And fearlesssalarynegotiation.com isn't explicitly about positive language or being positively persuasive, but you'll see even just reading the articles that I write there that underlying most of what I write is this sort of implicit understanding that positivity is the way to make progress and to get closer to what your goals are. So, @joshdoody on Twitter; joshdoody on LinkedIn, of course, and then fearlesssalarynegotiation.com.Corey: And we will, of course, put links to all of this in the show notes. Thank you so much for taking the time to speak with me. I appreciate it.Josh: Thanks for having me on, Corey. This was a lot of fun. I always like talking to you.Corey: I do, too. Josh [laugh] Doody, owner of Fearless Salary Negotiation. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that rants itself sick, but also only uses positive language.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
How Cloudflare is Working to Fix the Internet with Matthew Prince

Screaming in the Cloud

Play Episode Listen Later Aug 10, 2023 42:30


Matthew Prince, Co-founder & CEO at Cloudflare, joins Corey on Screaming in the Cloud to discuss how and why Cloudflare is working to solve some of the Internet's biggest problems. Matthew reveals some of his biggest issues with cloud providers, including the tendency to charge more for egress than ingress and the fact that the various clouds don't compete on a feature vs. feature basis. Corey and Matthew also discuss how Cloudflare is working to change those issues so the Internet is a better and more secure place. Matthew also discusses how transparency has been key to winning trust in the community and among Cloudflare's customers, and how he hopes the Internet and cloud providers will evolve over time.About MatthewMatthew Prince is co-founder and CEO of Cloudflare. Cloudflare's mission is to help build a better Internet. Today the company runs one of the world's largest networks, which spans more than 200 cities in over 100 countries. Matthew is a World Economic Forum Technology Pioneer, a member of the Council on Foreign Relations, winner of the 2011 Tech Fellow Award, and serves on the Board of Advisors for the Center for Information Technology and Privacy Law. Matthew holds an MBA from Harvard Business School where he was a George F. Baker Scholar and awarded the Dubilier Prize for Entrepreneurship. He is a member of the Illinois Bar, and earned his J.D. from the University of Chicago and B.A. in English Literature and Computer Science from Trinity College. He's also the co-creator of Project Honey Pot, the largest community of webmasters tracking online fraud and abuse.Links Referenced: Cloudflare: https://www.cloudflare.com/ Twitter: https://twitter.com/eastdakota TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. One of the things we talk about here, an awful lot is cloud providers. There sure are a lot of them, and there's the usual suspects that you would tend to expect with to come up, and there are companies that work within their ecosystem. And then there are the enigmas.Today, I'm talking to returning guest Matthew Prince, Cloudflare CEO and co-founder, who… well first, welcome back, Matthew. I appreciate your taking the time to come and suffer the slings and arrows a second time.Matthew: Corey, thanks for having me.Corey: What I'm trying to do at the moment is figure out where Cloudflare lives in the context of the broad ecosystem because you folks have released an awful lot. You had this vaporware-style announcement of R2, which was an S3 competitor, that then turned out to be real. And oh, it's always interesting, when vapor congeals into something that actually exists. Cloudflare Workers have been around for a while and I find that they become more capable every time I turn around. You have Cloudflare Tunnel which, to my understanding, is effectively a VPN without the VPN overhead. And it feels that you are coming at building a cloud provider almost from the other side than the traditional cloud provider path. Is it accurate? Am I missing something obvious? How do you see yourselves?Matthew: Hey, you know, I think that, you know, you can often tell a lot about a company by what they measure and what they measure themselves by. And so, if you're at a traditional, you know, hyperscale public cloud, an AWS or a Microsoft Azure or a Google Cloud, the key KPI that they focus on is how much of a customer's data are they hoarding, effectively? They're all hoarding clouds, fundamentally. Whereas at Cloudflare, we focus on something of it's very different, which is, how effectively are we moving a customer's data from one place to another? And so, while the traditional hyperscale public clouds are all focused on keeping your data and making sure that they have as much of it, what we're really focused on is how do we make sure your data is wherever you need it to be and how do we connect all of the various things together?So, I think it's exactly right, where we start with a network and are kind of building more functions on top of that network, whereas other companies start really with a database—the traditional hyperscale public clouds—and the network is sort of an afterthought on top of it, just you know, a cost center on what they're delivering. And I think that describes a lot of the difference between us and everyone else. And so oftentimes, we work very much in conjunction with. A lot of our customers use hyperscale public clouds and Cloudflare, but increasingly, there are certain applications, there's certain data that just makes sense to live inside the network itself, and in those cases, customers are using things like R2, they're using our Workers platform in order to be able to build applications that will be available everywhere around the world and incredibly performant. And I think that is fundamentally the difference. We're all about moving data between places, making sure it's available everywhere, whereas the traditional hyperscale public clouds are all about hoarding that data in one place.Corey: I want to clarify that when you say hoard, I think of this, from my position as a cloud economist, as effectively in an economic story where hoarding the data, they get to charge you for hosting it, they get to charge you serious prices for egress. I've had people mishear that before in a variety of ways, usually distilled down to, “Oh, and their data mining all of their customers' data.” And I want to make sure that that's not the direction that you intend the term to be used. If it is, then great, we can talk about that, too. I just want to make sure that I don't get letters because God forbid we get letters for things that we say in the public.Matthew: No, I mean, I had an aunt who was a hoarder and she collected every piece of everything and stored it somewhere in her tiny little apartment in the panhandle of Florida. I don't think she looked at any of it and for the most part, I don't think that AWS or Google or Microsoft are really using your data in any way that's nefarious, but they're definitely not going to make it easy for you to get it out of those places; they're going to make it very, very expensive. And again, what they're measuring is how much of a customer's data are they holding onto whereas at Cloudflare we're measuring how much can we enable you to move your data around and connected wherever you need it. And again, I think that that kind of gets to the fundamental difference between how we think of the world and how I think the hyperscale public clouds thing of the world. And it also gets to where are the places where it makes sense to use Cloudflare, and where are the places that it makes sense to use an AWS or Google Cloud or Microsoft Azure.Corey: So, I have to ask, and this gets into the origin story trope a bit, but what radicalized you? For me, it was the realization one day that I could download two terabytes of data from S3 once, and it would cost significantly more than having Amazon.com ship me a two-terabyte hard drive from their store.Matthew: I think that—so Cloudflare started with the basic idea that the internet's not as good as it should be. If we all knew what the internet was going to be used for and what we're all going to depend on it for, we would have made very different decisions in how it was designed. And we would have made sure that security was built in from day one, we would have—you know, the internet is very reliable and available, but there are now airplanes that can't land if the internet goes offline, they are shopping transactions shut down if the internet goes offline. And so, I don't think we understood—we made it available to some extent, but not nearly to the level that we all now depend on it. And it wasn't as fast or as efficient as it possibly could be. It's still very dependent on the geography of where data is located.And so, Cloudflare started out by saying, “Can we fix that? Can we go back and effectively patch the internet and make it what it should have been when we set down the original protocols in the '60s, '70s, and '80s?” But can we go back and say, can we build a new, sort of, overlay on the internet that solves those problems: make it more secure, make it more reliable, make it faster and more efficient? And so, I think that that's where we started, and as a result of, again, starting from that place, it just made fundamental sense that our job was, how do you move data from one place to another and do it in all of those ways? And so, where I think that, again, the hyperscale public clouds measure themselves by how much of a customer's data are they hoarding; we measure ourselves by how easy are we making it to securely, reliably, and efficiently move any piece of data from one place to another.And so, I guess, that is radical compared to some of the business models of the traditional cloud providers, but it just seems like what the internet should be. And that's our North Star and that's what just continues to drive us and I think is a big reason why more and more customers continue to rely on Cloudflare.Corey: The thing that irks me potentially the most in the entire broad strokes of cloud is how the actions of the existing hyperscalers have reflected mostly what's going on in the larger world. Moore's law has been going on for something like 100 years now. And compute continues to get faster all the time. Storage continues to cost less year over year in a variety of ways. But they have, on some level, tricked an entire generation of businesses into believing that network bandwidth is this precious, very finite thing, and of course, it's going to be ridiculously expensive. You know, unless you're taking it inbound, in which case, oh, by all means back the truck around. It'll be great.So, I've talked to founders—or prospective founders—who had ideas but were firmly convinced that there was no economical way to build it. Because oh, if I were to start doing real-time video stuff, well, great, let's do the numbers on this. And hey, that'll be $50,000 a minute, if I read the pricing page correctly, it's like, well, you could get some discounts if you ask nicely, but it doesn't occur to them that they could wind up asking for a 98% discount on these things. Everything is measured in a per gigabyte dimension and that just becomes one of those things where people are starting to think about and meter something that—from my days in data centers where you care about the size of the pipe and not what's passing through it—to be the wrong way of thinking about things.Matthew: A little of this is that everybody is colored by their experience of dealing with their ISP at home. And in the United States, in a lot of the world, ISPs are built on the old cable infrastructure. And if you think about the cable infrastructure, when it was originally laid down, it was all one-directional. So, you know, if you were turning on cable in your house in a pre-internet world, data fl—Corey: Oh, you'd watch a show and your feedback was yelling at the TV, and that's okay. They would drop those packets.Matthew: And there was a tiny, tiny, tiny bit of data that would go back the other direction, but cable was one-directional. And so, it actually took an enormous amount of engineering to make cable bi-directional. And that's the reason why if you're using a traditional cable company as your ISP, typically you will have a large amount of download capacity, you'll have, you know, a 100 megabits of down capacity, but you might only have a 10th of that—so maybe ten megabits—of upload capacity. That is an artifact of the cable system. That is not just the natural way that the internet works.And the way that it is different, that wholesale bandwidth works, is that when you sign up for wholesale bandwidth—again, as you phrase it, you're not buying this many bytes that flows over the line; you're buying, effectively, a pipe. You know, the late Senator Ted Stevens said that the internet is just a series of tubes and got mocked mercilessly, but the internet is just a series of tubes. And when Cloudflare or AWS or Google or Microsoft buys one of those tubes, what they pay for is the diameter of the tube, the amount that can fit through it. And the nature of this is you don't just get one tube, you get two. One that is down and one that is up. And they're the same size.And so, if you've got a terabit of traffic coming down and zero going up, that costs exactly the same as a terabit going up and zero going down, which costs exactly the same as a terabit going down and a terabit going up. It is different than your home, you know, cable internet connection. And that's the thing that I think a lot of people don't understand. And so, as you pointed out, but the great tragedy of the cloud is that for nothing other than business reasons, these hyperscale public cloud companies don't charge you anything to accept data—even though that is actually the more expensive of the two operations for that because writes are more expensive than reads—but the inherent fact that they were able to suck the data in means that they have the capacity, at no additional cost, to be able to send that data back out. And so, I think that, you know, the good news is that you're starting to see some providers—so Cloudflare, we've never charged for egress because, again, we think that over time, bandwidth prices go to zero because it just makes sense; it makes sense for ISPs, it makes sense for connectiv—to be connected to us.And that's something that we can do, but even in the cases of the cloud providers where maybe they're all in one place and somebody has to pay to backhaul the traffic around the world, maybe there's some cost, but you're starting to see some pressure from some of the more forward-leaning providers. So Oracle, I think has done a good job of leaning in and showing how egress fees are just out of control. But it's crazy that in some cases, you have a 4,000x markup on AWS bandwidth fees. And that's assuming that they're paying the same rates as what we would get at Cloudflare, you know, even though we are a much smaller company than they are, and they should be able to get even better prices.Corey: Yes, if there's one thing Amazon is known for, it as being bad at negotiating. Yeah, sure it is. I'm sure that they're just a terrific joy to be a vendor to.Matthew: Yeah, and I think that fundamentally what the price of bandwidth is, is tied very closely to what the cost of a port on a router costs. And what we've seen over the course of the last ten years is that cost has just gone enormously down where the capacity of that port has gone way up and the just physical cost, the depreciated cost that port has gone down. And yet, when you look at Amazon, you just haven't seen a decrease in the cost of bandwidth that they're passing on to customers. And so, again, I think that this is one of the places where you're starting to see regulators pay attention, we've seen efforts in the EU to say whatever you charge to take data out is the same as what you should charge it to put data in. We're seeing the FTC start to look at this, and we're seeing customers that are saying that this is a purely anti-competitive action.And, you know, I think what would be the best and healthiest thing for the cloud by far is if we made it easy to move between various cloud providers. Because right now the choice is, do I use AWS or Google or Microsoft, whereas what I think any company out there really wants to be able to do is they want to be able to say, “I want to use this feature at AWS because they're really good at that and I want to use this other feature at Google because they're really good at that, and I want to us this other feature at Microsoft, and I want to mix and match between those various things.” And I think that if you actually got cloud providers to start competing on features as opposed to competing on their overall platform, we'd actually have a much richer and more robust cloud environment, where you'd see a significantly improved amount of what's going on, as opposed to what we have now, which is AWS being mediocre at everything.Corey: I think that there's also a story where for me, the egress is annoying, but so is the cross-region and so is the cross-AZ, which in many cases costs exactly the same. And that frustrates me from the perspective of, yes, if you have two data centers ten miles apart, there is some startup costs to you in running fiber between them, however you want to wind up with that working, but it's a sunk cost. But at the end of that, though, when you wind up continuing to charge on a per gigabyte basis to customers on that, you're making them decide on a very explicit trade-off of, do I care more about cost or do I care more about reliability? And it's always going to be an investment decision between those two things, but when you make the reasonable approach of well, okay, an availability zone rarely goes down, and then it does, you get castigated by everyone for, “Oh it even says in their best practice documents to go ahead and build it this way.” It's funny how a lot of the best practice documents wind up suggesting things that accrue primarily to a cloud provider's benefit. But that's the way of the world I suppose.I just know, there's a lot of customer frustration on it and in my client environments, it doesn't seem to be very acute until we tear apart a bill and look at where they're spending money, and on what, at which point, the dawning realization, you can watch it happen, where they suddenly realize exactly where their money is going—because it's relatively impenetrable without that—and then they get angry. And I feel like if people don't know what they're being charged for, on some level, you've messed up.Matthew: Yeah. So, there's cost to running a network, but there's no reason other than limiting competition why you would charge more to take data out than you would put data in. And that's a puzzle. The cross-region thing, you know, I think where we're seeing a lot of that is actually oftentimes, when you've got new technologies that come out and they need to take advantage of some scarce resource. And so, AI—and all the AI companies are a classic example of this—right now, if you're trying to build a model, an AI model, you are hunting the world for available GPUs at a reasonable price because there's an enormous scarcity of them.And so, you need to move from AWS East to AWS West, to AWS, you know, Singapore, to AWS in Luxembourg and bounce around to find wherever there's GPU availability. And then that is crossed against the fact that these training datasets are huge. You know, I mean, they're just massive, massive, massive amounts of data. And so, what that is doing is you're having these AI companies that are really seeing this get hit in the face, where they literally can't get the capacity they need because of the fact that whatever cloud provider in whatever region they've selected to store their data isn't able to have that capacity. And so, they're getting hit not only by sort of a double whammy of, “I need to move my data to wherever there's capacity. And if I don't do that, then I have to pay some premium, an ever-escalating price for the underlying GPUs.” And God forbid, you have to move from AWS to Google to chase that.And so, we're seeing a lot of companies that are saying, “This doesn't make any sense. We have this enormous training set. If we just put it with Cloudflare, this is data that makes sense to live in the network, fundamentally.” And not everything does. Like, we're not the right place to store your long-term transaction logs that you're only going to look at if you get sued. There are much better places, much more effective places do it.But in those cases where you've got to read data frequently, you've got to read it from different places around the world, and you will need to decrease what those costs of each one of those reads are, what we're seeing is just an enormous amount of demand for that. And I think these AI startups are really just a very clear example of what company after company after company needs, and why R2 has had—which is our zero egress cost S3 competitor—why that is just seeing such explosive growth from a broad set of customers.Corey: Because I enjoy pushing the bounds of how ridiculous I can be on the internet, I wound up grabbing a copy of the model, the Llama 2 model that Meta just released earlier this week as we're recording this. And it was great. It took a little while to download here. I have gigabit internet, so okay, it took some time. But then I wound up with something like 330 gigs of models. Great, awesome.Except for the fact that I do the math on that and just for me as one person to download that, had they been paying the listed price on the AWS website, they would have spent a bit over $30, just for me as one random user to download the model, once. If you can express that into the idea of this is a model that is absolutely perfect for whatever use case, but we want to have it run with some great GPUs available at another cloud provider. Let's move the model over there, ignoring the data it's operating on as well, it becomes completely untenable. It really strikes me as an anti-competitiveness issue.Matthew: Yeah. I think that's it. That's right. And that's just the model. To build that model, you would have literally millions of times more data that was feeding it. And so, the training sets for that model would be many, many, many, many, many, many orders of magnitude larger in terms of what's there. And so, I think the AI space is really illustrating where you have this scarce resource that you need to chase around the world, you have these enormous datasets, it's illustrating how these egress fees are actually holding back the ability for innovation to happen.And again, they are absolutely—there is no valid reason why you would charge more for egress than you do for ingress other than limiting competition. And I think the good news, again, is that's something that's gotten regulators' attention, that's something that's gotten customers' attention, and over time, I think we all benefit. And I think actually, AWS and Google and Microsoft actually become better if we start to have more competition on a feature-by-feature basis as opposed to on an overall platform. The choice shouldn't be, “I use AWS.” And any big company, like, nobody is all-in only on one cloud provider. Everyone is multi-cloud, whether they want to be or not because people end up buying another company or some skunkworks team goes off and uses some other function.So, you are across multiple different clouds, whether you want to be or not. But the ideal, and when I talk to customers, they want is, they want to say, “Well, you know that stuff that they're doing over at Microsoft with AI, that sounds really interesting. I want to use that, but I really like the maturity and robustness of some of the EC2 API, so I want to use that at AWS. And Google is still, you know, the best in the world at doing search and indexing and everything, so I want to use that as well, in order to build my application.” And the applications of the future will inherently stitch together different features from different cloud providers, different startups.And at Cloudflare, what we see is our, sort of, purpose for being is how do we make that stitching as easy as possible, as cost-effective as possible, and make it just make sense so that you have one consistent security layer? And again, we're not about hording the data; we're about connecting all of those things together. And again, you know, from the last time we talked to now, I'm actually much more optimistic that you're going to see, kind of, this revolution where egress prices go down, you get competition on feature-by-features, and that's just going to make every cloud provider better over the long-term.Corey: This episode is sponsored in part by Panoptica.  Panoptica simplifies container deployment, monitoring, and security, protecting the entire application stack from build to runtime. Scalable across clusters and multi-cloud environments, Panoptica secures containers, serverless APIs, and Kubernetes with a unified view, reducing operational complexity and promoting collaboration by integrating with commonly used developer, SRE, and SecOps tools. Panoptica ensures compliance with regulatory mandates and CIS benchmarks for best practice conformity. Privacy teams can monitor API traffic and identify sensitive data, while identifying open-source components vulnerable to attacks that require patching. Proactively addressing security issues with Panoptica allows businesses to focus on mitigating critical risks and protecting their interests. Learn more about Panoptica today at panoptica.app.Corey: I don't know that I would trust you folks to the long-term storage of critical data or the store of record on that. You don't have the track record on that as a company the way that you do for being the network interchange that makes everything just work together. There are areas where I'm thrilled to explore and see how it works, but it takes time, at least from the sensible infrastructure perspective of trusting people with track records on these things. And you clearly have the network track record on these things to make this stick. It almost—it seems unfair to you folks, but I view you as Cloudflare is a CDN, that also dabbles in a few other things here in there, though, increasingly, it seems it's CDN and security company are becoming synonymous.Matthew: It's interesting. I remember—and this really is going back to the origin story, but when we were starting Cloudflare, you know, what we saw was that, you know, we watched as software—starting with companies like Salesforce—transition from something that you bought in the box to something that you bought as a service [into 00:23:25] the cloud. We watched as, sort of, storage and compute transition from something that you bought from Dell or HP to something that you rented as a service. And so the fundamental problem that Cloudflare started out with was if the software and the storage and compute are going to move, inherently the security and the networking is going to move as well because it has to be as a service as well, there's no way you can buy a you know, Cisco firewall and stick it in front of your cloud service. You have to be in the cloud as well.So, we actually started very much as a security company. And the objection that everybody had to us as we would sort of go out and describe what we were planning on doing was, “You know, that sounds great, but you're going to slow everything down.” And so, we became just obsessed with latency. And Michelle, my co-founder, and I were business students and we had an advisor, a guy named Tom [Eisenmann 00:24:26] in business school. And I remember going in and that was his objection as well and so we did all this work to figure it out.And obviously, you know, I'd say computer science, and anytime that you have a problem around latency or speed caching is an obvious part of the solution to that. And so, we went in and we said, “Here's how we're going to do it: [unintelligible 00:24:47] all this protocol optimization stuff, and here's how we're going to distribute it around the world and get close to where users are. And we're going to use caching in the places where we can do caching.” And Tom said, “Oh, you're building a CDN.” And I remember looking at him and then I'm looking at Michelle. And Michelle is Canadian, and so I was like, “I don't know that I'm building a Canadian, but I guess. I don't know.”And then, you know, we walked out in the hall and Michelle looked at me and she's like, “We have to go figure out what the CDN thing is.” And we had no idea what a CDN was. And even when we learned about it, we were like, that business doesn't make any sense. Like because again, the CDNs were the first ones to really charge for bandwidth. And so today, we have effectively built, you know, a giant CDN and are the fastest in the world and do all those things.But we've always given it away basically for free because fundamentally, what we're trying to do is all that other stuff. And so, we actually started with security. Security is—you know, my—I've been working in security now for over 25 years and that's where my background comes from, and if you go back and look at what the original plan was, it was how do we provide that security as a service? And yeah, you need to have caching because caching makes sense. What I think is the difference is that in order to do that, in order to be able to build that, we had to build a set of developer tools for our own team to allow them to build things as quickly as possible.And, you know, if you look at Cloudflare, I think one of the things we're known for is just the rapid, rapid, rapid pace of innovation. And so, over time, customers would ask us, “How do you innovate so fast? How do you build things fast?” And part of the answer to that, there are lots of ways that we've been able to do that, but part of the answer to that is we built a developer platform for our own team, which was just incredibly flexible, allowed you to scale to almost any level, took care of a lot of that traditional SRE functions just behind the scenes without you having to think about it, and it allowed our team to be really fast. And our customers are like, “Wow, I want that too.”And so, customer after customer after customer after customer was asking and saying, you know, “We have those same problems. You know, if we're a big e-commerce player, we need to be able to build something that can scale up incredibly quickly, and we don't have to think about spinning up VMs or containers or whatever, we don't have to think about that. You know, our customers are around the world. We don't want to have to pick a region for where we're going to deploy code.” And so, where we built Cloudflare Workers for ourself first, customers really pushed us to make it available to them as well.And that's the way that almost any good developer platform starts out. That's how AWS started. That's how, you know, the Microsoft developer platform, and so the Apple developer platform, the Salesforce developer platform, they all start out as internal tools, and then someone says, “Can you expose this to us as well?” And that's where, you know, I think that we have built this. And again, it's very opinionated, it is right for certain applications, it's never going to be the right place to run SAP HANA, but the company that builds the tool [crosstalk 00:27:58]—Corey: I'm not convinced there is a right place to run SAP HANA, but that's probably unfair of me.Matthew: Yeah, but there is a startup out there, I guarantee you, that's building whatever the replacement for SAP HANA is. And I think it's a better than even bet that Cloudflare Workers is part of their stack because it solves a lot of those fundamental challenges. And that's been great because it is now allowing customer after customer after customer, big and large startups and multinationals, to do things that you just can't do with traditional legacy hyperscale public cloud. And so, I think we're sort of the next generation of building that. And again, I don't think we set out to build a developer platform for third parties, but we needed to build it for ourselves and that's how we built such an effective tool that now so many companies are relying on.Corey: As a Cloudflare customer myself, I think that one of the things that makes you folks standalone—it's why I included security as well as CDN is one of the things I trust you folks with—has been—Matthew: I still think CDN is Canadian. You will never see us use that term. It's like, Gartner was like, “You have to submit something for the CDN-like ser—” and we ended up, like, being absolute top-right in it. But it's a space that is inherently going to zero because again, if bandwidth is free, I'm not sure what—this is what the internet—how the internet should work. So yeah, anyway.Corey: I agree wholeheartedly. But what I've always enjoyed, and this is probably going to make me sound meaner than I intend it to, it has been your outages. Because when computers inherently at some point break, which is what they do, you personally and you as a company have both taken a tone that I don't want to say gleeful, but it's sort of the next closest thing to it regarding the postmortem that winds up getting published, the explanation of what caused it, the transparency is unheard of at companies that are your scale, where usually they want to talk about these things as little as possible. Whereas you've turned these into things that are educational to those of us who don't have the same scale to worry about but can take things from that are helpful. And that transparency just counts for so much when we're talking about things as critical as security.Matthew: I would definitely not describe it as gleeful. It is incredibly painful. And we, you know, we know we let customers down anytime we have an issue. But we tend not to make the same mistake twice. And the only way that we really can reliably do that is by being just as transparent as possible about exactly what happened.And we hope that others can learn from the mistakes that we made. And so, we own the mistakes we made and we talk about them and we're transparent, both internally but also externally when there's a problem. And it's really amazing to just see how much, you know, we've improved over time. So, it's actually interesting that, you know, if you look across—and we measure, we test and measure all the big hyperscale public clouds, what their availability and reliability is and measure ourselves against it, and across the board, second half of 2021 and into the first half of 2022 was the worst for every cloud provider in terms of reliability. And the question is why?And the answer is, Covid. I mean, the answer to most things over the last three years is in one way, directly or indirectly, Covid. But what happened over that period of time was that in April of 2020, internet traffic and traffic to our service and everyone who's like us doubled over the course of a two-week period. And there are not many utilities that you can imagine that if their usage doubles, that you wouldn't have a problem. Imagine the sewer system all of a sudden has twice as much sewage, or the electrical grid as twice as much demand, or the freeways have twice as many cars. Like, things break down.And especially the European internet came incredibly close to just completely failing at that time. And we all saw where our bottlenecks were. And what's interesting is actually the availability wasn't so bad in 2020 because people were—they understood the absolute critical importance that while we're in the middle of a pandemic, we had to make sure the internet worked. And so, we—there were a lot of sleepless nights, there's a—and not just at with us, but with every provider that's out there. We were all doing Herculean tasks in order to make sure that things came online.By the time we got to the sort of the second half of 2021, what everybody did, Cloudflare included, was we looked at it, and we said, “Okay, here were where the bottlenecks were. Here were the problems. What can we do to rearchitect our systems to do that?” And one of the things that we saw was that we effectively treated large data centers as one big block, and if you had certain pieces of equipment that failed in a way, that you would take that entire data center down and then that could have cascading effects across traffic as it shifted around across our network. And so, we did the work to say, “Let's take that one big data center and divide it effectively into multiple independent units, where you make sure that they're all on different power suppliers, you make sure they're all in different [crosstalk 00:32:52]”—Corey: [crosstalk 00:32:51] harder than it sounds. When you have redundant things, very often, the thing that takes you down the most is the heartbeat that determines whether something next to it is up or not. It gets a false reading and suddenly, they're basically trying to clobber each other to death. So, this is a lot harder than it sounds like.Matthew: Yeah, and it was—but what's interesting is, like, we took it all that into account, but the act of fixing things, you break things. And that was not just true at Cloudflare. If you look across Google and Microsoft and Amazon, everybody, their worst availability was second half of 2021 or into 2022. But it both internally and externally, we talked about the mistakes we made, we talked about the challenges we had, we talked about—and today, we're significantly more resilient and more reliable because of that. And so, transparency is built into Cloudflare from the beginning.The earliest story of this, I remember, there was a 15-year-old kid living in Long Beach, California who bought my social security number off of a Russian website that had hacked a bank that I'd once used to get a mortgage. He then use that to redirect my cell phone voicemail to a voicemail box he controlled. He then used that to get into my personal email. He then used that to find a zero-day vulnerability in Google's corporate email where he could privilege-escalate from my personal email into Google's corporate email, which is the provider that we use for our email service. And then he used that as an administrator on our email at the time—this is back in the early days of Cloudflare—to get into another administration account that he then used to redirect one of Cloud Source customers to a website that he controlled.And thankfully, it wasn't, you know, the FBI or the Central Bank of Brazil, which were all Cloudflare customers. Instead, it was 4chan because he was a 15-year-old hacker kid. And we fix it pretty quickly and nobody knew who Cloudflare was at the time. And so potential—Corey: The potential damage that could have been caused at that point with that level of access to things, like, that is such a ridiculous way to use it.Matthew: And—yeah [laugh]—my temptation—because it was embarrassing. He took a bunch of stuff from my personal email and he put it up on a website, which just to add insult to injury, was actually using Cloudflare as well. And I wanted to sweep it under the rug. And our team was like, “That's not the right thing to do. We're fundamentally a security company and we need to talk about when we make mistakes on security.” And so, we wrote a huge postmortem on, “Here's all the stupid things that we did that caused this hack to happen.” And by the way, it wasn't just us. It was AT&T, it was Google. I mean, there are a lot of people that ended up being involved.Corey: It builds trust with that stuff. It's painful in the short term, but I believe with the benefit of hindsight, it was clearly the right call.Matthew: And it was—and I remember, you know, pushing ‘publish' on the blog post and thinking, “This is going to be the end of the company.” And quite the opposite happened, which was all of a sudden, we saw just an incredible amount of people who signed up the next day saying, “If you're going to be that transparent about something that was incredibly embarrassing when you didn't have to be, then that's the sort of thing that actually makes me trust that you're going to be transparent the future.” And I think learning that lesson early on, has been just an incredibly valuable lesson for us and made us the company that we are today.Corey: A question that I have for you about the idea of there being no reason to charge in one direction but not the other. There's something that I'm not sure that I understand on this. If I run a website, to use your numbers of a terabit out—because it's a web server—and effectively nothing in—because it's a webserver; other than the request, nothing really is going to come in—that ingress bandwidth becomes effectively unused and also free. So, if I have another use case where I'm paying for it anyway, if I'm primarily caring about an outward direction, sure, you can send things in for free. Now, there's a lot of nuance that goes into that. But I'm curious as to what the—is their fundamental misunderstanding in that analysis of the bandwidth market?Matthew: No. And I think that's exactly, exactly right. And it's actually interesting. At Cloudflare, our infrastructure team—which is the one that manages our connections to the outside world, manages the hardware we have—meets on a quarterly basis with our product team. It's called the Hot and Cold Meeting.And what they do is they go over our infrastructure, and they say, “Okay, where are we hot? Where do we have not enough capacity?” If you think of any given server, an easy way to think of a server is that it has, sort of, four resources that are available to it. This is, kind of, vast simplification, but one is the connectivity to the outside world, both transit in and out. The second is the—Corey: Otherwise it's just a complicated space heater.Matthew: Yeah [laugh]. The other is the CPU. The other is the longer-term storage. We use only SSDs, but sort of, you know, hard drives or SSD storage. And then the fourth is the short-term storage, or RAM that's in that server.And so, at any given moment, there are going to be places where we are running hot, where we have a sort of capacity level that we're targeting and we're over that capacity level, but we're also going to be running cold in some of those areas. And so, the infrastructure team and the product team get together and the product team has requests on, you know, “Here's some more places we would be great to have more infrastructure.” And we're really good at deploying that when we need to, but the infrastructure team then also says, “Here are the places where we're cold, where we have excess capacity.” And that turns into products at Cloudflare. So, for instance, you know, the reason that we got into the zero-trust space was very much because we had all this excess capacity.We have 100 times the capacity of something like Zscaler across our network, and we can add that—that is primar—where most of our older products are all about outward traffic, the zero-trust products are all about inward traffic. And the reason that we can do everything that Zscaler does, but for, you know, a much, much, much more affordable prices, we going to basically just layer that on the network that already exists. The reason we don't charge for the bandwidth behind DDoS attacks is DDoS attacks are always about inbound traffic and we have just a ton of excess capacity around that inbound traffic. And so, that unused capacity is a resource that we can then turn into products, and very much that conversation between our product team and our infrastructure team drives how we think about building new products. And we're always trying to say how can we get as much utilization out of every single piece of equipment that we run everywhere in the world.The way we build our network, we don't have custom machines or different networks for every products. We build all of our machines—they come in generations. So, we're on, I think, generation 14 of servers where we spec a server and it has, again, a certain amount of each of those four [bits 00:39:22] of capacity. But we can then deploy that server all around the world, and we're buying many, many, many of them at any given time so we can get the best cost on that. But our product team is very much in constant communication with our infrastructure team and saying, “What more can we do with the capacity that we have?” And then we pass that on to our customers by adding additional features that work across our network and then doing it in a way that's incredibly cost-effective.Corey: I really want to thank you for taking the time to, basically once again, suffer slings and arrows about networking, security, cloud, economics, and so much more. If people want to learn more, where's the best place for them to find you?Matthew: You know, used to be an easy question to answer because it was just, you know, go on Twitter and find me but now we have all these new mediums. So, I'm @eastdakota on Twitter. I'm eastdakota.com on Bluesky. I'm @real_eastdakota on Threads. And so, you know, one way or another, if you search for eastdakota, you'll come across me somewhere out there in the ether.Corey: And we will, of course, put links to that in the show notes. Thank you so much for your time. I appreciate it.Matthew: It's great to talk to you, Corey.Corey: Matthew Prince, CEO and co-founder of Cloudflare. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that I will of course not charge you inbound data rates on.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Role of DevRel at Google with Richard Seroter

Screaming in the Cloud

Play Episode Listen Later Aug 8, 2023 34:07


Richard Seroter, Director of Outbound Product Management at Google, joins Corey on Screaming in the Cloud to discuss what's new at Google. Corey and Richard discuss how AI can move from a novelty to truly providing value, as well as the importance of people maintaining their skills and abilities rather than using AI as a black box solution. Richard also discusses how he views the DevRel function, and why he feels it's so critical to communicate expectations for product launches with customers. About RichardRichard Seroter is Director of Outbound Product Management at Google Cloud. He's also an instructor at Pluralsight, a frequent public speaker, and the author of multiple books on software design and development. Richard maintains a regularly updated blog (seroter.com) on topics of architecture and solution design and can be found on Twitter as @rseroter. Links Referenced: Google Cloud: https://cloud.google.com Personal website: https://seroter.com Twitter: https://twitter.com/rseroter LinkedIn: https://www.linkedin.com/in/seroter/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Human-scale teams use Tailscale to build trusted networks. Tailscale Funnel is a great way to share a local service with your team for collaboration, testing, and experimentation.  Funnel securely exposes your dev environment at a stable URL, complete with auto-provisioned TLS certificates. Use it from the command line or the new VS Code extensions. In a few keystrokes, you can securely expose a local port to the internet, right from the IDE.I did this in a talk I gave at Tailscale Up, their first inaugural developer conference. I used it to present my slides and only revealed that that's what I was doing at the end of it. It's awesome, it works! Check it out!Their free plan now includes 3 users & 100 devices. Try it at snark.cloud/tailscalescream Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. We have returning guest Richard Seroter here who has apparently been collecting words to add to his job title over the years that we've been talking to him. Richard, you are now the Director of Product Management and Developer Relations at Google Cloud. Do I have all those words in the correct order and I haven't forgotten any along the way?Richard: I think that's all right. I think my first job was at Anderson Consulting as an analyst, so my goal is to really just add more words to whatever these titles—Corey: It's an adjective collection, really. That's what a career turns into. It's really the length of a career and success is measured not by accomplishments but by word count on your resume.Richard: If your business card requires a comma, success.Corey: So, it's been about a year or so since we last chatted here. What have you been up to?Richard: Yeah, plenty of things here, still, at Google Cloud as we took on developer relations. And, but you know, Google Cloud proper, I think AI has—I don't know if you've noticed, AI has kind of taken off with some folks who's spending a lot the last year… juicing up services and getting things ready there. And you know, myself and the team kind of remaking DevRel for a 2023 sort of worldview. So, yeah we spent the last year just scaling and growing and in covering some new areas like AI, which has been fun.Corey: You became profitable, which is awesome. I imagined at some point, someone wound up, like, basically realizing that you need to, like, patch the hole in the pipe and suddenly the water bill is no longer $8 billion a quarter. And hey, that works super well. Like, wow, that explains our utility bill and a few other things as well. I imagine the actual cause is slightly more complex than that, but I am a simple creature.Richard: Yeah. I think we made more than YouTube last quarter, which was a good milestone when you think of—I don't think anybody who says Google Cloud is a fun side project of Google is talking seriously anymore.Corey: I misunderstood you at first. I thought you said that you're pretty sure you made more than I did last year. It's like, well, yes, if a multi-billion dollar company's hyperscale cloud doesn't make more than I personally do, then I have many questions. And if I make more than that, I have a bunch of different questions, all of which could be terrifying to someone.Richard: You're killing it. Yeah.Corey: I'm working on it. So, over the last year, another trend that's emerged has been a pivot away—thankfully—from all of the Web3 nonsense and instead embracing the sprinkle some AI on it. And I'm not—people are about to listen to this and think, wait a minute, is he subtweeting my company? No, I'm subtweeting everyone's company because it seems to be a universal phenomenon. What's your take on it?Richard: I mean, it's countercultural now to not start every conversation with let me tell you about our AI story. And hopefully, we're going to get past this cycle. I think the AI stuff is here to stay. This does not feel like a hype trend to me overall. Like, this is legit tech with real user interest. I think that's awesome.I don't think a year from now, we're going to be competing over who has the biggest model anymore. Nobody cares. I don't know if we're going to hopefully lead with AI the same way as much as, what is it doing for me? What is my experience? Is it better? Can I do this job better? Did you eliminate this complex piece of toil from my day two stuff? That's what we should be talking about. But right now it's new and it's interesting. So, we all have to rub some AI on it.Corey: I think that there is also a bit of a passing of the buck going on when it comes to AI where I've talked to companies that are super excited about how they have this new AI story that's going to be great. And, “Well, what does it do?” “It lets you query our interface to get an answer.” Okay, is this just cover for being bad UX?Richard: [laugh]. That can be true in some cases. In other cases, this will fix UXes that will always be hard. Like, do we need to keep changing… I don't know, I'm sure if you and I go to our favorite cloud providers and go through their documentation, it's hard to have docs for 200 services and millions of pages. Maybe AI will fix some of that and make it easier to discover stuff.So in some cases, UIs are just hard at scale. But yes, I think in some cases, this papers over other things not happening by just rubbing some AI on it. Hopefully, for most everybody else, it's actually interesting, new value. But yeah, that's a… every week it's a new press release from somebody saying they're about to launch some AI stuff. I don't know how any normal human is keeping up with it.Corey: I certainly don't know. I'm curious to see what happens but it's kind of wild, too, because there you're right. There is something real there where you ask it to draw you a picture of a pony or something and it does, or give me a bunch of random analysis of this. I asked one recently to go ahead and rank the US presidents by absorbency and with a straight face, it did it, which is kind of amazing. I feel like there's a lack of imagination in the way that people talk about these things and a certain lack of awareness that you can make this a lot of fun, and in some ways, make that a better showcase of the business value than trying to do the straight-laced thing of having it explain Microsoft Excel to you.Richard: I think that's fair. I don't know how much sometimes whimsy and enterprise mix. Sometimes that can be a tricky part of the value prop. But I'm with you this some of this is hopefully returns to some more creativity of things. I mean, I personally use things like Bard or what have you that, “Hey, I'm trying to think of this idea. Can you give me some suggestions?” Or—I just did a couple weeks ago—“I need sample data for my app.”I could spend the next ten minutes coming up with Seinfeld and Bob's Burgers characters, or just give me the list in two seconds in JSON. Like that's great. So, I'm hoping we get to use this for more fun stuff. I'll be fascinated to see if when I write the keynote for—I'm working on the keynote for Next, if I can really inject something completely off the wall. I guess you're challenging me and I respect that.Corey: Oh, I absolutely am. And one of the things that I believe firmly is that we lose sight of the fact that people are inherently multifaceted. Just because you are a C-level executive at an enterprise does not mean that you're not also a human being with a sense of creativity and a bit of whimsy as well. Everyone is going to compete to wind up boring you to death with PowerPoint. Find something that sparks the imagination and sparks joy.Because yes, you're going to find the boring business case on your own without too much in the way of prodding for that, but isn't it great to imagine what if? What if we could have fun with some of these things? At least to me, that's always been the goal is to get people's attention. Humor has been my path, but there are others.Richard: I'm with you. I think there's a lot to that. And the question will be… yeah, I mean, again, to me, you and I talked about this before we started recording, this is the first trend for me in a while that feels purely organic where our customers, now—and I'll tell our internal folks—our customers have much better ideas than we do. And it's because they're doing all kinds of wild things. They're trying new scenarios, they're building apps purely based on prompts, and they're trying to, you know, do this.And it's better than what we just come up with, which is awesome. That's how it should be, versus just some vendor-led hype initiative where it is just boring corporate stuff. So, I like the fact that this isn't just us talking; it's the whole industry talking. It's people talking to my non-technical family members, giving me ideas for what they're using this stuff for. I think that's awesome. So yeah, but I'm with you, I think companies can also look for more creative angles than just what's another way to left-align something in a cell.Corey: I mean, some of the expressions on this are wild to me. The Photoshop beta with its generative AI play has just been phenomenal. Because it's weird stuff, like, things that, yeah, I'm never going to be a great artist, let's be clear, but being able to say remove this person from the background, and it does it, as best I can tell, seamlessly is stuff where yeah, that would have taken me ages to find someone who knows what the hell they're doing on the internet somewhere and then pay them to do it. Or basically stumble my way through it for two hours and it somehow looks worse afterwards than before I started. It's the baseline stuff of, I'm never going to be able to have it—to my understanding—go ahead just build me a whole banner ad that does this and hit these tones and the rest, but it is going to help me refine something in that direction, until I can then, you know, hand it to a professional who can take it from my chicken scratching into something real.Richard: If it will. I think that's my only concern personally with some of this is I don't want this to erase expertise or us to think we can just get lazy. I think that I get nervous, like, can I just tell it to do stuff and I don't even check the output, or I don't do whatever. So, I think that's when you go back to, again, enterprise use cases. If this is generating code or instructions or documentation or what have you, I need to trust that output in some way.Or more importantly, I still need to retain the skills necessary to check it. So, I'm hoping people like you and me and all our —every—all the users out there of this stuff, don't just offload responsibility to the machine. Like, just always treat it like a kind of slightly drunk friend sitting next to you with good advice and always check it out.Corey: It's critical. I think that there's a lot of concern—and I'm not saying that people are wrong on this—but that people are now going to let it take over their jobs, it's going to wind up destroying industries. No, I think it's going to continue to automate things that previously required human intervention. But this has been true since the Industrial Revolution, where opportunities arise and old jobs that used to be critical are no longer centered in quite the same way. The one aspect that does concern me is not that kids are going to be used to cheat on essays like, okay, great, whatever. That seems to be floated mostly by academics who are concerned about the appropriate structure of academia.For me, the problem is, is there's a reason that we have people go through 12 years of English class in the United States and that is, it's not to dissect of the work of long-dead authors. It's to understand how to write and how to tell us a story and how to frame ideas cohesively. And, “The computer will do that for me,” I feel like that potentially might not serve people particularly well. But as a counterpoint, I was told when I was going to school my entire life that you're never going to have a calculator in your pocket all the time that you need one. No, but I can also speak now to the open air, ask it any math problem I can imagine, and get a correct answer spoken back to me. That also wasn't really in the bingo card that I had back then either, so I am a hesitant to try and predict the future.Richard: Yeah, that's fair. I think it's still important for a kid that I know how to make change or do certain things. I don't want to just offload to calculators or—I want to be able to understand, as you say, literature or things, not just ever print me out a book report. But that happens with us professionals, too, right? Like, I don't want to just atrophy all of my programming skills because all I'm doing is accepting suggestions from the machine, or that it's writing my emails for me. Like, that still weirds me out a little bit. I like to write an email or send a tweet or do a summary. To me, I enjoy those things still. I don't want to—that's not toil to me. So, I'm hoping that we just use this to make ourselves better and we don't just use it to make ourselves lazier.Corey: You mentioned a few minutes ago that you are currently working on writing your keynote for Next, so I'm going to pretend, through a vicious character attack here, that this is—you know, it's 11 o'clock at night, the day before the Next keynote and you found new and exciting ways to procrastinate, like recording a podcast episode with me. My question for you is, how is this Next going to be different than previous Nexts?Richard: Hmm. Yeah, I mean, for the first time in a while it's in person, which is wonderful. So, we'll have a bunch of folks at Moscone in San Francisco, which is tremendous. And I [unintelligible 00:11:56] it, too, I definitely have online events fatigue. So—because absolutely no one has ever just watched the screen entirely for a 15 or 30 or 60-minute keynote. We're all tabbing over to something else and multitasking. And at least when I'm in the room, I can at least pretend I'll be paying attention the whole time. The medium is different. So, first off, I'm just excited—Corey: Right. It feels a lot ruder to get up and walk out of the front row in the middle of someone's talk. Now, don't get me wrong, I'll still do it because I'm a jerk, but I'll feel bad about it as I do. I kid, I kid. But yeah, a tab away is always a thing. And we seem to have taken the same structure that works in those events and tried to force it into more or less a non-interactive Zoom call, and I feel like that is just very hard to distinguish.I will say that Google did a phenomenal job of online events, given the constraints it was operating under. Production value is great, the fact that you took advantage of being in different facilities was awesome. But yeah, it'll be good to be back in person again. I will be there with bells on in Moscone myself, mostly yelling at people, but you know, that's what I do.Richard: It's what you do. But we missed that hallway track. You missed this sort of bump into people. Do hands-on labs, purposely have nothing to do where you just walk around the show floor. Like we have been missing, I think, society-wise, a little bit of just that intentional boredom. And so, sometimes you need at conference events, too, where you're like, “I'm going to skip that next talk and just see what's going on around here.” That's awesome. You should do that more often.So, we're going to have a lot of spaces for just, like, go—like, 6000 square feet of even just going and looking at demos or doing hands-on stuff or talking with other people. Like that's just the fun, awesome part. And yeah, you're going to hear a lot about AI, but plenty about other stuff, too. Tons of announcements. But the key is that to me, community stuff, learn from each other stuff, that energy in person, you can't replicate that online.Corey: So, an area that you have expanded into has been DevRel, where you've always been involved with it, let's be clear, but it's becoming a bit more pronounced. And as an outsider, I look at Google Cloud's DevRel presence and I don't see as much of it as your staffing levels would indicate, to the naive approach. And let's be clear, that means from my perspective, all public-facing humorous, probably performative content in different ways, where you have zany music videos that, you know, maybe, I don't know, parody popular songs do celebrate some exec's birthday they didn't know was coming—[fake coughing]. Or creative nonsense on social media. And the the lack of seeing a lot of that could in part be explained by the fact that social media is wildly fracturing into a bunch of different islands which, on balance, is probably a good thing for the internet, but I also suspect it comes down to a common misunderstanding of what DevRel actually is.It turns out that, contrary to what many people wanted to believe in the before times, it is not getting paid as much as an engineer, spending three times that amount of money on travel expenses every year to travel to exotic places, get on stage, party with your friends, and then give a 45-minute talk that spends two minutes mentioning where you work and 45 minutes talking about, I don't know, how to pick the right standing desk. That has, in many cases, been the perception of DevRel and I don't think that's particularly defensible in our current macroeconomic climate. So, what are all those DevRel people doing?Richard: [laugh]. That's such a good loaded question.Corey: It's always good to be given a question where the answers are very clear there are right answers and wrong answers, and oh, wow. It's a fun minefield. Have fun. Go catch.Richard: Yeah. No, that's terrific. Yeah, and your first part, we do have a pretty well-distributed team globally, who does a lot of things. Our YouTube channel has, you know, we just crossed a million subscribers who are getting this stuff regularly. It's more than Amazon and Azure combined on YouTube. So, in terms of like that, audience—Corey: Counterpoint, you definitionally are YouTube. But that's neither here nor there, either. I don't believe you're juicing the stats, but it's also somehow… not as awesome if, say, I were to do it, which I'm working on it, but I have a face for radio and it shows.Richard: [laugh]. Yeah, but a lot of this has been… the quality and quantity. Like, you look at the quantity of video, it overwhelms everyone else because we spend a lot of time, we have a specific media team within my DevRel team that does the studio work, that does the production, that does all that stuff. And it's a concerted effort. That team's amazing. They do really awesome work.But, you know, a lot of DevRel as you say, [sigh] I don't know about you, I don't think I've ever truly believed in the sort of halo effect of if super smart person works at X company, even if they don't even talk about that company, that somehow presents good vibes and business benefits to that company. I don't think we've ever proven that's really true. Maybe you've seen counterpoints, where [crosstalk 00:16:34]—Corey: I can think of anecdata examples of it. Often though, on some level, for me at least, it's been okay someone I tremendously respect to the industry has gone to work at a company that I've never heard of. I will be paying attention to what that company does as a direct result. Conversely, when someone who is super well known, and has been working at a company for a while leaves and then either trashes the company on the way out or doesn't talk about it, it's a question of, what's going on? Did something horrible happen there? Should we no longer like that company? Are we not friends anymore? It's—and I don't know if that's necessarily constructive, either, but it also, on some level, feels like it can shorthand to oh, to be working DevRel, you have to be an influencer, which frankly, I find terrifying.Richard: Yeah. Yeah. I just—the modern DevRel, hopefully, is doing a little more of product-led growth style work. They're focusing specifically on how are we helping developers discover, engage, scale, become advocates themselves in the platform, increasing that flywheel through usage, but that has very discreet metrics, it has very specific ownership. Again, personally, I don't even think DevRel should do as much with sales teams because sales teams have hundreds and sometimes thousands of sales engineers and sales reps. It's amazing. They have exactly what they need.I don't think DevRel is a drop in the bucket to that team. I'd rather talk directly to developers, focus on people who are self-service signups, people who are developers in those big accounts. So, I think the modern DevRel team is doing more in that respect. But when I look at—I just look, Corey, this morning at what my team did last week—so the average DevRel team, I look at what advocacy does, teams writing code labs, they're building tutorials. Yes, they're doing some in person events. They wrote some blog posts, published some videos, shipped a couple open-source projects that they contribute to in, like gaming sector, we ship—we have a couple projects there.They're actually usually customer zero in the product. They use the product before it ships, provides bugs and feedback to the team, we run DORA workshops—because again, we're the DevOps Research and Assessment gang—we actually run the tutorial and Docs platform for Google Cloud. We have people who write code samples and reference apps. So, sometimes you see things publicly, but you don't see the 20,000 code samples in the docs, many written by our team. So, a lot of the times, DevRel is doing work to just enable on some of these different properties, whether that's blogs or docs, whether that's guest articles or event series, but all of this should be in service of having that credible relationship to help devs use the platform easier. And I love watching this team do that.But I think there's more to it now than years ago, where maybe it was just, let's do some amazing work and try to have some second, third-order effect. I think DevRel teams that can have very discrete metrics around leading indicators of long-term cloud consumption. And if you can't measure that successfully, you've probably got to rethink the team.[midroll 00:19:20]Corey: That's probably fair. I think that there's a tremendous series of… I want to call it thankless work. Like having done some of those ridiculous parody videos myself, people look at it and they chuckle and they wind up, that was clever and funny, and they move on to the next one. And they don't see the fact that, you know, behind the scenes for that three-minute video, there was a five-figure budget to pull all that together with a lot of people doing a bunch of disparate work. Done right, a lot of this stuff looks like it was easy or that there was no work at all.I mean, at some level, I'm as guilty of that as anyone. We're recording a podcast now that is going to be handed over to the folks at HumblePod. They are going to produce this into something that sounds coherent, they're going to fix audio issues, all kinds of other stuff across the board, a full transcript, and the rest. And all of that is invisible to me. It's like AI; it's the magic box I drop a file into and get podcast out the other side.And that does a disservice to those people who are actively working in that space to make things better. Because the good stuff that they do never gets attention, but then the company makes an interesting blunder in some way or another and suddenly, everyone's out there screaming and wondering why these people aren't responding on Twitter in 20 seconds when they're finding out about this stuff for the first time.Richard: Mm-hm. Yeah, that's fair. You know, different internal, external expectations of even DevRel. We've recently launched—I don't know if you caught it—something called Jump Start Solutions, which were executable reference architectures. You can come into the Google Cloud Console or hit one of our pages and go, “Hey, I want to do a multi-tier web app.” “Hey, I want to do a data processing pipeline.” Like, use cases.One click, we blow out the entire thing in the platform, use it, mess around with it, turn it off with one click. Most of those are built by DevRel. Like, my engineers have gone and built that. Tons of work behind the scenes. Really, like, production-grade quality type architectures, really, really great work. There's going to be—there's a dozen of these. We'll GA them at Next—but really, really cool work. That's DevRel. Now, that's behind-the-scenes work, but as engineering work.That can be some of the thankless work of setting up projects, deployment architectures, Terraform, all of them also dropped into GitHub, ton of work documenting those. But yeah, that looks like behind-the-scenes work. But that's what—I mean, most of DevRel is engineers. These are folks often just building the things that then devs can use to learn the platforms. Is it the flashy work? No. Is it the most important work? Probably.Corey: I do have a question I'd be remiss not to ask. Since the last time we spoke, relatively recently from this recording, Google—well, I'd say ‘Google announced,' but they kind of didn't—Squarespace announced that they'd be taking over Google domains. And there was a lot of silence, which I interpret, to be clear, as people at Google being caught by surprise, by large companies, communication is challenging. And that's fine, but I don't think it was anything necessarily nefarious.And then it came out further in time with an FAQ that Google published on their site, that Google Cloud domains was a part of this as well. And that took a lot of people aback, in the sense—not that it's hard to migrate a domain from one provider to another, but it brought up the old question of, if you're building something in cloud, how do you pick what to trust? And I want to be clear before you answer that, I know you work there. I know that there are constraints on what you can or cannot say.And for people who are wondering why I'm not hitting you harder on this, I want to be very explicit, I can ask you a whole bunch of questions that I already know the answer to, and that answer is that you can't comment. That's not constructive or creative. So, I don't want people to think that I'm not intentionally asking the hard questions, but I also know that I'm not going to get an answer and all I'll do is make you uncomfortable. But I think it's fair to ask, how do you evaluate what services or providers or other resources you're using when you're building in cloud that are going to be around, that you can trust building on top of?Richard: It's a fair question. Not everyone's on… let's update our software on a weekly basis and I can just swap things in left. You know, there's a reason that even Red Hat is so popular with Linux because as a government employee, I can use that Linux and know it's backwards compatible for 15 years. And they sell that. Like, that's the value, that this thing works forever.And Microsoft does the same with a lot of their server products. Like, you know, for better or for worse, [laugh] they will always kind of work with a component you wrote 15 years ago in SharePoint and somehow it runs today. I don't even know how that's possible. Love it. That's impressive.Now, there's a cost to that. There's a giant tax in the vendor space to make that work. But yeah, there's certain times where even with us, look, we are trying to get better and better at things like comms. And last year we announced—I checked them recently—you know, we have 185 Cloud products in our enterprise APIs. Meaning they have a very, very tight way we would deprecate with very, very long notice, they've got certain expectations on guarantees of how long you can use them, quality of service, all the SLAs.And so, for me, like, I would bank on, first off, for every cloud provider, whether they're anchor services. Build on those right? You know, S3 is not going anywhere from Amazon. Rock solid service. BigQuery Goodness gracious, it's the center of Google Cloud.And you look at a lot of services: what can you bet on that are the anchors? And then you can take bets on things that sit around it. There's times to be edgy and say, “Hey, I'll use Service Weaver,” which we open-sourced earlier this year. It's kind of a cool framework for building apps and we'll deconstruct it into microservices at deploy time. That's cool.Would I literally build my whole business on it? No, I don't think so. It's early stuff. Now, would I maybe use it also with some really boring VMs and boring API Gateway and boring storage? Totally. Those are going to be around forever.I think for me, personally, I try to think of how do I isolate things that have some variability to them. Now, to your point, sometimes you don't know there's variability. You would have just thought that service might be around forever. So, how are you supposed to know that that thing could go away at some point? And that's totally fair. I get that.Which is why we have to keep being better at comms, making sure more things are in our enterprise APIs, which is almost everything. So, you have some assurances, when I build this thing, I've got a multi-year runway if anything ever changes. Nothing's going to stay the same forever, but nothing should change tomorrow on a dime. We need more trust than that.Corey: Absolutely. And I agree. And the problem, too, is hidden dependencies. Let's say what is something very simple. I want to log in to [unintelligible 00:25:34] brand new AWS account and spin of a single EC2 instance. The end. Well, I can trust that EC2 is going to be there. Great. That's not one service you need to go through that critical path. It is a bare minimum six, possibly as many as twelve, depending upon what it is exactly you're doing.And it's the, you find out after the fact that oh, there was that hidden dependency in there that I wasn't fully aware of. That is a tricky and delicate balance to strike. And, again, no one is going to ever congratulate you—at all—on the decision to maintain a service that is internally painful and engineering-ly expensive to keep going, but as soon as you kill something, even it's for this thing doesn't have any customers, the narrative becomes, “They're screwing over their customers.” It's—they just said that it didn't have any. What's the concern here?It's a messaging problem; it is a reputation problem. Conversely, everyone knows that Amazon does not kill AWS services. Full stop. Yeah, that turns out everyone's wrong. By my count, they've killed ten, full-on AWS services and counting at the moment. But that is not the reputation that they have.Conversely, I think that the reputation that Google is going to kill everything that it touches is probably not accurate, though I don't know that I'd want to have them over to babysit either. So, I don't know. But it is something that it feels like you're swimming uphill on in many respects, just due to not even deprecation decisions, historically, so much as poor communication around them.Richard: Mm-hm. I mean, communication can always get better, you know. And that's, it's not our customers' problem to make sure that they can track every weird thing we feel like doing. It's not their challenge. If our business model changes or our strategy changes, that's not technically the customer's problem. So, it's always our job to make this as easy as possible. Anytime we don't, we have made a mistake.So, you know, even DevRel, hey, look, it puts teams in a tough spot. We want our customers to trust us. We have to earn that; you will never just give it to us. At the same time, as you say, “Hey, we're profitable. It's great. We're growing like weeds,” it's amazing to see how many people are using this platform. I mean, even services, you don't talk about having—I mean, doing really, really well. But I got to earn that. And you got to earn, more importantly, the scale. I don't want you to just kick the tires on Google Cloud; I want you to bet on it. But we're only going to earn that with really good support, really good price, stability, really good feeling like these services are rock solid. Have we totally earned that? We're getting there, but not as mature as we'd like to get yet, but I like where we're going.Corey: I agree. And reputations are tricky. I mean, recently InfluxDB deprecated two regions and wound up turning them off and deleting data. And they wound up getting massive blowback for this, which, to their credit, their co-founder and CTO, Paul Dix—who has been on the show before—wound up talking about and saying, “Yeah, that was us. We're taking ownership of this.”But the public announcement said that they had—that data in AWS was not recoverable and they're reaching out to see if the data in GCP was still available. At which point, I took the wrong impression from this. Like, whoa, whoa, whoa. Hang on. Hold the phone here. Does that mean that data that I delete from a Google Cloud account isn't really deleted?Because I have a whole bunch of regulators that would like a word if so. And Paul jumped onto that with, “No, no, no, no, no. I want to be clear, we have a backup system internally that we were using that has that set up. And we deleted the backups on the AWS side; we don't believe we did on the Google Cloud side. It's purely us, not a cloud provider problem.” It's like, “Okay, first, sorry for causing a fire drill.” Secondly, “Okay, that's great.” But the reason I jumped in that direction was just because it becomes so easy when a narrative gets out there to believe the worst about companies that you don't even realize you're doing it.Richard: No, I understand. It's reflexive. And I get it. And look, B2B is not B2C, you know? In B2B, it's not, “Build it and they will come.” I think we have the best cloud infrastructure, the best security posture, and the most sophisticated managed services. I believe that I use all the clouds. I think that's true. But it doesn't matter unless you also do the things around it, around support, security, you know, usability, trust, you have to go sell these things and bring them to people. You can't just sit back and say, “It's amazing. Everyone's going to use it.” You've got to earn that. And so, that's something that we're still on the journey of, but our foundation is terrific. We just got to do a better job on some of these intangibles around it.Corey: I agree with you, when you s—I think there's a spirited debate you could have on any of those things you said that you believe that Google Cloud is the best at, with the exception of security, where I think that is unquestionably. I think that is a lot less variable than the others. The others are more or less, “Who has the best cloud infrastructure?” Well, depends on who had what for breakfast today. But the simplicity and the approach you take to security is head and shoulders above the competition.And I want to make sure I give credit where due: it is because of that simplicity and default posturing that customers wind up better for it as a result. Otherwise, you wind up in this hell of, “You must have at least this much security training to responsibly secure your environment.” And that is never going to happen. People read far less than we wish they would. I want to make very clear that Google deserves the credit for that security posture.Richard: Yeah, and the other thing, look, I'll say that, from my observation, where we do something that feels a little special and different is we do think in platforms, we think in both how we build and how we operate and how the console is built by a platform team, you—singularly. How—[is 00:30:51] we're doing Duet AI that we've pre-announced at I/O and are shipping. That is a full platform experience covering a dozen services. That is really hard to do if you have a lot of isolation. So, we've done a really cool job thinking in platforms and giving that simplicity at that platform level. Hard to do, but again, we have to bring people to it. You're not going to discover it by accident.Corey: Richard, I will let you get back to your tear-filled late-night writing of tomorrow's Next keynote, but if people want to learn more—once the dust settles—where's the best place for them to find you?Richard: Yeah, hopefully, they continue to hang out at cloud.google.com and using all the free stuff, which is great. You can always find me at seroter.com. I read a bunch every day and then I've read a blog post every day about what I read, so if you ever want to tune in on that, just see what wacky things I'm checking out in tech, that is good. And I still hang out on different social networks, Twitter at @rseroter and LinkedIn and things like that. But yeah, join in and yell at me about anything I said.Corey: I did not realize you had a daily reading list of what you put up there. That is news to me and I will definitely track in, and then of course, yell at you from the cheap seats when I disagree with anything that you've chosen to include. Thank you so much for taking the time to speak with me and suffer the uncomfortable questions.Richard: Hey, I love it. If people aren't talking about us, then we don't matter, so I would much rather we'd be yelling about us than the opposite there.Corey: [laugh]. As always, it's been a pleasure. Richard Seroter, Director of Product Management and Developer Relations at Google Cloud. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that you had an AI system write for you because you never learned how to structure a sentence.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.