Podcasts about cloudfront

  • 76PODCASTS
  • 169EPISODES
  • 39mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Feb 18, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about cloudfront

Latest podcast episodes about cloudfront

Thinking Elixir Podcast
241: A LiveView Debugger and Gigalixir

Thinking Elixir Podcast

Play Episode Listen Later Feb 18, 2025 44:59


News includes the release of LiveDebugger, an exciting new browser-based debugging tool for Phoenix LiveView applications and the announcement of Artifix for creating private Hex registries on S3 and CloudFront. We are also joined by Tim Knight, the CTO at Gigalixir, to get a peek inside the machine that is Gigalixir and learn more about how the platform specializes in providing an excellent Elixir deployment experience, and more! Show Notes online - http://podcast.thinkingelixir.com/241 (http://podcast.thinkingelixir.com/241) Elixir Community News https://github.com/software-mansion-labs/live-debugger (https://github.com/software-mansion-labs/live-debugger?utm_source=thinkingelixir&utm_medium=shownotes) – New Phoenix LiveView debugging tool released providing browser-based debugging capabilities similar to React DevTools. https://bsky.app/profile/bcardarella.bsky.social/post/3lhn3y7vw4k2v (https://bsky.app/profile/bcardarella.bsky.social/post/3lhn3y7vw4k2v?utm_source=thinkingelixir&utm_medium=shownotes) – Confirmation that LiveDebugger works with LiveView Native. https://github.com/probably-not/artifix (https://github.com/probably-not/artifix?utm_source=thinkingelixir&utm_medium=shownotes) – New project Artifix announced, allowing creation of private Hex Registry on S3 and Cloudfront with customizable deployment patterns. https://gleam.run/news/gleam-gets-rename-variable/ (https://gleam.run/news/gleam-gets-rename-variable/?utm_source=thinkingelixir&utm_medium=shownotes) – Gleam v1.8.0 released with significant Language Server enhancements and compiler improvements. https://github.com/Wilfred/difftastic (https://github.com/Wilfred/difftastic?utm_source=thinkingelixir&utm_medium=shownotes) – Difftastic, a structural diff tool, now supports HEEx syntax highlighting. https://bsky.app/profile/crbelaus.com/post/3lhtpkkn4vc2l (https://bsky.app/profile/crbelaus.com/post/3lhtpkkn4vc2l?utm_source=thinkingelixir&utm_medium=shownotes) – Additional announcement about Difftastic's HEEx support. https://github.com/Wilfred/difftastic/pull/785 (https://github.com/Wilfred/difftastic/pull/785?utm_source=thinkingelixir&utm_medium=shownotes) – Pull request adding HEEx support to Difftastic. https://x.com/chris_mccord/status/1887957394149310502 (https://x.com/chris_mccord/status/1887957394149310502?utm_source=thinkingelixir&utm_medium=shownotes) – Chris McCord shares a preview of integrated AI work at Fly.io, demonstrating web search capabilities. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources https://www.gigalixir.com/thinking (https://www.gigalixir.com/thinking?utm_source=thinkingelixir&utm_medium=shownotes) – Thinking Elixir Podcast listeners get 20% off the standard tier for the first YEAR with the promo code "Thinking" https://www.gigalixir.com (https://www.gigalixir.com?utm_source=thinkingelixir&utm_medium=shownotes) https://www.gigalixir.com/docs/ (https://www.gigalixir.com/docs/?utm_source=thinkingelixir&utm_medium=shownotes) https://www.gigalixir.com/pricing/ (https://www.gigalixir.com/pricing/?utm_source=thinkingelixir&utm_medium=shownotes) https://journey.gigalixir.com/ (https://journey.gigalixir.com/?utm_source=thinkingelixir&utm_medium=shownotes) Guest Information https://twitter.com/gigalixir (https://twitter.com/gigalixir?utm_source=thinkingelixir&utm_medium=shownotes) – on Twitter https://github.com/gigalixir/ (https://github.com/gigalixir/?utm_source=thinkingelixir&utm_medium=shownotes) – on Github https://bsky.app/profile/gigalixir.com (https://bsky.app/profile/gigalixir.com?utm_source=thinkingelixir&utm_medium=shownotes) – on BlueSky https://elixir-lang.slack.com/archives/C5AJLMATG (https://elixir-lang.slack.com/archives/C5AJLMATG?utm_source=thinkingelixir&utm_medium=shownotes) – gigalixir on Elixir Slack https://gigalixir.com/ (https://gigalixir.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Site Find us online Message the show - Bluesky (https://bsky.app/profile/thinkingelixir.com) Message the show - X (https://x.com/ThinkingElixir) Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) Mark Ericksen on X - @brainlid (https://x.com/brainlid) Mark Ericksen on Bluesky - @brainlid.bsky.social (https://bsky.app/profile/brainlid.bsky.social) Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) David Bernheisel on Bluesky - @david.bernheisel.com (https://bsky.app/profile/david.bernheisel.com) David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

Screaming in the Cloud
Replay - Serverless Hero, Got Servers in His Eyes with Ant Stanley

Screaming in the Cloud

Play Episode Listen Later Dec 3, 2024 33:37


On this Screaming in the Cloud Replay, we're revisiting our conversation with Co-Founder of Senzo, Ant Stanley. Ant sits down with Corey to do so. He offers up his history which has lead to his time as “Serverless Hero” to landing on the line that “serverless sucks.” Lend us your ears to see how that transition happened! Ant goes into detail on JeffConf (not the of the Bezos nomen), and working with servers and what to put where and why. Ant and Corey talk over the plague of AWS services where Ant offers his perspective how to trim the fat and keep things simple to make long-term objectives more attainable. They discuss the importance of training, the role of certifications for better and worse, and more. Tune in for his take!Show Highlights(0:00) Intro(0:51) Duckbill Group sponsor read(1:24) What does it mean to be an AWS Serverless Hero?(3:13) Why Ant and Corey are critical of the state of serverless(7:53) Woes with Lambda and CloudFront(10:12) The never-ending stream of new AWS services(13:36) Hurdles ahead of going serverless(17:33) Struggles of getting customers to understand a newly built service(21:31) Duckbill Group sponsor read(22:14) Pros and cons of certifications(32:17) Where you can find more from AntAbout Ant StanleyAnt Stanley is a community focused technologist with a passion for enabling better outcomes for society through technology. He is an AWS Serverless Hero, runs the Serverless London User Group, co-runs ServerlessDays London and is part of the ServerlessDays Global team. LinksA Cloud Guru: https://acloudguru.comhomeschool.dev: https://homeschool.devaws.training: https://aws.traininglearn.microsoft.com: https://learn.microsoft.comTwitter: https://twitter.com/iamstanOriginal Episodehttps://www.lastweekinaws.com/podcast/screaming-in-the-cloud/serverless-hero-got-servers-in-his-eyes-with-ant-stanley/SponsorThe Duckbill Group: duckbillgroup.com 

Le Podcast AWS en Français
Les nouveautés AWS au 29 novembre

Le Podcast AWS en Français

Play Episode Listen Later Nov 29, 2024 18:48


J'ai compté plus de 200 nouveautés ces deux dernières semaines, une situtaion typique juste avant re:invent. J'ai essayé de regrouper les principales par catégorie. On parle de CloudFront, de S3, de DynamoDB et un paquet d'autres services. Accrochez vos ceintures, c'est parti.

Cloud Masters
Amazon CloudFront Deep Dive: Optimizing performance, cost, and security

Cloud Masters

Play Episode Listen Later Sep 25, 2024 37:47


In this episode, we dive into Amazon CloudFront, exploring its benefits, use cases, and cost optimization strategies. Specifically, we go into the importance of Average Object Size (AOS) when wanting to sign a CloudFront PPA. We also discuss how using CloudFront saves you on data transfer costs compared to alternative solutions, its versatility in handling both static and dynamic content, and the importance of page-loading time for user experience. Finally, we conclude with an examination of security considerations when using CloudFront, including strategies for mitigating DDoS attacks while keeping costs in check.

SEO Is Not That Hard
Cloudflare is so much more than just a CDN

SEO Is Not That Hard

Play Episode Listen Later Sep 11, 2024 12:19 Transcription Available


Send us a textCould you be missing out on the hidden potential of Cloudflare? Join me, Ed Dawson, as I share my latest insights from Jono Alderson's eye-opening presentation at the Hive MCR conference. Discover how Cloudflare can do more than just speed up your website—learn about its advanced features that allow you to add programmatic functionality without touching your original site. From caching to altering page content and intercepting requests, this episode is packed with advanced technical SEO techniques that can give your website a significant edge.I'll also reflect on my personal journey with CloudFront, Amazon's CDN equivalent, and why Cloudflare might just be the more user-friendly option for many website owners, especially those using WordPress. Whether you're a seasoned Cloudflare user or just getting started, this episode is loaded with practical tips and expert advice. If you're eager to make your website faster, smarter, and more efficient, you won't want to miss this fascinating discussion. Tune in to unlock the untapped potential of your website today!SEO Is Not That Hard is hosted by Edd Dawson and brought to you by KeywordsPeopleUse.comYou can get your free copy of my 101 Quick SEO Tips at: https://seotips.edddawson.com/101-quick-seo-tipsTo get a personal no-obligation demo of how KeywordsPeopleUse could help you boost your SEO then book an appointment with me nowSee Edd's personal site at edddawson.comAsk me a question and get on the show Click here to record a questionFind Edd on Twitter @channel5Find KeywordsPeopleUse on Twitter @kwds_ppl_use"Werq" Kevin MacLeod (incompetech.com)Licensed under Creative Commons: By Attribution 4.0 Licensehttp://creativecommons.org/licenses/by/4.0/

The Cloud Pod
268: Long Time Show Host is CloudPod's first Casualty to AI (For This Week, at Least)

The Cloud Pod

Play Episode Listen Later Jul 21, 2024 49:12


Welcome to episode 268 of the Cloud Pod Podcast – where the forecast is always cloudy! Justin says he's in India, but we know he's really been replaced by Skynet. Jonathan, Matthew, and Ryan are here in his stead to bring all the latest cloud news, including PGO for optimization, a Linux vulnerability, CloudFront's new managed policies, and even a frank discussion about whether or not the AI Hype train has officially left the station. Sit back and enjoy!  Titles we almost went with this week: OpenSSH sings “Oops I did it again” All aboard, the AI hype train is leaving the station Caching In on CloudFront’s New Managed Policies  Get your Go Apps a personal trainer this summer with PGO Was Japan actually using floppy disks or were they 3.5 Azure is on summer break Singapore will soon just be datacenters A big thanks to this week's sponsor: We're sponsorless! Want to reach a dedicated audience of cloud engineers? Send us an email or hit us up on our Slack Channel and let's chat!  General News 00:56 Japan declares victory in effort to end government use of floppy disks Here’s a bit of tech nostalgia meets modernization for you!  Japan’s government has finally phased out the use of floppy disks in all its systems.  The Digital Agency has scrapped over 1,000 regulations related to their use, marking a significant step in their efforts to update government technology. Digital Minister Taro Kono, who’s been on a mission to modernize Japan’s government tech, announced this victory last week. It’s part of a larger push to digitize Japan’s notoriously paper-heavy bureaucracy, which became glaringly apparent during the COVID-19 pandemic. Japan’s digitization efforts have hit some bumps along the way, including issues with a contact-tracing app and slow adoption of their digital ID system.  It’s a reminder that modernizing legacy systems isn’t just about replacing old hardware – it’s a complex process that involves changing long-standing processes and especially mindsets. 02:36 Jonathan – “Yeah, I remember a couple of years ago they started talking about this modernization they were doing and people started to panic because Japan’s the largest purchaser of floppy disks anymore, or three and a half inch disks anyway. And so I ended up buying some because I’ve still got a USB floppy drive and some machines that have floppy disks. And I wanted just to stock up on some for the future, just in case the price went through the roof if Japan finally cut them and they have.” 05:16 regreSSHion: Remote Unauthenticated Code Execution Vulnerability in OpenSSH server  The Qualys Threat Research Unit just dropped a bombshell – they’ve discovered a remote code execution vulnerability in OpenSSH that affects millions of Linux systems. The vulnerability, dubbed “regreSSHion,” allows unauthenticated attackers to execute code as root on vulnerable system

UBC News World
Streamline Hosting With WP Toolkit Video Magic For Blog Audience Conversion

UBC News World

Play Episode Listen Later Jun 27, 2024 2:05


With WP Toolkit Video Magic by IM Wealth Builders, it's never been easier to host your videos via Amazon S3 and CloudFront and turn your blog into a sales machine! Find out more at: https://muncheye.com/im-wealth-builders-wp-toolkit-video-magic-v3 MunchEye City: London Address: London Office 15 Harwood Road, , London, England United Kingdom Website: https://muncheye.com/ Phone: +1-302-261-5332 Email: support@ampifire.com

AWS Developers Podcast
CloudFront hosting toolkit

AWS Developers Podcast

Play Episode Listen Later Jun 21, 2024 36:18


This week's episode of the AWS Developers podcast dives into the CloudFront Hosting Toolkit, a command-line tool designed to streamline web application deployment on AWS. The podcast explores how the toolkit simplifies the process by enabling deployment to Amazon S3 with exposure through CloudFront. Additionally, it delves into the creation of an automated deployment pipeline linked to your Git repository. Listeners will gain insights into configuring advanced features like dynamic routing for the latest application version, eliminating the need for cache invalidation. The episode offers a comprehensive overview of the CloudFront Hosting Toolkit and guidance on getting started. With Achraf Souk, Edge Specialist SA, AWS and Corneliu Croitoru https://www.linkedin.com/in/achrafsouk/ https://www.linkedin.com/in/corneliucroitoru/ **Links** Here are the links to the tools, technologies, or articles we mentioned in this episode. Amazon CloudFront hosting toolkit https://aws.amazon.com/blogs/networking-and-content-delivery/introducing-cloudfront-hosting-toolkit/ AWS CodePipeline https://docs.aws.amazon.com/codepipeline/latest/userguide/welcome.html Amazon CloudFront functions https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/cloudfront-functions.html Amazon CloudFront Key Value Store https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/kvs-with-functions.html AWS CodeBuild https://docs.aws.amazon.com/codebuild/latest/userguide/welcome.html AWS Step Functions https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html Build on AWS Edge https://aws.amazon.com/developer/application-security-performance/?whats-new-cards.sort-by=item.additionalFields.postDateTime&whats-new-cards.sort-order=desc&public-talk-id.sort-by=item.additionalFields.DisplayDate&public-talk-id.sort-order=desc&blogs-id.sort-by=item.additionalFields.createdDate&blogs-id.sort-order=desc A/B Testing on AWS https://aws.amazon.com/developer/application-security-performance/articles/a-b-testing/

AWS Bites
125. A first look at CloudFront Hosting Toolkit

AWS Bites

Play Episode Listen Later Jun 13, 2024 33:36


In this episode, we discuss the newly announced CloudFront Hosting Toolkit from AWS. We provide an overview of the tool, which aims to simplify deploying modern front-end applications to AWS while retaining infrastructure control. We discuss the current capabilities and limitations and share our hands-on experiences trying out the tool. We also talk about alternatives like Vercel and Amplify, and the tradeoffs between convenience VS control. Overall, the toolkit shows promise but is still early-stage. We are excited to see it evolve to support more frameworks and use cases.

52 Weeks of Cloud
AWS Virtual Private Cloud (VPC) and CloudFront

52 Weeks of Cloud

Play Episode Listen Later May 15, 2024 10:17


AWS Virtual Private Cloud (VPC) enables the creation of logically isolated virtual networks on the AWS Cloud, offering security, flexibility, and integration with various AWS services. CloudFront, a global content delivery network (CDN), ensures low latency, high data transfer speeds, and cost-effectiveness for content delivery. Check out all a Master's degree worth of courses on Coursera on topics ranging from Cloud Computing to Rust to LLMs and Generative AI: https://www.coursera.org/instructor/noahgift. You can also find many courses and programs on edX here:I build courses: Pragmatic AI Labs on edX

Le Podcast AWS en Français
Les nouveautés AWS au 19 avril

Le Podcast AWS en Français

Play Episode Listen Later Apr 19, 2024 16:02


J'ai compté 63 nouveautés ces derniers quinze jours, mais j'enregistre cet épisode tard jeudi soir, la journée n'est pas encore finie aux US, le compteur peut encore augmenter. J'ai sélectionné pour vous des changements du coté de Lambda avec CloudFront, de la rotation de clés de chiffrement sur KMS et de Bedrock.

Le Podcast AWS en Français
Les nouveautés AWS au 19 avril

Le Podcast AWS en Français

Play Episode Listen Later Apr 19, 2024 16:02


J'ai compté 63 nouveautés ces derniers quinze jours, mais j'enregistre cet épisode tard jeudi soir, la journée n'est pas encore finie aux US, le compteur peut encore augmenter. J'ai sélectionné pour vous des changements du coté de Lambda avec CloudFront, de la rotation de clés de chiffrement sur KMS et de Bedrock.

Le Podcast AWS en Français
Les nouveautés AWS au 8 mars

Le Podcast AWS en Français

Play Episode Listen Later Mar 8, 2024 14:46


Cette semaine il y a du lourd. J'ai compté 61 nouveautés depuis le 23 février, on parlera de CloudFront, de Location Service, de Bedrock, de Lambda pour .Net et d'un changement de prix important. Mais ça, je le garde pour la fin de cet épisode.

Le Podcast AWS en Français
Les nouveautés AWS au 8 mars

Le Podcast AWS en Français

Play Episode Listen Later Mar 8, 2024 14:46


Cette semaine il y a du lourd. J'ai compté 61 nouveautés depuis le 23 février, on parlera de CloudFront, de Location Service, de Bedrock, de Lambda pour .Net et d'un changement de prix important. Mais ça, je le garde pour la fin de cet épisode.

Cloud Champions
53. Giacomo Piccinini, Founder di FantaSanremo

Cloud Champions

Play Episode Listen Later Feb 27, 2024 67:28


Oltre 4 milioni di squadre, con picchi di 7 milioni di utenti gestiti durante le dirette TV: numeri enormi ed in costante crescita (10x YoY), ma con costi ridotti del 30% grazie alla migrazione al cloud.Amazon S3 e CloudFront tra gli ingredienti principali, della ricetta (ed altro ancora) parleremo in questa puntata con Giacomo Piccinini, founder di FantaSanremo.KudosEmanuele Garofalo per la postproduzione dell'episodioContattiTutti i podcast di Improove: https://www.improove.tech/podcastCanale Telegram di Improove: https://t.me/improove_techCanale Telegram di Cloud Champions: https://t.me/CloudChampions

Screaming in the Cloud
Championing CDK While Accepting the Limits of AWS with Matthew Bonig

Screaming in the Cloud

Play Episode Listen Later Jan 11, 2024 43:32


Matthew Bonig, Chief Cloud Architect at Defiance Digital, joins Corey on Screaming in the Cloud to discuss his experiences in CDK, why developers can't be solely reliant on AI or coding tools to fill in the blanks, and his biggest grievances with AWS. Matthew gives an in-depth look at how and why CDK has been so influential for him, as well as the positive work that Defiance Digital is doing as a managed service provider. Corey and Matthew debate the need for AWS to focus on innovating instead of simply surviving off its existing customer base.About MatthewChief Cloud Architect at Defiance Digital. AWS DevTools Hero, co-author of The CDK Book, author of the Advanced CDK Course. All things CDK and Star Trek.Links Referenced:CDK Book: https://www.thecdkbook.com/cdk.dev: https://cdk.devTwitter: https://twitter.com/mattbonigLinkedIn: https://www.linkedin.com/in/matthewbonig/Personal website: https://matthewbonig.comduckbillgroup.com: https://duckbillgroup.comTranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. And I'm back with my first recording that was conducted post-re:Invent and all of its attendant glory and nonsense; we might talk a little bit about what happened at the show. But my guest today is the Chief Cloud Architect at Defiance Digital, Matthew Bonig. Matthew, thank you for joining me.Matthew: Thank you, Corey. Thanks for having me today.Corey: So, you are deep into the CDK. You're one of the AWS Dev Tools Heros, and you're the co-author of the CDK Book, you've done a lot, really. You have a course now for Advanced CDK work. Honestly, at this point, it starts to feel like when I say the CDK is a cult, you're one of the cult leaders, or at least very high up in the cult.Matthew: [laugh] Yes, it was something that I discovered—Corey: Your robe has a fringe on it.Matthew: Yeah, yeah. I discovered this at re:Invent, and it kind of hit me a little surprised that I got called out by a couple people by being the CDK guy. And I didn't realize that I'd hit that status yet, so I got to get myself a hat, and a cloak, and maybe some fun stuff to wear.Corey: For me, what I saw on the—it was in the run-up to re:Invent, but the big CDK sized announcement was the fact that the new version of Amplify now is much closer tied to the CDK than it was in previous incarnations, which is great. It sort of solves the problem, how do I build a thing through a variety of different tools? Great, and how do I manage that thing programmatically? It seems if, according to what it says on the tin, that it narrows that gap. Of course, here in reality, I haven't had time to pick anything like that up, and I won't for months, just because so much comes out all at the same time. What happened in the CDK world? What did I miss? What's exciting?Matthew: Well, you know, the CDK world has been, I've said, fairly mature for a while now. You know, fundamentally the way the CDK works and the functionality within it hasn't changed drastically. Even when 2.0 came out a couple of years ago, there wasn't a drastic fundamental change in the way that the API worked. Really, the efforts that we've been seeing for the last year or so, and especially the last few months, is trying to button up some functionality, hit some of those edge cases have been rough for some users, and ultimately just continue to fill out things like L2 constructs and maybe try to build out some L3s.I think what they're doing with Amplify is a good sign that they are trying to, sort of, reach across the aisle and work with other frameworks and work with other systems within AWS to make the experience better, shows their commitment to the CDK of making it really the first class citizen for doing IaC work in AWS.Corey: I think that that is a—that's a long road, and it's also a lot of work under the hood that's not easily appreciated. You've remarked at one point that my talk at the CDK Community Day was illuminating, if nothing else, if for no other reason than I dressed up as a legitimate actual cultist and a robe to give the talk—Matthew: Yeah. Loved it.Corey: Because I have deep-seated emotional problems. But it was fun. It talked a bit about my journey with it, where originally I viewed it as, more or less, this thing that was not for me. And a large part of that because I come from a world of sysadmin ops types, where, “I don't really know how to code,” was sort of my approach to this. Because I was reaff—I had that reaffirmed every time I talked to a developer. Like, “You call this a bash script? It's terrible.” And sure, but it worked, and it tied into a different knowledge set.Then, when I encountered the CDK for the first time, I tried to use it in Python, which at the time was not really well-supported and led to unfortunate outcomes—I do not know if that's still the case—what got me into it, in seriousness, was when I tried it a few months later with TypeScript and that started to work a little bit more clearly, with the caveat that I did not know JavaScript, I did not know TypeScript, I had to learn it as I went in service to the CDK. And it works really well insofar as it scratched an itch that I had. There's a whole class of problems that I don't have to deal with, which include getting someone who isn't me involved in some of that codebase, or working in environments where you have either a monorepo or a crap ton of tiny repos scattered everywhere and collaborating with other people. I cannot speak authoritatively to any of that. I will say it's incredibly annoying when I'm trying to update something written in the CDK, and then I have touched it in a year-and-a-half, and the first thing I have to do is upgrade a whole a bunch of dependencies, clear half a day just to get the warnings to clear before I can go ahead and deploy the things, let alone implement the tiny change I'm logging into the thing to fix.Matthew: Oh, yeah, yes. Yeah, the dependency updates are probably one of the most infuriating things about any Node.js system, and I don't think that I've ever run across any application project framework, anything in which doing dependency upgrades wasn't a nightmare. And I think it's because the Node.js community, more so than I've seen any other, doesn't care about semantic versioning. And unfortunately, the CDK doesn't technically care about semantic versioning, either, which makes it very tricky to do upgrades properly.Corey: There also seems to be the additional problem layered on top, which is all of the various documentation sources that I stumble upon, the official documentation, not terrific at giving real-world use case. It feels like it's trying to read the dictionary to learn how English works, not really its purpose. So, I find a bunch of blog posts, and all of them tend to approach this ecosystem slightly differently. One talks about using NPM. Another talks about Yarn.If you're doing anything that involves a web app, as seems to be increasingly common, some will say, “Oh, use WEBrick,” others will recommend using Vite. There's the whole JavaScript framework wars, and the only unifying best practice seems to be, “Oh, there's another way to do it that you should be using instead of the way you currently are on.” And if you listen to that, you wind up in hell.Matthew: Oh, horribly so. Yeah, the split in the ecosystem between NPM and Yarn, I think, has been incredibly detrimental to the overall comfort level in Node.js development. You know, I was an NPM guy for many, many years, and then actually, the CDK got me more using Yarn, simply because Yarn handles cross-library dependency resolution a bit different from NPM. And I just ran into fewer errors and fewer problems if I use Yarn along the way.But NPM then came a long way since then. Now, there's also a PNPM, which is good if you're using monorepos. But then if you're going to be using monorepos, there's another 15 tools out there that you can use for those sorts of things. And ultimately, I think it's going to be what is the thing that causes you the least amount of problems when dealing with them. And every single dependency issue that I've ever run into when upgrading any project, whether it be a web application, a back-end API, or the CDK, it's always unique enough that there isn't a one-size-fits-all answer to solving those problems.Corey: The most recent experience I had with the CDK—since you know, you're basically Mr. CDK at this point, whether you want to be or not, and this is what I do, instead of filing issues anywhere or asking for help, I drag people onto this show, and then basically assault them with my weird use cases—I'm in the process of building something out in the service of shitposting, because that is my nature, and I decided, oh, there's a new thing called the Dynamo table v2—Matthew: Yes.Corey: Which is great. I looked into it. The big difference is that it addresses it from the beginning as a global table, so you have optionality. Cool. Trying to migrate something that is existing from a Dynamo table to a Dynamo v2 table started throwing CloudFormation issues, so my answer was—this was pre-production—just tear down the stack and rebuild it. That feels like that would be a problem if this had been something that was actually full of data at this point.Matthew: There's a couple of ways that you could maybe go about it. Now, this is a very special case that you mentioned because you're talking about fundamentally changing the CloudFormation resource that you are creating, so of course, the CDK being an abstraction layer over top of CloudFormation and the Dynamo table v2 using the global table resource rather than just the table resource. If you had a case where you have to do that migration—and I've actually got a client right now who's very much looking to do that—the process would probably be to orphan the existing table so that you can retain the data and then using an import routine with CloudFormation to bring that in under the new resource. I haven't tried it yet—Corey: In this case, the table was empty, so it was easy enough to just destroy and then recreate, but it meant that I also had to tear down and recreate everything else in the stack as well, including CloudFront distributions, ACM certificates, so it took 20 minutes.Matthew: Yes. And that is one of the reasons why I often will stick any sort of stateful resource into their own stack so that if I have to go through an operation like this, I'm know that I'm not going to be modifying things that are very painful to drop and recreate, like, CloudFront distributions, which can take a half an hour or more to re-initialize.Corey: Yeah. So, that was fun. The problem got sorted out, but it was still a bit challenging. I feel like at some level, the CDK is hobbled by the fact that under the hood, it really just is just CloudFormation once all is said and done, and CloudFormation has never been the speediest thing. I didn't understand that until I started playing with Terraform and I saw how much more quickly it could provision things just by calling the service APIs directly. It sort of raises the question of what the hell the CloudFormation service is doing when it takes five times longer to do effectively the same thing.Matthew: Yeah, and the big thing that I appreciate about Terraform versus CloudFormation—speed being kind of the big win—is the fact that Terraform doesn't obfuscate or hide state from you. If you absolutely need to, you can go in and change that state that relates your Terraform definitions to the back-end resources. You can't do that with CloudFormation. So CloudFormation, did release few years ago, that import routine, and that was pretty good—not great, but pretty good; it's getting better all the time—whereas this was a complete and unneeded feature with Terraform because if it came down to the point where you already had a resource, and you just want to tie it to your IaC, you just edit a state file. And they've got their import routines and tie-in routines as well, but having that underlying state exposed was a big advantage, in my mind, to Terraform that I missed going to CloudFormation, and still to this day frustrates me that I can't do that underlying state change.Corey: It becomes painful and challenging, for better or worse.Matthew: Yep.Corey: But yeah, that was what I ran into. Things have improved, though. When I google various topics, I find that the v2 documentation comes up instead of the v1. That was maddening for a little while. I find that there are still things that annoy me, but they become less all the time, partially because I feel like I'm getting better at knowing how to search for them, and also because I think I'm becoming broken in the right ways that the CDK tends to expect.Matthew: Oh, like how?Corey: Oh, easy example here: I was recently trying to get something set up and running, and I don't know why this is the case, I don't know if it holds true and other programming languages, but I'm getting more used to the fact that there are two files in TypeScript-land that run a project. One is generally small and in a side directory that no one cares about, I think it's in a lib or the bin subdirectory. I don't remember which because I don't care. And then there are things you have to do within the other equivalent that basically reference each other. And I've gotten better at understanding that those aren't one file, for example. Though they seem to sure be a lot in all the demos, but it's not how the init process, when you're starting something new, spins up.Matthew: Yeah, this is the hell of TypeScript, the fact that Node.js, as a runtime, cannot process TypeScript files, so you always have to pass them through a compiler. This is actually one of the things that I like about using Projen for all of my projects instead of using CDK init to start them is that those baseline configurations handle the TypeScript nature of the runtime—or I should say, the anti-TypeScript nature of the runtime a little bit better, and you run into fewer problems. You never have to worry about necessarily doing build routines or other things because they actually use the ts-node runtime to handle your CDK files instead of the node runtime. And I think that's a big benefit in terms of the developer experience. It just makes it so I generally never have to care about those JavaScript files that get compiled from TypeScript. In the, you know, two years or so I've been using Projen, I never have to worry about a build routine to turn that into JavaScript. And that makes the developer experience significantly better.Corey: Yeah, I still miss an awful lot of things that I feel like I should be understanding. I've never touched Projen, for example. It's on my backlog of things to look into.Matthew: Highly recommend it.Corey: Yeah, I also am still in that area of… my TypeScript knowledge has not yet gotten to a point where I see the value of it. It feels like I've spent far more time fighting with the arbitrary restrictions that are TypeScript than it has saved me from typing errors in anything that I've built. I believe it has to come back around at some point of familiarity with the language, but I'm not there yet.Matthew: Got you. So, Python developer before this?Corey: Ish. Mostly brute force and enthusiasm, but yeah, Python.Matthew: Python, and I think you said bash scripting and other things that have no inherent typing built into it.Corey: Right.Matthew: Yeah, that is a problem, I think… that I thankfully avoided. I was an application developer for many years. My background and my experience has always been around strongly typed languages, so when it came to adopting the CDK, everything felt very natural to me. But as I've worked with people over the years, both internally at Defiance as well as people in the community that don't have a background in that, I've been exposed to how problematic TypeScript as a language truly can be for someone who has never had this experience of, I've got this thing and it has a well-defined shape to it, and if I don't respect that, then I'm going to bang my head against to these weird errors that are hard to comprehend and hard to grok way more than it feels like I'm getting value from it.Corey: There's also a lack of understanding around how to structure projects, in my case, where all right, I have a front-end and I have a back-end. Is this all within the context of the CDK project? And this, of course, also presupposes that everything I'm doing is effectively greenfield, in which case, great, do I use the front-end wizard tutorial thing that I'm following, and how does that integrate when I'm using the CDK to deploy it somewhere, and so on and so forth. It's stuff that makes sense once you have angry and loud enough opinions, but I don't yet.Matthew: Yeah, so the key thing that I tell people about project structure—because it does often come up a lot—is that ultimately, the CDK itself doesn't really care how you structure things. So, how you structure, where you put certain files, how you organize them, is your personal preference. Now, there are some exceptions to that. When it comes to things like Lambda functions that you're building or Docker files, there are probably some better practices you can go through, but it's actually more dependent on those systems rather than the CDK directly itself. So I go through, in the Advanced CDK course, you know, my basic starting directory structure for everything, which is stacks, constructs, apps, and stages all go into their own specific directories.But then once those directories start growing—because I've added more stacks, more constructs, and things—once I get to around five to maybe seven files in a directory, then I look at them and go, “Okay, how can I group these together?” I create subdirectories, I move those files around. My development tool of choice, which is WebStorm—JetBrains's long-running tool—handles the moving of those files for me, so all of my imports, all of my references automatically get updated accordingly, which is really nice, and I can refactor things as much as I want to without too much of a problem. So, as a project grows over time, my directory structure can change to make sure that it is readable, well organized, and understandable, and it's never been too much of a problem.Corey: Yeah, it's one of those things that does take some getting used to. It helps, I think, having a mentor of sorts to take you under their wing and explain these things to you, but that's a hard thing to scale as well. So, in the absence of that we wind up defaulting to oh, whatever the most recent blog post we read is.Matthew: Yeah. Yeah, and I think one of the truest, I think, and truthful complaints I've heard about the CDK and why it can be fundamentally very difficult is that it has no guardrails. It is a general-purpose languages, and general purpose languages don't have guardrails. They don't want to be in the way of you building whatever you need to build.But when it comes to an Infrastructure as Code project, which is inherently very different from an API or a website or other, sort of, more typical programming projects, having guardrail—or not having guardrails is a bad thing, and it can really lead you down some bad paths. I remember working with a client this last year who had leveraged context instead of properties on classes to hand configuration value down through code, down through stacks and constructs and things like that. And it worked. It functionally got them what they needed, up until a point, and then all of sudden, they were like, “Well, now we want to do X with the CDK, and we simply cannot because we've now painted ourselves into a corner.” And that's the downside of not having these good guard rails.And I think that early, they needed to do this early on. When the CDK was initially released, and it got popular back around the 0.4, 0.5 timeframe—I think I picked it up right around 0.4, too—when it officially hit a 1.0 release, there should have been a better set of guidelines and best practices published. You can go to the documents and see them, and they have been published, but it really didn't go far enough to really explain how and why you had to take the steps to make sure you didn't screw yourself six months later.Corey: It's sort of those one-way doors you don't realize you're passing through when you first start building something. And I find, especially when you follow my development approach of more or less used to be copying and pasting for various places, now it's copying and pasting from one place which is Chat-Gippity-4, then—although I've seen increasingly GitHub's Copilot has been great at this and Code Whisperer, in my experience, has not yet been worth the energy it takes to really go diving into it. Your mileage may of course vary on that. But I found it was not making materially better or suggestions on CDK stuff then Copilot was.Matthew: Yeah, I haven't tried Code Whisperer outside of the shell. I've been using Copilot for the last year and absolutely adore it. I think it has completely changed the way that I felt about coding. I saw writing code for the last couple of years as being very tedious and very boring in terms of there weren't interesting problems to solve, and Copilot, as I've seen it, is autocomplete on steroids. So, it doesn't keep me from having to solve the interesting problems; it just keeps me from having to type out the boring solutions, and it's the thing that I love about it.Now, hopefully, Code Whisperer continues to get better over time. I'm hoping all of Amazon's GenAI products continue to get better over time and I can maybe ditch a subscription to Copilot, but for now, Copilot is still my thing. And it's producing good enough results for me. Thankfully because I've been working with it for four years now, I don't rely on it to answer my questions about how to use constructs. I go back to the docs for those. If I need to.Corey: It occurs to me that I can talk about this now because this episode will not air until after this has become generally available, but what's really spanked it from my perspective has been Google's Duet. And the key defining difference is, as I'm in one of these files—in many cases, I'm doing something with React these days due to an escalating series of weird choices—and—Matthew: My apologies, by the way. My condolences, I should say.Corey: Well, yeah. Well, things like Copilot Chat are great when they say, “Oh yeah, assuming that you're handling the state this way in your component, now…” What I love about Duet is it goes, and it actually checks, which is awesome. And it has contextual awareness of the entire project, not just the three lines that I'm talking about, or the file that I'm looking at this moment. It goes ahead and does the intelligent thing of looking at some of these things. It still has some problems where it's confidently wrong about things that really shouldn't be, but okay, early days.Matthew: Sure. Yeah, I'll need to check that out a little bit more because I still, to this day, despise working with React. It is still my framework of choice because the ecosystem is so good around it. And so, established that I know that whatever problem I have, I'll find 14 blogs, and maybe one of them is the answer that I want, versus any other framework where it still feels so very new and so very immature that I will probably beat my head more than I want to. Web development now is a hobby, not a job, so I don't want to bang my head against a hobby project.Corey: I tend to view, on some level, that these AIs coding assistants are good enough to get me almost anywhere I need to go, to the point where a beginner or enthusiastic amateur will be able to get sorted out. And for a lot of what I'm building, that's all I really need. I don't need this to be something that will withstand the rigors of production at a bank, for example. One challenge I have seen with all these things is there's a delay in something being released and their training data growing to understand those things. Very often it'll wind up giving me recommendations for—I forget the name of it, but there was a state manager in React that the first thing you saw when you installed it was, “This has been deprecated. This is the new replacement.” And if you explicitly ask about the replacement, it does the right thing, but it just cheerfully goes ahead and tells you to use ancient stuff or apply poor security practices or the rest.Matthew: Yeah, that's very scary to me, to be honest because I think these AI development tools—for me, it's revitalized my interest in doing development, but where I get really, really scared is where they become a dependency in writing the right code. And every time I ever use Copilot to fill out stuff, I'm always double-checking, and I'm always making sure that this is right or that is right. And what I worry about is those developers who are maybe still learning some things, or are having to write in-line SQL on to their back-end and let Copilot, or Code Whisperer, or whatever tool they pick fill this stuff out, and that answer is based on a solution that works for a 10,000 record database, but fails horribly on a 100 million record database. And now all of a sudden, and you've got this problem that is just festering in through a dev environment, in through a QA environment, and even maybe into a prod environment, and you don't find out that failure until six months later, when some database table runs past its magical limit and now all of sudden, you've got these queries that are failing, they're crashing databases, they're running into problems, and this developer that didn't really know what they built in the first place is now being asked, “Why doesn't your code work,” and they just sort of have to go, “Maybe ChatGPT can tell me why my code doesn't work.” And that's the scariest part of me to these things is that they're a little bit too good at answering difficult questions with a simple answer. There is no, “It depends,” with these answers, and there needs to be for a lot of what we do in complex systems that, for example, in the AWS world, we're expected to build complex systems, and ChatGPT and these other tools are bad at that.Corey: We're required to build complex systems, and, on some level, I would put that onus on Amazon in many respects. I mean, the challenge I keep smacking into is that they're building—they're giving you a bunch of components and expecting you to assemble them all yourself to achieve even relatively simple things. It increasingly feels like this is the direction that they want customers to go in because they're bad at moving up the stack and develop—delivering integrated solutions themselves.Matthew: Well, so I would wonder, would you consider a relatively simple system, then?Corey: Okay, one of the things I like to do is go out in the evenings, and sometimes with a friend, I'll have a few too many beers. And then I'll come up with an idea for I want to redirect this random domain that I want to buy to someone else's website. The end. Now, if you go with Namecheap, or GoDaddy, or one of these various things, you can set that up in their mobile app with a couple of clicks and a payment, and you're done. With AWS, you have a minimum of six different services you need to work with, many of which do not support anything on a mobile basis and don't talk to one another relatively well. I built a state machine out of step functions that will do a lot of it for me, but it's an example of having to touch so many different things just for a relatively straightforward solution space that is a common problem. And that's a small example, but you see it across the board.Matthew: Yeah, yeah. I was expecting you to come up with a little bit of a different answer for what a simple system is, for example, a website. Everyone likes to say, “Oh, a static website with just raw HTML. That's a simple”—Corey: No, that's hard as hell because the devil is in the details, and it slices you to ribbons whenever you go down that path.Matthew: Exactly.Corey: No, I'm talking things that a human being would do without needing to be an expert in getting that many different AWS services to talk to one another.Matthew: Yeah, and I agree that AWS traditionally is very bad at moving up that stack and getting those things to work. You had mentioned at the very top of this about Amplify. Amplify is a system that I have tried once or twice, and I generally think that, for the right use case, is an excellent system and I really like a lot of what it does.Corey: It is. I agree. Having gone down that, building up my scavenger hunt app that I'll be open-sourcing at some point next year.Matthew: Yeah. And it's fantastic, but it has a very steep cliff where you hit that point where all of a sudden, you go, “Okay, I added this, and I added this, and I added this, and now I want to add this one other thing, but to do it, now all of a sudden, I have to go through a tremendous amount of work.” It wasn't just the simple push button that the previous four steps were. Now, I have this one other thing that I need to do, and now it's a very difficult thing to incorporate into my system. And I'm having to learn all new stuff that I never had to care about before because Amplify made it way too easy.And I don't think this is necessarily an AWS problem. I think this is just a fundamentally difficult software problem to solve. Microsoft, I spent years and years in the Microsoft world, and this was my biggest complaint about Microsoft was that they made extremely difficult things, far too simple to solve. And then once those systems became either buggy, problematic, misconfigured, whatever you want to call it, once they stopped working for some reason, the people who were responsible for figuring those answers out didn't have the preceding knowledge because they didn't need it. And then all of a sudden, they go, “Well, I don't know how to solve this problem because I was told it was just this push-button thing.”So, Amplify is great, and I think it's fantastic, but it is a very, very difficult problem to solve. Amazon has proven to be very, very good at building the fundamentals, and I think that they function very well as a platform service, as a building blocks. But they give you the Lego pieces, and they expect you to build the very complex Batmobile. And they can maybe give you some custom pieces here and there, like the fenders, and the tires, and stuff like that, but that's not their bread and butter.Corey: Well, even starting with the CDK is a perfect example. Like, you can use the CDK init to create a new project from scratch, which is awesome. I love the fact that that exists, but it doesn't go far enough. It doesn't automatically create a repo you store the thing in that in turn hooks up to a CI/CD process that will wind up doing the build and deploy. Instead, it expects to do that all locally, which is a counter pattern. That's an anti-pattern. It'll lead you down the wrong path. And you always have to build these things from scratch yourself as you keep going. At least that's what it feels like.Matthew: Yeah, it is. And I think that here at Defiance Digital, our job as an MSP is to talk to the customer and figure out, but what are those very specific things you need? So, we do build new CDK repos all the time for our customers. But some of our customers want a trunk base system. Some of them want a branching or a development branch base system. Some of them have a very complex SDLC process within a PR stage of code changes versus a slightly less complex one after things have been merged into trunk.So, we fundamentally look at it like we're that bridge between the two, and in that case, AWS works great. In fact, all SaaS solutions are really nice because they give us those building blocks and then we provide value by figuring out which one of those we need to incorporate in for our clients. But every single one of our clients is very different. And we've only got, you know, less than a dozen right now. But you know, I've got project managers and directors always coming back to me and saying, “Well, how do we cookie-cutter this process?” And you can't do it. It's just very, very difficult.Not in a small-scale. Maybe when you're really big, and you're a company like AWS who has thousands, if not potentially millions of customers, you can find those patterns, but it is a very fundamentally difficult problem to solve, and we've seen multiple companies over the last two decades try to do these things and ultimately fail. So, I don't necessarily blame AWS for not having these things or not doing them well.Corey: Yes and no. I mean, GitHub delivers excellent experience for the user, start to finish. There's—Vercel does something very similar over in the front-end universe, too, where it is clearly possible, but it seems that designing user interfaces and integrating disparate things together is not an Amazon's DNA, which makes sense when you view the two-pizza teams assembling to build larger things. But man, is that a frustration.Matthew: Yeah. I really wonder if this two-pizza team mentality can ever work well for products that are bigger than just the fundamental concepts. I think Amplify is pretty good, but if you really want something that is this service that works for 80% of customers, you can't do it with five people. You can't do it with six. You need to have teams like what GitHub and what Vercel and other things, where teams are potentially dozens of people that really coordinate things and have a good project manager and product owner and understand the problem very well. And it's just very difficult with these very, very small teams to get that going.I don't know what the future of AWS looks like. It feels like a very Microsoft in the mid-2000s, which is, they're running off of their existing customers, they don't really have a need to innovate significantly because they have a lot of people locked in, they would be just fine for years on years on end with the products they have. So, there isn't a huge driver for doing it, not like, maybe, GCP or Azure really need to start to continue to innovate stronger in this space to pick up more customers. AWS doesn't have a problem getting customers.And if there isn't a significant change in the mentality, like what Microsoft saw at the end of the 2000s with getting rid of Ballmer, bringing in Satya and really changing the mentality inside the company, I don't see AWS breaking out from this anytime soon. But I think that's actually a good thing. I think AWS should stick to just building the fundamentals, and I think that they should rely on their partners and their third parties to bridge that gap. I think Jeremy Daly at Ampt and what they're building over there is a fantastic product.Corey: Yeah. The problem is that Amazon seems to be in denial about a lot of this, at least with what they're saying publicly.Matthew: Yeah, but what they say publicly and how they feel internally could be very, very different. I would say that, you know, we don't know what they're thinking internally. And that's fine. I don't necessarily need to. I think more specifically, we need to understand what their roadmap looks like and we need to understand, you know, what, are they going to change in the future to maybe fill in some of these gaps.I would say that the problem you said earlier about being able to do a simple website redirect, I don't think that's Amazon's desire to build those things. I think there should be a third-party that's built on top of AWS, and maybe even works directly within your AWS account as a marketplace product for doing that, but I don't think that's necessarily in the benefit of AWS to build that directly.Corey: We'll see. I'm very curious to see how this unfolds because a lot of customers want answers that require things that have to be assembled for them. I mean, honestly, a lot of the GenAI stuff is squarely in that category.Matthew: Agreed, but is this something where AWS needs to build it internally, and then we've got a product like App Composer, or Copilot, or things where they try, and then because they don't get enough traction, it just feels like they stall out and get stagnant? I mean, App Composer was a keynote product announcement during last year's re:Invent, and this year, we saw them introduce the ability to step function editing within it, and introduce the functionality into your IDE, VS Code directly. Both good things, but a year's worth of development effort to release those two features feels slow to me. The integration to VS Code should have been simple.Corey: Yeah. They are not the innovative company that would turn around and deliver something incredible three months after something had launched, “And here's a great new series of features around it.” It feels like the pace of innovation and face of delivery has massively slowed.Matthew: Yeah. And that's the scariest thing for me. And, you know, we saw this a little bit with a discussion recently in the cdk.dev server because if you take a look at what's been happening with the CDK application for the last six months and even almost a year now, it feels like the pace of changes within the codebase has slowed.There have been multiple releases over the course of the last year where the release at the end of the week—and they hit a pretty regular cadence of a release every week—that release at the end of the week fixes one bug or adds one small feature change to one construct in some library that maybe 10% of users are going to use. And that's troublesome. One of the main reasons why I ditched the Terraform and went hard on the CDK was that I looked at how many issues were open on the Terraform AWS provider, and how many missing features were, and how slow they were to incorporate those in, and said, “I can't invest another two years into this product if there isn't going to be that innovation.” And I wasn't in a place to do the development work myself—despite the fact that you can because it's open-source and providers are forkable—and the CDK is getting real close to that same spot right now. So, this weekend—and I know this is going to come out, you know, weeks later—but you know, the weekend of December 10th, they announced a change to the way that they were going to take contributions from the CDK community.And the long and short of it right now—and there's still some debate over exactly what they said—is, we're not going to accept brand-new L2 constructs from the community. Those have to be built internally by AWS only. That's a dr—step in the wrong direction. I understand why they're taking that approach. Contributions in the CDK have been very rough for the last four or five months because of the previous policies they put into place, but this is an open-source product. It's supposed to be an open-source product. It's also a very complex set of code because of all of the various AWS services that are being hit by it. This isn't just Amplify, which is hitting a couple of things here and there. This is potentially—Corey: It touches everything.Matthew: It touches everything.Corey: Yeah, I can see their perspective, but they've got to get way better at supporting things rapidly if they want to play that game.Matthew: And they can't do that internally with AWS, not with a two-pizza team.Corey: No. And there's an increasing philosophy I'm hearing from teams of, “Well, my service supports it. Other stuff, that's not my area of responsibility.” The wisdom that I've seen that really encapsulates this is written on Colm MacCárthaigh's old laptop in 2019: “AWS is the product.” That's the truth. It's not about the individual components; it's about the whole, collectively.Matthew: Right. And so, if we're not getting these L2 constructs and these things being built out for all of the services that CloudFormation hits, then the product feels stalled, there isn't a good initiative for users to continue trying to adopt it because over time, users are just going to hit more and more services in AWS, not fewer as they use the products. That's what AWS wants. They want people to be using VPC Lattice and all the GenAI stuff, and Glue, and SageMaker, and all these things, but if you don't have those L2 constructs, then there's no advantage of the CDK over top of just raw CloudFormation. So, the step in the right direction, in my opinion, would have been to make it easier and better for outside contributions to get into CDK, and they went the opposite way, and that's scary.Now, they basically said, go build these on your own, go publish them on the Construct Hub, and if they're good, we'll incorporate them in. But they also didn't define what good was, and what makes a good API. API development is very difficult. How do you build a construct that's going to hit 80% of use cases and still give you an out for those other 20 you missed? That's fundamentally hard.Corey: It is. And I don't know if there are good answers, yet. Maybe they're going in the right direction, maybe they're not.Matthew: Time will tell. My hope is that I can try to do some videos here after the new year to try to maybe make this a better experience for people. What does good API design look like? What is it like to implement these things well so they can be incorporated in? There has been a lot of pushback already, just after the first couple of days, from some very vocal users within the CDK community saying, “This is bad. This is fundamentally bad stuff.”Even from big fanboys like myself, who have supported the CDK, who co-authored the CDK Book, and they said, “This is not good.” So, we'll see what happens. Maybe they change direction after a couple of days. Maybe this is— turns out to be a great way to do it. Only time will really tell at this point.Corey: Awesome. And where can people go to find out more as you continue your exploration in this space and find out what you're up to in general?Matthew: So, I do have a Twitter account at@mattbonig on Twitter, however, I am probably going to be doing less and less over there. Engagement and the community as a whole over there has been problematic for a while, and I'll probably be doing more on LinkedIn, so you can find me there. Just search for Matthew Bonig. It's a very unique name.I've also got a website, matthewbonig.com, and from there, you can see blog articles, and a link to my Advanced CDK course, which I'm going to continue adding sessions to over the course of the next few months. I've got one coming out shortly about the deadly embrace and how you can work through that problem with the deadly embrace and hopefully not be so scared about multi-stack applications.Corey: I look forward to that because Lord knows, I'm running into that one myself increasingly frequently.Matthew: Well, good. I will hopefully be able to get this video out and solve all of your problems very easily.Corey: Awesome. Thank you so much for taking the time to speak with me. I appreciate it.Matthew: Thank you for having me. I really appreciate it.Corey: Matthew Bonig, Chief Cloud Architect at Defiance Digital, AWS Dev Tools Hero, and oh so much more. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that you will then have to wind up building the implementation for that constructs that power that comment yourself because apparently we're not allowed to build them globally anymore.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

On part en prod
#18 - Simon Parisot, Blank - Les forces et faiblesses du 100% serverless

On part en prod

Play Episode Listen Later Aug 25, 2023 91:56


Dans cet épisode je reçois Simon Parisot, CEO et cofondateur de la néo-banque Blank qui propose des comptes professionnels pour les indépendants. On y parle de l'approche serverless développée depuis le début sur cette plateforme bancaire née en 9 mois. Ou pourquoi être fortement dépendant d'un prestataire comme AWS. Quels sont ses avantages et ses inconvénients ? - Entre rapidité, coût et recrutement des meilleurs (des devs capables de faire du dev serverless, de la scalabilité…) - Mais aussi performance, émission carbone, - Les grandes forces et apports, - Et la difficulté à travailler en serverless.

Le Podcast AWS en Français
Amazon Cloudfront

Le Podcast AWS en Français

Play Episode Listen Later Jul 28, 2023 51:10


Une conversation sur le pourquoi et le comment utiliser un CDN. On commence en douceur et au fur et à mesure de la conversation, on rentre dans les détails : les stratégies de caching, des clés de caching, l'utilisation d'un CDN pour se protéger des attaques DDOS ou pour diminuer vos coûts. Ensuite nous parlons de Lambda on Edge et CloudFront functions pour exécuter du code en périmetre de votre infrastructure. Que vous soyez débutant ou expert en matière de CDN, vous apprendrez quelques chose en écoutant cet épisode.

Le Podcast AWS en Français
Amazon Cloudfront

Le Podcast AWS en Français

Play Episode Listen Later Jul 28, 2023 51:10


Une conversation sur le pourquoi et le comment utiliser un CDN. On commence en douceur et au fur et à mesure de la conversation, on rentre dans les détails : les stratégies de caching, des clés de caching, l'utilisation d'un CDN pour se protéger des attaques DDOS ou pour diminuer vos coûts. Ensuite nous parlons de Lambda on Edge et CloudFront functions pour exécuter du code en périmetre de votre infrastructure. Que vous soyez débutant ou expert en matière de CDN, vous apprendrez quelques chose en écoutant cet épisode.

Screaming in the Cloud
OpsLevel and The Need for a Developer Portal with Kenneth Rose

Screaming in the Cloud

Play Episode Listen Later Jun 15, 2023 36:34


Kenneth Rose, CTO at OpsLevel, joins Corey on Screaming in the Cloud to discuss how OpsLevel is helping developer teams to scale effectively. Kenneth reveals what a developer portal is, how he thinks about the functionality of a developer portal, and the problems a developer portal solves for large developer teams. Corey and Kenneth discuss how to drive adoption of a developer portal, and Kenneth explains why it's so necessary to have executive buy-in throughout that process. Kenneth also discusses how using their own portal internally along with seeking out customer feedback has allowed OpsLevel to make impactful innovations. About KenKenneth (Ken) Rose is the CTO and Co-Founder of OpsLevel. Ken has spent over 15 years scaling engineering teams as an early engineer at PagerDuty and Shopify. Having in-the-trenches experience has allowed Ken a unique perspective on how some of the best teams are built and scaled and lends this viewpoint to building products for OpsLevel, a service ownership platform built to turn chaos into consistency for engineering leaders.Links Referenced: OpsLevel: https://www.opslevel.com/ LinkedIn: https://www.linkedin.com/company/opslevel/ Twitter: https://twitter.com/OpsLevelHQ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn, about, oh I don't know, two years ago and change, I wound up writing a blog post titled, “Developer Portals are An Anti Pattern,” and I haven't really spent a lot of time thinking about them since. This promoted guest episode is brought to us by our friends at OpsLevel, and they have sent their CTO and co-founder Ken Rose, presumably in an attempt to change my perspective on these things. Let's find out. Ken, thank you for agreeing to, well, run the gauntlet, for lack of a better term.Ken: Hey, Corey. Thanks again for having me. And I've heard, you know, heard and listened to your show a bunch, and really excited to be here today.Corey: Let's begin with defining our terms. I'm curious to know what a developer portal is. ‘What would you say a developer portal means to you?' Like it's a college entrance essay.Ken: Right? Definitely. You know, so really, a developer portal is this consolidated place for developers to come to, especially in large organizations to be able to get their jobs done more easily, right? A large challenge that developers have in large organizations, there's just a lot to do and a lot to take care of. So, a developer portal is a place for developers to be able to better own, manage, and run the services, they're responsible for that run in production, and they can do that through access, easy access to self-service tooling.Corey: I guess, on some level, this turns into one of those alignment charts of, like, what is a database and, like, how prescriptive you want to be. It's like, well is a senior engineer a database because you can query them and they have information? Would you consider, for example, Kubernetes be a developer platform, and/or would the AWS console?Ken: Yeah, that's actually an interesting question, right? So, I think there's actually two—we're going to get really niggly here—there's developer platform and developer portal, right? And the word portal for me is something that sits above a developer platform. I don't know if you remember, like, the late-90s, early-2000s, like, portals were all the rage.Like, Yahoo and AltaVistas were like search portals, they were trying to, at the time, consolidate all this information on a much smaller internet to make it easy to access. A developer portal is sort of the same thing, but custom-built for developers and trying to consolidate a lot of the tooling that exists. Now, in terms of the AWS console? Yeah, maybe. Like, it has a suite of tools and suite of offerings. It doesn't do a lot on the well, how do I quickly find out what's running in production and who is responsible for it? I don't know, unless AWS shipped, like, their, you know, three-hundredth new offering in the last week that I haven't, you know, kept on top of.But you know, there's definitely some spectrum in terms of what goes into a developer portal. For me, there's kind of three main things you need. You do need some kind of a catalog, like, what's out there who owns it; you need some kind of a way to measure, like, how good are those services, like, how well built are they; and then you need some access to self-service tooling. And that last part is where, like, the Kubernetes or AWS could be, you know, sort of a dev portal as well.Corey: My experience with developer portals—there was a time when I loved it. RightScale was what I used—at some depth—back in I want to say 2010, 2011 because the EC2 console was clearly not built or designed by anyone who had not built EC2 themselves with their bare hands and sweat of their brow. And in time, the EC2 console got better where it wasn't written in hieroglyphics, as best we could tell, and it became ‘click button to launch instance.' And RightScale really didn't have a second act and they wound up getting acquired by our friends over at Flexera years later. And I haven't seen their developer portal in at least eight years as a direct result of this.So, the problem, at least when I was viewing it purely in the context of AWS services, it feels like you are competing against AWS iterating forward on developer experience, which they iterate slowly, sometimes, and unevenly across their breadth of services, but it does feel like at some level by building an internal portal, you are, first, trying to out-innovate AWS, in some ways, and two, you are inherently making the trade-off of not using recent features and enhancements that have not themselves been incorporated into the portal. That's where the, I guess the start, the genesis of my opposition to the developer portal approach comes from. Is that philosophy valid these days? Not as much. Because I can see an argument for it shifting.Ken: Yeah, I think it's slightly different. I think of a developer portal as again, it's something that sort of sits on top of AWS or Google Cloud or whatever cloud provider use, right? You give an example for example with RightScale and EC2. So, provisioning instances is one part of the activity you have to do as a developer. Now, in most modern organizations, you have, like, your product developers that ship features. They don't actually care about provisioning instance themselves. There are another group called the platform engineers or platform group that are responsible for building automation and tooling to help spin up instances and create CI/CD pipelines and get everything you need set up.And they might use AWS under the covers to do that, but the automation built on top and making that accessible to developers, that's really what a developer portal can provide. In addition, it also provides links to operational tooling that you need, technical documentation, it's everything you need as a developer to do your job, in one place. And though AWs bills itself is that, I think of them as more, they have a lot of platform offerings, right, they have a lot of infra-offerings, but they still haven't been able to, I think, customize that, unless you're an organization that builds—that has kind of gone in-all on AWS and doesn't build any of your own tooling, that's where a developer portal helps. It really helps by consolidating all that information in one place, by making that information discoverable for every developer so they have less… less cognitive load, right? We've asked developers to kind of do too much that we don't… we've asked to shift left and well, how do we make that information more accessible?Regarding the point of, you know, AWS adds new features or new capabilities all the time and, like, well you have this dev portal, that's sort of your interface for how to get things done. Like, how will you use those? Dev portal doesn't stop you from doing that, right? So, my mental model is, if I'm a developer, and I want to spin up a new service, I can just press a button inside of my dev portal in my company and do that. And I have a service that is built according to the latest standards, it has a CI/CD pipeline, it already has a—you know, it's registered in PagerDuty, it's registered in Datadog, it has all the various bits.And then there's something else that I want to do that isn't really on the golden path because maybe this is some new service or some experiment, nothing stops us from doing that. Like, you still can use all those tools from AWS, you know, kind of raw. And if those prove to be valuable for the rest of the organization, great. They can make their way into the dev portal; they can actually become a source of leverage. But if they're not, then they can also just sit there on the vine. Like, not everything that eight of us ever produces will be used by every company.Corey: Many years ago, I got a Cisco pair of certifications because recession was hitting and I needed to be better at networking. And taking those certifications, in those days before Cisco became the sad corporate dragon with no friends we all know today, they were highly germane and relevant. But I distinctly remember, even now, 15 years later, that there was this entire philosophy of pretend that the entire world is Cisco only, which in networking is absolutely never true. It feels like a lot of the AWS designs and patterns tend to assume, oh yeah, you're going to use AWS services for everything. I have never yet found that to be true, other than when I'm just trying to be obstinate.And hell is interoperability between a bunch of different things. Yes, I may want to spin up an EC2 instance and an AWS load balancer and some S3 storage or whatnot, but I'm also going to want to monitor it with PagerDuty, I'm going to want to have a CDN that isn't CloudFront because most CDN these days don't hate you in quite the same economic ways and are simpler to work with, et cetera, et cetera, et cetera. So, there's definitely a story wherein I've found that there's an—the interoperability of tying these things together is helpful. How do you avoid falling down the trap of oh, everyone should be multi-cloud, single pane of glass at cetera, et cetera? In practice that always seems to turn to custard.Ken: Yeah, I think multi-cloud and single pane of glass are actually two different things. So multi-cloud, like, I agree with you to some sense. Like, pick a cloud and go with it, like, unless you have really good business reasons to go for multi-cloud. And sometimes you do, like, years ago, I worked at PagerDuty, they were multi-cloud for a reliability reason, that hey, if one cloud provider goes down, you don't want [crosstalk 00:08:40]—Corey: They were an example I used all the time for that story—Ken: Right.Corey: —specifically the thing woke you up was homed in a bunch of different places, whereas the marketing site, the onboarding flow, the periphery stuff around it was not because it didn't need to be.Ken: Exactly.Corey: Like, the core business need of wake you up was very much multi-cloud because once upon a time, it wasn't and it went down with the rest of us-east-1 and people weren't woken up to be told their site was on fire.Ken: A hundred percent. And on the kind of like application side where, even then, pick a cloud and go with it, unless there's a really compelling business reason for your business to go multi-cloud. Maybe there's something credits or compliance or availability, right? There might be reasons, but you have to be articulate about whether they're right for you.Now, single pane of glass, I think that's different, right? I do think that's something that, ultimately, is a net boon for developers. In any large organization, there is a myriad of internal tools that have been built. And it's like, well, how do I provision a new topic in the Kafka cluster? How do I actually get access to the AWS console? How do I spin up a new service, right? How do I kind of do these things?And if I'm a developer, I just want to ship features. Like, that's what I'm incented to do, that's what I'm optimizing for. And all this other stuff I have to do as part of my job, but I don't want to have to become, like, a Kubernetes guru to be able to do it, right? So, what a developer portal is trying to do is be that single pane of glass, bringing all these common set of tools and responsibilities that you have as a developer in one place. They're easy to search for, they're easy to find, they're easy to query, they're easy to use.Corey: I should probably have asked this earlier on, but let's disambiguate for a little bit here. Because when I'm setting up to use a new service or product and kick the tires on it, no two explorations really look the same. Whereas at most responsible mature companies that are building products that are—services that are going to production use, they've standardized around a number of different approaches. What does your target customer look like? Is there a certain point of scale, a certain level of complexity, a certain maturity of process?Ken: Absolutely. So, a tool like OpsLevel or a developer portal really only makes sense when you hit some critical mass in terms of the number of services you have running in production, or the number of developers that you have. So, when you hit 20, 30, 50 developers or 20, 30, 50 services, an important part of a developer portal is this catalog of what's out there. Once you kind of hit the Dunbar number of services, like, when you have more than you keep in your head, that's when you start to need tooling like this. If you look at our customer base, they're all you know, kind of medium to large-sized companies. If you're a startup with, like, ten people, OpsLevel is probably not right for you. We use all playable internally at OpsLevel, and you know, like, we're still a small company. It's like, we make it work for us because we know how to get the most out of it, but like, it's not the perfect fit because it's not really meant for, you know, smaller companies.Corey: Oh, I hear you. I think I'm probably… I have a better AWS bill analytic system running internally here at The Duckbill Group than some banks do. So, I hear you on that front.Ken: I believe it.Corey: But also implies to me that there's no OpsLevel prospect or customer deployment that has ever been greenfield. It's always you're building existing things, there's already infrastructure in place, vendors have been selected across the board. You aren't—don't to want to starting a company day one, they're going to all right, time to spin up our AWS account and we're also going to wind up signing up for OpsLevel, from the sound of it.Ken: Correct—Corey: Accurate? Inaccurate?Ken: I think that's actually accurate. Like, a lot of the problems, we solve other problems that come as you start to scale both your product and your engineering team. And it's the problems of complexity.Corey: What do those painful problems look like? In other words, what is someone sitting at home right now listening to this, or driving to work debating whether want to ram a bridge abutment or go into the office depending on their mental state today, what painful problem did they have that OpsLevel is designed to fix?Ken: Yeah, for sure. So, let's help people self-select. So, here's my mental model for any [unintelligible 00:12:25]. There are product developers, platform developers, and engineering leaders. Product developers, if you're asking questions like, “I just got paged for the service. I don't know what this does.” Or, “It's upstream from here. Where do I find the technical documentation?” Or, “I think I have to do something with the payment service. Where do I find the API for that?”You know, when you get to that scale, a developer portal can help you. If you're a platform engineer and you have questions like, “Okay, we got to migrate. We're migrating, I don't know, from Datadog to Honeycomb, right? We got to get these fifty or a hundred or thousands of services and all these different owners to, like, switch to some new tool.” Or, “Hey, we've done all this work to ship the golden path. Like, how to actually measure the adoption of all this work that we're doing and if it's actually valuable?” Right?Like, we want everybody to be on a certain set of CI tooling or a certain minimum version of some library or framework. How do we do that? How do we measure that? OpsLevel is for you, right? We have a whole bunch of stuff around maturity.And if you're engineering leader, ultimately, questions you care about, like, “How fast are my developers working? I have this massive team, we've made this massive investment in hiring all these humans to write software and bring value for our customers. How can we be more efficient as a business in terms of that value delivery?” And that's where OpsLevel can help as well.Corey: Guardrails, whether they be economic, regulatory, or otherwise, have to make it easier than doing things incorrectly because one of the miracle aspects of cloud also turns into a bit of a problem, which is shadow IT is only ever a corporate credit card away. Make it too difficult to comply with corporate policies and people won't. And they're good actors; they're trying to get work done. They're not trying to make people's lives harder, but they don't want to spend six weeks provisioning an EC2 cluster. So, there's always that weird trade-off.Now, it feels—and please correct me if I'm wrong—once someone has rolled out OpsLevel at their organization, where it really shines is spinning up a new service where okay, great, you're going to spin up the automatic observability portion of it, you're going to spin up the underlying infrastructure in certain ways that comply with our policies, it's going to build the CI/CD pipelines around it, you're going to wind up having the various cost instrumentation rolled out to it. But for services that are already excellent within the environment, is there an OpsLevel story for them?Ken: Oh, absolutely. So, I look at it as, like, the first problem OpsLevel helps solve is the catalog and what's out there and who owns it. So, not even getting developers to spin up new services that are kind of on the golden path, but just understanding the taxonomy of what are the services we have? How do those services compose into higher-level things like systems or domains? What's the whole set of infrastructure we have?Like, I have 50 AWS accounts, maybe a handful of GCP ones, also, some Azure. I have all this infrastructure that, like, how do I start to get a handle on, like, what's out there in prod and who's responsible for it. And that helps you get in front of compliance risks, security risks. That's really the starting point for OpsLevel building that catalog. And we have a bunch of integrations that kind of slurp all this data to automatically assemble that catalog, or YAML as well if that's your thing. But that's the starting point is building that catalog and figuring out this assignment of, like, okay, this service and this human, or this—sorry—team, like, they're paired together.Corey: A number of offerings in this space, which honestly, my exposure to it is bounded simultaneously to things that are ten years old and no one uses anymore, or a bunch of things I found on GitHub. And the challenge that both of those products tend to have is that they assume certain things to be true about a given environment: that they're using Terraform to manage everything, or they're always going to be using CloudFormation, or everyone there knows Python or something else like that. What are the prerequisites to get started with OpsLevel?Ken: Yeah, so we worked pretty hard to build just a ton of integrations. I would say integrations is are just continuing thing we have going on in the background. Like, when we started, like, we only supported a GitHub. Now, we support all the gits, you know, like GitHub, GitLab, Bitbucket, Azure DevOps, like, we're building [unintelligible 00:16:19]. There's just a whole, like, long tail of integrations.The same with APM tooling. The same with vulnerability management tooling, right? And the reason we do that is because there's just this huge vendor footprint, and people, you know, want OpsLevel to work for them. Now, the other thing we try to do is we also build APIs. So, anything we have as, like, a core integration, we also have kind of like an underlying API for, so that there's, no matter what you have an escape hatch. If like, you're using some tool that we don't support or you have some homegrown thing, there's always a way to try to be able to integrate that into OpsLevel.Corey: When people think about developer portals, the most common one that pops to mind is Backstage, which Spotify wound up building, internally, championing, open-sourcing, and I believe, on some level, turned into a product because if there's one thing people want, it's to have their podcast music company become a SaaS vendor, which is weird to me. But the criticisms that I've seen about and across the board have rung relatively true, including from people internal at Spotify who have used the thing, which is, well first is underestimating the amount of effort that is necessary to maintain Backstage itself, that the build versus buy discussion is always harder to bu—engineers love to build, but they shouldn't be building things outside of their core competency half the time, and the other is driving adoption within the org where you can have the most amazing developer portal in the known universe, but if people don't use it, it may as well not exist and doing the carrot and stick approach often doesn't work. I think you have a pretty good answer that I need not even ask you to elaborate on, “Well, how do we avoid having to maintain this ourselves,” since you have a company that does this, but how do you find companies are driving adoption successfully once they have deployed OpsLevel?Ken: Yeah, that's a great question. So, absolutely. Like, I think the biggest thing you need first, is kind of cultural buy-in and that this is a tool that we want to invest in, right? I think one of the reasons Spotify was successful with Backstage and I think it was System Z before that was that they had this kind of flywheel of, like, they saw that their developers were getting, you know better faster, working happier, by using this type of tooling, by reducing the cognitive load. The way that we approach it is sort of similar, right?We want to make sure that there is executive buy-in that, like, everybody agrees this is, like, a problem that's worth solving. The first step we do is trying to build out that catalog again and helping assign ownership. And that helps people understand, like, hey, these are the services I'm responsible for. Oh, look, and now here's this other context that I didn't have before. And then helping organizations, you know, what—it depends on the problem we're trying to solve, but whether it's rolling out self-serve automation to help developers, like, reduce what was before a ton of cognitive load or if it's helping platform teams define what good looks like so they can start to level up the overall health of what's running in production, we kind of work on different problems, but it's picking one problem and then you know, kind of working with the customers and driving it forward.Corey: On some level, I think that this is going to be looked down upon inherently just by automatic reflex of folks with infrastructure engineering backgrounds. It's taken me some time to learn to overcome my own negative reaction to it. Because it's, I'm here to build things and I want to build things out in such a way that it's portable and reusable without having to be tied to a particular vendor and move on. And it took me a long time to realize that what that instinct was whispering in my ear was in fact, no, you should be your own cloud provider. If that's really what I want to do, I probably should just brush up on you know, computer science trivia from 20 years ago and then go see if I can pass Google's SRE interview.I'm not here to build the things that just provision infrastructure from scratch every company I wind up landing at. It feels like there's more important, impactful work that I can do. And let's be clear, people are never going to follow guardrails themselves when they have to do a bunch of manual steps. It has to be something that is done for them. And I don't know how you necessarily get there without having some form of blueprint or something like that, provided for them with something that is self-service because otherwise, it's not going to work.Ken: I a hundred percent agree, by the way, Corey. Like, the take that, like, automation is the only way to drive a lot of this forward is true, right? If for every single thing you're trying—like, we have a concept called a rubric and it's basically how you measure the service health. And you can—it's very customizable, you have different dimensions. But if, for any check that's on your rubric, it requires manual effort from all your developers, that is going to be harder than something you can just automate away.So, vulnerability management is a great example. If you tell developers, “Hey, you have to go upgrade this library,” okay, some percentage [unintelligible 00:20:47], if you give developers, “Here's a pull request that's already been done and has a test passing and now you just need to merge it,” you're going to have a much better adoption rate with that. Similarly with, like, applying templates being able to [up-level 00:20:57], you know, kind of apply the latest version of a template to an existing service, those types of capabilities, anything where you can automate what the fixes are, absolutely you're going to get better adoption.Corey: As you take a look at your existing reference customers—which is something I always look for on vendor websites because, like, oh, we have many customers who will absolutely not admit to being customers, it's like, that sounds like something that's easy to say—you have actual names tied to these things. Not just companies, but also individuals. If you were to sit down and ask your existing customer base, “So, why did you wind up implementing OpsLevel and what has the value that's delivered to you been since that implementation?” What do they say?Ken: Definitely. I actually had to check our website because we, you know, land new customers and put new logos on it. I was like, “Oh, I wonder what the current set is out right now?”Corey: I have the exact same challenge. Like oh, we have some mutual customers. And it's okay. I don't know if I can mention them by name because I haven't checked our own list of testimonials [unintelligible 00:21:51] lately because say the wrong thing and that's how you wind up being sued and not having a company anymore.Ken: Yeah. So, I don't—I definitely, you know, want to stay [on side 00:22:00] on that part, but in terms of, like, kind of sample reference customer, a lot of the folks that we initially worked with are the platform teams, right? They're the teams that care about what's out there, and they need to know who's responsible for it because they're trying to drive some kind of cross-cutting change across the entire, you know, production footprint. And so, the first thing that generally people will say is—and I love this quote. This came—I won't name them, but like, it's in one of our case studies.It was like, “I had, like, 50 different attempts at making a spreadsheet and they're all, like, in the graveyard, like, to be able to capture what's out there and who's responsible for it.” And just OpsLevel helping automate that has been one of the biggest values that they've gotten. The second point, then is now be able to drive maturity and be able to measure how well those services are being built. And again, it's sort of this interesting thing where we start with the platform teams. And then sometime later security teams find out about OpsLevel, and they're like, “Oh, this is a tool I can use to, like, get developers to do stuff? Like, I've been trying to get developers to do stuff for the longest time.”And they—I file Jira tickets and they just sit there and nothing gets done. But when it becomes part of this, like, overall health score that you're trying to increase a part of the across the board, yeah, it's just a way to kind of drive action.Corey: I think that there's a dichotomy of companies that emerge. And I tend to see the world through a lens of AWS bills, so let's go down that path. I feel like there are some companies presumably like OpsLevel, whereas if I—assuming you're running on top of AWS—if I were to pull your AWS bill, I would see upwards of 80% of your spend is going to be on this application called OpsLevel, the service that you provide to people. As opposed to the other side of the world, which is large enterprises, where they're spending hundreds of millions of dollars a year, but the largest application they have is a million-and-a-half a year in spend because just, they have thousands of these things scattered everywhere. That latter case is where I tend to see more platform teams, where I start to see a lot of managing a whole bunch of relatively small workloads. And developer platforms really seem to be where a lot of solutions lead, whereas 80% of our workload is one application, we don't feel the need for that as much. Is that accurate? Am I misunderstanding some aspect of it?Ken: No, a hundred percent you'd hit the nail on the head. Like, okay, think about the typical, like, microservices adoption journey. Like, you started with, you know, some small company—like us—you started with a monolith. Ah, maybe you built out a second app—Corey: Then you read on Hacker News and realize, “Oh, if we want to hire people, we've got to be doing what all the cool kids are up to.”Ken: Right. We got a microservice all the thing—but that's actually you know, microservices should come later, right, as a response to you needing scale your org and scale your—Corey: As someone who started building some application with microservices, I could not agree more.Ken: A hundred percent. So, it's as you're starting to take steps to having just more moving parts in your production infrastructure, right? If you have one moving part, unless it's like a really large moving part that you can internally break down, like, kind of this majestic monolith where you do have kind of like individual domains that are owned by different teams, but really the problem we're trying to solve, it's more about, like, who owns what. Now, if that's a single atomic unit, great, but can you decompose that? But if you just have, like, one small application, kind of like the whole team is owning everything, again, a developer portal is probably not the right tool for you. It really is a tool that you need as you start to scale your engineer work and as you start to scale the number of moving parts in your production infrastructure.Corey: I tended to use to think of that in terms of boring companies versus innovative ones and I don't think that's accurate. I think it is the question of maturity and where companies lead to. On some level, of OpsLevel starts growing and becomes larger and larger in different ways and starts doing acquisitions and launching into other areas, at some point, you don't have just one product offering, you have a multitude of them. At which point having something like that is going to be critical. But I have to ask, given that you are sort of not exactly your target customer profile, what are the sharp edges been on using it for your use case?Ken: Yeah. So, we actually have an internal Slack channel, we call OpsLevel on OpsLevel. And finding those sharp edges actually has been really useful for us. You know, all the good stuff, dogfooding and it makes your own product better. Okay, so we have our main app, we also do have a bunch of smaller things and it's like, oh yeah, you know, we have, like, I don't know, various Hackaday things that go on, it's important we kind of wind those down for, you know, compliance, we have our marketing site, we have, like, our Terraform.Like, so there's, like, stuff. It's not, like, hundreds or thousands of things, but there's more than just the main app. The second though, is it's really on the maturity piece that we really try to get a lot of value out of our own product, right? Helping—we have our own platform team. They're also trying to drive certain initiatives with our product developers.There is that usual tension of our, like, our own product developers are like, “I want to ship features.” What's this security thing I have to go take care of right now? But OpsLevel itself, like, helps reflect that. We had an operational review today and it was like, “Oh, this one service is actually now”—we have platinum as a level. It's in gold instead of platinum. It's like, “Why?” “Oh, there's this thing that came up. We got to go fix that.” “Great. Let's go actually go fix that so we're back into platinum.”Corey: Do you find that there's often a choice you have to make internally, where you could make the product more effective for your specific use case, but that also diverges from where your typical customer needs or wants the product to go?Ken: No, I think a lot of the things we find for our use case are, like, they're more small paper cuts, right? They're just as we're using it, it's like, “Hey, like, as I'm using this, I want to see the report for this particular check. Why do I have to click six times to get?” You know, like, “Wouldn't it be great if we had a button?” Right?And so, it's those type of, like, small innovations that kind of come up. And those ultimately lead to, you know, a better product for our customers. We also work really closely with our customers and developers are not shy about telling you what they don't like about your product. And I say this with love, like, a lot of our customers give us phenomenal feedback just on how our product can be better and we try to internalize that and you know, roll that feedback into the product.Corey: You have a number of integrations of different SaaS providers, infrastructure providers, et cetera, that you wind up working with. I imagine that given your scale and scope and whatnot, those offerings are dictated by what customers say, “Hey, we're using this thing. Are you going to support that or are you not going to maintain our business?” Which is a great way to wind up financing a lot of product development and figuring out what matters to people. My question for you is, if you look across the totality of your user base, what are the most popularly used integrations, if you can say?Ken: Yeah, for sure. I think right now—I could actually dive in to pull the numbers—GitHub and GitLab—or… I think GitHub, like, has slightly more adoption across our customer base. At least with our customers, almost nobody uses Bitbucket. I mean, we have, like, a small number, but, like, it's… I think, single-digit percentage. A lot of people use PagerDuty, which you know, hey, I'm an ex-PagerDuty person [crosstalk 00:28:24] and I'm glad to see that.Corey: I have a free tier PagerDuty account that will automatically page me for my home automation stuff. Specifically, if you know, the fire alarm goes off. Like, yeah, okay, there are certain things I want to be woken up for, but it's a very short list.Ken: Yeah, it's funny, the running default message when we use a test PagerDuty was, “The server is on fire.” [unintelligible 00:28:44] be like, “The house is on fire.” Like you know, go get that taken care of. There's one other tool so that's used a lot. Datadog actually is used a ton by just across our entire customer base, despite its… we're also Data—we're a Datadog partner, we're a Datadog customer, you know? It's not cheap, but it's a good product for, you know, monitoring and logs and there are [crosstalk 00:29:01]—Corey: No other than cloud infrastructure providers, I get the number one most common source of inquiries is Datadog optimization. It has now risen to a board-level concern in many cases because observability is expensive. That's a sign of success, on some level. Meanwhile, I'm sitting here, like, Date-a-dog? Oh, my God, that's disgusting. It's like Tinder for Pets. Which it turns out is not at all what they do.Ken: Nice.Corey: Yeah.[audio break 00:29:23]—optimizing their Slack integrations, their GitHub integration, et cetera. Or are they starting with the spinning up the servers piece of it?Ken: A lot of the time—and again, that first problem they're trying to solve is just get me a handle on everything we have running in production. You know, if you have multiple AWS accounts, multiple Kubernetes clusters, dozens or even hundreds of teams, God help you if you're going to try to, like, build a list manually to consolidate all that information. That's really the first part is, like, integrate Kubernetes, integrate your CI/CD pipelines, integrate Git, integrate your Cloud account, like, will integrate with everything and will try to build that map of, like, here's everything that's out there, and start to try to assign it to, like, and here's people that we think might be responsible in terms of owning the software. That's generally the starting point.Corey: Which makes an awesome amount of sense. I think going at it from the infrastructure first perspective is where I've seen most developer platforms founder. And to be fair, the job is easier now than it was years ago because it used to be that you were being out-innovated by AWS constantly. Innovation has slow down there. And you know that because of how much they say the pace of innovation has only sped up.And whenever AWS says something in a marketing context, they're insecure about it. I've learned this through the fullness of time observing that company. And these days, most customers do not use the majority of features available for any given service. They have solidified to a point where you can responsibly build on top of these things. Now, it seems that the problem is all the ‘yes, and' stuff that gets built on top of it.Ken: Yeah. Do you have an example, actually, like, one of the kinds of, like, ‘yes, and' tools that you're thinking about?Corey: Oh, absolutely. We have a bunch of AWS environment stuff so we should configure CloudWatch to look at all these things from an observability perspective. No, you should not. You should set up Datadog. And the first time someone does that by hand, they enable all have the observability and the rest and suddenly get charged approximately the GDP of Guam.And okay, maybe we shouldn't do that because then you have the downstream impact of that on your CloudWatch bill. So okay, how do we optimize this for the observability piece directly tied to that? How do we make sure that we get woken up when the site is down or preferably before that, but not every time basically, a EBS volume starts to get a little bit toasty? You have to start dialing this stuff in. And once you've found a lot of those aspects, being able to templatize that and roll that out on an ongoing basis and having the integrations all work together feels like it's the right problem to be solving.Ken: Yeah, absolutely. And the group that I think is responsible for that kind of—because it's a set of problems you described—is really, like, platform teams. Sometimes service owners for like, how should we get paged, but really, what you're describing are these kind of cross-cutting engineering concerns that platform teams are uniquely poised to help solve in an [unintelligible 00:32:03] organization, right? I was thinking what you said earlier. Like, nobody just wants to rebuild the same info over and over, but it's sort of like, it's not just building an [unintelligible 00:32:09]; it's kind of like solving this, like, how do we ship? Can we actually run stuff in prod? And not just run it but get observability and ensure that we're woken up for it and, like, what's that total end-to-end look like from, like, developers writing code to running software in production that's serving traffic? And solving all the problems [unintelligible 00:32:24], that's what I think of was platform engineering.Corey: So, my last question before we wind up wrapping this episode comes down to, I am very adept at two different programming languages, and those are brute force and enthusiasm. What implementation language is most of what you find yourself working with? And why is it in invariably going to be YAML?Ken: Yeah, that's a great question. So, I think there's, in terms of implementing OpsLevel and implementing a service catalog, we support YAML. Like, you know, there's this very common workflow, you just drop a YAML spec, basically, in your repo, if you're a service owner. And that, we can support that. I don't think that's a great take, though.Like, we have other integrations. Again, if the problem you're trying to solve is I want to build a catalog of everything that's out there, asking each of your developers hey, can you please all write YAML files that, like, describe the services you own and drop them into this repo? You've inverted this, like, database that essentially you're trying to build, like, what's out there and stored it in Git, potentially across several hundreds or thousands of repos. You put a lot of toil now on individual product developers to go write and maintain these files. And if you ever had to, like, make a blanket update to these files, there's no atomic way to kind of do that, right?So, I look at YAML as, like, I get it, you know? Like, we use the YAML for all the things in DevOps, so why not their service catalog as well, but I think it's toil. Like, there are easier ways to build a catalog. By, kind of, just integrate. Like, hook up AWS, hook up GitHub, hook up Kubernetes, hook up your CI/CD pipeline, hook up all these different sources that have information about what's running in prod, and let the software, let the tool, automatically infer what's actually running as opposed to requiring humans to manually enter data.Corey: I find that there are remarkably few technical holy wars that I cannot unify both sides on by nominating something far worse. Like, the VI versus Emacs stuff, the tabs versus spaces, and of course, the JSON versus YAML folks. My JSON versus YAML answer is XML: God's language. I find that as soon as you suggest that, people care a hell of a lot less about the differences between JSON and YAML because their job is to now kill the apostate, which is me.Ken: Right. Yeah. I remember XML, like, oh, man, 2002. SOAP. I remember SOAP as a protocol. That was a thing.Corey: Some of the earliest S3 API calls were done in SOAP, and I think they finally just used it to wash their mouths out when all was said and done.Ken: Nice. Yeah.Corey: I really want to thank you for taking the time to do your level best to attempt to convert me, and I would argue in many respects, you have succeeded. I'm thinking about this differently than I did half an hour ago. If people want to learn more, where's the best place for them to find you?Ken: Absolutely. So, you can always check out our website, opslevel.com. We're also fairly active on LinkedIn. If Twitter hasn't imploded by the time this episode becomes launched, then they can also check us out at twitter.com/OpsLevelHQ. We're always posting, just different content on, like, how to be successful with service maturity, DevOps, developer productivity, so that you know, ultimately, that you can ship out to customers faster.Corey: And we will, of course, put links to that in the [show notes 00:35:23]. Thank you so much for taking the time, not just to speak with me, but also for sponsoring this episode. It is appreciated.Ken: Cheers.Corey: Ken Rose, CTO and co-founder at OpsLevel. I'm Cloud Economist Corey Quinn and this has been a promoted guest episode of Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment which, upon further reflection, you could have posted to all of the podcast platforms if only you had the right developer platform to pull it off.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Le Podcast AWS en Français

Dans cet épisode nous parlons de l'architecture mise en place pour les canaux de ventes numériques chez Zadig&Voltaire, une marque que les amateurs de mode et de "effortless luxury" connaissent bien. On parle de leur migration d'un cluster Kube vers AWS, de leur experience en matière de autoscaling avec des pics de charge jusqu'à 20x la normale. On y parle aussi de leur utilisation de Graviton et des instances EC2 Spot. Enfin, nous évoquons leur utilisation de CloudFront, de Lambda et de leur évolution vers le serverless.

Screaming in the Cloud
Centralizing Cloud Security Breach Information with Chris Farris

Screaming in the Cloud

Play Episode Listen Later Jun 8, 2023 35:06


Chris Farris, Cloud Security Nerd at PrimeHarbor Technologies, LLC, joins Corey on Screaming in the Cloud to discuss his new project, breaches.cloud, and why he feels having a centralized location for cloud security breach information is so important. Corey and Chris also discuss what it means to dive into entrepreneurship, including both the benefits of not having to work within a corporate structure and the challenges that come with running your own business. Chris also reveals what led him to start breaches.cloud, and what he's learned about some of the biggest cloud security breaches so far. About ChrisChris Farris is a highly experienced IT professional with a career spanning over 25 years. During this time, he has focused on various areas, including Linux, networking, and security. For the past eight years, he has been deeply involved in public-cloud and public-cloud security in media and entertainment, leveraging his expertise to build and evolve multiple cloud security programs.Chris is passionate about enabling the broader security team's objectives of secure design, incident response, and vulnerability management. He has developed cloud security standards and baselines to provide risk-based guidance to development and operations teams. As a practitioner, he has architected and implemented numerous serverless and traditional cloud applications, focusing on deployment, security, operations, and financial modeling.He is one of the organizers of the fwd:cloudsec conference and presented at various AWS conferences and BSides events. Chris shares his insights on security and technology on social media platforms like Twitter, Mastodon and his website https://www.chrisfarris.com.Links Referenced: fwd:cloudsec: https://fwdcloudsec.org/ breaches.cloud: https://breaches.cloud Twitter: https://twitter.com/jcfarris Company Site: https://www.primeharbor.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. My returning guest today is Chris Farris, now at PrimeHarbor, which is his own consultancy. Chris, welcome back. Last time we spoke, you were a Turbot, and now you've decided to go independent because you don't like sleep anymore.Chris: Yeah, I don't like sleep.Corey: [laugh]. It's one of those things where when I went independent, at least in my case, everyone thought that it was, oh, I have this grand vision of what the world could be and how I could look at these things, and that's going to just be great and awesome and everyone's going to just be a better world for it. In my case, it was, no, just there was quite literally nothing else for me to do that didn't feel like an exact reframing of what I'd already been doing for years. I'm a terrible employee and setting out on my own was important. It was the only way I found that I could wind up getting to a place of not worrying about getting fired all the time because that was my particular skill set. And I look back at it now, almost seven years in, and it's one of those things where if I had known then what I know now, I never would have started.Chris: Well, that was encouraging. Thank you [laugh].Corey: Oh, of course. And in sincerity, it's not one of those things where there's any one thing that stops you, but it's the, a lot of people get into the independent consulting dance because they want to do a thing and they're very good at that thing and they love that thing. The problem is, when you're independent, and at least starting out, I was spending over 70% of my time on things that were not billable, which included things like go and find new clients, go and talk to existing clients, the freaking accounting. One of the first hires I made was a fractional CFO, which changed my life. Up until that, my business partner and I were more or less dead reckoning of looking at the bank account and how much money is in there to determine if we could afford things. That's a very unsophisticated way of navigating. It's like driving by braille.Chris: Yeah, I think I went into it mostly as a way to define my professional identity outside of my W-2 employer. I had built cloud security programs for two major media companies and felt like that was my identity: I was the cloud security person for these companies. And so, I was like, ehh, why don't I just define myself as myself, rather than define myself as being part of a company that, in the media space, they are getting overwhelmed by change, and job security, job satisfaction, wasn't really something that I could count on.Corey: One of the weird things that I found—it's counterintuitive—is that when you're independent, you have gotten to a point where you have hit a point of sustainability, where you're not doing the oh, I'm just going to go work for 40 billable hours a week for a client. It's just like being an employee without a bunch of protections and extra steps. That doesn't work super well. But now, at the point where I'm at where the largest client we have is a single-digit percentage of revenue, I can't get fired anymore, without having a whole bunch of people suddenly turn on me because I've done something monstrous, in which case, I probably deserve not to have business anymore, or there's something systemic in the macro environment, which given that I do the media side and I do the cost-cutting side, I work on the way up, I work on the way down, I'm questioning what that looks like in a scenario that doesn't involve me hunting for food. But it's counterintuitive to people who have been employees their whole life, like I was, where, oh, it's risky and dangerous to go out on your own.Chris: It's risky and dangerous to be, you know, tied to a single, yeah, W-2 paycheck. So.Corey: Yeah. The question I'd like to ask is, how many people need to be really pissed off before you have one of those conversations with HR that doesn't involve giving you a cup of coffee? That's the tell: when you don't get coffee, it's a bad conversation.Chris: Actually, that you haven't seen [unintelligible 00:04:25] coffee these days. You don't want the cup of coffee, you know. That's—Corey: Even when they don't give you the crappy percolator navy coffee, like, midnight hobo diner style, it's still going to be a bad meeting because [unintelligible 00:04:37] pretend the coffee's palatable.Chris: Perhaps, yes. I like not having to deal with my own HR department. And I do agree that yeah, getting out of the W-2 space allows me to work on side projects that interests me or, you know, volunteer to do things like continuing the fwd:cloudsec, developing breaches.cloud, et cetera.Corey: I'll never forget, one of my last jobs I had a boss who walked past and saw me looking at Reddit and asked me if that was really the best use of my time. At first—it was in, I think, the sysadmin forum at the time, so yes, it was very much the best use of my time for the problem I was focusing on, but also, even if it wasn't, I spent an inordinate amount of time on social media, just telling stories and building audiences, on some level. That's the weird thing is that what counts as work versus what doesn't count as work gets very squishy when you're doing your own marketing.Chris: True. And even when I was a W-2 employee, I spent a lot of time on Twitter because Twitter was an intel source for us. It was like, “Hey, who's talking about the latest cloud security misconfigurations? Who's talking about the latest data breach? What is Mandiant tweeting about?” It was, you know—I consider it part of my job to be on Twitter and watching things.Corey: Oh, people ask me that. “So, you're on Twitter an awful lot. Don't you have a newsletter to write?” Like, yeah, where do you think that content comes from, buddy?Chris: Exactly. Twitter and Mastodon. And Reddit now.Corey: There's a whole argument to be had about where to find various things. For me at least, because I'm only security adjacent, I was always trying to report the news that other people had, not make the news myself.Chris: You don't want to be the one making the news in security.Corey: Speaking of, I'd like to talk a bit about what you just alluded to breaches.cloud. I don't think I've seen that come across my desk yet, which tells me that it has not been making a big splash just yet.Chris: I haven't been really announcing it; it got published the other night and so basically, yeah, is this is sort of a inaugural marketing push for breaches.cloud. So, what we're looking to do is document all the public cloud security breaches, what happened, why, and more importantly, what the companies did or didn't do that led to the security incident or the security breach.Corey: How are you slicing the difference between broad versus deep? And what I mean by that is, there are some companies where there are indictments and massive deep dives into everything that happens with timelines and blows-by-blows, and other times you wind up with the email that shows up one day of, “Security is very important to us. Now, listen to how we completely dropped the ball on it.” And it just makes the biggest description that they can get away with of what happened. Occasionally, you find out oh, it was an open S3 buckets, or they'll allude to something that sounds like it. Does that count for inclusion? Does it not? How do you make those editorial decisions?Chris: So, we haven't yet built a page around just all of the recipients of the Bucket Negligence Award. We're looking at the specific ones where there's been something that's happened that's usually involving IAM credentials—oftentimes involving IAM credentials found in GitHub—and what led to that. So, in a lot of cases, if there's a detailed company postmortem that they send their customers that said, “Hey, we goofed up, but complete transparency—” and then they hit all the bullet points of how they goofed up. Or in the case of certain others, like Uber, “Hey, we have court transcripts that we can go to,” or, “We have federal indictments,” or, “We have court transcripts, and federal indictments and FTC civil actions.” And so, we go through those trying to suss out what the company did or did not do that led to the breach. And really, the goal here is to be able to articulate as security practitioners, hey, don't attach S3 full access to this role on EC2. That's what got Capital One in trouble.Corey: I have a lot of sympathy for the Capital One breach and I wish they would talk about it more than they do, for obvious reasons, just because it was not, someone showed up and made a very obvious dumb decision, like, “Oh, that was what that giant red screaming thing in the S3 console means.” It was a series of small misconfigurations that led to another one, to another one, to another one, and eventually gets to a point where a sophisticated attacker was able to chain them all together. And yes, it's bad, yes, they're a bank and the rest, but I look at that and it's—that's the sort of exploit that you look at and it's okay, I see it. I absolutely see it. Someone was very clever, and a bunch of small things that didn't rise to the obvious. But they got dragged and castigated as if they basically had a four-character password that they'd left on the back of the laptop on a Post-It note in an airport lounge when their CEO was traveling. Which is not the case.Chris: Or all of the highlighting the fact that Paige Thompson was a former Amazon employee, making it seem like it was her insider abilities that lead to the incident, rather than she just knew that, hey, there's a metadata service and it gives me creds if I ask it.Corey: Right. That drove me nuts. There was no maleficence as an employee. And to be very direct, from what I understand of internal AWS controls, had there been, it would have been audited, flagged, caught, interdicted. I have talked to enough Amazonians that either a lot of them are lying to me very consistently despite not knowing each other, or they're being honest when they say that you can't get access to customer data using secret inside hacks.Chris: Yeah. I have reasonably good faith in AWS and their ability to not touch customer data in most scenarios. And I've had cases that I'm not allowed to talk about where Amazon has gone and accessed customer data, and the amount of rigmarole and questions and drilling that I got as a customer to have them do that was pretty intense and somewhat, actually, annoying.Corey: Oh, absolutely. And, on some level, it gets frustrating when it's a, look, this is a test account. I have nothing of sensitive value in here. I want the thing that isn't working to start working. Can I just give you a whole, like, admin-powered user account and we can move on past all of this? And their answer is always absolutely not.Chris: Yes. Or, “Hey, can you put this in our bucket?” “No, we can't even write to a public bucket or a bucket that, you know, they can share too.” So.Corey: An Amazonian had to mail me a hard drive because they could not send anything out of S3 to me.Chris: There you go.Corey: So, then I wound up uploading it back to S3 with, you know, a Snowball Edge because there's no overkill like massive overkill.Chris: No, the [snowmobile 00:11:29] would have been the massive overkill. But depending on where you live, you know, you might not have been able to get a permit to park the snowmobile there.Corey: They apparently require a loading dock. Same as with the outposts. I can't fake having one of those on my front porch yet.Chris: Ah. Well, there you go. I mean, you know it's the right height though, and you don't mind them ruining your lawn.Corey: So, help me understand. It makes sense to me at least, on some level, why having a central repository of all the various cloud security breaches in one place that's easy to reference is valuable. But what caused you to decide, you know, rather than saying it'd be nice to have, I'm going to go build that thing?Chris: Yeah, so it was actually right before the last time we spoke, Nicholas Sharp was indicted. And there was like, hey, this person was indicted for, you know, this cloud security case. And I'm like, that name rings a bell, but I don't remember who this person was. And so, I kind of realized that there's so many of these things happening now that I forget who is who. And so, when a new piece of news comes along, I'm like, where did this come from and how does this fit into what my knowledge of cloud security is and cloud security cases?So, I kind of realized that these are all running together in my mind. The Department of Justice only referenced ‘Company One,' so it wasn't clear to me if this even was a new cloud incident or one I already knew about. And so basically, I decided, okay, let's build this. Breaches.cloud was available; I think I kind of got the idea from hackingthe.cloud.And I had been working with some college students through the Collegiate Cyber Defense Competition, and I was like, “Hey, anybody want a spring research project that I will pay you for?” And so yeah, PrimeHarbor funded two college students to do quite a bit of the background research for me, I mentored them through, “Hey, so here's what this means,” and, “Hey, have we noticed that all of these seem to relate to credentials found in GitHub? You know, maybe there's a pattern here.” So, if you're not yet scanning for secrets in GitHub, I recommend you start scanning for secrets in your GitHub, private and public repos.Corey: Also, it makes sense to look at the history. Because, oh, I committed a secret. I'm going to go ahead and revert that commit and push that. That solves the problem, right?Chris: No, no, it doesn't. Yes, apparently, you can force push and delete an entire commit, but you really want to use a tool that's going to go back through the commit history and dig through it because as we saw in the Uber incident, when—the second Uber incident, the one that led to the CSOs conviction—yeah, the two attackers, [unintelligible 00:14:09] stuffed a Uber employee's personal GitHub account that they were also using for Uber work, and yeah, then they dug through all the source code and dug through the commit histories until they found a set of keys, and that's what they used for the second Uber breach.Corey: Awful when that hits. It's one of those things where it's just… [sigh], one thing leads to another leads to another. And on some level, I'm kind of amazed by the forensics that happen around all of these things. With the counterpoint, it is so… freakishly difficult, I think, for lack of a better term, just to be able to say what happened with any degree of certainty, so I can't help but wonder in those dark nights when the creeping dread starts sinking in, how many things like this happen that we just never hear about because they don't know?Chris: Because they don't turn on CloudTrail. Probably a number of them. Once the data gets out and shows up on the dark web, then people start knocking on doors. You know, Troy Hunt's got a large collection of data breach stuff, and you know, when there's a data breach, people will send him, “Hey, I found these passwords on the dark web,” and he loads them into Have I Been Pwned, and you know, [laugh] then the CSO finds out. So yeah, there's probably a lot of this that happens in the quiet of night, but once it hits the dark web, I think that data starts becoming available and the victimized company finds out.Corey: I am profoundly cynical, in case that was unclear. So, I'm wondering, on some level, what is the likelihood or commonality, I suppose, of people who are fundamentally just viewing security breach response from a perspective of step one, make sure my resume is always up to date. Because we talk about these business continuity plans and these DR approaches, but very often it feels like step one, secure your own mask before assisting others, as they always say on the flight. Where does personal preservation come in? And how does that compare with company preservation?Chris: I think down at the [IaC 00:16:17] level, I don't know of anybody who has not gotten a job because they had Equifax on their resume back in, what, 2017, 2018, right? Yes, the CSO, the CEO, the CIO probably all lost their jobs. And you know, now they're scraping by book deals and speaking engagements.Corey: And these things are always, to be clear, nuanced. It's rare that this is always one person's fault. If you're a one-person company, okay, yeah, it's kind of your fault, let's be clear here, but there are controls and cost controls and audit trails—presumably—for all of these things, so it feels like that's a relatively easy thing to talk around, that it was a process failure, not that one person sucked. “Well, didn't you design and implement the process?” “Yes. But it turned out there were some holes in it and my team reported that those weren't there and it turned out that they were and, well, live and learn.” It feels like that's something that could be talked around.Chris: It's an investment failure. And again, you know, if we go back to Harry Truman, “The buck stops here,” you know, it's the CEO who decides that, hey, we're going to buy a corporate jet rather than buy a [SIIM 00:17:22]. And those are the choices that happen at the top level that define, do you have a capable security team, and more importantly, do you have a capable security culture such that your security team isn't the only ones who are actually thinking about security?Corey: That's, I guess, a fair question. I saw a take on Twitter—which is always a weird thing—or maybe was Blue-ski or somewhere else recently, that if you don't have a C-level executive responsible for security with security in their title, your company does not take security seriously. And I can see that past a certain point of scale, but as a one-person company, do you have a designated CSO?Chris: As a one-person company and as a security company, I sort of do have a designated CSO. I also have, you know, the person who's like, oh, I'm going to not put MFA on the root of this one thing because, while it's an experiment and it's a sandbox and whatever else, but I also know that that's not where I'm going to be putting any customer data, so I can measure and evaluate the risk from both a security perspective and a business existential investment perspective. When you get to the larger the organization, the more detached the CEO gets from the risk and what the company is building and what the company is doing, is where you get into trouble. And lots of companies have C-level somebody who's responsible for security. It's called the CSO, but oftentimes, they report four levels down, or even more, from the chief executive who is actually the one making the investment decisions.Corey: On some level, the oh yeah, that's my responsibility, too, but it feels like it's a trap that falls into. Like, well, the CTO is responsible for security at a publicly traded company. Like, well… that tends to not work anymore, past certain points of scale. Like when I started out independently, yes, I was the CSO. I was also the accountant. I was also the head of marketing. I was also the janitor. There's a bunch of different roles; we all wear different hats at different times.I'm also not a big fan of shaming that oh, yeah. This is a universal truth that applies to every company in existence. That's also where I think Twitter started to go wrong where you would get called out whenever making an observation or witticism or whatnot because there was some vertex case to which it did not necessarily apply and then people would ‘well, actually,' you to death.Chris: Yeah. Well, and I think there's a lot of us in the security community who are in the security one-percenters. We're, “Hey, yes, I'm a cloud security person on a 15-person cloud security team, and here's this awesome thing we're doing.” And then you've got most of the other companies in this country that are probably below the security poverty line. They may or may not have a dedicated security person, they certainly don't have a SIIM, they certainly don't have anybody who's monitoring their endpoints for malware attacks or anything else, and those are the companies that are getting hit all the time with, you know, a lot of this ransomware stuff. Healthcare is particularly vulnerable to that.Corey: When you take a look across the industry, what is it that you're doing now at PrimeHarbor that you feel has been an unmet need in the space? And let me be clear, as of this recording earlier today, we signed a contract with you for a project. There's more to come on that in the future. So, this is me asking you to tell a story, not challenging, like, what do you actually do? This is not a refund request, let's be very clear here. But what's the unmet need that you saw?Chris: I think the unmet need that I see is we don't talk to our builder community. And when I say builder, I mean, developers, DevOps, sysadmins, whatever. AWS likes the term builder and I think it works. We don't talk to our builder community about risk in a way that makes sense to them. So, we can say, “Hey, well, you know, we have this security policy and section 24601 says that all data's classifications must be signed off by the data custodian,” and a developer is going to look at you with their head tilted, and be like, “Huh? What? I just need to get the sprint done.”Whereas if we can articulate the risk—and one of the reasons I wanted to do breaches.cloud was to have that corpus of articulated risk around specific things—I can articulate the risk and say, “Hey, look, you know how easy it is for somebody to go in and enumerate an S3 bucket? And then once they've enumerated and guessed that S3 bucket exists, they list it, and oh, hey, look, now that they've listed it, they know all of the objects and all of the juicy PII that you just made public.” If you demonstrate that to them, then they're going to be like, “Oh, I'm going to add the extra story point to this story to go figure out how to do CloudFront origin access identity.” And now you've solved, you know, one more security thing. And you've done in a way that not just giving a man a fish or closing the bucket for them, but now they know, hey, I should always use origin access identity. This is why I need to do this particular thing.Corey: One of the challenges that I've seen in a variety of different sites that have tried to start cataloging different breaches and other collections of things happening in public is the discoverability or the library management problem. The most obvious example of this is, of course, the AWS console itself, where when it paginates things like, oh, there are 3000 things here, ten at a time, through various pages for it. Like, the marketplace is just a joke of discoverability. How do you wind up separating the stuff that is interesting and notable, rather than, well, this has about three sentences to it because that's all the company would say?Chris: So, I think even the ones where there's three sentences, we may actually go ahead and add it to the repo, or we may just hold it as a draft, so that we know later on when, “Hey, look, here's a federal indictment for Company Three. Oh, hey, look. Company Three was actually this breach announcement that we heard about three months ago,” or even three years ago. So like, you know, Chegg is a great example of, you know, one of those where, hey, you know, there was an incident, and they disclosed something, and then, years later, FTC comes along and starts banging them over the head. And in the FTC documentation, or in the FTC civil complaint, we got all sorts of useful data.Like, not only were they using root API keys, every contractor and employee there was sharing the root API keys, so when they had a contractor who left, it was too hard to change the keys and share it with everybody, so they just didn't do that. The contractor still had the keys, and that was one of the findings from the FTC against Chegg. Similar to that, Cisco didn't turn off contractors' access, and I think—this is pure speculation—I think the poor contractor one day logged into his Google Cloud Shell, cd'ed into a Terraform directory, ran ‘terraform destroy', and rather than destroying what he thought he was destroying, it had the access keys back to Cisco WebEx and took down 400 EC2 instances that made up all of WebEx. These are the kinds of things that I think it's worth capturing because the stories are going to come out over time.Corey: What have you seen in your, I guess, so far, a limited history of curating this that—I guess, first what is it you've learned that you've started seeing as far as patterns go, as far as what warrants inclusion, what doesn't, and of course, once you started launching and going a bit more public with it, I'm curious to hear what the response from companies is going to be.Chris: So, I want to be very careful and clear that if I'm going to name somebody, that we're sourcing something from the criminal justice system, that we're not going to say, “Hey, everybody knows that it was Paige Thompson who was behind it.” No, no, here's the indictment that said it was Paige Thompson that was, you know, indicted for this Capital One sort of thing. All the data that I'm using, it all comes from public sources, it's all sited, so it's not like, hey, some insider said, “Hey, this is what actually happened.” You know? I very much learned from the Ubiquiti case that I don't want to be in the position of Brian Krebs, where it's the attacker themselves who's updating the site and telling us everything that went wrong, when in fact, it's not because they're in fact the perpetrator.Corey: Yeah, there's a lot of lessons to be learned. And fortunately, for what it's s—at least it seems… mostly, that we've moved past the battle days of security researchers getting sued on a whim from large companies for saying embarrassing things about them. Of course, watch me be tempting fate and by the time this publishes, I'll get sued by some company, probably Azure or whatnot, telling me that, “Okay, we've had enough of you saying bad things about our security.” It's like, well, cool, but I also read the complaint before you file because your security is bad. Buh-dum-tss. I'm kidding. I'm kidding. Please don't sue me.Chris: So, you know, whether it's slander or libel, depending on whether you're reading this or hearing it, you know, truth is an actual defense, so I think Microsoft doesn't have a case against you. I think for what we're doing in breaches, you know—and one of the reasons that I'm going to be very clear on anybody who contributes—and just for the record, anybody is welcome to contribute. The GitHub repo that runs breaches.cloud is public and anybody can submit me a pull request and I will take their write-ups of incidents. But whatever it is, it has to be sourced.One of the things that I'm looking to do shortly, is start soliciting sponsorships for breaches so that we can afford to go pull down the PACER documents. Because apparently in this country, while we have a right to a speedy trial, we don't have a right to actually get the court transcripts for less than ten cents a page. And so, part of what we need to do next is download those—and once we've purchased them, we can make them public—download those, make them public, and let everybody see exactly what the transcript was from the Capital One incident, or the Joey Sullivan trial.Corey: You're absolutely right. It drives me nuts that I have to wind up budgeting money for PACER to pull up court records. And at ten cents a page, it hasn't changed in decades, where it's oh, this is the cost of providing that data. It's, I'm not asking someone to walk to the back room and fax it to me. I want to be very clear here. It just feels like it's one of those areas where the technology and government is not caught up and it's—part of the problem is, of course, having no competition.Chris: There is that. And I think I read somewhere that the ent—if you wanted to download the entire PACER, it would be, like, $100 million. Not that you would do that, but you know, it is the moneymaker for the judicial system, and you know, they do need to keep the lights on. Although I guess that's what my taxes are for. But again, yes, they're a monopoly; they can do that.Corey: Wildly frustrating, isn't it?Chris: Yeah [sigh]… yeah, yeah, yeah. Yeah, I think there's a lot of value in the court transcripts. I've held off on publishing the Capital One case because one, well, already there's been a lot of ink spilled on it, and two, I think all the good detail is going to be in the trial transcripts from Paige Thompson's trial.Corey: So, I am curious what your take is on… well, let's called the ‘FTX thing.' I don't even know how to describe it at this point. Is it a breach? Is it just maleficence? Is it 15,000 other things? But I noticed that it's something that breaches.cloud does talk about a bit.Chris: Yeah. So, that one was a fascinating one that came out because as I was starting this project, I heard you know, somebody who was tweeting was like, “Hey, they were storing all of the crypto private keys in AWS Secrets Manager.” And I was like, “Errr?” And so, I went back and I read John J. Ray III's interim report to the creditors.Now, John Ray is the man who was behind the cleaning up of Enron, and his comment was “FTX is the”—“Never in my career have I seen such a complete failure of corporate controls and such a complete absence of trustworthy information as occurred here.” And as part of his general, broad write-up, they went into, in-depth, a lot of the FTX AWS practices. Like, we talk about, hey, you know, your company should be multi-account. FTX was worse. They had three or four different companies all operating in the same AWS account.They had their main company, FTX US, Alameda, all of them had crypto keys in Secrets Manager and there was no access control between any of those. And what ended up happening on the day that SBF left and Ray came in as CEO, the $400 million worth of crypto somehow disappeared out of FTX's wallets.Corey: I want to call this out because otherwise, I will get letters from the AWS PR spin doctors. Because on the surface of it, I don't know that there's necessarily a lot wrong with using Secrets Manager as the backing store for private keys. I do that with other things myself. The question is, what other controls are there? You can't just slap it into Secrets Manager and, “Well, my job is done. Let's go to lunch early today.”There are challenges [laugh] around the access levels, there are—around who has access, who can audit these things, and what happens. Because most of the secrets I have in Secrets Manager are not the sort of thing that is, it is now a viable strategy to take that thing and abscond to a country with a non-extradition treaty for the rest of my life, but with private keys and crypto, there kind of is.Chris: That's it. It's like, you know, hey, okay, the RDS database password is one thing, but $400 million in crypto is potentially another thing. Putting it in and Secrets Manager might have been the right answer, too. You get KMS customer-managed keys, you get full auditability with CloudTrail, everything else, but we didn't hear any of that coming out of Ray's report to the creditors. So again, the question is, did they even have CloudTrail turned on? He did explicitly say that FTX had not enabled GuardDuty.Corey: On some level, even if GuardDuty doesn't do anything for you, which in my case, it doesn't, but I want to be clear, you should still enable it anyway because you're going to get dragged when there's inevitable breach because there's always a breach somewhere, and then you get yelled at for not having turned on something that was called GuardDuty. You already sound negligent, just with that sentence alone. Same with Security Hub. Good name on AWS's part if you're trying to drive service adoption. Just by calling it the thing that responsible people would use, you will see adoption, even if people never configure or understand it.Chris: Yeah, and then of course, hey, you had Security Hub turned on, but you ignore the 80,000 findings in it. Why did you ignore those 80,000 findings? I find Security Hub to probably be a little bit too much noise. And it's not Security Hub, it's ‘Compliance Hub.' Everything—and I'm going to have a blog post coming out shortly—on this, everything that Security Hub looks at, it looks at it from a compliance perspective.If you look at all of its scoring, it's not how many things are wrong; it's how many rules you are a hundred percent compliant to. It is not useful for anybody below that AWS security poverty line to really master or to really operationalize.Corey: I really want to thank you for taking the time to catch up with me once again. Although now that I'm the client, I expect I can do this on demand, which is just going to be delightful. If people want to learn more, where can they find you?Chris: So, they can find breaches.cloud at, well https://breaches.cloud. If you're looking for me, I am either on Twitter, still, at @jcfarris, or you can find me and my consulting company, which is www.primeharbor.com.Corey: And we will, of course, put links to all of that in the [show notes 00:33:57]. Thank you so much for taking the time to speak with me. As always, I appreciate it.Chris: Oh, thank you for having me again.Corey: Chris Farris, cloud security nerd at PrimeHarbor. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that you're also going to use as the storage back-end for your private keys.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

The Cloud Pod
212: The Cloud Pod Wades into Microservices vs. Monoliths

The Cloud Pod

Play Episode Listen Later May 18, 2023 41:27


Welcome to the newest episode of The Cloud Pod podcast! Justin, Ryan, Jonathan, Matthew and Peter are your hosts this week as we discuss all things cloud and AI,  Titles we almost went with this week: The Cloud Pod is better than Bob's Used Books The Cloud Pod sets up AWS notifications for all The Cloud Pod is non-differential about privacy in BigQuery The Cloud Pod finds Windows Bob The Cloud Pod starts preparing for its Azure Emergency today A big thanks to this week's sponsor: Foghorn Consulting, provides top-notch cloud and DevOps engineers to the world's most innovative companies. Initiatives stalled because you have trouble hiring?  Foghorn can be burning down your DevOps and Cloud backlogs as soon as next week.

AWS Bites
80. Can you do private static websites on AWS?

AWS Bites

Play Episode Listen Later May 11, 2023 19:12


In this episode of the AWS Bites podcast, we discuss the challenges of hosting private static websites on AWS. We explore why it's important to host internal corporate applications and line of business applications only for internal consumption, and the requirements for doing so. We also evaluate different options for hosting private static websites, including S3 with CloudFront, containers on ECS/Fargate with ALB, API Gateway, and AppRunner. Finally, we summarize the pros and cons of each option and provide a rating for each. If you're looking to host a private static website on AWS, this episode is a must-listen!

Screaming in the Cloud
The Benefits of Mocking Clouds Locally with Waldemar Hummer

Screaming in the Cloud

Play Episode Listen Later Mar 30, 2023 32:24


Waldemar Hummer, Co-Founder & CTO of LocalStack, joins Corey on Screaming in the Cloud to discuss how LocalStack changed Corey's mind on the futility of mocking clouds locally. Waldemar reveals why LocalStack appeals to both enterprise companies and digital nomads, and explains how both see improvements in their cost predictability as a result. Waldemar also discusses how LocalStack is an open-source company first and foremost, and how they're working with their community to evolve their licensing model. Corey and Waldemar chat about the rising demand for esoteric services, and Waldemar explains how accommodating that has led to an increase of adoption from the big data space. About WaldemarWaldemar is Co-Founder and CTO of LocalStack, where he and his team are building the world-leading platform for local cloud development, based on the hugely popular open source framework with 45k+ stars on Github. Prior to founding LocalStack, Waldemar has held several engineering and management roles at startups as well as large international companies, including Atlassian (Sydney), IBM (New York), and Zurich Insurance. He holds a PhD in Computer Science from TU Vienna.Links Referenced: LocalStack website: https://localstack.cloud/ LocalStack Slack channel: https://slack.localstack.cloud LocalStack Discourse forum: https://discuss.localstack.cloud LocalStack GitHub repository: https://github.com/localstack/localstack TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Until a bit over a year ago or so, I had a loud and some would say fairly obnoxious opinion around the futility of mocking cloud services locally. This is not to be confused with mocking cloud services on the internet, which is what I do in lieu of having a real personality. And then one day I stopped espousing that opinion, or frankly, any opinion at all. And I'm glad to be able to talk at long last about why that is. My guest today is Waldemar Hummer, CTO and co-founder at LocalStack. Waldemar, it is great to talk to you.Waldemar: Hey, Corey. It's so great to be on the show. Thank you so much for having me. We're big fans of what you do at The Duckbill Group and Last Week in AWS. So really, you know, glad to be here with you today and have this conversation.Corey: It is not uncommon for me to have strong opinions that I espouse—politely to be clear; I'll make fun of companies and not people as a general rule—but sometimes I find that I've not seen the full picture and I no longer stand by an opinion I once held. And you're one of my favorite examples of this because, over the course of a 45-minute call with you and one of your business partners, I went from, “What you're doing is a hilarious misstep and will never work,” to, “Okay, and do you have room for another investor?” And in the interest of full disclosure, the answer to that was yes, and I became one of your angel investors. It's not exactly common for me to do that kind of a hard pivot. And I kind of suspect I'm not the only person who currently holds the opinion that I used to hold, so let's talk a little bit about that. At the very beginning, what is LocalStack and what does it you would say that you folks do?Waldemar: So LocalStack, in a nutshell, is a cloud emulator that runs on your local machine. It's basically like a sandbox environment where you can develop your applications locally. We have currently a range of around 60, 70 services that we provide, things like Lambda Functions, DynamoDB, SQS, like, all the major AWS services. And to your point, it is indeed a pretty large undertaking to actually implement the cloud and run it locally, but with the right approach, it actually turns out that it is feasible and possible, and we've demonstrated this with LocalStack. And I'm glad that we've convinced you to think of it that way as well.Corey: A couple of points that you made during that early conversation really stuck with me. The first is, “Yeah, AWS has two, no three no four-hundred different service offerings. But look at your customer base. How many of those services are customers using in any real depth? And of those services, yeah, the APIs are vast, and very much a sprawling pile of nonsense, but how many of those esoteric features are those folks actually using?” That was half of the argument that won me over.The other half was, “Imagine that you're an enormous company that's an insurance company or a bank. And this year, you're hiring 5000 brand new developers, fresh out of school. Two to 3000 of those developers will still be working here in about a year as they wind up either progressing in other directions, not winding up completing internships, or going back to school after internships, or for a variety of reasons. So, you have that many people that you need to teach how to use cloud in the context that we use cloud, combined with the question of how do you make sure that one of them doesn't make a fun mistake that winds up bankrupting the entire company with a surprise AWS bill?” And those two things combined turned me from, “What you're doing is ridiculous,” to, “Oh, my God. You're absolutely right.”And since then, I've encountered you in a number of my client environments. You were absolutely right. This is something that resonates deeply and profoundly with larger enterprise customers in particular, but also folks who just don't want to wind up being beholden to every time they do a deploy to anything to test something out, yay, I get to spend more money on AWS services.Waldemar: Yeah, totally. That's spot on. So, to your first point, so definitely we have a core set of services that most people are using. So, things like Lambda, DynamoDB, SQS, like, the core serverless, kind of, APIs. And then there's kind of a long tail of more exotic services that we support these days, things like, even like QLDB, the quantum ledger database, or, you know, managed streaming for Kafka.But like, certainly, like, the core 15, 20 services are the ones that are really most used by the majority of people. And then we also, you know, pro offering have some very, sort of, advanced services for different use cases. So, that's to your first point.And second point is, yeah, totally spot on. So LocalStack, like, really enables you to experiment in the sandbox. So, we both see it as an experimentation, also development environment, where you don't need to think about cloud costs. And this, I guess, will be very close to your heart in the work that you're doing, the costs are becoming really predictable as well, right? Because in the cloud, you know, work to different companies before doing LocalStack where we were using AWS resources, and you can end up in a situation where overnight, you accumulate, you know, hundreds of thousands of dollars of AWS bill because you've turned on a certain feature, or some, you know, connectivity into some VPC or networking configuration that just turns out to be costly.Also, one more thing that is worth mentioning, like, we want to encourage, like, frequent testing, and a lot of the cloud's billing and cost structure is focused around, for example, hourly billing of resources, right? And if you have a test that just spins up resources that run for a couple of minutes, you still end up paying the entire hour. And we LocalStack, really, that brings down the cloud builds significantly because you can really test frequently, the cycles become much faster, and it's also again, more efficient, more cost-effective.Corey: There's something useful to be said for, “Well, how do I make sure that I turn off resources when I'm done?” In cloud, it's a bit of a game of guess-and-check. And you turn off things you think are there and you wait a few days and you check the bill again, and you go and turn more things off, and the cycle repeats. Or alternately, wait for the end of the month and wonder in perpetuity why you're being billed 48 cents a month, and not be clear on why. Restarting the laptop is a lot more straightforward.I also want to call out some of my own bias on this where I used to be a big believer in being able to build and deploy and iterate on things locally because well, what happens when I'm in a plane with terrible WiFi? Well, in the before times, I flew an awful lot and was writing a fair bit of, well, cloudy nonsense and I still never found that to be a particular blocker on most of what I was doing. So, it always felt a little bit precious to me when people were talking about, well, what if I can't access the internet to wind up building and deploying these things? It's now 2023. How often does that really happen? But is that a use case that you see a lot of?Waldemar: It's definitely a fair point. And probably, like, 95% of cloud development these days is done in a high internet bandwidth environment, maybe some corporate network where you have really fast internet access. But that's only a subset, I guess, of the world out there, right? So, there might be situations where, you know, you may have bad connectivity. Also, maybe you live in a region—or maybe you're traveling even, right? So, there's a lot more and more people who are just, “Digital nomads,” quote-unquote, right, who just like to work in remote places.Corey: You're absolutely right. My bias is that I live in San Francisco. I have symmetric gigabit internet at home. There's not a lot of scenarios in my day-to-day life—except when I'm, you know, on the train or the bus traveling through the city—because thank you, Verizon—where I have impeded connectivity.Waldemar: Right. Yeah, totally. And I think the other aspect of this is kind of the developers just like to have things locally, right, because it gives them the feeling of you know, better control over the code, like, being able to integrate into their IDEs, setting breakpoints, having these quick cycles of iterations. And again, this is something that there's more and more tooling coming up in the cloud ecosystem, but it's still inherently a remote execution that just, you know, takes the round trip of uploading your code, deploying, and so on, and that's just basically the pain point that we're addressing with LocalStack.Corey: One thing that did surprise me as well was discovering that there was a lot more appetite for this sort of thing in enterprise-scale environments. I mean, some of the reference customers that you have on your website include divisions of the UK Government and 3M—you know, the Post-It note people—as well as a number of other very large environments. And at first, that didn't make a whole lot of sense to me, but then it suddenly made an awful lot of sense because it seems—and please correct me if I'm wrong—that in order to use something like this at scale and use it in a way that isn't, more or less getting it into a point where the administration of it is more trouble than it's worth, you need to progress past a certain point of scale. An individual developer on their side project is likely just going to iterate against AWS itself, whereas a team of thousands of developers might not want to be doing that because they almost certainly have their own workflows that make that process high friction.Waldemar: Yeah, totally. So, what we see a lot is, especially in larger enterprises, dedicated teams, like, developer experience teams, whose main job is to really set up a workflow and environment where developers can be productive, most productive, and this can be, you know, on one side, like, setting up automated pipelines, provisioning maybe AWS sandbox and test accounts. And like some of these teams, when we introduce LocalStack, it's really a game-changer because it becomes much more decoupled and like, you know, distributed. You can basically configure your CI pipeline, just, you know, spin up the container, run your tests, tear down again afterwards. So, you know, it's less dependencies.And also, one aspect to consider is the aspect of cloud approvals. A lot of companies that we work with have, you know, very stringent processes around, even getting access to the clouds. Some SRE team needs to enable their IAM permissions and so on. With LocalStack, you can just get started from day one and just get productive and start testing from the local machine. So, I think those are patterns that we see a lot, in especially larger enterprise environments as well, where, you know, there might be some regulatory barriers and just, you know, process-wise steps as well.Corey: When I started playing with LocalStack myself, one of the things that I found disturbingly irritating is, there's a lot that AWS gets largely right with its AWS command-line utility. You can stuff a whole bunch of different options into the config for different profiles, and all the other tools that I use mostly wind up respecting that config. The few that extend it add custom lines to it, but everything else is mostly well-behaved and ignores the things it doesn't understand. But there is no facility that lets you say, “For this particular profile, use this endpoint for AWS service calls instead of the normal ones in public regions.” In fact, to do that, you effectively have to pass specific endpoint URLs to arguments, and I believe the syntax on that is not globally consistent between different services.It just feels like a living nightmare. At first, I was annoyed that you folks wound up having to ship your own command-line utility to wind up interfacing with this. Like, why don't you just add a profile? And then I tried it myself and, oh, I'm not the only person who knows how this stuff works that has ever looked at this and had that idea. No, it's because AWS is just unfortunate in that respect.Waldemar: That is a very good point. And you're touching upon one of the major pain points that we have, frankly, with the ecosystem. So, there are some pull requests against the AWS open-source repositories for the SDKs and various other tools, where folks—not only LocalStack, but other folks in the community have asked for introducing, for example, an AWS endpoint URL environment variable. These [protocols 00:12:32], unfortunately, were never merged. So, it would definitely make our lives a whole lot easier, but so far, we basically have to maintain these, you know, these wrapper scripts, basically, AWS local, CDK local, which basically just, you know, points the client to local endpoints. It's a good workaround for now, but I would assume and hope that the world's going to change in the upcoming years.Corey: I really hope so because everything else I can think of is just bad. The idea of building a custom wrapper around the AWS command-line utility that winds up checking the profile section, and oh, if this profile is that one, call out to this tool, otherwise it just becomes a pass-through. That has security implications that aren't necessarily terrific, you know, in large enterprise companies that care a lot about security. Yeah, pretend to be a binary you're not is usually the kind of thing that makes people sad when security politely kicks their door in.Waldemar: Yeah, we actually have pretty, like, big hopes for the v3 wave of the SDKs, AWS, because there is some restructuring happening with the endpoint resolution. And also, you can, in your profile, by now have, you know, special resolvers for endpoints. But still the case of just pointing all the SDKs and CLI to a custom endpoint is just not yet resolved. And this is, frankly, quite disappointing, actually.Corey: While we're complaining about the CLI, I'll throw one of my recurring issues with it in. I would love for it to adopt the Linux slash Unix paradigm of having a config.d directory that you can reference from within the primary config file, and then any file within that directory in the proper syntax winds up getting adopted into what becomes a giant composable config file, generated dynamically. The reason being is, I can have entire lists of profiles in separate files that I could then wind up dropping in and out on a client-by-client basis. So, I don't inadvertently expose who some of my clients are, in the event that winds up being part of the way that they have named their AWS accounts.That is one of those things I would love but it feels like it's not a common enough use case for there to be a whole lot of traction around it. And I guess some people would make a fair point if they were to say that the AWS CLI is the most widely deployed AWS open-source project, even though all it does is give money to AWS more efficiently.Waldemar: Yeah. Great point. Yeah, I think, like, how and some way to customize and, like, mingle or mangle your configurations in a more easy fashion would be super useful. I guess it might be a slippery slope to getting, you know, into something like I don't know, Helm for EKS and, like, really, you know, having to maintain a whole templating language for these configs. But certainly agree with you, to just you know, at least having [plug 00:15:18] points for being able to customize the behavior of the SDKs and CLIs would be extremely helpful and valuable.Corey: This is not—unfortunately—my first outing with the idea of trying to have AWS APIs done locally. In fact, almost a decade ago now, I did a build-out at a very large company of a… well, I would say that the build-out was not itself very large—it was about 300 nodes—that were all running Eucalyptus, which before it died on the vine, was imagined as a way of just emulating AWS APIs locally—done in Java, as I recall—and exposing local resources in ways that comported with how AWS did things. So, the idea being that you could write configuration to deploy any infrastructure you wanted in AWS, but also treat your local data center the same way. That idea unfortunately did not survive in the marketplace, which is kind of a shame, on some level. What was it that inspired you folks to wind up building this with an eye towards local development rather than run this as a private cloud in your data center instead?Waldemar: Yeah, very interesting. And I do also have some experience [unintelligible 00:16:29] from my past university days with Eucalyptus and OpenStack also, you know, running some workloads in an on-prem cluster. I think the main difference, first of all, these systems were extremely hard, notoriously hard to set up and maintain, right? So, lots of moving parts: you had your image server, your compute system, and then your messaging subsystems. Lots of moving parts, and wanting to have everything basically much more monolithic and in a single container.And Docker really sort of provides a great platform for us, which is create everything in a single container, spin up locally, make it very lightweight and easy to use. But I think really the first days of LocalStack, the idea was really, was actually with the use case of somebody from our team. Back then, I was working at Atlassian in the data engineering team and we had folks in the team were commuting to work on the train. And it was literally this use case that you mentioned before about being able to work basically offline on your commute. And this is kind of were the first lines of code were written and then kind of the idea evolves from there.We put it into the open-source, and then, kind of, it was growing over the years. But it really started as not having it as an on-prem, like, heavyweight server, but really as a lightweight system that you can easily—that is easily portable across different systems as well.Corey: That is a good question. Very often, when I'm using various tools that are aimed at development use cases, it is very clear that one particular operating system is invariably going to be the first-class citizen and everything else is a best effort. Ehh, it might work; it might not. Does LocalStack feel that way? And if so, what's the operating system that you want to be on?Waldemar: I would say we definitely work best on Mac OS and Linux. It also works really well on Windows, but I think given that some of our tooling in the ecosystem also pretty much geared towards Unix systems, I think those are the platforms it will work well with. Again, on the other hand, Docker is really a platform that helps us a lot being compatible across operating systems and also CPU architectures. We have a multi-arch build now for AMD and ARM64. So, I think in that sense, we're pretty broad in terms of the compatibility spectrum.Corey: I do not have any insight into how the experience goes on Windows, given that I don't use that operating system in anger for, wow, 15 years now, but I will say that it's been top-flight on Mac OS, which is what I spend most of my time. Depressed that I'm using, but for desktop experiences, it seems to work out fairly well. That said, having a focus on Windows seems like it would absolutely be a hard requirement, given that so many developer workstations in very large enterprises tend to skew very Windows-heavy. My hat is off to people who work with Linux and Linux-like systems in environments like that where even line endings becomes psychotically challenging. I don't envy them their problems. And I have nothing but respect for people who can power through it. I never had the patience.Waldemar: Yeah. Same here and definitely, I think everybody has their favorite operating system. For me, it's also been mostly Linux and Mac in the last couple of years. But certainly, we definitely want to be broad in terms of the adoption, and working with large enterprises often you have—you know, we want to fit into the existing landscape and environment that people work in. And we solve this by platform abstractions like Docker, for example, as I mentioned, and also, for example, Python, which is some more toolings within Python is also pretty nicely supported across platforms. But I do feel the same way as you, like, having been working with Windows for quite some time, especially for development purposes.Corey: What have you noticed that your customer usage patterns slash requests has been saying about AWS service adoption? I have to imagine that everyone cares whether you can mock S3 effectively. EC2, DynamoDB, probably. SQS, of course. But beyond the very small baseline level of offering, what have you seen surprising demand for, as I guess, customer implementation of more esoteric services continues to climb?Waldemar: Mm-hm. Yeah, so these days it's actually pretty [laugh] pretty insane the level of coverage we already have for different services, including some very exotic ones, like QLDB as I mentioned, Kafka. We even have Managed Airflow, for example. I mean, a lot of these services are essentially mostly, like, wrappers around the API. This is essentially also what AWS is doing, right? So, they're providing an API that basically provisions some underlying resources, some infrastructure.Some of the more interesting parts, I guess, we've seen is the data or big data ecosystem. So, things like Athena, Glue, we've invested quite a lot of time in, you know, making that available also in LocalStack so you can have your maybe CSV files or JSON files in an S3 bucket and you can query them from Athena with a SQL language, basically, right? And that makes it very—especially these big data-heavy jobs that are very heavyweight on AWS, you can iterate very quickly in LocalStack. So, this is where we're seeing a lot of adoption recently. And then also, obviously, things like, you know, Lambda and ECS, like, all the serverless and containerized applications, but I guess those are the more mainstream ones.Corey: I imagine you probably get your fair share of requests for things like CloudFormation or CloudFront, where, this is great, but can you go ahead and add a very lengthy sleep right here, just because it returns way too fast and we don't want people to get their hopes up when they use the real thing. On some level, it feels like exact replication of the AWS customer experience isn't quite in line with what makes sense from a developer productivity point of view.Waldemar: Yeah, that's a great point. And I'm sure that, like, a lot of code out there is probably littered with sleep statements that is just tailored to the specific timing in AWS. In fact, we recently opened an issue in the AWS Terraform provider repository to add a configuration option to configure the timings that Terraform is using for the resource deployment. So, just as an example, an S3 bucket creation takes 60 seconds, like, more than a minute against [unintelligible 00:22:37] AWS. I guess LocalStack, it's a second basically, right?And AWS Terraform provider has these, like, relatively slow cycles of checking whether the packet has already been created. And we want to get that configurable to actually reduce the time it takes for local development, right? So, we have an open, sort of, feature request, and we're probably going to contribute to a Terraform repository. But definitely, I share the sentiment that a lot of the tooling ecosystem is built and tailored and optimized towards the experience against the cloud, which often is just slow and, you know, that's what it is, right?Corey: One thing that I didn't expect, though, in hindsight, is blindingly obvious, is your support for a variety of different frameworks and deployment methodologies. I've found that it's relatively straightforward to get up and running with the CDK deploying to LocalStack, for instance. And in hindsight, of course; that's obvious. When you start out down that path, though it's well, you tend to think—at least I don't tend to think in that particular way. It's, “Well, yeah, it's just going to be a console-like experience, or I wind up doing CloudFormation or Terraform.” But yeah, that the world is advancing relatively quickly and it's nice to see that you are very comfortably keeping pace with that advancement.Waldemar: Yeah, true. And I guess for us, it's really, like, the level of abstraction is sort of increasing, so you know, once you have a solid foundation, with, you know, CloudFormation implementation, you can leverage a lot of tools that are sitting on top of it, CDK, serverless frameworks. So, CloudFormation is almost becoming, like, the assembly language of the AWS cloud, right, and if you have very solid support for that, a lot of, sort of, tools in the ecosystem will natively be supported on LocalStack. And then, you know, you have things like Terraform, and in the Terraform CDK, you know, some of these derived versions of Terraform which also are very straightforward because you just need to point, you know, the target endpoint to localhost and then the rest of the deployment loop just works out of the box, essentially.So, I guess for us, it's really mostly being able to focus on, like, the core emulation, making sure that we have very high parity with the real services. We spend a lot of time and effort into what we call parity testing and snapshot testing. We make sure that our API responses are identical and really the same as they are in AWS. And this really gives us, you know, a very strong confidence that a lot of tools in the ecosystem are working out-of-the-box against LocalStack as well.Corey: I would also like to point out that I'm also a proud LocalStack contributor at this point because at the start of this year, I noticed, ah, in one of the pages, the copyright year was still saying 2022 and not 2023. So, a single-character pull request? Oh, yes, I am on the board now because that is how you ingratiate yourself with an open-source project.Waldemar: Yeah. Eternal fame to you and kudos for your contribution. But, [laugh] you know, in all seriousness, we do have a quite an active community of contributors. We are an open-source first project; like, we were born in the open-source. We actually—maybe just touching upon this for a second, we use GitHub for our repository, we use a lot of automation around, you know, doing pull requests, and you know, service owners.We also participate in things like the Hacktoberfest, which we participated in last year to really encourage contributions from the community, and also host regular meetups with folks in the community to really make sure that there's an active ecosystem where people can contribute and make contributions like the one that you did with documentation and all that, but also, like, actual features, testing and you know, contributions of different levels. So really, kudos and shout out to the entire community out there.Corey: Do you feel that there's an inherent tension between being an open-source product as well as being a commercial product that is available for sale? I find that a lot of companies feel vaguely uncomfortable with the various trade-offs that they make going down that particular path, but I haven't seen anyone in the community upset with you folks, and it certainly hasn't seemed to act as a brake on your enterprise adoption, either.Waldemar: That is a very good point. So, we certainly are—so we're following an open-source-first model that we—you know, the core of the codebase is available in the community version. And then we have pro extensions, which are commercial and you basically, you know, setup—you sign up for a license. We are certainly having a lot of discussions on how to evolve this licensing model going forward, you know, which part to feed back into the community version of LocalStack. And it's certainly an ongoing evolving model as well, but certainly, so far, the support from the community has been great.And we definitely focus to, kind of, get a lot of the innovation that we're doing back into our open-source repo and make sure that it's, like, really not only open-source but also open contribution for folks to contribute their contributions. We also integrate with other third-party libraries. We're built on the shoulders of giants, if I may say so, other open-source projects that are doing great work with emulators. To name just a few, it's like, [unintelligible 00:27:33] which is a great project that we sort of use and depend upon. We have certain mocks and emulations, for Kinesis, for example, Kinesis mock and a bunch of other tools that we've been leveraging over the years, which are really great community efforts out there. And it's great to see such an active community that's really making this vision possible have a truly local emulated clouds that gives the best experience to developers out there.Corey: So, as of, well, now, when people are listening to this and the episode gets released, v2 of LocalStack is coming out. What are the big differences between LocalStack and now LocalStack 2: Electric Boogaloo, or whatever it is you're calling the release?Waldemar: Right. So, we're super excited to release our v2 version of LocalStack. Planned release date is end of March 2023, so hopefully, we will make that timeline. We did release our first version of OpenStack in July 2022, so it's been roughly seven months since then and we try to have a cadence of roughly six to nine months for the major releases. And what you can expect is we've invested a lot of time and effort in last couple of months and in last year to really make it a very rock-solid experience with enhancements in the current services, a lot of performance optimizations, we've invested a lot in parity testing.So, as I mentioned before, parity is really important for us to make sure that we have a high coverage of the different services and how they behave the same way as AWS. And we're also putting out an enhanced version and a completely polished version of our Cloud Pods experience. So, Cloud Pods is a state management mechanism in LocalStack. So, by default, the state in LocalStack is ephemeral, so when you restart the instance, you basically have a fresh state. But with Cloud Pods, we enable our users to take persistent snapshot of the states, save it to disk or to a server and easily share it with team members.And we have very polished experience with Community Cloud Pods that makes it very easy to share the state among team members and with the community. So, those are just some of the highlights of things that we're going to be putting out in the tool. And we're super excited to have it done by, you know, end of March. So, stay tuned for the v2 release.Corey: I am looking forward to seeing how the experience shifts and evolves. I really want to thank you for taking time out of your day to wind up basically humoring me and effectively re-covering ground that you and I covered about a year and a half ago now. If people want to learn more, where should they go?Waldemar: Yeah. So definitely, our Slack channel is a great way to get in touch with the community, also with the LocalStack team, if you have any technical questions. So, you can find it on our website, I think it's slack.localstack.cloud.We also host a Discourse forum. It's discuss.localstack.cloud, where you can just, you know, make feature requests and participate in the general conversation.And we do host monthly community meetups. Those are also available on our website. If you sign up, for example, for a newsletter, you will be notified where we have, you know, these webinars. Take about an hour or so where we often have guest speakers from different companies, people who are using, you know, cloud development, local cloud development, and just sharing the experiences of how the space is evolving. And we're always super happy to accept contributions from the community in these meetups as well. And last but not least, our GitHub repository is a great way to file any issues you may have, feature requests, and just getting involved with the project itself.Corey: And we will, of course, put links to that in the [show notes 00:31:09]. Thank you so much for taking the time to speak with me today. I appreciate it.Waldemar: Thank you so much, Corey. It's been a pleasure. Thanks for having me.Corey: Waldemar Hummer, CTO and co-founder at LocalStack. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, presumably because your compensation structure requires people to spend ever-increasing amounts of money on AWS services.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
Combining Community and Company Employees with Matty Stratton

Screaming in the Cloud

Play Episode Listen Later Mar 16, 2023 40:08


Matty Stratton, Director of Developer Relations at Aiven, joins Corey on Screaming in the Cloud for a friendly debate on whether or not company employees can still be considered community members. Corey says no, but opens up his position to the slings and arrows of Matty in an entertaining change of pace. Matty explains why he feels company employees can still be considered community members, and also explores how that should be done in a way that is transparent and helpful to everyone in the community. Matty and Corey also explore the benefits and drawbacks of talented community members becoming employees.About MattyMatty Stratton is the Director of Developer Relations at Aiven, a well-known member of the DevOps community, founder and co-host of the popular Arrested DevOps podcast, and a global organizer of the DevOpsDays set of conferences.Matty has over 20 years of experience in IT operations and is a sought-after speaker internationally, presenting at Agile, DevOps, and cloud engineering focused events worldwide. Demonstrating his keen insight into the changing landscape of technology, he recently changed his license plate from DEVOPS to KUBECTL.He lives in Chicago and has three awesome kids, whom he loves just a little bit more than he loves Diet Coke. Links Referenced: Aiven: https://aiven.io/ Twitter: https://twitter.com/mattstratton Mastodon: hackyderm.io/@mattstratton LinkedIn: https://www.linkedin.com/in/mattstratton/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Min.ioWith more than 1.1 billion docker pulls - Most of which were not due to an unfortunate loop mistake, like the kind I like to make - and more than 37 thousand github stars, (which are admittedly harder to get wrong), MinIO has become the industry standard alternative to S3. It runs everywhere  - public clouds, private clouds, Kubernetes distributions, baremetal, raspberry's pi, colocations - even in AWS Local Zones. The reason people like it comes down to its simplicity, scalability, enterprise features and best in class throughput. Software-defined and capable of running on almost any hardware you can imagine and some you probably can't, MinIO can handle everything you can throw at it - and AWS has imagined a lot of things - from datalakes to databases.Don't take their word for it though - check it out at www.min.io and see for yourself. That's www.min.io Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today by returning guest, my friend and yours, Matty Stratton, Director of Developer Relations at Aiven. Matty, it's been a hot second. How are you?Matty: It has been a while, but been pretty good. We have to come back to something that just occurred to me when we think about the different things we've talked about. There was a point of contention about prior art of the Corey Quinn face and photos. I don't know if you saw that discourse; we may have to have a conversation. There may be some absent—Corey: I did not see—Matty: Okay.Corey: —discourse, but I also would accept freely that I am not the first person to ever come up with the idea of opening my mouth and looking ridiculous for a photograph either.Matty: That's fair, but the thing that I think was funny—and if you don't mind, I'll just go ahead and throw this out here—is that I didn't put this two and two together. So, I posted a picture on Twitter a week or so ago that was primarily to show off the fact—it was a picture of me in 1993, and the point was that my jeans were French-rolled and were pegged. But in the photo, I am doing kind of the Corey Quinn face and so people said, “Oh, is this prior art?” And I said—you know what? I actually just remembered and I've never thought about this before, but one of my friends in high school, for his senior year ID he took a picture—his picture looks like, you know, that kind of, you know, three-quarters turn with the mouth opening going, “Ah,” you know?And he loved that picture—number one, he loved that picture so much that this guy carried his senior year high school ID in his wallet until we were like 25 because it was his favorite picture of himself. But every photo—and I saw this from looking through my yearbook of my friend Jay when we are seniors, he's doing the Corey Quinn face. And he is anecdotally part of the DevOps community, now a little bit too, and I haven't pointed this out to him. But people were saying that, you know, mine was prior art on yours, I said, “Actually, I was emulating yet someone else.”Corey: I will tell you the actual story of how it started. It was at re:Invent, I want to say 2018 or so, and what happened was is someone, they were a big fan of the newsletter—sort of the start of re:Invent—they said, “Hey, can I get a selfie with you?” And I figured, sure, why not. And the problem I had is I've always looked bad in photographs. And okay, great, so if I'm going to have a photo taken of me, that's going to be ridiculous, why not as a lark, go ahead and do this for fun during the course of re:Invent this year?So, whenever I did that I just slapped—if someone asked for a selfie—I'd slap the big happy open mouth smile on my face. And people thought, “Oh, my God, this is amazing.” And I don't know that it was necessarily worth that level of enthusiasm, but okay. I'll take it. I'm not here to tell people they're wrong when they enjoy a joke that I'm putting out there.And it just sort of stuck. And I think the peak of it that I don't think I'm ever going to be able to beat is I actually managed to pull that expression on my driver's license.Matty: Wow.Corey: Yeah.Matty: That's—Corey: They don't have a sense of humor that they are aware of at the DMV.Matty: No, they really don't. And having been to the San Francisco DMV and knowing how long it takes to get in there, like, that was a bit of a risk on your part because if they decided to change their mind, you wouldn't be able to come back for another four months [laugh].Corey: It amused me to do it, so why not? What else was I going to do? I brought my iPad with me, it has cellular on it, so I just can work remotely from there. It was either that or working in my home office again, and frankly, at the height of the pandemic, I could use the break.Matty: Yes [laugh]. That's saying something when the break you can use is going to the DMV.Corey: Right.Matty: That's a little bit where we were, where we at. I think just real quick thinking about that because there's a lot to be said with that kind of idea of making a—whether it's silly or not, but having a common, especially if you do a lot of photos, do a lot of things, you don't have to think about, like, how do I look? I mean, you have to think about—you know, you can just say I just know what I do. Because if you think about it, it's about cultivating your smile, cultivating your look for your photos, and just sort of having a way so you don't—you just know what to do every time. I guess that's a, you know, maybe a model tip or something. I don't know. But you might be onto something.Corey: I joke that my entire family motto is never be the most uncomfortable person in the room. And there's something to be said for it where if you're going to present a certain way, make it your own. Find a way to at least stand out. If nothing else, it's a bit different. Most people don't do that.Remember, we've all got made fun of, generally women—for some reason—back about 15 years ago or so for duck face, where in all the pictures you're making duck face. And well, there are reasons why that is a flattering way to present your face. But if there's one thing we love as a society, it's telling women they're doing something wrong.Matty: Yeah.Corey: So yeah, there's a whole bunch of ways you're supposed to take selfies or whatnot. Honestly, I'm in no way shape or form pretty enough or young enough to care about any of them. At this point, it's what I do when someone busts out a camera and that's the end of it. Now, am I the only person to do this? Absolutely not. Do I take ownership of it? No. Someone else wants to do it, they need give no credit. The idea probably didn't come from me.Matty: And to be fair, if I'm little bit taking the mickey there or whatever about prior art, it was more than I thought it was funny because I had not even—it was this thing where it was like, this is a good friend of mine, probably some of that I've been friends with longer than anyone in my whole life, and it was a core part [laugh] of his personality when we were 18 and 19, and it just d—I just never direct—like, made that connection. And then it happened to me and went “Oh, my God. Jason and Corey did the same thing.” [laugh]. It was—Corey: No, it feels like parallel evolution.Matty: Yeah, yeah. It was more of me never having connected those dots. And again, you're making that face for your DMV photo amused you, me talking about this for the last three minutes on a podcast amused me. So.Corey: And let's also be realistic here. How many ways are there to hold your face during a selfie that is distinguishable and worthy of comment? Usually, it's like okay, well, he has this weird sardonic half-smile with an eyebrow ar—no. His mouth was wide open. We're gonna go with that.Matty: You know, there's a little—I want to kind of—because I think there's actually quite a bit to the lesson from any of this because I think about—follow me here; maybe I'll get to the right place—like me and karaoke. No one would ever accuse me of being a talented singer, right? I'm not going to sing well in a way where people are going to be moved by my talent. So instead, I have to go a different direction. I have to go funny.But what it boils down to is I can only do—I do karaoke well when it's a song where I can feel like I'm doing an impression of the singer. So, for example, the B-52s. I do a very good impression of Fred Schneider. So, I can sing a B-52 song all day long. I actually could do better with Pearl Jam than I should be able to with my terrible voice because I'm doing an Eddie Vedder impression.So, what I'm getting at is you're sort of taking this thing where you're saying, okay, to your point, you said, “Hey,”—and your words, not mine—[where 00:07:09] somebody say, “The picture is not going to be of me looking like blue steel runway model, so I might as well look goofy.” You know? And take it that way and be funny with it. And also, every time, it's the same way, so I think it's a matter of kind of owning the conversation, you know, and saying, how do you accentuate the thing that you can do. I don't know. There's something about DevOps, somehow in there.Corey: So, I am in that uncomfortable place right now between having finalized a blog post slash podcast that's going out in two days from this recording. So, it will go out before you and I have this discussion publicly, but it's also too late for me to change any of it,m so I figured I will open myself up to the slings and arrows of you, more or less. And you haven't read this thing yet, which is even better, so you're now going to be angry about an imperfect representation of what I said in writing. But the short version is this: if you work for a company as their employee, then you are no longer a part of that company's community, as it were. And yes, that's nuanced and it's an overbroad statement and there are a bunch of ways that you could poke holes in it, but I'm curious to get your take on the overall positioning of it.Matty: So, at face value, I would vehemently disagree with that statement. And by that is, that I have spent years of my life tilting at the opposite windmill, which is just because you work at this company, doesn't mean you do not participate in the community and should not consider yourself a part of the community, first and foremost. That will, again, like everything else, it depends. It depends on a lot of things and I hope we can kind of explore that a little bit because just as much as I would take umbrage if you will, or whatnot, with the statement that if you work at the company, you stop being part of the community, I would also have an issue with, you're just automatically part of the community, right? Because these things take effort.And I feel like I've been as a devreloper, or whatever, Corey—how do you say it?Corey: Yep. No, you're right on. Devreloper.Matty: As a—or I would say, as a DevRel, although people on Twitter are angry about using the word DevRel to discuss—like saying, “I'm a DevRel.” “DevRel is a department.” It's a DevOps engineer thing again, except actually—it's, like, actually wrong. But anyway, you kind of run into this, like for example—I'm going to not name names here—but, like, to say, you know, Twitter for Pets, the—what do you—by the way, Corey, what are you going to do now for your made-up company when what Twitter is not fun for this anymore? You can't have Twitter for Pets anymore.Corey: I know I'm going to have to come up with a new joke. I don't quite know what to do with myself.Matty: This is really hard. While we will pretend Twitter for Pets is still around a little bit, even though its API is getting shut down.Corey: Exactly.Matty: So okay, so we're over here at Twitter for Pets, Inc. And we've got our—Corey: Twitter for Bees, because you know it'll at least have an APIary.Matty: Yeah. Ha. We have our team of devrelopers and community managers and stuff and community engineers that work at Twitter for Pets, and we have all of our software engineers and different people. And a lot of times the assumption—and now we're going to have Twitter for Pets community something, right? We have our community, we have our area, our place that we interact, whether it's in person, it's virtual, whether it's an event, whether it's our Discord or Discourse or Slack or whatever [doodlee 00:10:33] thing we're doing these days, and a lot of times, all those engineers and people whose title does not have the word ‘community' on it are like, “Oh, good. Well, we have people that do that.”So, number one, no because now we have people whose priority is it; like, we have more intentionality. So, if I work on the community team, if I'm a dev advocate or something like that, my priority is communicating and advocating to and for that community. But it's like a little bit of the, you know, the office space, I take the requirements from the [unintelligible 00:11:07] to people, you I give them to the engineers. I've got people—so like, you shouldn't have to have a go-between, right? And there's actually quite a bit of place.So, I think, this sort of assumption that you're not part of it and you have no responsibility towards that community, first of all, you're missing a lot as a person because that's just how you end up with people building a thing they don't understand.Corey: Oh, I think you have tremendous responsibility to the community, but whether you're a part of it and having responsibility to it or not aligned in my mind.Matty: So… maybe let's take a second and what do you mean by being a part of it?Corey: Right. Where very often I'll see a certain, I don't know, very large cloud provider will have an open-source project. Great, so you go and look at the open-source project and the only people with commit access are people who work at that company. That is an easy-to-make-fun-of example of this. Another is when the people who are in a community and talking about how they perceive things and putting out content about how they've interacted with various aspects of it start to work there, you see areas where it starts to call its authenticity into question.AWS is another great example of this. As someone in the community, I can talk about how I would build something on top of AWS, but then move this thing on to Fastly instead of CloudFront because CloudFront is terrible. If you work there, you're not going to be able to say the same thing. So, even if you're not being effusive with praise, there are certain guardrails and constraints that keep you from saying what you might otherwise, just based upon the sheer self-interest that comes from the company whose product or service you're talking about is also signing your paycheck and choosing to continue to do so.Matty: And I think even less about it because that's where your paycheck is coming. It's also just a—there's a gravitational pull towards those solutions because that's just what you're spending your day with, right? You know—Corey: Yeah. And you also don't want to start and admit even to yourself, in some cases, that okay, this aspect of what our company does is terrible, so companies—people shouldn't use it. You want to sort of ignore that, on some level, psychologically because that dissonance becomes harmful.Matty: Yeah. And I think there's—so again, this is where things get nuanced and get to levels. Because if you have the right amount of psychological safety in your organization, the organization understands what it's about to that. Because even people whose job is to be a community person should be able to say, “Hey, this is my actual opinion on this. And it might be contrary to the go-to-market where that comes in.”But it's hard, especially when it gets filtered through multiple layers and now you've got a CEO who doesn't understand that nuance who goes, “Wait, why was Corey on some podcast saying that the Twitter for Pets API is not everything it could possibly be?” So, I do think—I will say this—I do think that organizations and leadership are understanding this more than they might have in the past, so we are maybe putting on ourselves this belief that we can't be as fully honest, but even if it's not about hiding the warts, even if it's just a matter of also, you're just like, hey, chances are—plus also to be quite frank, if I work at the company, I probably have access to way more shit than I would have to pay for or do whatever and I know the right way. But here's the trick, and I won't even say it's a dogfooding thing, but if you are not learning and thinking about things the way that your users do—and I will even say that that's where—it is the users, which are the community, that community or the people that use your product or are connected to it, they don't use it; they may be anecdotal—or not anecdotally, maybe tangentially connected. I will give an example. And there was a place I was working where it was very clear, like, we had a way to you know, do open-source contributions back of a type of a provider plug-in, whatever you want to call it and I worked at the company and I could barely figure out how to follow the instructions.Because it made a lot of sense to someone who built that software all day long and knew the build patterns, knew all that stuff. So, if you were an engineer at this company, “Well, yeah, of course. You just do this.” And anybody who puts the—connects the dots, this has gotten better—and this was understood relatively quickly as, “Oh, this is the problem. Let's fix it.” So, the thing is, the reason why I bring this up is because it's not something anybody does intentionally because you don't know what you don't know. And—Corey: Oh, I'm not accusing anyone of being a nefarious actor in any of this. I also wonder if part of this is comes from your background as being heavily involved in the Chef community as a Chef employee and as part of the community around that, which is inherently focused on an open-source product that a company has been built around, whereas my primary interaction with community these days is the AWS community, where it doesn't matter whether you're large or small, you are not getting much, if anything, for free from AWS; you're all their customers and you don't really have input into how something gets built, beyond begging nicely.Matty: That's definitely true. And I think we saw that and there was things, when we look at, like, how community, kind of, evolved or just sort of happened at Chef and why we can't recreate it the same way is there was a certain inflection point of the industry and the burgeoning DevOps movement, and there wasn't—you know, so a lot of that was there. But one of the big problems, too, is, as Corey said, everybody—I shouldn't say every, but I've from the A—all the way up to AWS to your smaller startups will have this problem of where you end up hiring in—whether you want to or not—all of your champions and advocates and your really strong community members, and then that ends up happening. So, number one, that's going to happen. So frankly, if you don't push towards this idea, you're actually going to have people not want to come work because you should be able to be still the member that you were before.And the other thing is that at certain size, like, at the size of a hyperscaler, or, you know, a Microsoft—well, anybody—well Microsofts not a hyperscaler, but you know what I'm saying. Like, very, very large organization, your community folks are not necessarily the ones doing that hiring away. And as much as they might—you know, and again, I may be the running the community champion program at Microsoft and see that you want—you know, but that Joe Schmo is getting hired over into engineering. Like, I'm not going to hire Joe because it hurts me, but I can't say you can't, you know? It's so this is a problem at the large size.And at the smaller size, when you're growing that community, it happens, too, because it's really exciting. When there's a place that you're part of that community, especially when there's a strong feel, like going to work for the mothership, so to speak is, like, awesome. So again, to give an example, I was a member of the Chef community, I was a user, a community person well, before, you know, I went and, you know, had a paycheck coming out of that Seattle office. And it was, like, the coolest thing in the world to get a job offer from Ch—like, I was like, “Oh, my God. I get to actually go work there now.” Right?And when I was at Pulumi, there quite a few people I could think of who I knew through the community who then get jobs at Pulumi and we're so excited, and I imagine still excited, you know? I mean, that was awesome to do. So, it's hard because when you get really excited about a technology, then being able to say, “Wait, I can work on this all the time?” That sounds awesome, right? So like, you're going to have that happen.So, I think what you have to do is rather than prevent it from happening because number one, like, you don't want to actually prevent that from happening because those people will actually be really great additions to your organization in lots of ways. Also, you're not going to stop it from happening, right? I mean, it's also just a silly way to do it. All you're going to do is piss people off, and say, like, “Hey, you're not allowed to work here because we need you in the community.” Then they're going to be like, “Great. Well, guess what I'm not a part of anymore now, jerk?” Right? You know [laugh] I mean so—Corey: Exactly.Matty: Your [unintelligible 00:18:50] stops me. So, that doesn't work. But I think to your point, you talked about, like, okay, if you have a, ostensibly this a community project, but all the maintainers are from one—are from your company, you know? Or so I'm going to point to an example of, we had—you know, this was at Pulumi, we had a Champions program called Puluminaries, and then there's something similar to like Vox Populi, but it was kind of the community that was not run by Pulumi Inc. In that case.Now, we helped fund it and helped get it started, but there was there were rules about the, you know, the membership of the leadership, steering committee or board or whatever it was called, there was a hard limit on the number of people that could be Pulumi employees who were on that board. And it actually, as I recall when I was leaving—I imagine this is not—[unintelligible 00:19:41] does sometimes have to adjust a couple of things because maybe those board members become employees and now you have to say, you can't do that anymore or we have to take someone down. But the goal was to actually, you know, basically have—you know, Pulumi Corp wanted to have a voice on that board because if for no other reason, they were funding it, but it was just one voice. It wasn't even a majority voice. And that's a hard sell in a lot of places too because you lose control over that.There's things I know with, uh—when I think about, like, running meetup communities, like, we might be—well I mean, this is not a big secret, I mean because it's been announced, but we're—you know, Aiven is helping bootstrap a bunch of data infrastructure meetups around the world. But they're not Aiven meetups. Now, we're starting them because they have to start, but pretty much our approach is, as soon as this is running and there's people, whether they work here, work with us or not, they can take it, right? Like, if that's go—you know? And being able to do that can be really hard because you have to relinquish the control of your community.And I think you don't have to relinquish a hundred percent of that control because you're helping facilitate it because if it doesn't already have its own thing—to make sure that things like code of conduct and funding of it, and there's things that come along with the okay, we as an organization, as a company that has dollars and euros is going to do stuff for this, but it's not ours. And that's the thing to remember is that your community does not belong to you, the company. You are there to facilitate it, you are there to empower it, you're there to force-multiply it, to help protect it. And yeah, you will probably slurp a whole bunch of value out of it, so this is not magnanimous, but if you want it to actually be a place it's going to work, it kind of has to be what it wants to be. But by the same token, you can't just sort of sit there and be like, “I'm going to wait for this community grow up around me without anything”—you know.So, that's why you do have to start one if there is quote-unquote—maybe if there's no shape to one. But yeah, I think that's… it is different when it's something that feels a little—I don't even want to say that it's about being open-source. It's a little bit about it less of it being a SaaS or a service, or if it's something that you—I don't know.Corey: This episode is sponsored in part by Honeycomb. I'm not going to dance around the problem. Your. Engineers. Are. Burned. Out. They're tired from pagers waking them up at 2 am for something that could have waited until after their morning coffee. Ring Ring, Who's There? It's Nagios, the original call of duty! They're fed up with relying on two or three different “monitoring tools” that still require them to manually trudge through logs to decipher what might be wrong. Simply put, there's a better way. Observability tools like Honeycomb (and very little else becau se they do admittedly set the bar) show you the patterns and outliers of how users experience your code in complex and unpredictable environments so you can spend less time firefighting and more time innovating. It's great for your business, great for your engineers, and, most importantly, great for your customers. Try FREE today at honeycomb.io/screaminginthecloud. That's honeycomb.io/screaminginthecloud.Corey: Yeah, I think you're onto something here. I think another aspect where I found it be annoying is when companies view their community as, let's hire them all. And I don't think it ever starts that way. I think that it starts as, well these are people who are super-passionate about this, and they have great ideas and they were great to work with. Could we hire them?And the answer is, “Oh, wait. You can give me money for this thing I've been doing basically for free? Yeah, sure, why not?” And that's great in the individual cases. The problem is, at some point, you start to see scenarios where it feels like, if not everyone, then a significant vocal majority of the community starts to work there.Matty: I think less often than you might think is it done strategically or on purpose. There have been exceptions to that. There's one really clear one where it feels like a certain company a few years ago, hired up all the usual suspects of the DevOps community. All of a sudden, you're like, oh, a dozen people all went to go work at this place all at once. And the fun thing is, I remember feeling a little bit—got my nose a little out of joint because I was not the hiring mana—like, I knew the people.I was like, “Well, why didn't you ask me?” And they said, “Actually, you are more important to us not working here.” Now, that might have just been a way to sell my dude-in-tech ego or not, but whether or not that was actually true for me or not, that is a thing where you say you know, your folks—but I do think that particular example of, like, okay, I'm this, that company, and I'm going to go hire up all the usual suspects, I think that's less. I think a lot of times when you see communities hire up those people, it's not done on purpose and in fact, it's probably not something they actually wanted to do in mass that way. But it happens because people who are passionate about your product, it's like I said before, it actually seems pretty cool to go work on it as your main thing.But I can think of places I've been where we had, you know—again, same thing, we had a Pulumi—we had someone who was probably our strongest, loudest, most vocal community member, and you know, I really wanted to get this person to come join us and that was sort of one of the conversations. Nobody ever said, “We won't offer this person a job if they're great.” Like, that's the thing. I think that's actually kind of would be shitty to be like, “You're a very qualified individual, but you're more important to me out in the community so I'm not going to make your job offer.” But it was like, Ooh, that's the, you know—it'd be super cool to have this person but also, not that that should be part of our calculus of decision, but then you just say, what do you do to mitigate that?Because what I'm concerned about is people hearing this the wrong way and saying, “There's this very qualified individual who wants to come work on my team at my company, but they're also really important to our community and it will hurt our community if they come work here, so sorry, person, we're not going to give you an opportunity to have an awesome job.” Like, that's also thinking about the people involved, too. But I know having talked to folks that lots of these different large organizations that have this problem, generally, those community folks, especially at those places, they don't want this [laugh] happening. They get frustrated by it. So, I mean, I'll tell you, it's you know, the—AWS is one of them, right?They're very excited about a lot of the programs and cool people coming from community builders and stuff and Heroes, you know. On one hand, it's incredibly awesome to have a Hero come work at AWS, but it hurts, right, because now they're not external anymore.Corey: And you stop being a Hero in that case, as well.Matty: Yeah. You do, yeah.Corey: Of course, they also lose the status if they go to one of their major competitors. So like, let me get this straight. You can't be a Hero if you work for AWS or one of its competitors. And okay, how are there any Heroes left at all at some point? And the answer is, they bound it via size and a relatively small list of companies. But okay.Matty: So, thinking back to your point about saying, okay, so if you work at the company, you lose some authenticity, some impartiality, some, you know… I think, rather than just saying, “Well, you're not part”—because that also, honestly, my concern is that your blog post is now going to be ammunition for all the people who don't want to act as members of the community for the company they work for now. They're going to say, well, Corey told me I don't have to. So, like I said, I've been spending the last few years tilting at the opposite windmill, which is getting people that are not on the community team to take part in community summits and discourse and things like that, like, you know, for that's—so I think the thing is, rather than saying, “Well, you can't,” or, “You aren't,” it's like, “Well, what do you do to mitigate those things?”Corey: Yeah, it's a weird thing because taking AWS as the example that I've been beating up on a lot, the vast majority of their employees don't know the community exists in any meaningful sense. Which, no fault to them. The company has so many different things, no one keeps up with at all. But it's kind of nuts to realize that there are huge communities of people out there using a thing you have built and you do not know that those users exist and talk to each other in a particular watering hole. And you of course, as a result, have no presence there. I think that's the wrong direction, too. But—Matty: Mm-hm.Corey: Observing the community and being part of the community, I think there's a difference. Are you a biologist or are you a gorilla?Matty: Okay, but [sigh] I guess that's sort of the difference, too which—and it's hard, it's very hard to not just observe. Because I think that actually even taking the mentality of, “I am here to be Jane Goodall, Dr. Jane Goodall, and observe you while I live amongst you, but I'm not going to actually”—although maybe I'm probably doing disservice—I'm remembering my Goodall is… she was actually more involved. May be a bad example.Corey: Yeah. So, that analogy does fall apart a little bit.Matty: It does fall apart a little bit—Corey: Yeah.Matty: But it's you kind of am I sitting there taking field notes or am I actually engaging with you? Because there is a difference. Even if your main reason for being there is just purely to—I mean, this is not the Prime Directive. It's not Star Trek, right? You're not going to like, hold—you don't need to hold—I mean, do you have to hold yourself aloof and say, “I don't participate in this conversation; I'm just here to take notes?”I think that's very non-genuine at that point. That's over-rotating the other way. But I think it's a matter of in those spaces—I think there's two things. I think you have to have a way to be identified as you are an employee because that's just disclosure.Corey: Oh, I'm not suggesting by any stretch of the imagination, people work somewhere but not admit that they work somewhere when talking about the company. That's called fraud.Matty: Right. No, no, and I don't think it's even—but I'm saying beyond just, if it's not, if you're a cop, you have to tell me, right?Corey: [laugh].Matty: It's like, it's not—if asked, I will tell you I work at AWS. It's like in that place, it should say, “I am an AWS em—” like, I should be badged that way, just so it's clear. I think that's actually helpful in two ways. It's also helpful because it says like, okay, maybe you have a connection you can get for me somehow. Like, you might actually have some different insight or a way to chase something that, you know, it's not necessarily just about disclosure; it's also helpful to know.But I think within those spaces, that disclosure—or not disclosure, but being an employee does not offer you any more authority. And part of that is just having to be very clear about how you're constructing that community, right? And that's sort of the way that I think about it is, like, when we did the Pulumi Community Summit about a year ago, right? It was an online, you know, thing we did, and the timing was such that we didn't have a whole lot of Pulumi engineers were able to join, but when we—and it's hard to say we're going to sit in an open space together and everybody is the same here because people also—here's the difference. You say you want this authority? People will want that authority from the people that work at the company and they will always go to them and say, like, “Well, you should have this answer. Can you tell me about this? Can you do this?”So, it's actually hard on both cases to have that two-way conversation unless you set the rules of that space such as, “Okay, I work at Aiven, but when I'm in this space, short of code of conduct or whatever, if I have to be doing that thing, I have no more authority on this than anyone else.” I'm in this space as the same way everyone else's. You can't let that be assumed.Corey: Oh, and big companies do. It's always someone else's… there's someone else's department. Like, at some level, it feels like when you work in one of those enormous orgs, it's your remit is six inches wide.Matty: Well, right. Right. So, I think it's like your authority exists only so far as it's helpful to somebody. If I'm in a space as an Aivener, I'm there just as Matty the person. But I will say I work at Aiven, so if you're like, “God, I wish that I knew who was the person to ask about this replication issue,” and then I can be like, “Aha, I actually have backchannel. Let me help you with that.” But if I can say, “You know what? This is what I think about Kafka and I think why this is whatever,” like, you can—my opinion carries just as much weight as anybody else's, so to speak. Or—Corey: Yeah. You know, it's also weird. Again, community is such a broad and diverse term, I find myself in scenarios where I will observe and talk to people inside AWS about things, but I never want to come across as gloating somehow, that oh, I know, internal people that talk to you about this and you don't. Like, that's never how I want to come across. And I also, I never see the full picture; it's impossible for me to, so I never make commitments on behalf of other people. That's a good way to get in trouble.Matty: It is. And I think in the case of, like, someone like you who's, you know, got the connections you have or whatever, it's less likely for that to be something that you would advertise for a couple of reasons. Like, nobody should be advertising to gloat, but also, part of my remit as a member of a community team is to actually help people. Like, you're doing it because you want to or because it serves you in a different way. Like, that is literally my job.So like, it shouldn't be, like—like, because same thing, if you offer up your connections, now you are taking on some work to do that. Someone who works at the company, like, yes, you should be taking on that work because this is what we do. We're already getting paid for it, you know, so to speak, so I think that's the—Corey: Yeah.Matty: —maybe a nuance, but—Corey: Every once in a while, I'll check my Twitter spam graveyard, [unintelligible 00:32:01] people asking me technical questions months ago about various things regarding AWS and whatnot. And that's all well and good; the problem I have with it is that I'm not a support vector. I don't represent for the company or work for them. Now, if I worked there, I'd feel obligated to make sure this gets handed to the right person. And that's important.The other part of it, though, is okay, now that that's been done and handed off, like do I shepherd it through the process? Eh. I don't want people to get used to asking people in DMs because again, I consider myself to be a nice guy, but if I'm some nefarious jerk, then I could lead them down a very dark path where I suddenly have access to their accounts. And oh, yeah, go ahead and sign up for this thing and I'll take over their computer or convince them to pay me in iTunes gift cards or something like that. No, no, no. Have those conversations in public or through official channels, just because I don't, I don't think you want to wind up in that scenario.Matty: So, my concern as well, with sort of taking the tack of you are just an observer of the community, not a part of it is, that actually can reinforce some pretty bad behavior from an organization towards how they treat the community. One of the things that bothers me—if we're going to go on a different rant about devrelopers like myself—is I like to say that, you know, we pride ourselves as DevRels as being very empathetic and all this stuff, but very happy to shit all over people that work in sales or marketing, based on their job title, right? And I'm like, “Wow, that's great,” right? We're painting with this broad brush. Whereas in reality, we're not separate from.And so, the thing is, when you treat your community as something separate from you, you are treating it as something separate from you. And then it becomes a lot easier also, to not treat them like people and treat them as just a bunch of numbers and treat them as something to have value extracted from rather than it—this is actually a bunch of humans, right? And if I'm part of that, then I'm in the same Dunbar number a little bit, right? I'm in the same monkey sphere as those people because me, I'm—whoever; I'm the CTO or whatever, but I'm part of this community, just like Joe Smith over there in Paducah, you know, who's just building things for the first time. We're all humans together, and it helps to not treat it as the sort of amorphous blob of value to be extracted.So, I think that's… I think all of the examples you've been giving and those are all valid concerns and things to watch out for, the broad brush if you're not part of the community if you work there, my concern is that that leads towards exacerbating already existing bad behavior. You don't have to convince most of the people that the community is separate from them. That's what I'm sort of getting at. I feel like in this work, we've been spending so much time to try to get people to realize they should be acting like part of their larger community—and also, Corey, I know you well enough to know that, you know, sensationalism to make a point [laugh] works to get somebody to join—Corey: I have my moments.Matty: Yeah, yeah, yeah. I mean, there's I think… I'll put it this way. I'm very interested to see the reaction, the response that comes out in, well now, for us a couple of days, for you the listener, a while ago [laugh] when that hits because I think it is a, I don't want to say it's controversial, but I think it's something that has a lot of, um… put it this way, anything that's simple and black and white is not good for discussion.Corey: It's nuanced. And I know that whenever I wrote in 1200 words is not going to be as nuanced of the conversation we just had, either, so I'm sure people will have opinions on it. That'd be fun. It'd be a good excuse for me to listen.Matty: Exactly [laugh]. And then we'll have to remember to go back and find—I'll have to do a little Twitter search for the dates.Corey: We'll have to do another discussion on this, if anything interesting comes out of it.Matty: Actually, that would be funny. That would be—we could do a little recap.Corey: It would. I want to thank you so much for being so generous with your time. Where can people find you if they want to learn more?Matty: Well, [sigh] for the moment, [sigh] who knows what will be the case when this comes out, but you can still find me on Twitter at @mattstratton. I'm also at hackie-derm dot io—sorry, hackyderm.io. I keep wanting to say hackie-derm, but hackyderm actually works better anyway and it's funnier. But [hackyderm.io/@mattstratton](https://hackyderm.io/@mattstratton) is my Mastodon. LinkedIn; I'm. Around there. I need to play more at that. You will—also again, I don't know when this is coming out, so you won't tell you—you don't find me out traveling as much as you might have before, but DevOpsDays Chicago is coming up August 9th and 10th in Chicago, so at the time of listening to this, I'm sure our program will have been posted. But please come and join us. It will be our ninth time of hosting a DevOpsDay Chicago. And I have decided I'm sticking around for ten, so next year will be my last DevOpsDay that I'm running. So, this is the penultimate. And we always know that the penultimate is the best.Corey: Absolutely. Thanks again for your time. It's appreciated. Matty Stratton, Director of Developer Relations at Aiven. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment talking about how I completely missed the whole point of this community and failing to disclose that you are in fact one of the producers of the show.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Changelog Master Feed
Rust efficiencies at AWS scale (Ship It! #89)

Changelog Master Feed

Play Episode Listen Later Feb 16, 2023 63:31 Transcription Available


Tim McNamara is known as New Zealand's Rust guy. He is the author of Rust in Action, and also a Senior Software Engineer at AWS, where he helps other builders with all things Rust. The main reason why Gerhard is intrigued by Rust is the incredible resource frugality. Fewer CPUs means less energy used, which is good for the planet, and good for the monthly bill. This becomes most noticeable at Amazon's scale, when S3, Lambda, CloudFront and other services start adding Rust components.

Ship It! DevOps, Infra, Cloud Native
Rust efficiencies at AWS scale

Ship It! DevOps, Infra, Cloud Native

Play Episode Listen Later Feb 16, 2023 63:31 Transcription Available


Tim McNamara is known as New Zealand's Rust guy. He is the author of Rust in Action, and also a Senior Software Engineer at AWS, where he helps other builders with all things Rust. The main reason why Gerhard is intrigued by Rust is the incredible resource frugality. Fewer CPUs means less energy used, which is good for the planet, and good for the monthly bill. This becomes most noticeable at Amazon's scale, when S3, Lambda, CloudFront and other services start adding Rust components.

The Dan Rayburn Podcast
Episode 49: Key Takeaways From Earnings Recap of Google, Amazon, Apple, Comcast, Verizon

The Dan Rayburn Podcast

Play Episode Listen Later Feb 6, 2023 44:57


This week we highlight the key numbers you need to know from Q4 2022 earnings results from Google (YouTube revenue growth declined), Amazon (AWS revenue growth of 20%), Comcast (now has 20M Peacock subs but lost $978M), Apple (Revenue down 5% y/o/y), Verizon (Lost 80, 000 pay TV subs), Charter (Lost 145,000 pay TV subs) and additional news around Netflix, Paramount+, World Cup, Liliac Cloud and Harmonic. Companies and services mentioned: Netflix, Comcast, Apple, YouTube, Amazon, Peacock, Paramount+, Showtime, World Cup, F5, Struum, Liliac Cloud, Verizon, Charter, Harmonic, AWS, Prime Video, CloudFront.Questions or feedback? Contact: dan@danrayburn.com

cloudonaut
#60 [Hot off the Cloud] AppSync JavaScript Resolvers + IAM MFA + CloudFront CD

cloudonaut

Play Episode Listen Later Nov 23, 2022 30:48


Two brothers discussing all things AWS every week. Hosted by Andreas and Michael Wittig presented by cloudonaut.

Screaming in the Cloud
The Quest to Make Edge Computing a Reality with Andy Champagne

Screaming in the Cloud

Play Episode Listen Later Nov 10, 2022 46:56


About AndyAndy is on a lifelong journey to understand, invent, apply, and leverage technology in our world. Both personally and professionally technology is at the root of his interests and passions.Andy has always had an interest in understanding how things work at their fundamental level. In addition to figuring out how something works, the recursive journey of learning about enabling technologies and underlying principles is a fascinating experience which he greatly enjoys.The early Internet afforded tremendous opportunities for learning and discovery. Andy's early work focused on network engineering and architecture for regional Internet service providers in the late 1990s – a time of fantastic expansion on the Internet.Since joining Akamai in 2000, Akamai has afforded countless opportunities for learning and curiosity through its practically limitless globally distributed compute platform. Throughout his time at Akamai, Andy has held a variety of engineering and product leadership roles, resulting in the creation of many external and internal products, features, and intellectual property.Andy's role today at Akamai – Senior Vice President within the CTO Team - offers broad access and input to the full spectrum of Akamai's applied operations – from detailed patent filings to strategic company direction. Working to grow and scale Akamai's technology and business from a few hundred people to roughly 10,000 with a world-class team is an amazing environment for learning and creating connections.Personally Andy is an avid adventurer, observer, and photographer of nature, marine, and astronomical subjects. Hiking, typically in the varied terrain of New England, with his family is a common endeavor. He enjoys compact/embedded systems development and networking with a view towards their applications in drone technology.Links Referenced: Macrometa: https://www.macrometa.com/ Akamai: https://www.akamai.com/ LinkedIn: https://www.linkedin.com/in/andychampagne/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built-in key rotation, permissions as code, connectivity between any two devices, reduce latency, and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. Tailscale is completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I like doing promoted guest episodes like this one. Not that I don't enjoy all of my promoted guest episodes. But every once in a while, I generally have the ability to wind up winning an argument with one of my customers. Namely, it's great to talk to you folks, but why don't you send me someone who doesn't work at your company? Maybe a partner, maybe an investor, maybe a customer. At Macrometa who's sponsoring this episode said, okay, my guest today is Andy Champagne, SVP at the CTO office at Akamai. Andy, thanks for joining me.Andy: Thanks, Corey. Appreciate you having me. And appreciate Macrometa letting me come.Corey: Let's start with talking about you, and then we'll get around to the Macrometa discussion in the fullness of time. You've been at an Akamai for 22 years, which in tech company terms, it's like staying at a normal job for 75 years. What's it been like being in the same place for over two decades?Andy: Yeah, I've got several gold watches. I've been retired twice. Nobody—you know, Akamai—so in the late-90s, I was in the ISP universe, right? So, I was in network engineering at regional ISPs, you know, kind of cutting teeth on, you know, trying to scale networks and deal with the flux of user traffic coming in from the growth of the web. And, you know, frankly, it wasn't working, right?Companies were trying to scale up at the time by adding bigger and bigger servers, and buying literally, you know, servers, the size of refrigerators. And all of a sudden, there was this company that was coming together out in Cambridge, I'm from Massachusetts, and Akamai started in Cambridge, Massachusetts, still headquartered there. And Akamai was forming up and they had a totally different solution to how to solve this, which was amazing. And it was compelling and it drew me there, and I am still there, 22-odd years in, trying to solve challenging problems.Corey: Akamai is one of those companies that I often will describe to people who aren't quite as inclined in the network direction as I've been previously, as one of the biggest companies of the internet that you've never heard of. You are—the way that I think of you historically, I know this is not how you folks frame yourself these days, but I always thought of you as the CDN that you use when it really mattered, especially in the earlier days of the internet where there were not a whole lot of good options to choose from, and the failure mode that Akamai had when I was looking at it many years ago, is that, well, it feels enterprise-y. Well, what does that mean exactly because that's usually used as a disparaging term by any developer in San Francisco. What does that actually unpack to? And to my mind, it was, well, it was one of the more expensive options, which yes, that's generally not a terrible thing, and also that it felt relatively stodgy, for lack of a better term, where it felt like updating things through an API was more of a JSON API—namely a guy named Jason—who would take a ticket, possibly from Jira if they were that modern or not, and then implement it by hand. I don't believe that it is quite that bad these days because, again, this was circa 2012 that we're talking here. But how do you view what Akamai is and does in 2022?Andy: Yeah. Awesome question. There's a lot to unpack in there, including a few clever jabs you threw in. But all good.Corey: [laugh].Andy: [laugh]. I think Akamai has been through a tremendous, tremendous series of evolutions on the internet. And really the one that, you know, we're most excited about today is, you know, earlier this year, we kind of concluded our acquisition of Linode. And if we think about Linode, which brings compute into our platform, you know, ultimately Akamai today is a compute company that has a security offering and has a delivery offering as well. We do more security than delivery, so you know, delivery is kind of something that was really important during our first ten or twelve years, and security during the last ten, and we think compute during the next ten.The great news there is that if you look at Linode, you can't really find a more developer-focused company than Linode. You essentially fall into a virtual machine, you may accidentally set up a virtual machine inadvertently it's so easy. And that is how we see the interface evolving. We see a compute-centric interface becoming standard for people as time moves on.Corey: I'm reminded of one of those ancient advertisements, I forget, I think would have been Sun that put it out where the network is the computer or the computer is the network. The idea of that a computer sitting by itself unplugged was basically just this side of useless, whereas a bunch of interconnected computers was incredibly powerful. That today and 2022 sounds like an extraordinarily obvious statement, but it feels like this is sort of a natural outgrowth of that, where, okay, you've wound up solving the CDN piece of it pretty effectively. Now, you're expanding out into, as you say, compute through the Linode acquisition and others, and the question I have is, is that because there's a larger picture that's currently unfolding, or is this a scenario where well, we nailed the CDN side of the world, well, on that side of the universe, there's no new worlds left to conquer. Let's see what else we can do. Next, maybe we'll start making toasters.Andy: Bunch of bored guys in Cambridge, and we're just like, “Hey, let's go after compute. We don't know what we're doing.” No. There's a little bit more—Corey: Exactly. “We have money and time. Let's combine the two and see what we can come up with.”Andy: [laugh]. Hey, folks, compute: it's the new thing. No, it's more than that. And you know, Akamai has a very long history with the edge, right? And Akamai started—and again, arrogantly saying, we invented the concept of the edge, right, out there in '99, 2000, deploying hundreds and then to thousands of different locations, which is what our CDN ran on top of.And that was a really new, novel concept at the time. We extended that. We've always been flirting with what is called edge computing, which is how do we take pieces of application logic and move them from a centralized point and move them out to the edge. And I mean, cripes, if you go back and Google, like, ‘Akamai edge computing,' we were working on that in 2003, which is a bit like ancient history, right? And we are still on a quest.And literally, we think about it in the company this way: we are on a quest to make edge computing a reality, which is how do you take applications that have centralized chokepoints? And how do you move as much of those applications as possible out to the edge of the network to unblock user performance and experience, and then see what folks developers can enable with that kind of platform?Corey: For me, it seems that the rise of AWS—which is, by extension, the rise of cloud—has been, okay, you wind up building whatever you want for the internet and you stuff it into an AWS region, and oh, that's far away from your customers and/or your entire architecture is terrible so it has to make 20 different calls to the data center in series rather than in parallel. Great, how do we reduce the latency as much as possible? And their answer has largely seemed to be, ah, we'll build more regions, ever closer to you. One of these days, I expect to wake up and find that there's an announcement that they're launching a new region in my spare room here. It just seems to get closer and closer and closer. You look around, and there's a cloud construction crew stalking you to the mall and whatnot. I don't believe that is the direction that the future necessarily wants to be going in.Andy: Yeah, I think there's a lot there. And I would say it this way, which is, you know, having two-ish dozen uber-large data centers is probably not the peak technology of the internet, right? There's more we need to do to be able to get applications truly distributed. And, you know, just to be clear, I mean, Amazon AWS's done amazing stuff, they've projected phenomenal scale and they continue to do so. You know, but at Akamai, the problem we're trying to solve is really different than how do we put a bunch of stuff in a small number of data centers?It's, you know, obviously, there's going to be a centralized aspect, but there also needs to be incredibly integrated and seamless, moves through a gradient of compute, where hey, maybe you're in a very large data center for your AI/ML, kind of, you know, offline data lake type stuff. And then maybe you're in hundreds of locations for mid-tier application processing, and, you know, reconciliation of databases, et cetera. And then all the way out at the edge, you know, in thousands of locations, you should be there for user interactivity. And when I say user interactivity, I don't just mean, you know, read-only, but you've got to be able to do a read-write operation in synchronous fashion with the edge. And that's what we're after is building ultimately a platform for that and looking at tools, technology, and people along the way to help us with it.Corey: I've built something out, my lasttweetinaws.com threading Twitter client, and that's… it's fine. It's stateless, but it's a little too intricate to effectively run in the Lambda@Edge approach, so using their CloudFront offering is simply a non-starter. So, in order to get low latency for people using it around the world, I now have to deploy it simultaneously to 20 different AWS regions.And that is, to be direct, a colossal pain in the ass. No one is really doing stuff like that, that I can see. I had to build a whole lot of customs tooling just to get a CI/CD system up and working. Their strong regional isolation is great for containing blast radii, but obnoxious when you're trying to get something deployed globally. It's not the only way.Combine that with the reality that ingress data transfer to any of their regions is free—generally—but sending data to the internet is a jewel beyond price because all my stars, that is egress bandwidth; there is nothing more valuable on this planet or any other. And that doesn't quite seem right. Because if that were actively true, a whole swath of industries and apps would not be able to exist.Andy: Yeah, you know, Akamai, a huge part of our business is effectively distributing egress bandwidth to the world, right? And that is a big focus of ours. So, when we look at customers that are well positioned to do compute with Akamai, candidly, the filtering question that I typically ask with customers is, “Hey, do you have a highly distributed audience that you want to engage with, you know, a lot of interactivity or you're pushing a lot of content, video, updates, whatever it is, to them?” And that notion of highly distributed applications that have high egress requirements is exactly the sweet spot that we think Akamai has, you know, just a great advantage with, between our edge platform that we've been working on for the last 20-odd years and obviously, the platform that Linode brings into the conversation.Corey: Let's talk a little bit about Macrometa.Andy: Sure.Corey: What is the nature of your involvement with those folks? Because it seems like you sort of crossed into a whole bunch of different areas simultaneously, which is fascinating and great to see, but to my understanding, you do not own them.Andy: No, we don't. No, they're an independent company doing their thing. So, one of the fun hats that I get to wear at Akamai is, I'm responsible for our Akamai Ventures Program. So, we do our corporate investing and all this kind of thing. And we work with a wide array of companies that we think are contributing to the progression of the internet.So, there's a bunch of other folks out there that we work with as well. And Macrometa is on that list, which is we've done an investment in Macrometa, we're board observers there, so we get to sit in and give them input on, kind of, how they're doing things, but they don't have to listen to us since we're only observers. And we've also struck a preferred partnership with them. And what that means is that as our customers are building solutions, or as we're building solutions for our customers, utilizing the edge, you know, we're really excited and we've got Macrometa at the table to help with that. And Macrometa is—you know, just kind of as a refresher—is trying to solve the problem of distributed data access at the edge in a high-performance and almost non-blocking, developer-friendly way. And that is very, very exciting to us, so that's the context in which they're interesting to our continuing evolution of how the edge works.Corey: One of the questions I always like to ask, and it's usually not considered a personal attack when I asked the question—Andy: Oh, good.Corey: But it's, “Describe what the company does.” Now, at some places like the latter days of Yahoo, for example, it's very much a personal attack. But what is it that Macrometa does?Andy: So, Macrometa provides a worldwide, high-speed distributed database that is resident on what today, you could call the edge of the network. And the advantage here is, instead of having one SQL server sitting somewhere, or what you would call a distributed SQL Server, which is two SQL Servers sitting next to one another, Macrometa has a high-speed data store that allows you to, instead of having that centralized SQL Server, have it run natively at the edge of the network. And when you're building applications that run on the edge or anywhere, you need to try to think about how do you have the data as close to the user or to the access point as possible. And that's the problem Macrometa is after and that's what their products today solve. It's an incredibly bright team over there, a fantastic founder-CEO team, and we're really excited to be working with him.Corey: It wasn't intentionally designed this way as a setup when I mentioned a few minutes ago, but yeah, my Twitter client works across the 20-some-odd AWS regions, specifically because it's stateless. All of the state, other than a couple of API keys at provision time, wind up living in the user's browser. If this was something that needed to retain state in any way, like, you know, basically every real application under the sun, this strategy would absolutely not work unless I wound up with some heinous form of circular replication, and then you wind up with a single region going down and everything explodes. Having a cohesive, coherent data layer that spans all of that is key.Andy: Yeah, and you're on to the classical, you know, CompSci issue here around edge, which is if you have 100 edge regions, how do you have consistent state storage between applications running on N of those? And that is the problem Macrometa is after, and, you know, Akamai has been working on this and other variants of the edge problem for some time. We're very excited to be working with the folks at Macrometa. It's a cool group of folks. And it's an interesting approach to the technology. And from what we've seen so far, it's been working great.Corey: The idea of how do I wind up having persistent, scalable state across a bunch of different edge locations is not just a hard computer science problem; it's also a hard cloud economics problem, given the cost of data transit in a bunch of different directions between different providers. It turns, “How much does it cost?” In most cases to a question that can only be answered by well let's run it for a few days and find out. Which is not usually the best way to answer some questions. Like, “Is that power socket live?” “Let's touch it and find out.” Yeah, there are ways you learn that are extraordinarily painful.Andy: Yeah no, nobody should be doing that with power sockets. I think this is one of these interesting areas, which is this is really right in Akamai's backyard but it's not realized by a lot of folks. So, you know, Akamai has, for the last 20-odd-years, been all about how do we egress as much as possible to the entire internet. The weird areas, the big areas, the small areas, the up-and-coming areas, we serve them all. And in doing that, we've built a very large global fabric network, which allows us to get between those locations at a very low cost because we have to move our own content around.And hooking those together, having a essentially private network fabric that hooks the vast majority of our big locations together and then having very high-speed egress out of all of the locations to the internet, you know, that's been how we operate our business at scale effectively and economically for years, and utilizing that for compute data replication, data synchronization tasks is what we're doing.Corey: There are a lot of different solutions that could be used to solve a lot of the persistent data layer question. For example, when you had to solve a similar problem with compute, you had a few options in front of you. Well, we could buy a whole bunch of computers and stuff them in a rack somewhere because, eh, cloud; how hard could it be? Saner heads prevailed, and no, no, no, we're going to buy Linode, which was honestly a genius approach on about three different levels, and I'm still unconvinced the industry sees that for the savvy move that it was. I'm confident that'll change in time.Why not build it yourself? Or alternately, acquire another company that was working on something similar? Instead, you're an investor in a company that's doing this effectively, but not buying them outright?Andy: Yeah, you know, and I think that's—Akamai is beyond at this point in thinking that it's just about ownership, right? I think that this—we don't have to own everything in order to have a successful ecosystem. You know, certainly, we're going to want to own key parts of it and that's where you saw the Linode acquisition, where we felt that was kind of core. But ultimately, we believe in promoting customer choice here. And there's a pretty big role that we have that we think we can help with companies, such as folks like Macrometa where they have, you know, really interesting technology, but they can use leverage, they can use some of our go-to-market, they can use, you know, some of our, you know, kind of guidance and expertise on running a startup—which, by the way, it's not an easy job for these folks—and that's what we're there to do.So, with things like Linode, you know, we want to bring it in, and we want to own it because we think it's just so compelling, and it fits so well with where we want to go. With folks like Macrometa, you know, that's still a really young area. I mean, you know, Linode was in business for many, many, many years and was a good-sized business, you know, before we bought them.Corey: Yeah, there's something to be said, for letting the market shake something out rather than having to do it all yourself as trailblazers. I'm a big believer in letting other companies do things. I mean, one of the more annoying things, from my position, is this idea where AWS takes a product strategy of, “Yes.” That becomes a bit of a challenge when they're trying to wind up building compete decks, and how do we defeat the competition? And it's like, “Wh—oh, you're talking about the other hyperscalers?” “No, we're talking with the service team one floor away.”That just seems a little on the strange side to—some companies get too big and too expensive on some level. I think that there's a very real risk of Akamai trying to do everything on the internet if you continue to expand and start listing out things that are not currently in your portfolio. And, oh, we should do that, too, and we should do that, too, and we should do that, too. And suddenly, it feels pretty closely aligned with you're trying to do everything.Andy: Yeah. I think we've been a company who has been really disciplined and not doing everything. You know, we started with CDN. And you know, we're talking '98 to 2010, you know, CDN was really our thing, and we feel we executed really well on that. We probably executed quite quietly and well, but feel we executed pretty well on that.Really from 2010, 2012 to 2020, it was all about security, right? And, you know, we built, you know, pretty amazing security business, hundred percent of SaaS business, on top of our CDN platform with security. And now we're thinking about—we did that route relatively quietly, as well, and now we're thinking about the next ten years and how do we have that same kind of impact on cloud. And that is exciting because it's not just centralized cloud; it's about a distributed cloud vision. And that is really compelling and that's why you know, we've got great folks that are still here and working on it.Corey: I'm a big believer in the idea that you can start getting distilled truth out of folks, particularly companies, the more you compress the space they have to wind up saying. Something that's why Twitter very often lets people tip their hands. But a commonplace that I look for is the title field on a company's website. So, when I go over to akamai.com, you position yourself as something that fits in a small portion of a tweet, which is good. Whenever have a Tolstoy-length paragraph in the tooltip title for the browser tab, that's a problem.But you say simply, “Security, cloud delivery, performance. Akamai.” Which is beautifully well done, but security comes first. I have a mental model of Akamai as being a CDN and some other stuff that I don't fully understand. But again, I first encountered you folks in the early-2000s.It turns out that it's hard to change existing opinions. Are you a CDN Company or are you a security company?Andy: Oh, super—Corey: In other words, if someone wind up mis-alphabetizing that and they're about to get censured after this show because, “No, we're a CDN, first; why did you put security first?”Andy: You know, so all those things feed off each other, right? And this has been a question where it's like, you know, our security layer and our distributed WAF and other security offerings run on top of the CDN layer. So, it's all about building a common compute edge and then leveraging that for new applications. CDN was the first application. The next and second application was security.And we think the third application, but probably not the final one, is compute. So, I think I don't think anyone in marketing will be fired by the ordering that they did on that. I think that ultimately now, you know, for—just if we look at it from a monetary perspective, right, we do more security than we do CDN. So, there's a lot that we have in the security business. And you know, compute's got a long way to go, especially because it's not just one big data center of compute; it is a different flavor than I think folks have seen before.Corey: When I was at RSA, you folks were one of the exhibitors there. And I like to make the common observation that there are basically six companies that exhibit at RSA. Yeah, there are hundreds of booths, but it's the same six products, all marketed are different logos with different words. And they all seem to approach it from a few relatively expectable personas and positions. I've always found myself agreeing with the things that you folks say, and maybe it's because of my own network-centric background, but it doesn't seem like you take the same approach that a number of other companies do or it's, “Oh, it has to start with the way that developers write their first line of code.” Instead, it seems to take a holistic view that comes from the starting position of everything talks to each other on a network basis, and from here, let's move forward. Is that accurate to how you view the security space?Andy: Yeah, you know, our view of the security space is—again, it's a network-centric one, right? And our work in the security space initially came from really big DDoS attacks, right? And how do we stop Distributed Denial of Service attacks from impacting folks? And that was the initial benefit that we brought. And from there, we evolved our story around, you know, how do we have a more sophisticated WAF? How do we have predictive capabilities at the edge?So ultimately, we're not about ingraining into your process of how your thing was written or telling you how to write it. We're about, you know, essentially being that perimeter edge that is watching and monitoring everything that comes into you to make sure that, you know, hey, we're not seeing Log4j-type exploits coming at you, and we'll let you know if we do, or to block malicious activity. So, we fit on anything, which is why our security business has been so successful. If you have an application on the edge, you can put Akamai Security in front of it and it's going to make your application better. That's been super compelling for the last, you know, again, last decade or so that we've really been focused on security.Corey: I think that it is a mistake to take a security model that starts with a view of what people have in front of them day-to-day—like, I look at my laptop and say, “Oh, this is what I spend my time on. This is where all security must start and stop.” Because yeah, okay, great. If you get physical access to my laptop, it's pretty much game over on some level. But yeah, if you're at a point where you're going to bust into my house and threaten me in order to get access to my laptop, here you go.There are no secrets that I am in possession of that are worth dying for. It's just money and that's okay. But looking at it through a lens of the internet has gone from science experiment to thing that the nerds love to use to a cornerstone of the fabric of modern society. And that's not because of the magic supercomputer that we all have in our pockets, but rather because those magic supercomputers can talk to the sum total of human knowledge and any other human anywhere on the planet, basically, ever. And I don't know that that evolution has been really appreciated by society at large as far as just how empowering that can be. But it completely changes the entire security paradigm from back in the '80s when I got started, don't put untrusted floppy disks into your computer or it might literally explode on your desk.Andy: [laugh]. So, we're talking about floppy disks now? Yes. So, first of all, the scope of impact of the internet has increased, meaning what you can do with it has increased. And directly proportional to that increase the threat vectors have increased, right? And the more systems are connected, the more vulnerabilities there are.So listen, it's easy to scare anybody about security on the internet. It is a topic that is an infinite well of scariness. At the same time, you know, and not just Akamai, but there's a lot of companies out there that can, whether it's making your development more secure, making your pipeline, your digital supply chain a more secure, or then you know where Akamai is, we're at the end, which is you know, helping to wrap around your entire web presence to make it more secure, there's a variety of companies that are out there really making the internet work from a security perspective. And honestly, there's also been tremendous progress on the operating system front in the last several years, which previously was not as good—probably is way to characterize it—as it is today. So, and you know, at the end of the day, the nerds are still out there working, right?We are out here still working on making the internet, you know, scale better, making it more secure, making it more robust because we're probably not done, right? You know, phones are awesome, and tablet devices, et cetera, are awesome, but we've probably got more coming. We don't quite know what that is yet, but we want to have the capacity, safety, and compute to power it.Corey: How does Macrometa as a persistent data layer tie into your future vision of security first as what Akamai does? I can see a few directions, but I'm going to go out on a limb and guess that before you folks decided to make an investment in such a thing, you probably gave it more than the 30 seconds or whatnot or so a thought that I've had to wind up putting these pieces together.Andy: So, a few things there. First of all, Macrometa, ultimately, we see them coming in the front door with our compute solution, right? Because as folks are building capabilities on the edge, “Hey, I want to run compute on the edge. How do I interoperate with data?” The worst answer possible is, “Well, call back to the centralized data store.”So, we want to ensure that customers have choice and performance options for distributed data access. Macrometa fits great there. However, now pause that; let's transition back to the security point you raised, which is, you know, coordinating an edge data security platform is a really complicated thing. Because you want to make sure that threats that are coming in on one side of the network, or you know, in one given country, you know, are also understood throughout the network. And there's a definite role for a data platform in doing that.We obviously, you know, for the last ten years have built several that help accomplish that at scale for our network, but we also recognize that, you know, innovation in data platforms is probably not done. And you know, Macrometa's got some pretty interesting approaches. So, we're very interested in working with them and talking jointly with customers, which we've done a bunch of, to see how that progresses. But there's tie-ins, I would say, mostly on compute, but secondarily, there's a lot of interesting areas with real-time security intel, they can be very useful as well.Corey: Since I have you here, I would love to ask you something that's a little orthogonal to the rest of this conversation, but I don't even care about that because that's why it's my show; I can ask what I want.Andy: Oh, no.Corey: Talk to me a little bit about the Linode acquisition. Because when it first came out, I thought, “Oh, Linode must not be doing well, so it's an acqui-hire scenario.” Followed by, “Wait a minute, that doesn't seem quite right.” And I dug deeper, and suddenly, I started to see a bunch of things that made sense. But that's just my outside perspective. I prefer to see you justify what it is that you've done.Andy: Justify what we've done. Well, with that positive framing—Corey: Exactly. “Explain yourself. How dare you, sir?”Andy: [laugh]. “What are you doing?” So, to take that, which is first of all, Linode was doing great when we bought them and they're continuing to do great now. You know, backstory here is actually a fun one. So, I personally have been a customer of Linode for about 13 years, and you know, super familiar with their offerings, as we're a bunch of other folks at Akamai.And what ultimately attracted us to Linode was, first of all, from a strategic perspective, is we talked about how Akamai thinks about Compute being a gradient of compute: you've got the edge, you've got kind of a middle tier, and you've got more centralized locations. Akamai has the edge, we've got the middle, we didn't have the central. Linode has got the central. And obviously, you know, we're going to see some significant expansion of capacity and scale there, but they've got the central location. And, you know, ultimately, we feel that there's a lot of passion in Linode.You know, they're a Linux open-source-centric company, and believe it or not Akamai is, too. I mean, you know, that's kind of how it works. And there was a great connection between the sorts of folks that they had and how they think about customers. Linode was a really customer-driven company. I mean, they were fanatical.I mean, I as a, you know, customer of $30 a month personally, could open a ticket and I'd get an answer in five minutes. And that's very similar to kind of how Akamai is driven, which is we're very customer-centric, and when a customer has a problem or need something different, you know, we're on it. So, there's literally nothing bad there and it's a super exciting beginning of a new chapter for Akamai, which is really how do we tackle compute? We're super excited to have the Linode team. You know, they're still mostly down in Philadelphia doing their thing.And, you know, we've hired substantially and we're continuing to do so, so if you want to work there, drop a note over. And it's been fantastic. And it's one of our, you know, really large acquisitions that we've done, and I think we were really lucky to find a great company in such a good position and be able to make it work.Corey: From my perspective, one of the areas that has me excited about the acquisition stems from what I would consider to be something of a customer-base culture misalignment between the two companies. One of the things that I have always enjoyed about Linode—and in the interest of full transparency, they have been a periodic sponsor over the last five or six years of my ridiculous nonsense. I believe that they are not at the moment which I expect you to immediately rectify after this conversation, of course.Andy: I'll give you my credit card. Yeah.Corey: Excellent. Excellent. We do not get in the way of people trying to give you money. But it was great because that's exactly it. I could take a credit card in the middle of the night and spin up things on Linode.And it was one of those companies that aligned very closely to how I tended to view cloud infrastructure from the perspective of, I need a Linux box, or I need a bunch of Linux boxes right there, right now, and I don't have 12 weeks to go to cloud school to learn the intricacies of a given provider. It more or less just worked in a whole bunch of easy ways. Whereas if I wanted to roll out at Akamai, it was always I would pull up the website, and it's, “Click here to talk to our enterprise sales team.” And that tells me two things. One, it is probably going to be outside of my signing authority because no one trusts me with money for obvious reasons, when I was an employee, and two, you will not be going to space today because those conversations always take time.And it's going to be—if I'm in a hurry and trying to get something out the door, that is going to act as a significant drag on capability. Now, most of your customers do not launch things by the seat of their pants, three hours after the idea first occurs to them, but on Linode, that often seems to be the case. The idea of addressing developers early on in the ‘it's just an idea' phase. I can't shake the feeling that there's a definite future in which Linode winds up being able to speak much more effectively to enterprise, while Akamai also learns to speak to, honestly, half-awake shitposters at 2 a.m. when we're building something heinous.Andy: I feel like you've been sitting in on our strategy presentations. Maybe not the shitposters, but the rest of it. And I think the way that I would couch it, my corporate-speak of that, would be that there's a distinct yin and yang, there a complementary nature between the customer bases of Akamai, which has, you know, an incredible list of enterprise customers—I mean, the who's-who of enterprise customers, Akamai works with them—but then, you know, Linode, who has really tremendous representation of developers—that's what we'll use for the name posts—like, folks like myself included, right, who want to throw something together, want to spin up a VM, and then maybe tear it down and never do it again, or maybe set up 100 of them. And, to your point, the crossover opportunities there, which is, you know, Linode has done a really good job of having small customers that grow over time. And by having Akamai, you know, you can now grow, and never have to leave because we're going to be able to bring enough scale and throughput and, you know, professional help services as you need it to help you stay in the ecosystem.And similarly, Akamai has a tremendous—you know, the benefit of a tremendous set of enterprise customers who are out there, you know, frankly, looking to solve their compute challenges, saying, “Hey, I have a highly distributed application. Akamai, how can you help me with this?” Or, “Hey, I need presence in x or y.” And now we have, you know, with Linode, the right tools to support that. And yes, we can make all kinds of jokes about, you know, Akamai and Linode and different, you know, people and archetypes we appeal to, but ultimately, there's an alignment between Akamai and Linode on how we approach things, which is about Linux, open-source, it's about technical honesty and simplicity. So, great group of folks. And secondly, like, I think the customer crossover, you're right on it. And we're very excited for how that goes.Corey: I also want to call out that Macrometa seems to have split this difference perfectly. One of the first things I visit on any given company's page when I'm trying to understand them is the pricing page. It's one of those areas where people spend the least time, early on, but it's also where they tend to be the most honest. Maybe that's why. And I look for two things, and Macrometa has both of them.The first is a ‘try it for free, right now, get started.' It's a free-tier approach. Because even if you charge $10 or whatnot, there are many developers working on things in odd hours where they don't necessarily either have the ability to make that purchase decision, know that they have the ability to make that purchase decision, or are willing to do that by the seat of their pants. So, ‘get started for free' is important; it means you can develop right now. Conversely, there are a bunch of enterprise procurement departments out there who will want a whole bunch of custom things.Custom SLAs, custom support responses, custom everything, and they also don't know how to sign a check that doesn't have two commas in it. So, you don't probably want to avoid those customers, but what they're looking for is an enterprise offering that is no price. There should not be a price tag on that because you will never get it right for everyone, but what they want to see is ‘click here to contact sales.' That is coded language for, “We are serious professionals and know who you are and how you like to operate.” They've got both and I think that is absolutely the right decision.Andy: It do—Corey: And whatever you have in between those two is almost irrelevant.Andy: No, I think you're on it. And Macrometa, their pricing philosophy allows you to get in and try it with zero friction, which is super important. Like, I don't even have to use a credit card. I can experiment for free, I can try it for free, but then as I grow their pricing tier kind of scales along with that. And it's a—you know, that is the way that folks try applications.I always try to think about, hey, you know, if I'm on a team and we're tasked with putting together a proof of concept for something in two days, and I've got, you know, a couple folks working with me, how do I do that? And you don't have time for procurement, you might need to use the free thing to experiment. So, there is a lot that they can do. And you know, their pricing—this transparency of pricing that they have is fantastic. Now, Linode, also very transparent, we don't have a free tier, but you know, you can get in for very low friction and try that as well.Corey: Yeah, companies tend to go through a maturity curve evolution on these things. I've talked to companies that purely view it is how much money a given customer is spending determines how much attention they get. And it's like, “Yeah, maybe take a look through some of your smaller users or new signups there.” Yeah, they're spending $10 a month or whatnot, but their email address is@cocacola.com. Just spitballing here; maybe you might want a white-glove a few of those folks, just because not everyone comes in the door via an RFP.Andy: Yep. We look at customers for what your potential is, right? Like, you know, how much could you end up spending with us, right? You know, so if you're building your application on Linode, and you're going to spend $20, for the first couple months, that's totally fine. Get in there, experiment, and then you know, in the next several years, let's see where it goes. So, you're exactly right, which is, you know, that username@enterprisedomain.com is often much more indicative than what the actual bill is on a monthly basis.Corey: I always find it a little strange when I have a vendor that I'm doing business with, and then suddenly, an account person reaches out, like, hey, let's just have a call for half an hour to talk about what you're doing and how you're doing it. It's my immediate response to that these days, just of too many years doing that, as, “I really need to look at that bill. How much are we spending, again?” And I honestly, usually not that much because believe it or not, when you focus on cloud economics for a living, you pay attention to your credit card bills, but it is always interesting to see who reaches out and who doesn't. That's been a strange approach, and there is no one right answer for all of this.If every free tier account user of any given cloud provider wound up getting constant emails from their account managers, it's how desperate are you to grow revenue, and what are you about to do to pricing? At some level of becomes… unhelpful.Andy: I can see that. I've had, personally, situations where I'm a trial user of something, and all of a sudden I get emails—you know, using personal email addresses, no Akamai involvement—all of a sudden, I'm getting emails. And I'm like, “Really? Did I make the priority list for you to call me and leave me a voicemail, and then email me?” I don't know how that's possible.So, from a personal perspective, totally see that. You know, from an account development perspective, you know, kind of with the Akamai hat on, it's challenging, right? You know, folks are out there trying to figure out where business is going to come from. And I think if you're able to get an indicator that somebody, you know, maybe you're going to call that person at enterprisedomain.com to try to figure out, you know, hey, is this real and is this you with a side project or is this you with a proof of concept for something that could be more fruitful? And, you know, Corey, they're probably just calling you because you're you.Corey: One of the things that I was surprised by where I saw the exact same thing. I started getting a series of emails from my account manager for Google Workspaces. Okay, and then I really did a spit-take when I realized this was on my personal address. Okay… so I read this carefully because what the hell is happening? Oh, they're raising prices and it's a campaign. Great.Now, my one-user vanity domain is going to go from $6 a month to $8 a month or whatever. Cool, I don't care. This is not someone actively trying to reach out as a human being. It's an outreach campaign. Cool, fair. But that's the problem, on some level, for super-tiny customers. It's a, what is it, is it a shakedown? What are they about to yell at me for?Andy: No, I got the same thing. My Google Workspace personal account, which is, like, two people, right? Like, and I got an email and then I think, like, a voicemail. And I'm like, I read the email and I'm like—you know, it's going—again, it's like, it was like six something and now it's, like, eight something a month. So, it's like, “Okay. You're all right.”Corey: Just go—that's what you have a credit card for. Go ahead and charge it. It's fine. Now, yeah, counterpoint if you're a large company, and yeah, we're just going to be raising prices by 20% across the board for everyone, and you look at this and like, that's a phone number. Yeah, I kind of want some special outreach and conversations there. But it's odd.Andy: It's interesting. Yeah. They're great.Corey: Last question before we call this an episode. In 22 years, how have you seen the market change from your perspective? Most people do not work in the industry from one company's perspective for as long as you have. That gives you a somewhat privileged position to see, from a point of relative stability, what the industry has done.Andy: So—Corey: What have you noticed?Andy: —and I'm going to give you an answer, which is about, like, the sales cycle, which is it used to be about meetings and about everybody coming together and used to have to occasionally wear a suit. And there would be, you know, meetings where you would need to get a CEO or CFO to personally see a presentation and decide something and say, “Okay, we're going with X or Y. We're going to make a decision.” And today, those decisions are, pretty far and wide, made much, much further down in the organization. They're made by developers, team leads, project managers, program managers.So, the way people engage with customers today is so different. First of all, like, most meetings are still virtual. I mean, like, yeah, we have physical meetings and we get together for things, but like, so much more is done virtually, which is cool because we built the internet so we wouldn't have to go anywhere, so it's nice that we got that landed. It's unfortunate that we had to do with Covid to get there, but ultimately, I think that purchasing decisions and technology decisions are distributed so much more deeply into the organization than they were. It used to be a, like, C-level thing. We're now seeing that stuff happened much further down in the organization.We see that inside Akamai and we see it with our customers as well. It's been, honestly, refreshing because you tend to be able to engage with technical folks when you're talking about technical products. And you know, the business folks are still there and they're helping to guide the discussions and all that, but it's a much better time, I think, to be a technical person now than it probably was 20 years ago.Corey: I would say that being a technical person has gotten easier in a bunch of ways; it's gotten harder in a bunch of ways. I would say that it has transformed. I was very opposed to the idea that oh, as a sysadmin, why should I learn to write code? And in retrospect, it was because I wasn't sure I could do it and it felt like the rising tide was going to drown me. And in hindsight, yeah, it was the right direction for the industry to go in.But I'm also sensitive to folks who don't want to, midway through their career, pick up an entirely new skill set in order to remain relevant. I think that it is a lot easier to do some things. Back when Akamai started, it took an intimate knowledge of GCC compiler flags, in most cases, to host a website. Now, it is checking a box on a web page and you're done. Things have gotten easier.The abstractions continue to slip below the waterline, so the things we have to care about getting more and more meaningful to the business. We're nowhere near our final form yet, but I'm very excited about how accessible this industry is to folks that previously would not have been, while also disheartened by just how much there is to know. Otherwise, “Oh yeah, that entire aspect of the way that this core thing that runs my business, yeah, that's basically magic and we just hope the magic doesn't stop working, or we make a sacrifice to the proper God, which is usually a giant trillion-dollar company.” And the sacrifice is, of course, engineering time combined with money.Andy: You know, technology is all about abstraction layers, right? And I think—that's my view, right—and we've been spending the last several decades, not, ‘we' Akamai; ‘we' the technology industry—on, you know, coming up with some pretty solid abstraction layers. And you're right, like, the, you know, GCC j6—you know, -j6—you know, kind of compiler tags not that important anymore, we could go back in time and talk about inetd, the first serverless. But other than that, you know, as we get to the present day, I think what's really interesting is you can contribute technically without being a super coding nerd. There's all kinds of different technical approaches today and technical disciplines that aren't just about development.Development is super important, but you know, frankly, the sysadmin skill set is more valuable today if you look at what SREs have become and how important they are to the industry. I mean, you know, those are some of the most critical folks in the entire piping here. So, don't feel bad for starting out as a sysadmin. I think that's my closing comment back to you.Corey: I think that's probably a good place to leave it. I really want to thank you for being so generous with your time.Andy: Anytime.Corey: If people want to learn more about how you see the world, where can they find you?Andy: Yeah, I mean, I guess you could check me out on LinkedIn. Happy to shoot me something there and happy to catch up. I'm pretty much read-only on social, so I don't pontificate a lot on Twitter, but—Corey: Such a good decision.Andy: Feel free to shoot me something on LinkedIn if you want to get in touch or chat about Akamai.Corey: Excellent. And of course, our thanks goes well, to the fine folks at Macrometa who have promoted this episode. It is always appreciated when people wind up supporting this ridiculous nonsense that I do. My guest has been Andy Champagne SVP at the CTO office over at Akamai. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that will not post successfully because your podcast provider of choice wound up skimping out on a provider who did not care enough about a persistent global data layer.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Dynamic Configuration Through AWS AppConfig with Steve Rice

Screaming in the Cloud

Play Episode Listen Later Oct 11, 2022 35:54


About Steve:Steve Rice is Principal Product Manager for AWS AppConfig. He is surprisingly passionate about feature flags and continuous configuration. He lives in the Washington DC area with his wife, 3 kids, and 2 incontinent dogs.Links Referenced:AWS AppConfig: https://go.aws/awsappconfig TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With tail scale, ssh, you can do exactly that. Tail scale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate.S. Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation permissions is code connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. tail scales. Completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This is a promoted guest episode. What does that mean? Well, it means that some people don't just want me to sit here and throw slings and arrows their way, they would prefer to send me a guest specifically, and they do pay for that privilege, which I appreciate. Paying me is absolutely a behavior I wish to endorse.Today's victim who has decided to contribute to slash sponsor my ongoing ridiculous nonsense is, of all companies, AWS. And today I'm talking to Steve Rice, who's the principal product manager on AWS AppConfig. Steve, thank you for joining me.Steve: Hey, Corey, great to see you. Thanks for having me. Looking forward to a conversation.Corey: As am I. Now, AppConfig does something super interesting, which I'm not aware of any other service or sub-service doing. You are under the umbrella of AWS Systems Manager, but you're not going to market with Systems Manager AppConfig. You're just AWS AppConfig. Why?Steve: So, AppConfig is part of AWS Systems Manager. Systems Manager has, I think, 17 different features associated with it. Some of them have an individual name that is associated with Systems Manager, some of them don't. We just happen to be one that doesn't. AppConfig is a service that's been around for a while internally before it was launched externally a couple years ago, so I'd say that's probably the origin of the name and the service. I can tell you more about the origin of the service if you're curious.Corey: Oh, I absolutely am. But I just want to take a bit of a detour here and point out that I make fun of the sub-service names in Systems Manager an awful lot, like Systems Manager Session Manager and Systems Manager Change Manager. And part of the reason I do that is not just because it's funny, but because almost everything I found so far within the Systems Manager umbrella is pretty awesome. It aligns with how I tend to think about the world in a bunch of different ways. I have yet to see anything lurking within the Systems Manager umbrella that has led to a tee-hee-hee bill surprise level that rivals, you know, the GDP of Guam. So, I'm a big fan of the entire suite of services. But yes, how did AppConfig get its name?Steve: [laugh]. So, AppConfig started about six years ago, now, internally. So, we actually were part of the region services department inside of Amazon, which is in charge of launching new services around the world. We found that a centralized tool for configuration associated with each service launching was really helpful. So, a service might be launching in a new region and have to enable and disable things as it moved along.And so, the tool was sort of built for that, turning on and off things as the region developed and was ready to launch publicly; then the regions launch publicly. It turned out that our internal customers, which are a lot of AWS services and then some Amazon services as well, started to use us beyond launching new regions, and started to use us for feature flagging. Again, turning on and off capabilities, launching things safely. And so, it became massively popular; we were actually a top 30 service internally in terms of usage. And two years ago, we thought we really should launch this externally and let our customers benefit from some of the goodness that we put in there, and some of—those all come from the mistakes we've made internally. And so, it became AppConfig. In terms of the name itself, we specialize in application configuration, so that's kind of a mouthful, so we just changed it to AppConfig.Corey: Earlier this year, there was a vulnerability reported around I believe it was AWS Glue, but please don't quote me on that. And as part of its excellent response that AWS put out, they said that from the time that it was disclosed to them, they had patched the service and rolled it out to every AWS region in which Glue existed in a little under 29 hours, which at scale is absolutely magic fast. That is superhero speed and then some because you generally don't just throw something over the wall, regardless of how small it is when we're talking about something at the scale of AWS. I mean, look at who your customers are; mistakes will show. This also got me thinking that when you have Adam, or previously Andy, on stage giving a keynote announcement and then they mention something on stage, like, “Congratulations. It's now a very complicated service with 14 adjectives in his name because someone's paid by the syllable. Great.”Suddenly, the marketing pages are up, the APIs are working, it's showing up in the console, and it occurs to me only somewhat recently to think about all of the moving parts that go on behind this. That is far faster than even the improved speed of CloudFront distribution updates. There's very clearly something going on there. So, I've got to ask, is that you?Steve: Yes, a lot of that is us. I can't take credit for a hundred percent of what you're talking about, but that's how we are used. We're essentially used as a feature-flagging service. And I can talk generically about feature flagging. Feature flagging allows you to push code out to production, but it's hidden behind a configuration switch: a feature toggle or a feature flag. And that code can be sitting out there, nobody can access it until somebody flips that toggle. Now, the smart way to do it is to flip that toggle on for a small set of users. Maybe it's just internal users, maybe it's 1% of your users. And so, the features available, you can—Corey: It's your best slash worst customers [laugh] in that 1%, in some cases.Steve: Yeah, you want to stress test the system with them and you want to be able to look and see what's going to break before it breaks for everybody. So, you release us to a small cohort, you measure your operations, you measure your application health, you measure your reputational concerns, and then if everything goes well, then you maybe bump it up to 2%, and then 10%, and then 20%. So, feature flags allow you to slowly release features, and you know what you're releasing by the time it's at a hundred percent. It's tempting for teams to want to, like, have everybody access it at the same time; you've been working hard on this feature for a long time. But again, that's kind of an anti-pattern. You want to make sure that on production, it behaves the way you expect it to behave.Corey: I have to ask what is the fundamental difference between feature flags and/or dynamic configuration. Because to my mind, one of them is a means of achieving the other, but I could also see very easily using the terms interchangeably. Given that in some of our conversations, you have corrected me which, first, how dare you? Secondly, okay, there's probably a reason here. What is that point of distinction?Steve: Yeah. Typically for those that are not eat, sleep, and breathing dynamic configuration—which I do—and most people are not obsessed with this kind of thing, feature flags is kind of a shorthand for dynamic configuration. It allows you to turn on and off things without pushing out any new code. So, your application code's running, it's pulling its configuration data, say every five seconds, every ten seconds, something like that, and when that configuration data changes, then that app changes its behavior, again, without a code push or without restarting the app.So, dynamic configuration is maybe a superset of feature flags. Typically, when people think feature flags, they're thinking of, “Oh, I'm going to release a new feature, so it's almost like an on-off switch.” But we see customers using feature flags—and we use this internally—for things like throttling limits. Let's say you want to be able to throttle TPS transactions per second. Or let's say you want to throttle the number of simultaneous background tasks, and say, you know, I just really don't want this creeping above 50; bad things can start to happen.But in a period of stress, you might want to actually bring that number down. Well, you can push out these changes with dynamic configuration—which is, again, any type of configuration, not just an on-off switch—you can push this out and adjust the behavior and see what happens. Again, I'd recommend pushing it out to 1% of your users, and then 10%. But it allows you to have these dials and switches to do that. And, again, generically, that's dynamic configuration. It's not as fun to term as feature flags; feature flags is sort of a good mental picture, so I do use them interchangeably, but if you're really into the whole world of this dynamic configuration, then you probably will care about the difference.Corey: Which makes a fair bit of sense. It's the question of what are you talking about high level versus what are you talking about implementation detail-wise.Steve: Yep. Yep.Corey: And on some level, I used to get… well, we'll call it angsty—because I can't think of a better adjective right now—about how AWS was reluctant to disclose implementation details behind what it did. And in the fullness of time, it's made a lot more sense to me, specifically through a lens of, you want to be able to have the freedom to change how something works under the hood. And if you've made no particular guarantee about the implementation detail, you can do that without potentially worrying about breaking a whole bunch of customer expectations that you've inadvertently set. And that makes an awful lot of sense.The idea of rolling out changes to your infrastructure has evolved over the last decade. Once upon a time you'd have EC2 instances, and great, you want to go ahead and make a change there—or this actually predates EC2 instances. Virtual machines in a data center or heaven forbid, bare metal servers, you're not going to deploy a whole new server because there's a new version of the code out, so you separate out your infrastructure from the code that it runs. And that worked out well. And increasingly, we started to see ways of okay, if we want to change the behavior of the application, we'll just push out new environment variables to that thing and restart the service so it winds up consuming those.And that's great. You've rolled it out throughout your fleet. With containers, which is sort of the next logical step, well, okay, this stuff gets baked in, we'll just restart containers with a new version of code because that takes less than a second each and you're fine. And then Lambda functions, it's okay, we'll just change the deployment option and the next invocation will wind up taking the brand new environment variables passed out to it. How do feature flags feature into those, I guess, three evolving methods of running applications in anger, by which I mean, of course, production?Steve: [laugh]. Good question. And I think you really articulated that well.Corey: Well, thank you. I should hope so. I'm a storyteller. At least I fancy myself one.Steve: [laugh]. Yes, you are. Really what you talked about is the evolution of you know, at the beginning, people were—well, first of all, people probably were embedding their variables deep in their code and then they realized, “Oh, I want to change this,” and now you have to find where in my code that is. And so, it became a pattern. Why don't we separate everything that's a configuration data into its own file? But it'll get compiled at build time and sent out all at once.There was kind of this breakthrough that was, why don't we actually separate out the deployment of this? We can separate the deployment from code from the deployment of configuration data, and have the code be reading that configuration data on a regular interval, as I already said. So now, as the environments have changed—like you said, containers and Lambda—that ability to make tweaks at microsecond intervals is more important and more powerful. So, there certainly is still value in having things like environment variables that get read at startup. We call that static configuration as opposed to dynamic configuration.And that's a very important element in the world of containers that you talked about. Containers are a bit ephemeral, and so they kind of come and go, and you can restart things, or you might spin up new containers that are slightly different config and have them operate in a certain way. And again, Lambda takes that to the next level. I'm really excited where people are going to take feature flags to the next level because already today we have people just fine-tuning to very targeted small subsets, different configuration data, different feature flag data, and allows them to do this like at we've never seen before scale of turning this on, seeing how it reacts, seeing how the application behaves, and then being able to roll that out to all of your audience.Now, you got to be careful, you really don't want to have completely different configurations out there and have 10 different, or you know, 100 different configurations out there. That makes it really tough to debug. So, you want to think of this as I want to roll this out gradually over time, but eventually, you want to have this sort of state where everything is somewhat consistent.Corey: That, on some level, speaks to a level of operational maturity that my current deployment adventures generally don't have. A common reference I make is to my lasttweetinaws.com Twitter threading app. And anyone can visit it, use it however they want.And it uses a Route 53 latency record to figure out, ah, which is the closest region to you because I've deployed it to 20 different regions. Now, if this were a paid service, or I had people using this in large volume and I had to worry about that sort of thing, I would probably approach something that is very close to what you describe. In practice, I pick a devoted region that I deploy something to, and cool, that's sort of my canary where I get things working the way I would expect. And when that works the way I want it to I then just push it to everything else automatically. Given that I've put significant effort into getting deployments down to approximately two minutes to deploy to everything, it feels like that's a reasonable amount of time to push something out.Whereas if I were, I don't know, running a bank, for example, I would probably have an incredibly heavy process around things that make changes to things like payment or whatnot. Because despite the lies, we all like to tell both to ourselves and in public, anything that touches payments does go through waterfall, not agile iterative development because that mistake tends to show up on your customer's credit card bills, and then they're also angry. I think that there's a certain point of maturity you need to be at as either an organization or possibly as a software technology stack before something like feature flags even becomes available to you. Would you agree with that, or is this something everyone should use?Steve: I would agree with that. Definitely, a small team that has communication flowing between the two probably won't get as much value out of a gradual release process because everybody kind of knows what's going on inside of the team. Once your team scales, or maybe your audience scales, that's when it matters more. You really don't want to have something blow up with your users. You really don't want to have people getting paged in the middle of the night because of a change that was made. And so, feature flags do help with that.So typically, the journey we see is people start off in a maybe very small startup. They're releasing features at a very fast pace. They grow and they start to build their own feature flagging solution—again, at companies I've been at previously have done that—and you start using feature flags and you see the power of it. Oh, my gosh, this is great. I can release something when I want without doing a big code push. I can just do a small little change, and if something goes wrong, I can roll it back instantly. That's really handy.And so, the basics of feature flagging might be a homegrown solution that you all have built. If you really lean into that and start to use it more, then you probably want to look at a third-party solution because there's so many features out there that you might want. A lot of them are around safeguards that makes sure that releasing a new feature is safe. You know, again, pushing out a new feature to everybody could be similar to pushing out untested code to production. You don't want to do that, so you need to have, you know, some checks and balances in your release process of your feature flags, and that's what a lot of third parties do.It really depends—to get back to your question about who needs feature flags—it depends on your audience size. You know, if you have enough audience out there to want to do a small rollout to a small set first and then have everybody hit it, that's great. Also, if you just have, you know, one or two developers, then feature flags are probably something that you're just kind of, you're doing yourself, you're pushing out this thing anyway on your own, but you don't need it coordinated across your team.Corey: I think that there's also a bit of—how to frame this—misunderstanding on someone's part about where AppConfig starts and where it stops. When it was first announced, feature flags were one of the things that it did. And that was talked about on stage, I believe in re:Invent, but please don't quote me on that, when it wound up getting announced. And then in the fullness of time, there was another announcement of AppConfig now supports feature flags, which I'm sitting there and I had to go back to my old notes. Like, did I hallucinate this? Which again, would not be the first time I'd imagine such a thing. But no, it was originally how the service was described, but now it's extra feature flags, almost like someone would, I don't know, flip on a feature-flag toggle for the service and now it does a different thing. What changed? What was it that was misunderstood about the service initially versus what it became?Steve: Yeah, I wouldn't say it was a misunderstanding. I think what happened was we launched it, guessing what our customers were going to use it as. We had done plenty of research on that, and as I mentioned before we had—Corey: Please tell me someone used it as a database. Or am I the only nutter that does stuff like that?Steve: We have seen that before. We have seen something like that before.Corey: Excellent. Excellent, excellent. I approve.Steve: And so, we had done our due diligence ahead of time about how we thought people were going to use it. We were right about a lot of it. I mentioned before that we have a lot of usage internally, so you know, that was kind of maybe cheating even for us to be able to sort of see how this is going to evolve. What we did announce, I guess it was last November, was an opinionated version of feature flags. So, we had people using us for feature flags, but they were building their own structure, their own JSON, and there was not a dedicated console experience for feature flags.What we announced last November was an opinionated version that structured the JSON in a way that we think is the right way, and that afforded us the ability to have a smooth console experience. So, if we know what the structure of the JSON is, we can have things like toggles and validations in there that really specifically look at some of the data points. So, that's really what happened. We're just making it easier for our customers to use us for feature flags. We still have some customers that are kind of building their own solution, but we're seeing a lot of them move over to our opinionated version.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq/screaminginthecloud to get. That's www.datadoghq/screaminginthecloudCorey: Part of the problem I have when I look at what it is you folks do, and your use cases, and how you structure it is, it's similar in some respects to how folks perceive things like FIS, the fault injection service, or chaos engineering, as is commonly known, which is, “We can't even get the service to stay up on its own for any [unintelligible 00:18:35] period of time. What do you mean, now let's intentionally degrade it and make it work?” There needs to be a certain level of operational stability or operational maturity. When you're still building a service before it's up and running, feature flags seem awfully premature because there's no one depending on it. You can change configuration however your little heart desires. In most cases. I'm sure at certain points of scale of development teams, you have a communications problem internally, but it's not aimed at me trying to get something working at 2 a.m. in the middle of the night.Whereas by the time folks are ready for what you're doing, they clearly have that level of operational maturity established. So, I have to guess on some level, that your typical adopter of AppConfig feature flags isn't in fact, someone who is, “Well, we're ready for feature flags; let's go,” but rather someone who's come up with something else as a stopgap as they've been iterating forward. Usually something homebuilt. And it might very well be you have the exact same biggest competitor that I do in my consulting work, which is of course, Microsoft Excel as people try to build their own thing that works in their own way.Steve: Yeah, so definitely a very common customer of ours is somebody that is using a homegrown solution for turning on and off things. And they really feel like I'm using the heck out of these feature flags. I'm using them on a daily or weekly basis. I would like to have some enhancements to how my feature flags work, but I have limited resources and I'm not sure that my resources should be building enhancements to a feature-flagging service, but instead, I'd rather have them focusing on something, you know, directly for our customers, some of the core features of whatever your company does. And so, that's when people sort of look around externally and say, “Oh, let me see if there's some other third-party service or something built into AWS like AWS AppConfig that can meet those needs.”And so absolutely, the workflows get more sophisticated, the ability to move forward faster becomes more important, and do so in a safe way. I used to work at a cybersecurity company and we would kind of joke that the security budget of the company is relatively low until something bad happens, and then it's, you know, whatever you need to spend on it. It's not quite the same with feature flags, but you do see when somebody has a problem on production, and they want to be able to turn something off right away or make an adjustment right away, then the ability to do that in a measured way becomes incredibly important. And so, that's when, again, you'll see customers starting to feel like they're outgrowing their homegrown solution and moving to something that's a third-party solution.Corey: Honestly, I feel like so many tools exist in this space, where, “Oh, yeah, you should definitely use this tool.” And most people will use that tool. The second time. Because the first time, it's one of those, “How hard could that be out? I can build something like that in a weekend.” Which is sort of the rallying cry of doomed engineers who are bad at scoping.And by the time that they figure out why, they have to backtrack significantly. There's a whole bunch of stuff that I have built that people look at and say, “Wow, that's a really great design. What inspired you to do that?” And the absolute honest answer to all of it is simply, “Yeah, I worked in roles for the first time I did it the way you would think I would do it and it didn't go well.” Experience is what you get when you didn't get what you wanted, and this is one of those areas where it tends to manifest in reasonable ways.Steve: Absolutely, absolutely.Corey: So, give me an example here, if you don't mind, about how feature flags can improve the day-to-day experience of an engineering team or an engineer themselves. Because we've been down this path enough, in some cases, to know the failure modes, but for folks who haven't been there that's trying to shave a little bit off of their journey of, “I'm going to learn from my own mistakes.” Eh, learn from someone else's. What are the benefits that accrue and are felt immediately?Steve: Yeah. So, we kind of have a policy that the very first commit of any new feature ought to be the feature flag. That's that sort of on-off switch that you want to put there so that you can start to deploy your code and not have a long-lived branch in your source code. But you can have your code there, it reads whether that configuration is on or off. You start with it off.And so, it really helps just while developing these things about keeping your branches short. And you can push the mainline, as long as the feature flag is off and the feature is hidden to production, which is great. So, that helps with the mess of doing big code merges. The other part is around the launch of a feature.So, you talked about Andy Jassy being on stage to launch a new feature. Sort of the old way of doing this, Corey, was that you would need to look at your pipelines and see how long it might take for you to push out your code with any sort of code change in it. And let's say that was an hour-and-a-half process and let's say your CEO is on stage at eight o'clock on a Friday. And as much as you like to say it, “Oh, I'm never pushing out code on a Friday,” sometimes you have to. The old way—Corey: Yeah, that week, yes you are, whether you want to or not.Steve: [laugh]. Exactly, exactly. The old way was this idea that I'm going to time my release, and it takes an hour-and-a-half; I'm going to push it out, and I'll do my best, but hopefully, when the CEO raises her arm or his arm up and points to a screen that everything's lit up. Well, let's say you're doing that and something goes wrong and you have to start over again. Well, oh, my goodness, we're 15 minutes behind, can you accelerate things? And then you start to pull away some of these blockers to accelerate your pipeline or you start editing it right in the console of your application, which is generally not a good idea right before a really big launch.So, the new way is, I'm going to have that code already out there on a Wednesday [laugh] before this big thing on a Friday, but it's hidden behind this feature flag, I've already turned it on and off for internals, and it's just waiting there. And so, then when the CEO points to the big screen, you can just flip that one small little configuration change—and that can be almost instantaneous—and people can access it. So, that just reduces the amount of stress, reduces the amount of risk in pushing out your code.Another thing is—we've heard this from customers—customers are increasing the number of deploys that they can do per week by a very large percentage because they're deploying with confidence. They know that I can push out this code and it's off by default, then I can turn it on whenever I feel like it, and then I can turn it off if something goes wrong. So, if you're into CI/CD, you can actually just move a lot faster with a number of pushes to production each week, which again, I think really helps engineers on their day-to-day lives. The final thing I'm going to talk about is that let's say you did push out something, and for whatever reason, that following weekend, something's going wrong. The old way was oop, you're going to get a page, I'm going to have to get on my computer and go and debug things and fix things, and then push out a new code change.And this could be late on a Saturday evening when you're out with friends. If there's a feature flag there that can turn it off and if this feature is not critical to the operation of your product, you can actually just go in and flip that feature flag off until the next morning or maybe even Monday morning. So, in theory, you kind of get your free time back when you are implementing feature flags. So, I think those are the big benefits for engineers in using feature flags.Corey: And the best way to figure out whether someone is speaking from a position of experience or is simply a raving zealot when they're in a position where they are incentivized to advocate for a particular way of doing things or a particular product, as—let's be clear—you are in that position, is to ask a form of the following question. Let's turn it around for a second. In what scenarios would you absolutely not want to use feature flags? What problems arise? When do you take a look at a situation and say, “Oh, yeah, feature flags will make things worse, instead of better. Don't do it.”Steve: I'm not sure I wouldn't necessarily don't do it—maybe I am that zealot—but you got to do it carefully.Corey: [laugh].Steve: You really got to do things carefully because as I said before, flipping on a feature flag for everybody is similar to pushing out untested code to production. So, you want to do that in a measured way. So, you need to make sure that you do a couple of things. One, there should be some way to measure what the system behavior is for a small set of users with that feature flag flipped to on first. And it could be some canaries that you're using for that.You can also—there's other mechanisms you can do that to: set up cohorts and beta testers and those kinds of things. But I would say the gradual rollout and the targeted rollout of a feature flag is critical. You know, again, it sounds easy, “I'll just turn it on later,” but you ideally don't want to do that. The second thing you want to do is, if you can, is there some sort of validation that the feature flag is what you expect? So, I was talking about on-off feature flags; there are things, as when I was talking about dynamic configuration, that are things like throttling limits, that you actually want to make sure that you put in some other safeguards that say, “I never want my TPS to go above 1200 and never want to set it below 800,” for whatever reason, for example. Well, you want to have some sort of validation of that data before the feature flag gets pushed out. Inside Amazon, we actually have the policy that every single flag needs to have some sort of validation around it so that we don't accidentally fat-finger something out before it goes out there. And we have fat-fingered things.Corey: Typing the wrong thing into a command structure into a tool? “Who would ever do something like that?” He says, remembering times he's taken production down himself, exactly that way.Steve: Exactly, exactly, yeah. And we've done it at Amazon and AWS, for sure. And so yeah, if you have some sort of structure or process to validate that—because oftentimes, what you're doing is you're trying to remediate something in production. Stress levels are high, it is especially easy to fat-finger there. So, that check-and-balance of a validation is important.And then ideally, you have something to automatically roll back whatever change that you made, very quickly. So AppConfig, for example, hooks up to CloudWatch alarms. If an alarm goes off, we're actually going to roll back instantly whatever that feature flag was to its previous state so that you don't even need to really worry about validating against your CloudWatch. It'll just automatically do that against whatever alarms you have.Corey: One of the interesting parts about working at Amazon and seeing things in Amazonian scale is that one in a million events happen thousands of times every second for you folks. What lessons have you learned by deploying feature flags at that kind of scale? Because one of my problems and challenges with deploying feature flags myself is that in some cases, we're talking about three to five users a day for some of these things. That's not really enough usage to get insights into various cohort analyses or A/B tests.Steve: Yeah. As I mentioned before, we build these things as features into our product. So, I just talked about the CloudWatch alarms. That wasn't there originally. Originally, you know, if something went wrong, you would observe a CloudWatch alarm and then you decide what to do, and one of those things might be that I'm going to roll back my configuration.So, a lot of the mistakes that we made that caused alarms to go off necessitated us building some automatic mechanisms. And you know, a human being can only react so fast, but an automated system there is going to be able to roll things back very, very quickly. So, that came from some specific mistakes that we had made inside of AWS. The validation that I was talking about as well. We have a couple of ways of validating things.You might want to do a syntactic validation, which really you're validating—as I was saying—the range between 100 and 1000, but you also might want to have sort of a functional validation, or we call it a semantic validation so that you can make sure that, for example, if you're switching to a new database, that you're going to flip over to your new database, you can have a validation there that says, “This database is ready, I can write to this table, it's truly ready for me to switch.” Instead of just updating some config data, you're actually going to be validating that the new target is ready for you. So, those are a couple of things that we've learned from some of the mistakes we made. And again, not saying we aren't making mistakes still, but we always look at these things inside of AWS and figure out how we can benefit from them and how our customers, more importantly, can benefit from these mistakes.Corey: I would say that I agree. I think that you have threaded the needle of not talking smack about your own product, while also presenting it as not the global panacea that everyone should roll out, willy-nilly. That's a good balance to strike. And frankly, I'd also say it's probably a good point to park the episode. If people want to learn more about AppConfig, how you view these challenges, or even potentially want to get started using it themselves, what should they do?Steve: We have an informational page at go.aws/awsappconfig. That will tell you the high-level overview. You can search for our documentation and we have a lot of blog posts to help you get started there.Corey: And links to that will, of course, go into the [show notes 00:31:21]. Thank you so much for suffering my slings, arrows, and other assorted nonsense on this. I really appreciate your taking the time.Steve: Corey thank you for the time. It's always a pleasure to talk to you. Really appreciate your insights.Corey: You're too kind. Steve Rice, principal product manager for AWS AppConfig. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment. But before you do, just try clearing your cookies and downloading the episode again. You might be in the 3% cohort for an A/B test, and you [want to 00:32:01] listen to the good one instead.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

AWS Morning Brief
Naming Things Accurately

AWS Morning Brief

Play Episode Listen Later Sep 15, 2022 5:02


Links: Nick Frichette wrote an incredibly handy guide on the ordered steps to take to avoid CloudFront or DNS domain takeovers on AWS. This handy walkthrough talks about how to configure something that shrieks its head off whenever someone logs into AWS via the root account. The Center for Internet Security just released an update to the AWS version of their security benchmarks, and this approachable post goes through what's new. Introducing message data protection for Amazon SNS - This is a bit hard to wrap my head around--then Scott Piper nailed it with "it's Macie for SNS and now I'm wondering what the point of me even is.  I've talked about Parliament before--it's an AWS IAM linting library. Version 1.6.0 just dropped. I'll be in the DC area next week; come by Highline at 7PM and let me buy you a drink / swap stories if you're around.

AWS Morning Brief
Mobile Authentication to AWS is Hard

AWS Morning Brief

Play Episode Listen Later Sep 8, 2022 5:42


Links: 1Password frankly got it wrong with their assertion that you shouldn't bother with MFA for 1Password itself.  Joe Frichette has a handy guide on the ordered steps to take to avoid CloudFront or DNS domain takeovers on AWS Over 1,000 iOS apps found exposing hardcoded AWS credentials Chris Farris has a great post covering how to handle Incident Response in AWS. Announcing new AWS IAM Identity Center APIs to manage users and groups at scale  How to subscribe to the new Security Hub Announcements topic for Amazon SNS  This week's tool is an open source dingus that lets you use TouchID on supported Macs to authenticate sudo on macOS.

The Swyx Mixtape
[Weekend Drop] AWS, Cloudflare, and Techbro Therapy on AWS.fm

The Swyx Mixtape

Play Episode Listen Later Aug 27, 2022 51:25


Listen to AWS.fm: https://aws.fm/episodes/episode-25-shawn-swyx-wangShawn joins Adam to discuss Amplify and its place in the developer ecosystem, whether we should care about Cloudflare, yet, and how to cope with the anxiety that can come with being extremely online. Also, it sounds like Adam is a tech bro and he's NOT happy about it.TranscriptAdam Elmore: Hey, everyone. Welcome to AWS FM, a podcast with guests from around the AWS community. I'm your host, Adam Elmore. And today, I'm joined by Shawn Swyx Wang. Hi, Shawn.Shawn Wang: Hey, Adam. How's it going?Adam Elmore: It's going well. I've been extremely excited. I've said this on a ton of podcasts, that I'm excited to get on with a guest, but this has been a long time because before I took my break, I was going to get on with you. Took a big, long break, and I've finally got you on. You're somebody, and I'm going to say a lot of things, I'm very dramatic, but you're somebody that I really admire in the online space. You have this ability to think about things, and distill them, and put them out there in a way that I admire greatly. I'm so excited to have you on here. It's going to be hard for me to stay on any one topic because I have just a list of questions I want to ask you, basically.Shawn Wang: [inaudible 00:00:52].Adam Elmore: First, could you tell everyone on this show who you are, just the short version of Shawn?Shawn Wang: Yeah. So I'm Shawn, born and raised in Singapore, went to The States for college and then spent my first career in finance where I did investment banking and hedge funds. Loved the coding part because every junior finance person starts to learn to code, and didn't like the stress of the finance part, so I pivoted to tech where I was a software engineer at Two Sigma and then I was in developer relations at Netlify, AWS, Temporal, and I've just joined Airbyte as head of developer experience.Adam Elmore: Oh, I did not know you weren't still at Temporal. So Airbyte, what is Airbyte?Shawn Wang: Airbyte is a data integration company, it basically has the largest community of open-source connectors for connecting to any SaaS API source into your data warehouse. So for anyone doing data engineering, the first task that you have to do is to get data from all the different silos of data in your business. Let's say you have a Salesforce being the source of truth for customers, Stripe being the source of truth for transactions, get all of them into a single data warehouse for you to do operations on. So the goal is to have the largest community of open-source developers for connecting all the data and liberating your data from all the silos that you have in your business.Adam Elmore: And how long ago did you start? How did I miss this?Shawn Wang: A couple weeks ago. I actually have not announced it on Twitter, which is why.Adam Elmore: Oh, there you go.Shawn Wang: I like to slow play it. So when I joined Temporal, I actually waited for six months to really understand Temporal and to practice my pitch before announcing it on Twitter. And that's how I like to do things because, well, partially I want to be fully up to speed before I represent something publicly.Adam Elmore: Yeah. So I want to talk about that. You get very up to speed in a way that I don't see a lot of people on Twitter. I don't see them understand things in the way that you do. So you obviously write, your blog is a huge source of information for me, and I've enjoyed it quite a lot, but it's not just that you write, it's the way you think about things. Does that come from your finance, your analytical background in finance, or were you like that before, your ability to see the whole forest, take in the way things are trending and the way things are moving, put it all together and distill it into these wonderful articles? Where does that come from?Shawn Wang: Oh, so first of all, thanks for the very kind words. I don't hear back from my readers that often, so it's really nice when I get to talk to someone like this. So yeah, I would say a lot of this stuff is actually from my finance days. This is the kind of analysis that you would have to do when you do an investment report or investment research on any stock or any industry. You want to get a perspective of what's going on, what the trends are, who the major players are, and form an opinion on where things are going. And I think taking that finance mindset into the bets I have, in terms of technologies, whether or not it's for using them personally in my personal stack or for joining them as a startup employee, I think is extremely underrated. And it's something I'm trying to model and hopefully teach people someday.Shawn Wang: Although I'm not sure about the teaching part, because if I say like, "Get rich by doing investment analysis stock on early stage startups," I would feel like a hustler. So maybe not that, but I just do like engaging in that. And probably it's an exercise for me to think things through clearly by writing it down. And I also get a lot of feedback from that, so I actually improve and learn a lot by learning in public. And that's the other thing that I am pretty well known for, so this is the application of the general purpose learning in public principle.Adam Elmore: Yeah. No, and I love your learning in public article. I hope more people see how you break down systems and the world around us and distill it. I hope more people do that because I'd love to have more sources of that kind of information. It's really fascinating and that's a lot of what I want to talk about today is your opinions on the future and where certain things are headed. First, I want to talk, you did work at AWS. How long were you at AWS?Shawn Wang: A year. AWS Amplify.Adam Elmore: Yeah. So I'd love to know, I guess what it was like working at AWS, what you took from that, but also more broadly, I want to get into Amplify and where it fits. You sort of live in that intersection. I feel like web, and cloud, and infrastructure, where things are trending, and I want to talk Amplify's place in that, but first, what was your role there like at AWS, at Amplify?Shawn Wang: Yeah, I was a senior dev advocate at Amplify, basically doing demos and talks for Amplify. And the fun thing about working at Amplify is that you are essentially also a developer advocate for all the underlying services. So amplify is essentially a roll up of DynamoDB, API Gateway, AWS AppSync, even file storage like S3. You could do some demos with that. And I did, I made like a DIY Dropbox clone. But it's focus on front-end engineers. And I think that was the first time that AWS had ever made a dedicated arm or products for front-end engineers. And it turned out to be a really good bet because AWS Amplify was one of the fastest growing AWS services, at least during the time that I was there. So I thought it was just really compelling to try it out and obviously everyone has very high regard for AWS. There's a bunch of services that I only experienced on the inside and I only learned about once I got on the inside, and I thought that was really interesting as well.Shawn Wang: A few things I'll point out. I really loved the AWS interview process, actually. I felt like it was very rigorous and I definitely haven't had as rigorous a process anywhere else. And they really got a good look at every single part of me before they made the decision. And fortunately for me, it was a unanimous, good decision, but I felt challenged. I felt like there was a lot of growth that I took away from that process as well. So I highly recommend going through it, even if you don't necessarily take the job.Shawn Wang: And once you're in, I think the other practice I really like was the weekly business reviews. Not everyone gets to be a part of, but I was, and essentially you have a P&L from the central AWS finance team that week to week tells you how well you're doing or not. And the PMs in particular, they'll put up highlights, they bring up topics of discussion, and the general manager would be grilling people on. And I thought that was just a fun way to run a business. It was a little bit stressful, sometimes a little bit dramatic, but hey, it forced you to take on the issues head on instead of ignoring them for three months to a year, which I've also seen happen.Shawn Wang: So I just really appreciated that directness, and everything that you've heard about on the outside about AWS culture applies, like they'll send out the memo and the first 10 minutes of the meeting will be spend in complete silence where you just read the memo.Adam Elmore: Just read the memo. Yeah, that's real. Well, what about the leadership principle? You talked about interviewing there. Did you feel like you started to embody those? Did those really become something you valued or was it sort of like, you're just doing it because that's what Amazon cares about?Shawn Wang: There are a few things here. So I think one, people are drawn to Amazon because of leadership principles, like literally is what the interview is for. So you can't really join without already having them ingrained in you. And then second, yes, it gets brought up a lot when decisions are being made or just behaviors being modeled or discussed, especially in the performance review stuff. So I think that is useful, that is helpful, but at the same time I have problems with some of the LPs myself. "Be right a lot." What the hell is that?Adam Elmore: So what is right?Shawn Wang: Yes, exactly. What is right, what is a lot? So I think that, for example, what is underdiscussed or just not on the table, just because it comes from so much up high and has so much baggage and history with it, is that sometimes you have to try to be wrong, to take more risks. And being right a lot means that you might be more conservative than you otherwise should be. It leads to very incrementalist thinking, which is like, "All right, what is the most obvious next step? What is the low-hanging fruit? What is the short thing?" You just pick that over something that is more risky, but potentially has higher impact.Adam Elmore: Yeah. No, that makes sense. I want to, I want to shift gears a little bit and talk about Amplify. Now that you're outside of AWS, you mentioned it was sort of the first example of AWS trying to go to the front-end developer and bundle up more of a developer experience. How do you feel? And you may have information from being there about traction and things like that. How do you feel about Amplify's return on investment and is Amazon doing a good job, I guess, with Amplify in terms of trying to package up their own experience? Do you see that resonating with developers?Shawn Wang: So I think Amazon is doing a good enough job at addressing the needs of AWS customers. And that's something that is Prime first and foremost, like excels at that. Amplify could be doing a lot better at competing with the other standalone front-end developer focused startups that are out there that don't have the AWS infrastructure, which should help, but actually sometimes hurts it a little bit. So my favorite example of this is, so there's another company Begin, begin.com with Brian LeRoux. It's a four-persons company, and they also do very similar things. They deploy on top of Amazon, they are entirely serverless, they have a smaller set of offerings that they have, but their deploy speeds are in order of magnitude, faster than Amplify. They can deploy faster to AWS than Amplify can.Shawn Wang: And that's because Amplify doesn't do some of the trickery that they do, like having a cold pool ready or anything like that. When people are not married to the AWS stack, just because that's the solution, that's the technology provider or cloud that their company has picked. When you have free choice, then you come with no baggage and just being from AWS doesn't give you any home ground advantage anymore. Therefore, you have to really, really, really compete on developer experience. And that's something that Amplify still needed to work on at the time that I left.Adam Elmore: Yeah. I'm glad you brought up Begin too. I'm curious how it fits into the landscape. I've seen you mention Begin within some of your articles, like the cloud distros article I think about, I want to talk about that, but how is Begin doing? I interact with Brian on Twitter, I generally like him a lot, I like what they're building, but it is sort of a thing you have to buy into. It's like a whole different way of building applications. Do you have any sense for how they fit as a player in all of this?Shawn Wang: They're tiny. I mean, they're not a rocket ship by any means, but they absolutely solve the problem for the serverless full stack minimalist aesthetic that they're going for.Adam Elmore: Those are all things I like, so.Shawn Wang: Right down to the API calls, having an inbuilt authentication solution that when you write the serverless function, you just have the user ID and it's all done for you with cookies in the background. That's just beautiful, that's [inaudible 00:12:58] mess with cognito or anything like that. Because it's very straightforward, that is the way that I would want to build serverless applications. If I didn't have some kind of big enterprise thing requirement, which maybe it's a premature optimization to try to glom that on in the first place, which is what you're required to do with AWS Amplify.Shawn Wang: So I don't think I have enough experience to really judge, are they the right technical choice in all aspects? But I think there's just a certain aesthetic that you try to optimize for. And if you have full stack needs, if you like serverless, if you like one of everything, essentially one story solution, one queuing solution, one database solution, then Begin is the right curation for you. And then Amplify is sort of the more fully loaded solution if you want an easy way to access, let's say API Gateway, even like the... Actually just before I left, they actually launched support for serverless containers with a AWS Fargate, which is also super interesting.Adam Elmore: Oh, I didn't even know Amplify supported that.Shawn Wang: Yeah, exactly. They're just different trade offs in the spectrum, like Begin is way more opinionated than Amplify. Amplify is way more opinionated than the full set of AWS services that are possibly out there. I think they serve front-end developers well in all different respects. Yeah. I think Amplify is definitely hitting its goals and probably exceeding its goals for adoption internally. Begin could do a better job at marketing and something that I should probably try to help them on just because I'm a friend of the company and so, I mean, I just really like the philosophy, but at the same time, there are other competitors out there, like CloudFlare Workers is essentially trying to become a Jamstack or a backend-as-a-service platform, because they have Workers KV and Durable Objects. And that's a very compelling solution for a particular type of audience.Shawn Wang: And it's weird because you have to be much more specific now. Like that's the thing, you have to figure out which part of the population you are in, in order to figure out which provider is best for you. There's no such thing as one provider fits all. It's really about like, "Okay, do you like the minimalist approach? Go with Begin. Do you like the edge-first approach? Maybe go with CloudFlare. Do you like the little bit more full stack, scalable, cloudy service? Maybe go with Amplify." There's a lot there. Like, "Do you like to self-host containers? Maybe go with Fly.io or Render.com. There's just a lot of options out there, but all of them happened to be built on top of AWS, which is why we had the cloud distros thesis.Adam Elmore: Yeah. And I've consumed a lot of your content on that front, like hosted back ends. I do wonder where it's all headed. Maybe the answer is that there's just going to be a lot of options, and because there's a lot of different use cases, I guess maybe narrowing it down. Like if I really don't care about enterprise stuff or big teams, if I just care about building stuff with small teams, startups, that's where I live. Do you have any predictions, I guess, for where ideal product building is headed? Is it hosted back ends to go with your hosted front ends on Vercel or whatever else? Is it learning AWS primitives and just good and good at building stuff? How do you see that forecasting into the future?Shawn Wang: What's the alternative to hosted back ends?Adam Elmore: I guess what I do right now is build... Like I kind of use all the Amplify services, I just don't use Amplify. So I build a lot of bespoke APIs with AppSync, and Dynamo, and whatever.Shawn Wang: So because you have that knowledge, that's the best thing for you, because you already have that knowledge. Like it's not a big deal for you to spin up another service, but for others it would be, because they would be new to that and sometimes a more friendly layer that abstracts it away for them would be helpful. So it's really hard to say which is going to win just because they're all going to win in some way, but some will be more winning than others. That's kind of how I view it.Adam Elmore: Yeah. Yeah.Shawn Wang: Because at the end of the day, like cloud is such a big deal, it's such a multi decade thing. It's going to take the rest of our lives to play out. That means that the vast majority of users of cloud haven't adopted it yet, still. This late into the game, they still haven't adopted it yet.Adam Elmore: It's so hard for me to wrap my brain around. It seems like it's been so long. And when you say the rest of our lives, I don't put it in that kind of perspective. I need to calm down trying to figure out what's going to happen in the next three years. Like it doesn't matter.Shawn Wang: Yeah. Yeah. Lambda is like seven years old. This is so early. The way that this looks 40, 50 years from now is going to be so different. AWS has like a million-something customers, imagine it having 10 million. When you have order of magnitude, when we start to think in terms of orders of magnitude, you start to really sweat the small details a lot less because you're like, "Whatever. Everyone's going to win."Adam Elmore: We all win. Yeah, I guess it's true. I don't know if you've talked about this, I'm sure you've thought about it, and maybe you have written about this, but it's the idea of scarcity versus abundance mentality, I guess. It's weird because all at the same time, I agree with the sentiment that if you're on Twitter or you're very online or whatever, you should have this mentality that we can all lift each other up and we can all succeed. But then on the other hand, you've got the climate and how much can the earth sustain in terms of everything can only grow so much. I just had that thought, that sort of raw stream of consciousness. So I don't know if you've got any refined response to that. Is that sort of totally different concepts that I shouldn't conflate?Shawn Wang: What, the limits to growth thesis?Adam Elmore: Oh, yeah. I guess that's what it's called. See, I knew you'd have a name for it or something. Like the idea that we can all succeed, but at the same time, we all need to do a lot less because the planet can't succeed if we all...Shawn Wang: I mean, this is about the offline-online shift. So we can still do a lot less and cloud can still grow because the mix of what we do in-cloud versus off-cloud is still very much imbalanced. So when you do things like pay attention to an Andy Jassy Keynote, and he'll talk about like, "Oh, cloud penetration is whatever, 20%, 30%." That is how low it is and it still takes a long time for people to adopt for whatever reason, institutional or just generational, or maybe our technology's not there yet. There's still a lot that needs to be developed to serve all kinds of markets that it hasn't penetrated. My favorite stat was that online shopping went from 10% to 20% in COVID.Adam Elmore: I can't believe it's only 20%. That's actually...Shawn Wang: Exactly, right?Adam Elmore: That's bonkers.Shawn Wang: So there's some version of the future where that is 70%, which means that you still have a long, long, long, long, long way to grow for every part of e-commerce and the planet can still win by maybe more efficient sorting or less retail outlets. I don't know. I don't know about that. I think I'm much more shakier ground there, but yeah, often the online transition, I think it is a very positive thing for the planet, especially because a lot of the major clouds are committing to net zero carbon footprints. I'm not sure if AWS has actually done that yet, but definitely Microsoft and Google have done it, which means AWS will eventually do it.Adam Elmore: And I know AWS, they've launched sustainability insights and stuff recently, where you can start to see the emissions impact of the services you're spinning up. I know Google's done that for some time, but AWS is now doing that, I think.Shawn Wang: Right. But we're actually measuring it now versus not measuring it before, so whatever. This is peanuts compared to like, "All right, are we moving to electric vehicles or something?" That is way more of an interesting concern than this stuff. Like invent a better battery and that will drastically accelerate the move to solar, and that will be much more meaningful than choosing paper straws. Sweating over the carbon footprint of your EC2 instance is the developer equivalent of choosing a paper straw. Really, look, I appreciate the effort, the spirit's, the heart's in the right place, but really if you want to make an impact, go work in the big things.Adam Elmore: I'm glad you said that because this is not on my notes, this is not something I planned to talk about, but this is the thing that I feel like to make an impact, I've really struggled, I'm 15 years into my career, I've been like a software engineer mostly early in my career, then I did a startup, and then I've mostly just been doing consulting. I feel like there are more possible things I could do with my time than ever. And it's so hard for me to decide what is worth spending time on.Adam Elmore: And I guess, do you have any thoughts on senior engineers, when you get to a point in your career where you have more flexibility and more opportunities, what is the most impactful thing? I've thought about making courses, I've thought about building products and just continuing with consulting. Is there a way to split your time that you're ever going to feel good about?Shawn Wang: Probably not.Adam Elmore: Okay. It's good to know. I can stop trying to find it.Shawn Wang: Yeah. The menu options is so high. I think just figure out what gives you energy and then try to spend more of your time and day on that than stuff that takes away energy from you, so it was just a very hippie thing for me to say.Adam Elmore: Yeah. No, that seems much simpler than I'm making it.Shawn Wang: There's a concept here that I do like to share about leverage. There's an inherent tension between productivity and leverage. I think we are trained from basically our days in school, that high productivity is the goal, which is you want to have a packed calendar, you want to be doing eight different things at once. You should feel bad if your efficiency went down 10% compared to last week or whatever, and you're not meeting your OKRs or whatever. And the exact opposite to that is leverage where you want to have one thing, you want to do one thing and just have a lot of impacts come out of that.Shawn Wang: And I think there's a movement, at least in VC circles, but also in sort of tech bro circles of waking up to the idea of slack in your life, and having peace and not having so much going on, and just doing high leverage activities that help you extend your reach without you necessarily putting more hours in or being super productive. Like being unproductive is fantastic. It's actually people who cannot figure out leverage who have to try to be productive. If you can figure out leverage, then productivity doesn't matter at all.Adam Elmore: Yeah. No, that's good stuff. I think I intuitively knew that. I just have a really hard time. I feel like I'm much more seeing the tree versus the forest, so I really appreciate talking with people like you that see the broader picture. I think I have a lot of thoughts and then I read an article of yours and it helps me put words to those thoughts that I couldn't really formalize in my head.Shawn Wang: I should really write about this more, but I feel like I haven't got it yet. You see me out there, you see me doing all sorts of random crap. So I haven't internalized it fully. I haven't let go of the sort of productivity mantra. Part of that is me being very risk-averse, part of that is me being doubting myself. Definitely, the stuff that you see from me has extremely high leverage. I think, okay... The other thing is I also have second thoughts or doubts about this whole leverage thing, that's why I have a very divisive tone about VCs and tech bros, because everyone wants to be high leverage, everyone wants to do the 80-20. Nobody wants to ship stuff, they just want to tweet thoughts, and then they think they're done. Right?Adam Elmore: Yeah.Shawn Wang: That's what they think high leverage is. But really the people who get shit done, swipe to find details and take things to the finish line. And guess what? Doing that last 10% is super low leverage. Like, "Oh man, I got to fix this stupid SEO description or the OG image isn't right, let me go fix that." That kind of small little details matter for the quality of the products and for shipping things, but all the high-leverage people feel like they're above that because it's not a good use of time.Adam Elmore: So are they the high-leverage people or you're saying the people that want to be high leverage, is that the VCs and the tech bros?Shawn Wang: Yeah, exactly.Adam Elmore: What is tech bro? I feel like I probably am a tech bro, and I don't want to be a tech bro, but I feel like I'm a white male that has a podcast, so I can't escape it.Shawn Wang: Yeah. Yeah. I'm a tech bro guy. I'm sort of reluctantly in that demographic. Yeah, the tech bro is a bro that's in tech.Adam Elmore: Okay. Yeah. Well.Shawn Wang: That is fully aware. Okay. I do like to have this mis-metric. If you're fully up to speed on the latest news, the gossip, you know all the new launches and new products, you're definitely a tech bro.Adam Elmore: Okay. Okay.Shawn Wang: If nothing surprises you, you're a tech bro. If you know what AUM is, if you know what ARR is, if you know all these acronyms without even blinking, you're a tech bro. Well, the real people who get shit done out there are wonderfully blissfully ignorant. They'll be like, "What is this whole Twitter kerfuffle, what's going on? I don't know. I just completely stayed out of the loop." But you being a tech bro, you would know the blow by blow of like Elon did this, twitter did that, Elon did other thing, twitter did other thing. It doesn't matter, the stuff doesn't matter to some extent and tech bros are so involved in their own filter bubble that they don't see their own forest for the trees, so.Adam Elmore: You said Twitter. I think I've been on Twitter actively for a year or so and I don't know that I'm better for it. I don't know that like... I know that I'm very influenced by that sphere and sort of feeling like, I think that's why it's so surprising to me when I hear about cloud adoption or I hear about online shopping. It just seems like everyone lives in this little community and it's very easy to just not really remember the people that are actually around me in my local community and what life is actually like. Is there a way to balance it? Is there a way to balance being very online, being a member of this Twitter community and still keep a good grasp on the real world?Shawn Wang: I don't think I personally have figured that out a lot, but I think it's basically the developer equivalent of go touch grass, which is go outside.Adam Elmore: Yeah, yeah, yeah.Shawn Wang: Have hobbies, have kids.Adam Elmore: That I was going to say, I've got two boys and they make me be outside a whole lot, so that probably helps, I guess, somewhat.Shawn Wang: Yeah, yeah, yeah.Adam Elmore: I think the biggest thing for me just career and in terms of the always online, the tech broness, I think giving my wife the opportunity to set some boundaries around the time that I am working, I think this stage of my career, I've been able to say I'm going to work less and just seeing her role and what her life looks like and realizing how it shouldn't be this different. Like we shouldn't have such a, I don't know, huge chasm in terms of our daily life. Like I get to go enjoy what I do all day. Yeah, that's helped. We've carved out a lot of time that's like, "This is time for family." I think yeah, but my online, my work life feels very homogenous, I guess. And it could be better.Shawn Wang: For me, it's like, "All right, figure out what is probably going to make your money and focus all your attention on that. Ignore everything else. Try to stick to, okay, what can you reasonably explain to your non-technical relatives? If you can't really justify it to them, then maybe have a second thought about like, 'All right, what am I really doing here?' Am I really making the world a better place by inventing a better form of infrastructure as code? Probably not." Unless you become a billionaire by creating HashiCorp, right?Adam Elmore: Yeah, I guess it happens in that very rare instance. Yeah.Shawn Wang: Right. But it can happen. You just have to be super clear on what you're trying to do here. And just like, yeah, be super intellectually honest about like, "Look, you're you're in this for the money, whatever you work on is probably going to be irrelevant in 10 years anyway. It doesn't matter, but you're at least going to have fun, you're going to build some relationships, you're going to make some people happy, create some jobs, whatever, and then spend the rest of your time with family and friends."Adam Elmore: That was a very succinct way of wrapping up a lot of the things I needed answered. So I don't know if anyone that listens to this podcast cares about any of this. I really appreciate the conversation we just had.Shawn Wang: No, no. I think yeah, this is very real and I really appreciate you bringing it up, because I don't get a lot of chance to talk about this.Adam Elmore: Yeah. No, I live in the Ozarks, so tech literacy here is super low. I think that's where getting into the Twitter community, it was like, "I have friends now that I can talk to about technology and things I care about." But yeah, finding that balance. I think it's really very practical of you, very wise of you to point out that ultimately this stuff doesn't necessarily matter in a decade, that whatever I think I'm working on that's so important is probably more about the people, more about what I'm kind of enjoying the process along the way and that it's making a living and that we're moving a little bit forward whatever parts we touch and what other people we can be involved with. That was very nice for me to hear.Shawn Wang: I will point out one thing. So humanity is kind of moving onto this metaverse. If there's anything that's actually real about the metaverse is that you have your community online that is dissociated from your physical community. You're so into AWS, or cloud, or anything like that, and no one else around you physically is, and it's fine. And this is something that actually the crypto bros, they probably got right. So I think Balaji Srinivasan, who is one of the crypto investors at Andreessen Horowitz, he released this book recently about building a digital nation, which is really compelling, which is like, essentially there's the world of physical nations, like the ones that country that've boundaries, but then there's the digital nations, which are formed online, and you're a member of the digital nation of probably tech Twitter, whatever.Adam Elmore: Yeah, yeah.Shawn Wang: Or AWS Twitter. And I kind of liken it to the difference between friends being the family that you choose versus the family that you have is the one that you're born with.Adam Elmore: Yeah, yeah, yeah, yeah.Shawn Wang: So where you're physically located is just the nation that you're born with or the nation that you have to live in for your family reasons, but the one that you do online, that's the nation that you choose, so you're member of a different nation online. And that nation is global, it's ephemeral, it's virtual, whatever that is. But it's something that you prefer to spend your time in as compared to your physical nation.Adam Elmore: Yeah. So I feel like since getting really active in Twitter and being involved with the AWS community, even outside of Twitter, it is so global. It's helped me see the perspective of America, where I live, so differently. Just getting all those other points of view and just knowing that when I interact with someone, it's not this base assumption that they understand the world through the lens of America like I do. I very much appreciate that. I feel like I'm, if anything, becoming more and more dissociated with the country I physically live in, because I just don't interact much with people outside of these walls. I don't know if it was COVID and being in all the time. I always have been kind of an at-home person.Shawn Wang: So that is dangerous. Right? That is dangerous.Adam Elmore: Yeah. It feels dangerous. Yeah, tell me why.Shawn Wang: Well, because if you don't care about the physical environment that you're in, then it's going to degrade, it's going to diverge away from your preference.Adam Elmore: Yeah.Shawn Wang: I don't know if that's inherently bad to me. Like there's definitely a physical element to humanity that we should keep around. We are not just brains plugged into the matrix. Essentially this leads to the matrix, that we might also just be plugged into something virtual online and spend zero time on a physical environment. Most people would not like to live that way, and that means we should care about what's going on around us. And we should try to have some physical presence that we're actually proud of and enjoy. And I think that there's a tension there that I think is sort of the modern humanistic existentialism, which is like, "How much of my life should I spend online versus how much should I spend in person?" And the fact that you have to choose is just nuts.Adam Elmore: Yeah. And I think my problem, like if I'm just being honest with myself and just thinking through this, I spend about as much time, I think, in the real world, but it's just with my family, at home, it's with my neighbor, I got a neighbor that I go for walks every week with. It's like my very, very hyper local community. But what's going on in the City of Nixa? It's like 10,000 people where I live. What's the local government doing? I don't know. I have no idea. What's the State of Missouri doing? Probably stuff I don't like.Shawn Wang: Exactly. And look, this has a very real impact on us because these people are making the laws that we have to follow. And we don't have a voice because we choose not to have a voice because we choose to not care. But hey, is it really our fault when the Supreme Court or the Congress makes a law that we don't like? Well, yeah. I mean, what did you expect? You didn't spend any time investing in that part of the world. It's like, "When are we going to have a software engineer in Congress?" That's really the big question.Adam Elmore: Yeah. There's not a lot of tech representation, is there? In government in the United States.Shawn Wang: No, because everyone hates politics, they love to dunk on it, they don't want to do a thing about it, but that's kind of the problem. I don't care which side of the bench you're on, like just the politicalness because you feel like you're not a member of the physical nation, you're a member of the digital nation. That is a problem for the physical nation, because at the end of the day, that's basically a reality.Adam Elmore: Yeah. Oh, I think of that, there was that Netflix documentary. I don't even know if it was just on Netflix, but there was that social. Well, I don't even remember what it was called, it was about social media and had all these people from Facebook and other places, or ex-Facebook, talking about just this impact that the very online nature of our generation, what it's doing to our brains and all that. This all sort of ties in my mind. Like I definitely need to do some more things that are yeah, going to impact my life, my kids' lives, sort of being more involved, I guess, outside of... Like I divide my time into I'm at work and I'm on a computer all day or I'm with my family and we're out in the yard playing. It's those two things. And I make no time for anything else, but that's probably not good. Not a good, long-term solution.Adam Elmore: Okay. Now I'm getting way off the rails. AWS FM, people literally listen to this for some good AWS bits. They've turned out long ago. I do have a couple more questions here, getting back to like I'm a developer, I like building full-stack web applications and I happen to like leveraging AWS. I'm going to ask you a few things. When should I care about CloudFlare? They announce all this cool stuff and it really is genuinely cool sounding, but there's so much of a barrier to adoption, like for me to change my day to day and start using a new thing. When should I care about CloudFlare?Shawn Wang: I have the article on this, about how CloudFlare is playing Go while AWS plays chess, so I highly recommend reading that up. Essentially, CloudFlare is a really good CDN. AWS has its own. I would think you can do up comparisons of CloudFront and CloudFlare all day long, but I would say that CloudFlare probably has much more of a security focus than CloudFront has, and that by default wins you the majority of the business and it happens to be very easily adoptable because you just need to configure some DNS, just is carrying a lot of weight there and it comes to DNS.Adam Elmore: If you're asking someone in the Ozarks around me, then what's DNS, first of all?Shawn Wang: So I think it basically starts from the outside in. You want to think about CloudFlare, you think about where your user's traffic is coming in. Maybe you want to protect those with CloudFlare and then you want to come in a little bit. CloudFlare has this S3 wrapper called R2, that basically reduces a lot of your outgoing bandwidth costs. And that seems like basically a Pareto optimal win. Pareto being you're no worse off in any dimension and you're better off in one dimension, which is cost. And that's just a function of CloudFlare.Shawn Wang: Like how many points of presence does AWS have? I think in the hundreds, maybe 100, 150, something like that. CloudFlare has tens of thousands, right?Adam Elmore: Oh, okay.Shawn Wang: It's just a much better edge network than AWS has. And so they just have a fundamentally different business model. And I think once you understand that from a fundamental physics and points of presence perspective, then you're understanding, "Okay, this is what I'm getting that AWS doesn't do." It's not a straight up one-to-one competitor, it's trying to tackle the cloud problem from a different way.Shawn Wang: So you do the cloud traffic protection, then you do the sort of egress charges, which are sort of the main sticking point of AWS. Then you get into the extra stuff that CloudFlare offers for application builders. And I focus on this because I'm an application builder. CloudFlare's other offerings for security that I have no idea, security and networking that I have no idea about, particularly if you need to wire a building or an office, they have a box that's pretty sweet for everything I heard. CloudFlare One is the name of it if you want to Google it.Adam Elmore: Okay. Yeah, I do.Shawn Wang: But for application developers, CloudFlare Workers, that team is the sort of primary team that's working on that. And that is, there's edge function service that would be a big leap to adopt because they don't run Node.js, they run V8 isolates, which are taken out of the Chrome V8 engine.Adam Elmore: Is it similar to like Lambda@Edge? Like the same kind of...?Shawn Wang: No, it is not.Adam Elmore: Oh, is Lambda@Edge node?Shawn Wang: Yes.Adam Elmore: Oh, it is.Shawn Wang: Yes.Adam Elmore: It is. Now, what is it similar to? It's similar to, I guess like Middleware and Next.js, that's that same kind of a limited runtime environment?Shawn Wang: I think so. Yeah, exactly, exactly. I would say it's more limited in Lambda@Edge and it's got different costs and criteria. Basically, there's just more of the open source ecosystem that it will be incompatible with CloudFlare Workers than it would be with Lambda@Edge. And that's the thing that you need to know because you're going to use...Adam Elmore: CloudFront Functions.Shawn Wang: Ah, okay. Yeah, that's the one I keep forgetting.Adam Elmore: I don't know who's using it, but that's what I was thinking of.Shawn Wang: Right. So I used to use this only for smart redirects, like looking at the headers of a request and saying, "If you're coming in with a header indicating you're from a certain region, certain IPS, certain language, then I'm going to route you to a different location than I would normally." Only for route, but now Edge Functions are becoming so capable that you might be able to do rendering on demands instead of just routing. And that actually is unlocking a few new things because on top of that, CloudFlare also has persistence solutions with Workers KV, which is their eventually consistent store, and Workers, and Durable Objects, which is their strongly consistent store. So either one of those combined with the ability to render, means that you can actually just host a site full stack with Front on the Edge. There's no origin server, there's no region, you just have everything everywhere all at once, which is a favorite phrase that I try to sneak in.Adam Elmore: Yeah. That's super compelling.Shawn Wang: So yeah, your latencies go down from like 300 milliseconds to nine, just because you're just pinging near a cell tower or something.Adam Elmore: Yeah, that's incredible. And they've just announced, I don't remember D1 or whatever. I don't know, I can't keep track of their product names, but they have like a distributed SQL offering as well that's coming or...Shawn Wang: SQLite. Yeah.Adam Elmore: Yeah. SQLite at the edge.Shawn Wang: I mean, everything's just built on top, it's just clearly built on top of the original persistence primitive that they have. And so once they got strongly consistent and eventually consistent, those are the two dimensions that you really care about. You can build any sort of solution on that, so the SQLite offering is just built on top of that.Adam Elmore: Yeah. Okay. So I don't know if I'm going to like jump on this stuff yet, but it does sound like there is a world where I could build side projects just on CloudFlare, like stuff runs all at the edge and I don't have to build up, I guess, is the interop, like if I want to still stand up a GraphQL API in AWS, like AppSync or something, is there interoping between the two services? You said their durable storage sits on top of S3, so it's actually, you're using an S3 bucket, you're just wrapping it with a CloudFlare thing?Shawn Wang: It's a proxy.Adam Elmore: Okay. Are people building hybrid CloudFlare, oh, I know they are, hybrid CloudFlare and AWS back ends today? I think I know of a couple at least. Is that a thing you recommend?Shawn Wang: I would say yeah, there are. I'd say this is definitely on the cutting edge. You do it because you feel like [inaudible 00:42:35].Adam Elmore: It's like Twitter, where you do it and you talk about it on Twitter and then everyone thinks...Shawn Wang: It's theoretically possible, it's just like probably not in any size.Adam Elmore: Doesn't make sense yet. Okay. So I'm going to say, I don't need to care about CloudFlare yet, that's what I'm going to say based on this conversation. I mean, I'm going to keep reading the articles, but.Shawn Wang: The only thing I'll point out is don't stop there because this is what they've achieved in the past three, four years, they clearly have a roadmap, they clearly are going to keep going, and just eating the cloud from outside in, which is the name of the article. What else of the functionality can be replicated in an-edge-first way? CloudFlare is probably going to do that. And so there's a whole roadmap that just consists of looking at the AWS console and just going, "That first, that first, that first comes [inaudible 00:43:17]."Adam Elmore: Yep. Yep. Yep.Shawn Wang: And then there's a question of just what kind of application are you building and do you really need the full set of AWS services, or can you just start from the edge first? That's how disruption happens. Disruption happens by taking a section on the market that nobody cared about and making that your entire thing, and then making it so capable over time that people see no use to use the old thing, but it takes a course of what, 10, 20 years to do that because AWS has just spent the past 20 years doing that in the first place.Adam Elmore: I just don't keep those time frames in mind. Like Twitter has warped my sense of when things are coming. And when you say 10, 20 years, it's like, I don't think about anything that's coming 10, 20 years from now. I think I'm thinking what's coming in the next 18 months.Shawn Wang: Right. But that's a problem for us, because that short-term mentality stops us from betting on big trends early. And I think to build anything of significance, you have to do it for 10 years.Adam Elmore: Yeah. I got to get off Twitter, that's what I'm coming to here.Shawn Wang: I think so. I think I'm going to do it in healthy amounts. So I actually, one of my longstanding wishlist projects is to actually build a Twitter client that has a time limit.Adam Elmore: Oh, nice. Yes.Shawn Wang: [inaudible 00:44:25] Client with a time limit. If you're going to have more time, you're going to have to pay to donate to your favorite charity or something.Adam Elmore: Oh, I love it.Shawn Wang: And that's in my wishlist.Adam Elmore: Yeah. I will use it. You've got your first user if you build it.Shawn Wang: I'll just say the only reason I don't do it is because nobody trusts the Twitter API.Adam Elmore: So one more, should I care about it yet or not? Because I see Brian LeRoux talk about this quite a bit. Deno. Should I care about Deno yet?Shawn Wang: I think so. I think it's there. I think it's there. So what is Deno? Dino is sort of the new runtime that the original creator of Node.js is saying, "All right, I'm going to do this over. Node.js has been around for 10 years. I see all the flaws of it, now I'm going to start over from scratch." I was very skeptical of Deno when it first came out, but it's been two years and it's really shown a lot of progress. And I think the governance is right, the funding model was right, and the adoption is growing. What is really compelling to me about Deno, just not from a technical perspective, from a business perspective, which feeds into a technical, the business side. There are companies so Superbase and Netlify, both launched edge functions powered by Deno, which means that their biggest products shipping capability announcement of the year of 2022 was someone else's product. It was a startup that's way younger than them, but they just have the right abstraction and the right cloud service that is already functional that they're launching. So it's weird.Shawn Wang: Deno's go-to-market strategy is just waiting for other people to wake up and go, "I need this. Deno's the only supplier in the market for this. And yeah, let's just bring it on and ship it as our thing." Where it really is Deno's thing, but they're just letting other people white label them. It's that's fantastic. So I mean, from that perspective alone in the past six months, I've really changed to, from like, "Okay, Node and Deno will coexist for the foreseeable future because there's such a huge install base of Node into every incremental app will probably be built in Deno."Adam Elmore: Well, that's... Yeah. No, that's what I needed to hear. I think I there's a lot of excitement. I see it all, but it's all Twitter, so I needed to hear it face to face that it's worth digging into.Adam Elmore: One last question. We do have a couple more minutes here. Do you have thoughts on the whole macro venture capital situation and how that might impact the next 5, 10 years? And I don't know if we're entering into some tightening cycle that we've never seen anything like the last 10 years, 13, whatever years, of government injecting so much capital into the system. And if that starts going away, do you have opinions or thoughts on all these startups that are making our lives better? Like I think of DevX startups where I don't know how financially sound they are yet, they've been living off the VC. Do you have thoughts on all that?Shawn Wang: Not fully formed ones, but I can give you a quick hit.Adam Elmore: Yeah. Yeah.Shawn Wang: So how bad did it get? It got to the point, so the average price of sales ratio of a publicly traded company would be in the range of 10 to 50. That's a very wide range, meaning your market capitalization, the total value of a company is 50 times your sales. In private markets, the price of sales ratios of funding rounds, series A and B, and all that, got up to 1,000 times.Adam Elmore: Oh my God.Shawn Wang: We had 1,500 at one of the startups that I was at and I heard of one startup that was 2,500.Adam Elmore: Wow.Shawn Wang: So that was the peak in November of last year. Those days are gone, people are now asking for 100X, which is very like 10X fall, like very, very big. That's why almost nobody's raising money. So that VC market is right up, I'll say it has different impact on different stages. And this is all to do with like, "Okay, would you invest in Stripe at 95 billion when Shopify used to be 100 billion and now it's worth 20 billion?" You probably want to buy the more quality asset that's already publicly listed than the very stable asset that is at a high valuation.Shawn Wang: So this is the deal making has just gone off. Like I think at the seed stage, people are completely unaffected. I think people are cognizant of the fact that economic cycles repeat or like, this is not going to... This is a recession. We are probably already in a recession right now, we are in a tightening cycle right now, but this is probably not one of those that's just going to drag out super long. And startup take 10 years to build anyway, so why should your early stage investing be affected at all by what the current level of the S&P is? It shouldn't.Adam Elmore: Yeah. No, it's true. I mean, so much of this conversation just echoes your bias towards long term versus short term, and I should have known that coming in. I'm asking all these questions that are very much like, there's a clear answer if you just think outside of the next year.Shawn Wang: Oh, I love training people to do that.Adam Elmore: Yeah. No, it's really nice.Shawn Wang: Take a long-term perspective in the history and then project it out to the future as well, and try to make decisions on that, so.Adam Elmore: Yeah, it's sort of refreshing, especially in this sort of anxiety-ridden digital space. I feel like when you zoom out things feel a lot less pressing or anxiety-laden, I guess. I don't know. Yeah, I appreciate that.Shawn Wang: It's weird because I think that's true, but at the same time, you're only here on this earth for so long. When you zoom out, that actually reduces the available number of decisions that you can possibly make, which means that each decision goes from being a two-way door into a one-way door because you want to make more substantial decisions. Therefore, for example, when I changed jobs, it took me like two months of agonizing to finally land on something, because I could have done any number of things and I think you have to really examine your beliefs as to what the long-term trends are going to be and trade that off versus being happy in the short run.Adam Elmore: Yeah. I'm going to be trying to do that. I think I'm in the middle of the agonizing stage right now, trying to figure out what's next, but I'm going to try and think a little more long term.Shawn Wang: The thing I'll point you to, you're talking about courses and stuff like that in leverage, I'll say definitely check out Eric Jorgenson, who is the book writer for Naval Ravikant. He wrote the Almanac of Naval Ravikant, and he's trying to build up a thesis or a body of knowledge around what leverage is and what leverage means. And then the other thing I'll point you to is Nathan Barry, who's the founder of ConvertKit who talked about the letters of wealth creation and how some things are more high leverage than others, so.Adam Elmore: Thank you so much for that. Again, this podcast may just be for me, but that's okay because I got a lot out of it. Thank you so much for taking the time, Shawn.Shawn Wang: [inaudible 00:50:58].Adam Elmore: I didn't know how much I'd get in on my... I think we covered half the things I thought about talking to you about. You're just a wealth of knowledge, you're sort of a wise sage in this community and it's been so great to pick your brain. Thanks for coming on.Shawn Wang: I think we're the same age.Adam Elmore: Oh, yeah. Well yeah, you've been using your time better, I guess. You've been doing more high-leverage things or something.Shawn Wang: Yeah. Thanks for having me around, but we can talk anytime. I really enjoyed this conversation.Adam Elmore: That sounds good. Thanks, Shawn.

SubscribeMe Online Courses, Membership Sites, Content Marketing and Digital Marketing
Part 2: 17 Marketing & Business Secrets You Can Learn From Hollywood - 116

SubscribeMe Online Courses, Membership Sites, Content Marketing and Digital Marketing

Play Episode Listen Later Aug 5, 2022 19:11


In part 2 of this series, I talk about creating a network, building a team, and to practice or not to practice. There is a lot we can learn from Hollywood. Especially about what NOT to do. This episode is about learning Business secrets, Marketing secrets, and even Content Creation secrets, from Hollywood. So let's see the best things we can take away from Hollywood and apply them to Business, Marketing, and Content Creation. Listen to the show for the rest at https://SubscribeMe.fm Or search for "SubscribeMe.fm" in your favorite podcast app. Episode brought to you by https://S3MediaVault.com - Audio & Video Player for Amazon S3 & File Protector for Amazon S3 and CloudFront

Screaming in the Cloud
Google Cloud Run, Satisfaction, and Scalability with Steren Giannini

Screaming in the Cloud

Play Episode Listen Later Jun 23, 2022 37:01


Full Description / Show Notes Steren and Corey talk about how Google Cloud Run got its name (00:49) Corey talks about his experiences using Google Cloud (2:42) Corey and Steven discuss Google Cloud's cloud run custom domains (10:01) Steren talks about Cloud Run's high developer satisfaction and scalability (15:54) Corey and Steven talk about Cloud Run releases at Google I/O (23:21) Steren discusses the majority of developer and customer interest in Google's cloud product (25:33) Steren talks about his 20% projects around sustainability (29:00) About SterenSteren is a Senior Product Manager at Google Cloud. He is part of the serverless team, leading Cloud Run. He is also working on sustainability, leading the Google Cloud Carbon Footprint product.Steren is an engineer from École Centrale (France). Prior to joining Google, he was CTO of a startup building connected objects and multi device solutions.Links Referenced: Google Cloud Run: https://cloud.run sheets-url-shortener: https://github.com/ahmetb/sheets-url-shortener snark.cloud/run: https://snark.cloud/run Twitter: https://twitter.com/steren TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined today by Steren Giannini, who is a senior product manager at Google Cloud, specifically on something called Google Cloud Run. Steren, thank you for joining me today.Steren: Thanks for inviting me, Corey.Corey: So, I want to start at the very beginning of, “Oh, a cloud service. What are we going to call it?” “Well, let's put the word cloud in it.” “Okay, great. Now, it is cloud, so we have to give it a vague and unassuming name. What does it do?” “It runs things.” “Genius. Let's break and go for work.” Now, it's easy to imagine that you spent all of 30 seconds on a name, but it never works that way. How easy was it to get to Cloud Run as a name for the service?Steren: [laugh]. Such a good question because originally it was not named Cloud Run at all. The original name was Google Serverless Engine. But a few people know that because they've been helping us since the beginning, but originally it was Google Serverless Engine. Nobody liked the name internally, and I think at one point, we wondered, “Hey, can we drop the engine structure and let's just think about the name. And what does this thing do?” “It runs things.”We already have Cloud Build. Well, wouldn't it be great to have Cloud Run to pair with Cloud Build so that after you've built your containers, you can run them? And that's how we ended up with this very simple Cloud Run, which today seems so obvious, but it took us a long time to get to that name, and we actually had a lot of renaming to do because we were about to ship with Google Serverless Engine.Corey: That seems like a very interesting last-minute change because it's not just a find and replace at that point, it's—Steren: No.Corey: —“Well, okay, if we call it Cloud Run, which can also be a verb or a noun, depending, is that going to change the meaning of some sentences?” And just doing a find and replace without a proofread pass as well, well, that's how you wind up with funny things on Twitter.Steren: API endpoints needed to be changed, adding weeks of delays to the launch. That is why we—you know, [laugh] announced in 2018 and publicly launched in 2019.Corey: I've been doing a fair bit of work in cloud for a while, and I wound up going down a very interesting path. So, the first native Google Cloud service—not things like WP Engine that ride on top of GCP—but my first native Google Cloud Service was done in service of this podcast, and it is built on Google Cloud Run. I don't think I've told you part of this story yet, but it's one of the reasons I reached out to invite you onto the show. Let me set the stage here with a little bit of backstory that might explain what the hell I'm talking about.As listeners of this show are probably aware, we have sponsors whom we love and adore. In the early days of this show, they would say, “Great, we want to tell people about our product”—which is the point of a sponsorship—“And then send them to a URL.” “Great. What's the URL?” And they would give me something that was three layers deep, then with a bunch of UTM tracking parameters at the end.And it's, “You do realize that no one is going to be sitting there typing all of that into a web browser?” At best, you're going to get three words or so. So, I built myself a URL redirector, snark.cloud. I can wind up redirecting things in there anywhere it needs to go.And for a long time, I did this on top of S3 and then put CloudFront in front of it. And this was all well and good until, you know, things happened in the fullness of time. And now holy crap, I have an operations team involved in things, and maybe I shouldn't be the only person that knows how to work on all of these bits and bobs. So, it was time to come up with something that had a business user-friendly interface that had some level of security, so I don't wind up automatically building out a spam redirect service for anything that wants to, and it needs to be something that's easy to work with. So, I went on an exploration.So, at first it showed that there were—like, I have an article out that I've spoken about before that there are, “17 Ways to Run Containers on AWS,” and then I wrote the sequel, “17 More Ways to Run Containers on AWS.” And I'm keeping a list, I'm almost to the third installation of that series, which is awful. So, great. There's got to be some ways to build some URL redirect stuff with an interface that has an admin panel. And I spent three days on this trying a bunch of different things, and some were running on deprecated versions of Node that wouldn't build properly and others were just such complex nonsense things that had got really bad. I was starting to consider something like just paying for Bitly or whatnot and making it someone else's problem.And then I stumbled upon something on GitHub that really was probably one of the formative things that changed my opinion of Google Cloud for the better. And within half an hour of discovering this thing, it was up and running. I did the entire thing, start to finish, from my iPad in a web browser, and it just worked. It was written by—let me make sure I get his name correct; you know, messing up someone's name is a great way to say that we don't care about them—Ahmet Balkan used to work at Google Cloud; now he's over at Twitter. And he has something up on GitHub that is just absolutely phenomenal about this, called sheets-url-shortener.And this is going to sound wild, but stick with me. The interface is simply a Google Sheet, where you have one column that has the shorthand slug—for example, run; if you go to snark.cloud/run, it will redirect to Google Cloud Run's website. And the second column is where you want it to go. The end.And whenever that gets updated, there's of course some caching issues, which means it can take up to five seconds from finishing that before it will actually work across the entire internet. And as best I can tell, that is fundamentally magic. But what made it particularly useful and magic, from my perspective, was how easy it was to get up and running. There was none of this oh, but then you have to integrate it with Google Sheets and that's a whole ‘nother team so there's no way you're going to be able to figure that out from our Docs. Go talk to them and then come back in the day.They were the get started, click here to proceed. It just worked. And it really brought back some of the magic of cloud for me in a way that I hadn't seen in quite a while. So, all which is to say, amazing service, I continue to use it for all of these sponsored links, and I am still waiting for you folks to bill me, but it fits comfortably in the free tier because it turns out that I don't have hundreds of thousands of people typing it in every week.Steren: I'm glad it went well. And you know, we measure tasks success for Cloud Run. And we do know that most new users are able to deploy their apps very quickly. And that was the case for you. Just so you know, we've put a lot of effort to make sure it was true, and I'll be glad to tell you more about all that.But for that particular service, yes, I suppose Ahmet—who I really enjoyed working with on Cloud Run, he was really helpful designing Cloud Run with us—has open-sourced this side project. And basically, you might even have clicked on a deploy to Cloud Run button on GitHub, right, to deploy it?Corey: That is exactly what I did and it somehow just worked and—Steren: Exactly.Corey: And it knew, even logging into the Google Cloud Console because it understands who I am because I use Google Docs and things, I'm already logged in. None of this, “Oh, which one of these 85 credential sets is it going to be?” Like certain other clouds. It was, “Oh, wow. Wait, cloud can be easy and fun? When did that happen?”Steren: So, what has happened when you click that deploy to Google Cloud button, basically, the GitHub repository was built into a container with Cloud Build and then was deployed to Cloud Run. And once on Cloud Run, well, hopefully, you have forgotten about it because that's what we do, right? We—give us your code, in a container if you know containers if you don't just—we support, you know, many popular languages, and we know how to build them, so don't worry about that. And then we run it. And as you said, when there is low traffic or no traffic, it scales to zero.When there is low traffic, you're likely going to stay under the generous free tier. And if you have more traffic for, you know, Screaming in the Cloud suddenly becoming a high destination URL redirects, well, Cloud Run will scale the number of instances of this container to be able to handle the load. Cloud Run scales automatically and very well, but only—as always—charging you when you are processing some requests.Corey: I had to fork and make a couple of changes myself after I wound up doing some testing. The first was to make the entire thing case insensitive, which is—you know, makes obvious sense. And the other was to change the permanent redirect to a temporary redirect because believe it or not, in the fullness of time, sometimes sponsors want to change the landing page in different ways for different campaigns and that's fine by me. I just wanted to make sure people's browser cache didn't remember it into perpetuity. But it was easy enough to run—that was back in the early days of my exploring Go, which I've been doing this quarter—and in the couple of months this thing has been running it has been effectively flawless.It's set it; it's forget it. The only challenges I had with it are it was a little opaque getting a custom domain set up that—which is still in beta, to be clear—and I've heard some horror stories of people saying it got wedged. In my case, no, I deployed it and I started refreshing it and suddenly, it start throwing an SSL error. And it's like, “Oh, that's not good, but I'm going to break my own lifestyle here and be patient for ten minutes.” And sure enough, it cleared itself and everything started working. And that was the last time I had to think about any of this. And it just worked.Steren: So first, Cloud Run is HTTPS only. Why? Because it's 2020, right? It's 2022, but—Corey: [laugh].Steren: —it's launched in 2020. And so basically, we have made a decision that let's just not accept HTTP traffic; it's only HTTPS. As a consequence, we need to provision a cert for your custom domain. That is something that can take some time. And as you said, we keep it in beta or in preview because we are not yet satisfied with the experience or even the performance of Cloud Run custom domains, so we are actively working on fixing that with a different approach. So, expect some changes, hopefully, this year.Corey: I will say it does take a few seconds when people go to a snark.cloud URL for it to finish resolving, and it feels on some level like it's almost like a cold start problem. But subsequent visits, the same thing also feel a little on the slow and pokey side. And I don't know if that's just me being wildly impatient, if there's an optimization opportunity, or if that's just inherent to the platform that is not under current significant load.Steren: So, it depends. If the Cloud Run service has scaled down to zero, well of course, your service will need to be started. But what we do know, if it's a small Go binary, like something that you mentioned, it should really take less than, let's say, 500 milliseconds to go from zero to one of your container instance. Latency can also be due to the way the code is running. If it occurred is fetching things from Google Sheets at every startup, that is something that could add to the startup latency.So, I would need to take a look, but in general, we are not spinning up a virtual machine anytime we need to scale horizontally. Like, our infrastructure is a multi-tenant, rapidly scalable infrastructure that can materialize a container in literally 300 milliseconds. The rest of the latency comes from what does the container do at startup time?Corey: Yeah, I just ran a quick test of putting time in front of a curl command. It looks like it took 4.83 seconds. So, enough to be perceptive. But again, for just a quick redirect, it's generally not the end of the world and there's probably something I'm doing that is interesting and odd. Again, I did not invite you on the show to file a—Steren: [laugh].Corey: Bug report. Let's be very clear here.Steren: Seems on the very high end of startup latencies. I mean, I would definitely expect under the second. We should deep-dive into the code to take a look. And by the way, building stuff on top of spreadsheets. I've done that a ton in my previous lives as a CTO of a startup because well, that's the best administration interface, right? You just have a CRUD UI—Corey: [unintelligible 00:12:29] world and all business users understand it. If people in Microsoft decided they were going to change Microsoft Excel interface, even a bit, they would revert the change before noon of the same day after an army of business users grabbed pitchforks and torches and marched on their headquarters. It's one of those things that is how the world runs; it is the world's most common IDE. And it's great, but I still think of databases through the lens of thinking about it as a spreadsheet as my default approach to things. I also think of databases as DNS, but that's neither here nor there.Steren: You know, if you have maybe 100 redirects, that's totally fine. And by the way, the beauty of Cloud Run in a spreadsheet, as you mentioned is that Cloud Run services run with a certain identity. And this identity, you can grant it permissions. And in that case, what I would recommend if you haven't done so yet, is to give an identity to your Cloud Run service that has the permission to read that particular spreadsheet. And how you do that you invite the email of the service account as a reader of your spreadsheet, and that's probably what you did.Corey: The click button to the workflow on Google Cloud automatically did that—Steren: Oh, wow.Corey: —and taught me how to do it. “Here's the thing that look at. The end.” It was a flawless user-onboarding experience.Steren: Very nicely done. But indeed, you know, there is this built-in security which is the principle of minimal permission, like each of your Cloud Run service should basically only be able to read and write to the backing resources that they should. And by default, we give you a service account which has a lot of permissions, but our recommendation is to narrow those permissions to basically only look at the cloud storage buckets that the service is supposed to look at. And the same for a spreadsheet.Corey: Yes, on some level, I feel like I'm going to write an analysis of my own security approach. It would be titled, “My God, It's Full Of Stars” as I look at the IAM policies of everything that I've configured. The idea of least privilege is great. What I like about this approach is that it made it easy to do it so I don't have to worry about it. At one point, I want to go back and wind up instrumenting it a bit further, just so I can wind up getting aggregate numbers of all right, how many times if someone visited this particular link? It'll be good to know.And I don't know… if I have to change permissions to do that yet, but that's okay. It's the best kind of problem: future Corey. So, we'll deal with that when the time comes. But across the board, this has just been a phenomenal experience and it's clear that when you were building Google Cloud Run, you understood the assignment. Because I was looking for people saying negative things about it and by and large, all of its seem to come from a perspective of, “Well, this isn't going to be the most cost-effective or best way to run something that is hyperscale, globe-spanning.”It's yes, that's the thing that Kubernetes was originally built to run and for some godforsaken reason people run their blog on it instead now. Okay. For something that is small, scales to zero, and has long periods where no one is visiting it, great, this is a terrific answer and there's absolutely nothing wrong with that. It's clear that you understood who you were aiming at, and the migration strategy to something that is a bit more, I want to say robust, but let's be clear what I mean when I'm saying that if you want something that's a little bit more impressive on your SRE resume as you're trying a multi-year project to get hired by Google or pretend you got hired by Google, yeah, you can migrate to something else in a relatively straightforward way. But that this is up, running, and works without having to think about it, and that is no small thing.Steren: So, there are two things to say here. The first is yes, indeed, we know we have high developer satisfaction. You know, we measure this—in Google Cloud, you might have seen those small satisfaction surveys popping up sometimes on the user interface, and you know, we are above 90% satisfaction score. We hire third parties to help us understand how usable and what satisfaction score would users get out of Cloud Run, and we are constantly getting very, very good results, in absolute but also compared to the competition.Now, the other thing that you said is that, you know, Cloud Run is for small things, and here while it is definitely something that allows you to be productive, something that strives for simplicity, but it also scales a lot. And contrary to other systems, you do not have any pre-provisioning to make. So, we have done demos where we go from zero to 10,000 container instances in ten seconds because of the infrastructure on which Cloud Run runs, which is fully managed and multi-tenant, we can offer you this scale on demand. And many of our biggest customers have actually not switched to something like Kubernetes after starting with Cloud Run because they value the low maintenance, the no infrastructure management that Cloud Run brings them.So, we have like Ikea, ecobee… for example ecobee, you know, the smart thermostats are using Cloud Run to ingest events from the thermostat. I think Ikea is using Cloud Run more and more for more of their websites. You know, those companies scale, right? This is not, like, scale to zero hobby project. This is actually production e-commerce and connected smart objects production systems that have made the choice of being on a fully-managed platform in order to reduce their operational overhead.[midroll 00:17:54]Corey: Let me be clear. When I say scale—I think we might be talking past each other on a small point here. When I say scale, I'm talking less about oh tens or hundreds of thousands of containers running concurrently. I'm talking in a more complicated way of, okay, now we have a whole bunch of different microservices talking to one another and affinity as far as location to each other for data transfer reasons. And as you start beginning to service discovery style areas of things, where we build a really complicated applications because we hired engineers and failed to properly supervise them, and that type of convoluted complex architecture.That's where it feels like Cloud Run increasingly, as you move in that direction, starts to look a little bit less like the tool of choice. Which is fine, I want to be clear on that point. The sense that I've gotten of it is a great way to get started, it's a great way to continue running a thing you don't have to think about because you have a day job that isn't infrastructure management. And it is clear to—as your needs change—to either remain with the service or pivot to a very close service without a whole lot of retooling, which is key. There's not much of a lock-in story to this, which I love.Steren: That was one of the key principles when we started to design Cloud Run was, you know, we realized the industry had agreed that the container image was the standard for the deployment artifact of software. And so, we just made the early choice of focusing on deploying containers. Of course, we are helping users build those containers, you know, we have things called build packs, we can continuously deploy from GitHub, but at the end of the day, the thing that gets auto-scaled on Cloud Run is a container. And that enables portability.As you said. You can literally run the same container, nothing proprietary in it, I want to be clear. Like, you're just listening on a port for some incoming requests. Those requests can be HTTP requests, events, you know, we have products that can push events to Cloud Run like Eventarc or Pub/Sub. And this same container, you can run it on your local machine, you can run it on Kubernetes, you can run it on another cloud. You're not locked in, in terms of API of the compute.We even went even above and beyond by having the Cloud Run API looks like a Kubernetes API. I think that was an extra effort that we made. I'm not sure people care that much, but if you look at the Cloud Run API, it is actually exactly looking like Kubernetes, Even if there is no Kubernetes at all under the hood; we just made it for portability. Because we wanted to address this concern of serverless which was lock-in. Like, when you use a Function as a Service product, you are worried that the architecture that you are going to develop around this product is going to be only working in this particular cloud provider, and you're not in control of the language, the version that this provider has decided to offer you, you're not in control of more of the complexity that can come as you want to scan this code, as you want to move this code between staging and production or test this code.So, containers are really helping with that. So, I think we made the right choice of this new artifact that to build Cloud Run around the container artifact. And you know, at the time when we launched, it was a little bit controversial because back in the day, you know, 2018, 2019, serverless really meant Functions as a Service. So, when we launched, we little bit redefined serverless. And we basically said serverless containers. Which at the time were two worlds that in the same sentence were incompatible. Like, many people, including internally, had concerns around—Corey: Oh, the serverless versus container war was a big thing for a while. Everyone was on a different side of that divide. It's… containers are effectively increasingly—and I know, I'll get email for this, and I don't even slightly care, they're a packaging format—Steren: Exactly.Corey: —where it solves the problem of how do I build this thing to deploy on Debian instances? And Ubuntu instances, and other instances, God forbid, Windows somewhere, you throw a container over the wall. The end. Its DevOps is about breaking down the walls between Dev and Ops. That's why containers are here to make them silos that don't have to talk to each other.Steren: A container image is a glorified zip file. Literally. You have a set of layers with files in them, and basically, we decided to adopt that artifact standard, but not the perceived complexity that existed at the time around containers. And so, we basically merged containers with serverless to make something as easy to use as a Function as a Service product but with the power of bringing your own container. And today, we are seeing—you mentioned, what kind of architecture would you use Cloud Run for?So, I would say now there are three big buckets. The obvious one is anything that is a website or an API, serving public internet traffic, like your URL redirect service, right? This is, you have an API, takes a request and returns a response. It can be a REST API, GraphQL API. We recently added support for WebSockets, which is pretty unique for a service offering to support natively WebSockets.So, what I mean natively is, my client can open a socket connection—a bi-directional socket connection—with a given instance, for up to one hour. This is pretty unique for something that is as fully managed as Cloud Run.Corey: Right. As we're recording this, we are just coming off of Google I/O, and there were a number of announcements around Cloud Run that were touching it because of, you know, strange marketing issues. I only found out that Google I/O was a thing and featured cloud stuff via Twitter at the time it was happening. What did you folks release around Cloud Run?Steren: Good question, actually. Part of the Google I/O Developer keynote, I pitched a story around how Cloud Run helps developers, and the I/O team liked the story, so we decided to include that story as part of the live developer keynote. So, on stage, we announced Cloud Run jobs. So now, I talked to you about Cloud Run services, which can be used to expose an API, but also to do, like, private microservice-to-microservice communication—because cloud services don't have to be public—and in that case, we support GRPC and, you know, a very strong security mechanism where only Service A can invoke Service B, for example, but Cloud Run jobs are about non-request-driven containers. So, today—I mean, before Google I/O a few days ago, the only requirement that we imposed on your container image was that it started to listen for requests, or events, or GRPC—Corey: Web requests—Steren: Exactly—Corey: It speaks [unintelligible 00:24:35] you want as long as it's HTTP. Yes.Steren: That was the only requirement we asked you to have on your container image. And now we've changed that. Now, if you have a container that basically starts and executes to completion, you can deploy it on a Cloud Run job. So, you will use Cloud Run jobs for, like, daily batch jobs. And you have the same infrastructure, so on-demand, you can go from zero to, I think for now, the maximum is a hundred tasks in parallel, for—of course, you can run many tasks in sequence, but in parallel, you can go from zero to a hundred, right away to run your daily batch job, daily admin job, data processing.But this is more in the batch mode than in streaming mode. If you would like to use a more, like, streaming data processing, than a Cloud Run service would still be the best fit because you can literally push events to it, and it will auto-scale to handle any number of events that it receives.Corey: Do you find that the majority of customers are using Cloud Run for one-off jobs that barely will get more than a single container, like my thing, or do you find that they're doing massively parallel jobs? Where's the lion's share of developer and customer interest?Steren: It's both actually. We have both individual developers, small startups—which really value the scale to zero and pay per use model of Cloud Run. Your URL redirect service probably is staying below the free tier, and there are many, many, many users in your case. But at the same time, we have big, big, big customers who value the on-demand scalability of Cloud Run. And for these customers, of course, they will probably very likely not scale to zero, but they value the fact that—you know, we have a media company who uses Cloud Run for TV streaming, and when there is a soccer game somewhere in the world, they have a big spike of usage of requests coming in to their Cloud Run service, and here they can trust the rapid scaling of Cloud Run so they don't have to pre-provision things in advance to be able to serve that sudden traffic spike.But for those customers, Cloud Run is priced in a way so that if you know that you're going to consume a lot of Cloud Run CPU and memory, you can purchase Committed Use Discounts, which will lower your bill overall because you know you are going to spend one dollar per hour on Cloud Run, well purchase a Committed Use Discount because you will only spend 83 cents instead of one dollar. And also, Cloud Run and comes with two pricing model, one which is the default, which is the request-based pricing model, which is basically you only have CPU allocated to your container instances if you are processing at least one request. But as a consequence of that, you are not paying outside of the processing of those requests. Those containers might stay up for you, one, ready to receive new requests, but you're not paying for them. And so, that is—you know, your URL redirect service is probably in that mode where yes when you haven't used it for a while, it will scale down to zero, but if you send one request to it, it will serve that request and then it will stay up for a while until it decides to scale down. But you the user only pays when you are processing these specific requests, a little bit like a Function as a Service product.Corey: Scales to zero is one of the fundamental tenets of serverless that I think that companies calling something serverless, but it always charges you per hour anyway. Yeah, that doesn't work. Storage, let's be clear, is a separate matter entirely. I'm talking about compute. Even if your workflow doesn't scale down to zero ever as a workload, that's fine, but if the workload does, you don't get to keep charging me for it.Steren: Exactly. And so, in that other mode where you decide to always have CPU allocated to your Cloud Run container instances, then you pay for the entire lifecycle of this container instances. You still benefit from the auto-scaling of Cloud Run, but you will pay for the lifecycle and in that case, the price points are lower because you pay for a longer period of time. But that's more the price model that those bigger customers will take because at their scale, they basically always receive requests, so they already to pay always, basically.Corey: I really want to thank you for taking the time to chat with me. Before you go, one last question that we'll be using as a teaser for the next episode that we record together. It seems like this is a full-time job being the product manager on Cloud Run, but no Google, contrary to popular opinion, does in fact, still support 20% projects. What's yours?Steren: So, I've been looking to work on Cloud Run since it was a prototype, and you know, for a long time, we've been iterating privately on Cloud Run, launching it, seeing it grow, seeing it adopted, it's great. It's my full-time job. But on Fridays, I still find the time to have a 20% project, which also had quite a bit of impact. And I work on some sustainability efforts for Google Cloud. And notably, we've released two things last year.The first one is that we are sharing some carbon characteristics of Google Cloud regions. So, if you have seen those small leaves in the Cloud Console next to the regions that are emitting the less carbon, that's something that I helped bring to life. And the second one, which is something quite big, is we are helping customers report and reduce their gross carbon emissions of their Google Cloud usage by providing an out of the box reporting tool called Google Cloud Carbon Footprint. So, that's something that I was able to bootstrap with a team a little bit on the side of my Cloud Run project, but I was very glad to see it launched by our CEO at the last Cloud Next Conference. And now it is a fully-funded project, so we are very glad that we are able to help our customers better meet their sustainability goals themselves.Corey: And we will be talking about it significantly on the next episode. We're giving a teaser, not telling the whole story.Steren: [laugh].Corey: I really want to thank you for being as generous with your time as you are. If people want to learn more, where can they find you?Steren: Well, if they want to learn more about Cloud Run, we talked about how simple was that name. It was obviously not simple to find this simple name, but the domain is https://cloud.run.Corey: We will also accept snark.cloud/run, I will take credit for that service, too.Steren: [laugh]. Exactly.Corey: There we are.Steren: And then, people can find me on Twitter at @steren, S-T-E-R-E-N. I'll be happy—I'm always happy to help developers get started or answer questions about Cloud Run. And, yeah, thank you for having me. As I said, you successfully deployed something in just a few minutes to Cloud Run. I would encourage the audience to—Corey: In spite of myself. I know, I'm as surprised as anyone.Steren: [laugh].Corey: The only snag I really hit was the fact that I was riding shotgun when we picked up my daughter from school and went through a dead zone. It's like, why is this thing not loading in the Google Cloud Console? Yeah, fix the cell network in my area, please.Steren: I'm impressed that you did all of that from an iPad. But yeah, to the audience give Cloud Run the try. You can really get started connecting your GitHub repository or deploy your favorite container image. And we've worked very hard to ensure that usability was here, and we know we have pretty strong usability scores. Because that was a lot of work to simplicity, and product excellence and developer experience is a lot of work to get right, and we are very proud of what we've achieved with Cloud Run and proud to see that the developer community has been very supportive and likes this product.Corey: I'm a big fan of what you've built. And well, of course, it links to all of that in the show notes. I just want to thank you again for being so generous with your time. And thanks again for building something that I think in many ways showcases the best of what Google Cloud has to offer.Steren: Thanks for the invite.Corey: We'll talk again soon. Steren Giannini is a senior product manager at Google Cloud, on Cloud Run. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice. If it's on YouTube, put the thumbs up and the subscribe buttons as well, but in the event that you hated it also include an angry comment explaining why your 20% project is being a shithead on the internet.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Serverless Should be Simple with Tomasz Łakomy

Screaming in the Cloud

Play Episode Listen Later May 10, 2022 38:43


About TomaszTomasz is a Frontend Engineer at Stedi, Co-Founder/Head of React at Cloudash, egghead.io instructor with over 200 lessons published, a tech speaker, an AWS Community Hero and a lifelong learner.Links Referenced: Cloudash: https://cloudash.dev/ Twitter: https://twitter.com/tlakomy TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by Honeycomb. When production is running slow, it's hard to know where problems originate. Is it your application code, users, or the underlying systems? I've got five bucks on DNS, personally. Why scroll through endless dashboards while dealing with alert floods, going from tool to tool to tool that you employ, guessing at which puzzle pieces matter? Context switching and tool sprawl are slowly killing both your team and your business. You should care more about one of those than the other; which one is up to you. Drop the separate pillars and enter a world of getting one unified understanding of the one thing driving your business: production. With Honeycomb, you guess less and know more. Try it for free at honeycomb.io/screaminginthecloud. Observability: it's more than just hipster monitoring.Corey: This episode is sponsored in part by our friends at ChaosSearch. You could run Elasticsearch or Elastic Cloud—or OpenSearch as they're calling it now—or a self-hosted ELK stack. But why? ChaosSearch gives you the same API you've come to know and tolerate, along with unlimited data retention and no data movement. Just throw your data into S3 and proceed from there as you would expect. This is great for IT operations folks, for app performance monitoring, cybersecurity. If you're using Elasticsearch, consider not running Elasticsearch. They're also available now in the AWS marketplace if you'd prefer not to go direct and have half of whatever you pay them count towards your EDB commitment. Discover what companies like Equifax, Armor Security, and Blackboard already have. To learn more, visit chaossearch.io and tell them I sent you just so you can see them facepalm, yet again.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. It's always a pleasure to talk to people who ask the bold questions. One of those great bold questions is, what if CloudWatch's web page didn't suck? It's a good question. It's one I ask myself all the time.And then I stumbled across a product that wound up solving this for me, and I'm a happy customer. To be clear, they're not sponsoring anything that I do, nor should they. It's one of those bootstrapped, exciting software projects called Cloudash. Today, I'm joined by the Head of React at Cloudash, Tomasz Łakomy. Tomasz, thank you for joining me.Tomasz: It's a pleasure to be here.Corey: So, where did this entire idea come from? Because I sit and I get upset every time I have to go into the CloudWatch dashboard because first, something's broken. In an ideal scenario, I don't have to care about monitoring or observability or anything like that. But then it's quickly overshadowed by the fact that this interface is terrible. And the reason I know it's terrible is that every time I'm in there, I feel dumb.My belief is—for the longest time, I thought that was a problem with me. But no, invariably, when you wind up working with something and consistently finding it a bad—you don't know enough to solve for it, it's not you. It is, in fact, the signs of a poorly designed experience, start to finish. “You should be smarter to use this tool,” is very rarely correct. And there are a bunch of observability tools and monitoring tools for serverless things that have made sense over the years and made this easier, but one of the most—and please don't take this the wrong way—stripped down, bare essentials of just the facts, style of presentation is Cloudash. It's why I continue to pay for it every month with a smile on my face. How did you get here from there?Tomasz: Yeah that's a good question. I would say that. Cloudash was born out of desire for simple things to be simple. So, as you mentioned, Cloudash is basically the monitoring and troubleshooting tool for serverless applications, made for serverless developers because I am very much into serverless space, as is Maciej Winnicki, who is the another half of Cloudash team. And, you know, the whole premise of serverless was things are going to be simpler, right?So, you know, you have a bunch of code, you're going to dump it into a Lambda function, and that's it. You don't have to care about servers, you don't have to care about, you know, provisioning stuff, you don't have to care about maintenance, and so on. And that is not exactly true because why PagerDuty still continues to be [unintelligible 00:02:56] business even in serverless spaces. So, you will get paged every now and then. The problem is—what we kind of found is once you have an incident—you know, PagerDuty always tends to call it in the middle of the night; it's never, like, 11 a.m. during the workday; it's always the middle of the night.Corey: And no one's ever happy when it calls them either. It's, “Ah, hell.” Whatever it rings, it's yeah, the original Call of Duty. PagerDuty hooked up to Nagios. I am old enough to remember those days.Tomasz: [unintelligible 00:03:24] then business, like, imagine paying for something that's going to wake you up in the middle of the night. It doesn't make sense. In any case—Corey: “So, why do you pay for that product? Because it's really going to piss me off.” “Okay, well… does that sound like a good business to you? Well, AWS seems to think so. No one's happy working with that stuff.” “Fair. Fair enough.”Tomasz: So, in any case, like we've established an [unintelligible 00:03:43]. So you wake up, you go to AWS console because you saw a notification that this-and-this API has, you know, this threshold was above it, something was above the threshold. And then you go to the CloudWatch console. And then you see, okay, those are the logs, those are the metrics. I'm going to copy this request ID. I'm going to go over here. I'm going to go to X-Ray.And again, it's 3 a.m. so you don't exactly remember what do you investigate; you have, like, ten minutes. And this is a problem. Like, we've kind of identified that it's not simple to do these kinds of things, too—it's not simple to open something and have an understanding, okay, what exactly is happening in my serverless app at this very moment? Like, what's going on?So, we've built that. So, Cloudash is a desktop app; it lives on your machine, which is a single pane of glass. It's a single pane of glass view into your serverless system. So, if you are using CloudFormation in order to provision something, when you open Cloudash, you're going to see, you know, all of the metrics, all the Lambda functions, all of the API Gateways that you have provisioned. As of yesterday, API Gateway is no longer cool because they did launch the direct integration, so you have—you can call Lambda functions with [crosstalk 00:04:57]—Corey: Yeah, it's the one they released, and then rolled back and somehow never said a word—because that's an AWS messaging story, and then some—right around re:Invent last year. And another quarter goes by and out it goes.Tomasz: It's out yesterday.Corey: Yeah, it's terrific. I love that thing. The only downside to it is, ah, you have to use one of their—you have to use their domain; no custom domain support. Really? Well, you can hook up CloudFront to it, but the pricing model that way makes it more expensive than API Gateway.Okay, so I could use Cloudflare in front of it, and then it becomes free, so I bought a domain just for that purpose. That's right, my serverl—my direct Lambda URLs now live behind the glorious domain of cheapass.cloud because of course. They are. It's a day-one product from AWS, so of course, it's not feature-complete.But one of the things I like about the serverless model, and it's also a challenge when it comes to troubleshooting stuff is that it's very much set it and forget it style because serverless in many cases, at least the way that I tend to use it, is back-office stuff, its back-end things, it's processing on things that are not necessarily always direct front and center. So, these things can run on their own for years until finally, you find a strange bug in a new use case, or you want to go and change something. And then it's how the hell did this ever work? And it's still working, kind of, but what fool built this? Of course, it was me; it's always me.But what happened here? You're basically excavating your own legacy code, trying to understand what's going on. And so, you're already upset then. Cloudash makes this easier to find the things, to navigate through a whole bunch of different accounts. And there are a bunch of decisions that you made while building the app that are so clearly correct, that I get actively annoyed when others don't because oh, it looks at your AWS configuration file in your user home directory. Great, awesome. It's a desktop app, but it still consults that file. Yay, integration between ClickOps and the terminal. Wonderful.But ah, use SSO for a lot of stuff, so that's going to fix your little red wagon. I click on that app, and suddenly, bam, a browser opens asking me to log in and authenticate, allow the request. It works, and then suddenly, it goes back to doing exactly what you'd expect it to. It's really nice. The affordances behind this are glorious.Tomasz: Like I said, one of our kind of design goals when building Cloudash was to make simple things simple again. The whole purpose is to make sure that you can get into the root cause of an issue within, like, five minutes, if not less. And this is kind of the app that you're going to tend to open whenever that—as I said, because some of the systems can be around for, like, ages, literally without any incident whatsoever, then the data is going to change because somebody [unintelligible 00:07:30] got that the year is 2020 and off you go, we have an incident.But what's important about Cloudash is that we don't send logs anywhere. And that's kind of important because you don't pay for [PUT 00:07:42] metric API because we are not sending those logs anywhere. If you install Cloudash on your machine, we are not going to get your logs from the last ten years, put them in into a system, charge you for that, just so you are able to, you know, find out what happened in this particular hour, like, two weeks ago. We genuinely don't care about your logs; we have enough of our own logs at work to, you know, to analyze, to investigate, and so on; we are not storing them anywhere.In fact, you know, whatever happens on your machine stays on the machine. And that is partially why this is a desktop app. Because we don't want to handle your credentials. We don't—absolutely, we don't want you to give us any of your credentials or access keys, you know, whatever. We don't want that.So, that is why you install Cloudash, it's going to run on your machine, it's going to use your local credentials. So, it's… effectively, you could say that this is a much more streamlined and much more laser-focused browser or like, an eye into AWS systems, which live on the serverless side of things.Corey: I got to deal with it in a bit of an interesting way, recently. I have a detector in my company's production AWS org, to detect when ClickOps is afoot. Now, I'm a big proponent of ClickOps, but I also want to know what's going on, so I have a whole thing that [runs detects 00:09:04] when people are doing things in the console versus via API. And it alerts on certain subsets of them. I had to build a special case for the user agent string coming out of Cloudash because no, no, this is an app, this is not technically ClickOps—it is also read-only, which is neither here nor there, to my understanding.But it was, “Oh yeah, this is effectively an Electron app.” It just wraps, effectively, a browser and presents that as an application. And cool. From my perspective, that's an implementation detail. It feels like a native app—because it is—and I can suddenly see the things I care about in a way that is much more straightforward without having to have four different browser tabs open where, okay, here's the CloudTrail log for this thing, here's the metrics next to it. Oh, those are two separate windows already, and so on and so forth. It just makes hunting down to the obnoxious problems so much nicer.It's also, you're one of those rare products where if I don't use it for a month, I don't get the bill at the end of the month and think, “Ooh, that's going to—did I waste the money?” It's no, nice. I had a whole month where I didn't have to mess with this. It's great.Tomasz: Exactly. I feel like, you know, it's one of those systems where, as you said, we send you an email at the end of every month that we're going to charge you X dollars for the month—by the way, we have fixed pricing and then you can cancel anytime—and it's like one of those things that, you know, I didn't have to open this up for a month. This is awesome because I didn't have any incidents. But I know whenever again, PagerDuty is going to decide, “Hey, dude, wake up. You know, if slept for three hours. That is definitely long enough,” then you know that; you know, this app is there and you can use that.We very much care about, you know, building this stuff, not only for our customers, but we also use that on a daily basis. In fact, I… every single time that I have to—I want to investigate something in, like, our serverless systems at Stedi because everything that we do at work, at Stedi, since this incident serverless paradigm. So, I tend to open Cloudash, like, 95% of the time whenever I want to investigate something. And whenever I am not able to do something in Cloudash, this goes, like, straight to the top of our, you know, issue lists or backlog or whatever you want to call it. Because we want to make this product, not only awesome, you know, for customers to buy a [unintelligible 00:11:22] or whatever, but we also want to be able to use that on a daily basis.And so far, I think we've kind of succeeded. But then again, we have quite a long way to go because we have more ideas, than we have the time, definitely, so we have to kind of prioritize what exactly we're going to build. So, [unintelligible 00:11:39] integrations with alarms. So, for instance, we want to be able to see the alarms directly in the Cloudash UI. Secondly, integration with logs insights, and many other ideas. I could probably talk for hours about what we want to build.Corey: I also want to point out that this is still your side gig. You are by day a front-end engineer over at Stedi, which has a borderline disturbing number of engineers with side gigs, generally in the serverless space, doing interesting things like this. Dynobase is another example, a DynamoDB desktop client; very similar in some respects. I pay for that too. Honestly, for a company in Stedi's space, which is designed as basically a giant API for deep, large enterprise business stuff, there's an awful lot of stuff for small-scale coming out of that.Like, I wind up throwing a disturbing amount of money in the general direction of Stedi for not being their customer. But there's something about the culture that you folks have built over there that's just phenomenal.Tomasz: Yeah. For the record, you know, having a side gig is another part of interview process at Stedi. You don't have to have [laugh] a side project, but yeah, you're absolutely right, you know, the amount of kind of side projects, and you know, some of those are monetized, as you mentioned, you know, Cloudash and Dynobase and others. Some of those—because for instance, you talked to Aidan, I think a couple of weeks ago about his shenanigans, whenever you know, AWS is going to announce something he gets in and try to [unintelligible 00:13:06] this in the most amusing ways possible. Yeah, I mean, I could probably talk for ages about why Stedi is by far the best company I've ever worked at, but I'm going to say this: that this is the most talented group of people I've ever met, and myself, honestly.And, you know, the fact that I think we are the second largest, kind of, group of AWS experts outside of AWS because the density of AWS Heroes, or ex-AWS employees, or people who have been doing cloud stuff for years, is frankly, massive, I tend to learn something new about cloud every single day. And not only because of the Last Week in AWS but also from our Slack.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: There's something to be said for having colleagues that you learn from. I have never enjoyed environments where I did not actively feel like the dumbest person in the room. That's why I love what I do now. I inherently am. I have to talk about so many different things, that whenever I talk to a subject matter expert, it is a certainty that they know more about the thing than I do, with the admitted and depressing exception of course of the AWS bill because it turns out the reason I had to start becoming the expert in that was because there weren't any. And here we are now.I want to talk as well about some of—your interaction outside of work with AWS. For example, you've been an Egghead instructor for a while with over 200 lessons that you published. You're an AWS Community Hero, which means you have the notable distinction of volunteering for a for-profit company—good work—no, the community is very important. It's helping each other make sense of the nonsense coming out of there. You've been involved within the ecosystem for a very long time. What is it about, I guess—the thing I'm wondering about myself sometimes—what is it about the AWS universe that drew you in, and what keeps you here?Tomasz: So, give you some context, I've started, you know, learning about the cloud and AWS back in early-2019. So, fun fact: Maciej Winnicki—again, the co-founder of Cloudash—was my manager at the time. So, we were—I mean, the company I used to work for at the time, OLX Group, we are in the middle of cloud transformation, so to speak. So, going from, you know, on-premises to AWS. And I was, you know, hired as a senior front-end engineer doing, you know, all kinds of front-end stuff, but I wanted to grow, I wanted to learn more.So, the idea was, okay, maybe you can get AWS Certified because, you know, it's one of those corporate goals that you have to have something to put that checkbox next to it. So, you know, getting certified, there you go, you have a checkbox. And off you go. So, I started, you know, diving in, and I saw this whole ocean of things that, you know, I was not entirely aware of. To be fair, at the time I knew about this S3, I knew that you can put a file in an S3 bucket and then you can access it from the internet. This is, like, the [unintelligible 00:16:02] idea of my AWS experiences.Corey: Ideally, intentionally, but one wonders sometimes.Tomasz: Yeah, exactly. That is why you always put stuff as public, right? Because you didn't have to worry about who [unintelligible 00:16:12] [laugh] public [unintelligible 00:16:15]. No, I'm kidding, of course. But still, I think what's [unintelligible 00:16:20] to AWS is what—because it is this endless ocean of things to learn and things to play with, and, you know, things to teach.I do enjoy teaching. As you said, I have quite a lot of, you know, content, videos, blog posts, conference talks, and a bunch of other stuff, and I do that for two reasons. You know, first of all, I tend to learn the best by teaching, so it helps me very much, kind of like, solidify my own knowledge. Whenever I record—like, I have two courses about CDK, you know, when I was recording those, I definitely—that kind of solidify my, you know, ideas about CDK, I get to play with all those technologies.And secondly, you know, it's helpful for others. And, you know, people have opinions about certificates, and so on and so forth, but I think that for somebody who's trying to get into either the tech industry or, you know, cloud stuff in general, being certified helps massively. And I've heard stories about people who are basically managed to double or triple their salaries by going into tech, you know, with some of those certificates. That is why I strongly believe, by the way, that those certificates should be free. Like, if you can pass the exam, you shouldn't have to worry about this $150 of the fee.Corey: I wrote a blog post a while back, “The Dumbest Dollars a Cloud Provider Can Make,” and it's charging for training and certification because if someone's going to invest that kind of time in learning your platform, you're going to try and make $150 bucks off them? Which in some cases, is going to put people off from even beginning that process. “What cloud provider I'm not going to build a project on?” Obviously, the one I know how to work with and have a familiarity with, in almost every case. And the things you learn in your spare time as an independent learner when you get a job, you tend to think about your work the same way. It matters. It's an early on-ramp that pays off down the road and the term of years.I used to be very anti-cert personally because it felt like I was jumping through hoops, and paying, in some cases, for the privilege. I had a CCNA for a while from Cisco. There were a couple of smaller companies, SaltStack, for example, that I got various certifications from at different times. And that was sort of cheating because I helped write the software, but that's neither here nor there. It's the—and I do have a standing AWS cert that I get a different one every time—mine is about to expire—because it gets me access to lounges at physical events, which is the dumbest of all reasons to get certs, but here you go. I view it as the $150 lounge pass with a really weird entrance questionnaire.But in my case it certs don't add anything to what I do. I am not the common case. I am not early in my career. Because as you progress through your career, things—there needs to be a piece of paper that says you know things, and early on degree or certifications are great at that. In the time it becomes your own list of experience on your resume or CV or LinkedIn or God knows what. Polywork if you're doing it the right way these days.And it shows a history of projects that are similar in scope and scale and impact to the kinds of problems that your prospective employer is going to have to solve themselves. Because the best answer to hear—especially in the ops world—when there's a problem is, “Oh, I've seen this before. Here's how you fix it.” As opposed to, “Well, I don't know. Let me do some research.”There's value to that. And I don't begrudge anyone getting certs… to a point. At least that's where I sit on it. At some point when you have 25 certs, it's when you actually do any work? Because it's taking the tests and learning all of these things, which in many ways does boil down to trivia, it stands in counterbalance to a lot of these things.Tomasz: Yeah. I mean, I definitely, totally agree. I remember, you know, going from zero to—maybe not Hero; I'm not talking about AWS Hero—but going from zero to be certified, there was the Solutions Architect Associate. I think it took me, like, 200 hours. I am not the, you know, the brightest, you know, the sharpest tool in the shed, so it probably took me, kind of, somewhat more.I think it's doable in, like, 100 hours, but I tend to over-prepare for stuff, so I didn't actually take the actual exam until I was able to pass the sample exams with, like, 90% pass, just to be extra sure that I'm actually going to pass it. But still, I think that, you know, at some point, you probably should focus on, you know, getting into the actual stuff because I hold two certificates, you know, one of those is going to expire, and I'm not entirely sure if I want to go through the process again. But still, if AWS were to introduce, like, a serverless specialty exam, I would be more than happy to have that. I genuinely enjoy, kind of, serverless, and you know, the fact that I would be able to solidify my knowledge, I have this kind of established path of the things that I should learn about in order to get this particular certificate, I think this could be interesting. But I am not probably going to chase all the 12 certificates.Maybe if AWS IQ was available in Poland, maybe that would change because I do know that with IQ, those certs do matter. But as of [unintelligible 00:21:26] now, I'm quite happy with my certs that I have right now.Corey: Part of the problem, too, is the more you work with these things, the harder it becomes to pass the exams, which sounds weird and counterintuitive, but let me use myself as an example. When I got the cloud practitioner cert, which I believe has lapsed since then, and I got one of the new associate-level betas—I'll keep moving up the stack until I start failing exams. But I got a question wrong on the cloud practitioner because it was, “How long does it take to restore an RDS database from a snapshot backup?” And I gave the honest answer of what I've seen rather than what it says in the book, and that honest answer can be measured in days or hours. Yeah.And no, that's not the correct answer. Yeah, but it is the real one. Similarly, a lot of the questions get around trivia, syntax of which of these is the correct argument, and which ones did we make up? It's, I can explain in some level of detail, virtually every one of AWS has 300 some-odd services to you. Ask me about any of them, I could tell you what it is, how it works, how it's supposed to work and make a dumb joke about it. Fine, whatever.You'll forgive me if I went down that path, instead of memorizing what is the actual syntax of this YAML construct inside of a CloudFormation template? Yeah, I can get the answer to that question in the real world, with about ten seconds of Googling and we move on. That's the way most of us learn. It's not cramming trivia into our heads. There's something broken about the way that we do certifications, and tech interviews in many cases as well.I look back at some of the questions I used to ask people for Linux sysadmin-style jobs, and I don't remember the answer to a lot of these things. I could definitely get back into it, but if I went through one of these interviews now, I wouldn't get the job. One would argue I shouldn't because of my personality, but that's neither here nor there.Tomasz: [laugh]. I mean, that's why you use CDK, so you'd have to remember random YAML comments. And if you [unintelligible 00:23:26] you don't have YAML anymore. [unintelligible 00:23:27].Corey: Yes, you're quite the CDK fanboy, apparently.Tomasz: I do like CDK, yes. I don't like, you know, mental overhead, I don't like context switching, and the way we kind of work at Stedi is everything is written in TypeScript. So, I am a front-end engineer, so I do stuff in the front-end line in TypeScript, all of our Lambda functions are written in TypeScript, and our [unintelligible 00:23:48] is written in TypeScript. So, I can, you know, open up my Visual Studio Code and jump between all of those files, and the language stays the same, the syntax stays the same, the tools stay the same. And I think this is one of the benefits of CDK that is kind of hard to replicate otherwise.And, you know, people have many opinions about the best to deploy infrastructure in the cloud, you know? The best infrastructure-as-code tool is the one that you use at work or in your private projects, right? Because some people enjoy ClickOps like you do; people—Corey: Oh yeah.Tomasz: Enjoy CloudFormation by hand, which I don't; people are very much into Terraform or Serverless Framework. I'm very much into CDK.Corey: Or the SAM CLI, like, three or four more, and I use—Tomasz: Oh, yeah. [unintelligible 00:24:33]—Corey: —all of these things in various ways in some of my [monstrous 00:24:35] projects to keep up on all these things. I did an exploration with the CDK. Incidentally, I think you just answered why I don't like it.Tomasz: Because?Corey: Because it is very clear that TypeScript is a first-class citizen with the CDK. My language of choice is shitty bash because, grumpy old sysadmin; it happens. And increasingly, that is switching over to terrible Python because I'm very bad at that. And the problem that I run into as I was experimenting with this is, it feels like the Python support is not fully baked, most people who are using the CDK are using a flavor of JavaScript and, let's be very clear here, the every time I have tried to explore front-end, I have come away more confused than I was when I started, part of me really thinks I should be learning some JavaScript just because of its versatility and utility to a whole bunch of different problems. But it does not work the way I think, on some level, that it should because of my own biases and experiences. So, if you're not a JavaScript person, I think that you have a much rockier road with the CDK.Tomasz: I agree. Like I said, I tend to talk about my own experiences and my kind of thoughts about stuff. I'm not going to say that, you know, this tool or that tool is the best tool ever because nothing like that exists. Apart from jQuery, which is the best thing that ever happened to the web since, you know, baked bread, honestly. But you are right about CDK, to the best of my knowledge, kind of, all the other languages that are supported by CDK are effectively transpiled down from TypeScript. So it's, like, first of all, it is written in TypeScript, and then kind of the Python, all of the other languages… kind of come second.You know, and afterwards, I tend to enjoy CDK because as I said, I use TypeScript on a daily basis. And you know, with regards to front-end, you mentioned that you are, every single time you is that you end up being more confused. It never goes away. I've been doing front-end stuff for years, and it's, you know, kind of exactly the same. Fun story, I actually joined Cloudash because, well, Maciej started working on Cloudash alone, and after quite some time, he was so frustrated with the modern front-end landscape that he asked me, “Dude, you need to help me. Like, I genuinely need some help. I am tired of React. I am tired of React hooks. This is way too complex. I want to go back to doing back-end stuff. I want to go back, you know, thinking about how we're going to integrate with all those APIs. I don't want to do UI stuff anymore.”Which was kind of like an interesting shift because I remember at the very beginning of my career, where people were talking about front-end—you know, “Front-end is not real programming. Front-end is, you know, it's easy, it's simple. I can learn CSS in an hour.” And the amount of people who say that CSS is easy, and are good at CSS is exactly zero. Literally, nobody who's actually good at CSS says that, you know, CSS, or front-end, or anything like that is easy because it's not. It's incredibly complex. It's getting probably more and more complex because the expectations of our front-end UIs [unintelligible 00:27:44].Corey: It's challenging, it is difficult, and one of the things I find most admirable about you is not even your technical achievements, it's the fact that you're teaching other people to do this. In fact, this gets to the last point I want to cover on our conversation today. When I was bouncing topic ideas off of you, one of the points you brought up that I'm like, “Oh, we're keeping that and saving that for the end,” is why—to your words—why speaking at tech events gets easier, but never easy. Let's dive into that. Tell me more about it.Tomasz: Basically, I've accidentally kickstarted my career by speaking at meetups which later turned into conferences, which later turned into me publishing courses online, which later turned into me becoming an AWS Hero, and here we are, you know, talking to each other. I do enjoy, you know, going out in public and speaking and being on stage. I think, you know, if somebody has, kind of, the heart, the ability to do that, I do strongly recommend, you know, giving it a shot, not only to give, like, an honestly life-changing experience because the first time you go in front of hundreds of people, this is definitely, you know, something that's going to shake you, while at the same time acknowledging that this is absolutely, definitely not for everyone. But if you are able to do that, I think this is definitely worth your time. But as you said—by quoting me—that it gets easier, so every single time you go on stage, talk at a meetup or at a conference or online conferences—which I'm not exactly a fan of, for the record—it's—Corey: It's too much like work, too much like meetings. There's nothing different about it.Tomasz: Yeah, exactly. Like, there's no journey. There's no adventure in online conferences. I know that, of course, you know, given all of that, you know, we had to kind of switch to online conferences for quite some time where I think we are pretending that Covid is not a thing anymore, so we, you know, we're effectively going back, but kind of the point I wanted to make is that I am a somewhat experienced public speaker—I'd like to say that because I've been doing that for years—but I've been, you know, talking to people who actually get paid to speak at the conferences, to actually kind of do that for a living, and they all say the same thing. It gets simpler, it gets easier, but it's never freaking easy, you know, to go out there, and you know, to share whatever you've learned.Corey: I'm one of those people. I am a paid public speaker fairly often, even ignoring the podcast side, and I've spoken on conference stages a couple hundred times at least. And it does get easier but never easy. That's a great way of framing it. You… I get nervous before every talk I give.There are I think two talks I've given that I did not have an adrenaline hit and nervous energy before I went onstage, and both of those were duds. Because I think that it's part of the process, at least for me. And it's like, “Oh, how do you wind up not being scared for before you go on stage?” You don't. You really don't.But if that appeals to you and you enjoy the adrenaline rush of the rest, do it. If you're one of those people who've used public speaking as, “I would prefer death over that,” people are more scared of public speaking their death, in some cases, great. There are so many ways to build audiences and to reach people that fine, if you don't like doing it on stage, don't force yourself to. I'd say try it once; see how it feels meetups are great for this.Tomasz: Yeah. Meetups are basically the best way to get started. I'm yet to meet a meetup, either, you know, offline or online, who is not looking for speakers. It's always quite the opposite, you know? I was, you know, co-organizing a meetup in my city here in Poznań, Poland, and the story always goes like this: “Okay, we have a date. We have a venue. Where are the speakers?” And then you know, the tumbleweed is going to roll across the road and, “Oh, crap, we don't have any speakers.” So, we're going to try to find some, reach out to people. “Hey, I know that you did this fantastic project at your workplace. Come to us, talk about this.” “No, I don't want to. You know, I'm not an expert. I am, you know, I have on the 50 years of experience as an engineer. This is not enough.” Like I said, I do strongly recommend it, but as you said, if you're more scared of public speaking than, like, literally dying, maybe this is not for you.Corey: Yeah. It comes down to stretching your limits, finding yourself interesting. I find that there are lots of great engineers out there. The ones that I find myself drawn to are the ones who aren't just great at building something, but at storytelling around the thing that they are built of, yes, you build something awesome, but you have to convince me to care about it. You have to show me the thing that got you excited about this.And if you can't inspire that excitement in other people, okay. Are you really excited about it? Or what is the story here? And again, it's a different skill set. It is not for everyone, but it is absolutely a significant career accelerator if it's leveraged right.Tomasz: [crosstalk 00:32:45].Corey: [crosstalk 00:32:46] on it.Tomasz: Yeah, absolutely. I think that we don't talk enough about, kind of, the overlap between engineering and marketing. In the good sense of marketing, not the shady kind of marketing. The kind of marketing that you do for yourself in order to elevate yourself, your projects, your successes to others. Because, you know, try as you might, but if you are kind of like sitting in the corner of an office, you know, just jamming on your keyboard 40 hours per week, you're not exactly likely to be promoted because nobody's going to actively reach out to you to find out about your, you know, recent successes and so on.Which at the same time, I'm not saying that you should go @channel in Slack every single time you push a commit to the main branch, but there's definitely, you know, a way of being, kind of, kind to yourself by letting others know that, “Okay, I'm here. I do exist, I have, you know, those particular skills that you may be interested about. And I'm able to tell a story which is, you know, convincing.” So it's, you know, you can tell a story on stage, but you can also tell your story to your customers by building a future that they're going to use. [unintelligible 00:33:50].Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place to find you?Tomasz: So, the best place to find me is on Twitter. So, my Twitter handle is @tlakomy. So, it's T-L-A-K-O-M-Y. I'm assuming this is going to be in the [show notes 00:34:06] as well.Corey: Oh, it absolutely is. You beat me to it.Tomasz: [laugh]. So, you can find Cloudash at cloudash.dev. You can probably also find my email, but don't email me because I'm terrible, absolutely terrible at email, so the best way to kind of reach out to me is via my Twitter DMs. I'm slightly less bad at those.Corey: Excellent. And we will, of course, put links to that in the [show notes 00:34:29]. Thank you so much for being so generous with your time. I appreciate it.Tomasz: Thank you. Thank you for having me.Corey: Tomasz Łakomy, Head of React at Cloudash. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, and if you're on the YouTubes, smash the like and subscribe button, as the kids say. Whereas if you've hated this episode, please do the exact same thing—five-star reviews smash the buttons—but this time also leave an insulting and angry comment written in the form of a CloudWatch log entry that no one is ever able to find in the native interface.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Leading the Cloud Security Pack with Yoav Alon

Screaming in the Cloud

Play Episode Listen Later May 3, 2022 34:13


About YoavYoav is a security veteran recognized on Microsoft Security Response Center's Most Valuable Research List (BlackHat 2019). Prior to joining Orca Security, he was a Unit 8200 researcher and team leader, a chief architect at Hyperwise Security, and a security architect at Check Point Software Technologies. Yoav enjoys hunting for Linux and Windows vulnerabilities in his spare time.Links Referenced: Orca Security: https://orca.security Twitter: https://twitter.com/yoavalon TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning fast processing power, courtesy of third gen AMD EPYC processors without the IO, or hardware limitations, of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices, and say goodbye to noisy neighbors and egregious egress forever. Vultr delivers the power of the cloud with none of the bloat. "Screaming in the Cloud" listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G E T V U L T R.com/screaming. My thanks to them for sponsoring this ridiculous podcast.Corey: Finding skilled DevOps engineers is a pain in the neck! And if you need to deploy a secure and compliant application to AWS, forgettaboutit! But that's where DuploCloud can help. Their comprehensive no-code/low-code software platform guarantees a secure and compliant infrastructure in as little as two weeks, while automating the full DevSecOps lifestyle. Get started with DevOps-as-a-Service from DuploCloud so that your cloud configurations are done right the first time. Tell them I sent you and your first two months are free. To learn more visit: snark.cloud/duplocloud. Thats's snark.cloud/D-U-P-L-O-C-L-O-U-D. Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Periodically, I would say that I enjoy dealing with cloud platform security issues, except I really don't. It's sort of forced upon me to deal with much like a dead dog is cast into their neighbor's yard for someone else to have to worry about. Well, invariably, it seems like it's my yard.And I'm only on the periphery of these things. Someone who's much more in the trenches in the wide world of cloud security is joining me today. Yoav Alon is the CTO at Orca Security. Yoav, thank you for taking the time to join me today and suffer the slings and arrows I'll no doubt be hurling your way.Yoav: Thank you, Corey, for having me. I've been a longtime listener, and it's an honor to be here.Corey: I still am periodically surprised that anyone listens to these things. Because it's unlike a newsletter where everyone will hit reply and give me a piece of their mind. People generally don't wind up sending me letters about things that they hear on the podcast, so whenever I talk to somebody listens to it as, “Oh. Oh, right, I did turn the microphone on. Awesome.” So, it's always just a little on the surreal side.But we're not here to talk necessarily about podcasting, or the modern version of an AM radio show. Let's start at the very beginning. What is Orca Security, and why would folks potentially care about what it is you do?Yoav: So, Orca Security is a cloud security company, and our vision is very simple. Given a customer's cloud environment, we want to detect all the risks in it and implement mechanisms to prevent it from occurring. And while it sounds trivial, before Orca, it wasn't really possible. You will have to install multiple tools and aggregate them and do a lot of manual work, and it was messy. And we wanted to change that, so we had, like, three guiding principles.We call it seamless, so I want to detect all the risks in your environment without friction, which is our speak for fighting with your peers. We also want to detect everything so you don't have to install, like, a tool for each issue: A tool for vulnerabilities, a tool for misconfigurations, and for sensitive data, IAM roles, and such. And we put a very high priority on context, which means telling you what's important, what's not. So, for example, S3 bucket open to the internet is important if it has sensitive data, not if it's a, I don't know, static website.Corey: Exactly. I have a few that I'd like to get screamed at in my AWS account, like, “This is an open S3 bucket and it's terrible.” I look at it the name is assets.lastweekinaws.com. Gee, I wonder if that's something that's designed to be a static hosted website.Increasingly, I've been slapping CloudFront in front of those things just to make the broken warning light go away. I feel like it's an underhanded way of driving CloudFront adoption some days, but not may not be the most charitable interpretation thereof. Orca has been top-of-mind for a lot of folks in the security community lately because let's be clear here, dealing with security problems in cloud providers from a vendor perspective is an increasingly crowded—and clouded—space. Just because there's so much—there's investment pouring into it, everyone has a slightly different take on the problem, and it becomes somewhat challenging to stand out from the pack. You didn't really stand out from the pack so much as leaped to the front of it and more or less have become the de facto name in a very short period of time, specifically—at least from my world—when you wound up having some very interesting announcements about vulnerabilities within AWS itself. You will almost certainly do a better job of relating the story, so please, what did you folks find?Yoav: So, back in September of 2021, two of my researchers, Yanir Tsarimi and Tzah Pahima, each one of them within a relatively short span of time from each other, found a vulnerability in AWS. Tzah found a vulnerability in CloudFormation which we named BreakingFormation and Yanir found a vulnerability in AWS Glue, which we named SuperGlue. We're not the best copywriters, but anyway—Corey: No naming things is hard. Ask any Amazonian.Yoav: Yes. [laugh]. So, I'll start with BreakingFormation which caught the eyes of many. It was an XXE SSRF, which is jargon to say that we were able to read files and execute HTTP requests and read potentially sensitive data from CloudFormation servers. This one was mitigated within 26 hours by AWS, so—Corey: That was mitigated globally.Yoav: Yes, globally, which I've never seen such quick turnaround anywhere. It was an amazing security feat to see.Corey: Particularly in light of the fact that AWS does a lot of things very right when it comes to, you know, designing cloud infrastructure. Imagine that, they've had 15 years of experience and basically built the idea of cloud, in some respects, at the scale that hyperscalers operate at. And one of their core tenets has always been that there's a hard separation between regions. There are remarkably few global services, and those are treated with the utmost of care and delicacy. To the point where when something like that breaks as an issue that spans more than one region, it is headline-making news in many cases.So it's, they almost never wind up deploying things to all regions at the same time. That can be irksome when we're talking about things like I want a feature that solves a problem that I have, and I have to wait months for it to hit a region that I have resources living within, but for security, stuff like this, I am surprised that going from, “This is the problem,” to, “It has been mitigated,” took place within 26 hours. I know it sounds like a long time to folks who are not deep in the space, but that is superhero speed.Yoav: A small correction, it's 26 hours for, like, the main regions. And it took three to four days to propagate to all regions. But still, it's speed of lighting in for security space.Corey: When this came out, I was speaking to a number of journalists on background about trying to wrap their head around this, and they said that, “Oh yeah, and security is always, like, the top priority for AWS, second only to uptime and reliability.” And… and I understand the perception, but I disagree with it in the sense of the nightmare scenario—that every time I mention to a security person watching the blood drain from their face is awesome—but the idea that take IAM, which as Werner said in his keynote, processes—was it 500 million or was it 500 billion requests a second, some ludicrous number—imagine fails open where everything suddenly becomes permitted. I have to imagine in that scenario, they would physically rip the power cables out of the data centers in order to stop things from going out. And that is the right move. Fortunately, I am extremely optimistic that will remain a hypothetical because that is nightmare fuel right there.But Amazon says that security is job zero. And my cynical interpretation is that well, it wasn't, but they forgot security, decided to bolt it on to the end, like everyone else does, and they just didn't want to renumber all their slides, so instead of making it point one, they just put another slide in front of it and called the job zero. I'm sure that isn't how it worked, but for those of us who procrastinate and building slide decks for talks, it has a certain resonance to it. That was one issue. The other seemed a little bit more pernicious focusing on Glue, which is their ETL-as-a-Service… service. One of them I suppose. Tell me more about it.Yoav: So, one of the things that we found when we found the BreakingFormation when we reported the vulnerability, it led us to do a quick Google search, which led us back to the Glue service. It had references to Glue, and we started looking around it. And what we were able to do with the vulnerability is given a specific feature in Glue, which we don't disclose at the moment, we were able to effectively take control over the account which hosts the Glue service in us-east-1. And having this control allowed us to essentially be able to impersonate the Glue service. So, every role in AWS that has a trust to the Glue service, we were able to effectively assume a role into it in any account in AWS. So, this was more critical a vulnerability in its effect.Corey: I think on some level, the game of security has changed because for a lot of us who basically don't have much in the way of sensitive data living in AWS—and let's be clear, I take confidentiality extremely seriously. Our clients on the consulting side view their AWS bills themselves as extremely confidential information that Amazon stuffs into a PDF and emails every month. But still. If there's going to be a leak, we absolutely do not want it to come from us, and that is something that we take extraordinarily seriously. But compared to other jobs I've had in the past, no one will die if that information gets out.It is not the sort of thing that is going to ruin people's lives, which is very often something that can happen in some data breaches. But in my world, one of the bad cases of a breach of someone getting access to my account is they could spin up a bunch of containers on the 17 different services that AWS offers that can run containers and mine cryptocurrency with it. And the damage to me then becomes a surprise bill. Okay, great. I can live with that.Something that's a lot scarier to a lot of companies with, you know, serious problems is, yep, fine, cost us money, whatever, but our access to our data is the one thing that is going to absolutely be the thing that cannot happen. So, from that perspective alone, something like Glue being able to do that is a lot more terrifying than subverting CloudFormation and being able to spin up additional resources or potentially take resources down. Is that how you folks see it too, or is—I'm sure there's nuance I'm missing.Yoav: So yeah, the access to data is top-of-mind for everyone. It's a bit scary to think about it. I have to mention, again, the quick turnaround time for AWS, which almost immediately issued a patch. It was a very fast one and they mitigated, again, the issue completely within days. About your comment about data.Data is king these days, there is nothing like data, and it has all the properties of everything that we care about. It's expensive to store, it's expensive to move, and it's very expensive if it leaks. So, I think a lot of people were more alarmed about the Glue vulnerability than the CloudFormation vulnerability. And they're right in doing so.Corey: I do want to call out that AWS did a lot of things right in this area. Their security posture is very clearly built around defense-in-depth. The fact that they were able to disclose—after some prodding—that they checked the CloudTrail logs for the service itself, dating back to the time the service launched, and verified that there had never been an exploit of this, that is phenomenal, as opposed to the usual milquetoast statements that companies have. We have no evidence of it, which can mean that we did the same thing and we looked through all the logs in it's great, but it can also mean that, “Oh, yeah, we probably should have logs, shouldn't we? But let's take a backlog item for that.” And that's just terrifying on some level.It becomes a clear example—a shining beacon for some of us in some cases—of doing things right from that perspective. There are other sides to it, though. As a customer, it was frustrating in the extreme to—and I mean, no offense by this—to learn about this from you rather than from the provider themselves. They wound up putting up a security notification many hours after your blog post went up, which I would also just like to point out—and we spoke about it at the time and it was a pure coincidence—but there was something that was just chef's-kiss perfect about you announcing this on Andy Jassy's birthday. That was just very well done.Yoav: So, we didn't know about Andy's birthday. And it was—Corey: Well, I see only one of us has a company calendar with notable executive birthdays splattered all over it.Yoav: Yes. And it was also published around the time that AWS CISO was announced, which was also a coincidence because the date was chosen a lot of time in advance. So, we genuinely didn't know.Corey: Communicating around these things is always challenging because on the one hand, I can absolutely understand the cloud providers' position on this. We had a vulnerability disclosed to us. We did our diligence and our research because we do an awful lot of things correctly and everyone is going to have vulnerabilities, let's be serious here. I'm not sitting here shaking my fist, angry at AWS's security model. It works, and I am very much a fan of what they do.And I can definitely understand then, going through all of that there was no customer impact, they've proven it. What value is there to them telling anyone about it, I get that. Conversely, you're a security company attempting to stand out in a very crowded market, and it is very clear that announcing things like this demonstrates a familiarity with cloud that goes beyond the common. I radically changed my position on how I thought about Orca based upon these discoveries. It went from, “Orca who,” other than the fact that you folks have sponsored various publications in the past—thanks for that—but okay, a security company. Great to, “Oh, that's Orca. We should absolutely talk to them about a thing that we're seeing.” It has been transformative for what I perceive to be your public reputation in the cloud security space.So, those two things are at odds: The cloud provider doesn't want to talk about anything and the security company absolutely wants to demonstrate a conversational fluency with what is going on in the world of cloud. And that feels like it's got to be a very delicate balancing act to wind up coming up with answers that satisfy all parties.Yoav: So, I just want to underline something. We don't do what we do in order to make a marketing stand. It's a byproduct of our work, but it's not the goal. For the Orca Security Research Pod, which it's the team at Orca which does this kind of research, our mission statement is to make cloud security better for everyone. Not just Orca customers; for everyone.And you get to hear about the more shiny things like big headline vulnerabilities, but we also have very sensible blog posts explaining how to do things, how to configure things and give you more in-depth understanding into security features that the cloud providers themselves provide, which are great, and advance the state of the cloud security. I would say that having a cloud vulnerability is sort of one of those things, which makes me happy to be a cloud customer. On the one side, we had a very big vulnerability with very big impact, and the ability to access a lot of customers' data is conceptually terrifying. The flip side is that everything was mitigated by the cloud providers in warp speed compared to everything else we've seen in all other elements of security. And you get to sleep better knowing that it happened—so no platform is infallible—but still the cloud provider do work for you, and you'll get a lot of added value from that.Corey: You've made a few points when this first came out, and I want to address them. The first is, when I reached out to you with a, “Wow, great work.” You effectively instantly came back with, “Oh, it wasn't me. It was members of my team.” So, let's start there. Who was it that found these things? I'm a huge believer giving people credit for the things that they do.The joy of being in a leadership position is if the company screws up, yeah, you take responsibility for that, whether the company does something great, yeah, you want to pass praise onto the people who actually—please don't take this the wrong way—did the work. And not that leadership is not work, it absolutely is, but it's a different kind of work.Yoav: So, I am a security researcher, and I am very mindful for the effort and skill it requires to find vulnerabilities and actually do a full circle on them. And the first thing I'll mention is Tzah Pahima, which found the BreakingFormation vulnerability and the vulnerability in CloudFormation, and Yanir Tsarimi, which found the AutoWarp vulnerability, which is the Azure vulnerability that we have not mentioned, and the Glue vulnerability, dubbed SuperGlue. Both of them are phenomenal researcher, world-class, and I'm very honored to work with them every day. It's one of my joys.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.Corey: It's very clear that you have built an extraordinary team for people who are able to focus on vulnerability research. Which, on some level, is very interesting because you are not branded as it were as a vulnerability research company. This is not something that is your core competency; it's not a thing that you wind up selling directly that I'm aware of. You are selling a security platform offering. So, on the one hand, it makes perfect sense that you would have a division internally that works on this, but it's also very noteworthy, I think, that is not the core description of what it is that you do.It is a means by which you get to the outcome you deliver for customers, not the thing that you are selling directly to them. I just find that an interesting nuance.Yoav: Yes, it is. And I would elaborate and say that research informs the product, and the product informs research. And we get to have this fun dance where we learn new things by doing research. We [unintelligible 00:18:08] the product, and we use the customers to teach us things that we didn't know. So, it's one of those happy synergies.Corey: I want to also highlight a second thing that you have mentioned and been very, I guess, on message about since news of this stuff first broke. And because it's easy to look at this and sensationalize aspects of it, where, “See? The cloud providers security model is terrible. You shouldn't use them. Back to data centers we go.” Is basically the line taken by an awful lot of folks trying to sell data center things.That is not particularly helpful for the way that the world is going. And you've said, “Yeah, you should absolutely continue to be in cloud. Do not disrupt your cloud plan as a result.” And let's be clear, none of the rest of us are going to find and mitigate these things with anything near the rigor or rapidity that the cloud providers can and do demonstrate.Yoav: I totally agree. And I would say that the AWS security folks are doing a phenomenal job. I can name a few, but they're all great. And I think that the cloud is by far a much safer alternative than on-prem. I've never seen issues in my on-prem environment which were critical and fixed in such a high velocity and such a massive scale.And you always get the incremental improvements of someone really thinking about all the ins and outs of how to do security, how to do security in the cloud, how to make it faster, more reliable, without a business interruptions. It's just phenomenal to see and phenomenal to witness how far we've come in such a relatively short time as an industry.Corey: AWS in particular, has a reputation for being very good at security. I would argue that, from my perspective, Google is almost certainly slightly better at their security approach than AWS is, but to be clear, both of them are significantly further along the path than I am going to be. So great, fantastic. You also have found something interesting over in the world of Azure, and that honestly feels like a different class of vulnerability. To my understanding, the Azure vulnerability that you recently found was you could get credential material for other customers simply by asking for it on a random high port. Which is one of those—I'm almost positive I'm misunderstanding something here. I hope. Please?Yoav: I'm not sure you're misunderstanding. So, I would just emphasize that the vulnerability again, was found by Yanir Tsarimi. And what he found was, he used a service called Azure Automation which enables you essentially to run a Python script on various events and schedules. And he opened the python script and he tried different ports. And one of the high ports he found, essentially gave him his credentials. And he said, “Oh, wait. That's a really odd port for an HTTP server. Let's try, I don't know, a few ports on either way.” And he started getting credentials from other customers. Which was very surprising to us.Corey: That is understating it by a couple orders of magnitude. Yes, like, “Huh. That seems sub-optimal,” is sort of like the corporate messaging approved thing. At the time you discover that—I'm certain it was a three-minute-long blistering string of profanity in no fewer than four languages.Yoav: I said to him that this is, like, a dishonorable bug because he worked very little to find it. So it was, from start to finish, the entire research took less than two hours, which, in my mind, is not enough for this kind of vulnerability. You have to work a lot harder to get it. So.Corey: Yeah, exactly. My perception is that when there are security issues that I have stumbled over—for example, I gave a talk at re:Invent about it in the before times, one of them was an overly broad permission in a managed IAM policy for SageMaker. Okay, great. That was something that obviously was not good, but it also was more of a privilege escalation style of approach. It wasn't, “Oh, by the way, here's the keys to everything.”That is the type of vulnerability I have come to expect, by and large, from cloud providers. We're just going to give you access credentials for other customers is one of those areas that… it bugs me on a visceral level, not because I'm necessarily exposed personally, but because it more or less shores up so many of the arguments that I have spent the last eight years having with folks are like, “Oh, you can't go to cloud. Your data should live on your own stuff. It's more secure that way.” And we were finally it feels like starting to turn a cultural corner on these things.And then something like that happens, and it—almost have those naysayers become vindicated for it. And it's… it almost feels, on some level, and I don't mean to be overly unkind on this, but it's like, you are absolutely going to be in a better security position with the cloud providers. Except to Azure. And perhaps that is unfair, but it seems like Azure's level of security rigor is nowhere near that of the other two. Is that generally how you're seeing things?Yoav: I would say that they have seen more security issues than most other cloud providers. And they also have a very strong culture of report things to us, and we're very streamlined into patching those and giving credit where credit's due. And they give out bounties, which is an incentives for more research to happen on those platforms. So, I wouldn't say this categorically, but I would say that the optics are not very good. Generally, the cloud providers are much safer than on-prem because you only hear very seldom on security issues in the cloud.You hear literally every other day on issues happening to on-prem environments all over the place. And people just say they expect it to be this way. Most of the time, it's not even a headline. Like, “Company X affected with cryptocurrency or whatever.” It happens every single day, and multiple times a day, breaches which are massively bigger. And people who don't want to be in the cloud will find every reason not to be the cloud. Let us have fun.Corey: One of the interesting parts about this is that so many breaches that are on-prem are just never discovered because no one knows what the heck's running in an environment. And the breaches that we hear about are just the ones that someone had at least enough wherewithal to find out that, “Huh. That shouldn't be the way that it is. Let's dig deeper.” And that's a bad day for everyone. I mean, no one enjoys those conversations and those moments.And let's be clear, I am surprisingly optimistic about the future of Azure Security. It's like, “All right, you have a magic wand. What would you do to fix it?” It's, “Well, I'd probably, you know, hire Charlie Bell and get out of his way,” is not a bad answer as far as how these things go. But it takes time to reform a culture, to wind up building in security as a foundational principle. It's not something you can slap on after the fact.And perhaps this is unfair. But Microsoft has 30 years of history now of getting the world accustomed to oh, yeah, just periodically, terrible vulnerabilities are going to be discovered in your desktop software. And every once a month on Tuesdays, we're going to roll out a whole bunch of patches, and here you go. Make sure you turn on security updates, yadda, yadda, yadda. That doesn't fly in the cloud. It's like, “Oh, yeah, here's this month's list of security problems on your cloud provider.” That's one of those things that, like, the record-scratch, freeze-frame moment of wait, what are we doing here, exactly?Yoav: So, I would say that they also have a very long history of making those turnarounds. Bill Gates famously did his speech where security comes first, and they have done a very, very long journey and turn around the company from doing things a lot quicker and a lot safer. It doesn't mean they're perfect; everyone will have bugs, and Azure will have more people finding bugs into it in the near future, but security is a journey, and they've not started from zero. They're doing a lot of work. I would say it's going to take time.Corey: The last topic I want to explore a little bit is—and again, please don't take this as anyway being insulting or disparaging to your company, but I am actively annoyed that you exist. By which I mean that if I go into my AWS account, and I want to configure it to be secure. Great. It's not a matter of turning on the security service, it's turning on the dozen or so security services that then round up to something like GuardDuty that then, in turn, rounds up to something like Security Hub. And you look at not only the sheer number of these services and the level of complexity inherent to them, but then the bill comes in and you do some quick math and realize that getting breached would have been less expensive than what you're spending on all of these things.And somehow—the fact that it's complex, I understand; computers are like that. The fact that there is—[audio break 00:27:03] a great messaging story that's cohesive around this, I come to accept that because it's AWS; talking is not their strong suit. Basically declining to comment is. But the thing that galls me is that they are selling these services and not inexpensively either, so it almost feels, on some level like, shouldn't this on some of the built into the offerings that you folks are giving us?And don't get me wrong, I'm glad that you exist because bringing order to a lot of that chaos is incredibly important. But I can't shake the feeling that this should be a foundational part of any cloud offering. I'm guessing you might have a slightly different opinion than mine. I don't think you show up at the office every morning, “I hate that we exist.”Yoav: No. And I'll add a bit of context and nuance. So, for every other company than cloud providers, we expect them to be very good at most things, but not exceptional at everything. I'll give the Redshift example. Redshift is a pretty good offering, but Snowflake is a much better offering for a much wider range of—Corey: And there's a reason we're about to become Snowflake customers ourselves.Yoav: So, yeah. And there are a few other examples of that. A security company, a company that is focused solely on your security will be much better suited to help you, in a lot of cases more than the platform. And we work actively with AWS, Azure, and GCP requesting new features, helping us find places where we can shed more light and be more proactive. And we help to advance the conversation and make it a lot more actionable and improve from year to year. It's one of those collaborations. I think the cloud providers can do anything, but they can't do everything. And they do a very good job at security; it doesn't mean they're perfect.Corey: As you folks are doing an excellent job of demonstrating. Again, I'm glad you folks exist; I'm very glad that you are publishing the research that you are. It's doing a lot to bring a lot I guess a lot of the undue credit that I was giving AWS for years of, “No, no, it's not that they don't have vulnerabilities like everyone else does. It just that they don't ever talk about them.” And they're operationalizing of security response is phenomenal to watch.It's one of those things where I think you've succeeded and what you said earlier that you were looking to achieve, which is elevating the state of cloud security for everyone, not just Orca customers.Yoav: Thank you.Corey: Thank you. I really appreciate your taking the time out of your day to speak with me. If people want to learn more, where's the best place they can go to do that?Yoav: So, we have our website at orca.security. And you can reach me out on Twitter. My handle is at @yoavalon, which is @-Y-O-A-V-A-L-O-N.Corey: And we will of course put links to that in the [show notes 00:29:44]. Thanks so much for your time. I appreciate it.Yoav: Thank you, Corey.Corey: Yoav Alon, Chief Technology Officer at Orca Security. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, or of course on YouTube, smash the like and subscribe buttons because that's what they do on that platform. Whereas if you've hated this podcast, please do the exact same thing, five-star review, smash the like and subscribe buttons on YouTube, but also leave an angry comment that includes a link that is both suspicious and frightening, and when we click on it, suddenly our phones will all begin mining cryptocurrency.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Creating “Quinntainers” with Casey Lee

Screaming in the Cloud

Play Episode Listen Later Apr 20, 2022 46:16


About CaseyCasey spends his days leveraging AWS to help organizations improve the speed at which they deliver software. With a background in software development, he has spent the past 20 years architecting, building, and supporting software systems for organizations ranging from startups to Fortune 500 enterprises.Links Referenced: “17 Ways to Run Containers in AWS”: https://www.lastweekinaws.com/blog/the-17-ways-to-run-containers-on-aws/ “17 More Ways to Run Containers on AWS”: https://www.lastweekinaws.com/blog/17-more-ways-to-run-containers-on-aws/ kubernetestheeasyway.com: https://kubernetestheeasyway.com snark.cloud/quinntainers: https://snark.cloud/quinntainers ECS Chargeback: https://github.com/gaggle-net/ecs-chargeback  twitter.com/nektos: https://twitter.com/nektos TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored by our friends at Revelo. Revelo is the Spanish word of the day, and its spelled R-E-V-E-L-O. It means “I reveal.” Now, have you tried to hire an engineer lately? I assure you it is significantly harder than it sounds. One of the things that Revelo has recognized is something I've been talking about for a while, specifically that while talent is evenly distributed, opportunity is absolutely not. They're exposing a new talent pool to, basically, those of us without a presence in Latin America via their platform. It's the largest tech talent marketplace in Latin America with over a million engineers in their network, which includes—but isn't limited to—talent in Mexico, Costa Rica, Brazil, and Argentina. Now, not only do they wind up spreading all of their talent on English ability, as well as you know, their engineering skills, but they go significantly beyond that. Some of the folks on their platform are hands down the most talented engineers that I've ever spoken to. Let's also not forget that Latin America has high time zone overlap with what we have here in the United States, so you can hire full-time remote engineers who share most of the workday as your team. It's an end-to-end talent service, so you can find and hire engineers in Central and South America without having to worry about, frankly, the colossal pain of cross-border payroll and benefits and compliance because Revelo handles all of it. If you're hiring engineers, check out revelo.io/screaming to get 20% off your first three months. That's R-E-V-E-L-O dot I-O slash screaming.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is someone that I had the pleasure of meeting at re:Invent last year, but we'll get to that story in a minute. Casey Lee is the CTO with a company called Gaggle, which is—as they frame it—saving lives. Now, that seems to be a relatively common position that an awful lot of different tech companies take. “We're saving lives here.” It's, “You show banner ads and some of them are attack platforms for JavaScript malware. Let's be serious here.” Casey, thank you for joining me, and what makes the statement that Gaggle saves lives not patently ridiculous?Casey: Sure. Thanks, Corey. Thanks for having me on the show. So Gaggle, we're ed-tech company. We sell software to school districts, and school districts use our software to help protect their students while the students use the school-issued Google or Microsoft accounts.So, we're looking for signs of bullying, harassment, self-harm, and potentially suicide from K-12 students while they're using these platforms. They will take the thoughts, concerns, emotions they're struggling with and write them in their school-issued accounts. We detect that and then we notify the school districts, and they get the students the help they need before they can do any permanent damage to themselves. We protect about 6 million students throughout the US. We ingest a lot of content.Last school year, over 6 billion files, about the equal number of emails ingested. We're looking for concerning content and then we have humans review the stuff that our machine learning algorithms detect and flag. About 40 million items had to go in front of humans last year, resulted in about 20,000 what we call PSSes. These are Possible Student Situations where students are talking about harming themselves or harming others. And that resulted in what we like to track as lives saved. 1400 incidents last school year where a student was dealing with suicide ideation, they were planning to take their own lives. We detect that and get them help within minutes before they can act on that. That's what Gaggle has been doing. We're using tech, solving tech problems, and also saving lives as we do it.Corey: It's easy to lob a criticism at some of the things you're alluding to, the idea of oh, you're using machine learning on student data for young kids, yadda, yadda, yadda. Look at the outcome, look at the privacy controls you have in place, and look at the outcomes you're driving to. Now, I don't necessarily trust the number of school administrations not to become heavy-handed and overbearing with it, but let's be clear, that's not the intent. That is not what the success stories you have alluded to. I've got to say I'm a fan, so thanks for doing what you're doing. I don't say that very often to people who work in tech companies.Casey: Cool. Thanks, Corey.Corey: But let's rewind a bit because you and I had passed like ships in the night on Twitter for a while, but last year at re:Invent something odd happened. First, my business partner procrastinated at getting his ticket—that's not the odd part; he does that a lot—but then suddenly ticket sales slammed shut and none were to be had anywhere. You reached out with a, “Hey, I have a spare ticket because someone can't go. Let me get it to you.” And I said, “Terrific. Let me pay you for the ticket and take you to dinner.”You said, “Yes on the dinner, but I'd rather you just look at my AWS bill and don't worry about the cost of the ticket.” “All right,” said I. I know a deal when I see one. We grabbed dinner at the Venetian. I said, “Bust out your laptop.” And you said, “Oh, I was kidding.” And I said, “Great. I wasn't. Bust it out.”And you went from laughing to taking notes in about the usual time that happens when I start looking at these things. But how was your recollection of that? I always tend to romanticize some of these things. Like, “And then everyone's restaurant just turned, stopped, and clapped the entire time.” Maybe that part didn't happen.Casey: Everything was right up until the clapping part. That was a really cool experience. I appreciate you walking through that with me. Yeah, we've got lots of opportunity to save on our AWS bill here at Gaggle, and in that little bit of time that we had together, I think I walked away with no more than a dozen ideas for where to shave some costs. The most obvious one, the first thing that you keyed in on, is we had RIs coming due that weren't really well-optimized and you steered me towards savings plans. We put that in place and we're able to apply those savings plans not just to our EC2 instances but also to our serverless spend as well.So, that was a very worthwhile and cost-effective dinner for us. The thing that was most surprising though, Corey, was your approach. Your approach to how to review our bill was not what I thought at all.Corey: Well, what did you expect my approach was going to be? Because this always is of interest to me. Like, do you expect me to, like, whip a portable machine learning rig out of my backpack full of GPUs or something?Casey: I didn't know if you had, like, some secret tool you were going to hit, or if nothing else, I thought you were going to go for the Cost Explorer. I spend a lot of time in Cost Explorer, that's my go-to tool, and you wanted nothing to do with Cost Exp—I think I was actually pulling up Cost Explorer for you and you said, “I'm not interested. Take me to the bills.” So, we went right to the billing dashboard, you started opening up the invoices, and I thought to myself, “I don't remember the last time I looked at an AWS invoice.” I just, it's noise; it's not something that I pay attention to.And I learned something, that you get a real quick view of both the cost and the usage. And that's what you were keyed in on, right? And you were looking at things relative to each other. “Okay, I have no idea about Gaggle or what they do, but normally, for a company that's spending x amount of dollars in EC2, why is your data transfer cost the way it is? Is that high or low?” So, you're looking for kind of relative numbers, but it was really cool watching you slice and dice that bill through the dashboard there.Corey: There are a few things I tie together there. Part of it is that this is sort of a surprising thing that people don't think about but start with big numbers first, rather than going alphabetically because I don't really care about your $6 Alexa for Business spend. I care a bit more about the $6 million, or whatever it happens to be at EC2—I'm pulling numbers completely out of the ether, let's be clear; I don't recall what the exact magnitude of your bill is and it's not relevant to the conversation.And then you see that and it's like, “Huh. Okay, you're spending $6 million on EC2. Why are you spending 400 bucks on S3? Seems to me that those two should be a little closer aligned. What's the deal here? Oh, God, you're using eight petabytes of EBS volumes. Oh, dear.”And just, it tends to lead to interesting stuff. Break it down by region, service, and use case—or usage type, rather—is what shows up on those exploded bills, and that's where I tend to start. It also is one of the easiest things to wind up having someone throw into a PDF and email my way if I'm not doing it in a restaurant with, you know, people clapping standing around.Casey: [laugh]. Right.Corey: I also want to highlight that you've been using AWS for a long time. You're a Container Hero; you are not bad at understanding the nuances and depths of AWS, so I take praise from you around this stuff as valuing it very highly. This stuff is not intuitive, it is deeply nuanced, and you have a business outcome you are working towards that invariably is not oriented day in day out around, “How do I get these services for less money than I'm currently paying?” But that is how I see the world and I tend to live in a very different space just based on the nature of what I do. It's sort of a case study and the advantage of specialization. But I know remarkably little about containers, which is how we wound up reconnecting about a week or so before we did this recording.Casey: Yeah. I saw your tweet; you were trying to run some workload—container workload—and I could hear the frustration on the other end of Twitter when you were shaking your fist at—Corey: I should not tweet angrily, and I did in this case. And, eh, every time I do I regret it. But it played well with the people, so that does help. I believe my exact comment was, “‘me: I've got this container. Run it, please.' ‘Google Cloud: Run. You got it, boss.' AWS has 17 ways to run containers and they all suck.”And that's painting with an overly broad brush, let's be clear, but that was at the tail end of two or three days of work trying to solve a very specific, very common, business problem, that I was just beating my head off of a wall again and again and again. And it took less than half an hour from start to finish with Google Cloud Run and I didn't have to think about it anymore. And it's one of those moments where you look at this and realize that the future is here, we just don't see it in certain ways. And you took exception to this. So please, let's dive in because 280 characters of text after half a bottle of wine is not the best context to have a nuanced discussion that leaves friendships intact the following morning.Casey: Nice. Well, I just want to make sure I understand the use case first because I was trying to read between the lines on what you needed, but let me take a guess. My guess is you got your source code in GitHub, you have a Docker file, and you want to be able to take that repo from GitHub and just have it continuously deployed somewhere in Run. And you don't want to have headaches with it; you just want to push more changes up to GitHub, Docker Build runs and updates some service somewhere. Am I right so far?Corey: Ish, but think a little further up the stack. It was in service of this show. So, this show, as people who are listening to this are probably aware by this point, periodically has sponsors, which we love: We thank them for participating in the ongoing support of this show, which empowers conversations like this. Sometimes a sponsor will come to us with, “Oh, and here's the URL we want to give people.” And it's, “First, you misspelled your company name from the common English word; there are three sublevels within the domain, and then you have a complex UTM tagging tracking co—yeah, you realize people are driving to work when they're listening to this?”So, I've built a while back a link shortener, snark.cloud because is it the shortest thing in the world? Not really, but it's easily understandable when I say that, and people hear it for what it is. And that's been running for a long time as an S3 bucket with full of redirects, behind CloudFront. So, I wind up adding a zero-byte object with a redirect parameter on it, and it just works.Now, the challenge that I have here as a business is that I am increasingly prolific these days. So, anything that I am not directly required to be doing, I probably shouldn't necessarily be the one to do it. And care and feeding of those redirect links is a prime example of this. So, I went hunting, and the things that I was looking for were, obviously, do the redirect. Now, if you pull up GitHub, there are hundreds of solutions here.There are AWS blog posts. One that I really liked and almost got working was Eric Johnson's three-part blog post on how to do it serverlessly, with API Gateway, and DynamoDB, no Lambdas required. I really liked aspects of what that was, but it was complex, I kept smacking into weird challenges as I went, and front end is just baffling to me. Because I needed a front end app for people to be able to use here; I need to be able to secure that because it turns out that if you just have a, anyone who stumbles across the URL can redirect things to other places, well, you've just empowered a whole bunch of spam email, and you're going to find that service abused, and everyone starts blocking it, and then you have trouble. Nothing lasts the first encounter with jerks.And I was getting more and more frustrated, and then I found something by a Twitter engineer on GitHub, with a few creative search terms, who used to work at Google Cloud. And what it uses as a client is it doesn't build any kind of custom web app. Instead, as a database, it uses not S3 objects, not Route 53—the ideal database—but a Google sheet, which sounds ridiculous, but every business user here knows how to use that.Casey: Sure.Corey: And it looks for the two columns. The first one is the slug after the snark.cloud, and the second is the long URL. And it has a TTL of five seconds on cache, so make a change to that spreadsheet, five seconds later, it's live. Everyone gets it, I don't have to build anything new, I just put it somewhere around the relevant people can access it, I gave him a tutorial and a giant warning on it, and everyone gets that. And it just works well. It was, “Click here to deploy. Follow the steps.”And the documentation was a little, eh, okay, I had to undo it once and redo it again. Getting the domain registered was getting—ported over took a bit of time, and there were some weird SSL errors as the certificates were set up, but once all of that was done, it just worked. And I tested the heck out of it, and cold starts are relatively low, and the entire thing fits within the free tier. And it is reminiscent of the magic that I first saw when I started working with some of the cloud providers services, years ago. It's been a long time since I had that level of delight with something, especially after three days of frustration. It's one of the, “This is a great service. Why are people not shouting about this from the rooftops?” That was my perspective. And I put it out on Twitter and oh, Lord, did I get comments. What was your take on it?Casey: Well, so my take was, when you're evaluating a platform to use for running your applications, how fast it can get you to Hello World is not necessarily the best way to go. I just assumed you're wrong. I assumed of the 17 ways AWS has to run containers, Corey just doesn't understand. And so I went after it. And I said, “Okay, let me see if I can find a way that solves his use case, as I understand it, through a quick tweet.”And so I tried to App Runner; I saw that App Runner does not meet your needs because you have to somehow get your Docker image pushed up to a repo. App Runner can take an image that's already been pushed up and deployed for you or it can build from source but neither of those were the way I understood your use case.Corey: Having used App Runner before via the Copilot CLI, it is the closest as best I can tell to achieving what I want. But also let's be clear that I don't believe there's a free tier; there needs to be a load balancer in front of it, so you're starting with 15 bucks a month for this thing. Which is not the end of the world. Had I known at the beginning that all of this was going to be there, I would have just signed up for a bit.ly account and called it good. But here we are.Casey: Yeah. I tried Copilot. Copilot is a great developer experience, but it also is just pulling together tons of—I mean just trying to do a Copilot service deploy, VPCs are being created and tons IAM roles are being created, code pipelines, there's just so much going on. I was like 20 minutes into it, and I said, “Yeah, this is not fitting the bill for what Corey was looking for.” Plus, it doesn't solve my the way I understood your use case, which is you don't want to worry about builds, you just want to push code and have new Docker images get built for you.Corey: Well, honestly, let's be clear here, once it's up and running, I don't want to ever have to touch the silly thing again.Casey: Right.Corey: And that's so far has been the case, after I forked the repo and made a couple of changes to it that I wanted to see. One of them was to render the entire thing case insensitive because I get that one wrong a lot, and the other is I wanted to change the permanent 301 redirect to a temporary 302 redirect because occasionally, sponsors will want to change where it goes in the fullness of time. And that is just fine, but I want to be able to support that and not have to deal with old cached data. So, getting that up and running was a bit of a challenge. But the way that it worked, was following the instructions in the GitHub repo.The developer environment had spun up in the Google's Cloud Shell was just spectacular. It prompted me for a few things and it told me step by step what to do. This is the sort of thing I could have given a basically non-technical user, and they would have had success with it.Casey: So, I tried it as well. I said, “Well, okay, if I'm going to respond to Corey here and challenge him on this, I need to try Cloud Run.” I had no experience with Cloud Run. I had a small example repo that loosely mapped what I understood you were trying to do. Within five minutes, I had Cloud Run working.And I was surprised anytime I pushed a new change, within 45 seconds the change was built and deployed. So, here's my conclusion, Corey. Google Cloud Run is great for your use case, and AWS doesn't have the perfect answer. But here's my challenge to you. I think that you just proved why there's 17 different ways to run containers on AWS, is because there's that many different types of users that have different needs and you just happen to be number 18 that hasn't gotten the right attention yet from AWS.Corey: Well, let's be clear, like, my gag about 17 ways to run containers on AWS was largely a joke, and it went around the internet three times. So, I wrote a list of them on the blog post of “17 Ways to Run Containers in AWS” and people liked it. And then a few months later, I wrote “17 More Ways to Run Containers on AWS” listing 17 additional services that all run containers.And my favorite email that I think I've ever received in feedback was from a salty AWS employee, saying that one of them didn't really count because of some esoteric reason. And it turns out that when I'm trying to make a point of you have a sarcastic number of ways to run containers, pointing out that well, one of them isn't quite valid, doesn't really shatter the argument, let's be very clear here. So, I appreciate the feedback, I always do. And it's partially snark, but there is an element of truth to it in that customers don't want to run containers, by and large. That is what they do in service of a business goal.And they want their application to run which is in turn to serve as the business goal that continues to abstract out into, “Remain a going concern via the current position the company stakes out.” In your case, it is saving lives; in my case, it is fixing horrifying AWS bills and making fun of Amazon at the same time, and in most other places, there are somewhat more prosaic answers to that. But containers are simply an implementation detail, to some extent—to my way of thinking—of getting to that point. An important one [unintelligible 00:18:20], let's be clear, I was very anti-container for a long time. I wrote a talk, “Heresy in the Church of Docker” that then was accepted at ContainerCon. It's like, “Oh, boy, I'm not going to leave here alive.”And the honest answer is many years later, that Kubernetes solves almost all the criticisms that I had with the downside of well, first, you have to learn Kubernetes, and that continues to be mind-bogglingly complex from where I sit. There's a reason that I've registered kubernetestheeasyway.com and repointed it to ECS, Amazon's container service that is not requiring you to cosplay as a cloud provider yourself. But even ECS has a number of challenges to it, I want to be very clear here. There are no silver bullets in this.And you're completely correct in that I have a large, complex environment, and the application is nuanced, and I'm willing to invest a few weeks in setting up the baseline underlying infrastructure on AWS with some of these services, ideally not all of them at once because that's something a lunatic would do, but getting them up and running. The other side of it, though, is that if I am trying to evaluate a cloud provider's handling of containers and how this stuff works, the reason that everyone starts with a Hello World-style example is that it delivers ideally, the meantime to dopamine. There's a reason that Hello World doesn't have 18 different dependencies across a bunch of different databases and message queues and all the other complicated parts of running a modern application. Because you just want to see how it works out of the gate. And if getting that baseline empty container that just returns the string ‘Hello World' is that complicated and requires that much work, my takeaway is not that this user experience is going to get better once I'd make the application itself more complicated.So, I find that off-putting. My approach has always been find something that I can get the easy, minimum viable thing up and running on, and then as I expand know that you'll be there to catch me as my needs intensify and become ever more complex. But if I can't get the baseline thing up and running, I'm unlikely to be super enthused about continuing to beat my head against the wall like, “Well, I'll just make it more complex. That'll solve the problem.” Because it often does not. That's my position.Casey: Yeah, I agree that dopamine hit is valuable in getting attached to want to invest into whatever tech stack you're using. The challenge is your second part of that. Your second part is will it grow with me and scale with me and support the complex edge cases that I have? And the problem I've seen is a lot of organizations will start with something that's very easy to get started with and then quickly outgrow it, and then come up with all sorts of weird Rube Goldberg-type solutions. Because they jumped all in before seeing—I've got kind of an example of that.I'm happy to announce that there's now 18 ways to run containers on AWS. Because in your use case, in the spirit of AWS customer obsession, I hear your use case, I've created an open-source project that I want to share called Quinntainers—Corey: Oh, no.Casey: —and it solves—yes. Quinntainers is live and is ready for the world. So, now we've got 18 ways to run containers. And if you have Corey's use case of, “Hey, here's my container. Run it for me,” now we've got a one command that you can run to get things going for you. I can share a link for you and you could check it out. This is a [unintelligible 00:21:38]—Corey: Oh, we're putting that in the [show notes 00:21:37], for sure. In fact, if you go to snark.cloud/quinntainers, you'll find it.Casey: You'll find it. There you go. The idea here was this: There is a real use case that you had, and I looked at AWS does not have an out-of-the-box simple solution for you. I agree with that. And Google Cloud Run does.Well, the answer would have been from AWS, “Well, then here, we need to make that solution.” And so that's what this was, was a way to demonstrate that it is a solvable problem. AWS has all the right primitives, just that use case hadn't been covered. So, how does Quinntainers work? Real straightforward: It's a command-line—it's an NPM tool.You just run a [MPX 00:22:17] Quinntainer, it sets up a GitHub action role in your AWS account, it then creates a GitHub action workflow in your repo, and then uses the Quinntainer GitHub action—reusable action—that creates the image for you; every time you push to the branch, pushes it up to ECR, and then automatically pushes up that new version of the image to App Runner for you. So, now it's using App Runner under the covers, but it's providing that nice developer experience that you are getting out of Cloud Run. Look, is container really the right way to go with running containers? No, I'm not making that point at all. But the point is it is a—Corey: It might very well be.Casey: Well, if you want to show a good Hello World experience, Quinntainer's the best because within 30 seconds, your app is now set up to continuously deliver containers into AWS for your very specific use case. The problem is, it's not going to grow for you. I mean that it was something I did over the weekend just for fun; it's not something that would ever be worthy of hitching up a real production workload to. So, the point there is, you can build frameworks and tools that are very good at getting that initial dopamine hit, but then are not going to be there for you unnecessarily as you mature and get more complex.Corey: And yet, I've tilted a couple of times at the windmill of integrating GitHub actions in anything remotely resembling a programmatic way with AWS services, as far as instance roles go. Are you using permanent credentials for this as stored secrets or are you doing the [OICD 00:23:50][00:23:50] handoff?Casey: OIDC. So, what happens is the tool creates the IAM role for you with the trust policy on GitHub's OIDC provider, sets all that up for you in your account, locks it down so that just your repo and your main branch is able to push or is able to assume the role, the role is set up just to allow deployments to App Runner and ECR repository. And then that's it. At that point, it's out of your way. And you're just git push, and couple minutes later, your updates are now running an App Runner for you.Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning fast processing power, courtesy of third gen AMD EPYC processors without the IO, or hardware limitations, of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices, and say goodbye to noisy neighbors and egregious egress forever.Vultr delivers the power of the cloud with none of the bloat. "Screaming in the Cloud" listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G E T V U L T R.com/screaming. My thanks to them for sponsoring this ridiculous podcast.Corey: Don't undersell what you've just built. This is something that—is this what I would use for a large-scale production deployment, obviously not, but it has streamlined and made incredibly accessible things that previously have been very complex for folks to get up and running. One of the most disturbing themes behind some of the feedback I got was, at one point I said, “Well, have you tried running a Docker container on Lambda?” Because now it supports containers as a packaging format. And I said no because I spent a few weeks getting Lambda up and running back when it first came out and I've basically been copying and pasting what I got working ever since the way most of us do.And response is, “Oh, that explains a lot.” With the implication being that I'm just a fool. Maybe, but let's be clear, I am never the only person in the room who doesn't know how to do something; I'm just loud about what I don't know. And the failure mode of a bad user experience is that a customer feels dumb. And that's not okay because this stuff is complicated, and when a user has a bad time, it's a bug.I learned that in 2012. From Jordan Sissel the creator of LogStash. He has been an inspiration to me for the last ten years. And that's something I try to live by that if a user has a bad time, something needs to get fixed. Maybe it's the tool itself, maybe it's the documentation, maybe it's the way that GitHub repo's readme is structured in a way that just makes it accessible.Because I am not a trailblazer in most things, nor do I intend to be. I'm not the world's best engineer by a landslide. Just look at my code and you'd argue the fact that I'm an engineer at all. But if it's bad and it works, how bad is it? Is sort of the other side of it.So, my problem is that there needs to be a couple of things. Ignore for a second the aspect of making it the right answer to get something out of the door. The fact that I want to take this container and just run it, and you and I both reach for App Runner as the default AWS service that does this because I've been swimming in the AWS waters a while and you're a frickin AWS Container Hero, where it is expected that you know what most of these things do. For someone who shows up on the containers webpage—which by the way lists, I believe 15 ways to run containers on mobile and 19 ways to run containers on non-mobile, which is just fascinating in its own right—and it's overwhelming, it's confusing, and it's not something that makes it is abundantly clear what the golden path is. First, get it up and working, get it running, then you can add nuance and flavor and the rest, and I think that's something that's gotten overlooked in our mad rush to pretend that we're all Google engineers, circa 2012.Casey: Mmm. I think people get stressed out when they tried to run containers in AWS because they think, “What is that golden path?” You said golden path. And my advice to people is there is no golden path. And the great thing about AWS is they do continue to invest in the solutions they come up with. I'm still bitter about Google Reader.Corey: As am I.Casey: Yeah. I built so much time getting my perfect set of RSS feeds and then I had to find somewhere else to—with AWS, the different offerings that are available for running containers, those are there intentionally, it's not by accident. They're there to solve specific problems, so the trick is finding what works best for you and don't feel like one is better than the other is going to get more attention than others. And they each have different use cases.And I approach it this way. I've seen a couple of different people do some great flowcharts—I think Forrest did one, Vlad did one—on ways to make the decision on how to run your containers. And I break it down to three questions. I ask people first of all, where are you going to run these workloads? If someone says, “It has to be in the data center,” okay, cool, then ECS Anywhere or EKS Anywhere and we'll figure out if Kubernetes is needed.If they need specific requirements, so if they say, “No, we can run in the cloud, but we need privileged mode for containers,” or, “We need EBS volumes,” or, “We want really small container sizes,” like, less than a quarter-VCP or less than half a gig of RAM—or if you have custom log requirements, Fargate is not going to work for you, so you're going to run on EC2. Otherwise, run it on Fargate. But that's the first question. Figure out where are you going to run your containers. That leads to the second question: What's your control plane?But those are different, sort of related but different questions. And I only see six options there. That's App Runner for your control plane, LightSail for your control plane, Rosa if you're invested in OpenShift already, EKS either if you have Momentum and Kubernetes or you have a bunch of engineers that have a bunch of experience with Kubernetes—if you don't have either, don't choose it—or ECS. The last option Elastic Beanstalk, but let's leave that as a—if you're not currently invested in Elastic Beanstalk don't start today. But I look at those as okay, so I—first question, where am I going to run my containers? Second question, what do I want to use for my control plane? And there's different pros and cons of each of those.And then the third question, how do I want to manage them? What tools do I want to use for managing deployment? All those other tools like Copilot or App2Container or Proton, those aren't my control plane; those aren't where I run my containers; that's how I manage, deploy, and orchestrate all the different containers. So, I look at it as those three questions. But I don't know, what do you think of that, Corey?Corey: I think you're onto something. I think that is a terrific way of exploring that question. I would argue that setting up a framework like that—one or very similar—is what the AWS containers page should be, just coming from the perspective of what is the neophyte customer experience. On some level, you almost need a slide of have choose your level of experience ranging from, “What's a container?” To, “I named my kid Kubernetes because I make terrible life decisions,” and anywhere in between.Casey: Sure. Yeah, well, and I think that really dictates the control plane level. So, for example, LightSail, where does LightSail fit? To me, the value of LightSail is the simplicity. I'm looking at a monthly pricing: Seven bucks a month for a container.I don't know how [unintelligible 00:30:23] works, but I can think in terms of monthly pricing. And it's tailored towards a console user, someone just wants to click in, point to an image. That's a very specific user, there's thousands of customers that are very happy with that experience, and they use it. App Runner presents that scale to zero. That's one of the big selling points I see with App Runner. Likewise, with Google Cloud Run. I've got that scale to zero. I can't do that with ECS, or EKS, or any of the other platforms. So, if you've got something that has a ton of idle time, I'd really be looking at those. I would argue that I think I did the math, Google Cloud Run is about 30% more expensive than App Runner.Corey: Yeah, if you disregard the free tier, I think that's have it—running persistently at all times throughout the month, the drop-out cold starts would cost something like 40 some odd bucks a month or something like that. Don't quote me on it. Again and to be clear, I wound up doing this very congratulatory and complimentary tweet about them on I think it was Thursday, and then they immediately apparently took one look at this and said, “Holy shit. Corey's saying nice things about us. What do we do? What do we do?” Panic.And the next morning, they raised prices on a bunch of cloud offerings. Whew, that'll fix it. Like—Casey: [laugh].Corey: Di-, did you miss the direction you're going on here? No, that's the exact opposite of what you should be doing. But here we are. Interestingly enough, to tie our two conversation threads together, when I look at an AWS bill, unless you're using Fargate, I can't tell whether you're using Kubernetes or not because EKS is a small charge. And almost every case for the control plane, or Fargate under it.Everything else just manifests as EC2 spend. From the perspective of the cloud provider. If you're running a Kubernetes cluster, it is a single-tenant application that can have some very funky behaviors like cross-AZ chatter back and fourth because there's no internal mechanism to say talk to the free thing, rather than the two cents a gigabyte thing. It winds up spinning up and down in a bunch of different ways, and the behavior patterns, because of how placement works are not necessarily deterministic, depending upon workload. And that becomes something that people find odd when, “Okay, we look at our bill for a week, what can you say?”“Well, first question. Are you running Kubernetes at all?” And they're like, “Who invited these clowns?” Understand, we're not prying into your workloads for a variety of excellent legal and contractual reasons, here. We are looking at how they behave, and for specific workloads, once we have a conversation engineering team, yeah, we're going to dive in, but it is not at all intuitive from the outside to make any determination whether you're running containers, or whether you're running VMs that you just haven't done anything with in 20 years, or what exactly is going on. And that's just an artifact of the billing system.Casey: We ran into this challenge in Gaggle. We don't use EKS, we use ECS, but we have some shared clusters, lots of EC2 spend, hard to figure out which team is creating the services that's running that up. We actually ended up creating a tool—we open-sourced it—ECS Chargeback, and what it does is it looks at the CPU memory reservations for each task definition, and then prorates the overall charge of the ECS cluster, and then creates metrics in Datadog to give us a breakdown of cost per ECS service. And it also measures what we like to refer to as waste, right? Because if you're reserving four gigs of memory, but your utilization never goes over two gigs, we're paying for that reservation, but you're underutilizing.So, we're able to also show which services have the highest degree of waste, not just utilization, so it helps us go after it. But this is a hard problem. I'd be curious, how do you approach these shared ECS resources and slicing and dicing those bills?Corey: Everyone has a different approach, too. This there is no unifiable, correct answer. A previous show guest, Peter Hamilton, over at Remind had done something very similar, open-sourced a bunch of these things. Understanding what your spend is important on this, and it comes down to getting at the actual business concern because in some cases, effectively dead reckoning is enough. You take a look at the cluster that is really hard to attribute because it's a shared service. Great. It is 5% of your bill.First pass, why don't we just agree that it is a third for Service A, two-thirds for Service B, and we'll call it mostly good at that point? That can be enough in a lot of cases. With scale [laugh] you're just sort of hand-waving over many millions of dollars a year there. How about we get into some more depth? And then you start instrumenting and reporting to something, be it CloudWatch, be a Datadog, be it something else, and understanding what the use case is.In some cases, customers have broken apart shared clusters for that specific reason. I don't think that's necessarily the best approach from an engineering perspective, but again, this is not purely an engineering decision. It comes down to serving the business need. And if you're taking up partial credits on that cluster, for a tax credit for R&D for example, you want that position to be extraordinarily defensible, and spending a few extra dollars to ensure that it is the right business decision. I mean, again, we're pure advisory; we advise customers on what we would do in their position, but people often mistake that to be we're going to go for the lowest possible price—bad idea, or that we're going to wind up doing this from a purely engineering-centric point of view.It's, be aware of that in almost every case, with some very notable weird exceptions, the AWS Bill costs significantly less than the payroll expense that you have of people working on the AWS environment in various ways. People are more expensive, so the idea of, well, you can save a whole bunch of engineering effort by spending a bit more on your cloud, yeah, let's go ahead and do that.Casey: Yeah, good point.Corey: The real mark of someone who's senior enough is their answer to almost any question is, “It depends.” And I feel I've fallen into that trap as well. Much as I'd love to sit here and say, “Oh, it's really simple. You do X, Y, and Z.” Yeah… honestly, my answer, the simple answer, is I think that we orchestrate a cyber-bullying campaign against AWS through the AWS wishlist hashtag, we get people to harass their account managers with repeated requests for, “Hey, could you go ahead and [dip 00:36:19] that thing in—they give that a plus-one for me, whatever internal system you're using?”Just because this is a problem we're seeing more and more. Given that it's an unbounded growth problem, we're going to see it more and more for the foreseeable future. So, I wish I had a better answer for you, but yeah, that's stuff's super hard is honest, but it's also not the most useful answer for most of us.Casey: I'd love feedback from anyone from you or your team on that tool that we created. I can share link after the fact. ECS Chargeback is what we call it.Corey: Excellent. I will follow up with you separately on that. That is always worth diving into. I'm curious to see new and exciting approaches to this. Just be aware that we have an obnoxious talent sometimes for seeing these things and, “Well, what about”—and asking about some weird corner edge case that either invalidates the entire thing, or you're like, “Who on earth would ever have a problem like that?” And the answer is always, “The next customer.”Casey: Yeah.Corey: For a bounded problem space of the AWS bill. Every time I think I've seen it all, I just have to talk to one more customer.Casey: Mmm. Cool.Corey: In fact, the way that we approached your teardown in the restaurant is how we launched our first pass approach. Because there's value in something like that is different than the value of a six to eight-week-long, deep-dive engagement to every nook and cranny. And—Casey: Yeah, for sure. It was valuable to us.Corey: Yeah, having someone come in to just spend a day with your team, diving into it up one side and down the other, it seems like a weird thing, like, “How much good could you possibly do in a day?” And the answer in some cases is—we had a Honeycomb saying that in a couple of days of something like this, we wound up blowing 10% off their entire operating budget for the company, it led to an increased valuation, Liz Fong-Jones says that—on multiple occasions—that the company would not be what it was without our efforts on their bill, which is just incredibly gratifying to hear. It's easy to get lost in the idea of well, it's the AWS bill. It's just making big companies spend a little bit less to another big company. And that's not exactly, you know, saving the lives of K through 12 students here.Casey: It's opening up opportunities.Corey: Yeah. It's about optimizing for the win for everyone. Because now AWS gets a lot more money from Honeycomb than they would if Honeycomb had not continued on their trajectory. It's, you can charge customers a lot right now, or you can charge them a little bit over time and grow with them in a partnership context. I've always opted for the second model rather than the first.Casey: Right on.Corey: But here we are. I want to thank you for taking so much time out of well, several days now to argue with me on Twitter, which is always appreciated, particularly when it's, you know, constructive—thanks for that—Casey: Yeah.Corey: For helping me get my business partner to re:Invent, although then he got me that horrible puzzle of 1000 pieces for the Cloud-Native Computing Foundation landscape and now I don't ever want to see him again—so you know, that happens—and of course, spending the time to write Quinntainers, which is going to be at snark.cloud/quinntainers as soon as we're done with this recording. Then I'm going to kick the tires and send some pull requests.Casey: Right on. Yeah, thanks for having me. I appreciate you starting the conversation. I would just conclude with I think that yes, there are a lot of ways to run containers in AWS; don't let it stress you out. They're there for intention, they're there by design. Understand them.I would also encourage people to go a little deeper, especially if you got a significantly large workload. You got to get your hands dirty. As a matter of fact, there's a hands-on lab that a company called Liatrio does. They call it their Night Lab; it's a one-day free, hands-on, you run legacy monolithic job applications on Kubernetes, gives you first-hand experience on how to—gets all the way up into observability and doing things like Canary deployments. It's a great, great lab.But you got to do something like that to really get your hands dirty and understand how these things work. So, don't sweat it; there's not one right way. There's a way that will probably work best for each user, and just take the time and understand the ways to make sure you're applying the one that's going to give you the most runway for your workload.Corey: I will definitely dig into that myself. But I think you're right, I think you have nailed a point that is, again, a nuanced one and challenging to put in a rage tweet. But the services don't exist in a vacuum. They're not there because, despite the joke, someone wants to get promoted. It's because there are customer needs that are going on that, and this is another way of meeting those needs.I think there could be better guidance, but I also understand that there are a lot of nuanced perspectives here and that… hell is someone else's workflow—Casey: [laugh].Corey: —and there's always value in broadening your perspective a bit on those things. If people want to learn more about you and how you see the world, where's the best place to find you?Casey: Probably on Twitter: twitter.com/nektos, N-E-K-T-O-S.Corey: That might be the first time Twitter has been described as a best place for anything. But—Casey: [laugh].Corey: Thank you once again, for your time. It is always appreciated.Casey: Thanks, Corey.Corey: Casey Lee, CTO at Gaggle and AWS Container Hero. And apparently writing code in anger to invalidate my points, which is always appreciated. Please do more of that, folks. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, or the YouTube comments, which is always a great place to go reading, whereas if you've hated this podcast, please leave a five-star review in the usual places and an angry comment telling me that I'm completely wrong, and then launching your own open-source tool to point out exactly what I've gotten wrong this time.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

AWS на русском
006. Как выбрать регион в AWS

AWS на русском

Play Episode Listen Later Feb 8, 2022 22:41


В 6-ом выпуске мы рассмотрели: На что обратить внимание при выборе региона? Какие основные 4 факта помогут вам сделать правильный выбор. В чем разница между регионом и AZ? Почему одна зона с eu-west-1b может отличаться в других. Когда стоит делать мультирегиональные решения, а в каких случаях может хватить мультизонного решения? Таймкоды: 00:00:00 - Начало 00:02:40 - Определение региона и AZ 00:06:05 - Как выбрать регион (4 основных принципа при выборе региона) 00:10:00 - Мультирегиональное решение когда оно нужно 00:14:50 - Что такое ID AZ и как его узнать 00:16:05 - Локальные зоны и Wavelength зоны 00:19:05 - CloudFront точки присутсвия 00:20:05 - Подведение итогов Ссылки: Основные концепты 4 факта при выборе региона Слушайте подкаст "AWS на русском" и делитесь с коллегами!

AWS Bites
13. What's on your re:Invent 2021 wish list?

AWS Bites

Play Episode Listen Later Nov 28, 2021 30:31


In this special episode, Eoin and Luciano talk about their wishlist for AWS re:invent 2021. Based on our experience and personal AWS pain points, we share some of our wishes for new announcements during the biggest cloud event of the year. We also discuss some of the biggest announcements of last year and a few tips on how to get ready to follow the announcements of the next few days. CORRECTION: The changes in data transfer were not reported accurately in this episode. The monthly data transfer free tier limit has changed from 1 GB/month per region to 100GB/month for all regions. Data transfer out of CloudFront is now free for 1TB/month, up from 50GB/month. See the official announcements linked below. In this episode we mentioned the following resources: - Serverless Airline booking app example: https://github.com/aws-samples/aws-serverless-airline-booking - AWS Wild Rydes example: http://www.wildrydes.com/ - AWS Workshops: https://workshops.aws/ - Data transfer free tier increase: 1) https://aws.amazon.com/blogs/aws/aws-free-tier-data-transfer-expansion-100-gb-from-regions-and-1-tb-from-amazon-cloudfront-per-month/ and 2) https://aws.amazon.com/about-aws/whats-new/2021/11/aws-price-reduction-data-transfers-internet/ - Export Amplify projects to CDK: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-amplify-export-amplify-backends-cdk-stacks-integrate-cdk-based-pipelines/ - CDK hotswap: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-cdk-new-releases-api-apprunner-hotswap-amazon-ecs-step-functions - Partial SQS batch response: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-lambda-partial-batch-response-sqs-event-source/ This episode is also available on YouTube: https://www.youtube.com/AWSBites You can listen to AWS Bites wherever you get your podcasts: - Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-bites/id1585489017 - Spotify: https://open.spotify.com/show/3Lh7PzqBFV6yt5WsTAmO5q - Google: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy82YTMzMTJhMC9wb2RjYXN0L3Jzcw== - Breaker: https://www.breaker.audio/aws-bites - RSS: ​​https://anchor.fm/s/6a3312a0/podcast/rss Do you have any AWS questions you would like us to address? Connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige

Perfectly Boring
Innovating in Hardware, Software, and the Public Cloud with Steve Tuck, CEO/Co-Founder of Oxide Computer

Perfectly Boring

Play Episode Listen Later Sep 27, 2021 53:14


In this episode, we cover:00:00:00 - Reflections on the Episode/Introduction 00:03:06 - Steve's Bio00:07:30 - The 5 W's of Servers and their Future00:14:00 - Hardware and Software00:21:00 - Oxide Computer 00:30:00 - Investing in Oxide and the Public Cloud00:36:20 - Oxide's Offerings to Customers 00:43:30 - Continious Improvement00:49:00 - Oxide's Future and OutroLinks: Oxide Computer: https://oxide.computer Perfectlyboring.com: https://perfectlyboring.com TranscriptJason: Welcome to the Perfectly Boring podcast, a show where we talk to the people transforming the world's most boring industries. I'm Jason Black, general partner at RRE ventures.Will: And I'm Will Coffield, general partner at Riot Ventures.Jason: Today's boring topic of the day: servers.Will: Today, we've got Steve Tuck, the co-founder and CEO of Oxide Computer, on the podcast. Oxide is on a mission to fundamentally transform the private cloud and on-premise data center so that companies that are not Google, or Microsoft, or Amazon can have hyper scalable, ultra performant infrastructure at their beck and call. I've been an investor in the company for the last two or three years at this point, but Jason, this is your first time hearing the story from Steve and really going deep on Oxide's mission and place in the market. Curious what your initial thoughts are.Jason: At first glance, Oxide feels like a faster horse approach to an industry buying cars left and right. But the shift in the cloud will add $140 billion in new spend every year over the next five years. But one of the big things that was really interesting in the conversation was that it's actually the overarching pie that's expanding, not just demand for cloud but at the same rate, a demand for on-premise infrastructure that's largely been stagnant over the years. One of the interesting pivot points was when hardware and software were integrated back in the mainframe era, and then virtual machines kind of divorced hardware and software at the server level. Opening up the opportunity for a public cloud that reunified those two things where your software and hardware ran together, but the on-premises never really recaptured that software layer and have historically struggled to innovate on that domain.Will: Yeah, it's an interesting inflection point for the enterprise, and for basically any company that is operating digitally at this point, is that you're stuck between a rock and a hard place. You can scale infinitely on the public cloud but you make certain sacrifices from a performance security and certainly from an expense standpoint, or you can go to what is available commercially right now and you can cobble together a Frankenstein-esque solution from a bunch of legacy providers like HP, and Dell, and SolarWinds, and VMware into a MacGyvered together on-premise data center that is difficult to operate for companies where infrastructure isn't, and they don't want it to be, their core competency. Oxide is looking to step into that void and provide a infinitely scalable, ultra-high-performance, plug-and-play rack-scale server for everybody to be able to own and operate without needing to rent it from Google, or AWS, or Microsoft.Jason: Well, it doesn't sound very fun, and it definitely sounds [laugh] very boring. So, before we go too deep, let's jump into the interview with Steve.Will: Steve Tuck, founder and CEO of Oxide Computer. Thank you for joining us today.Steve: Yeah, thanks for having me. Looking forward to it.Will: And I think maybe a great way to kick things off here for listeners would be to give folks a baseline of your background, sort of your bio, leading up to founding Oxide.Steve: Sure. Born and raised in the Bay Area. Grew up in a family business that was and has been focused on heating and air conditioning over the last 100-plus years, Atlas. And went to school and then straight out of school, went into the computer space. Joined Dell computer company in 1999, which was a pretty fun and exciting time at Dell.I think that Dell had just crossed over to being the number one PC manufacturer in the US. I think number two worldwide at Compaq. Really just got to take in and appreciate the direct approach that Dell had taken in a market to stand apart, working directly with customers not pushing everything to the channel, which was customary for a lot of the PC vendors at the time. And while I was there, you had the emergence of—in the enterprise—hardware virtualization company called VMware that at the time, had a product that allowed one to drive a lot more density on their servers by way of virtualizing the hardware that people were running. And watching that become much more pervasive, and working with companies as they began to shift from single system, single app to virtualized environments.And then at the tail end, just watching large tech companies emerge and demand a lot different style computers than those that we had been customarily making at Dell. And kind of fascinated with just what these companies like Facebook, and Google, and Amazon, and others were doing to reimagine what systems needed to look like in their hyperscale environments. One of the companies that was in the tech space, Joyent, a cloud computing company, is where I went next. Was really drawn in just to velocity and the innovation that was taking place with these companies that were providing abstractions on top of hardware to make it much easier for customers to get access to the compute, and the storage, and the networking that they needed to build and deploy software. So, spent—after ten years at Dell, I was at Joyent for ten years. That is where I met my future co-founders, Bryan Cantrill who was at Joyent, and then also Jess Frazelle who we knew working closely while she was at Docker and other stops.But spent ten years as a public cloud infrastructure operator, and we built that service out to support workloads that ran the gamut from small game developers up to very large enterprises, and it was really interesting to learn about and appreciate what this infrastructure utility business looked like in public cloud. And that was also kind of where I got my first realization of just how hard it was to run large fleets of the systems that I had been responsible for providing back at Dell for ten years. We were obviously a large customer of Dell, and Supermicro, and a number of switch manufacturers. It was eye-opening just how much was lacking in the remaining software to bind together hundreds or thousands of these machines.A lot of the operational tooling that I wished had been there and how much we were living at spreadsheets to manage and organize and deploy this infrastructure. While there, also got to kind of see firsthand what happened as customers got really, really big in the public cloud. And one of those was Samsung, who was a very large AWS customer, got so large that they needed to figure out what their path on-premise would look like. And after going through the landscape of all the legacy enterprise solutions, deemed that they had to go buy a cloud company to complete that journey. And they bought Joyent. Spent three years operating the Samsung cloud, and then that brings us to two years ago, when Jess, Bryan, and I started Oxide Computer.Will: I think maybe for the benefit of our listeners, it would be interesting to have you define—and what we're talking about today is the server industry—and to maybe take a step back and in your own words, define what a server is. And then it would be really interesting to jump into a high-level history of the server up until today, and maybe within that, where the emergence of the public cloud came from.Steve: You know, you'll probably get different definitions of what a server is depending on who you ask, but at the highest level, a server differs from a typical PC that you would have in your home in a couple of ways, and more about what it is being asked to do that drives the requirements of what one would deem a server. But if you think about a basic PC that you're running in your home, a laptop, a desktop, a server has a lot of the same components: they have CPUs, and DRAM memory that is for non-volatile storage, and disks that are storing things in a persistent way when you shut off your computer that actually store and retain the data, and a network card so that you can connect to either other machines or to the internet. But where servers start to take on a little bit different shape and a little bit different set of responsibilities is the workloads that they're supporting. Servers, the expectations are that they are going to be running 24/7 in a highly reliable and highly available manner. And so there are technologies that have gone into servers, that ECC memory to ensure that you do not have memory faults that lose data, more robust components internally, ways to manage these things remotely, and ways to connect these to other servers, other computers.Servers, when running well, are things you don't really need to think about, are doing that, are running in a resilient, highly available manner. In terms of the arc of the server industry, if you go back—I mean, there's been servers for many, many, many, many decades. Some of the earlier commercially available servers were called mainframes, and these were big monolithic systems that had a lot of hardware resources at the time, and then were combined with a lot of operational and utilization software to be able to run a variety of tasks. These were giant, giant machines; these were extraordinarily expensive; you would typically find them only running in universities or government projects, maybe some very, very large enterprises in the'60s and'70s. As more and more software was being built and developed and run, the market demand and need for smaller, more accessible servers that were going to be running this common software, were driving machines that were coming out—still hardware plus software—from the likes of IBM and DEC and others.Then you broke into this period in the '80s where, with the advent of x86 and the rise of these PC manufacturers—the Dells and Compaqs and others—this transition to more commodity server systems. A focus, really a focus on hardware only, and building these commodity x86 servers that were less expensive, that were more accessible from an economics perspective, and then ultimately that would be able to run arbitrary software, so one could run any operating system or any body of software that they wanted on these commodity servers. When I got to Dell in 1999, this is several years into Dell's foray into the server market, and you would buy a server from Dell, or from HP, or from Compaq, or IBM, then you would go find your software that you were going to run on top of that to stitch these machines together. That was, kind of, that server virtualization era, in the '90s, 2000s. As I mentioned, technology companies were looking at building more scalable systems that were aggregating resources together and making it much easier for their customers to access the storage, the networking that they needed, that period of time in which the commodity servers and the software industry diverged, and you had a bunch of different companies that were responsible for either hardware or the software that would bring these computers together, these large hyperscalers said, “Well, we're building purpose-built infrastructure services for our constituents at, like, a Facebook. That means we really need to bind this hardware and software together in a single product so that our software teams can go very quickly and they can programmatically access the resources that they need to deploy software.”So, they began to develop systems that looked more monolithic, kind of, rack-level systems that were driving much better efficiency from a power and density perspective, and hydrating it with software to provide infrastructure services to their businesses. And so you saw, what started out in the computer industry is these monolithic hardware plus software products that were not very accessible because they were so expensive and so large, but real products that were much easier to do real work on, to this period where you had a disaggregation of hardware and software where the end-user bore the responsibility of tying these things together and binding these into those infrastructure products, to today, where the largest hyperscalers in the market have come to the realization that building hardware and software together and designing and developing what modern computers should look like, is commonplace, and we all know that well or can access that as public cloud computing.Jason: And what was the driving force behind that decoupling? Was it the actual hardware vendors that didn't want to have to deal with the software? Or is that more from a customer-facing perspective where the customers themselves felt that they could eke out the best advantage by developing their own software stack on top of a relatively commodity unopinionated hardware stack that they could buy from a Dell or an HP?Steve: Yeah, I think probably both, but one thing that was a driver is that these were PC companies. So, coming out of the'80s companies that were considered, quote-unquote, “The IBM clones,” Dell, and Compaq, and HP, and others that were building personal computers and saw an opportunity to build more robust personal computers that could be sold to customers who were running, again, just arbitrary software. There wasn't the desire nor the DNA to go build that full software stack and provide that out as an opinionated appliance or product. And I think then, part of it was also like, hey, if we just focus on the hardware, then got this high utility artifact that we can go sell into all sorts of arbitrary software use cases. You know, whether this is going to be a single server or three servers that's going to go run in a closet of cafe, or it's going to be a thousand servers that are running in one of these large enterprise data centers, we get to build the same box, and that box can run underneath any different type of software. By way of that, what you ultimately get in that scenario is you do have to boil things down to the lowest common denominators to make sure that you've got that compatibility across all the different software types.Will: Who were the primary software vendors that were helping those companies take commodity servers and specialize into particular areas? And what's their role now and how has that transformed in light of the public cloud and the offerings that are once again generalized, but also reintegrated from a hardware and software perspective, just not maybe in your own server room, but in AWS, or Azure, or GCP?Steve: Yeah, so you have a couple layers of software that are required in the operation of hardware, and then all the way up through what we would think about as running in a rack, a full rack system today. You've first got firmware, and this is the software that runs on the hardware to be able to connect the different hardware components, to boot the system, to make sure that the CPU can talk to its memory, and storage, and the network. That software may be a surprise to some, but that firmware that is essential to the hardware itself is not made by the server manufacturer themselves. That was part of this outsourcing exercise in the '80s where not only the upstack software that runs on server systems but actually some of the lower-level downstack software was outsourced to these third-party firmware shops that would write that software. And at the time, probably made a lot of sense and made things a lot easier for the entire ecosystem.You know, the fact that's the same model today, and given how proprietary that is and, you know, where that can actually lead to some vulnerabilities and security issues is more problematic. You've got firmware, then you've got the operating system that runs on top of the server. You have a hypervisor, which is the emulation layer that translates that lower-level hardware into a number of virtual machines that applications can run in. You have control plane software that connects multiple systems together so that you can have five or ten or a hundred, or a thousand servers working in a pool, in a fleet. And then you've got higher-level software that allows a user to carve up the resources that they need to identify the amount of compute, and memory, and storage that they want to spin up.And that is exposed to the end-user by way of APIs and/or a user interface. And so you've got many layers of software that are running on top of hardware, and the two in conjunction are all there to provide infrastructure services to the end-user. And so when you're going to the public cloud today, you don't have to worry about any of that, right? Both of you have probably spun up infrastructure on the public cloud, but they call it 16 digits to freedom because you just swipe a credit card and hit an API, and within seconds, certainly within a minute, you've got readily available virtual servers and services that allow you to deploy software quickly and manage a project with team members. And the kinds of things that used to take days, weeks, or even months inside an enterprise can be done now in a matter of minutes, and that's extraordinarily powerful.But what you don't see is all the integration of these different components running, very well stitched together under the hood. Now, for someone who's deploying their own infrastructure in their own data center today, that sausage-making is very evident. Today, if you're not a cloud hyperscaler, you are having to go pick a hardware vendor and then figure out your operating system and your control plane and your hypervisor, and you have to bind all those things together to create a rack-level system. And it might have three or four different vendors and three or four different products inside of it, and ultimately, you have to bear the responsibility of knitting all that together.Will: Because those products were developed in silos from each other?Steve: Yeah.Will: They were not co-developed. You've got hardware that was designed in a silo separate from oftentimes it sounds like the firmware and all of the software for operating those resources.Steve: Yeah. The hardware has a certain set of market user requirements, and then if you're a Red Hat or you're a VMware, you're talking to your customers about what they need and you're thinking at the software layer. And then you yourself are trying to make it such that it can run across ten or twenty different types of hardware, which means that you cannot do things that bind or provide hooks into that underlying hardware which, unfortunately, is where a ton of value comes from. You can see an analog to this in thinking about the Android ecosystem compared to the Apple ecosystem and what that experience is like when all that hardware and software is integrated together, co-designed together, and you have that iPhone experience. Plenty of other analogs in the automotive industry, with Tesla, and health equipment, and Peloton and others, but when hardware and software—we believe certainly—when hardware and software is co-designed together, you get a better artifact and you get a much, much better user experience. Unfortunately, that is just not the case today in on-prem computing.Jason: So, this is probably a great time to transition to Oxide. Maybe to keep the analogy going, the public cloud is that iPhone experience, but it's just running in somebody else's data center, whether that's AWS, Azure, or one of the other public clouds. You're developing iOS for on-prem, for the people who want to run their own servers, which seems like kind of a countertrend. Maybe you can talk us through the dynamics in that market as it stands today, and how that's growing and evolving, and what role Oxide Computer plays in that, going forward.Steve: You've got this what my co-founder Jess affectionately refers to as ‘infrastructure privilege' in the hyperscalers, where they have been able to apply the money, and the time, and the resources to develop this, kind of, iPhone stack, instead of thinking about a server as a single 1U unit, or single machine, had looked at, well, what does a rack—which is the case that servers are slotted into in these large data centers—what does rack-level computing look like and where can we drive better power efficiency? Where can we drive better density? How can we drive much better security at scale than the commodity server market today? And doing things like implementing hardware Roots of Trust and Chain of Trust, so that you can ensure the software that is running on your machines is what is intended to be running there. The blessing is that we all—the market—gets access to that modern infrastructure, but you can only rent it.The only way you can access it is to rent, and that means that you need to run in one of the three mega cloud providers' data centers in those locations, that you are having to operate in a rental fee model, which at scale can become very, very prohibitively expensive. Our fundamental belief is that the way that these hyperscale data centers have been designed and these products have been designed certainly looks a lot more like what modern computers should look like, but the rest of the market should have access to the same thing. You should be able to buy and own and deploy that same product that runs inside a Facebook data center, or Apple data center, or Amazon, or a Google data center, you should be able to take that product with you wherever your business needs to run. A bit intimidating at the top because what we signed up for was building hardware, and taking a clean sheet paper approach to what a modern server could look like. There's a lot of good hardware innovation that the hyperscalers have helped drive; if you go back to 2010, Facebook pioneered being a lot more open about these modern open hardware systems that they were developing, and the Open Compute Project, OCP, has been a great collection point for these hyperscalers investing in these modern rack-level systems and doing it in the open, thinking about what the software is that is required to operate modern machines, importantly, in a way that does not sink the operations teams of the enterprises that are running them.Again, I think one of the things that was just so stunning to me, when I was at Joyent—we were running these machines, these commodity machines, and stitching together the software at scale—was how much of the organization's time was tied up in the deployment, and the integration, and the operation of this. And not just the organization's time, but actually our most precious resource, our engineering team, was having to spend so much time figuring out where a performance problem was coming from. For example in [clear throat], man, those are the times in which you really are pounding your fist on the table because you will try and go downstack to figure out, is this in the control plane? Is this in the firmware? Is this in the hardware?And commodity systems of today make it extremely, extremely difficult to figure that out. But what we set out to do was build same rack-level system that you might find in a hyperscaler data center, complete with all the software that you need to operate it with the automation required for high availability and low operational overhead, and then with a CloudFront end, with a set of services on the front end of that rack-level system that delight developers, that look like the cloud experience that developers have come to love and depend on in the public cloud. And that means everything is programmable, API-driven services, all the hardware resources that you need—compute, memory, and storage—are actually a pool of resources that you can carve up and get access to and use in a very developer-friendly way. And the developer tools that your software teams have come to depend on just work and all the tooling that these developers have invested so much time in over the last several years, to be able to automate things, to be able to deploy software faster are resident in that product. And so it is definitely kind of hardware and software co-designed, much like some of the original servers long, long, long ago, but modernized with the hardware innovation and open software approach that the cloud has ushered in.Jason: And give us a sense of scale; I think we're so used to seeing the headline numbers of the public cloud, you know, $300-and-some billion dollars today, adding $740-some billion over the next five years in public cloud spend. It's obviously a massive transformation, huge amount of green space up for grabs. What's happening in the on-prem market where your Oxide Computer is playing and how do you think about the growth in that market relative to a public cloud?Steve: It's funny because as Will can attest, as we were going through and fundraising, the prevalent sentiment was, like, everything's going to the public cloud. As we're talking to the folks in the VC community, it was Amazon, Microsoft, and Google are going to own the entirety of compute. We fundamentally disagreed because, A, we've lived it, and b, we went out as we were starting out and talked to dozens and dozens of our peers in the enterprise, who said, “Our cloud ambitions are to be able to get 20, 30, 40% of our workloads out there, and then we still have 60, 70% of our infrastructure that is going to continue to run in our own data centers for reasons including regulatory compliance, latency, security, and in a lot of cases, cost.” It's not possible for these enterprises that are spending half a billion, a billion dollars a year to run all of their infrastructure in the public cloud. What you've seen on-premises, and it depends on who you're turning to, what sort of poll and research you're turning to, but the on-prem market, one is growing, which I think surprises a lot of folks; the public cloud market, of course, it's growing like gangbusters, and that does not surprise a lot of folks, but what we see is that the combined market of on-prem and cloud, you can call it—if on-premise on the order of $100 billion and cloud is on the order of $150 billion, you are going to see enormous growth in both places over the next 10, 15 years.These markets are going to look very, very small compared to where they will be because one of the biggest drivers of whether it's public cloud or on-prem infrastructure, is everything shifting to digital formats. The digitalization that is just all too commonplace, described everywhere. But we're still very, very early in that journey. I think that if you look at the global GDP, less than 10% of the global GDP is on the internet, is online. Over the coming 10, 20 years, as that shifts to 20%, 30%, you're seeing exponential growth. And again, we believe and we have heard from the market that is representative of that $100 billion that investments in the public cloud and on-prem is going to continue to grow much, much more as we look forward.Will: Steve, I really appreciate you letting listeners know how special a VC I am.Steve: [laugh].Will: [laugh]. It was really important point that I wanted to make sure we hit on.Steve: Yeah, should we come back to that?Will: Yeah, yeah yeah—Steve: Yeah, let's spend another five or ten minutes on that.Will: —we'll revisit that. We'll revisit that later. But when we're talking about the market here, one of the things that got us so excited about investing in Oxide is looking at the existing ecosystem of on-prem commercial providers. I think if you look at the public cloud, there are fierce competitors there, with unbelievably sophisticated operations and product development. When you look at the on-prem ecosystem and who you would go to if you were going to build your own data center today, it's a lot of legacy companies that have started to optimize more for, I would say, profitability over the last couple of years than they have for really continuing to drive forward from an R&D and product standpoint.Would love maybe for you to touch on briefly, what does competition for you look like in the on-prem ecosystem? I think it's very clear who you're competing with, from a public cloud perspective, right? It's Microsoft, Google, Amazon, but who are you going up against in the on-prem ecosystem?Steve: Yeah. And just one note on that. We don't view ourselves as competing with Amazon, Google, and Microsoft. In fact, we are ardent supporters of cloud in the format, namely this kind of programmable API-fronted infrastructure as being the path of the future of compute and storage and networking. That is the way that, in the future, most software should be deployed to, and operated on, and run.We just view the opportunity for, and what customers are really, really excited about is having those same benefits of public cloud, but in a format in which they can own it and being able to have access to that everywhere their business needs to run, so that it's not, you know, do I get all this velocity, and this innovation, and this simplicity when I rent public cloud, or do I own my infrastructure and have to give up a lot of that? But to the first part of your question, I think the first issue is that it isn't one vendor that you are talking about what is the collection of vendors that I go to to get servers, software to make my servers talk to each other, switches to network together these servers, and additional software to operate, and manage, and monitor, and update. And there's a lot of complexity there. And then when you take apart each one of those different sets of vendors in the ecosystem, they're not designing together, so you've got these kind of data boundaries and these product boundaries that start to become really, really real when you're operating at scale, and when you're running critical applications to your business on these machines. And you find yourself spending an enormous amount of the company's time just knitting this stuff together and operating it, which is all time lost that could be spent adding additional features to your own product and better competing with your competitors.And so I think that you have a couple of things in play that make it hard for customers running infrastructure on-premises, you've got that dynamic that it's a fractured ecosystem, that these things are not designed together, that you have this kit car that you have to assemble yourself and it doesn't even come with a blueprint of the particular car design that you're building. I think that you do have some profit-taking in that it is very monopolized, especially on the software side where you've only got a couple of large players that know that there are few alternatives for companies. And so you are seeing these ELAs balloon, and you are seeing practices that look a lot like Oracle Enterprise software sales that are really making this on-prem experience not very economically attractive. And so our approach is, hardware should come with all the software required to operate it, it should be tightly integrated, the software should be all open-source. Something we haven't talked about.I think open-source is playing an enormous role in accelerating the cloud landscape and the technology landscapes. We are going to be developing our software in an open manner, and truly believe whether it's from a security view through to the open ecosystem, that it is imperative that software be open. And then we are integrating the switch into that rack-level product so that you've got networking baked in. By doing that, it opens up a whole new vector of value to the customer where, for example, you can see for the first time what is the path of traffic from my virtual machine to a switchboard? Or when things are not performing well, being able to look into that path, and the health, and see where things are not performing as well as they should, and being able to mitigate those sorts of issues.It does turn out if you are able to get rid of a lot of the old, crufty artifacts that have built up inside even these commodity system servers, and you are able to start designing at a rack level where you can drive much better power efficiency and density, and you bake in the software to effectively make this modern rack-level server look like a cloud in a box, and ensure these things can snap together in a grid, where in that larger fleet, operational management is easy because you've got the same automation capabilities that the big cloud hyperscalers have today. It can really simplify life. It ends up being an economic win and maybe most importantly, presents the infrastructure in a way that the developers love. And so there's not this view of the public cloud being the fast, innovative path for developers and on-prem being this, submit a trouble ticket and try and get access to a VM in six days, which sadly is the experience that we hear a lot of companies are still struggling with in on-prem computing.Jason: Practically, when you're going out and talking to customers, you're going to be a heterogeneous environment where presumably they already have their own on-prem infrastructure and they'll start to plug in—Steve: Yeah.Jason: —Oxide Computer alongside of it. And presumably, they're also to some degree in the public cloud. It's a fairly complex environment that you're trying to insert yourself into. How are your customers thinking about building on top of Oxide Computer in that heterogeneous environment? And how do you see Oxide Computer expanding within these enterprises, given that there's a huge amount of existing capital that's gone into building out their data centers that are already operating today, and the public cloud deployments that they have?Steve: As customers are starting to adopt Oxide rack-level computing, they are certainly going to be going into environments where they've got multiple generations of multiple different types of infrastructure. First, the discussions that we're having are around what are the points of data exfiltration, of data access that one needs to operate their broader environment. You can think about handoff points like the network where you want to make sure you've got a consistent protocol to, like, BGP or other, to be able to speak from your layer 2 networks to your layer 3 networks; you've got operational software that is doing monitoring and alerting and rolling up for service for your SRE teams, your operations teams, and we are making sure that Oxide's endpoint—the front end of the Oxide product—will integrate well, will provide the data required for those systems to run well. Another thorny issue for a lot of companies is identity and access management, controlling the authentication and the access for users of their infrastructure systems, and that's another area where we are making sure that the interface from Oxide to the systems they use today, and also resident in the Oxide product such as one wants to use it directly, has a clean cloud-like identity and access management construct for one to use. But at the highest level it is, make sure that you can get out of the Oxide infrastructure, the kind of data and tooling required to incorporate into management of your overall fleet.Over time, I think customers are going to experience a much simpler and much more automated world inside of the Oxide ecosystem; I think they're going to find that there are exponentially fewer hours required to manage that environment and that is going to inevitably just lead to wanting to replace a hundred racks of the extant commodity stack with, you know, sixty racks of Oxide that provide much better density, smaller footprint in the data center, and again, software-driven in the way that these folks are looking for.Jason: And in that answer, you alluded to a lot of the specialization and features that you guys can offer. I've always loved Alan Kay's quote, “People who are really serious about software make their own hardware.”Steve: Yeah.Jason: Obviously, you've got some things in here that only Oxide Computer can do. What are some of those features that traditional vendors can't even touch or deliver that you'll be able to, given your hardware-software integration?Steve: Maybe not the most exciting example, but I think one that is extremely important to a lot of the large enterprise company that we're working with, and that is at a station, being able to attest to the software that is running on your hardware. And why is that important? Well, as we've talked about, you've got a lot of different vendors that are participating in that system that you're deploying in your data center. And today, a lot of that software is proprietary and opaque and it is very difficult to know what versions of things you are running, or what was qualified inside that package that was delivered in the firmware. We were talking to a large financial institution, and they said their teams are spending two weeks a month just doing, kind of a proof of trust in their infrastructure that their customer's data is running on, and how cumbersome and hard it is because of how murky and opaque those lower-level system software world is.What do the hyperscalers do? They have incorporated hardware Root of Trust, which ensures from that first boot instruction, from that first instruction on the microprocessor, that you have a trusted and verifiable path, from the system booting all the way up through the control plane software to, say, a provisioned VM. And so what this does is it allows the rest of the market access to a bunch of security innovation that has gone on where these hyperscalers would never run without this. Again, having the hardware Root of Trust anchored at a station process, the way to attest all that software running is going to be really, really impactful for more than just security-conscious customers, but certainly, those that are investing more in that are really, really excited. If you move upstack a little bit, when you co-design the hardware with the control plane, both the server and the switch hardware with the control plane, it opens up a whole bunch of opportunity to improve performance, improve availability because you now have systems that are designed to work together very, very well.You can now see from the networking of a system through to the resources that are being allocated on a particular machine, and when things are slow, when things are broken, you are able to identify and drive those fixes, in some cases that you could not do before, in much, much, much faster time, which allows you to start driving infrastructure that looks a lot more like the five nines environment that we expect out of the public cloud.Jason: A lot of what you just mentioned, actually, once again, ties back to that analogy to the iPhone, and having that kind of secure enclave that powers Touch ID and Face ID—Steve: Yep.Jason: —kind of a server equivalent, and once again, optimization around particular workflows, the iPhone knows exactly how many photos every [laugh] iOS user takes, and therefore they have a custom chip dedicated specifically to processing images. I think that tight coupling, just relating it back to that iOS and iPhone integration, is really exciting.Steve: Well, and the feedback loop is so important because, you know, like iPhone, we're going to be able to understand where there are rough edges and where things are—where improvements can even can continue to be made. And because this is software-driven hardware, you get an opportunity to continuously improve that artifact over time. It now stops looking like the old, your car loses 30% of the value when you drive it off the lot. Because there's so much intelligent software baked into the hardware, and there's an opportunity to update and add features, and take the learnings from that hardware-software interaction and feed that back into an improving product over time, you can start to see the actual hardware itself have a much longer useful life. And that's one of the things we're really excited about is that we don't think servers should be commodities that the vendors are trying to push you to replace every 36 months.One of the things that is important to keep in mind is as Moore's laws is starting to slow or starting to hit some of the limitations, you won't have CPU density and some of these things, driving the need to replace hardware as quickly. So, with software that helps you drive better utilization and create a better-combined product in that rack-level system, we think we're going to see customers that can start getting five, six, seven years of useful life out of the product, not the typical two, or three, or maybe four that customers are seeing today in the commodity systems.Will: Steve, that's one of the challenges for Oxide is that you're taking on excellence in a bunch of interdisciplinary sciences here, between the hardware, the software, the firmware, the security; this is a monster engineering undertaking. One of the things that I've seen as an investor is how dedicated you have got to be to hiring, to build basically the Avengers team here to go after such a big mission. Maybe you could touch on just how you've thought about architecting a team here. And it's certainly very different than what the legacy providers from an on-prem ecosystem perspective have taken on.Steve: I think one of the things that has been so important is before we even set out on what we were going to build, the three of us spent time and focused on what kind of company we wanted to build, what kind of company that we wanted to work at for the next long chunk of our careers. And it's certainly drawing on experiences that we had in the past. Plenty of positives, but also making sure to keep in mind the negatives and some of the patterns we did not want to repeat in where we were working next. And so we spent a lot of time just first getting the principles and the values of the company down, which was pretty easy because the three of us shared these values. And thinking about all the headwinds, just all the foot faults that hurt startups and even big companies, all the time, whether it be the subjectivity and obscurity of compensation or how folks in some of these large tech companies doing performance management and things, and just thinking about how we could start from a point of building a company that people really want to work for and work with.And I think then layering on top of that, setting out on a mission to go build the next great computer company and build computers for the cloud era, for the cloud generation, that is, as you say, it's a big swing. And it's ambitious, and exhilarating and terrifying, and I think with that foundation of focusing first on the fundamentals of the business regardless of what the business is, and then layering on top of it the mission that we are taking on, that has been appealing, that's been exciting for folks. And it has given us the great opportunity of having terrific technologists from all over the world that have come inbound and have wanted to be a part of this. And we, kind of, will joke internally that we've got eight or nine startups instead of a startup because we're building hardware, and we're taking on developing open-source firmware, and a control plane, and a switch, and hardware Root of Trust, and in all of these elements. And just finding folks that are excited about the mission, that share our values, and that are great technologists, but also have the versatility to work up and down the stack has been really, really key.So far, so great. We've been very fortunate to build a terrific, terrific team. Shameless plug: we are definitely still hiring all over the company. So, from hardware engineering, software engineering, operations, support, sales, we're continuing to add to the team, and that is definitely what is going to make this company great.Will: Maybe just kind of a wrap-up question here. One of the things Jason and I always like to ask folks is, if you succeed over the next five years, how have you changed the market that you're operating in, and what does the company look like in five years? And I want you to know as an investor, I'm holding you to this. Um, so—Steve: Yeah, get your pen ready. Yeah.Will: Yeah, yeah. [laugh].Steve: Definitely. Expect to hear about that in the next board meeting. When we get this product in the market and five years from now, as that has expanded and we've done our jobs, then I think one of the most important things is we will see an incredible amount of time given back to these companies, time that is wasted today having to stitch together a fractured ecosystem of products that were not designed to work together, were not designed with each other in mind. And in some cases, this can be 20, 30, 40% of an organization's time. That is something you can't get back.You know, you can get more money, you can—there's a lot that folks can control, but that loss of time, that inefficiency in DIY your own cloud infrastructure on-premises, will be a big boon. Because that means now you've got the ability for these companies to capitalize on digitalizing their businesses, and just the velocity of their ability to go improve their own products, that just will have a flywheel effect. So, that great simplification where you don't even consider having to go through and do these low-level updates, and having to debug and deal with performance issues that are impossible to sort out, this—aggregation just goes away. This system comes complete and you wouldn't think anything else, just like an iPhone. I think the other thing that I would hope to see is that we have made a huge dent in the efficiency of computing systems on-premises, that the amount of power required to power your applications today has fallen by a significant amount because of the ability to instrument the system, from a hardware and software perspective, to understand where power is being used, where it is being wasted.And I think that can have some big implications, both to just economics, to the climate, to a number of things, by building and people using smarter systems that are more efficient. I think generally just making it commonplace that you have a programmable infrastructure that is great for developers everywhere, that is no longer restricted to a rental-only model. Is that enough for five years?Will: Yeah, I think I think democratizing access to hyperscale infrastructure for everybody else sounds about right.Steve: All right. I'm glad you wrote that down.Jason: Well, once again, Steve, thanks for coming on. Really exciting, I think, in this conversation, talking about the server market as being a fairly dynamic market still, that has a great growth path, and we're really excited to see Oxide Computer succeed, so thanks for coming on and sharing your story with us.Steve: Yeah, thank you both. It was a lot of fun.Will: Thank you for listening to Perfectly Boring. You can keep up the latest on the podcast at perfectlyboring.com, and follow us on Apple, Spotify, or wherever you listen to podcasts. We'll see you next time.

Meanwhile in Security
The Grid Has Fallen and It Can't Get Up

Meanwhile in Security

Play Episode Listen Later May 13, 2021 9:54


Jesse Trucks is the Minister of Magic at Splunk, where he consults on security and compliance program designs and develops Splunk architectures for security use cases, among other things. He brings more than 20 years of experience in tech to this role, having previously worked as director of security and compliance at Peak Hosting, a staff member at freenode, a cybersecurity engineer at Oak Ridge National Laboratory, and a systems engineer at D.E. Shaw Research, among several other positions. Of course, Jesse is also the host of Meanwhile in Security, the podcast about better cloud security you're about to listen to.Show Notes:Links: Here's the hacking group responsible for the Colonial Pipeline shutdown: https://www.cnbc.com/2021/05/10/hacking-group-darkside-reportedly-responsible-for-colonial-pipeline-shutdown.html Biden says ‘no evidence' Russia involved in US pipeline hack but Putin should act: https://www.theguardian.com/us-news/2021/may/10/colonial-pipeline-shutdown-us-darkside-message Colonial Pipeline CEO warns of possible fuel shortages following cyberattack: https://www.foxbusiness.com/technology/colonial-pipeline-ceo-warns-of-fuel-shortages-following-cyberattack Colonial Pipeline hackers apologize, promise to ransom less controversial targets in future: https://www.theverge.com/2021/5/10/22428996/colonial-pipeline-ransomware-attack-apology-investigation Over 40 Apps With More Than 100 Million Installs Found Leaking AWS Keys: https://thehackernews.com/2021/05/over-40-apps-with-more-than-100-million.html Red Hat bakes cloud security into the heart of Red Hat OpenShift: https://siliconangle.com/2021/04/27/red-hat-bakes-cloud-security-heart-openshift/ Amazon debuts CloudFront Functions for running lightweight code at the edge: https://siliconangle.com/2021/05/03/amazon-debuts-cloudfront-functions-running-lightweight-code-edge Critical Patch Out for Critical Pulse Secure VPN 0-Day Under Attack: https://thehackernews.com/2021/05/critical-patch-out-for-month-old-pulse.html New Amazon FinSpace Simplifies Data Management and Analytics for Financial Services: https://aws.amazon.com/blogs/aws/amazon-finspace-simplifies-data-management-and-analytics-for-financial-services/ Spectre Strikes Back: New Hacking Vulnerability Affecting Billions of Computers Worldwide: https://scitechdaily.com/spectre-strikes-back-new-hacking-vulnerability-affecting-billions-of-computers-worldwide America Hacks Itself. Waiting for the Cyber-Apocalypse: https://tomdispatch.com/waiting-for-the-cyber-apocalypse/ Wanted: The (Elusive) Cybersecurity ‘all-Star': https://www.darkreading.com/operations/wanted-the-(elusive)-cybersecurity-all-star/d/d-id/1340929 How to Solve the Cybersecurity Skills Gap: https://securityboulevard.com/2021/05/how-to-solve-the-cybersecurity-skills-gap/ Most Organizations Feel More Vulnerable to Breaches Amid Pandemic: https://www.darkreading.com/risk/most-organizations-feel-more-vulnerable-to-breaches-amid-pandemic/d/d-id/1340954 How the COVID-19 Pandemic is Impacting Cyber Security Worldwide: https://innovationatwork.ieee.org/how-the-covid-19-pandemic-is-impacting-cyber-security-worldwide/ Impact of COVID-19 on Cybersecurity: https://www2.deloitte.com/ch/en/pages/risk/articles/impact-covid-cybersecurity.html Biden on cyber security after 100 days: A good start, but now comes the hard part: https://securityboulevard.com/2021/05/biden-on-cyber-security-after-100-days-a-good-start-but-now-comes-the-hard-part/ Why Software Supply Chain Attacks are Inevitable and what you Must do to Protect Your Applications: https://securityboulevard.com/2021/05/why-software-supply-chain-attacks-are-inevitable-and-what-you-must-do-to-protect-your-applications/ TranscriptJesse: Welcome to Meanwhile in Security where I, your host Jesse Trucks, guides you to better security in the cloud.Announcer: If your mean time to WTF for a security alert is more than a minute, it's time to look at Lacework. Lacework will help you get your security act together for everything from compliance service configurations to container app relationships, all without the need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements, you don't really have time to choose between antivirus or firewall companies to help you secure your stack. That's why Lacework is built from the ground up for the cloud: low effort, high visibility, and detection. To learn more, visit lacework.com. That's lacework.com.Jesse: Infrastructure security, including both critical physical systems that make our modern human lives possible, and supply chain on critical software systems is the theme of the week—maybe month, or a year—and we need to sit up and pay attention. Our electrical grids, telco systems, fuel pipelines, water supplies, and more, are delicate flowers ready to be stomped by anything with brute force, or eaten away by a swarm of tiny insects. These systems lurk online in the background where most of us don't see them. However, all these are managed by computerized systems and they aren't as air-gapped as we would hope they are. Internet of Things—or IoT—operational technology—or OT—and industrial control systems—or ICS—aren't new security problems to solve. These have been highly vulnerable forever, but now we're seeing how IoT, OT, ISS security lags far behind mainstream cybersecurity. This is a rapidly changing trend, but we should be worried over the next few months and years, as the security for these things catch up to the rest of the world.Meanwhile, in the news, “Here's the hacking group responsible for the Colonial Pipeline shutdown.” And, “Biden says ‘no evidence' Russia involved in US pipeline hack but Putin should act.” And, “Colonial Pipeline CEO warns of possible fuel shortages following cyberattack,” and, “Colonial Pipeline hackers apologize, promise to ransom less controversial targets in future.” I could list hundreds of more articles on the Colonial Pipeline breach. These are some choice ones you should read to understand the impact of this event. And also hacker groups with sort of a conscience? Hmm.“Over 40 Apps With More Than 100 Million Installs Found Leaking AWS Keys.” Wow, just wow. This is the modern equivalent of hard-coding a password in plain text into an app anyone can read. Please don't be stupid. Don't put keys or passwords into your apps in ways that expose your whole internal structure and customer or user data to the world.“Red Hat bakes cloud security into the heart of Red Hat OpenShift.” DevSecOps is like DevOps, but integrating security into the entire process. If you aren't doing DevSecOps already, you need to start. I like that Red Hat has an offering that makes it easier to adopt for organizations that need a managed service.“Amazon debuts CloudFront Functions for running lightweight code at the edge.” Using a DevSecOps model is critical when you run code that calls someone else's functions. CloudFront functions look useful programmatically to deliver a smooth and fast user experience, but be careful about your inputs and outputs and test your code well.“Critical Patch Out for Critical Pulse Secure VPN 0-Day Under Attack.” Finally, a patch to install if you use pulse secure. You need to know what's happening and you need to install the patch. It's still a good read even if you don't use the product.“New Amazon FinSpace Simplifies Data Management and Analytics for Financial Services.” Like many of us, I'm an armchair economist who likes to geeking out over market and economy analysis and trends. AWS FinSpace looks like a combination of a fantastic way to open opportunities for new players in the financial services industry—or FSI—but at the same time, this moves the trust of data integrity and availability into someone else's hands. When I worked with supercomputers used by chemists, the accuracy and availability of computational results were the most important aspect of the work, so outsourcing some of the fundamental maths makes me fret.Announcer: This episode is sponsored by ExtraHop. ExtraHop provides threat detection and response for the Enterprise (not the starship). On-prem security doesn't translate well to cloud or multi-cloud environments, and that's not even counting IoT. ExtraHop automatically discovers everything inside the perimeter, including your cloud workloads and IoT devices, detects these threats up to 35 percent faster, and helps you act immediately. Ask for a free trial of detection and response for AWS today at extrahop.com/trial. That's extrahop.com/trial.Jesse: “Spectre Strikes Back: New Hacking Vulnerability Affecting Billions of Computers Worldwide.” Hardware flaws are both esoteric and terrifying. This shows that anything can be compromised given enough willpower and science. Always assume your systems are flawed and breakable and have multiple checks and balances to ensure the efficacy of operations and the integrity of your data.“America Hacks Itself. Waiting for the Cyber-Apocalypse.” I'm a Cold War spy novel aficionado, and I can't go a week without reading a story or novel about a dystopian nightmare. You know, like today's news. Most of the former teaches us about the origins of the latter, and we are living in one of those nightmares now. If you want to understand more about nation-state hacking and cracking, this one is for you.“Wanted: The (Elusive) Cybersecurity ‘all-Star',” and, “How to Solve the Cybersecurity Skills Gap.” The whole point of Meanwhile in Security is to help people who don't do security full time, and this piece expresses my thoughts on the cybersecurity labor market quite well. There are not enough experienced security people on the planet to meet the demands, so everyone has to learn more about security just to get through the day. Repeat this mantra when it gets you down. “I can do it. Security isn't as hard as security people claim. Remember, I can do it. I can do it. I think I can. I think again.”Cloud-native businesses struggle with security, you aren't alone. As more things move to cloud services, security gets more complex and difficult for everyone. These are solvable problems, but it will take an industry shift for it to become easy. It looks worse now than it will be in the near-term future over the next couple of years. We'll catch up to the bad guys' methods and mindsets soon enough.“Most Organizations Feel More Vulnerable to Breaches Amid Pandemic,” and, “How The COVID-19 Pandemic is Impacting Cyber Security Worldwide,” and, “Impact of COVID-19 on Cybersecurity.” There are tons of articles, and surveys, and studies out talking about how cybersecurity has become a larger problem during the global pandemic. It isn't only SARS-CoV-2 rampaging through our human world. I find it important to understand trends in cybersecurity in any sector or vertical because it helps me understand how to gauge my own risk.“Biden on cyber security after 100 days: A good start, but now comes the hard part.” It is important to understand how government policies and politics affects the tech industry, and cybersecurity is not any different. The speed of innovation in attacks and defenses usually leaves governments way behind. We should understand how government thinks about these things.“Why Software Supply Chain Attacks are Inevitable and what you Must do to Protect Your Applications.” I wrote about supply chain attacks recently because it is a scary problem that has shown up in the news with catastrophic results. Everyone managing any type of infrastructure or service needs to understand the nature of the attacks and the associated risks.And now the tip of the week. Remember the article about exposing AWS access keys? Yeah, don't do those things. Even AWS tells you not to. Any app or service should be protected using the most limited IAM role you can possibly use, and keys allowing access to those roles should not be embedded directly into code.Build a process to pull the access credentials when an app launches or connects to your service to initiate the access Instead of putting these things directly into the client systems. You should always be thinking of the ‘least privilege paradigm.' This means you give a service or user the smallest possible set of access rights to do the job needed. For example, AWS allows you to use AWS Config to track what a service touches. So, in testing, use AWS Config to see what your service needs and limit access to only those minimal things it needs.And that's a wrap for the week, folks. Securely yours Jesse Trucks.Jesse: Thanks for listening. Please subscribe and rate us on Apple and Google Podcast, Spotify, or wherever you listen to podcasts.Announcer: This has been a HumblePod production. Stay humble.

cloudonaut
#30 Getting started with IPv6 on AWS

cloudonaut

Play Episode Listen Later Oct 28, 2020 45:13


Michael shares his learnings about IPv6 on AWS. Enabling IPv6 is highly recommended for public endpoints like CloudFront and ALB. On top of that, Michael explains how to enable IPv6 for your VPCs.

Mobycast
Hands On AWS - Massively Scalable Image Hosting Using S3 and CloudFront - Part 2

Mobycast

Play Episode Listen Later Jul 8, 2020 41:13


In this episode, we cover the following topics: We discuss the features and limitations of serving files directly from S3. We then talk about how CloudFront can address many of S3's limitations. In particular, CloudFront is performant, inexpensive and allows us to use custom CNAMEs with TLS encryption. How to create a secure CloudFront distribution for files hosted in S3. What is OAI (Origin Access Identity), why we need it and how to set it up. We show how you can configure your CloudFront distribution to use TLS and redirect HTTP to HTTPS. We finish up by discussing "byte-range requests" and how to enable them for our image hosting solution. Detailed Show NotesWant the complete episode outline with detailed notes? Sign up here: https://mobycast.fm/show-notes/End SongBeauty in Rhythm by Roy EnglandMore InfoFor a full transcription of this episode, please visit the episode webpage.We'd love to hear from you! You can reach us at: Web: https://mobycast.fm Voicemail: 844-818-0993 Email: ask@mobycast.fm Twitter: https://twitter.com/hashtag/mobycast Reddit: https://reddit.com/r/mobycast

Mobycast
Hands On AWS - Massively Scalable Image Hosting Using S3 and CloudFront - Part 1

Mobycast

Play Episode Listen Later Jul 1, 2020 43:25


In this episode, we cover the following topics: A common feature for web apps is image upload. And we all know the "best practices" for how to build this feature. But getting it right can be tricky. We start off by discussing the problem space, and what we want to solve. A key goal is to have a solution that is massively scalable while being cost-effective. We outline the general architecture of the solution, with separate techniques for handling image uploading and downloading. We then dive deep into how to handle image uploading, highlighting various techniques for controlling access over who can perform uploads. Two common techniques for securing uploads when using AWS are presigned URLs and presigned POSTs. We discuss how each works and when to use one over the other. We finish up by putting everything together and detailing the steps involved with uploading an image. Detailed Show NotesWant the complete episode outline with detailed notes? Sign up here: https://mobycast.fm/show-notes/Support Mobycasthttps://glow.fm/mobycastEnd SongLazy Sunday by Roy EnglandMore InfoFor a full transcription of this episode, please visit the episode webpage.We'd love to hear from you! You can reach us at: Web: https://mobycast.fm Voicemail: 844-818-0993 Email: ask@mobycast.fm Twitter: https://twitter.com/hashtag/mobycast Reddit: https://reddit.com/r/mobycast

newline
Serverless on AWS Lambda with Stephanie Prime

newline

Play Episode Listen Later Jun 17, 2020 60:46


newline Podcast Sudo StephNate: [00:00:00] Steph, just tell us a little bit about your work and kind of your background with, like AWS and like what you're doing now.Steph: [00:00:06] Yes, so I work as a engineer for a manage services provider called Second Watch. We basically partner with other big companies that use AWS or some other clouds sometimes Azure for managing their cloud infrastructure, which basically just means that.We help big companies who may not, their focus may not be technology, it may not be cloud stuff in general, and we're able to just basically optimize the cost of everything, make sure that things are running reliably and smoothly, and we're able to work with AWS directly to kind of keep people ahead of the curve when.New stuff is coming out and just it changes so much, you know, it's important to be able to adapt. So like personally, my role is I develop automation for our internal operations teams. So we have a bunch of, you know, just really smart people who are always working on customer specific AWS issues. And we see some of the same issues.Pop up over and over. Of course, you know, security , auditing, cost optimization. And so my team makes optimizations that we can distribute to all of these clients who have to maintain their own. You know, they have their own AWS account. It's theirs. And we make it so that we're actually able to distribute these automations same way in all of our customers' accounts.So the idea is that, and it's really wouldn't be doable without serverless because the idea is that everyone has to own their own infrastructure, right? Your AWS account is yours does or your resources, you don't, for security reasons, want to put all of your stuff on somebody else's account. But actually managing them all the same way can be a really difficult, even with scripts, because permissions different places have to be granted through the AWS permissions up with  access, I identity and access management, right? So serverless gave us the real tool that we needed to be able to at scale, say, Hey, we came up with a little script that will run on an hourly basis to check to see how much usage these servers are getting, and if they're not production servers, you know, spin them down if they're not in use to save money.Little things like that when it comes to operations and AWS Lambda is just so good for it because it's all about, you know, like I said, doing things reliably. Doing things in a ways that can be audited and logged and doing things for like a decent price. So like background wise, I used to work at AWS in AWS support actually, and I kind of supported some of their dev ops products like OpsWorks, which is based on chef for configuration management, elastic Beanstalk and AWS CloudFormation, specifically. After working there for a bit, I really got to see, you know, how it's made and what the underlying system is like. And it was just crazy just to see how much work goes into all this, just so you can have a supposedly, easier to use for an end. But serverless just kinda changed all that for the better.Luckily.Amelia: [00:02:57] So it sounds like AWS has a ton of different services. What are the main ones and how many are there?Steph: [00:03:04] So I don't think I can even count anymore because they just, they do release new ones all the time. So hundreds at this point, but really main ones, and maybe not hundreds, maybe a little over a hundred would be a better estimate.I mean,  EC2 which is elastic compute is. The bread and butter. Historically, AWS is just, they're virtualized servers basically. So EC2, the thing that made AWS really special from the beginning and that made cloud start to take over the world was the concept of auto scaling groups, which are basically definitions you attached to EC2 and it basically allows you to say, Hey, if I start getting a lot of traffic on.This one type of server, right? You know, create a second server that looks exactly the same and load balance the traffic through it. So when they say scaling, that's basically what, how you scale, easy to use auto scaling groups and elastic load balancers and kind of distribute the traffic out. The other big thing besides the scalability of  with auto scaling groups is.Redundancy. So there's this idea of regions within AWS, and within each region there's availability zones. So regions are the general, like you can think of it as the place where data center is kind of like located within like a small degree. So it's usually like. Virginia is one, right? That's us East one.It's the oldest one. Another one is in California, but they're all over the world now. So the idea is you pick a region to minimize latency, so you pick the region that's closest to you. And then within the region, there's the idea of availability zones, which are basically just discreet, like physical locations of the servers that you administer them the same way, but they're protected.So like if a tornado runs through and hits one of your data centers. If you happen to have them distributed between two different availability zones, then you'll still be able to, you know, serve traffic. The other one will go down, but then the elastic load balancer will just notice that it's not responding and send the traffic to the other availability zone.So those are the main concepts that make it like EC2 those are what you need to use it effectively.Nate: [00:05:12] So with an easy to instance, that would be like a virtual server. I mean, it's not quite a Docker container, I guess we're getting to nuance there, but it's basically like a server that you would have like command line access to.You could log in and you can do more or less whenever you want on an EC2 instance.Steph: [00:05:29] Right, exactly. And so it used to be AWS used what was called Zen virtualization to do it. And that's just like you can run Zen on your own computer, you can get a computer and set up a virtual machine, almost just like they used to do it .So they are constantly putting out like new ways of virtualizing more efficiently. So they do have new technology now, but it's not something that was really, I mean, it was well known, but they really took it to a new kind of scale, which made it really impressive.Nate: [00:05:56] Okay, so EC2 lets you have full access to the box that's running and you might like load bounce requests against that.How does that contrast with what you do with AWS Lambda and serverless?Steph: [00:06:09] So with , you still have to, you know, either secure shell or, you know, furious and windows. Use RDP or something to actually get in there. You care about what ports are open. You have security groups for that. You care about all the stuff you would care about normally with a server you care about.Is it patched and up today you care about, you know, what's the current memory and CPU usage? All those things don't go away on EC2 just because it's cloud, right? When we start bringing serverless into the mix, suddenly. They do go and away. I mean, and there's still a few limitations. Like for instance, a Lambda has a limit on how much memory it can process with, just because they have to have a way to kind of keep costs down and define the units of them and define where to put them.Right? But at its core, what a Lambda is, it actually runs on a Docker container. You can think of it like a pre-configured Docker container with some pre-installed dependencies. So for Python, it would have. The latest version of Python that it says it has, it would have boto. It would have the stuff that it needs to execute that, and it would also have some basic, it's structured like it was, you know, basic Linux.So there's like a attempt. So slash temp you can write files there, right. But really it's like a Docker container. That runs underneath it on a fleet of . As far as availability zone distribution goes, that's already built into land, but you don't have to think about it with . You do have to think about it.Because if you only run one easy to server and it's in one availability zone, it's not really different from just having a physical server somewhere with a traditional provider.Nate: [00:07:38] So. There are these two terms, there's like serverless and Lambda. Can you talk a little bit about like the difference between those two terms and when to use each appropriately?Steph: [00:07:48] Yeah, so they are in a way sorta interchangeable, right? Because serverless technology just means the general idea of. I have an application, I have it defined it an artifact of we'll say zip from our get repo, right? So that application is my main artifact, and then I pass it to a service somewhere. I don't know.It could be at work. The Google app engine, that's a type of serverless technology and AWS Lambda is just the specific AWS serverless technology. But the reason AWS Lambda is, in my opinion so powerful, is because it integrates really well with the other features of AWS. So permissions management works with AWS Lambda API gateway.there's a lot of really tight integrations you can make with Lambda so that it doesn't, it's not like you have to keep half of your stuff one place and half of your stuff somewhere else. I remember when like Heroku was really big . A lot of people, you know, maybe they were maintaining an AWS account and they were also maintaining a bunch of  stuff and Heroku, and they're just trying to make it work together.And even though Heroku does use, you know, AWS on the backend, or at least it did back then, it can just make things more complicated. But the whole server, this idea of the artifact is you make your code general, it's like a little microservice in a way. So I can take my serverless application and ideally, you know, it's just Python.I use NF, I write it the right way. Getting it to work on a different server. This back end, like for, exit. I think Azure has one, and Google app engine isn't really too much of a change. There's some changes to permissions and the way that you invoke it, but at the core of it, the real resource is just the application itself.It's not, you know, how many, you know, units of compute. Does it have, how many, you know, how much memory, what are the IP address rules and all that. YouNate: [00:09:35] know. So what are some good apps to build on serverless?Steph: [00:09:39] Yes. So you can build almost anything today on serverless, there's actually so much support, especially with AWS Lambda for integrations with all these other kinds of services that the stuff you can't do is getting more limited.But there is a trade off with cost, right? Because. To me the situation where it shines, where I would for no reason ever choose anything but serverless, is if you have something that's kind of bursty. So let's say you're making like a report generation tool that needs to run, but really you only run it three times a week or something like things that.They need to live somewhere. They need to be consistent. They need to be stable, they need to be available, but you don't know how often they're going to call. And even if they can go from that, there is small numbers of times it's being called, because the cool thing about serverless is , you're charged per every 100 milliseconds of time that it's being processed.When it comes to , you're charged and units that are, it used to be by the hour, I think they finally fixed it, and it's down to smaller increments. . But if you can write it. Efficiently. You can save a ton of money just by doing it this way, depending on what the use cases. So some stuff, like if you're using API gateway with Lambda, that actually can.Be a lot more expensive than Lambda will be. But you don't have to worry about, especially if you need redundancy. Cause otherwise you have to run a minimum of two  two servers just to keep them both up for a AZ kind of outages situation. You don't have to worry about that with Lambda. So anything that with lower usage 100%.If it's bursty 100% use Lambda, if it's one of those things where you just don't have many dependencies on it, then Lambda is a really good choice as well. So there's especially infrastructure management, which is, if you look, I think Warner Vogels, he wrote something recently about how serverless driven infrastructure automation is kind of going to be the really key point to making places that are using cloud use cloud more effectively.And so that's one group of people. That's a big group of people. If you're a big company and you already use the AWS and you're not getting everything out of it that you thought you would get. Sometimes there's serverless use cases that already exist out there and like there's a serverless application repo that AWS provides and AWS config integrations, so that if you can trigger a serverless action based off of some other resource actions. So like, let's say that your auto scaling group  scaled up and you wanted to like notify somebody, there's so many things you could do with it. It's super useful for that. But even if you're just, I'm co you're coming at it from like a blank slate and you want to create something .There are a lot of really good use cases for serverless. If you are, like I said, you're not really sure about how it's going to scale. You don't want to deal with redundancy and it fits into like a fairly well-defined, you know, this is pretty much all Python and it works with minimal dependencies. Then it's a really good starting place for that.Nate: [00:12:29] You know, you mentioned earlier that serverless is very good for when you have bursty services in that if you were to do it based on  and then also get that redundancy one. You're going to have to run while you're saying you'll have to run at least two EC2 instances, just 24 hours a day. I'm paying for those.Plus you're also going to pay for API gateway. Do you pay hourly for API gatewaySteph: [00:12:53] API gateway? It, it would work the same either way, but you would pay for, in that case, like a load balancer.Nate: [00:12:59] What is API gateway? Do you use that for serverless?Steph: [00:13:02] All the time. So API gateway?Nate: [00:13:04] Yeah. Tell us the elements of a typical serverless stack.So I understand there's like Lambda, for example, maybe you say like you use CloudFront. In front of your Lambda functions, which may be store images and S3 that like a typical stack? And then can you explain like what each of those services are,Steph: [00:13:22] how you would do that? Yeah, so you're, you're not, you're on the right track here.So, okay. So a good way to think about it is, if you look at AWS has published something which a lot of documentations on it called the serverless application management standard. So S a N. And so basically if you look at that, it actually defines the core units of serverless applications. So which is the function, an API, if you, if you want one.And basically any other permission type resources. So in your case, let's say it was something where I just wanted like a really. Basic tutorial that AWS provides is someone wants to upload an image for their profile and you want to, you know, scale it down to like a smaller image before you store it on your S3.You know, just so they're all the same size and it saves you a ton, all that. So if you're creating something like that, the AWS resources that you would need are basically an API gateway, which is. Acts as basically the definition of your API schema. So like if you've ever used swagger or like a open API, these standards where you basically just define, and JSON, you know it's a rest API, you do get here, post here, this resource name.That's a standard that you might see outside of AWS a lot. And so API Gateway is just basically a way to implement that same standard. They work with AWS. So that's how you can think of API gateway. It also manages stuff like authentication integration. So if you want to enable OAuth or something on something, you could set that up the API gateway level.SoNate: [00:14:55] if you had API gateway set up. Then is that basically a web server hosted by Amazon?Steph: [00:15:03] Yeah, that's basically it.Nate: [00:15:05] And so then your API gateway is just  assigned essentially randomly a DNS name by Amazon. If you wanted to have a custom DNS name to your API gateway. How do you do that?Steph: [00:15:21] Oh, it's just a setting.It's pretty. so what you could do, yeah, so if you already have a domain name, right? Route 53 which is AWS is domain name management service, you can use that to basically point that domain to the API gateway.Nate: [00:15:35] So you'd use route 53 you configure your DNS to have route 53 point a specific DNS name to your API gateway, and your API gateway would be like a web server that also handles like authentication and AWS integration. Okay,Steph: [00:15:51] got it. Yeah, that's a good breakdown of what that works. So that's your first kind of half of how people commonly trigger Lambdas. And that's not the only way to trigger it, but it's a very common way to do it. So what happens is when the API gateway is configured, part of it is you set what happens when the method is invoked.So there's like a REST API as a type of API gateway that. People use a lot. There's a few others, like a web socket, one which is pretty cool for streaming uses, and then they're always adding new options to it. So it's a really neat service. So you would have that kind of input go into your API gate.We would decide where to route it. Right. So in a lot of cases here, you might say that the Lambda function is where it gets routed to. That's considered the integration to it. And so basically API gateway passes it all of this stuff from the requests that you want to pass it. So, you know, I want to give it the content that was uploaded.I want to give it the IP address. It originally came from whatever you want to give it.Nate: [00:16:47] What backend would people use for API gateway other than Lambda? Like would you use an API gateway in front of an EC2 instance?Steph: [00:16:56] You could, but I would use probably a load balancer or application load balancer and that kind of thing.There's a lot of things you can integrate it for. Another cool one is, AWS API calls. It can proxy, so it can just directly take input from an API and send it to a specific API call if you want to do that. That's kind of some advanced usage, but Lambdas are kind of what I see is the go-to right now.Nate: [00:17:20] So the basic stack that we're looking at is you use API gateway to be in front of your Lambda function, and then your Lambda function just basically does the work, which is either like a writing to some sort of storage or calculating some sort of response. You mentioned earlier, you said, you know the Lambda function it can be fronted by an API if you want one. And then you mentioned, you know, there's other ways that you can trigger these Lambda functions. Can you talk a little bit about like what some of those other ways are?Steph: [00:17:48] Yeah, so actually those are really cool. So the cool thing is that you could trigger it based off of basically any type of CloudWatch event is a big one.And so CloudWatch is basically a monitoring slash auditing kind of service that AWS provides. So you can set triggers that go off when alarms are set. So. It could be usage, it could be, Hey, somebody logged in from an IP address that we don't recognize. You could do some really cool stuff with CloudWatch events specifically. And so those are one that I think for like management purposes are really powerful to leverage. But also you can do it off of S3 events, which are really cool. So like you could just have it, so anytime somebody uploads a. Let's say it was a or CI build, right?  You're doing IA builds and you're putting your artifacts into a S three bucket, so you know this is released version 1.1 or whatever, right?You put it into an S3 bucket, right? You can hook it up so that when ever something gets put into that S3 bucket. That another action is that takes place so you can make it so that, you know, whenever we upload a release, then you know, notify these people. So now an email or you can make it so that it, you know, as complicated as you want, you can make it trigger a different kind of part in your build stage.If you have things that are outside of AWS, you can have it trigger from there. There's a lot of really cool, just direct kind of things that you don't need. An API for. An S3 is a good one. The notification service, SNS it's used within AWS a lot that can be used. The queuing service AWS provides called SQS.It works with, and also just scheduled events, which I really like because it replaces the need for a crown server. So if you have things that run, you know, every Tuesday or whatever, right, you can just trigger your Lambda to do that from just one configuration point, you don't have to deal with anything more complicated than that.Nate: [00:19:38] I feel like that gives me a pretty good grounding in the ecosystem, in the setting. Maybe we could talk a little bit more about tools and tooling. Yeah, so I know that in the JavaScript world, on like the node world, they have the serverless framework, which is basically this abstraction over, I think it's over like Lambda and you know, Azure functions and Google up.Google cloud probably too. Do they have like a serverless framework for Python or is there like a framework that you end up using? Or do you just generally just write straight on top of Lambda?Steph: [00:20:06] So that's a great question and I definitely do recommend that even though there is like a front end you could do to just start, you know, typing code in and making the Lambda work right.It's definitely better to have some sort of framework that. Integrates with your actual, like, you know, wherever you use to store your code and test it and that kind of thing. So serverless is a really big one, and that's, it's kind of annoying because serverless, you know, it also refers to the greater ecosystem of code that runs without managing underlying servers.But in this particular case, Serverless is also like a third party company in tooling, and it does work for Python. It works for, a whole budget. That's kind of like the serverless equivalent in my head of like Terraform, something that is kind of meant to be kind of generic, but it offers a lot of kind of value to people just getting started. If you just want to put something in your, read me that says, here's how to, you know, deploy this from Github. You know, serverless is a cool tool for that. I don't personally like building with it myself just because I find that this SAM, which is Serverless Application Model, I think I said management earlier, but it's actually model.I just looked that up. I feel like that has everything I really want for AWS and I get more fine grain control. I don't like having too much obstruction and I also don't like. When you have to install something and something changes between versions and that changes the way your infrastructure gets deployed.That's a pet peeve of mine, and that's why I don't really use Terraform too much for the same reason. When you're operating really in one world, which in my case is AWS, I just don't get the value out of that. Right. But with the serverless application model, and they have a whole Sam CLI, they have a bunch of tools coming out.So many examples on their Github repos as well. I find that it's got really everything. I want to use plus some CloudFormation plugs right into that. So if I need to do anything outside of the normal serverless kind of world, I can do that. So it's better to use serverless than to not use anything at all.  I think it's a good tool and really good way to kind of get used to it and started, but at least my case where it really matters to have super consistent deployments where I'm sharing between people and accounts and all of that. And I find that SAM really gives me the best kind of best of both worlds.Amelia: [00:22:17] So, as far as I understand it, serverless is a fairly new concept.Steph: [00:22:22] You know, it's one of those things it's catching on. Recently, I felt like Google app engine candidate a long time ago, and it was kind of a niche thing for awhile, but it recently it, we're starting to see. Bigger enterprises, people who might not necessarily want bleeding edge stuff start to accept that serverless is going to be the future.And that's why we're seeing all this stuff come up and it's, it's actually really exciting. But the good thing is it's been around long enough that a lot of the actual tooling and the architecture patterns that you will use are mature. They've been used for years. Their sites you've been using for a long time that.You don't know that it's serverless on the back end, but it is because it's one of those things that doesn't really affect you unless you're kind of working on it. Right. But it's new to a lot of people, but I think it's in a good spot where it's more approachable than it used to be.Nate: [00:23:10] When you say that there's like a lot of standard patterns, maybe we could talk about some of those.So when you write a Lambda function and code, either with like Python or Java script or whatever, there are bloods, they say Python because you use Python primarily right? Well, maybe we could talk a little bit about that. Like why do you prefer Python?Steph: [00:23:26] Yeah, so just coming from my background, which is, like I said, I did some support, did some straight dev ops, kind of a more assisted mini before the world kind of became a more interesting place kind of background.Python is just one of those tools that is installed on like every Linux server in the world and works kind of predictably. Enough people know it that it's, it's not too hard to like. Share between people who may not be, you know, super advanced developers, right? Cause a lot of people I work with, maybe they have varying levels of skills and Python's one of those ones you can start to pick up pretty quickly.And it's not too foreign really to people coming from other languages as well. So it's just a practicality thing for a lot of it. But there's also a lot of the tooling that is around. Dev ops type stuff is in Python, like them, Ansible for configuration management, super useful tool. You know, it's all Python.So really there's, there's a lot of good reasons to use Python from, like in my world it's, it's one of the things where you don't have to use one specific language, but Python is just, it has what I need and it gets, I can work with it pretty quickly. The ecosystems develop. There's still a lot of people who use it and it's a good tool for what I have to do.Nate: [00:24:35] Yeah, there's tons, and I was looking at the metrics. I think Python is like, last year was like one of the fastest growing programming languages too. There's a lot of new people coming into Python,Steph: [00:24:44] and a lot of it is data science people too, right? People who may not necessarily have a strong programming background, but there's the tooling they need in a Python already there.There's community, and it sucks that it's not as scary looking as some other languages, frankly. You know.Nate: [00:24:58] And what are some of the other like cloud libraries that Python has? Like I've seen one that's called like BotoSteph: [00:25:03] Boto is the one that Amazon provides as their SDK, basically. And so every Lambda comes bundled with Boto three you know, by default.So yeah, there was an older version of ODA for Python too. But Boto three is the main one everyone uses now. So yeah, Bodo is great. I use it extensively. It's pretty easy to use, a lot of documentation, a lot of examples, a lot of people answering questions about it on StackOverflow, but I'm really, every language does have an SDK for AWS these days, and they all pretty much work the same way because they're all just based off of.The AWS API APIs and all the API APIs are well-defined and pretty stable, so it's not too much of a stretch to use any other language, but Bono's the big one, the requests library in Python is super useful just because it makes it easier to deal with, you know, interacting with API APIs or interacting with requests to APIs.It's just all about, you know, HTP requests and all that. Some of the new Python three. Libraries are really good as well, just because they kind of improve. It used to be like with Python 2, you know, there's URL lib for processing requests and it was just not as easy to use. So people would always bundle a third party tool, like requests, but it's getting better.Also, you know, Python, there's some. Different options for testing Py unit and unit test, and really there's just a bunch of libraries that are well maintained by the community. There's a kazillion on PyPy, but I try to keep outside dependencies from Python to a total minimum because again, I just don't like when things change from underneath me, how things function.So it's one of the things where I can do a lot without. Installing third party libraries, so wherever I can avoid it, I do.Nate: [00:26:47] So let's talk a little bit about these patterns that you have. So Lambda functions generally have a pretty well defined structure, and it's basically through that convention. It makes it somewhat straightforward to write each function. Can you talk a little bit about like, I don't know, the anatomy of a Lambda function?Steph: [00:27:05] Yeah,  so at its basic core, the most important thing that every Lambda function in the world is going to have is something called a handler. And so the handler is basically a function that is accessed to begin the way that it starts.So, any Lambda function when it's invoked. So anytime you are calling it, it's called invoking a Lambda function. It sends it parameters that are event. An event is basically just the data that defines, Hey, this is stuff you need to act on. And it sends it something called context, which a lot of time you never touched the context object.But it's useful too, because AWS provides it with every Lambda and it's basically like, Hey, this is the ID of the currently running Lambda function. You know, this is where you're running. This is the Lambdas name. So like for logging stuff, context can be really useful. Or for stuff where it's like your function code may need to know something about where it is.You can save yourself time from, you don't have to use like an environment. They're able, sometimes if you can look in the context object. So at the core it's cause you have at least a file, you can name it whatever you want. A lot of people call it index and then within that file you define a function called handler.Again, it doesn't have to be called handler, but. That makes it easy to know which one it is, and it takes that event and context. And so really, if that's all you have, you can literally just have your Lambda file be one Python file that says, you can say def handler takes, you know, object and then return something.And then that can be it. As long as you define that index dot handler as your handler resource, which is, that's a lot of words, but basically we need to find your Lambda within AWS.  The required parameters are basically the handler URI, which is the name of the file, and then a.in the name of the handler function.So that's at its most basic. Every Lambda has that, but then you start, you know, scoping it out so you can actually know, organize your code decently. And then it's just a matter of, is there a read me there. Just like any other Python application really, you know, do you have a read me? Do you want to use like a requirements.txt file to like define, you know, these are the exact third party libraries that I'm going to be using.That's really useful. And if you're defining it with SAM, which I really recommend. Then there's a file called template.yaml And that's just contains the actual, like AWS resource definition, as well as any like CloudFormation defined resources that you're using. So you can make a template.yaml as the infrastructure kind of as code, and then everything else, just the code as code.Nate: [00:29:36] Okay. So unpacking that a little bit, you'll invoke this function and they'll basically be two arguments. One is the event that's coming in the event in particular, and then it'll also have the context, which is sort of metadata about the context in which this request is running. So you mentioned some of the things that come in the context, which is like what region you're in or what the name of the function is that you're on.What are some of the parameters in the event object.Steph: [00:30:02] So the interesting thing about the event object. Is, it can be anything. It just has to be basically a Python dictionary or basically, you know, you could think of it like a JSON, right? So it's not predefined and Lambda itself doesn't care what the event is.That's all up to your code to decide what is it, what is a valid event, and how to act on it. So API gateway if you're using that. There's a lot of example events, API gateway will send and so if you like ever try it, look at like the test events for Lambda, you'll see a lot of like templates, which are just JSON files with like expected outputs.But really it can be anything.Nate: [00:30:41] So the way that Lambda's structured is that API gateway will typically pass in an event that's maybe like the request was a POST request, and then it has these like query parameters or headers attached to it. And all of that would be within like the request object. But the event could also be like you mentioned like CloudWatch, like there's like a CloudWatch event that could come in and say, you basically just have to configure your handler to handle any of the events you expect that handler to receive.Steph: [00:31:07] Yeah, exactly.Nate: [00:31:09] So let's talk a little bit more about the development tooling. How in the world do you test these sorts of things? Like with, do you have to deploy every single time or tell us about like the development tooling that you use to test along the way.Steph: [00:31:22] Yeah. So I'm, one of the great things about SAM and there's some other tools for this as well, is that it lets you test your Lambdas locally before you deploy it, if you want.And the way that it does that is, I mentioned earlier that Lambda is really at its core, a container, like a Docker container running on a server somewhere. Is, it just creates a Docker container that behaves exactly like a Lambda would, and it sends your events. So you would just define basically a JSON with the expected data from either API gateway or whatever, right?You make a test one and then it will send it to that. It'll build it on demand for you and you test it all locally with Docker. When you like it, you use the same tool and it'll package it up and deploy it for you. So yeah, it's actually not too bad to test locally at all.Nate: [00:32:05] So you create JSON files of the events that you want it to handle, and then you just like invoke it with those particular events.Steph: [00:32:12] Yeah, so basically like if I created it like a test event, I would save it to my repo is tests slash API gateway event.json Had put in the data I expect, and then I would do like a SAM. So the command is like SAM, a local invoke, and then I would give it to the file path to the JSON, and it would process it.I'd see the exact same output that I would expect to see from Lambda. So it'll say, Hey, this took this many milliseconds to invoke the response code was this, this is what was printed. So it's really useful just for. It's almost a one to one with what you would get from Amazon Lambda output.Amelia: [00:32:50] And then to update your Lambda functions.Do you have to go inside the AWS GUI or can you do that from the command line.Steph: [00:32:57] yeah, no, you can do that from the command line with Sam as well. So there's a Sam package and Sam deploy command. It's useful if you need to use basically any type of CII testing service to like manage your deployments or anything like that.Then you can get a package it and then send it the package to your, Whatever you're using, like Gitlab or something, right. For further validation and then have Gitlab deploy it. Like if you don't want people to have deployed credentials on their local machine, that's the reason it's kind of broken up into two steps there.But basically you just do a command, Sam deploy, and what it does is it goes out to Amazon. It says, Hey, update the Lambda to point to this as the new resource artifact to be invoked. And if you're using and which I think it's enabled by default, not actually the versioning feature, it actually just adds another version of the Lambda so that if you need to roll back, you can just go to the previous one, which is really useful sometimes.Nate: [00:33:54] So let's talk a little bit about deployment. One of the things that I think is stressing when you are deploying Lambda functions is like, I have no idea how much it's going to cost. How is it going to cost to launch something, and how much am I going to pay? And I guess maybe you can kind of calculate if you estimate the number of requests you think you're going to get, but how do you approach that when you're writing a new function?Steph: [00:34:18] Yeah, so the first thing I look at is what's the minimum, basically timeout, what's the minimum memory usage? So number of invocations is a big factor, right? So like if you have free tier, I think it's like a million invocations you get, but that's like assuming like a hundred under a hundred milliseconds each.So when you just deploy it, there's no cost for just deploying it. You don't get charged until it's invoked. If you're storing like an artifact and as three, there's a little cost for you keeping it in as three. But it's usually really, really minimal. So the big thing is really, how many times are you give it?Is it over a million times and or are you not on free tier? The costs, like I said, it gets batchedtogether and it's actually really pretty cheap just in terms of number of invocations cause at the bigger places where you can normally save costs. Is it over-provisioned for how much memory you give it?Right. I think the smallest unit you can give it as 128 that can go up to like two gigabytes maybe more now. So if you have it set where, Oh, I want it to use this much memory and it really never is going to use that much memory and that's kind of like wasteful or if you know, if it uses that much, that's like something's wrongNate: [00:35:25] cause you pay, you configure beforehand, like we're going to use max 128 megabytes of memory and then it's allocated on invocation or something like that.And then if you set it too high, you were going to pay more than you need to. Is that right?Steph: [00:35:40] Yeah. Well and it's more like, I think I'll have to double check cause it actually just show you how much memory you use each time in Lambda is invoked. So you can sort of measure if it's getting near that or if you think you need more than it might give an error.If it doesn't, it isn't able to complete . But in general, like. I haven't had many cases where the memory has been the limiting factor. I will say that, the timeout can sometimes get you, because if a Lambda's processing forever, like let's say API gateway, a lot of times API gateway has its own sort of timeout, which is, I think it's like 30 seconds to respond.And if your Lambda is set to, you know, you give it five minutes to process  it always five minutes processing. If you, let's say that you program something wrong and there's like a loop somewhere and it's going on forever, it'll waste five minutes. Computing API gateway will give up after 30 seconds, but you'll still be charged for the five minutes that Lambda was kind of doing its thing.SoNate: [00:36:29] it's like, I know that AWS is services and Lambda are created by like world-class engineers. It's the highest performing infrastructure probably in the world, but as a user, sometimes it feels like there's giant Rube Goldberg machine, and I have like no idea. All of the different aspects that are involved in, like how do you manage that complexity?Like when you're trying to learn AWS, like let's say someone who's listening to this, they want to try to understand this. How do you. Go about making sense of all of that. Like difficulty.Steph: [00:37:02] You really have to go through a lot of the docs, like videos, people showing you how they did something isn't always the best just because they kind of skirt around all the things that went wrong in the process, right? So it's really important just to understand, just to look at the documentation for what all these features are before you use them. The marketing people make it sound like it's super easy and go, and to a degree, it really is like, it's easier than the alternative, right?It's where you put your complexities the problem Nate: [00:37:29] yeah, and I think that part of the problem that I have with their docs is like they are trying to give you every possible path because they're an infrastructure provider, and so they support like these very complex use cases. And so it's, it's like the world's most detailed choose your own adventure.It's like, Oh, have you decide that you need to take this path? Go to   or this one path B. Path C there's like so many different like paths you can kind of go down. It's just a lot when you're first learning.Steph: [00:37:58] It is, and sometimes like the blog posts have better kind of actual tutorial kind of things for like real use cases.So if you have a use case that is general enough, a lot of times you can just Google for it and there'll be something that one of their solution architects wrote up about had actually do it from like a, you know, user-friendly perspective that anything with the options is that you need to be aware of them too, just because the way that they interact can be really important.If you do ever do something that's not done before and the reason why it's so powerful and what, you know why it takes all these super smart people to set up and all this stuff is actually because are just so many variables that go into it that you can do so much with that. It's so easy to shoot yourself in the foot.It always has been in a way, right? But it's just learning how to not shoot yourself in the foot and use it like with the right agility. And once you get that down, it's really great.Amelia: [00:38:46] So there's over a hundred AWS services. How do you personally find new services that you want to try out or how does anyone discover any of these different services.Steph: [00:38:57] What I do is, you know, I get the emails from AWS whenever they release new ones, and I try to, you know, keep up to date with that. Sometimes I'll read blog posts that I see people writing about how they're using some of them, but honestly, a lot of it's just based off of when I'm doing something, I just keep an eye out.If there's something like, I wished that it did sometimes, like, I used some AWS systems manager a lot, which is basically. You can think of it. It's sort of like a config management an orchestration tool. It lets you, basically, it's a little agent. You can sell on  servers and you can, you know, just automate patching and all this other like little stuff that you would do with like Chef or Puppet or other config management tools.And. It seems like they keep announcing services. What are really just like tie ins to existing ones, right? Which is like, Oh, this one adds, you know, for instance, like the secret management and the parameter store would secrets. A lot of them are really just integrations to other AWS services, so it's not as much.The really core ones that everyone needs to know is, you know, EC2 of course Lambda, so big API gateway and CloudFormation because it's basically. The infrastructure as code format that is super useful just for structuring everything. And I guess S3 is the other one. Yeah. Let's talk aboutNate: [00:40:15] cloud formation for a second.So earlier you said your Lambda function is typically going to have a template.yaml. Is that template.yaml CloudFormation code.Steph: [00:40:26] So at its core, yes. But the way you write it is different. So how it works is that the Sam templating language is defined to simplify. What you would with CloudFormation.So a CloudFormation you have to put a gazillion variables in.And it's like, there's some ways to like make that easier. Like I really like using a Python library called Tropo sphere, where you can actually use Python to generate your own cloud formation templates for you. And it's really nice just cause, you know, I like to know I'll need a loop for something or I'll need to like fetch a value from somewhere else.And it's great to have that kind of flexibility with it . The, the Sam template is specifically a transform, is what they call it, of cloud formation, which means that it executes against the CloudFormation service. So the CloudFormation service receives that kind of turns it into the core that it understands and executes on it.So at the core of it, it is executing on top of CloudFormation. You could create a mostly equivalent kind of CloudFormation template usually, but there's more to it. But there's a lot of just reasons why you would want to use Sam for serverless specifically, just because they add so many niceties and stuff around, you know, permissions management that you don't have to like think of as much and shortcuts and it's just a lot easier to deal with, which is a nice change.But the power of CloudFormation is that if you wanted to do something. That like maybe SAM didn't support the is outside the normal scope. You could just stick a CloudFormation resource definition in it and it would work the same way cause it's running against it. It's one of those services where people, sometimes it gets a bad rap because it's so complicated, but it's also super stable.It behaves in a super predictable way and it's just, I think learning how to use that when I worked at AWS was really valuable.Nate: [00:42:08] What tools do you use to manage the security when you're configuring these things? So earlier you mentioned IAM, which is a, I don't know what it stands for.Steph: [00:42:19] Identity and access management,Nate: [00:42:20] right?Which is like configuration language or configuration that we can configure, which accounts have access to certain resources. let me give you an example. One question I have is how do you make sure each system has the minimum level of permissions and what tools you use? So for example, I wrote this Lambda function a couple of weeks ago.Yeah. I was just following some tutorial and they said like, yeah, make sure that you create this IAM role as like one of the resources for launching this Lambda function, which I think they're like, that's great. But then like. How do I pin down the permissions when I'm granting that function itself permissions to grant new IAM roles. So it was like I basically just had to give it route permission according to my low, my skill level, because otherwise I wasn't able to. Create, I am roles without the authority to create new roles, which just seems like root permissions.Steph: [00:43:13] Yes. So there are some ways that's super risky, honestly, like super risky.Nate: [00:43:17] Yeah. I'm going to need your help,Steph: [00:43:19] but it is a thing that there are case you can, you can limit it down with the right kind of definition. SoIAM. It's really powerful. Right? So the original case behind a MRI was that, so you're a servers so that if you had a, an application server and a database server separately.You could give them separate IAM roles so that they could have different things they could do. Like you never really want your database server to maybe. Interface directly with, you know, an S three resource, but maybe you want your application server to do that or something. So it was nice because it really let you limit down the scope from a servers and you don't, cause you have to leave keys around if you do it .So you don't have to keep keys anywhere on the server if you're using IAM roles to access that stuff. So anytime you're storing like an AWS secret key on a server, or like in a Lambda, you kinda did something wrong. The thing they are just because that's, AWS doesn't really care about those keys. It just looks, is it a key?Do it here. But when you actually use IAM policies, you could say it has to be from this role. It has to be executed from, you know, this service. So it can make sure that it's Lambda or the one doing this, or is it somebody trying to assume Lambda credentials? Right? There's so much you can do to kind of limit it.With I am. So it was really good to like learn about that. And like all of the AWS certifications do focus on IAM specifically. So if anyone thinking about taking like an AWS certification course, a lot of them will introduce you to that and help a lot with understanding like how to use those correctly.But for what you talked about with you, like how do you deal with a function that passes, that creates a role for another function, right? What you would do in that kind of case is there's an idea of IAM paths. So basically you can give them like as namespacing for IAM permissions, right? So you can make a, I am role that can grant functions that can create roles .Only underneath its own namespace. Within its own path.Nate: [00:45:20] When you say namespaces, I mean did inherit permissions. But the parent permission has?Steph: [00:45:28] Depends. So it doesn't inherit itself. But like, let's say that I was making a build server . And my build server, we had to use a couple of different roles for different pieces of it. For different steps. Cause they used different services or something. So we would give it like the top level one of build. And then in my S3 bucket, I might say aloud upload for anyone whose path had built in it. So that's, that's the idea that you can limit on the other side, what is allowed.And so of course, it's one of the things where you want to by default blacklist as much as possible, and then white list what you can. But in reality it can be very hard to go through some of that stuff. So you just have to try to, wherever you can, just minimize the risk potential and understand what's the worst case that could happen if someone came in and was able to use these credentials for something.Amelia: [00:46:16] What are some of the other common things that people do wrong when they're new to AWS or DevOps?Steph: [00:46:22] One thing I see a lot is people treating the environment variables for Lambdas as if they were. Private, like secrets. So they think that if you put like an API key in through the environment variable that that's kind of like secure, but really like I worked in AWS support, anyone would be able to see that if they were helping you out in your account.So it's not really a secure way to do that. You would need to use a surface like secrets manager, or you'd have some kind of way to, you would encrypt it before you put it in and then the Lambda would decrypt it, right? So there's ways to get around that, but like using environment variables as if there were secure or storing.Secure things within your git repositories that get pushed to AWS is like a really big thing that should be avoided. And we said, what else did you ever own?Nate: [00:47:08] I'm pretty sure that I put an API key in mineSteph: [00:47:11] before. So yeah, no, it's one of the things people do, and it's one of those things that. A lot of people, you know, maybe nothing will go wrong and it's fine, but if you can just reduce the scope, then you don't have to worry about it.And it just makes things easier in the future.Amelia: [00:47:27] What are like the new hot things that are up and coming?Steph: [00:47:30] So I'd say that there's more and more kind of uses for Lambda at edge for like IOT integration, which is pretty cool. So basically Lambda editor. Is basically where you can process your lamb dos computers, basically, like, you know, like, just think of it as like raspberry pi.It's like that kind of type thing, right? So you could take asmall computer and you could put it like, you know, maybe where it doesn't have a completely like, consistent internet connection . So maybe if you're doing like a smart vending machine or something. Think of it like that. Then you could actually execute the Lambda logic directly there and deploy it to there and manage it from AWS whenever it does have like a network connection and then you can basically, it just reduces latency.A lot and let your coat and lets you test your code both like locally and then deploy it out. So it was really cool for like IOT stuff. There's been a lot of like tons of stuff happening in machine learning space on AWS too much for me to even keep on top of. But a lot of the stuff around Alexa voices is super cool, like a poly where you can just, if you play with your Alexa type thing before, it's cool, but you could just write a little Lambda program to actually generate, you know, whatever you want it to say in different accents, different voices on demand, and integrate it with your own thing, which is pretty cool. Like, I mean, I haven't had a super great use case for that yet, but it's fun to play with.Amelia: [00:48:48] I feel like a lot of the internet of things are like that.Steph: [00:48:52] Oh, they totally are. That they really are. But yeah, it's just one of the things you had to keep an eye out for. Sometimes the things that, for me, because I'm dealing so much with like enterprisey kind of stuff that excite me are not really exciting to other people cause it's like, yay, patching has a way to like lock it down to a specific version of this at this time.You know, it's like, it's like, it's not really exciting, but like, yeah.Nate: [00:49:14] And I think that's one of the things that's interesting talking to you is like I write web apps, I think of serverless from like a web app perspective, but it's like, Oh, I'm going to write an API that will let her know, fix my images on the way up or something.But a lot of the uses that you alluded to are like using serverless for managing, other parts of your infrastructure, they're like, you're using, you've got a monitor on some EC2 instance that sends out a cloud watch alert that like then responds in some other way, like within your infrastructure.So that's really interesting.Steph: [00:49:47] Yeah, no, that's, it's just been really valuable for us. And like I said, I mentioned the IAM stuff. That's what makes it all possible really.Amelia: [00:49:52] So this is totally unrelated, but I'm always curious how people got into DevOps, because I do a lot of front end development and I feel like.It's pretty easy to get into front end web development because a lot of people need websites. It's fairly easy to create a small website, so that's a really good gateway, but I've never like on the weekend when it to spin up a server or any of this,Steph: [00:50:19] honestly for me, a lot of it was like my first job in college.Like I was basically part-time tech support / sys admin. And I always loved L nuxi because, and the reason I got into Lennox in the first place is I realized that when I was in high school that I could get around a lot of the schools, like, you know, spy software that won't let you do fun stuff on the internet or with the software if you just use a live boot Linux USB.So part of it was just, I was using it. So, you know. Get around stuff, just curiosity about that kind of stuff . But when I got my first job, that's kind of like assist admin type thing. It kind of became a necessity. Because you know when you have limited resources, it was like me and like another part time person and one full time person and hundreds of people who we had to keep their email and everything.Working for them. It kind of becomes a necessity thing cause you realize that all the stuff that you have to do by hand back then, you can't keep track of it all. You can't keep it all secured for a few people. It's extremely hard. And so one way people dealt with that was, you know, offshoring or hiring people, other people to maintain it.But it was kind of cool at the time to realize that the same stuff I was learning in my CS program about programming. There's no reason I couldn't use that for my job, which was support and admin stuff. So, I think I got introduced to like chef, that was the first tool that I really, I was like, wow, this changes everything.You know, because you would write little Ruby files to do configuration management and then your servers would, you know, you run the chef agent to end, you know. You know, they'd all be configured exactly the same way. And it was testable. And there's all this really cool stuff you could do with chef that I, you know, I had been trying to play to do with like, you know, bash script or just normal Python scripts.But then chef kind of gave me that framework for it. And I got a job at AWS where one of the main components was supporting their AWS ops work stool, which was basically managed chef deployments. And so that was cool because then I learned about how does that work at super high scale. What are other things that people use?And right before I actually, you know, got my first job as a full time dev ops person was when they, they were releasing the beta for Lambda. So I was in the little private beta for AWS employees and we were all kind of just like, wow, this changes a lot. They'll make our jobs a lot easier, you know, in a way it will reduce the need for some of it.But we were so overloaded all the time. And I feel like a lot of people from a  perspective know what it feels like to be like. There's so much going on and you can't keep track of it all and you're overloaded all the time and you just want it to be clean and not have to touch it and to do less work at dev ops was kind of like the way forward.So that's really how I got into it.Amelia: [00:52:54] That's awesome. Another thing I keep hearing is that a lot of dev ops tests are slowly being automated. So how does the future of DevOps look if a lot of the things that we're doing by hand now will be automated in the future?Steph: [00:53:09] Well, see, the thing about dev ops is really, it's more of like a goal.It's an ideal. A lot of people, if they're dev ops purists and they'll tell you that it means it's having a culture where. There are not silos between developers and operations, and everyone knows how to deploy and everyone knows how to do everything. But really in reality, not everyone's a generalist.And being a generalist in some ways is kind of its own specialty, which is kind of how I feel about the DevOps role that you see. So I think we'll see that the dev ops role, people might go by different names for the same idea, which is. Basically reliability engineering, like Google has a whole book about site reliability engineering is the same kind of philosophy, right? It's you want to keep things running. You want to know where things are. You want to make things efficient from an infrastructure level. But the way that you do it is you use a lot of the same tools that developers use. So I think that we'll see tiles shift to like serverless architect is a big one that's coming up because that reliability engineering is big.And we may not see people say dev ops is their role as much, but I don't think the need for people who kind of specialize in like infrastructure and deployment and that kind of thing is going to go away. You might have to do more with less, right? Or there might be certain companies that just hire. A bunch of them, like Google and Amazon, right?They're pro still going to be a lot of people, but maybe they're not going to be working at your local place because if they're going to be working for the big people who actually develop the tools that are used for that resource. So I still think it's a great field and it might be getting a little harder to figure out where to enter in this because there's so much competition and attention around the tools and resources that people use, but it's still a really great field overall. And if you just learn, you know, serverless or Kubernetes or something that's big right now, you can start to branch out and it's still a really good place to kind of make a career.Nate: [00:54:59] Yeah. Kubernetes. Oh man, that's a whole nother podcast. We'll have to come back for that.Steph: [00:55:02] Oh, it is. It is.Nate: [00:55:04] So, Steph, tell us more about where we can learn more about you.Steph: [00:55:07] Yeah. So I have a book coming out.Nate: [00:55:10] Yes. Let's talk about the book.Steph: [00:55:12] Yeah. So I'm releasing a book called, Fullstack Serverless. See, I'm terrible.I should know exactly what the title, I don'tNate: [00:55:18] know exactly the title. . Yeah. Full stack. Python with serverless or full-stack serverless with Python,Steph: [00:55:27] full stack Python on Lambda.Nate: [00:55:29] Oh yeah. Lambda. Not serverless.Steph: [00:55:31] Yeah, that's correct. Python on Lambda. Right. And that book really has, it could take you from start to finish, to really understand.I think if you read this kind of book, if I, if I had read this before, like learning it, it wouldn't feel so maybe. Some people confusing or kind of like it's a black box that you don't know what's happening. Cause really at its core lambda that you can understand exactly everything that happens. It has a reason, you know it's running on infrastructure that's not too different from people who run infrastructure on Docker or something.Right. And the code that you write. Can be the same code that you might run on a server or on some other cloud provider. So the real things that I think that the book has that maybe kind of hard to find elsewhere is there's a lot of information about how do you do proper testing and deployment?How do you. Manage your secrets, so you aren't storing those in them in those environment variables. Correct. It has stuff about logging and monitoring, all the different ways that you can trigger Lambda. So API gateway, you know, that's a big one. But then I mentioned S3 and all those other ones. there's going to be examples of pretty much every way you can do that in that book.Stuff about optimizing cost and performance and stuff about using that. SAM, serverless application, a repository, so you can actually publish Lambdas and share them and even sell them if you want to. So it's really a start to finish everything you need to. If you want to have something that you create from scratch.In production. I don't think there's anything left out that you would need to know. I feel pretty confident about that.Nate: [00:57:04] It's great. I think one of the things I love about it is it's almost like the anti version of the docs, like when we talked about earlier that the docs cover every possible use case.This talks about like very specific, but like production use cases in a very approachable, like linear way. You know, even though you can find some tutorials online, maybe. Like you mentioned, they're not always accurate in terms of how you actually do or should do it, and so, yeah, I think your book so far has been really great in covering these production concerns in a linear way.All right. Well, Steph is great to have you.Steph: [00:57:37] Thank you for having me. It was, it was great talking to you both.

cloudonaut
#15 Advanced AWS Networking

cloudonaut

Play Episode Listen Later Mar 16, 2020 53:07


AWS offers shiny and powerful networking services. However, you should know about the pitfalls when designing advanced networking architectures for AWS. I will share some pitfalls that came to my attention when consulting clients to get the most out of AWS. You will learn how to answer the following questions: VPC Peering or Transit Gateway NAT Gateway or Public Subnet? VPC Endpoints or NAT Gateway? CloudFront or Akamai, Cloudflare, Fastly ...? Route 53 Resolver or Public Hosted Zone?

cloudonaut
#7 How we run our blog cloudonaut.io

cloudonaut

Play Episode Listen Later Oct 29, 2019 35:30


We love simplicity! Our blog runs on CloudFront and S3 which is maintenance free and does handle traffic spikes easily. We use the static website generator hexo to publish our content. Lambda@Edge handles redirects and generates optimized images on the fly. Instead of Google Analytics we are using Athena and QuickSight to get statistics about our blog and posts.